Apache Cassandra

What is Apache Cassandra:

Apache Cassandra is a distributed, open source and decentralized/distributed storage system, to manage very large amounts of data spread out across the world. It provides highly quality service without any failure.Some of the notable points of Apache Cassandra are:

  • Cassandra is consistent, scalable, and fault-tolerant.

  • It has a column-oriented database.

  • Its data model is based on Google’s Bigtable.

  • It differs from relational database management systems.

  • It implements a Dynamo-style replication model without any failure, but adds a more powerful “column family” data model.

  • It is used by some of the biggest companies such as Facebook, Cisco, Rackspace, ebay, Twitter, Netflix, and more.

Features of Cassandra:

Cassandra has become so popular because of its outstanding technical features. Some of the features of Cassandra are:

  • Elastic scalability - It allows to add more hardware to accommodate more customers and more data as per requirement.

  • Always on architecture - Cassandra has no point of failure and it is always available for business-critical applications that cannot afford a failure.

  • Fast linear-scale performance - It is linearly scalable, which means it increases your throughput as you increase the number of nodes in the cluster. That’s why it maintains a quick response time.

  • Flexible data storage - It adapt all possible data formats including: unstructured , structured, and semi-structured. It can dynamically adapt changes to your data structures according to your need.

  • Easy data distribution - Cassandra provides the flexibility to distribute data where you need by replicating data across multiple data centers.

  • Transaction support - Cassandra supports properties like Atomicity, Consistency, Isolation, and Durability (ACID).

  • Fast writes - Cassandra was designed to run on cheap object hardware. It performs fast writes and can store hundreds of terabytes of data, without sacrificing the efficiency of readness.

Components of Cassandra:

The key components of Cassandra are:

  • Node - It is the place where data is stored.

  • Data center - It is a collection of related nodes.

  • Commit log - The commit log is a crash-recovery method in Cassandra. Each write operation is written to the commit log.

  • Cluster - A cluster is a component that contains one or more data centers.

  • Mem-table - A mem-table is a memory-resident data structure.The data will be written to the mem-table, after commit log. Sometimes, for a single-column family there will be multiple mem-tables.

  • SSTable - It is a disk file, to which the data is flushed to, from mem-table, when its contents reach a threshold value.

  • Bloom filter - These are quick, nondeterministic, algorithms for testing whether an element is a member of a set. Bloom filter is a special kind of cache. These are accessed after every query.

  • Compaction – Compaction is the process of freeing up space by merging large accumulated data files. During compaction, the data is merged, indexed, sorted, and stored in a new SS Table. It also reduces the number of required seeks.


Share this post

We are always here to help. If you have a query Send Now