[this announcement is available online at https://s.apache.org/d30v9 ]
Open Source enterprise-grade Big Data distributed database powers
mission-critical deployments with improved performance and unparalleled levels
of scale in the Cloud
Wilmington, DE —27 July 2021— The Apache Cassandra Project released today v4.0
of Apache® Cassandra™, the Open Source, highly performant, distributed Big Data
database management platform.
"A long time coming, Cassandra 4.0 is the most thoroughly tested Cassandra
yet," said Nate McCall, Vice President of Apache Cassandra. "The latest version
is faster, more scalable, and bolstered with enterprise security features,
ready-for-production with unprecedented scale in the Cloud."
As a NoSQL database, Apache Cassandra handles massive amounts of data across
load-intensive applications with high availability and no single point of
failure. Cassandra’s largest production deployments include Apple (more than
160,000 instances storing over 100 petabytes of data across 1,000+ clusters),
Huawei (more than 30,000 instances across 300+ clusters), and Netflix (more
than 10,000 instances storing 6 petabytes across 100+ clusters, with over 1
trillion requests per day), among many others. Cassandra originated at Facebook
in 2008, entered the Apache Incubator in January 2009, and graduated as an
Apache Top-Level Project in February 2010.
Apache Cassandra v4.0
Cassandra v4.0 effortlessly handles unstructured data, with thousands of writes
per second. Three years in the making, v4.0 reflects more than 1,000 bug fixes,
improvements, and new features that include:
- Increased speed and scalability – streams data up to 5 times faster during
scaling operations, and up to 25% faster throughput on reads and writes, that
delivers a more elastic architecture, particularly in Cloud and Kubernetes
deployments.
- Improved consistency – keeps data replicas in sync to optimize incremental
repair for faster, more efficient operation and consistency across data
replicas.
- Enhanced security and observability – audit logging tracks users access and
activity with minimal impact to workload performance. New capture and replay
enables analysis of production workloads to help ensure regulatory and security
compliance with SOX, PCI, GDPR, or other requirements.
- New configuration settings – exposed system metrics and configuration
settings provides flexibility for operators to ensure they have easy access to
data that optimize deployments.
- Minimized latency – garbage collector pause times are reduced to a few
milliseconds with no latency degradation as heap sizes increase.
- Better compression – improved compression efficiency eases unnecessary
strain on disk space and improves read performance.
Cassandra 4.0 is community-hardened and tested by Amazon, Apple, DataStax,
Instaclustr, iland, Netflix, and others that routinely run clusters as large as
1,000 nodes and with hundreds of real-world use cases and schemas.
The Apache Cassandra community deployed several testing and quality assurance
(QA) projects and methodologies to deploy the most stable release yet. During
the testing and QA period, the community generated reproducible workloads that
are as close to real-life as possible, while effectively verifying the cluster
state against the model without pausing the workload itself.
"In our experience, nothing beats Apache Cassandra for write scaling, and we're
looking forward to the performance and management improvements in the 4.0
release," said Elliott Sims, Senior Systems Administrator at Backblaze. "We
rely on Cassandra to manage over one exabyte of customer data and serve over 50
billion files for our customers across 175 countries so optimizing Cassandra's
capabilities and performance means a lot to us."
"Since 2016, software engineers at Bloomberg have turned to Apache Cassandra
because it’s easy to use, easy to scale, and always available," said Isaac
Reath, Software Engineering Team Lead, NoSQL Infrastructure at Bloomberg.
"Today, Cassandra is used to support a variety of our applications, from
low-latency storage of intraday financial market data to high-throughput
storage for fixed income index publication. We serve up more than 20 billion
requests per day on a nearly 1 PB dataset across a fleet of 1,700+ Cassandra
nodes."
"Netflix uses Apache Cassandra heavily to satisfy its ever-growing persistence
needs on its mission to entertain the world. We have been experimenting and
partially using the 4.0 beta in our environments and its features like Audit
Logging and backpressure," said Vinay Chella, Netflix Engineering Manager and
Apache Cassandra Committer. "Apache Cassandra 4.0's improved performance helps
us reduce infrastructure costs. 4.0's stability and correctness allow us to
focus on building higher-level abstractions on top of data store compositions,
which results in increased developer velocity and optimized data