Hello Apache DataSketches PMC and Community,

This is a call for vote to release Apache DataSketches-java candidate
version: 9.0.0-RC1

   - This is the core Java component of the DataSketches library that
   includes all the sketch algorithms in production-ready packages. These
   sketches can be called directly from this component or used in conjunction
   with the adaptor components such as Hadoop Pig, Hadoop Hive, or the
   aggregator adaptors built into Apache Druid.

Major changes with this release:

This release is a major release where we took the opportunity to do some
significant refactoring that will constitute incompatible changes from
previous releases.  Any incompatibility with prior releases is always an
inconvenience to users who wish to just upgrade to the latest release and
run.  However, some of the code in this library was written in 2013 and
meanwhile the Java language has evolved enormously since then.  We chose to
use this major release as the opportunity to modernize some of the code to
achieve the following goals:

*Remove the dependency on the DataSketches-Memory component and use FFM
instead.*

   - The DataSketches-Memory component was originally developed in 2014 to
   address the need for fast access to off-heap memory data structures and
   used Unsafe and other JVM internals as there were no satisfactory Java
   language features to do this at the time.
   - The FFM capabilities introduced into the language in Java 22, are now
   part of the Java 25 LTS release, which we support. Since the capabilities
   of FFM are a superset of the original DataSketches-Memory component, it
   made sense to rewrite the code to eliminate the dependency on
   DataSketches-Memory and use FFM instead.  This impacted code across the
   entire library.
   - This provided several advantages to the code base. By removing this
   dependency on DataSketches-Memory, there are now no runtime dependencies!
   This should make integrating this library into other Java systems much
   simpler. Since FFM is tightly integrated into the Java language, it has
   improved performance, especially with bulk operations.
   - As an added note: There are numerous other improvements to the Java
   language that we could perhaps take advantage of in a rewrite, e.g.,
   Records, text blocks, switch expressions, sealed, var, modules, patterns,
   etc.  However, faced with the risk of accidentally creating bugs due to too
   many changes at one time, we focused on FFM, which actually improved
   performance as opposed to just creating syntactic sugar.

*Align public sketch class names so that the sketch family name is part of
the class name. *

   - For example, the Theta sketch family was the first family written for
   the library and its base class was called *Sketch*.  The Tuple sketch
   family evolved soon after and its base class was also called *Sketch*.
   If a user wanted to use both the Theta and Tuple families in the same class
   one of them had to be fully qualified every time it was referenced.
   - Unfortunately, this style propagated so some of the other early
   sketch families where we ended up with two different sketch
families with a *ItemsSketch,
   etc*. For the more recent additions to the library we started including
   the sketch family name in all the relevant sketch-like public classes of a
   sketch family.
   - In this release we have refactored these older sketches with new names
   that now include the sketch family name.  This is an incompatible change
   for user code moving from earlier releases, but this can be readily fixed
   with search-and-replace tools. This release is not perfect, but hopefully
   more consistent across all the different sketch families.

Known Issues:

*SpotBugs*

   - Make sure you configure SpotBugs with the
   /tools/FindBugsExcludeFilter.xml file. Otherwise, you may get a lot of
   false positive or low risk issues that we have examined and eliminated with
   this exclusion file.

*Checkstyle*

   - At the time of this writing, Checkstyle had not been upgraded to
   handle Java 25 features.

References for this release:

*Source repository:  *
https://github.com/apache/datasketches-java

*Git Tag for this release: *
https://github.com/apache/datasketches-java/releases/tag/9.0.0-RC1  on
branch 9.0.X

*Git HashId for this release starts with: *
f3b334b on branch 9.0.X

*The Release Candidate / Zip Repository: *
https://dist.apache.org/repos/dist/dev/datasketches/java/9.0.0-RC1

*The public signing key can be found in the KEYS file: *
https://dist.apache.org/repos/dist/dev/datasketches/KEYS

*The artifacts have been signed with --keyid-format SHORT:*
8CD4A902

*Repository: Maven Central [Nexus](http://repository.apache.org
<http://repository.apache.org>) (Jar Artifacts):*
https://repository.apache.org/content/groups/staging/org/apache/datasketches/datasketches-java/9.0.0/

*Build & Test Guide:*
https://github.com/apache/datasketches-java/blob/9.0.0-RC1/README.md

The vote will be performed as follows:
This letter will be published on dev@ and remain open for at least 72 hours
(excluding weekends and holidays), AND until at least 3 (+1) PMC votes or a
majority of (+1) PMC votes are acquired. Anyone in the community can vote.
This vote will close no earlier than Monday Dec 1, 2025, 6:00 PM PST.

Please vote accordingly:

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove with the reason

Thanks,
Lee Rhodes
[email protected]

Reply via email to