We are overdue for a release and we should to discuss a couple of things as a 
community before we start the release process. There are still a few issues 
that are blockers that are being worked, so the release is a ways off, but if 
we can reach community consensus on a few things it would help move things 
along.

-       What is the next appropriate version number?
-       What is the state of replication and how should we handle it moving 
forward?
-       ???

We may want to have single topic discussions in their own threads for 
continuity, but I wanted to break the ice and raise these as general issues and 
to solicit other issues that you feel need to be addressed for the next release.

Version number:  There have been substantial changes since 2.0 was released.   
The next version was expected to be 2.1, but with the number and the scope of 
changes that have been made and some that are in the pipeline, maybe we should 
signal this with a major version bump to 3.0?  

-       With semver, we might be able to go either way, depending on 
interpretation.
-       With the adoption of LTM releases, whatever the next version is 
numbered, it will be a LTM release candidate.
-       There have been over 800 changes committed.
-       Notable major changes:
   o    Name changes to inclusive language (Manager instead of Master,…)
   o    Enabling external compactions.
   o    Changes in the storage of properties in ZooKeeper to reduce watchers 
(in progress, issues #1225, #1809)
   o    Change tracing to use OpenTracing instead of HTrace (PR #2259)
   o    Change metrics to use micrometer.io instead of Hadoop-metrics2 (PR 
#2305)
   o    Changes to enable per-table encryption and other improvements (PR #2197)
   o    ???

Replication: It is hard to know what the state of replication is and maybe we 
need to mark it as either experimental or deprecated to convey that to users. 
The replication tests have been unstable and failing with transient errors and 
have been removed from the regular build process – this reduced the automated 
build time by over 2 hours.   A recent example is accumulo-testing issue #164 
(https://github.com/apache/accumulo-testing/issues/164) Without the test 
running regularly, it is hard to state with any confidence that replication 
works reliably in a production environment.   This should not be interpreted as 
advocating that we remove replication at this point, but we need a way forward. 
Maybe someone volunteers to examine the tests and fixes them so that they run 
reliably and in a reasonable time, or maybe we begin to explore other 
approaches – for example, maybe some  kind of NiFi connector or something else 
entirely.  I really don’t know, but it seems we need to clearly c
 ommunicate something to any users that may be using or considering using 
replication in the next release the current state and to signal possible future 
intentions.

These topics are what I am aware of, please include any additional issues that 
you may have concerning the next release. I will start separate threads for the 
version discussion and for replication. Please use this thread or create a new 
thread if you want to raise other issues.

Ed Coleman

Reply via email to