We are overdue for a release and we should to discuss a couple of things as a community before we start the release process. There are still a few issues that are blockers that are being worked, so the release is a ways off, but if we can reach community consensus on a few things it would help move things along.
- What is the next appropriate version number? - What is the state of replication and how should we handle it moving forward? - ??? We may want to have single topic discussions in their own threads for continuity, but I wanted to break the ice and raise these as general issues and to solicit other issues that you feel need to be addressed for the next release. Version number: There have been substantial changes since 2.0 was released. The next version was expected to be 2.1, but with the number and the scope of changes that have been made and some that are in the pipeline, maybe we should signal this with a major version bump to 3.0? - With semver, we might be able to go either way, depending on interpretation. - With the adoption of LTM releases, whatever the next version is numbered, it will be a LTM release candidate. - There have been over 800 changes committed. - Notable major changes: o Name changes to inclusive language (Manager instead of Master,…) o Enabling external compactions. o Changes in the storage of properties in ZooKeeper to reduce watchers (in progress, issues #1225, #1809) o Change tracing to use OpenTracing instead of HTrace (PR #2259) o Change metrics to use micrometer.io instead of Hadoop-metrics2 (PR #2305) o Changes to enable per-table encryption and other improvements (PR #2197) o ??? Replication: It is hard to know what the state of replication is and maybe we need to mark it as either experimental or deprecated to convey that to users. The replication tests have been unstable and failing with transient errors and have been removed from the regular build process – this reduced the automated build time by over 2 hours. A recent example is accumulo-testing issue #164 (https://github.com/apache/accumulo-testing/issues/164) Without the test running regularly, it is hard to state with any confidence that replication works reliably in a production environment. This should not be interpreted as advocating that we remove replication at this point, but we need a way forward. Maybe someone volunteers to examine the tests and fixes them so that they run reliably and in a reasonable time, or maybe we begin to explore other approaches – for example, maybe some kind of NiFi connector or something else entirely. I really don’t know, but it seems we need to clearly c ommunicate something to any users that may be using or considering using replication in the next release the current state and to signal possible future intentions. These topics are what I am aware of, please include any additional issues that you may have concerning the next release. I will start separate threads for the version discussion and for replication. Please use this thread or create a new thread if you want to raise other issues. Ed Coleman
