Another dimension to this discussion that I'd like to address is the provision for a 1.10 version. In fact, I lean towards having 1.10 nominated as the pre-2.x LTS version instead of a 1.9.x. I am in favor of the basic LTS proposal, but I think that additional accommodations to ease the pre-2.x to a 2.x upgrade path must be considered before any adoption of an LTS plan.
The largest change that I'd like to propose for 1.10 is that the minimum Java language version be bumped to java 8 so that merging code between versions can use the same language constructs. As it is now, code written for 1.9.x cannot use lambda, streams,... all of the "modern" features. Merging the code forward, one is left with the option of not using those features, or changing the code which, if not done perfectly, could introduce a different set of bugs between versions. Likewise, if someone wanted to back port a feature from 2.x into the 1.9.x code base, additional changes, beyond those required because of 2.x restructuring are likely to be necessary. The migration from Accumulo 1.9.x to a 2.x is not straight forward and will require changes to Accumulo clients. However, the largest obstacle to upgrading to 2.x is with the Hadoop 3 requirement. This is a major, non-trival requirement change that is going to take significant effort (and time) for a large-scale deployments to develop to and then upgrade to Hadoop 3. There is going to be significant work required to adequately test necessary client changes, and then upgrade the deployed systems, first to Hadoop 3 and then to Accumulo 2.x. And until they can, they are going to be on a pre-2.x Accumulo version. With code frozen at 1.9.x, large deployments are going to need to make some hard decisions - do they continue to use 1.9.x as released, or do they make some patched Frankenstein version? If they find that they aggressively need to patch to get features that improve current operations, how much additional work is going to be required if / when they are in a position to upgrade? How much of that work would further delay upgrading to Hadoop 3 / Accumulo 2.x? Having features released by the community eases support across the whole ecosystem. We will all have access to the same code base, the code will be exercised by the continuous integration tests, and it provides greater insurance that those features will be available once an upgrade to 2.x is possible. Otherwise, reasoning about what "version" is actually running and what that implies when requesting support from the community is just that much harder for everyone. My opinion is that if we can accommodate some feature improvements as groups work to adopting a Hadoop 3 / 2.x deployment, then we can reduce the work required across the community and the users, work that freezing at 1.9.x for pre-2.x would introduce an additional burdens on the users. I am in favor of adopting an LTS, but I think we really need to consider the impact of requiring Hadoop 3 is having on upgrading to Accumulo 2.x in the LTS plan.
