Based on recent feedback on ACCUMULO-1792 and ACCUMULO-1795, I want to
resurrect this thread to make sure everyone's concerns are addressed.

For context, here's a link to the start of the last thread:

http://bit.ly/1aPqKuH

 From ACCUMULO-1792, ctubbsii:

I'd be reluctant to support any Hadoop 2.x support in the 1.4 release
line that breaks compatibility with 0.20. I don't think breaking 0.20
and then possibly fixing it again as a second step is acceptable (because
that subsequent work may not ever be done, and I don't think
we should break the compatibility contract that we've established with
1.4.0).

Chris, I believe keeping all of the work in a branch under the umbrella
jira of ACCUMULO-1790 will ensure that we don't end up with a 1.4 release
that doesn't have proper support for 0.20.203.

Is there something beyond making sure the branch passes a full set of
release tests on 0.20.203 that you'd like to see? In the event that the
branch only ever contains the work for adding Hadoop 2, it's a simple
matter to abandon without rolling into the 1.4 development line.

 From ACCUMULO-1795, bills (and +1ed by elserj and ctubbsii):

I'm very uncomfortable with risking breaking continuity in such an old
release, and I don't think managing two lines of 1.4 releases is
worth the effort. Though we have no official EOL policy, 1.3 was
practically dead in the water once 1.4 was around, and I hope we start
encouraging more adoption of 1.5 (and soon 1.6) versus continually
propping up 1.4.

I'd love to get people to move off of 1.4. However, I think adding Hadoop 2
support to 1.4 encourages this more than leaving it out.

I'm not sure I agree that adding Hadoop2 support to 1.4 encourages people to upgrade Accumulo. My gut reaction would be that it allows people to completely ignore Accumulo updates (ignoring moving to 1.4.5 which would allow them to do hadoop2 with your proposed changes)

Accumulo 1.5.x places a higher burden on HDFS than 1.4 did, and I'm not
surprised people find relying on 0.20 for the 1.5 WAL intimidating.
Upgrading both HDFS and Accumulo across major versions at once is asking
them to take on a bunch of risk. By adding in Hadoop 2 support to 1.4 we
allow them to break the risk up into steps: they can upgrade HDFS versions
first, get comfortable, then upgrade Accumulo to 1.5.

Personally, maintaining 0.20 compatibility is not a big concern on my radar. If you're still running an 0.20 release, I'd *really* hope that you have an upgrade path to 1.2.x (if not 2.2.x) scheduled.

I think claiming that 1.5 has a higher burden on 1.4 is a bit of a fallacy. There were many problems and pains regarding WALs in <=1.4 that are very difficult to work with in a large environment (try finding WALs in server failure cases). I think the increased I/O on HDFS is a much smaller cost than the completely different I/O path that the old loggers have.

I also think upgrading Accumulo is much less scary than upgrading HDFS, but that's just me.

To me, it seems like the argument may be coming down to whether or not we break 0.20 hadoop compatibility on a bug-fix release and how concerned we are about letting users lag behind the upstream development.

I think the existing tickets under the umbrella of ACCUMULO-1790 should
ensure that we end up with a single 1.4 line that can work with either the
existing 0.20.203.0 claimed in releases or against 2.2.0.

Bill (or Josh or Chris), is there stronger language you'd like to see
around docs / packaging (area #3 in the original plan and currently
ACCUMULO-1796)? Maybe expressly only doing a binary convenience package for
0.20.203.0? Are you looking for something beyond a full release suite to
ensure 1.4 is still maintaining compatibility on Hadoop 0.20.203?


Again, my biggest concern here is not following our own guidelines of breaking changes across minor releases, but I'd hope 0.20 users have an upgrade path outlined for themselves.

Reply via email to