Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset

Arun C Murthy Tue, 18 Jan 2011 02:00:22 -0800

Thanks for the clarifications Roy.

I considered either b) and c).

As I mentioned, the reason I think b) wasn't useful in this context isthat we have, in several cases, 5-6 patches per jira (bug-fix on ontop of a bug-fix) and several jiras didn't make sense for trunk sincethe bug didn't exist in trunk etc. etc. Also, I was considering ascenario where I would squash relevant patches together to produce aminimal set of coherent patches. Then there is work to remove Yahoo!specific commits.

IAC, I agree - we've spent too much time talking and too little doingactual work. Let me get the job done and folks can then weigh-in onthe release at later point, folks might be willing to consider thismore positively once they see the branch, the change-log etc.

Of course we need to get the small number of remaining patches intotrunk asap for 0.22 and beyond.


Arun

On Jan 18, 2011, at 12:20 AM, Roy T. Fielding wrote:

I thought that this discussion would have reached some sensible
understanding of how Apache projects work by now, but it seems not.

On Jan 13, 2011, at 6:12 PM, Arun C Murthy wrote:

The version I'm offering to push to the community has fixed all ofthem, *plus* the added benefit of several stability and performancefixes we have done since 20.104.3, almost 10 internal releases.This is a battle tested and hardened version which we have deployedon 40,000+ nodes. It is a significant upgrade on 0.20.104.3 whichwe never deployed. I'm pretty sure *some* users will find thatvaluable. ;)
Also, I've offered to push individual patches as a backgroundactivity on a branch - that should suffice, no? Or, do you considerthis a blocker?
Again, my goal in this exercise is to get a stable, improvedversion of Hadoop into the hands of our users asap, and focus on0.22 and beyond.


So, you have a bunch of changes that you want to contribute.
Please do so.  There are several ways:

a) break the changes down into a sequence of patches, create jira
   issues for each one (or append to the existing issue), and then
   provide the group with a list of the issue links so that people
   can quickly +1 each one.  When it seems worthwhile to you, create
   a branch off of some prior Apache release point in svn and commit
   each patch to it until the branch is identical to (or, in your own
   opinion, better than) the source code that you have tested locally.
   Then RM a tarball and start a release vote.  Since all of this is
   being done in jira and svn, others can help you do all but the
   first part (breaking down the big patch).

or

b) create a branch off of some prior Apache release point in svn
   and replay the internal Y! commits on that branch until the branch
   source code is identical to what you have tested locally.  Then
   RM a tarball based on that branch and start a release vote.
   Since the history is now in svn, others could do the RM bit if
   you don't have time.

or

c) create a branch off of some prior Apache release point in svn
   and apply one big ugly patch to it.  Then RM a tarball based
   on that branch and ask for a release vote.

You will note that none of the above requires a discussion on this
list prior to the release vote, though (a) would likely result in
more +1s than (b), and (b) would likely receive more +1s than (c).
Regardless, the release vote is a lazy majority decision.

I do not believe that there is any rational reason to apply a
single big patch.  "It takes too long" is nonsense -- you have
already spent far more time discussing it than would be required
to do it.  "It is too hard" is also nonsense -- use your version
control system to extract the set of changes and just replay them
(with appropriate changelog editing).  "It has already been tested
at Y!" is simply irrelevant -- the source code has been tested, not
the order in which the patches have been applied, so all you should
care about is that the final branch code is comparable to the tested
source code (i.e., use diff).

Nevertheless, all contributions at Apache are voluntary.  Do what
you have time for, now, with the understanding that others may or
may not complete the task, and may or may not vote for the release.

You can make a branch, apply the big patch, and stand by
while the rest of the group chooses whether to just accept it
as a big change.  Someone else might create a parallel branch to
apply the specific changes in an orderly matter, or perhaps you'll
discover an easy way to do that a few days from now.  Or it
might just sit there and never be released.

There is no need for the group to agree to a plan up front, just
as there is no need for the group to approve a release just because
someone did the work of RMing a tarball.  Sure, it might save
a lot of time if potential disagreements can be resolved before
work is done, but the fact is that people tend to disagree less
with actual work products than with abstract plans.  After all,
everyone has a plan.  It is also far easier to convince people
to fix their own problems if the problem is right in front of them.

When the release vote happens, encourage folks to test and +1
the release.  If it passes, woohoo!  If not, then listen to the
reasons given by the other PMC members and see if you can make
enough changes to the release to get those extra +1s.

In other words, collaborate.

....Roy

Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset

Reply via email to