Thanks for the clarifications Roy.
I considered either b) and c).
As I mentioned, the reason I think b) wasn't useful in this context is
that we have, in several cases, 5-6 patches per jira (bug-fix on on
top of a bug-fix) and several jiras didn't make sense for trunk since
the bug didn't exist in trunk etc. etc. Also, I was considering a
scenario where I would squash relevant patches together to produce a
minimal set of coherent patches. Then there is work to remove Yahoo!
specific commits.
IAC, I agree - we've spent too much time talking and too little doing
actual work. Let me get the job done and folks can then weigh-in on
the release at later point, folks might be willing to consider this
more positively once they see the branch, the change-log etc.
Of course we need to get the small number of remaining patches into
trunk asap for 0.22 and beyond.
Arun
On Jan 18, 2011, at 12:20 AM, Roy T. Fielding wrote:
I thought that this discussion would have reached some sensible
understanding of how Apache projects work by now, but it seems not.
On Jan 13, 2011, at 6:12 PM, Arun C Murthy wrote:
The version I'm offering to push to the community has fixed all of
them, *plus* the added benefit of several stability and performance
fixes we have done since 20.104.3, almost 10 internal releases.
This is a battle tested and hardened version which we have deployed
on 40,000+ nodes. It is a significant upgrade on 0.20.104.3 which
we never deployed. I'm pretty sure *some* users will find that
valuable. ;)
Also, I've offered to push individual patches as a background
activity on a branch - that should suffice, no? Or, do you consider
this a blocker?
Again, my goal in this exercise is to get a stable, improved
version of Hadoop into the hands of our users asap, and focus on
0.22 and beyond.
So, you have a bunch of changes that you want to contribute.
Please do so. There are several ways:
a) break the changes down into a sequence of patches, create jira
issues for each one (or append to the existing issue), and then
provide the group with a list of the issue links so that people
can quickly +1 each one. When it seems worthwhile to you, create
a branch off of some prior Apache release point in svn and commit
each patch to it until the branch is identical to (or, in your own
opinion, better than) the source code that you have tested locally.
Then RM a tarball and start a release vote. Since all of this is
being done in jira and svn, others can help you do all but the
first part (breaking down the big patch).
or
b) create a branch off of some prior Apache release point in svn
and replay the internal Y! commits on that branch until the branch
source code is identical to what you have tested locally. Then
RM a tarball based on that branch and start a release vote.
Since the history is now in svn, others could do the RM bit if
you don't have time.
or
c) create a branch off of some prior Apache release point in svn
and apply one big ugly patch to it. Then RM a tarball based
on that branch and ask for a release vote.
You will note that none of the above requires a discussion on this
list prior to the release vote, though (a) would likely result in
more +1s than (b), and (b) would likely receive more +1s than (c).
Regardless, the release vote is a lazy majority decision.
I do not believe that there is any rational reason to apply a
single big patch. "It takes too long" is nonsense -- you have
already spent far more time discussing it than would be required
to do it. "It is too hard" is also nonsense -- use your version
control system to extract the set of changes and just replay them
(with appropriate changelog editing). "It has already been tested
at Y!" is simply irrelevant -- the source code has been tested, not
the order in which the patches have been applied, so all you should
care about is that the final branch code is comparable to the tested
source code (i.e., use diff).
Nevertheless, all contributions at Apache are voluntary. Do what
you have time for, now, with the understanding that others may or
may not complete the task, and may or may not vote for the release.
You can make a branch, apply the big patch, and stand by
while the rest of the group chooses whether to just accept it
as a big change. Someone else might create a parallel branch to
apply the specific changes in an orderly matter, or perhaps you'll
discover an easy way to do that a few days from now. Or it
might just sit there and never be released.
There is no need for the group to agree to a plan up front, just
as there is no need for the group to approve a release just because
someone did the work of RMing a tarball. Sure, it might save
a lot of time if potential disagreements can be resolved before
work is done, but the fact is that people tend to disagree less
with actual work products than with abstract plans. After all,
everyone has a plan. It is also far easier to convince people
to fix their own problems if the problem is right in front of them.
When the release vote happens, encourage folks to test and +1
the release. If it passes, woohoo! If not, then listen to the
reasons given by the other PMC members and see if you can make
enough changes to the release to get those extra +1s.
In other words, collaborate.
....Roy