Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Jean-Daniel Cryans
Roy,

On Wed, May 4, 2011 at 7:22 PM, Roy T. Fielding  wrote:
> The ASF is a vehicle for whomever wishes to collaborate on a
> given project.  Collaboration means helping do the work.  Those
> who do the work may do so for whatever reasons that they think
> are good, whether it is because they feel like being charitable
> today, they get paid a salary and the big boss said "work on
> this part", or because they just have an itch worth scratching.
>
> Apache does not care why people choose to collaborate or
> how they choose to apply their own intellectual efforts.  We
> welcome all forms of contribution under the terms of our license.

I don't think I was arguing against the contribution of the code in
that branch, it's very welcome, but I'm questioning (and ranting
about) the motivation for releasing a version that even just by name
is a weird hulla-hoop around the usual development practices that
Hadoop has had in the past (not that it's set in stone).

So I wanted to contribute my negative non-binding vote to highlight
that this release is probably very confusing for the general user.
This is 0.20, but it's not. Also it has more numbers, and it starts at
203. Why doing this at all instead of just moving on with 0.22? Or is
0.22 bound to be like 0.21? It almost begs the question if this should
be called 0.22.0 then.

>
> What we do require is a certain amount of civility regarding
> our voting procedures and an emphasis on individual responsibility
> for your votes.  Anyone caught *voting* a particular way just
> because the boss says so will be dealt with severely.  Votes
> are how we do quality control and make decisions, and no other
> company can be allowed to make decisions for our non-profit.

Yeah I don't think that's a problem here, everyone seem to have their
very own strong opinions.


RE: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Jane Chen
Agree.  As a new comer, I had trouble figuring out which version to adopt -- 
0.20.2 vs. 0.21. This new release candidate seems to add more confusion to 
general users.

Jane

-Original Message-
From: Matei Zaharia [mailto:ma...@eecs.berkeley.edu] 
Sent: Wednesday, May 04, 2011 11:21 PM
To: general@hadoop.apache.org
Subject: Re: [VOTE] Release candidate 0.20.203.0-rc1

I'm not going to cast a vote, but I'm concerned about this for the same reasons 
Eli brought up -- in particular, compatibility with 0.22. I'm an author of 
several patches that have gone into 0.21 and trunk, only to stay on hiatus for 
2 years because the project hasn't made a stable release since 0.20. (Today, 
many of these patches are being used through CDH, which is great, but it would 
be nice to see them in an Apache release too.) This push of features into 
0.20.203 makes a widely used 0.22 seem even more distant. Can we at least get a 
confirmation that these changes will be included in 0.22, as well as a timeline?

To support a vibrant developer community, Apache Hadoop should not just be a 
mechanism for Yahoo and Cloudera to publish patches. It should include a 
well-defined process for smaller third-party contributors to push changes that 
will make it into a stable release within a reasonable time horizon. The lack 
of such a process has been a major cause for the slowdown in the project in my 
perspective.

Matei



On May 4, 2011, at 10:47 PM, Eric Sammer wrote:

> (non-binding) -1 for similar reasons to what Jeff and others have laid out,
> and certainly if we're going to change the development process as a side
> effect of a release vote.
> 
> On Wed, May 4, 2011 at 9:54 PM, Jeff Hammerbacher wrote:
> 
>> -1.
>> 
>> As Roy says, "whatever gets released will define the new norm by which
>> policies are assumed", and I certainly don't want this project to change
>> its
>> norms to accommodate bad practices. In particular, Eli presented three very
>> reasonable technical objections to this release. To summarize:
>> 
>> 1) Let's get the JIRAs that are going into this release into trunk first.
>> 2) Let's create a JIRA for each issue in the release.
>> 3) Let's stick to the release numbering conventions established for this
>> project.
>> 
>> I know the folks at Yahoo! are all professional engineers and done
>> tremendous work to help get the project to this point. There's no doubt in
>> my mind they understand the validity of the above three technical
>> objections. In fact, many of them helped author our "How to Contribute"
>> page, which established these conventions:
>> wiki.apache.org/hadoop/HowToContribute. We develop new features against
>> trunk, we create JIRAs for each issue, we review code before it goes into
>> trunk, and we only update old releases with bug fixes.
>> 
>> I couldn't be more excited to have Yahoo! once again doing development in
>> Apache, and I hope that we can work together to get the work that you've
>> done in this branch into one of our upcoming feature releases.
>> 
>> I hope those who voted +1 before Roy clarified what a release vote will
>> mean
>> for future project norms will reconsider their votes.
>> 
>> While there may be many competing agendas in this community, we all wish to
>> see Apache Hadoop releases of the highest quality. Changing our norms to
>> allow huge, unreviewed patch sets introducing new features into a past
>> release is a step in the wrong direction.
>> 
>> With a little bit of elbow grease, we can get the work done in this branch
>> into trunk, get 0.22 out the door, and be ready for a great 0.23 release.
>> 
>> Later,
>> Jeff
>> 
>> On Wed, May 4, 2011 at 9:17 PM, Nigel Daley  wrote:
>> 
>>> I'm really not sure yet how to vote here.  I was going to vote +1 for
>> what
>>> I was told by a number of Yahoo! committers would be a one time release
>> as
>>> Yahoo! "comes back to Apache" after a hiatus last fall/winter and ended
>>> their own distribution.  Clearly this code was not all developed as a
>>> community process, but I was going to support a one time release of what
>>> they had developed in exclusion.
>>> 
>>> Then I read Roy's email, which confused me.  We would he or I or anyone
>>> else support this release setting precedent or policy since it would walk
>>> all over our bylaws, community process, and the consensus nature of our
>>> foundation?  This release vote is a lazy majority of the PMC, but other
>>> decisions rolled up in this are supposed to be lazy majority of active
>>> committers or, in the case of code changes, a lazy consensus.  Setting
>>> policy by this release means any sufficiently large group of committers
>>> could go off and develop on their own and then commit it to a branch and
>>> call a release.
>>> 
>>> Furthermore, it now sounds like this is possibly the first in a line of
>>> feature releases off this branch.  Bug fixes releases, sure.  But feature
>>> releases?  What's wrong with trunk?
>>> 
>>> Nige
>>> 
>>> On May 4, 201

Re: [DISCUSSION] Release rules

2011-05-04 Thread Eli Collins
On Wed, May 4, 2011 at 5:59 PM, Tom White  wrote:
> One year ago (to the day!) Chris started a discussion about the
> release manager role
> (http://mail-archives.apache.org/mod_mbox/hadoop-general/201005.mbox/%3ch2q1267dd3b1005041331r7d8f696di370a279ff6058...@mail.gmail.com%3E).
> In light of today's disagreements, I think we should restart this
> discussion and incorporate these rules into the bylaws, since it
> formalizes our practices.
>
> I'm happy to drive this. We could start by discussing Chris' proposal
> (see clarifications in
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201005.mbox/%3ct2y1267dd3b1005051201h7116e4caud75673ac9d512...@mail.gmail.com%3E),
> then when we get consensus we can put the document on the website.
> (BTW does anyone know if the bylaws were checked into SVN anywhere?
> These belong together.)

Sounds good to me. I like Chris' proposal, he was clear that "nothing
should be in (unreleased) 0.x that isn't also in trunk." so that may
needs to be revisited if we want to be consistent with today's vote.

I don't think the bylaws were checked in, we should do that first. How
about checking them into the site repo so they get generated as part
of the docs? Eg this is how Pig does it:
http://pig.apache.org/bylaws.html

Thanks,
Eli


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Sanjay Radia

+1
 downloaded, built, deployed on one node cluster.


sanjay

On May 4, 2011, at 10:31 AM, Owen O'Malley wrote:

Here's an updated release candidate for 0.20.203.0. I've  
incorporated the feedback and included all of the patches from  
0.20.2, which is the last stable release. I also fixed the eclipse- 
plugin problem.


The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/

Please download it, inspect it, compile it, and test it. Clearly,  
I'm +1.


-- Owen




Re: [DISCUSSION] development process of Hadoop

2011-05-04 Thread Eli Collins
On Wed, May 4, 2011 at 7:39 PM, Eric Yang  wrote:
> If we reflect back and see how the development community end up in its 
> current state for Hadoop.  There are development rapidly happening and tested 
> in all kind of organizations.  However, Hadoop committers are only committing 
> code that are interested by the sponsored companies.  People are coding 
> defensively to ensuring only self serving patches would be committed, and 
> helping others and merging problem are always prioritized secondary.  While 
> the world demand agility, the "review then commit" process is preventing 
> progress from happening.  Committers are afraid to commit patches because 
> review hasn't took place.  By the time patch is reviewed, it does not apply 
> properly.  People end up having to generate multiple version of patches to 
> ensure the code can be applied.  The large lag time between patch generation 
> and reviewed is taking significant toll on the community and progress.
>
> Yahoo have a great team of developers who improves Hadoop at faster pace with 
> its own fork of the source code.  The reason that Yahoo was able to achieve 
> faster improvement with features was due to the ability to use source code 
> repository tools properly.  Unfortunate for Yahoo, their source code 
> repository was not Apache svn trunk.  I applause Owen and Arun's effort for 
> men powering and backward/forward porting the changes between yahoo github 
> and Apache svn.  There might be some jiras that needs to be merged into 
> Hadoop 0.20.203 branch to ensure the linage is correct.  The community should 
> offer to help with detail listing of what is missing rather than vote -1 
> without concise reasoning of what is missing.
>
> JIRA is meant as a discussion and collaboration tool, but hadoop community 
> intends to use it as the source code version control system with men powered 
> diff maker.  While spending time in the incubator with other project, the 
> mentors have explained that it is not ASF's philosophy to use "review then 
> commit".

ASF's policy is that projects make this decision for themselves:
http://www.apache.org/dev/project-creation.html

The Hadoop bylaws specify that code changes are lazy consensus, ie you
need a +1 from a committer. Technically the code doesn't have to be
reviewed before committing it, that's just been the norm.

I don't think jira is technically required either, it's just been the
norm. The vote for the patch has to happen on the lists, that happens
as a side effect of jira traffic going to the dev lists.

> Hadoop community should rethink if the community is using the right tools for 
> the right task.
>
> Use JIRA, if there is large feature set that requires brain storming, and 
> developers should have the ability to make small incremental changes without 
> RTC.  This will ensure developers help each other rather than policing each 
> other.
>
> Any thoughts?
>

I think you can move quickly with RTC or CTR, I've worked on RTC
projects that have moved quickly. It requires people dedicate
bandwidth to reviewing changes. If you do want all your code reviewed
(at some point) then you're ultimately limited by review bandwidth,
with either RTC or CTR.

The time it takes to file a jira is normally insignificant compared to
the time to create and test a change. The idea with using jira is that
you propose/discuss a change before creating code. You could do that
on the lists too. I agree using just a code review tool for small
stuff would be faster, eg things that don't require a bug #, release
note, etc.

Thanks,
Eli


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Matei Zaharia
I'm not going to cast a vote, but I'm concerned about this for the same reasons 
Eli brought up -- in particular, compatibility with 0.22. I'm an author of 
several patches that have gone into 0.21 and trunk, only to stay on hiatus for 
2 years because the project hasn't made a stable release since 0.20. (Today, 
many of these patches are being used through CDH, which is great, but it would 
be nice to see them in an Apache release too.) This push of features into 
0.20.203 makes a widely used 0.22 seem even more distant. Can we at least get a 
confirmation that these changes will be included in 0.22, as well as a timeline?

To support a vibrant developer community, Apache Hadoop should not just be a 
mechanism for Yahoo and Cloudera to publish patches. It should include a 
well-defined process for smaller third-party contributors to push changes that 
will make it into a stable release within a reasonable time horizon. The lack 
of such a process has been a major cause for the slowdown in the project in my 
perspective.

Matei



On May 4, 2011, at 10:47 PM, Eric Sammer wrote:

> (non-binding) -1 for similar reasons to what Jeff and others have laid out,
> and certainly if we're going to change the development process as a side
> effect of a release vote.
> 
> On Wed, May 4, 2011 at 9:54 PM, Jeff Hammerbacher wrote:
> 
>> -1.
>> 
>> As Roy says, "whatever gets released will define the new norm by which
>> policies are assumed", and I certainly don't want this project to change
>> its
>> norms to accommodate bad practices. In particular, Eli presented three very
>> reasonable technical objections to this release. To summarize:
>> 
>> 1) Let's get the JIRAs that are going into this release into trunk first.
>> 2) Let's create a JIRA for each issue in the release.
>> 3) Let's stick to the release numbering conventions established for this
>> project.
>> 
>> I know the folks at Yahoo! are all professional engineers and done
>> tremendous work to help get the project to this point. There's no doubt in
>> my mind they understand the validity of the above three technical
>> objections. In fact, many of them helped author our "How to Contribute"
>> page, which established these conventions:
>> wiki.apache.org/hadoop/HowToContribute. We develop new features against
>> trunk, we create JIRAs for each issue, we review code before it goes into
>> trunk, and we only update old releases with bug fixes.
>> 
>> I couldn't be more excited to have Yahoo! once again doing development in
>> Apache, and I hope that we can work together to get the work that you've
>> done in this branch into one of our upcoming feature releases.
>> 
>> I hope those who voted +1 before Roy clarified what a release vote will
>> mean
>> for future project norms will reconsider their votes.
>> 
>> While there may be many competing agendas in this community, we all wish to
>> see Apache Hadoop releases of the highest quality. Changing our norms to
>> allow huge, unreviewed patch sets introducing new features into a past
>> release is a step in the wrong direction.
>> 
>> With a little bit of elbow grease, we can get the work done in this branch
>> into trunk, get 0.22 out the door, and be ready for a great 0.23 release.
>> 
>> Later,
>> Jeff
>> 
>> On Wed, May 4, 2011 at 9:17 PM, Nigel Daley  wrote:
>> 
>>> I'm really not sure yet how to vote here.  I was going to vote +1 for
>> what
>>> I was told by a number of Yahoo! committers would be a one time release
>> as
>>> Yahoo! "comes back to Apache" after a hiatus last fall/winter and ended
>>> their own distribution.  Clearly this code was not all developed as a
>>> community process, but I was going to support a one time release of what
>>> they had developed in exclusion.
>>> 
>>> Then I read Roy's email, which confused me.  We would he or I or anyone
>>> else support this release setting precedent or policy since it would walk
>>> all over our bylaws, community process, and the consensus nature of our
>>> foundation?  This release vote is a lazy majority of the PMC, but other
>>> decisions rolled up in this are supposed to be lazy majority of active
>>> committers or, in the case of code changes, a lazy consensus.  Setting
>>> policy by this release means any sufficiently large group of committers
>>> could go off and develop on their own and then commit it to a branch and
>>> call a release.
>>> 
>>> Furthermore, it now sounds like this is possibly the first in a line of
>>> feature releases off this branch.  Bug fixes releases, sure.  But feature
>>> releases?  What's wrong with trunk?
>>> 
>>> Nige
>>> 
>>> On May 4, 2011, at 6:56 PM, Roy T. Fielding wrote:
>>> 
 On May 4, 2011, at 5:39 PM, Eli Collins wrote:
 
> The point is that these discussion should be sorted out, ie you don't
> change your development and release model on a release VOTE thread,
> you change it on a DISCUSSION thread.
 
 That is no different than saying you have a right to veto a
 release u

Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eric Sammer
(non-binding) -1 for similar reasons to what Jeff and others have laid out,
and certainly if we're going to change the development process as a side
effect of a release vote.

On Wed, May 4, 2011 at 9:54 PM, Jeff Hammerbacher wrote:

> -1.
>
> As Roy says, "whatever gets released will define the new norm by which
> policies are assumed", and I certainly don't want this project to change
> its
> norms to accommodate bad practices. In particular, Eli presented three very
> reasonable technical objections to this release. To summarize:
>
> 1) Let's get the JIRAs that are going into this release into trunk first.
> 2) Let's create a JIRA for each issue in the release.
> 3) Let's stick to the release numbering conventions established for this
> project.
>
> I know the folks at Yahoo! are all professional engineers and done
> tremendous work to help get the project to this point. There's no doubt in
> my mind they understand the validity of the above three technical
> objections. In fact, many of them helped author our "How to Contribute"
> page, which established these conventions:
> wiki.apache.org/hadoop/HowToContribute. We develop new features against
> trunk, we create JIRAs for each issue, we review code before it goes into
> trunk, and we only update old releases with bug fixes.
>
> I couldn't be more excited to have Yahoo! once again doing development in
> Apache, and I hope that we can work together to get the work that you've
> done in this branch into one of our upcoming feature releases.
>
> I hope those who voted +1 before Roy clarified what a release vote will
> mean
> for future project norms will reconsider their votes.
>
> While there may be many competing agendas in this community, we all wish to
> see Apache Hadoop releases of the highest quality. Changing our norms to
> allow huge, unreviewed patch sets introducing new features into a past
> release is a step in the wrong direction.
>
> With a little bit of elbow grease, we can get the work done in this branch
> into trunk, get 0.22 out the door, and be ready for a great 0.23 release.
>
> Later,
> Jeff
>
> On Wed, May 4, 2011 at 9:17 PM, Nigel Daley  wrote:
>
> > I'm really not sure yet how to vote here.  I was going to vote +1 for
> what
> > I was told by a number of Yahoo! committers would be a one time release
> as
> > Yahoo! "comes back to Apache" after a hiatus last fall/winter and ended
> > their own distribution.  Clearly this code was not all developed as a
> > community process, but I was going to support a one time release of what
> > they had developed in exclusion.
> >
> > Then I read Roy's email, which confused me.  We would he or I or anyone
> > else support this release setting precedent or policy since it would walk
> > all over our bylaws, community process, and the consensus nature of our
> > foundation?  This release vote is a lazy majority of the PMC, but other
> > decisions rolled up in this are supposed to be lazy majority of active
> > committers or, in the case of code changes, a lazy consensus.  Setting
> > policy by this release means any sufficiently large group of committers
> > could go off and develop on their own and then commit it to a branch and
> > call a release.
> >
> > Furthermore, it now sounds like this is possibly the first in a line of
> > feature releases off this branch.  Bug fixes releases, sure.  But feature
> > releases?  What's wrong with trunk?
> >
> > Nige
> >
> > On May 4, 2011, at 6:56 PM, Roy T. Fielding wrote:
> >
> > > On May 4, 2011, at 5:39 PM, Eli Collins wrote:
> > >
> > >> The point is that these discussion should be sorted out, ie you don't
> > >> change your development and release model on a release VOTE thread,
> > >> you change it on a DISCUSSION thread.
> > >
> > > That is no different than saying you have a right to veto a
> > > release until the issue is addressed, which you don't have.
> > >
> > > A release vote is a majority decision.  If the majority
> > > decides to release, then whatever gets released will define
> > > the new norm by which policies are assumed.  If not released,
> > > then I suggest collaborating more on the policies before
> > > trying to vote again.
> > >
> > > Either way, we don't hold up a vote for the sake of a
> > > policy discussion because voting is a more efficient
> > > means of discovering if the policy really matters.
> > >
> > > Roy
> > >
> >
> >
>



-- 
Eric Sammer
twitter: esammer
data: www.cloudera.com


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Jeff Hammerbacher
-1.

As Roy says, "whatever gets released will define the new norm by which
policies are assumed", and I certainly don't want this project to change its
norms to accommodate bad practices. In particular, Eli presented three very
reasonable technical objections to this release. To summarize:

1) Let's get the JIRAs that are going into this release into trunk first.
2) Let's create a JIRA for each issue in the release.
3) Let's stick to the release numbering conventions established for this
project.

I know the folks at Yahoo! are all professional engineers and done
tremendous work to help get the project to this point. There's no doubt in
my mind they understand the validity of the above three technical
objections. In fact, many of them helped author our "How to Contribute"
page, which established these conventions:
wiki.apache.org/hadoop/HowToContribute. We develop new features against
trunk, we create JIRAs for each issue, we review code before it goes into
trunk, and we only update old releases with bug fixes.

I couldn't be more excited to have Yahoo! once again doing development in
Apache, and I hope that we can work together to get the work that you've
done in this branch into one of our upcoming feature releases.

I hope those who voted +1 before Roy clarified what a release vote will mean
for future project norms will reconsider their votes.

While there may be many competing agendas in this community, we all wish to
see Apache Hadoop releases of the highest quality. Changing our norms to
allow huge, unreviewed patch sets introducing new features into a past
release is a step in the wrong direction.

With a little bit of elbow grease, we can get the work done in this branch
into trunk, get 0.22 out the door, and be ready for a great 0.23 release.

Later,
Jeff

On Wed, May 4, 2011 at 9:17 PM, Nigel Daley  wrote:

> I'm really not sure yet how to vote here.  I was going to vote +1 for what
> I was told by a number of Yahoo! committers would be a one time release as
> Yahoo! "comes back to Apache" after a hiatus last fall/winter and ended
> their own distribution.  Clearly this code was not all developed as a
> community process, but I was going to support a one time release of what
> they had developed in exclusion.
>
> Then I read Roy's email, which confused me.  We would he or I or anyone
> else support this release setting precedent or policy since it would walk
> all over our bylaws, community process, and the consensus nature of our
> foundation?  This release vote is a lazy majority of the PMC, but other
> decisions rolled up in this are supposed to be lazy majority of active
> committers or, in the case of code changes, a lazy consensus.  Setting
> policy by this release means any sufficiently large group of committers
> could go off and develop on their own and then commit it to a branch and
> call a release.
>
> Furthermore, it now sounds like this is possibly the first in a line of
> feature releases off this branch.  Bug fixes releases, sure.  But feature
> releases?  What's wrong with trunk?
>
> Nige
>
> On May 4, 2011, at 6:56 PM, Roy T. Fielding wrote:
>
> > On May 4, 2011, at 5:39 PM, Eli Collins wrote:
> >
> >> The point is that these discussion should be sorted out, ie you don't
> >> change your development and release model on a release VOTE thread,
> >> you change it on a DISCUSSION thread.
> >
> > That is no different than saying you have a right to veto a
> > release until the issue is addressed, which you don't have.
> >
> > A release vote is a majority decision.  If the majority
> > decides to release, then whatever gets released will define
> > the new norm by which policies are assumed.  If not released,
> > then I suggest collaborating more on the policies before
> > trying to vote again.
> >
> > Either way, we don't hold up a vote for the sake of a
> > policy discussion because voting is a more efficient
> > means of discovering if the policy really matters.
> >
> > Roy
> >
>
>


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Nigel Daley
I'm really not sure yet how to vote here.  I was going to vote +1 for what I 
was told by a number of Yahoo! committers would be a one time release as Yahoo! 
"comes back to Apache" after a hiatus last fall/winter and ended their own 
distribution.  Clearly this code was not all developed as a community process, 
but I was going to support a one time release of what they had developed in 
exclusion.

Then I read Roy's email, which confused me.  We would he or I or anyone else 
support this release setting precedent or policy since it would walk all over 
our bylaws, community process, and the consensus nature of our foundation?  
This release vote is a lazy majority of the PMC, but other decisions rolled up 
in this are supposed to be lazy majority of active committers or, in the case 
of code changes, a lazy consensus.  Setting policy by this release means any 
sufficiently large group of committers could go off and develop on their own 
and then commit it to a branch and call a release.

Furthermore, it now sounds like this is possibly the first in a line of feature 
releases off this branch.  Bug fixes releases, sure.  But feature releases?  
What's wrong with trunk?

Nige

On May 4, 2011, at 6:56 PM, Roy T. Fielding wrote:

> On May 4, 2011, at 5:39 PM, Eli Collins wrote:
> 
>> The point is that these discussion should be sorted out, ie you don't
>> change your development and release model on a release VOTE thread,
>> you change it on a DISCUSSION thread.
> 
> That is no different than saying you have a right to veto a
> release until the issue is addressed, which you don't have.
> 
> A release vote is a majority decision.  If the majority
> decides to release, then whatever gets released will define
> the new norm by which policies are assumed.  If not released,
> then I suggest collaborating more on the policies before
> trying to vote again.
> 
> Either way, we don't hold up a vote for the sake of a
> policy discussion because voting is a more efficient
> means of discovering if the policy really matters.
> 
> Roy
> 



Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Ian Holsman
just as a Tally
we have
6+1's (andy.. is yours binding?? if so 7)
and 3 -1's.

so according to the votes so far we are releasing.. but according to our 
bylaws.. we need to wait 7 days for everyone to chime in.

--I
On May 5, 2011, at 12:22 PM, Roy T. Fielding wrote:

> On May 4, 2011, at 6:24 PM, Jean-Daniel Cryans wrote:
> 
>> Non-biding -1.
>> 
>> I did download it and checked it out, but when I look at the
>> documentation I see it says "Hadoop 0.20 documentation" in the tab on
>> top. From what I can tell this isn't the branch 0.20 so I think it's
>> an error and from a user point of view this looks more like something
>> I would call 0.22 (although yes I understand this is 0.20 +security
>> +whatever).
>> 
>> Why would a single company push so hard to go against the "normal"
>> release process just for "the benefit of putting our work in the hands
>> of all hadoop users" is beyond me. It's not like people were begging
>> on the mailing lists to be able to get their hands on such a release
>> to the point where an emergency point release including tons of new
>> features is needed.
>> 
>> So to me the more logical reason would be monetary gains, that I would
>> understand better from a for-profit company. But then why go through
>> the hurdles of having such an ASF release when Y! isn't even selling
>> anything remotely related to Hadoop services? And why now?
>> 
>> But then there's this spinoff thing and it suddenly makes a lot more sense.
>> 
>> E14 said earlier that "That is how apache works."
>> 
>> I would say yes, maybe this is how it works, but I'm not sure I want
>> to see it working like _that_. The ASF shouldn't be the vehicle for a
>> single (future) company's wishes.
> 
> The ASF is a vehicle for whomever wishes to collaborate on a
> given project.  Collaboration means helping do the work.  Those
> who do the work may do so for whatever reasons that they think
> are good, whether it is because they feel like being charitable
> today, they get paid a salary and the big boss said "work on
> this part", or because they just have an itch worth scratching.
> 
> Apache does not care why people choose to collaborate or
> how they choose to apply their own intellectual efforts.  We
> welcome all forms of contribution under the terms of our license.
> 
> What we do require is a certain amount of civility regarding
> our voting procedures and an emphasis on individual responsibility
> for your votes.  Anyone caught *voting* a particular way just
> because the boss says so will be dealt with severely.  Votes
> are how we do quality control and make decisions, and no other
> company can be allowed to make decisions for our non-profit.
> 
> Roy



[DISCUSSION] development process of Hadoop

2011-05-04 Thread Eric Yang
If we reflect back and see how the development community end up in its current 
state for Hadoop.  There are development rapidly happening and tested in all 
kind of organizations.  However, Hadoop committers are only committing code 
that are interested by the sponsored companies.  People are coding defensively 
to ensuring only self serving patches would be committed, and helping others 
and merging problem are always prioritized secondary.  While the world demand 
agility, the "review then commit" process is preventing progress from 
happening.  Committers are afraid to commit patches because review hasn't took 
place.  By the time patch is reviewed, it does not apply properly.  People end 
up having to generate multiple version of patches to ensure the code can be 
applied.  The large lag time between patch generation and reviewed is taking 
significant toll on the community and progress.

Yahoo have a great team of developers who improves Hadoop at faster pace with 
its own fork of the source code.  The reason that Yahoo was able to achieve 
faster improvement with features was due to the ability to use source code 
repository tools properly.  Unfortunate for Yahoo, their source code repository 
was not Apache svn trunk.  I applause Owen and Arun's effort for men powering 
and backward/forward porting the changes between yahoo github and Apache svn.  
There might be some jiras that needs to be merged into Hadoop 0.20.203 branch 
to ensure the linage is correct.  The community should offer to help with 
detail listing of what is missing rather than vote -1 without concise reasoning 
of what is missing.

JIRA is meant as a discussion and collaboration tool, but hadoop community 
intends to use it as the source code version control system with men powered 
diff maker.  While spending time in the incubator with other project, the 
mentors have explained that it is not ASF's philosophy to use "review then 
commit".  Hadoop community should rethink if the community is using the right 
tools for the right task.

Use JIRA, if there is large feature set that requires brain storming, and 
developers should have the ability to make small incremental changes without 
RTC.  This will ensure developers help each other rather than policing each 
other.

Any thoughts?

Regards,
Eric


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Roy T. Fielding
On May 4, 2011, at 6:24 PM, Jean-Daniel Cryans wrote:

> Non-biding -1.
> 
> I did download it and checked it out, but when I look at the
> documentation I see it says "Hadoop 0.20 documentation" in the tab on
> top. From what I can tell this isn't the branch 0.20 so I think it's
> an error and from a user point of view this looks more like something
> I would call 0.22 (although yes I understand this is 0.20 +security
> +whatever).
> 
> Why would a single company push so hard to go against the "normal"
> release process just for "the benefit of putting our work in the hands
> of all hadoop users" is beyond me. It's not like people were begging
> on the mailing lists to be able to get their hands on such a release
> to the point where an emergency point release including tons of new
> features is needed.
> 
> So to me the more logical reason would be monetary gains, that I would
> understand better from a for-profit company. But then why go through
> the hurdles of having such an ASF release when Y! isn't even selling
> anything remotely related to Hadoop services? And why now?
> 
> But then there's this spinoff thing and it suddenly makes a lot more sense.
> 
> E14 said earlier that "That is how apache works."
> 
> I would say yes, maybe this is how it works, but I'm not sure I want
> to see it working like _that_. The ASF shouldn't be the vehicle for a
> single (future) company's wishes.

The ASF is a vehicle for whomever wishes to collaborate on a
given project.  Collaboration means helping do the work.  Those
who do the work may do so for whatever reasons that they think
are good, whether it is because they feel like being charitable
today, they get paid a salary and the big boss said "work on
this part", or because they just have an itch worth scratching.

Apache does not care why people choose to collaborate or
how they choose to apply their own intellectual efforts.  We
welcome all forms of contribution under the terms of our license.

What we do require is a certain amount of civility regarding
our voting procedures and an emphasis on individual responsibility
for your votes.  Anyone caught *voting* a particular way just
because the boss says so will be dealt with severely.  Votes
are how we do quality control and make decisions, and no other
company can be allowed to make decisions for our non-profit.

Roy


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Dhruba Borthakur
+1.

I downloaded the bits, compiled and ran unit tests. Also, looked at the
source code to some extent. Looks good.

-dhruba

On Wed, May 4, 2011 at 6:56 PM, Roy T. Fielding  wrote:

> On May 4, 2011, at 5:39 PM, Eli Collins wrote:
>
> > The point is that these discussion should be sorted out, ie you don't
> > change your development and release model on a release VOTE thread,
> > you change it on a DISCUSSION thread.
>
> That is no different than saying you have a right to veto a
> release until the issue is addressed, which you don't have.
>
> A release vote is a majority decision.  If the majority
> decides to release, then whatever gets released will define
> the new norm by which policies are assumed.  If not released,
> then I suggest collaborating more on the policies before
> trying to vote again.
>
> Either way, we don't hold up a vote for the sake of a
> policy discussion because voting is a more efficient
> means of discovering if the policy really matters.
>
> Roy
>
>


-- 
Connect to me at http://www.facebook.com/dhruba


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Andrew Purtell
Speculation either on the motives of those objecting to a release or of those 
making contributions or proposing a release does not advance progress. The 
accusations and counter-accusations seen on this thread are regrettable and I 
feel less and less confident in the future of Apache Hadoop as time goes on. As 
a strong believer in and advocate of open source as an answer to technical and 
architectural challenges, I am pained to see the members of what should be a 
vibrant community litigating in an ultimately self-defeating way. If only this 
energy put into argument could be channeled into code or patches...

In open source, if opinions were code we would rule the world.

So what of this candidate?

Artifact looks good, DFS tests are good, MR tests are good. Looked over some of 
the documentation and found no errors. To my knowledge this is now a superset 
of branch-0.20, addressing the reasonably determined deficit of rc0.

There seems no reason other issues cannot be addressed subsequently.

There has not been a release of Apache Hadoop 0.20 since at least Feb 6 2010 
yet since this time important security enhancements have been contributed, but 
in the form of an Apache product these are only available as patches on a 
non-release branch. Forward progress of the Apache product seems more important 
than achieving the perfect release in all eyes.

For example, append features remain on a non-release branch. I would really 
have liked to see the append changes included in this candidate, but this is 
not grounds for objection merely regret, and I hope this can be covered by a 
subsequent release, perhaps soon.

After security and append features are in 0.20, in my personal humble opinion 
the 0.20 release in total is sufficient and all attention should be paid to the 
next release (0.22 or whatever), except for critical bug fixes.

+1

Best regards,

    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
Tom White)



Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Roy T. Fielding
On May 4, 2011, at 5:39 PM, Eli Collins wrote:

> The point is that these discussion should be sorted out, ie you don't
> change your development and release model on a release VOTE thread,
> you change it on a DISCUSSION thread.

That is no different than saying you have a right to veto a
release until the issue is addressed, which you don't have.

A release vote is a majority decision.  If the majority
decides to release, then whatever gets released will define
the new norm by which policies are assumed.  If not released,
then I suggest collaborating more on the policies before
trying to vote again.

Either way, we don't hold up a vote for the sake of a
policy discussion because voting is a more efficient
means of discovering if the policy really matters.

Roy



Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Milind Bhandarkar
My (non-binding) vote for 0.20.203.0-rc1 is +1.

I downloaded, compiled, ran tests, ran my bigrams example, all ran
perfectly.
(I did a single node test without security on.)

The voting criteria I used are:

1. Is this a working release? : Yes
2. Does it take the codebase forward? : Yes
3. Does it have features that the user community might find valuable? : Yes

- milind

-- 
Milind Bhandarkar
mbhandar...@linkedin.com
+1-650-776-3167






On 5/4/11 6:10 PM, "Devaraj Das"  wrote:

>+1 based on some single node tests I did (with security ON).
>
>
>On 5/4/11 10:31 AM, "Owen O'Malley"  wrote:
>
>Here's an updated release candidate for 0.20.203.0. I've incorporated the
>feedback and included all of the patches from 0.20.2, which is the last
>stable release. I also fixed the eclipse-plugin problem.
>
>The candidate is at:
>http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>
>Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>
>-- Owen
>



Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Jean-Daniel Cryans
Non-biding -1.

I did download it and checked it out, but when I look at the
documentation I see it says "Hadoop 0.20 documentation" in the tab on
top. From what I can tell this isn't the branch 0.20 so I think it's
an error and from a user point of view this looks more like something
I would call 0.22 (although yes I understand this is 0.20 +security
+whatever).

Why would a single company push so hard to go against the "normal"
release process just for "the benefit of putting our work in the hands
of all hadoop users" is beyond me. It's not like people were begging
on the mailing lists to be able to get their hands on such a release
to the point where an emergency point release including tons of new
features is needed.

So to me the more logical reason would be monetary gains, that I would
understand better from a for-profit company. But then why go through
the hurdles of having such an ASF release when Y! isn't even selling
anything remotely related to Hadoop services? And why now?

But then there's this spinoff thing and it suddenly makes a lot more sense.

E14 said earlier that "That is how apache works."

I would say yes, maybe this is how it works, but I'm not sure I want
to see it working like _that_. The ASF shouldn't be the vehicle for a
single (future) company's wishes.

J-D

On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley  wrote:
> Here's an updated release candidate for 0.20.203.0. I've incorporated the 
> feedback and included all of the patches from 0.20.2, which is the last 
> stable release. I also fixed the eclipse-plugin problem.
>
> The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>
> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>
> -- Owen


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
On Wed, May 4, 2011 at 6:18 PM, Eric Baldeschwieler
 wrote:
> Ok. I'll bite.
>
> The point of a vote is to learn what everyone thinks. So far we have learned:
>
> 1 - the team that is trying to contribute code and do a release thinks it is 
> ready.
>
> 2 - Cloudera does not think the release is a good idea.
>

I don't think that's true.  There's a difference between not
supporting a given rc and not supporting a release from this branch in
general.

With both of my hats on, I want code to be reviewed before being
release, I want releases to not regress against previous releases, I
don't want the next major release to regress against a stable release,
I want the community to discuss new version schemes and development
models vs adopting them by accident just because we voted on a
particular release.

Thanks,
Eli


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
> Entertaining concerns like a one-to-one
> correspondence between commits and JIRA issues is bizarre in this
> context.

It's not about whether there's a jira, it's about whether the code was
reviewed.  We think code should be reviewed and vote on by the
community before releasing it. That's how we've always rolled.

Everyone agrees releases are too infrequent, that's not an excuse for
steam rolling the community.

Thanks,
Eli


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eric Baldeschwieler
Ok. I'll bite.

The point of a vote is to learn what everyone thinks. So far we have learned:

1 - the team that is trying to contribute code and do a release thinks it is 
ready. 

2 - Cloudera does not think the release is a good idea. 

No more talk between the Team contributing code and cloudera will educate us 
further  Let's hear from the rest of the community. 

In parallel on other threads, let's work out how to address concerns. That will 
be useful however the vote goes. I promise to continue to work with everyone to 
help drive releases. 

We've called a vote, so let it proceed. That is how apache works. 

Thanks!

---
E14 - typing on glass

PS this is my last comment on this thread. Start new ones if you are not 
casting a vote. 

On May 4, 2011, at 5:45 PM, "Konstantin Boudnik"  wrote:

> I tend to agree. Changing release model of Apache Hadoop train isn't
> something that should be done in a hassle or as a part of release
> voting.
> 
> If these questions aren't addressed - let's postpone the vote and
> discuss all the complications or implications until they sorted out or
> the consensus/compromise is reached.
> 
> Cos
> 
> On Wed, May 4, 2011 at 17:39, Eli Collins  wrote:
>> The point is that these discussion should be sorted out, ie you don't
>> change your development and release model on a release VOTE thread,
>> you change it on a DISCUSSION thread.
>> 
>> Ie before we release this we should understand what that means. What
>> is being proposed is not just another release from branch-0.20 or
>> branch-0.22.
>> 
>> Thanks,
>> Eli
>> 
>> On Wed, May 4, 2011 at 5:30 PM, Mahadev Konar  wrote:
>>> Eli,
>>>  I think the intent from the email was to just vote on this thread,
>>> which I agree with.
>>>  Discussions should be done in a separate threads. Hopefully we can
>>> all stick to just voting!
>>> 
>>> thanks
>>> mahadev
>>> 
>>> On Wed, May 4, 2011 at 5:22 PM, Eli Collins  wrote:
 Good suggestion, it would be helpful to hash out the issues around
 compatibility, feature branches, version numbers, how to contribute at
 Apache before putting up new votes that would be helpful, ie the vote
 would go much smoother if all the issues with the previous vote were
 addressed before starting a new one.
 
 Thanks,
 Eli
 
 On Wed, May 4, 2011 at 5:05 PM, Eric Baldeschwieler
  wrote:
> Hi folks,
> 
> Let's stay focused. Let's take the other threads onto other threads. This 
> is a vote.
> 
> To the extent naming is a problem, let's take that to a thread and find 
> an acceptable proposal.
> 
> To the extent folks want to collaborate on certifying the release for 
> total lack of regression or collaborate on the cleanest possible merge, I 
> think all interested parties should take these topics to another thread 
> and divide up the work.
> 
> If you've voted, you don't need to comment further on this thread, no 
> matter what company you work for!
> 
> Thanks,
> 
> ---
> E14 - typing on glass
> 
> On May 4, 2011, at 4:46 PM, "Todd Lipcon"  wrote:
> 
>> On Wed, May 4, 2011 at 4:11 PM, Arun C Murthy  wrote:
>> 
>>> On May 4, 2011, at 4:09 PM, Tsz Wo (Nicholas), Sze wrote:
>>> 
>>> The list seems highly inaccurate.  Checked the first few N/A items.  All
 are
 false positives.
 
 
>>> Also,  can you please provide a list on features which are not related 
>>> to
>>> gridmix benchmarks or herriot tests?
>>> 
>> 
>> Here are a few I quickly pulled up:
>> MAPREDUCE-2316 (docs for improved capacity scheduler)
>> MAPREDUCE-2355 (adds new config for heartbeat dampening in MR)
>> 
>> "   BZ-4182948. Add statistics logging to Fred for better visibility into
>> startup time costs. (Matt Foley)"
>> - I believe I saw a note from Matt on the JIRA yesterday about this 
>> feature,
>> where he decided that the version done in 203 wasn't a good approach, and
>> it's done differently in trunk (not sure if done yet).
>> 
>> MAPREDUCE-2364 (important bug fix for localization)
>> - in fact most of localization is different in this branch compared to 
>> trunk
>> due to inclusion of MAPREDUCE-2378, the trunk version of which is still 
>> on
>> the "yahoo-merge" branch,.
>> 
>> "New cunters for FileInput/OutputFormat. New Counter
>>MAP_OUTPUT_MATERIALZIED_BYTES. Related bugs: 4241034, 3418543,
>> 4217546"
>> - not sure which JIRA this is, I think I've seen a JIRA for trunk, but 
>> not
>> committed.
>> 
>> - MAPREDUCE-1904, committed without JIRA as:
>> ". Reducing new Path(), RawFileStatus() creation overhead in
>> LocalDirAllocator"
>> not in trunk
>> 
>> +BZ4101537 .  When a queue is built without any access rights we 
>> explain
>> the
>> +problem.  (dking, rv

Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Chris Douglas
I'm +1 on releasing rc1. The signature and hashes match on the
artifact, ran some of the more aggressive MR tests. Reviewed changes
from rc0.

It looks like we need a FAQ for this release, if only to prevent the
same questions from being asked and answered across different threads
and lists. Reservations, regressions, and pending work can also be
documented there.

Right now, Apache Hadoop releases are not recommended by its
community. Instead, not only our end users, but other Apache projects
run Cloudera's distribution. From all those wearing their Apache hat,
I would like to see more effort directed toward a release that we can
recommend soon and less time spent compiling tasks to delay it.

Releasing this will complicate the documented process. However, that
process *has not produced a usable release* for the last two out of
six years. This is failure. Entertaining concerns like a one-to-one
correspondence between commits and JIRA issues is bizarre in this
context. Let's find a way to make progress instead of tossing
pharisaic accusations of illegitimacy. -C

On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley  wrote:
> Here's an updated release candidate for 0.20.203.0. I've incorporated the 
> feedback and included all of the patches from 0.20.2, which is the last 
> stable release. I also fixed the eclipse-plugin problem.
>
> The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>
> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>
> -- Owen


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Devaraj Das
+1 based on some single node tests I did (with security ON).


On 5/4/11 10:31 AM, "Owen O'Malley"  wrote:

Here's an updated release candidate for 0.20.203.0. I've incorporated the 
feedback and included all of the patches from 0.20.2, which is the last stable 
release. I also fixed the eclipse-plugin problem.

The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/

Please download it, inspect it, compile it, and test it. Clearly, I'm +1.

-- Owen



[DISCUSSION] Release rules

2011-05-04 Thread Tom White
One year ago (to the day!) Chris started a discussion about the
release manager role
(http://mail-archives.apache.org/mod_mbox/hadoop-general/201005.mbox/%3ch2q1267dd3b1005041331r7d8f696di370a279ff6058...@mail.gmail.com%3E).
In light of today's disagreements, I think we should restart this
discussion and incorporate these rules into the bylaws, since it
formalizes our practices.

I'm happy to drive this. We could start by discussing Chris' proposal
(see clarifications in
http://mail-archives.apache.org/mod_mbox/hadoop-general/201005.mbox/%3ct2y1267dd3b1005051201h7116e4caud75673ac9d512...@mail.gmail.com%3E),
then when we get consensus we can put the document on the website.
(BTW does anyone know if the bylaws were checked into SVN anywhere?
These belong together.)

Cheers,
Tom


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Konstantin Boudnik
I tend to agree. Changing release model of Apache Hadoop train isn't
something that should be done in a hassle or as a part of release
voting.

If these questions aren't addressed - let's postpone the vote and
discuss all the complications or implications until they sorted out or
the consensus/compromise is reached.

Cos

On Wed, May 4, 2011 at 17:39, Eli Collins  wrote:
> The point is that these discussion should be sorted out, ie you don't
> change your development and release model on a release VOTE thread,
> you change it on a DISCUSSION thread.
>
> Ie before we release this we should understand what that means. What
> is being proposed is not just another release from branch-0.20 or
> branch-0.22.
>
> Thanks,
> Eli
>
> On Wed, May 4, 2011 at 5:30 PM, Mahadev Konar  wrote:
>> Eli,
>>  I think the intent from the email was to just vote on this thread,
>> which I agree with.
>>  Discussions should be done in a separate threads. Hopefully we can
>> all stick to just voting!
>>
>> thanks
>> mahadev
>>
>> On Wed, May 4, 2011 at 5:22 PM, Eli Collins  wrote:
>>> Good suggestion, it would be helpful to hash out the issues around
>>> compatibility, feature branches, version numbers, how to contribute at
>>> Apache before putting up new votes that would be helpful, ie the vote
>>> would go much smoother if all the issues with the previous vote were
>>> addressed before starting a new one.
>>>
>>> Thanks,
>>> Eli
>>>
>>> On Wed, May 4, 2011 at 5:05 PM, Eric Baldeschwieler
>>>  wrote:
 Hi folks,

 Let's stay focused. Let's take the other threads onto other threads. This 
 is a vote.

 To the extent naming is a problem, let's take that to a thread and find an 
 acceptable proposal.

 To the extent folks want to collaborate on certifying the release for 
 total lack of regression or collaborate on the cleanest possible merge, I 
 think all interested parties should take these topics to another thread 
 and divide up the work.

 If you've voted, you don't need to comment further on this thread, no 
 matter what company you work for!

 Thanks,

 ---
 E14 - typing on glass

 On May 4, 2011, at 4:46 PM, "Todd Lipcon"  wrote:

> On Wed, May 4, 2011 at 4:11 PM, Arun C Murthy  wrote:
>
>> On May 4, 2011, at 4:09 PM, Tsz Wo (Nicholas), Sze wrote:
>>
>> The list seems highly inaccurate.  Checked the first few N/A items.  All
>>> are
>>> false positives.
>>>
>>>
>> Also,  can you please provide a list on features which are not related to
>> gridmix benchmarks or herriot tests?
>>
>
> Here are a few I quickly pulled up:
> MAPREDUCE-2316 (docs for improved capacity scheduler)
> MAPREDUCE-2355 (adds new config for heartbeat dampening in MR)
>
> "   BZ-4182948. Add statistics logging to Fred for better visibility into
> startup time costs. (Matt Foley)"
> - I believe I saw a note from Matt on the JIRA yesterday about this 
> feature,
> where he decided that the version done in 203 wasn't a good approach, and
> it's done differently in trunk (not sure if done yet).
>
> MAPREDUCE-2364 (important bug fix for localization)
> - in fact most of localization is different in this branch compared to 
> trunk
> due to inclusion of MAPREDUCE-2378, the trunk version of which is still on
> the "yahoo-merge" branch,.
>
> "New cunters for FileInput/OutputFormat. New Counter
>        MAP_OUTPUT_MATERIALZIED_BYTES. Related bugs: 4241034, 3418543,
> 4217546"
> - not sure which JIRA this is, I think I've seen a JIRA for trunk, but not
> committed.
>
> - MAPREDUCE-1904, committed without JIRA as:
> "        . Reducing new Path(), RawFileStatus() creation overhead in
> LocalDirAllocator"
> not in trunk
>
> +    BZ4101537 .  When a queue is built without any access rights we 
> explain
> the
> +    problem.  (dking, rvw ramach)  [attachment of 2010-11-24]
> seems to be on trunk as MR-2411, but not committed, best I can tell, 
> despite
> the JIRA there being resolved (based on looking at QueueManager in trunk)
>
> "        . Remove unnecessary reference to user configuration from
> TaskDistributedCacheManager causing memory leaks"
> Not in trunk, not sure which JIRA it might be.. probably part of 2178.
>
> Major new feature: MAPREDUCE-323 - very large rework of how job history
> files are managed
> Major change: MAPREDUCE-1100/MAPREDUCE-1176: unresolved on trunk, though
> probably will be attacked by different JIRAs
> Major new ops-visible feature: "metrics2" system
> Major new ops-visible feature: MAPREDUCE-291 job history can be viewed 
> from
> a separate server
> Major new set of user-visible configurations: MAPREDUCE-1943 and friends
> which implement new limits in MapReduce (eg MAPREDUCE-1872 as well)
>

Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
The point is that these discussion should be sorted out, ie you don't
change your development and release model on a release VOTE thread,
you change it on a DISCUSSION thread.

Ie before we release this we should understand what that means. What
is being proposed is not just another release from branch-0.20 or
branch-0.22.

Thanks,
Eli

On Wed, May 4, 2011 at 5:30 PM, Mahadev Konar  wrote:
> Eli,
>  I think the intent from the email was to just vote on this thread,
> which I agree with.
>  Discussions should be done in a separate threads. Hopefully we can
> all stick to just voting!
>
> thanks
> mahadev
>
> On Wed, May 4, 2011 at 5:22 PM, Eli Collins  wrote:
>> Good suggestion, it would be helpful to hash out the issues around
>> compatibility, feature branches, version numbers, how to contribute at
>> Apache before putting up new votes that would be helpful, ie the vote
>> would go much smoother if all the issues with the previous vote were
>> addressed before starting a new one.
>>
>> Thanks,
>> Eli
>>
>> On Wed, May 4, 2011 at 5:05 PM, Eric Baldeschwieler
>>  wrote:
>>> Hi folks,
>>>
>>> Let's stay focused. Let's take the other threads onto other threads. This 
>>> is a vote.
>>>
>>> To the extent naming is a problem, let's take that to a thread and find an 
>>> acceptable proposal.
>>>
>>> To the extent folks want to collaborate on certifying the release for total 
>>> lack of regression or collaborate on the cleanest possible merge, I think 
>>> all interested parties should take these topics to another thread and 
>>> divide up the work.
>>>
>>> If you've voted, you don't need to comment further on this thread, no 
>>> matter what company you work for!
>>>
>>> Thanks,
>>>
>>> ---
>>> E14 - typing on glass
>>>
>>> On May 4, 2011, at 4:46 PM, "Todd Lipcon"  wrote:
>>>
 On Wed, May 4, 2011 at 4:11 PM, Arun C Murthy  wrote:

> On May 4, 2011, at 4:09 PM, Tsz Wo (Nicholas), Sze wrote:
>
> The list seems highly inaccurate.  Checked the first few N/A items.  All
>> are
>> false positives.
>>
>>
> Also,  can you please provide a list on features which are not related to
> gridmix benchmarks or herriot tests?
>

 Here are a few I quickly pulled up:
 MAPREDUCE-2316 (docs for improved capacity scheduler)
 MAPREDUCE-2355 (adds new config for heartbeat dampening in MR)

 "   BZ-4182948. Add statistics logging to Fred for better visibility into
 startup time costs. (Matt Foley)"
 - I believe I saw a note from Matt on the JIRA yesterday about this 
 feature,
 where he decided that the version done in 203 wasn't a good approach, and
 it's done differently in trunk (not sure if done yet).

 MAPREDUCE-2364 (important bug fix for localization)
 - in fact most of localization is different in this branch compared to 
 trunk
 due to inclusion of MAPREDUCE-2378, the trunk version of which is still on
 the "yahoo-merge" branch,.

 "New cunters for FileInput/OutputFormat. New Counter
        MAP_OUTPUT_MATERIALZIED_BYTES. Related bugs: 4241034, 3418543,
 4217546"
 - not sure which JIRA this is, I think I've seen a JIRA for trunk, but not
 committed.

 - MAPREDUCE-1904, committed without JIRA as:
 "        . Reducing new Path(), RawFileStatus() creation overhead in
 LocalDirAllocator"
 not in trunk

 +    BZ4101537 .  When a queue is built without any access rights we 
 explain
 the
 +    problem.  (dking, rvw ramach)  [attachment of 2010-11-24]
 seems to be on trunk as MR-2411, but not committed, best I can tell, 
 despite
 the JIRA there being resolved (based on looking at QueueManager in trunk)

 "        . Remove unnecessary reference to user configuration from
 TaskDistributedCacheManager causing memory leaks"
 Not in trunk, not sure which JIRA it might be.. probably part of 2178.

 Major new feature: MAPREDUCE-323 - very large rework of how job history
 files are managed
 Major change: MAPREDUCE-1100/MAPREDUCE-1176: unresolved on trunk, though
 probably will be attacked by different JIRAs
 Major new ops-visible feature: "metrics2" system
 Major new ops-visible feature: MAPREDUCE-291 job history can be viewed from
 a separate server
 Major new set of user-visible configurations: MAPREDUCE-1943 and friends
 which implement new limits in MapReduce (eg MAPREDUCE-1872 as well)

 I have code to work on, so I won't keep going, but this is from looking at
 the last couple months of 203.

 -Todd
 --
 Todd Lipcon
 Software Engineer, Cloudera
>>>
>>
>
>
>
> --
> thanks
> mahadev
> @mahadevkonar
>


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Mahadev Konar
Eli,
  I think the intent from the email was to just vote on this thread,
which I agree with.
 Discussions should be done in a separate threads. Hopefully we can
all stick to just voting!

thanks
mahadev

On Wed, May 4, 2011 at 5:22 PM, Eli Collins  wrote:
> Good suggestion, it would be helpful to hash out the issues around
> compatibility, feature branches, version numbers, how to contribute at
> Apache before putting up new votes that would be helpful, ie the vote
> would go much smoother if all the issues with the previous vote were
> addressed before starting a new one.
>
> Thanks,
> Eli
>
> On Wed, May 4, 2011 at 5:05 PM, Eric Baldeschwieler
>  wrote:
>> Hi folks,
>>
>> Let's stay focused. Let's take the other threads onto other threads. This is 
>> a vote.
>>
>> To the extent naming is a problem, let's take that to a thread and find an 
>> acceptable proposal.
>>
>> To the extent folks want to collaborate on certifying the release for total 
>> lack of regression or collaborate on the cleanest possible merge, I think 
>> all interested parties should take these topics to another thread and divide 
>> up the work.
>>
>> If you've voted, you don't need to comment further on this thread, no matter 
>> what company you work for!
>>
>> Thanks,
>>
>> ---
>> E14 - typing on glass
>>
>> On May 4, 2011, at 4:46 PM, "Todd Lipcon"  wrote:
>>
>>> On Wed, May 4, 2011 at 4:11 PM, Arun C Murthy  wrote:
>>>
 On May 4, 2011, at 4:09 PM, Tsz Wo (Nicholas), Sze wrote:

 The list seems highly inaccurate.  Checked the first few N/A items.  All
> are
> false positives.
>
>
 Also,  can you please provide a list on features which are not related to
 gridmix benchmarks or herriot tests?

>>>
>>> Here are a few I quickly pulled up:
>>> MAPREDUCE-2316 (docs for improved capacity scheduler)
>>> MAPREDUCE-2355 (adds new config for heartbeat dampening in MR)
>>>
>>> "   BZ-4182948. Add statistics logging to Fred for better visibility into
>>> startup time costs. (Matt Foley)"
>>> - I believe I saw a note from Matt on the JIRA yesterday about this feature,
>>> where he decided that the version done in 203 wasn't a good approach, and
>>> it's done differently in trunk (not sure if done yet).
>>>
>>> MAPREDUCE-2364 (important bug fix for localization)
>>> - in fact most of localization is different in this branch compared to trunk
>>> due to inclusion of MAPREDUCE-2378, the trunk version of which is still on
>>> the "yahoo-merge" branch,.
>>>
>>> "New cunters for FileInput/OutputFormat. New Counter
>>>        MAP_OUTPUT_MATERIALZIED_BYTES. Related bugs: 4241034, 3418543,
>>> 4217546"
>>> - not sure which JIRA this is, I think I've seen a JIRA for trunk, but not
>>> committed.
>>>
>>> - MAPREDUCE-1904, committed without JIRA as:
>>> "        . Reducing new Path(), RawFileStatus() creation overhead in
>>> LocalDirAllocator"
>>> not in trunk
>>>
>>> +    BZ4101537 .  When a queue is built without any access rights we explain
>>> the
>>> +    problem.  (dking, rvw ramach)  [attachment of 2010-11-24]
>>> seems to be on trunk as MR-2411, but not committed, best I can tell, despite
>>> the JIRA there being resolved (based on looking at QueueManager in trunk)
>>>
>>> "        . Remove unnecessary reference to user configuration from
>>> TaskDistributedCacheManager causing memory leaks"
>>> Not in trunk, not sure which JIRA it might be.. probably part of 2178.
>>>
>>> Major new feature: MAPREDUCE-323 - very large rework of how job history
>>> files are managed
>>> Major change: MAPREDUCE-1100/MAPREDUCE-1176: unresolved on trunk, though
>>> probably will be attacked by different JIRAs
>>> Major new ops-visible feature: "metrics2" system
>>> Major new ops-visible feature: MAPREDUCE-291 job history can be viewed from
>>> a separate server
>>> Major new set of user-visible configurations: MAPREDUCE-1943 and friends
>>> which implement new limits in MapReduce (eg MAPREDUCE-1872 as well)
>>>
>>> I have code to work on, so I won't keep going, but this is from looking at
>>> the last couple months of 203.
>>>
>>> -Todd
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>
>



-- 
thanks
mahadev
@mahadevkonar


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
Good suggestion, it would be helpful to hash out the issues around
compatibility, feature branches, version numbers, how to contribute at
Apache before putting up new votes that would be helpful, ie the vote
would go much smoother if all the issues with the previous vote were
addressed before starting a new one.

Thanks,
Eli

On Wed, May 4, 2011 at 5:05 PM, Eric Baldeschwieler
 wrote:
> Hi folks,
>
> Let's stay focused. Let's take the other threads onto other threads. This is 
> a vote.
>
> To the extent naming is a problem, let's take that to a thread and find an 
> acceptable proposal.
>
> To the extent folks want to collaborate on certifying the release for total 
> lack of regression or collaborate on the cleanest possible merge, I think all 
> interested parties should take these topics to another thread and divide up 
> the work.
>
> If you've voted, you don't need to comment further on this thread, no matter 
> what company you work for!
>
> Thanks,
>
> ---
> E14 - typing on glass
>
> On May 4, 2011, at 4:46 PM, "Todd Lipcon"  wrote:
>
>> On Wed, May 4, 2011 at 4:11 PM, Arun C Murthy  wrote:
>>
>>> On May 4, 2011, at 4:09 PM, Tsz Wo (Nicholas), Sze wrote:
>>>
>>> The list seems highly inaccurate.  Checked the first few N/A items.  All
 are
 false positives.


>>> Also,  can you please provide a list on features which are not related to
>>> gridmix benchmarks or herriot tests?
>>>
>>
>> Here are a few I quickly pulled up:
>> MAPREDUCE-2316 (docs for improved capacity scheduler)
>> MAPREDUCE-2355 (adds new config for heartbeat dampening in MR)
>>
>> "   BZ-4182948. Add statistics logging to Fred for better visibility into
>> startup time costs. (Matt Foley)"
>> - I believe I saw a note from Matt on the JIRA yesterday about this feature,
>> where he decided that the version done in 203 wasn't a good approach, and
>> it's done differently in trunk (not sure if done yet).
>>
>> MAPREDUCE-2364 (important bug fix for localization)
>> - in fact most of localization is different in this branch compared to trunk
>> due to inclusion of MAPREDUCE-2378, the trunk version of which is still on
>> the "yahoo-merge" branch,.
>>
>> "New cunters for FileInput/OutputFormat. New Counter
>>        MAP_OUTPUT_MATERIALZIED_BYTES. Related bugs: 4241034, 3418543,
>> 4217546"
>> - not sure which JIRA this is, I think I've seen a JIRA for trunk, but not
>> committed.
>>
>> - MAPREDUCE-1904, committed without JIRA as:
>> "        . Reducing new Path(), RawFileStatus() creation overhead in
>> LocalDirAllocator"
>> not in trunk
>>
>> +    BZ4101537 .  When a queue is built without any access rights we explain
>> the
>> +    problem.  (dking, rvw ramach)  [attachment of 2010-11-24]
>> seems to be on trunk as MR-2411, but not committed, best I can tell, despite
>> the JIRA there being resolved (based on looking at QueueManager in trunk)
>>
>> "        . Remove unnecessary reference to user configuration from
>> TaskDistributedCacheManager causing memory leaks"
>> Not in trunk, not sure which JIRA it might be.. probably part of 2178.
>>
>> Major new feature: MAPREDUCE-323 - very large rework of how job history
>> files are managed
>> Major change: MAPREDUCE-1100/MAPREDUCE-1176: unresolved on trunk, though
>> probably will be attacked by different JIRAs
>> Major new ops-visible feature: "metrics2" system
>> Major new ops-visible feature: MAPREDUCE-291 job history can be viewed from
>> a separate server
>> Major new set of user-visible configurations: MAPREDUCE-1943 and friends
>> which implement new limits in MapReduce (eg MAPREDUCE-1872 as well)
>>
>> I have code to work on, so I won't keep going, but this is from looking at
>> the last couple months of 203.
>>
>> -Todd
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eric Baldeschwieler
Hi folks,

Let's stay focused. Let's take the other threads onto other threads. This is a 
vote. 

To the extent naming is a problem, let's take that to a thread and find an 
acceptable proposal. 

To the extent folks want to collaborate on certifying the release for total 
lack of regression or collaborate on the cleanest possible merge, I think all 
interested parties should take these topics to another thread and divide up the 
work. 

If you've voted, you don't need to comment further on this thread, no matter 
what company you work for!

Thanks,

---
E14 - typing on glass

On May 4, 2011, at 4:46 PM, "Todd Lipcon"  wrote:

> On Wed, May 4, 2011 at 4:11 PM, Arun C Murthy  wrote:
> 
>> On May 4, 2011, at 4:09 PM, Tsz Wo (Nicholas), Sze wrote:
>> 
>> The list seems highly inaccurate.  Checked the first few N/A items.  All
>>> are
>>> false positives.
>>> 
>>> 
>> Also,  can you please provide a list on features which are not related to
>> gridmix benchmarks or herriot tests?
>> 
> 
> Here are a few I quickly pulled up:
> MAPREDUCE-2316 (docs for improved capacity scheduler)
> MAPREDUCE-2355 (adds new config for heartbeat dampening in MR)
> 
> "   BZ-4182948. Add statistics logging to Fred for better visibility into
> startup time costs. (Matt Foley)"
> - I believe I saw a note from Matt on the JIRA yesterday about this feature,
> where he decided that the version done in 203 wasn't a good approach, and
> it's done differently in trunk (not sure if done yet).
> 
> MAPREDUCE-2364 (important bug fix for localization)
> - in fact most of localization is different in this branch compared to trunk
> due to inclusion of MAPREDUCE-2378, the trunk version of which is still on
> the "yahoo-merge" branch,.
> 
> "New cunters for FileInput/OutputFormat. New Counter
>MAP_OUTPUT_MATERIALZIED_BYTES. Related bugs: 4241034, 3418543,
> 4217546"
> - not sure which JIRA this is, I think I've seen a JIRA for trunk, but not
> committed.
> 
> - MAPREDUCE-1904, committed without JIRA as:
> ". Reducing new Path(), RawFileStatus() creation overhead in
> LocalDirAllocator"
> not in trunk
> 
> +BZ4101537 .  When a queue is built without any access rights we explain
> the
> +problem.  (dking, rvw ramach)  [attachment of 2010-11-24]
> seems to be on trunk as MR-2411, but not committed, best I can tell, despite
> the JIRA there being resolved (based on looking at QueueManager in trunk)
> 
> ". Remove unnecessary reference to user configuration from
> TaskDistributedCacheManager causing memory leaks"
> Not in trunk, not sure which JIRA it might be.. probably part of 2178.
> 
> Major new feature: MAPREDUCE-323 - very large rework of how job history
> files are managed
> Major change: MAPREDUCE-1100/MAPREDUCE-1176: unresolved on trunk, though
> probably will be attacked by different JIRAs
> Major new ops-visible feature: "metrics2" system
> Major new ops-visible feature: MAPREDUCE-291 job history can be viewed from
> a separate server
> Major new set of user-visible configurations: MAPREDUCE-1943 and friends
> which implement new limits in MapReduce (eg MAPREDUCE-1872 as well)
> 
> I have code to work on, so I won't keep going, but this is from looking at
> the last couple months of 203.
> 
> -Todd
> -- 
> Todd Lipcon
> Software Engineer, Cloudera


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Arun C Murthy

On May 4, 2011, at 4:44 PM, Todd Lipcon wrote:

On Wed, May 4, 2011 at 4:11 PM, Arun C Murthy   
wrote:



On May 4, 2011, at 4:09 PM, Tsz Wo (Nicholas), Sze wrote:

The list seems highly inaccurate.  Checked the first few N/A  
items.  All

are
false positives.


Also,  can you please provide a list on features which are not  
related to

gridmix benchmarks or herriot tests?



Here are a few I quickly pulled up:


So, it's around 10? Approximately? Also, the ones you put up were  
reviewed via jira.


Please note that several of the ones you are pointing out are already  
in y-merge branch which is nearly trunk. including MR-2378 as you  
pointed out.


Thanks for the list, I'll ensure we work on forward porting them.

Arun



Re: [VOTE] Release candidate 0.20.203.0-rc0

2011-05-04 Thread Doug Cutting
On 05/03/2011 06:01 PM, Arun C Murthy wrote:
> On May 3, 2011, at 5:17 PM, "Doug Cutting"  wrote:
> 
>> On 05/02/2011 02:33 PM, Arun C Murthy wrote:
>>> Are you simply asking for someone to go through the 450 odd jiras and
>>> set 'fix-for' fields?
>>
>> Every other release we've made is well-correlated with Jira.  It should
>> not be difficult to achieve that for this one.  We could write a script
>> to take all 450 bug IDs from the change log and use Jira's command-line
>> tool to set the "fix-for" to be this 0.20+security release.  Would you
>> like help with that?
>>
> 
> Yes please, that would be great. Thanks!

Please find below a script that will add a fix-version to issues.

Doug

#!/bin/bash

# reads bug ids from standard input
# and adds the fixVersion named on command line

if [ $# -eq 0 ]
then
  echo "Usage: $0 bugid"
  exit 1
fi

fix=$1
echo Setting fix version to $fix.

server=https://issues.apache.org/jira
jira=./jira-cli-2.0.0/jira.sh

set -e

echo -n "Jira username: "
read user
echo -n "Jira password: "
stty -echo
read password
stty echo

while read issue
do
# first read the old fix versions
old=`$jira -a getFieldValue --server $server \
 --password $password --user $user \
 --issue $issue --field fixVersions | \
 tail -n 1 | sed 's/([0-9]*)//g' | sed s/\'//g`

# now update, adding new value
# jira will ignore if this value is already present
$jira -a updateIssue --server $server \
 --password $password --user $user \
 --issue $issue --fixVersions "${old},${fix}"
done


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Todd Lipcon
On Wed, May 4, 2011 at 4:11 PM, Arun C Murthy  wrote:

> On May 4, 2011, at 4:09 PM, Tsz Wo (Nicholas), Sze wrote:
>
>  The list seems highly inaccurate.  Checked the first few N/A items.  All
>> are
>> false positives.
>>
>>
> Also,  can you please provide a list on features which are not related to
> gridmix benchmarks or herriot tests?
>

Here are a few I quickly pulled up:
MAPREDUCE-2316 (docs for improved capacity scheduler)
MAPREDUCE-2355 (adds new config for heartbeat dampening in MR)

"   BZ-4182948. Add statistics logging to Fred for better visibility into
startup time costs. (Matt Foley)"
- I believe I saw a note from Matt on the JIRA yesterday about this feature,
where he decided that the version done in 203 wasn't a good approach, and
it's done differently in trunk (not sure if done yet).

MAPREDUCE-2364 (important bug fix for localization)
- in fact most of localization is different in this branch compared to trunk
due to inclusion of MAPREDUCE-2378, the trunk version of which is still on
the "yahoo-merge" branch,.

"New cunters for FileInput/OutputFormat. New Counter
MAP_OUTPUT_MATERIALZIED_BYTES. Related bugs: 4241034, 3418543,
4217546"
- not sure which JIRA this is, I think I've seen a JIRA for trunk, but not
committed.

- MAPREDUCE-1904, committed without JIRA as:
". Reducing new Path(), RawFileStatus() creation overhead in
LocalDirAllocator"
not in trunk

+BZ4101537 .  When a queue is built without any access rights we explain
the
+problem.  (dking, rvw ramach)  [attachment of 2010-11-24]
seems to be on trunk as MR-2411, but not committed, best I can tell, despite
the JIRA there being resolved (based on looking at QueueManager in trunk)

". Remove unnecessary reference to user configuration from
TaskDistributedCacheManager causing memory leaks"
Not in trunk, not sure which JIRA it might be.. probably part of 2178.

Major new feature: MAPREDUCE-323 - very large rework of how job history
files are managed
Major change: MAPREDUCE-1100/MAPREDUCE-1176: unresolved on trunk, though
probably will be attacked by different JIRAs
Major new ops-visible feature: "metrics2" system
Major new ops-visible feature: MAPREDUCE-291 job history can be viewed from
a separate server
Major new set of user-visible configurations: MAPREDUCE-1943 and friends
which implement new limits in MapReduce (eg MAPREDUCE-1872 as well)

I have code to work on, so I won't keep going, but this is from looking at
the last couple months of 203.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Konstantin Boudnik
On Wed, May 4, 2011 at 15:06, Suresh Srinivas  wrote:
> Eli,
>
> How many of these patches that you find troublesome are in CDH already?

How is that relevant to the release vote and discrepancies listed in
Eli's email?

> Regards,
> Suresh
>
>
> On 5/4/11 3:03 PM, "Eli Collins"  wrote:
>
>> On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley  wrote:
>>> Here's an updated release candidate for 0.20.203.0. I've incorporated the
>>> feedback and included all of the patches from 0.20.2, which is the last
>>> stable release. I also fixed the eclipse-plugin problem.
>>>
>>> The candidate is at: 
>>> http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>>>
>>> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>>>
>>> -- Owen
>>
>> While rc2 is an improvement on rc1, I am -1 on this particular rc.  
>> Rationale:
>>
>> This rc contains many patches not yet committed to trunk. This would
>> cause the next major release (0.22) to be a feature regression against
>> our latest stable release (203), were 0.22 released soon.
>>
>> This rc contains many patches not yet reviewed by the community via
>> the normal process (jira, patch against trunk, merge to a release
>> branch). I think we should respect the existing community process that
>> has been used for all previous releases.
>>
>> This rc introduces a new development and braching model (new feature
>> development outside trunk) and Hadoop versioning scheme without
>> sufficient discussion or proposal of these changes with the community.
>>
>> We should establish new process before the release, a release is not
>> the appropriate mechanism for changing our review and development
>> process or versioning .
>>
>> I do support a release from branch-0.20-security that follows the
>> existing, established community process.
>>
>> Thanks,
>> Eli
>
>


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Mahadev Konar
+1 for the release.

I downloaded the release, verified checksums, built and deployed. Ran
randomwriter jobs on it.

Everything passes.

-- 
thanks
mahadev
@mahadevkonar

On Wed, May 4, 2011 at 3:05 PM, Arun C Murthy  wrote:
> On May 4, 2011, at 10:31 AM, Owen O'Malley wrote:
>
>> Here's an updated release candidate for 0.20.203.0. I've incorporated the
>> feedback and included all of the patches from 0.20.2, which is the last
>> stable release. I also fixed the eclipse-plugin problem.
>>
>> The candidate is at:
>> http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>>
>> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>>
>
> +1
>
> Downloaded release, checked checksums, built, deployed single-node cluster.
>
> Arun
>
>


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Owen O'Malley

On May 4, 2011, at 1:17 PM, Allen Wittenauer wrote:

>   Am I misreading this, or are the MR protocols out of sync between 
> 0.20.203 and 0.21?  It would also appear that this is marked stable in 0.21. 
> What is the user impact?

The names of the protocols were changed, but the names of the protocols aren't 
user-facing. The protocols themselves also changed, as with all Hadoop major 
versions. (We need to switch to protobuf or something for RPC to provide wire 
compatibility.) 

-- Owen

Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Arun C Murthy

On May 4, 2011, at 4:09 PM, Tsz Wo (Nicholas), Sze wrote:

The list seems highly inaccurate.  Checked the first few N/A items.   
All are

false positives.



Also,  can you please provide a list on features which are not related  
to gridmix benchmarks or herriot tests?


Please remember, and I have said this on list and off-list, that many  
of the forward ports obviated the need for multiple patches which show  
up in the commit logs.


thanks,
Arun

< HADOOP-6304 N/A -- fixed in trunk via HADOOP-7110 (Todd, it was  
fixed by you.

Forgot?)
< HADOOP-6598 N/A -- moved to HADOOP-6763 and committed to trunk
< HADOOP-6653 N/A -- not applicable in trunk
< HADOOP-6716 N/A -- as part of HADOOP-6815 which was committed to  
trunk

< HADOOP-6718 N/A -- Incorporated in HADOOP-6706 for 0.22.
< HADOOP-6776 N/A -- Tom White said "This is fixed in trunk, so can  
be closed."


Regards,
Nicholas






From: Eli Collins 
To: general@hadoop.apache.org
Sent: Wed, May 4, 2011 3:36:16 PM
Subject: Re: [VOTE] Release candidate 0.20.203.0-rc1

On Wed, May 4, 2011 at 3:29 PM, Jakob Homan  wrote:

@Eli >> This rc contains many patches not yet committed to trunk.
If you've compiled this list, can you post it?



Here's the list Todd posted yesterday:

http://mail-archives.apache.org/mod_mbox/hadoop-general/201105.mbox/%3CBANLkTimKKbkuPCz61TU=8-no8z6pyhf...@mail.gmail.com%3E


Thanks,
Eli




Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
On Wed, May 4, 2011 at 4:09 PM, Tsz Wo (Nicholas), Sze
 wrote:
> The list seems highly inaccurate.  Checked the first few N/A items.  All are
> false positives.

Yes, that's why those are marked N/A ie "Not applicable". Check out
the non N/A ones.

Thanks,
Eli


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Tsz Wo (Nicholas), Sze
The list seems highly inaccurate.  Checked the first few N/A items.  All are 
false positives.

< HADOOP-6304 N/A -- fixed in trunk via HADOOP-7110 (Todd, it was fixed by you. 
Forgot?)
< HADOOP-6598 N/A -- moved to HADOOP-6763 and committed to trunk
< HADOOP-6653 N/A -- not applicable in trunk
< HADOOP-6716 N/A -- as part of HADOOP-6815 which was committed to trunk
< HADOOP-6718 N/A -- Incorporated in HADOOP-6706 for 0.22.
< HADOOP-6776 N/A -- Tom White said "This is fixed in trunk, so can be closed."

Regards,
Nicholas






From: Eli Collins 
To: general@hadoop.apache.org
Sent: Wed, May 4, 2011 3:36:16 PM
Subject: Re: [VOTE] Release candidate 0.20.203.0-rc1

On Wed, May 4, 2011 at 3:29 PM, Jakob Homan  wrote:
> @Eli >> This rc contains many patches not yet committed to trunk.
> If you've compiled this list, can you post it?
>

Here's the list Todd posted yesterday:

http://mail-archives.apache.org/mod_mbox/hadoop-general/201105.mbox/%3CBANLkTimKKbkuPCz61TU=8-no8z6pyhf...@mail.gmail.com%3E


Thanks,
Eli


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
> Your -1 vote essentially blocks the changes that are already available in
> CDH to be available from Apache open source!

As Eric mentioned, this thread is about an Apache release, not CDH.

My -1 vote does not block these changes from being released via
Apache. You can not veto a release. Releases are lazy majority, the
release is only blocked if there are more -1 votes than +1 votes.

If these changes are contributed on jira, discussed and reviewed, and
committed to trunk I'm happy to support the release.  There's a big
difference between asking that a release respect the Apache community
process and blocking it. If you want to get the release out how about
contributing the work via the normal means so the community can review
it like we review all other code changes.

Thanks,
Eli


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Suresh Srinivas
Here is a snippet from your blog -
http://www.cloudera.com/blog/2010/10/cdh3-beta-3-now-available/

--

Security Enhancements
As one of the primary contributors and largest production users of Hadoop,
Yahoo! publishes the source tree for the version of Hadoop that they run on
their production clusters. We are pleased to announce that we have merged
Yahoo¹s source tree into CDH3b3. This merge brings many improvements
developed at Yahoo! into CDH, including improvements for MapReduce
scalability on 1000+-node clusters and several new tools for benchmarking
and testing Hadoop.
--

It would be great, if you can list how many of 192 changes were reviewed and
became part of CDH.

Your -1 vote essentially blocks the changes that are already available in
CDH to be available from Apache open source!


On 5/4/11 3:30 PM, "Todd Lipcon"  wrote:

> With Cloudera hat on, I agree with Eli's assessment.
> 
> With Apache hat on, I don't see how this is at all relevant to the task at
> hand. I would make the same arguments against taking CDH3 and releasing it
> as an ASF artifact -- we'd also have a certain amount of work to do to make
> sure that all of the patches are in trunk, first. Additionally, I'd want to
> outline what the inclusion criteria would be for that branch.
> 
> -Todd
> 
> On Wed, May 4, 2011 at 3:24 PM, Eli Collins  wrote:
> 
>> With my Cloudera hat on..
>> 
>> When we went through the 10x and 20x patches we only pulled a subset
>> of them, primarily for security and the general improvements that we
>> thought were good.  We found both incompatible changes and some
>> sketchy changes that we did not pull in from a quality perspective.
>> There is a big difference between a patch set that's acceptable for
>> Yahoo!'s user base and one that's a more general artifact.
>> 
>> When we evaluated the YDH patch sets we were using that frame of mind.
>>  I'm now looking it in terms of an Apache release. And the place to
>> review changes for an Apache release is on jira.
>> 
>> CDH3 is based on the latest stable Apache release (20.2) so it doesn't
>> regress against it.  I'm nervous about rebasing future releases on 203
>> because of the compatibility and quality implications.
>> 
>> Thanks,
>> Eli
>> 
>> 
>> On Wed, May 4, 2011 at 3:06 PM, Suresh Srinivas 
>> wrote:
>>> Eli,
>>> 
>>> How many of these patches that you find troublesome are in CDH already?
>>> 
>>> Regards,
>>> Suresh
>>> 
>>> 
>>> On 5/4/11 3:03 PM, "Eli Collins"  wrote:
>>> 
 On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley 
>> wrote:
> Here's an updated release candidate for 0.20.203.0. I've incorporated
>> the
> feedback and included all of the patches from 0.20.2, which is the last
> stable release. I also fixed the eclipse-plugin problem.
> 
> The candidate is at:
>> http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
> 
> Please download it, inspect it, compile it, and test it. Clearly, I'm
>> +1.
> 
> -- Owen
 
 While rc2 is an improvement on rc1, I am -1 on this particular rc.
>>  Rationale:
 
 This rc contains many patches not yet committed to trunk. This would
 cause the next major release (0.22) to be a feature regression against
 our latest stable release (203), were 0.22 released soon.
 
 This rc contains many patches not yet reviewed by the community via
 the normal process (jira, patch against trunk, merge to a release
 branch). I think we should respect the existing community process that
 has been used for all previous releases.
 
 This rc introduces a new development and braching model (new feature
 development outside trunk) and Hadoop versioning scheme without
 sufficient discussion or proposal of these changes with the community.
 
 We should establish new process before the release, a release is not
 the appropriate mechanism for changing our review and development
 process or versioning .
 
 I do support a release from branch-0.20-security that follows the
 existing, established community process.
 
 Thanks,
 Eli
>>> 
>>> 
>> 
> 
> 



Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
On Wed, May 4, 2011 at 3:29 PM, Jakob Homan  wrote:
> @Eli >> This rc contains many patches not yet committed to trunk.
> If you've compiled this list, can you post it?
>

Here's the list Todd posted yesterday:

http://mail-archives.apache.org/mod_mbox/hadoop-general/201105.mbox/%3CBANLkTimKKbkuPCz61TU=8-no8z6pyhf...@mail.gmail.com%3E

Thanks,
Eli


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Todd Lipcon
With Cloudera hat on, I agree with Eli's assessment.

With Apache hat on, I don't see how this is at all relevant to the task at
hand. I would make the same arguments against taking CDH3 and releasing it
as an ASF artifact -- we'd also have a certain amount of work to do to make
sure that all of the patches are in trunk, first. Additionally, I'd want to
outline what the inclusion criteria would be for that branch.

-Todd

On Wed, May 4, 2011 at 3:24 PM, Eli Collins  wrote:

> With my Cloudera hat on..
>
> When we went through the 10x and 20x patches we only pulled a subset
> of them, primarily for security and the general improvements that we
> thought were good.  We found both incompatible changes and some
> sketchy changes that we did not pull in from a quality perspective.
> There is a big difference between a patch set that's acceptable for
> Yahoo!'s user base and one that's a more general artifact.
>
> When we evaluated the YDH patch sets we were using that frame of mind.
>  I'm now looking it in terms of an Apache release. And the place to
> review changes for an Apache release is on jira.
>
> CDH3 is based on the latest stable Apache release (20.2) so it doesn't
> regress against it.  I'm nervous about rebasing future releases on 203
> because of the compatibility and quality implications.
>
> Thanks,
> Eli
>
>
> On Wed, May 4, 2011 at 3:06 PM, Suresh Srinivas 
> wrote:
> > Eli,
> >
> > How many of these patches that you find troublesome are in CDH already?
> >
> > Regards,
> > Suresh
> >
> >
> > On 5/4/11 3:03 PM, "Eli Collins"  wrote:
> >
> >> On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley 
> wrote:
> >>> Here's an updated release candidate for 0.20.203.0. I've incorporated
> the
> >>> feedback and included all of the patches from 0.20.2, which is the last
> >>> stable release. I also fixed the eclipse-plugin problem.
> >>>
> >>> The candidate is at:
> http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
> >>>
> >>> Please download it, inspect it, compile it, and test it. Clearly, I'm
> +1.
> >>>
> >>> -- Owen
> >>
> >> While rc2 is an improvement on rc1, I am -1 on this particular rc.
>  Rationale:
> >>
> >> This rc contains many patches not yet committed to trunk. This would
> >> cause the next major release (0.22) to be a feature regression against
> >> our latest stable release (203), were 0.22 released soon.
> >>
> >> This rc contains many patches not yet reviewed by the community via
> >> the normal process (jira, patch against trunk, merge to a release
> >> branch). I think we should respect the existing community process that
> >> has been used for all previous releases.
> >>
> >> This rc introduces a new development and braching model (new feature
> >> development outside trunk) and Hadoop versioning scheme without
> >> sufficient discussion or proposal of these changes with the community.
> >>
> >> We should establish new process before the release, a release is not
> >> the appropriate mechanism for changing our review and development
> >> process or versioning .
> >>
> >> I do support a release from branch-0.20-security that follows the
> >> existing, established community process.
> >>
> >> Thanks,
> >> Eli
> >
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Jakob Homan
@Eli >> This rc contains many patches not yet committed to trunk.
If you've compiled this list, can you post it?

On Wed, May 4, 2011 at 3:24 PM, Eli Collins  wrote:
> With my Cloudera hat on..
>
> When we went through the 10x and 20x patches we only pulled a subset
> of them, primarily for security and the general improvements that we
> thought were good.  We found both incompatible changes and some
> sketchy changes that we did not pull in from a quality perspective.
> There is a big difference between a patch set that's acceptable for
> Yahoo!'s user base and one that's a more general artifact.
>
> When we evaluated the YDH patch sets we were using that frame of mind.
>  I'm now looking it in terms of an Apache release. And the place to
> review changes for an Apache release is on jira.
>
> CDH3 is based on the latest stable Apache release (20.2) so it doesn't
> regress against it.  I'm nervous about rebasing future releases on 203
> because of the compatibility and quality implications.
>
> Thanks,
> Eli
>
>
> On Wed, May 4, 2011 at 3:06 PM, Suresh Srinivas  
> wrote:
>> Eli,
>>
>> How many of these patches that you find troublesome are in CDH already?
>>
>> Regards,
>> Suresh
>>
>>
>> On 5/4/11 3:03 PM, "Eli Collins"  wrote:
>>
>>> On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley  wrote:
 Here's an updated release candidate for 0.20.203.0. I've incorporated the
 feedback and included all of the patches from 0.20.2, which is the last
 stable release. I also fixed the eclipse-plugin problem.

 The candidate is at: 
 http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/

 Please download it, inspect it, compile it, and test it. Clearly, I'm +1.

 -- Owen
>>>
>>> While rc2 is an improvement on rc1, I am -1 on this particular rc.  
>>> Rationale:
>>>
>>> This rc contains many patches not yet committed to trunk. This would
>>> cause the next major release (0.22) to be a feature regression against
>>> our latest stable release (203), were 0.22 released soon.
>>>
>>> This rc contains many patches not yet reviewed by the community via
>>> the normal process (jira, patch against trunk, merge to a release
>>> branch). I think we should respect the existing community process that
>>> has been used for all previous releases.
>>>
>>> This rc introduces a new development and braching model (new feature
>>> development outside trunk) and Hadoop versioning scheme without
>>> sufficient discussion or proposal of these changes with the community.
>>>
>>> We should establish new process before the release, a release is not
>>> the appropriate mechanism for changing our review and development
>>> process or versioning .
>>>
>>> I do support a release from branch-0.20-security that follows the
>>> existing, established community process.
>>>
>>> Thanks,
>>> Eli
>>
>>
>


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Todd Lipcon
-1 for the same reasons I outlined in my email yesterday. This is not a
community artifact following the community's processes, and thus should not
be an official release until those issues are addressed.

On Wed, May 4, 2011 at 3:17 PM, Doug Cutting  wrote:

> -1
>
> This candidate has lots of patches that are not in trunk, potentially
> adding regressions to 0.22 and 0.23.  This should be addressed before we
> release from 0.20-security.  We should also not move to four-component
> version numbering.  A release from the 0.20-security branch should
> perhaps be called 0.20.100.
>
> Doug
>
> On 05/04/2011 10:31 AM, Owen O'Malley wrote:
> > Here's an updated release candidate for 0.20.203.0. I've incorporated the
> feedback and included all of the patches from 0.20.2, which is the last
> stable release. I also fixed the eclipse-plugin problem.
> >
> > The candidate is at:
> http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
> >
> > Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
> >
> > -- Owen
>



-- 
Todd Lipcon
Software Engineer, Cloudera


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
With my Cloudera hat on..

When we went through the 10x and 20x patches we only pulled a subset
of them, primarily for security and the general improvements that we
thought were good.  We found both incompatible changes and some
sketchy changes that we did not pull in from a quality perspective.
There is a big difference between a patch set that's acceptable for
Yahoo!'s user base and one that's a more general artifact.

When we evaluated the YDH patch sets we were using that frame of mind.
 I'm now looking it in terms of an Apache release. And the place to
review changes for an Apache release is on jira.

CDH3 is based on the latest stable Apache release (20.2) so it doesn't
regress against it.  I'm nervous about rebasing future releases on 203
because of the compatibility and quality implications.

Thanks,
Eli


On Wed, May 4, 2011 at 3:06 PM, Suresh Srinivas  wrote:
> Eli,
>
> How many of these patches that you find troublesome are in CDH already?
>
> Regards,
> Suresh
>
>
> On 5/4/11 3:03 PM, "Eli Collins"  wrote:
>
>> On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley  wrote:
>>> Here's an updated release candidate for 0.20.203.0. I've incorporated the
>>> feedback and included all of the patches from 0.20.2, which is the last
>>> stable release. I also fixed the eclipse-plugin problem.
>>>
>>> The candidate is at: 
>>> http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>>>
>>> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>>>
>>> -- Owen
>>
>> While rc2 is an improvement on rc1, I am -1 on this particular rc.  
>> Rationale:
>>
>> This rc contains many patches not yet committed to trunk. This would
>> cause the next major release (0.22) to be a feature regression against
>> our latest stable release (203), were 0.22 released soon.
>>
>> This rc contains many patches not yet reviewed by the community via
>> the normal process (jira, patch against trunk, merge to a release
>> branch). I think we should respect the existing community process that
>> has been used for all previous releases.
>>
>> This rc introduces a new development and braching model (new feature
>> development outside trunk) and Hadoop versioning scheme without
>> sufficient discussion or proposal of these changes with the community.
>>
>> We should establish new process before the release, a release is not
>> the appropriate mechanism for changing our review and development
>> process or versioning .
>>
>> I do support a release from branch-0.20-security that follows the
>> existing, established community process.
>>
>> Thanks,
>> Eli
>
>


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Doug Cutting
-1

This candidate has lots of patches that are not in trunk, potentially
adding regressions to 0.22 and 0.23.  This should be addressed before we
release from 0.20-security.  We should also not move to four-component
version numbering.  A release from the 0.20-security branch should
perhaps be called 0.20.100.

Doug

On 05/04/2011 10:31 AM, Owen O'Malley wrote:
> Here's an updated release candidate for 0.20.203.0. I've incorporated the 
> feedback and included all of the patches from 0.20.2, which is the last 
> stable release. I also fixed the eclipse-plugin problem. 
> 
> The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
> 
> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
> 
> -- Owen


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Suresh Srinivas
Eli,

How many of these patches that you find troublesome are in CDH already?

Regards,
Suresh


On 5/4/11 3:03 PM, "Eli Collins"  wrote:

> On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley  wrote:
>> Here's an updated release candidate for 0.20.203.0. I've incorporated the
>> feedback and included all of the patches from 0.20.2, which is the last
>> stable release. I also fixed the eclipse-plugin problem.
>> 
>> The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>> 
>> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>> 
>> -- Owen
> 
> While rc2 is an improvement on rc1, I am -1 on this particular rc.  Rationale:
> 
> This rc contains many patches not yet committed to trunk. This would
> cause the next major release (0.22) to be a feature regression against
> our latest stable release (203), were 0.22 released soon.
> 
> This rc contains many patches not yet reviewed by the community via
> the normal process (jira, patch against trunk, merge to a release
> branch). I think we should respect the existing community process that
> has been used for all previous releases.
> 
> This rc introduces a new development and braching model (new feature
> development outside trunk) and Hadoop versioning scheme without
> sufficient discussion or proposal of these changes with the community.
> 
> We should establish new process before the release, a release is not
> the appropriate mechanism for changing our review and development
> process or versioning .
> 
> I do support a release from branch-0.20-security that follows the
> existing, established community process.
> 
> Thanks,
> Eli



Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Arun C Murthy

On May 4, 2011, at 10:31 AM, Owen O'Malley wrote:

Here's an updated release candidate for 0.20.203.0. I've  
incorporated the feedback and included all of the patches from  
0.20.2, which is the last stable release. I also fixed the eclipse- 
plugin problem.


The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/

Please download it, inspect it, compile it, and test it. Clearly,  
I'm +1.




+1

Downloaded release, checked checksums, built, deployed single-node  
cluster.


Arun



Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley  wrote:
> Here's an updated release candidate for 0.20.203.0. I've incorporated the 
> feedback and included all of the patches from 0.20.2, which is the last 
> stable release. I also fixed the eclipse-plugin problem.
>
> The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>
> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>
> -- Owen

While rc2 is an improvement on rc1, I am -1 on this particular rc.  Rationale:

This rc contains many patches not yet committed to trunk. This would
cause the next major release (0.22) to be a feature regression against
our latest stable release (203), were 0.22 released soon.

This rc contains many patches not yet reviewed by the community via
the normal process (jira, patch against trunk, merge to a release
branch). I think we should respect the existing community process that
has been used for all previous releases.

This rc introduces a new development and braching model (new feature
development outside trunk) and Hadoop versioning scheme without
sufficient discussion or proposal of these changes with the community.

We should establish new process before the release, a release is not
the appropriate mechanism for changing our review and development
process or versioning .

I do support a release from branch-0.20-security that follows the
existing, established community process.

Thanks,
Eli


Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eric Baldeschwieler
Hi Folks,

This is a release vote, let's stay focused.  On this thread I think appropriate 
responses are either 

+1 and some short commentary  (assuming you've tried it and it works)

or

-1 and some short commentary.  It would also be cool if you noted if you've 
tried it.



In the spirit of my feedback, I'll respond to this under another subject.

Thanks,

E14

On May 4, 2011, at 12:17 PM, Eli Collins wrote:

> On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley  wrote:
>> Here's an updated release candidate for 0.20.203.0. I've incorporated the 
>> feedback and included all of the patches from 0.20.2, which is the last 
>> stable release. I also fixed the eclipse-plugin problem.
>> 
>> The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>> 
>> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>> 
>> -- Owen
> 
> Hey Owen,
> 
> Thanks for incorporating all the feedback and additional changes. It's
> great that this release won't be a regression against our previous
> stable release.
> 
> I would like to call out that we are not just voting to adopt a
> particular release, we are starting a new version scheme for the
> project, doing new feature development on maintenance release branches
> (before trunk), and we're saying it's OK to release software that
> hasn't been reviewed by the community.
> 
> I'd like to hear from our development community not just that we want
> to do a release from this branch but that we want to adopt these other
> changes as well. Here's a summary of the major *remaining* issues and
> a recommendation on how to proceed:
> 
> 1. There are about ~50 changes that have jiras that are committed to
> the branch that are not yet in trunk. The next release (0.22) will be
> a regression against this release, with respect to these particular
> changes. Recomendation: we should get these changes in trunk before
> releasing so that new features do not show up in maintenace branches
> first.
> 
> 2. There are 192 patches that were committed to the branch without
> reference to any Jira in the commit message. Some of these may have
> already been forward ported, but it is very difficult to match them up
> and evaluate which ones have been committed. Some are troublesome,
> when spot checking the commits I found some that have been done by
> non-committers with no public review that introduced an apparent
> performance regressions (eg see HADOOP-7255). Recommendation: we
> should update the commit log to make sure there is a jira for each
> issue, and all changes have been reviewed/committed. This is the way
> we've always done releases.
> 
> 3. The new versioning scheme major.minor.point.X the new "X" component
> allows for new feature development on point releases. Recomendation:
> we should discuss in a separate thread whether we want to do new
> feature development on maintenance branches and if so to adopt this
> new version scheme.
> 
> Thanks,
> Eli



Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Allen Wittenauer

On May 4, 2011, at 10:31 AM, Owen O'Malley wrote:

> Here's an updated release candidate for 0.20.203.0. I've incorporated the 
> feedback and included all of the patches from 0.20.2, which is the last 
> stable release. I also fixed the eclipse-plugin problem. 
> 
> The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
> 
> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.


Am I misreading this, or are the MR protocols out of sync between 
0.20.203 and 0.21?  It would also appear that this is marked stable in 0.21. 
What is the user impact?




Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Eli Collins
On Wed, May 4, 2011 at 10:31 AM, Owen O'Malley  wrote:
> Here's an updated release candidate for 0.20.203.0. I've incorporated the 
> feedback and included all of the patches from 0.20.2, which is the last 
> stable release. I also fixed the eclipse-plugin problem.
>
> The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/
>
> Please download it, inspect it, compile it, and test it. Clearly, I'm +1.
>
> -- Owen

Hey Owen,

Thanks for incorporating all the feedback and additional changes. It's
great that this release won't be a regression against our previous
stable release.

I would like to call out that we are not just voting to adopt a
particular release, we are starting a new version scheme for the
project, doing new feature development on maintenance release branches
(before trunk), and we're saying it's OK to release software that
hasn't been reviewed by the community.

I'd like to hear from our development community not just that we want
to do a release from this branch but that we want to adopt these other
changes as well. Here's a summary of the major *remaining* issues and
a recommendation on how to proceed:

1. There are about ~50 changes that have jiras that are committed to
the branch that are not yet in trunk. The next release (0.22) will be
a regression against this release, with respect to these particular
changes. Recomendation: we should get these changes in trunk before
releasing so that new features do not show up in maintenace branches
first.

2. There are 192 patches that were committed to the branch without
reference to any Jira in the commit message. Some of these may have
already been forward ported, but it is very difficult to match them up
and evaluate which ones have been committed. Some are troublesome,
when spot checking the commits I found some that have been done by
non-committers with no public review that introduced an apparent
performance regressions (eg see HADOOP-7255). Recommendation: we
should update the commit log to make sure there is a jira for each
issue, and all changes have been reviewed/committed. This is the way
we've always done releases.

3. The new versioning scheme major.minor.point.X the new "X" component
allows for new feature development on point releases. Recomendation:
we should discuss in a separate thread whether we want to do new
feature development on maintenance branches and if so to adopt this
new version scheme.

Thanks,
Eli


[VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Owen O'Malley
Here's an updated release candidate for 0.20.203.0. I've incorporated the 
feedback and included all of the patches from 0.20.2, which is the last stable 
release. I also fixed the eclipse-plugin problem. 

The candidate is at: http://people.apache.org/~omalley/hadoop-0.20.203.0-rc1/

Please download it, inspect it, compile it, and test it. Clearly, I'm +1.

-- Owen