Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-21 Thread Steve Loughran
On 17/11/11 19:31, Roman Shaposhnik wrote: On Thu, Nov 17, 2011 at 11:09 AM, Arun C Murthya...@hortonworks.com wrote: I don't know which are the ones in 'every single downstream component' - care to enumerate? The ones I'm aware of, which have since been fixed are:

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-17 Thread Roman Shaposhnik
On Thu, Nov 17, 2011 at 2:45 AM, Steve Loughran ste...@apache.org wrote: -0.23 is a superset of the MR and HDFS APIs compatible with previous versions (I don't know or care whether or not it is a proper superset or not). The goal here is that end user apps and higher levels in the stack

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-17 Thread Alejandro Abdelnur
On Thu, Nov 17, 2011 at 2:45 AM, Steve Loughran ste...@apache.org wrote: ... What I will miss in 0.23 is the MiniMRCluster, which I consider to be part of the API. Certainly its why I pull in hadoop-common-test-0.20.20x.jar into downstream builds, because it is the simplest way to do basic

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-17 Thread Roman Shaposhnik
On Thu, Nov 17, 2011 at 11:09 AM, Arun C Murthy a...@hortonworks.com wrote: I don't know which are the ones in 'every single downstream component' - care to enumerate? The ones I'm aware of, which have since been fixed are: https://issues.apache.org/jira/browse/HBASE-4510 -

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-17 Thread Andrew Purtell
From: Arun C Murthy a...@hortonworks.com Now, a downstream project such as HBase, Hive or Pig isn't the 'normal end-user application'. These projects can choose to use undocumented/non-public (e.g. LimitedPrivate) apis and we are committed to working with them to ensure a smooth

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Doug Cutting
On 11/15/2011 06:06 PM, Konstantin Boudnik wrote: Are you suggesting to drop 0.22 out of the picture all together? Any reason for that? By no means. I thought that we might, as Scott Carey said, treat 0.22 as a minor release in the 1.x series. I'd prefer that we consistently rename branches

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Konstantin Boudnik
On Wed, Nov 16, 2011 at 09:15AM, Doug Cutting wrote: On 11/15/2011 06:06 PM, Konstantin Boudnik wrote: Are you suggesting to drop 0.22 out of the picture all together? Any reason for that? By no means. I thought that we might, as Scott Carey said, treat 0.22 as a minor release in the

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Scott Carey
On 11/16/11 9:24 AM, Konstantin Boudnik c...@apache.org wrote: On Wed, Nov 16, 2011 at 09:15AM, Doug Cutting wrote: On 11/15/2011 06:06 PM, Konstantin Boudnik wrote: Are you suggesting to drop 0.22 out of the picture all together? Any reason for that? By no means. I thought that we

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Matt Foley
I support giving all three active code branches a clean start, on an equal footing: - The next release of 0.20-security (formerly expected as 0.20.205.1) to be 1.0.0, establishing branch-1.0 - The next release of 0.22 to be 2.0.0, establishing branch-2.0 - The recent release of 0.23.0 to be

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Joe Stein
+1 to Owen's slight modification and to Matt's proposal with a minor (no pun intended) suggestion branch-0.20-security - branch-1.0 branch-0.20-security-205 - branch-1.1.0 On Wed, Nov 16, 2011 at 4:37 PM, Owen O'Malley o...@hortonworks.com wrote: +1 to Matt's proposal, although I'd modify it

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Arun C Murthy
On Nov 16, 2011, at 11:57 AM, Doug Cutting wrote: On 11/16/2011 10:15 AM, Scott Carey wrote: - Should hadoop adopt a new clear definition of major.minor.patch number significance? Would you care to call a vote on one or both of these? Great points Scott and Doug. I agree about the need

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Doug Cutting
On 11/16/2011 02:43 PM, Arun C Murthy wrote: I propose we adopt the convention that a new major version should be a superset of the previous major version, features-wise. That means that we could never discard a feature, no? One definition is that a major release includes some fundamental

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Roman Shaposhnik
On Wed, Nov 16, 2011 at 1:11 PM, Matt Foley mfo...@hortonworks.com wrote: I support giving all three active code branches a clean start, on an equal footing: - The next release of 0.20-security (formerly expected as 0.20.205.1) to be 1.0.0, establishing branch-1.0 - The next release of 0.22

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Arun C Murthy
Agreed. We will discard features as we go along, but we need to have consensus to discard major features. Is that fair? And we discard them for reasons you outlined... Arun On Nov 16, 2011, at 3:02 PM, Doug Cutting wrote: On 11/16/2011 02:43 PM, Arun C Murthy wrote: I propose we adopt the

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Arun C Murthy
On Nov 16, 2011, at 3:02 PM, Doug Cutting wrote: Another definition is that a major release permits incompatible changes, either in APIs, wire-formats, on-disk formats, etc. This is more objective measure. For example, one might in release X+1 deprecate features of release X but still

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Andrew Purtell
On Wed, Nov 16, 2011 at 1:11 PM, Matt Foley wrote: I support giving all three active code branches a clean start, on an equal footing: - The next release of 0.20-security (formerly expected as 0.20.205.1) to be 1.0.0, establishing branch-1.0 - The next release of 0.22 to be 2.0.0,

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread sanjay Radia
On Nov 16, 2011, at 3:02 PM, Doug Cutting wrote: Another definition is that a major release permits incompatible changes, either in APIs, wire-formats, on-disk formats, etc. This is more objective measure. For example, one might in release X+1 deprecate features of release X but still

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-16 Thread Konstantin Boudnik
On Wed, Nov 16, 2011 at 01:11PM, Matt Foley wrote: I support giving all three active code branches a clean start, on an equal footing: - The next release of 0.20-security (formerly expected as 0.20.205.1) to be 1.0.0, establishing branch-1.0 - The next release of 0.22 to be 2.0.0,

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Steve Loughran
On 15/11/11 06:07, Dhruba Borthakur wrote: +1 to making the upcoming 0.23 release as 2.0. +1 And leave the 0.20.20x chain as is, just because people are used to it

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Todd Lipcon
On Tue, Nov 15, 2011 at 1:57 AM, Steve Loughran ste...@apache.org wrote: On 15/11/11 06:07, Dhruba Borthakur wrote: +1 to making the upcoming 0.23 release as 2.0. +1 And leave the 0.20.20x chain as is, just because people are used to it +1 to Steve's proposal. Renaming 0.20 is too big a

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Owen O'Malley
On Tue, Nov 15, 2011 at 1:43 PM, Todd Lipcon t...@cloudera.com wrote: On Tue, Nov 15, 2011 at 1:57 AM, Steve Loughran ste...@apache.org wrote: On 15/11/11 06:07, Dhruba Borthakur wrote: +1 to making the upcoming 0.23 release as 2.0. +1 And leave the 0.20.20x chain as is, just

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Ted Dunning
On Tue, Nov 15, 2011 at 2:17 PM, Owen O'Malley o...@hortonworks.com wrote: On Tue, Nov 15, 2011 at 1:43 PM, Todd Lipcon t...@cloudera.com wrote: On Tue, Nov 15, 2011 at 1:57 AM, Steve Loughran ste...@apache.org wrote: On 15/11/11 06:07, Dhruba Borthakur wrote: +1 to making the

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Arun C Murthy
I don't see this as 'renaming', I propose we just look forward and make the next release from branch-0.20-security as 1.0 to keep things simple. IMHO, going back to rename existing releases (0.21 etc.) isn't productive. Arun On Nov 15, 2011, at 1:43 PM, Todd Lipcon wrote: On Tue, Nov 15,

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Luke Lu
+1 on *new* releases from 0.20.2xx branches as 1.x; 0.22 branch as 2.x and 0.23/24 branches as 3.x. On Tue, Nov 15, 2011 at 2:32 PM, Arun C Murthy a...@hortonworks.com wrote: I don't see this as 'renaming', I propose we just look forward and make the next release from branch-0.20-security as

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Doug Cutting
On 11/15/2011 01:43 PM, Todd Lipcon wrote: +1 to Steve's proposal. Renaming 0.20 is too big a pain at this point. Everyone seems to agree that we should rename 0.23 to either 2.0 or 3.0. There are a number of different views about what to do with 0.20, 0.21 and 0.22. So maybe we should proceed

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Ahmed Radwan
+1 Can we agree to 0.23 - 2.0?  That's consistent with the MR2 nomenclature. Best Regards Ahmed On Tue, Nov 15, 2011 at 5:37 PM, Doug Cutting cutt...@apache.org wrote: On 11/15/2011 01:43 PM, Todd Lipcon wrote: +1 to Steve's proposal. Renaming 0.20 is too big a pain at this point. Everyone

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Eli Collins
On Tue, Nov 15, 2011 at 5:37 PM, Doug Cutting cutt...@apache.org wrote: On 11/15/2011 01:43 PM, Todd Lipcon wrote: +1 to Steve's proposal. Renaming 0.20 is too big a pain at this point. Everyone seems to agree that we should rename 0.23 to either 2.0 or 3.0.  There are a number of different

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Doug Cutting
On 11/15/2011 05:49 PM, Eli Collins wrote: Are you suggesting a two part version scheme? Ie 0.23.0 - 2.0 0.23.1 - 2.1 I didn't specify. We could either do that or: 0.23.0 - 2.0.0 0.23.1 - 2.0.1 ... 0.24.0 - 2.1.0 ... I don't care which much. Do you? fwiw I'd map

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Konstantin Boudnik
I believe it has been advocated a number of times in that thread to release 0.22 as 2.0. Are you suggesting to drop 0.22 out of the picture all together? Any reason for that? Thanks, Cos On Tue, Nov 15, 2011 at 05:37PM, Doug Cutting wrote: On 11/15/2011 01:43 PM, Todd Lipcon wrote: +1 to

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Matt Foley
I agree with some prior posters that renaming the 0.20-security sustaining branch could be confusing. How about the following (pseudo-code)? ## Just before we are ready to make rc0 for release 0.20.205.1, do: svn copy branch-0.20-security-205 branch-1.0 ## and actually release it from branch-1.0

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Konstantin Boudnik
And once again - 0.22 seems to be forgotten for an unexplained reason. I urge to stick to original Arun's proposal and use 0.22 as 2.0 With the correction I like the following proposal. Cos On Tue, Nov 15, 2011 at 06:42PM, Matt Foley wrote: I agree with some prior posters that renaming the

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Joe Stein
Consistency between supported branches and releases from trunk in some logical order would be helpful for those outside of the community coming in, labeled however works best for the active community. My 0.235689 cents. /* Joe Stein http://www.medialets.com Twitter: @allthingshadoop */ On Nov

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Arun Murthy
I think this discussion is getting too wide, can we tease them apart? Do we agree we should call the forthcoming releases off branch-0.20-security as 1.x.x? Let me start a vote for just that. Arun Sent from my iPhone On Nov 15, 2011, at 6:43 PM, Matt Foley mfo...@hortonworks.com wrote: I

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Arun Murthy
On Nov 15, 2011, at 6:03 PM, Eli Collins e...@cloudera.com wrote: On Tue, Nov 15, 2011 at 5:56 PM, Doug Cutting cutt...@apache.org wrote: On 11/15/2011 05:49 PM, Eli Collins wrote: Are you suggesting a two part version scheme? Ie 0.23.0 - 2.0 0.23.1 - 2.1 I didn't specify. We could

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Eli Collins
On Tue, Nov 15, 2011 at 8:14 PM, Arun Murthy a...@hortonworks.com wrote: I think this discussion is getting too wide, can we tease them apart? Do we agree we should call the forthcoming releases off branch-0.20-security as 1.x.x? Let me start a vote for just that. +1 IMO the values of x.x

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Joe Stein
If trunk releases would then mean 2.x.x then the branch 1x.x ( 0.20.06.0 being 1.6.0) makes total sense +1 (not binding) so the current trunk release = 2.0.0 and the branch release 0.20.206.0 = 1.6.0 speaking from those of us that have 4,000 nodes in our cluster and want to proliferate the

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Arun Murthy
Eli, Seems to me that trying to 'carry over' numbers from 0.20.2xx would, at best, lead to confusion... similar to folks asking for non-existent 0.20.201/202. I propose we look forward with hadoop-1.0.0 as the supported release with security+append to keep things simple. Thoughts? thanks, Arun

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Eli Collins
On Tue, Nov 15, 2011 at 9:53 PM, Arun Murthy a...@hortonworks.com wrote: Eli, Seems to me that trying to 'carry over' numbers from 0.20.2xx would, at best, lead to confusion... similar to folks asking for non-existent 0.20.201/202. I propose we look forward with hadoop-1.0.0 as the

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Scott Carey
On 11/15/11 6:47 PM, Konstantin Boudnik c...@apache.org wrote: And once again - 0.22 seems to be forgotten for an unexplained reason. I urge to stick to original Arun's proposal and use 0.22 as 2.0 With the correction I like the following proposal. If 0.20.20x ends up in the 1.0.x line, then

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Arun Murthy
Thanks Eli. In keeping with the theme of 'looking ahead' I was thinking of upcoming 0.20.205.1 as 1.0.0. I'll clarify in the voting thread too. Sent from my iPhone On Nov 15, 2011, at 10:13 PM, Eli Collins e...@cloudera.com wrote: On Tue, Nov 15, 2011 at 9:53 PM, Arun Murthy

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-15 Thread Konstantin Shvachko
Consistency of naming the releases is a very valid point and should be the main concern in the decision making. If 0.20.205 is called Hadoop 1, and 0.23 called Hadoop 2, then releasing 0.22 under 0.22 will be confusing. If we vote only on renaming 0.20.205 to 1.0 then the 0.23 release becomes

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-14 Thread Doug Cutting
To be specific, I think one of the possible could be sensible: A. Rename as follows: 0.20 - 1.0 0.21 - 1.1 0.22 - 1.2 0.23 - 2.0 0.24 - 2.1 B. Just drop the leading zero, e.g., 0.23.0 becomes 23.0. Doug On 11/14/2011 02:11 PM, Arun C Murthy wrote: Folks, Apache Hadoop has come a

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-14 Thread Milind.Bhandarkar
Arun, You beat me to start this discussion :-) I was at Apachecon recently, and based on the questions and comments from several attendees for the hadoop sessions, as well as the hadoop meetup afterwards, it was clear that users are perplexed about our versioning strategies. In addition, Doug

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-14 Thread Mattmann, Chris A (388J)
On Nov 14, 2011, at 2:41 PM, Doug Cutting wrote: To be specific, I think one of the possible could be sensible: A. Rename as follows: 0.20 - 1.0 0.21 - 1.1 0.22 - 1.2 0.23 - 2.0 0.24 - 2.1 I like this one, Doug. +1. Cheers, Chris B. Just drop the leading zero, e.g., 0.23.0

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-14 Thread Todd Papaioannou
A) is MUCH better from a product branding stand point.. which is what this is mostly about. I would go for something along those lines. ToddP On Nov 14, 2011, at 2:41 PM, Doug Cutting wrote: To be specific, I think one of the possible could be sensible: A. Rename as follows: 0.20 - 1.0

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-14 Thread Konstantin Boudnik
+1 on graduating .205 as 1.0. It is a very mature and widely used version of Hadoop and really has a significant bang for a buck! It seems that making 0.22 to be 2.0 has a lot of sense because its coming release carries a number of significant changes qualifying it to be a major release. .23

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-14 Thread Sharad Agarwal
+1 remembering and understanding current release numbering and attributing it to stable/compatible etc. is really painful. On Tue, Nov 15, 2011 at 3:41 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, Apache Hadoop has come a long way since our humble beginnings. As a community we've

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-14 Thread Owen O'Malley
I think this is great. Thanks, Arun. Since the 2xx line is clearly a major branch, we should designate it as 1.0. I don't think there is any need to rename current releases, so let's just rename the upcoming ones: 0.20.205.1 - 1.0.0 0.20.206.0 - 1.1.0 0.21 is dead and we should just leave it as

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-14 Thread Andreas Neumann
+1 for not renaming past releases, that would really start confusion. If .20.20x.y corresponds to 1.z.y, then z=x-5 and: 0.20.205.1 - 1.0.1 0.20.206.0 - 1.1.0 -Andreas. On 11/14/11 9:47 PM, Owen O'Malley o...@hortonworks.com wrote: I think this is great. Thanks, Arun. Since the 2xx line is

Re: [DISCUSS] Apache Hadoop 1.0?

2011-11-14 Thread Mahadev Konar
+1 for 0.20.2xx as 1.0. mahadev On Mon, Nov 14, 2011 at 9:37 PM, Sharad Agarwal sharad.apa...@gmail.com wrote: +1 remembering and understanding current release numbering and attributing it to stable/compatible etc. is really painful. On Tue, Nov 15, 2011 at 3:41 AM, Arun C Murthy