Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than to 0.20. Jarcec
On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote: > I agree -- this is a good release. The bugs Kai pointed out should be > fixed, but as they are not critical regressions, we can fix them in 0.11.1 > (if someone wants to roll 0.11.1 the minute these fixes are committed, I > won't mind and will dutifully vote for the release). > > I think the Hadoop 20.2 incompatibility is unfortunate but iirc this is > fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in 20.2?) > > FWIW Twitter's running CDH3 and this release works in our environment. > > At this point things that block a release are critical regressions in > performance or correctness. > > D > > > On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates <ga...@hortonworks.com> wrote: > > > No. Bugs like these are supposed to be found and fixed after we branch > > from trunk (which happened several months ago in the case of 0.11). The > > point of RCs are to check that it's a good build, licenses are right, etc. > > Any bugs found this late in the game have to be seen as failures of > > earlier testing. > > > > Alan. > > > > On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote: > > > > > Isn't the point of an RC to find and fix bugs like these> > > > > > > > > > On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham <billgra...@gmail.com> > > wrote: > > > > > >> Regarding Pig 11 rc2, I propose we continue with the current vote as is > > >> (which closes today EOD). Patches for 0.20.2 issues can be rolled into a > > >> Pig 0.11.1 release whenever they're available and tested. > > >> > > >> > > >> > > >> On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich <onatkov...@yahoo.com > > >>> wrote: > > >> > > >>> I agree that supporting as much as we can is a good goal. The issue is > > >> who > > >>> is going to be testing against all these versions? We found the issues > > >>> under discussion because of a customer report, not because we > > >> consistently > > >>> test against all versions. Perhaps when we decide which versions to > > >> support > > >>> for next release we need also to agree who is going to be testing and > > >>> maintaining compatibility with a particular version. > > >>> > > >>> For instance since Hadoop 23 compatibility is important for us at Yahoo > > >> we > > >>> have been maintaining compatibility with this version for 0.9, 0.10 and > > >>> will do the same for 0.11 and going forward. I think we would need > > others > > >>> to step in and claim the versions of their interest. > > >>> > > >>> Olga > > >>> > > >>> > > >>> ________________________________ > > >>> From: Kai Londenberg <kai.londenb...@googlemail.com> > > >>> To: dev@pig.apache.org > > >>> Sent: Wednesday, February 20, 2013 1:51 AM > > >>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems > > >>> > > >>> Hi, > > >>> > > >>> I stronly agree with Jonathan here. If there are good reasons why you > > >>> can't support an older version of Hadoop any more, that's one thing. > > >>> But having to change 2 lines of code doesn't really qualify as such in > > >>> my point of view ;) > > >>> > > >>> At least for me, pig support for 0.20.2 is essential - without it, I > > >>> can't use it. If it doesn't support it, I'll have to branch pig and > > >>> hack it myself, or stop using it. > > >>> > > >>> I guess, there are a lot of people still running 0.20.2 Clusters. If > > >>> you really have lots of data stored on HDFS and a continuously busy > > >>> cluster, an upgrade is nothing you do "just because". > > >>> > > >>> > > >>> 2013/2/20 Jonathan Coveney <jcove...@gmail.com>: > > >>>> I agree that we shouldn't have to support old versions forever. That > > >>> said, > > >>>> I also don't think we should be too blase about supporting older > > >> versions > > >>>> where it is not odious to do so. We have a lot of competition in the > > >>>> language space and the broader the versions we can support, the better > > >>>> (assuming it isn't too odious to do so). In this case, I don't think > > it > > >>>> should be too hard to change ObjectSerializer so that the > > commons-codec > > >>>> code used is compatible with both versions...we could just in-line > > some > > >>> of > > >>>> the Base64 code, and comment accordingly. > > >>>> > > >>>> That said, we also should be clear about what versions we support, but > > >>> 6-12 > > >>>> months seems short. The upgrade cycles on Hadoop are really, really > > >> long. > > >>>> > > >>>> > > >>>> 2013/2/20 Prashant Kommireddi <prash1...@gmail.com> > > >>>> > > >>>>> Agreed, that makes sense. Probably supporting older hadoop version > > for > > >>> a 1 > > >>>>> or 2 pig releases before moving to a newer/stable version? > > >>>>> > > >>>>> Having said that, should we use 0.11 period to communicate the same > > to > > >>> the > > >>>>> community and start moving on 0.12 onwards? I know we are way past > > >> 6-12 > > >>>>> months (1-2 release) time frame with 0.20.2, but we also need to make > > >>> sure > > >>>>> users are aware and plan accordingly. > > >>>>> > > >>>>> I'd also be interested to hear how other projects (Hive, Oozie) are > > >>>>> handling this. > > >>>>> > > >>>>> -Prashant > > >>>>> > > >>>>> On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich < > > onatkov...@yahoo.com > > >>>>>> wrote: > > >>>>> > > >>>>>> It seems that for each Pig release we need to agree and clearly > > >> state > > >>>>>> which Hadoop versions it will support. I guess the main question is > > >>> how > > >>>>> we > > >>>>>> decide on this. Perhaps we should say that Pig no longer supports > > >>> older > > >>>>>> Hadoop versions once the newer one is out for at least 6-12 month to > > >>> make > > >>>>>> sure it is stable. I don't think we can support old versions > > >>>>> indefinitely. > > >>>>>> It is in everybody's interest to keep moving forward. > > >>>>>> > > >>>>>> Olga > > >>>>>> > > >>>>>> > > >>>>>> ________________________________ > > >>>>>> From: Prashant Kommireddi <prash1...@gmail.com> > > >>>>>> To: dev@pig.apache.org > > >>>>>> Sent: Tuesday, February 19, 2013 10:57 AM > > >>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems > > >>>>>> > > >>>>>> What do you guys feel about the JIRA to do with 0.20.2 compatibility > > >>>>>> (PIG-3194)? I am interested in discussing the strategy around > > >> backward > > >>>>>> compatibility as this is something that would haunt us each time we > > >>> move > > >>>>> to > > >>>>>> the next hadoop version. For eg, we might be in a similar situation > > >>> while > > >>>>>> moving to Hadoop 2.0, when some of the stuff might break for 1.0. > > >>>>>> > > >>>>>> I feel it would be good to get this JIRA fix in for 0.11, as 0.20.2 > > >>> users > > >>>>>> might be caught unaware. Of course, I must admit there is selfish > > >>>>> interest > > >>>>>> here and it's probably easier for us to have a workaround on Pig > > >>> rather > > >>>>>> than upgrade hadoop in all our production DCs. > > >>>>>> > > >>>>>> -Prashant > > >>>>>> > > >>>>>> > > >>>>>> On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney < > > >>>>> russell.jur...@gmail.com > > >>>>>>> wrote: > > >>>>>> > > >>>>>>> I think someone should step up and fix the easy ones, if possible. > > >>>>>>> > > >>>>>>> > > >>>>>>> On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham < > > >> billgra...@gmail.com> > > >>>>>> wrote: > > >>>>>>> > > >>>>>>>> Thanks Kai for reporting these. > > >>>>>>>> > > >>>>>>>> What do people think about the severity of these issues w.r.t. > > >> Pig > > >>>>> 11? > > >>>>>> I > > >>>>>>>> see a few possible options: > > >>>>>>>> > > >>>>>>>> 1. We include some or all of these patches in a new Pig 11 rc. > > >>> We'd > > >>>>>> want > > >>>>>>> to > > >>>>>>>> make sure that they don't destabilize the current branch. This > > >>>>> approach > > >>>>>>>> makes sense if we think Pig 11 wouldn't be a good release > > >> without > > >>> one > > >>>>>> or > > >>>>>>>> more of these included. > > >>>>>>>> > > >>>>>>>> 2. We continue with the Pig 11 release without these, but then > > >>>>> include > > >>>>>>> one > > >>>>>>>> or more in a 0.11.1 release. > > >>>>>>>> > > >>>>>>>> 3. We continue with the Pig 11 release without these, but then > > >>>>> include > > >>>>>>> them > > >>>>>>>> in a 0.12 release. > > >>>>>>>> > > >>>>>>>> Jon has a patch for the MAP issue > > >>>>>>>> (PIG-3144<https://issues.apache.org/jira/browse/PIG-3144>) > > >>>>>>>> ready, which seems like the most pressing of the three to me. > > >>>>>>>> > > >>>>>>>> thanks, > > >>>>>>>> Bill > > >>>>>>>> > > >>>>>>>> On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg < > > >>>>>>>> kai.londenb...@googlemail.com> wrote: > > >>>>>>>> > > >>>>>>>>> Hi, > > >>>>>>>>> > > >>>>>>>>> I just subscribed to the dev mailing list in order to give you > > >>> some > > >>>>>>>>> feedback on pig 0.11 candidate 2. > > >>>>>>>>> > > >>>>>>>>> The following three issues are currently present in 0.11 > > >>> candidate > > >>>>> 2: > > >>>>>>>>> > > >>>>>>>>> https://issues.apache.org/jira/browse/PIG-3144 - 'Erroneous > > >> map > > >>>>>> entry > > >>>>>>>>> alias resolution leading to "Duplicate schema alias" errors' > > >>>>>>>>> https://issues.apache.org/jira/browse/PIG-3194 - Changes to > > >>>>>>>>> ObjectSerializer.java break compatibility with Hadoop 0.20.2 > > >>>>>>>>> https://issues.apache.org/jira/browse/PIG-3195 - Race > > >>> Condition in > > >>>>>>>>> PhysicalOperator leads to ExecException "Error while trying to > > >>> get > > >>>>>>>>> next result in POStream" > > >>>>>>>>> > > >>>>>>>>> The last two of these are easily solveable (see the tickets > > >> for > > >>>>>>>>> details on that). The first one is a bit trickier I think, but > > >>> at > > >>>>>>>>> least there is a workaround for it (pass Map fields through an > > >>> UDF) > > >>>>>>>>> > > >>>>>>>>> In my personal opinion, each of these problems is pretty > > >> severe, > > >>>>> but > > >>>>>>>>> opinions about the importance of the MAP Datatype and STREAM > > >>>>>> Operator, > > >>>>>>>>> as well as Hadoop 0.20.2 compatibility might differ. > > >>>>>>>>> > > >>>>>>>>> so far .. > > >>>>>>>>> > > >>>>>>>>> Kai Londenberg > > >>>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> -- > > >>>>>>>> *Note that I'm no longer using my Yahoo! email address. Please > > >>> email > > >>>>> me > > >>>>>>> at > > >>>>>>>> billgra...@gmail.com going forward.* > > >>>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> -- > > >>>>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > > >>>>>>> datasyndrome.com > > >>>>>>> > > >>>>>> > > >>>>> > > >>> > > >> > > >> > > >> > > >> -- > > >> *Note that I'm no longer using my Yahoo! email address. Please email me > > at > > >> billgra...@gmail.com going forward.* > > >> > > > > > > > > > > > > -- > > > Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > > datasyndrome.com > > > >
signature.asc
Description: Digital signature