Looks like Lohit found a critical bug we should fix for 11.1: https://issues.apache.org/jira/browse/PIG-3241(only observed in hadoop 2.0)
D On Wed, Mar 6, 2013 at 12:57 PM, Prashant Kommireddi <prash1...@gmail.com>wrote: > Dmitriy, are the gc fixes all in for 0.11.1? PIG-3148 and PIG-3212 are the > 2 JIRAs I know were fixed, any others? > > I have a patch up for 3194, I think we should be good for a release once > that makes it in. > > -Prashant > > On Sat, Mar 2, 2013 at 11:16 AM, Prashant Kommireddi <prash1...@gmail.com > >wrote: > > > Great. > > > > I have commented regarding a possible approach for PIG-3194 > > http://goo.gl/UQ3zs. Please take a look when you folks have a chance. > > > > > > On Fri, Mar 1, 2013 at 7:00 PM, Dmitriy Ryaboy <dvrya...@gmail.com> > wrote: > > > >> I'd like to get the gc fix in as well, but looks like Rohini is about to > >> commit it so we are good there. > >> > >> On Mar 1, 2013, at 11:33 AM, Bill Graham <billgra...@gmail.com> wrote: > >> > >> > +1 to releasing Pig 0.11.1 when this is addressed. I should be able to > >> help > >> > with the release again. > >> > > >> > > >> > > >> > On Fri, Mar 1, 2013 at 11:25 AM, Prashant Kommireddi < > >> prash1...@gmail.com>wrote: > >> > > >> >> Hey Guys, > >> >> > >> >> I wanted to start a conversation on this again. If Kai is not looking > >> at > >> >> PIG-3194 I can start working on it to get 0.11 compatible with 20.2. > If > >> >> everyone agrees, we should roll out 0.11.1 sooner than usual and I > >> >> volunteer to help with it in anyway possible. > >> >> > >> >> Any objections to getting 0.11.1 out soon after 3194 is fixed? > >> >> > >> >> -Prashant > >> >> > >> >> On Wed, Feb 20, 2013 at 3:34 PM, Russell Jurney < > >> russell.jur...@gmail.com > >> >>> wrote: > >> >> > >> >>> I stand corrected. Cool, 0.11 is good! > >> >>> > >> >>> > >> >>> On Wed, Feb 20, 2013 at 1:15 PM, Jarek Jarcec Cecho < > >> jar...@apache.org > >> >>>> wrote: > >> >>> > >> >>>> Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than > to > >> >>> 0.20. > >> >>>> > >> >>>> Jarcec > >> >>>> > >> >>>> On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote: > >> >>>>> I agree -- this is a good release. The bugs Kai pointed out should > >> be > >> >>>>> fixed, but as they are not critical regressions, we can fix them > in > >> >>>> 0.11.1 > >> >>>>> (if someone wants to roll 0.11.1 the minute these fixes are > >> >> committed, > >> >>> I > >> >>>>> won't mind and will dutifully vote for the release). > >> >>>>> > >> >>>>> I think the Hadoop 20.2 incompatibility is unfortunate but iirc > this > >> >> is > >> >>>>> fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in > >> >> 20.2?) > >> >>>>> > >> >>>>> FWIW Twitter's running CDH3 and this release works in our > >> >> environment. > >> >>>>> > >> >>>>> At this point things that block a release are critical regressions > >> in > >> >>>>> performance or correctness. > >> >>>>> > >> >>>>> D > >> >>>>> > >> >>>>> > >> >>>>> On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates < > ga...@hortonworks.com > >> > > >> >>>> wrote: > >> >>>>> > >> >>>>>> No. Bugs like these are supposed to be found and fixed after we > >> >>> branch > >> >>>>>> from trunk (which happened several months ago in the case of > 0.11). > >> >>>> The > >> >>>>>> point of RCs are to check that it's a good build, licenses are > >> >> right, > >> >>>> etc. > >> >>>>>> Any bugs found this late in the game have to be seen as failures > >> >> of > >> >>>>>> earlier testing. > >> >>>>>> > >> >>>>>> Alan. > >> >>>>>> > >> >>>>>> On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote: > >> >>>>>> > >> >>>>>>> Isn't the point of an RC to find and fix bugs like these> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham < > >> >>> billgra...@gmail.com> > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>>> Regarding Pig 11 rc2, I propose we continue with the current > >> >> vote > >> >>>> as is > >> >>>>>>>> (which closes today EOD). Patches for 0.20.2 issues can be > >> >> rolled > >> >>>> into a > >> >>>>>>>> Pig 0.11.1 release whenever they're available and tested. > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich < > >> >>>> onatkov...@yahoo.com > >> >>>>>>>>> wrote: > >> >>>>>>>> > >> >>>>>>>>> I agree that supporting as much as we can is a good goal. The > >> >>>> issue is > >> >>>>>>>> who > >> >>>>>>>>> is going to be testing against all these versions? We found > the > >> >>>> issues > >> >>>>>>>>> under discussion because of a customer report, not because we > >> >>>>>>>> consistently > >> >>>>>>>>> test against all versions. Perhaps when we decide which > >> >> versions > >> >>> to > >> >>>>>>>> support > >> >>>>>>>>> for next release we need also to agree who is going to be > >> >> testing > >> >>>> and > >> >>>>>>>>> maintaining compatibility with a particular version. > >> >>>>>>>>> > >> >>>>>>>>> For instance since Hadoop 23 compatibility is important for us > >> >> at > >> >>>> Yahoo > >> >>>>>>>> we > >> >>>>>>>>> have been maintaining compatibility with this version for 0.9, > >> >>>> 0.10 and > >> >>>>>>>>> will do the same for 0.11 and going forward. I think we would > >> >>> need > >> >>>>>> others > >> >>>>>>>>> to step in and claim the versions of their interest. > >> >>>>>>>>> > >> >>>>>>>>> Olga > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> ________________________________ > >> >>>>>>>>> From: Kai Londenberg <kai.londenb...@googlemail.com> > >> >>>>>>>>> To: dev@pig.apache.org > >> >>>>>>>>> Sent: Wednesday, February 20, 2013 1:51 AM > >> >>>>>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems > >> >>>>>>>>> > >> >>>>>>>>> Hi, > >> >>>>>>>>> > >> >>>>>>>>> I stronly agree with Jonathan here. If there are good reasons > >> >> why > >> >>>> you > >> >>>>>>>>> can't support an older version of Hadoop any more, that's one > >> >>>> thing. > >> >>>>>>>>> But having to change 2 lines of code doesn't really qualify as > >> >>>> such in > >> >>>>>>>>> my point of view ;) > >> >>>>>>>>> > >> >>>>>>>>> At least for me, pig support for 0.20.2 is essential - without > >> >>> it, > >> >>>> I > >> >>>>>>>>> can't use it. If it doesn't support it, I'll have to branch > pig > >> >>> and > >> >>>>>>>>> hack it myself, or stop using it. > >> >>>>>>>>> > >> >>>>>>>>> I guess, there are a lot of people still running 0.20.2 > >> >> Clusters. > >> >>>> If > >> >>>>>>>>> you really have lots of data stored on HDFS and a continuously > >> >>> busy > >> >>>>>>>>> cluster, an upgrade is nothing you do "just because". > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> 2013/2/20 Jonathan Coveney <jcove...@gmail.com>: > >> >>>>>>>>>> I agree that we shouldn't have to support old versions > >> >> forever. > >> >>>> That > >> >>>>>>>>> said, > >> >>>>>>>>>> I also don't think we should be too blase about supporting > >> >> older > >> >>>>>>>> versions > >> >>>>>>>>>> where it is not odious to do so. We have a lot of competition > >> >> in > >> >>>> the > >> >>>>>>>>>> language space and the broader the versions we can support, > >> >> the > >> >>>> better > >> >>>>>>>>>> (assuming it isn't too odious to do so). In this case, I > don't > >> >>>> think > >> >>>>>> it > >> >>>>>>>>>> should be too hard to change ObjectSerializer so that the > >> >>>>>> commons-codec > >> >>>>>>>>>> code used is compatible with both versions...we could just > >> >>> in-line > >> >>>>>> some > >> >>>>>>>>> of > >> >>>>>>>>>> the Base64 code, and comment accordingly. > >> >>>>>>>>>> > >> >>>>>>>>>> That said, we also should be clear about what versions we > >> >>>> support, but > >> >>>>>>>>> 6-12 > >> >>>>>>>>>> months seems short. The upgrade cycles on Hadoop are really, > >> >>>> really > >> >>>>>>>> long. > >> >>>>>>>>>> > >> >>>>>>>>>> > >> >>>>>>>>>> 2013/2/20 Prashant Kommireddi <prash1...@gmail.com> > >> >>>>>>>>>> > >> >>>>>>>>>>> Agreed, that makes sense. Probably supporting older hadoop > >> >>>> version > >> >>>>>> for > >> >>>>>>>>> a 1 > >> >>>>>>>>>>> or 2 pig releases before moving to a newer/stable version? > >> >>>>>>>>>>> > >> >>>>>>>>>>> Having said that, should we use 0.11 period to communicate > >> >> the > >> >>>> same > >> >>>>>> to > >> >>>>>>>>> the > >> >>>>>>>>>>> community and start moving on 0.12 onwards? I know we are > way > >> >>>> past > >> >>>>>>>> 6-12 > >> >>>>>>>>>>> months (1-2 release) time frame with 0.20.2, but we also > need > >> >>> to > >> >>>> make > >> >>>>>>>>> sure > >> >>>>>>>>>>> users are aware and plan accordingly. > >> >>>>>>>>>>> > >> >>>>>>>>>>> I'd also be interested to hear how other projects (Hive, > >> >> Oozie) > >> >>>> are > >> >>>>>>>>>>> handling this. > >> >>>>>>>>>>> > >> >>>>>>>>>>> -Prashant > >> >>>>>>>>>>> > >> >>>>>>>>>>> On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich < > >> >>>>>> onatkov...@yahoo.com > >> >>>>>>>>>>>> wrote: > >> >>>>>>>>>>> > >> >>>>>>>>>>>> It seems that for each Pig release we need to agree and > >> >>> clearly > >> >>>>>>>> state > >> >>>>>>>>>>>> which Hadoop versions it will support. I guess the main > >> >>>> question is > >> >>>>>>>>> how > >> >>>>>>>>>>> we > >> >>>>>>>>>>>> decide on this. Perhaps we should say that Pig no longer > >> >>>> supports > >> >>>>>>>>> older > >> >>>>>>>>>>>> Hadoop versions once the newer one is out for at least 6-12 > >> >>>> month to > >> >>>>>>>>> make > >> >>>>>>>>>>>> sure it is stable. I don't think we can support old > versions > >> >>>>>>>>>>> indefinitely. > >> >>>>>>>>>>>> It is in everybody's interest to keep moving forward. > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> Olga > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> ________________________________ > >> >>>>>>>>>>>> From: Prashant Kommireddi <prash1...@gmail.com> > >> >>>>>>>>>>>> To: dev@pig.apache.org > >> >>>>>>>>>>>> Sent: Tuesday, February 19, 2013 10:57 AM > >> >>>>>>>>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several > problems > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> What do you guys feel about the JIRA to do with 0.20.2 > >> >>>> compatibility > >> >>>>>>>>>>>> (PIG-3194)? I am interested in discussing the strategy > >> >> around > >> >>>>>>>> backward > >> >>>>>>>>>>>> compatibility as this is something that would haunt us each > >> >>>> time we > >> >>>>>>>>> move > >> >>>>>>>>>>> to > >> >>>>>>>>>>>> the next hadoop version. For eg, we might be in a similar > >> >>>> situation > >> >>>>>>>>> while > >> >>>>>>>>>>>> moving to Hadoop 2.0, when some of the stuff might break > for > >> >>>> 1.0. > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> I feel it would be good to get this JIRA fix in for 0.11, > as > >> >>>> 0.20.2 > >> >>>>>>>>> users > >> >>>>>>>>>>>> might be caught unaware. Of course, I must admit there is > >> >>>> selfish > >> >>>>>>>>>>> interest > >> >>>>>>>>>>>> here and it's probably easier for us to have a workaround > on > >> >>> Pig > >> >>>>>>>>> rather > >> >>>>>>>>>>>> than upgrade hadoop in all our production DCs. > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> -Prashant > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney < > >> >>>>>>>>>>> russell.jur...@gmail.com > >> >>>>>>>>>>>>> wrote: > >> >>>>>>>>>>>> > >> >>>>>>>>>>>>> I think someone should step up and fix the easy ones, if > >> >>>> possible. > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham < > >> >>>>>>>> billgra...@gmail.com> > >> >>>>>>>>>>>> wrote: > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>>> Thanks Kai for reporting these. > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> What do people think about the severity of these issues > >> >>> w.r.t. > >> >>>>>>>> Pig > >> >>>>>>>>>>> 11? > >> >>>>>>>>>>>> I > >> >>>>>>>>>>>>>> see a few possible options: > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> 1. We include some or all of these patches in a new Pig > 11 > >> >>> rc. > >> >>>>>>>>> We'd > >> >>>>>>>>>>>> want > >> >>>>>>>>>>>>> to > >> >>>>>>>>>>>>>> make sure that they don't destabilize the current branch. > >> >>> This > >> >>>>>>>>>>> approach > >> >>>>>>>>>>>>>> makes sense if we think Pig 11 wouldn't be a good release > >> >>>>>>>> without > >> >>>>>>>>> one > >> >>>>>>>>>>>> or > >> >>>>>>>>>>>>>> more of these included. > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> 2. We continue with the Pig 11 release without these, but > >> >>> then > >> >>>>>>>>>>> include > >> >>>>>>>>>>>>> one > >> >>>>>>>>>>>>>> or more in a 0.11.1 release. > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> 3. We continue with the Pig 11 release without these, but > >> >>> then > >> >>>>>>>>>>> include > >> >>>>>>>>>>>>> them > >> >>>>>>>>>>>>>> in a 0.12 release. > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> Jon has a patch for the MAP issue > >> >>>>>>>>>>>>>> (PIG-3144<https://issues.apache.org/jira/browse/PIG-3144 > >> >>> ) > >> >>>>>>>>>>>>>> ready, which seems like the most pressing of the three to > >> >>> me. > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> thanks, > >> >>>>>>>>>>>>>> Bill > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg < > >> >>>>>>>>>>>>>> kai.londenb...@googlemail.com> wrote: > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> Hi, > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> I just subscribed to the dev mailing list in order to > >> >> give > >> >>>> you > >> >>>>>>>>> some > >> >>>>>>>>>>>>>>> feedback on pig 0.11 candidate 2. > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> The following three issues are currently present in 0.11 > >> >>>>>>>>> candidate > >> >>>>>>>>>>> 2: > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3144 - > >> >>> 'Erroneous > >> >>>>>>>> map > >> >>>>>>>>>>>> entry > >> >>>>>>>>>>>>>>> alias resolution leading to "Duplicate schema alias" > >> >>> errors' > >> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3194 - > Changes > >> >>> to > >> >>>>>>>>>>>>>>> ObjectSerializer.java break compatibility with Hadoop > >> >>> 0.20.2 > >> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3195 - Race > >> >>>>>>>>> Condition in > >> >>>>>>>>>>>>>>> PhysicalOperator leads to ExecException "Error while > >> >> trying > >> >>>> to > >> >>>>>>>>> get > >> >>>>>>>>>>>>>>> next result in POStream" > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> The last two of these are easily solveable (see the > >> >> tickets > >> >>>>>>>> for > >> >>>>>>>>>>>>>>> details on that). The first one is a bit trickier I > >> >> think, > >> >>>> but > >> >>>>>>>>> at > >> >>>>>>>>>>>>>>> least there is a workaround for it (pass Map fields > >> >> through > >> >>>> an > >> >>>>>>>>> UDF) > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> In my personal opinion, each of these problems is pretty > >> >>>>>>>> severe, > >> >>>>>>>>>>> but > >> >>>>>>>>>>>>>>> opinions about the importance of the MAP Datatype and > >> >>> STREAM > >> >>>>>>>>>>>> Operator, > >> >>>>>>>>>>>>>>> as well as Hadoop 0.20.2 compatibility might differ. > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> so far .. > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> Kai Londenberg > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> -- > >> >>>>>>>>>>>>>> *Note that I'm no longer using my Yahoo! email address. > >> >>> Please > >> >>>>>>>>> email > >> >>>>>>>>>>> me > >> >>>>>>>>>>>>> at > >> >>>>>>>>>>>>>> billgra...@gmail.com going forward.* > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> -- > >> >>>>>>>>>>>>> Russell Jurney twitter.com/rjurney > >> >> russell.jur...@gmail.com > >> >>>>>>>>>>>>> datasyndrome.com > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> -- > >> >>>>>>>> *Note that I'm no longer using my Yahoo! email address. Please > >> >>>> email me > >> >>>>>> at > >> >>>>>>>> billgra...@gmail.com going forward.* > >> >>>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> -- > >> >>>>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > >> >>>>>> datasyndrome.com > >> >>>>>> > >> >>>>>> > >> >>>> > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > >> >>> datasyndrome.com > >> >>> > >> >> > >> > > > > >