Great. I have commented regarding a possible approach for PIG-3194 http://goo.gl/UQ3zs. Please take a look when you folks have a chance.
On Fri, Mar 1, 2013 at 7:00 PM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote: > I'd like to get the gc fix in as well, but looks like Rohini is about to > commit it so we are good there. > > On Mar 1, 2013, at 11:33 AM, Bill Graham <billgra...@gmail.com> wrote: > > > +1 to releasing Pig 0.11.1 when this is addressed. I should be able to > help > > with the release again. > > > > > > > > On Fri, Mar 1, 2013 at 11:25 AM, Prashant Kommireddi < > prash1...@gmail.com>wrote: > > > >> Hey Guys, > >> > >> I wanted to start a conversation on this again. If Kai is not looking at > >> PIG-3194 I can start working on it to get 0.11 compatible with 20.2. If > >> everyone agrees, we should roll out 0.11.1 sooner than usual and I > >> volunteer to help with it in anyway possible. > >> > >> Any objections to getting 0.11.1 out soon after 3194 is fixed? > >> > >> -Prashant > >> > >> On Wed, Feb 20, 2013 at 3:34 PM, Russell Jurney < > russell.jur...@gmail.com > >>> wrote: > >> > >>> I stand corrected. Cool, 0.11 is good! > >>> > >>> > >>> On Wed, Feb 20, 2013 at 1:15 PM, Jarek Jarcec Cecho <jar...@apache.org > >>>> wrote: > >>> > >>>> Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than to > >>> 0.20. > >>>> > >>>> Jarcec > >>>> > >>>> On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote: > >>>>> I agree -- this is a good release. The bugs Kai pointed out should be > >>>>> fixed, but as they are not critical regressions, we can fix them in > >>>> 0.11.1 > >>>>> (if someone wants to roll 0.11.1 the minute these fixes are > >> committed, > >>> I > >>>>> won't mind and will dutifully vote for the release). > >>>>> > >>>>> I think the Hadoop 20.2 incompatibility is unfortunate but iirc this > >> is > >>>>> fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in > >> 20.2?) > >>>>> > >>>>> FWIW Twitter's running CDH3 and this release works in our > >> environment. > >>>>> > >>>>> At this point things that block a release are critical regressions in > >>>>> performance or correctness. > >>>>> > >>>>> D > >>>>> > >>>>> > >>>>> On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates <ga...@hortonworks.com> > >>>> wrote: > >>>>> > >>>>>> No. Bugs like these are supposed to be found and fixed after we > >>> branch > >>>>>> from trunk (which happened several months ago in the case of 0.11). > >>>> The > >>>>>> point of RCs are to check that it's a good build, licenses are > >> right, > >>>> etc. > >>>>>> Any bugs found this late in the game have to be seen as failures > >> of > >>>>>> earlier testing. > >>>>>> > >>>>>> Alan. > >>>>>> > >>>>>> On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote: > >>>>>> > >>>>>>> Isn't the point of an RC to find and fix bugs like these> > >>>>>>> > >>>>>>> > >>>>>>> On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham < > >>> billgra...@gmail.com> > >>>>>> wrote: > >>>>>>> > >>>>>>>> Regarding Pig 11 rc2, I propose we continue with the current > >> vote > >>>> as is > >>>>>>>> (which closes today EOD). Patches for 0.20.2 issues can be > >> rolled > >>>> into a > >>>>>>>> Pig 0.11.1 release whenever they're available and tested. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich < > >>>> onatkov...@yahoo.com > >>>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> I agree that supporting as much as we can is a good goal. The > >>>> issue is > >>>>>>>> who > >>>>>>>>> is going to be testing against all these versions? We found the > >>>> issues > >>>>>>>>> under discussion because of a customer report, not because we > >>>>>>>> consistently > >>>>>>>>> test against all versions. Perhaps when we decide which > >> versions > >>> to > >>>>>>>> support > >>>>>>>>> for next release we need also to agree who is going to be > >> testing > >>>> and > >>>>>>>>> maintaining compatibility with a particular version. > >>>>>>>>> > >>>>>>>>> For instance since Hadoop 23 compatibility is important for us > >> at > >>>> Yahoo > >>>>>>>> we > >>>>>>>>> have been maintaining compatibility with this version for 0.9, > >>>> 0.10 and > >>>>>>>>> will do the same for 0.11 and going forward. I think we would > >>> need > >>>>>> others > >>>>>>>>> to step in and claim the versions of their interest. > >>>>>>>>> > >>>>>>>>> Olga > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> ________________________________ > >>>>>>>>> From: Kai Londenberg <kai.londenb...@googlemail.com> > >>>>>>>>> To: dev@pig.apache.org > >>>>>>>>> Sent: Wednesday, February 20, 2013 1:51 AM > >>>>>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems > >>>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I stronly agree with Jonathan here. If there are good reasons > >> why > >>>> you > >>>>>>>>> can't support an older version of Hadoop any more, that's one > >>>> thing. > >>>>>>>>> But having to change 2 lines of code doesn't really qualify as > >>>> such in > >>>>>>>>> my point of view ;) > >>>>>>>>> > >>>>>>>>> At least for me, pig support for 0.20.2 is essential - without > >>> it, > >>>> I > >>>>>>>>> can't use it. If it doesn't support it, I'll have to branch pig > >>> and > >>>>>>>>> hack it myself, or stop using it. > >>>>>>>>> > >>>>>>>>> I guess, there are a lot of people still running 0.20.2 > >> Clusters. > >>>> If > >>>>>>>>> you really have lots of data stored on HDFS and a continuously > >>> busy > >>>>>>>>> cluster, an upgrade is nothing you do "just because". > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> 2013/2/20 Jonathan Coveney <jcove...@gmail.com>: > >>>>>>>>>> I agree that we shouldn't have to support old versions > >> forever. > >>>> That > >>>>>>>>> said, > >>>>>>>>>> I also don't think we should be too blase about supporting > >> older > >>>>>>>> versions > >>>>>>>>>> where it is not odious to do so. We have a lot of competition > >> in > >>>> the > >>>>>>>>>> language space and the broader the versions we can support, > >> the > >>>> better > >>>>>>>>>> (assuming it isn't too odious to do so). In this case, I don't > >>>> think > >>>>>> it > >>>>>>>>>> should be too hard to change ObjectSerializer so that the > >>>>>> commons-codec > >>>>>>>>>> code used is compatible with both versions...we could just > >>> in-line > >>>>>> some > >>>>>>>>> of > >>>>>>>>>> the Base64 code, and comment accordingly. > >>>>>>>>>> > >>>>>>>>>> That said, we also should be clear about what versions we > >>>> support, but > >>>>>>>>> 6-12 > >>>>>>>>>> months seems short. The upgrade cycles on Hadoop are really, > >>>> really > >>>>>>>> long. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> 2013/2/20 Prashant Kommireddi <prash1...@gmail.com> > >>>>>>>>>> > >>>>>>>>>>> Agreed, that makes sense. Probably supporting older hadoop > >>>> version > >>>>>> for > >>>>>>>>> a 1 > >>>>>>>>>>> or 2 pig releases before moving to a newer/stable version? > >>>>>>>>>>> > >>>>>>>>>>> Having said that, should we use 0.11 period to communicate > >> the > >>>> same > >>>>>> to > >>>>>>>>> the > >>>>>>>>>>> community and start moving on 0.12 onwards? I know we are way > >>>> past > >>>>>>>> 6-12 > >>>>>>>>>>> months (1-2 release) time frame with 0.20.2, but we also need > >>> to > >>>> make > >>>>>>>>> sure > >>>>>>>>>>> users are aware and plan accordingly. > >>>>>>>>>>> > >>>>>>>>>>> I'd also be interested to hear how other projects (Hive, > >> Oozie) > >>>> are > >>>>>>>>>>> handling this. > >>>>>>>>>>> > >>>>>>>>>>> -Prashant > >>>>>>>>>>> > >>>>>>>>>>> On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich < > >>>>>> onatkov...@yahoo.com > >>>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> It seems that for each Pig release we need to agree and > >>> clearly > >>>>>>>> state > >>>>>>>>>>>> which Hadoop versions it will support. I guess the main > >>>> question is > >>>>>>>>> how > >>>>>>>>>>> we > >>>>>>>>>>>> decide on this. Perhaps we should say that Pig no longer > >>>> supports > >>>>>>>>> older > >>>>>>>>>>>> Hadoop versions once the newer one is out for at least 6-12 > >>>> month to > >>>>>>>>> make > >>>>>>>>>>>> sure it is stable. I don't think we can support old versions > >>>>>>>>>>> indefinitely. > >>>>>>>>>>>> It is in everybody's interest to keep moving forward. > >>>>>>>>>>>> > >>>>>>>>>>>> Olga > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> ________________________________ > >>>>>>>>>>>> From: Prashant Kommireddi <prash1...@gmail.com> > >>>>>>>>>>>> To: dev@pig.apache.org > >>>>>>>>>>>> Sent: Tuesday, February 19, 2013 10:57 AM > >>>>>>>>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems > >>>>>>>>>>>> > >>>>>>>>>>>> What do you guys feel about the JIRA to do with 0.20.2 > >>>> compatibility > >>>>>>>>>>>> (PIG-3194)? I am interested in discussing the strategy > >> around > >>>>>>>> backward > >>>>>>>>>>>> compatibility as this is something that would haunt us each > >>>> time we > >>>>>>>>> move > >>>>>>>>>>> to > >>>>>>>>>>>> the next hadoop version. For eg, we might be in a similar > >>>> situation > >>>>>>>>> while > >>>>>>>>>>>> moving to Hadoop 2.0, when some of the stuff might break for > >>>> 1.0. > >>>>>>>>>>>> > >>>>>>>>>>>> I feel it would be good to get this JIRA fix in for 0.11, as > >>>> 0.20.2 > >>>>>>>>> users > >>>>>>>>>>>> might be caught unaware. Of course, I must admit there is > >>>> selfish > >>>>>>>>>>> interest > >>>>>>>>>>>> here and it's probably easier for us to have a workaround on > >>> Pig > >>>>>>>>> rather > >>>>>>>>>>>> than upgrade hadoop in all our production DCs. > >>>>>>>>>>>> > >>>>>>>>>>>> -Prashant > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney < > >>>>>>>>>>> russell.jur...@gmail.com > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> I think someone should step up and fix the easy ones, if > >>>> possible. > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham < > >>>>>>>> billgra...@gmail.com> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks Kai for reporting these. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> What do people think about the severity of these issues > >>> w.r.t. > >>>>>>>> Pig > >>>>>>>>>>> 11? > >>>>>>>>>>>> I > >>>>>>>>>>>>>> see a few possible options: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 1. We include some or all of these patches in a new Pig 11 > >>> rc. > >>>>>>>>> We'd > >>>>>>>>>>>> want > >>>>>>>>>>>>> to > >>>>>>>>>>>>>> make sure that they don't destabilize the current branch. > >>> This > >>>>>>>>>>> approach > >>>>>>>>>>>>>> makes sense if we think Pig 11 wouldn't be a good release > >>>>>>>> without > >>>>>>>>> one > >>>>>>>>>>>> or > >>>>>>>>>>>>>> more of these included. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 2. We continue with the Pig 11 release without these, but > >>> then > >>>>>>>>>>> include > >>>>>>>>>>>>> one > >>>>>>>>>>>>>> or more in a 0.11.1 release. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 3. We continue with the Pig 11 release without these, but > >>> then > >>>>>>>>>>> include > >>>>>>>>>>>>> them > >>>>>>>>>>>>>> in a 0.12 release. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Jon has a patch for the MAP issue > >>>>>>>>>>>>>> (PIG-3144<https://issues.apache.org/jira/browse/PIG-3144 > >>> ) > >>>>>>>>>>>>>> ready, which seems like the most pressing of the three to > >>> me. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> thanks, > >>>>>>>>>>>>>> Bill > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg < > >>>>>>>>>>>>>> kai.londenb...@googlemail.com> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I just subscribed to the dev mailing list in order to > >> give > >>>> you > >>>>>>>>> some > >>>>>>>>>>>>>>> feedback on pig 0.11 candidate 2. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> The following three issues are currently present in 0.11 > >>>>>>>>> candidate > >>>>>>>>>>> 2: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3144 - > >>> 'Erroneous > >>>>>>>> map > >>>>>>>>>>>> entry > >>>>>>>>>>>>>>> alias resolution leading to "Duplicate schema alias" > >>> errors' > >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3194 - Changes > >>> to > >>>>>>>>>>>>>>> ObjectSerializer.java break compatibility with Hadoop > >>> 0.20.2 > >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3195 - Race > >>>>>>>>> Condition in > >>>>>>>>>>>>>>> PhysicalOperator leads to ExecException "Error while > >> trying > >>>> to > >>>>>>>>> get > >>>>>>>>>>>>>>> next result in POStream" > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> The last two of these are easily solveable (see the > >> tickets > >>>>>>>> for > >>>>>>>>>>>>>>> details on that). The first one is a bit trickier I > >> think, > >>>> but > >>>>>>>>> at > >>>>>>>>>>>>>>> least there is a workaround for it (pass Map fields > >> through > >>>> an > >>>>>>>>> UDF) > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> In my personal opinion, each of these problems is pretty > >>>>>>>> severe, > >>>>>>>>>>> but > >>>>>>>>>>>>>>> opinions about the importance of the MAP Datatype and > >>> STREAM > >>>>>>>>>>>> Operator, > >>>>>>>>>>>>>>> as well as Hadoop 0.20.2 compatibility might differ. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> so far .. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Kai Londenberg > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> *Note that I'm no longer using my Yahoo! email address. > >>> Please > >>>>>>>>> email > >>>>>>>>>>> me > >>>>>>>>>>>>> at > >>>>>>>>>>>>>> billgra...@gmail.com going forward.* > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>>>>> Russell Jurney twitter.com/rjurney > >> russell.jur...@gmail.com > >>>>>>>>>>>>> datasyndrome.com > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> *Note that I'm no longer using my Yahoo! email address. Please > >>>> email me > >>>>>> at > >>>>>>>> billgra...@gmail.com going forward.* > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > >>>>>> datasyndrome.com > >>>>>> > >>>>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > >>> datasyndrome.com > >>> > >> >