I'd like to get the gc fix in as well, but looks like Rohini is about to commit it so we are good there.
On Mar 1, 2013, at 11:33 AM, Bill Graham <billgra...@gmail.com> wrote: > +1 to releasing Pig 0.11.1 when this is addressed. I should be able to help > with the release again. > > > > On Fri, Mar 1, 2013 at 11:25 AM, Prashant Kommireddi > <prash1...@gmail.com>wrote: > >> Hey Guys, >> >> I wanted to start a conversation on this again. If Kai is not looking at >> PIG-3194 I can start working on it to get 0.11 compatible with 20.2. If >> everyone agrees, we should roll out 0.11.1 sooner than usual and I >> volunteer to help with it in anyway possible. >> >> Any objections to getting 0.11.1 out soon after 3194 is fixed? >> >> -Prashant >> >> On Wed, Feb 20, 2013 at 3:34 PM, Russell Jurney <russell.jur...@gmail.com >>> wrote: >> >>> I stand corrected. Cool, 0.11 is good! >>> >>> >>> On Wed, Feb 20, 2013 at 1:15 PM, Jarek Jarcec Cecho <jar...@apache.org >>>> wrote: >>> >>>> Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than to >>> 0.20. >>>> >>>> Jarcec >>>> >>>> On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote: >>>>> I agree -- this is a good release. The bugs Kai pointed out should be >>>>> fixed, but as they are not critical regressions, we can fix them in >>>> 0.11.1 >>>>> (if someone wants to roll 0.11.1 the minute these fixes are >> committed, >>> I >>>>> won't mind and will dutifully vote for the release). >>>>> >>>>> I think the Hadoop 20.2 incompatibility is unfortunate but iirc this >> is >>>>> fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in >> 20.2?) >>>>> >>>>> FWIW Twitter's running CDH3 and this release works in our >> environment. >>>>> >>>>> At this point things that block a release are critical regressions in >>>>> performance or correctness. >>>>> >>>>> D >>>>> >>>>> >>>>> On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates <ga...@hortonworks.com> >>>> wrote: >>>>> >>>>>> No. Bugs like these are supposed to be found and fixed after we >>> branch >>>>>> from trunk (which happened several months ago in the case of 0.11). >>>> The >>>>>> point of RCs are to check that it's a good build, licenses are >> right, >>>> etc. >>>>>> Any bugs found this late in the game have to be seen as failures >> of >>>>>> earlier testing. >>>>>> >>>>>> Alan. >>>>>> >>>>>> On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote: >>>>>> >>>>>>> Isn't the point of an RC to find and fix bugs like these> >>>>>>> >>>>>>> >>>>>>> On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham < >>> billgra...@gmail.com> >>>>>> wrote: >>>>>>> >>>>>>>> Regarding Pig 11 rc2, I propose we continue with the current >> vote >>>> as is >>>>>>>> (which closes today EOD). Patches for 0.20.2 issues can be >> rolled >>>> into a >>>>>>>> Pig 0.11.1 release whenever they're available and tested. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich < >>>> onatkov...@yahoo.com >>>>>>>>> wrote: >>>>>>>> >>>>>>>>> I agree that supporting as much as we can is a good goal. The >>>> issue is >>>>>>>> who >>>>>>>>> is going to be testing against all these versions? We found the >>>> issues >>>>>>>>> under discussion because of a customer report, not because we >>>>>>>> consistently >>>>>>>>> test against all versions. Perhaps when we decide which >> versions >>> to >>>>>>>> support >>>>>>>>> for next release we need also to agree who is going to be >> testing >>>> and >>>>>>>>> maintaining compatibility with a particular version. >>>>>>>>> >>>>>>>>> For instance since Hadoop 23 compatibility is important for us >> at >>>> Yahoo >>>>>>>> we >>>>>>>>> have been maintaining compatibility with this version for 0.9, >>>> 0.10 and >>>>>>>>> will do the same for 0.11 and going forward. I think we would >>> need >>>>>> others >>>>>>>>> to step in and claim the versions of their interest. >>>>>>>>> >>>>>>>>> Olga >>>>>>>>> >>>>>>>>> >>>>>>>>> ________________________________ >>>>>>>>> From: Kai Londenberg <kai.londenb...@googlemail.com> >>>>>>>>> To: dev@pig.apache.org >>>>>>>>> Sent: Wednesday, February 20, 2013 1:51 AM >>>>>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I stronly agree with Jonathan here. If there are good reasons >> why >>>> you >>>>>>>>> can't support an older version of Hadoop any more, that's one >>>> thing. >>>>>>>>> But having to change 2 lines of code doesn't really qualify as >>>> such in >>>>>>>>> my point of view ;) >>>>>>>>> >>>>>>>>> At least for me, pig support for 0.20.2 is essential - without >>> it, >>>> I >>>>>>>>> can't use it. If it doesn't support it, I'll have to branch pig >>> and >>>>>>>>> hack it myself, or stop using it. >>>>>>>>> >>>>>>>>> I guess, there are a lot of people still running 0.20.2 >> Clusters. >>>> If >>>>>>>>> you really have lots of data stored on HDFS and a continuously >>> busy >>>>>>>>> cluster, an upgrade is nothing you do "just because". >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013/2/20 Jonathan Coveney <jcove...@gmail.com>: >>>>>>>>>> I agree that we shouldn't have to support old versions >> forever. >>>> That >>>>>>>>> said, >>>>>>>>>> I also don't think we should be too blase about supporting >> older >>>>>>>> versions >>>>>>>>>> where it is not odious to do so. We have a lot of competition >> in >>>> the >>>>>>>>>> language space and the broader the versions we can support, >> the >>>> better >>>>>>>>>> (assuming it isn't too odious to do so). In this case, I don't >>>> think >>>>>> it >>>>>>>>>> should be too hard to change ObjectSerializer so that the >>>>>> commons-codec >>>>>>>>>> code used is compatible with both versions...we could just >>> in-line >>>>>> some >>>>>>>>> of >>>>>>>>>> the Base64 code, and comment accordingly. >>>>>>>>>> >>>>>>>>>> That said, we also should be clear about what versions we >>>> support, but >>>>>>>>> 6-12 >>>>>>>>>> months seems short. The upgrade cycles on Hadoop are really, >>>> really >>>>>>>> long. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2013/2/20 Prashant Kommireddi <prash1...@gmail.com> >>>>>>>>>> >>>>>>>>>>> Agreed, that makes sense. Probably supporting older hadoop >>>> version >>>>>> for >>>>>>>>> a 1 >>>>>>>>>>> or 2 pig releases before moving to a newer/stable version? >>>>>>>>>>> >>>>>>>>>>> Having said that, should we use 0.11 period to communicate >> the >>>> same >>>>>> to >>>>>>>>> the >>>>>>>>>>> community and start moving on 0.12 onwards? I know we are way >>>> past >>>>>>>> 6-12 >>>>>>>>>>> months (1-2 release) time frame with 0.20.2, but we also need >>> to >>>> make >>>>>>>>> sure >>>>>>>>>>> users are aware and plan accordingly. >>>>>>>>>>> >>>>>>>>>>> I'd also be interested to hear how other projects (Hive, >> Oozie) >>>> are >>>>>>>>>>> handling this. >>>>>>>>>>> >>>>>>>>>>> -Prashant >>>>>>>>>>> >>>>>>>>>>> On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich < >>>>>> onatkov...@yahoo.com >>>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> It seems that for each Pig release we need to agree and >>> clearly >>>>>>>> state >>>>>>>>>>>> which Hadoop versions it will support. I guess the main >>>> question is >>>>>>>>> how >>>>>>>>>>> we >>>>>>>>>>>> decide on this. Perhaps we should say that Pig no longer >>>> supports >>>>>>>>> older >>>>>>>>>>>> Hadoop versions once the newer one is out for at least 6-12 >>>> month to >>>>>>>>> make >>>>>>>>>>>> sure it is stable. I don't think we can support old versions >>>>>>>>>>> indefinitely. >>>>>>>>>>>> It is in everybody's interest to keep moving forward. >>>>>>>>>>>> >>>>>>>>>>>> Olga >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ________________________________ >>>>>>>>>>>> From: Prashant Kommireddi <prash1...@gmail.com> >>>>>>>>>>>> To: dev@pig.apache.org >>>>>>>>>>>> Sent: Tuesday, February 19, 2013 10:57 AM >>>>>>>>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems >>>>>>>>>>>> >>>>>>>>>>>> What do you guys feel about the JIRA to do with 0.20.2 >>>> compatibility >>>>>>>>>>>> (PIG-3194)? I am interested in discussing the strategy >> around >>>>>>>> backward >>>>>>>>>>>> compatibility as this is something that would haunt us each >>>> time we >>>>>>>>> move >>>>>>>>>>> to >>>>>>>>>>>> the next hadoop version. For eg, we might be in a similar >>>> situation >>>>>>>>> while >>>>>>>>>>>> moving to Hadoop 2.0, when some of the stuff might break for >>>> 1.0. >>>>>>>>>>>> >>>>>>>>>>>> I feel it would be good to get this JIRA fix in for 0.11, as >>>> 0.20.2 >>>>>>>>> users >>>>>>>>>>>> might be caught unaware. Of course, I must admit there is >>>> selfish >>>>>>>>>>> interest >>>>>>>>>>>> here and it's probably easier for us to have a workaround on >>> Pig >>>>>>>>> rather >>>>>>>>>>>> than upgrade hadoop in all our production DCs. >>>>>>>>>>>> >>>>>>>>>>>> -Prashant >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney < >>>>>>>>>>> russell.jur...@gmail.com >>>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I think someone should step up and fix the easy ones, if >>>> possible. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham < >>>>>>>> billgra...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks Kai for reporting these. >>>>>>>>>>>>>> >>>>>>>>>>>>>> What do people think about the severity of these issues >>> w.r.t. >>>>>>>> Pig >>>>>>>>>>> 11? >>>>>>>>>>>> I >>>>>>>>>>>>>> see a few possible options: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. We include some or all of these patches in a new Pig 11 >>> rc. >>>>>>>>> We'd >>>>>>>>>>>> want >>>>>>>>>>>>> to >>>>>>>>>>>>>> make sure that they don't destabilize the current branch. >>> This >>>>>>>>>>> approach >>>>>>>>>>>>>> makes sense if we think Pig 11 wouldn't be a good release >>>>>>>> without >>>>>>>>> one >>>>>>>>>>>> or >>>>>>>>>>>>>> more of these included. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. We continue with the Pig 11 release without these, but >>> then >>>>>>>>>>> include >>>>>>>>>>>>> one >>>>>>>>>>>>>> or more in a 0.11.1 release. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 3. We continue with the Pig 11 release without these, but >>> then >>>>>>>>>>> include >>>>>>>>>>>>> them >>>>>>>>>>>>>> in a 0.12 release. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Jon has a patch for the MAP issue >>>>>>>>>>>>>> (PIG-3144<https://issues.apache.org/jira/browse/PIG-3144 >>> ) >>>>>>>>>>>>>> ready, which seems like the most pressing of the three to >>> me. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> Bill >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg < >>>>>>>>>>>>>> kai.londenb...@googlemail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I just subscribed to the dev mailing list in order to >> give >>>> you >>>>>>>>> some >>>>>>>>>>>>>>> feedback on pig 0.11 candidate 2. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The following three issues are currently present in 0.11 >>>>>>>>> candidate >>>>>>>>>>> 2: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3144 - >>> 'Erroneous >>>>>>>> map >>>>>>>>>>>> entry >>>>>>>>>>>>>>> alias resolution leading to "Duplicate schema alias" >>> errors' >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3194 - Changes >>> to >>>>>>>>>>>>>>> ObjectSerializer.java break compatibility with Hadoop >>> 0.20.2 >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3195 - Race >>>>>>>>> Condition in >>>>>>>>>>>>>>> PhysicalOperator leads to ExecException "Error while >> trying >>>> to >>>>>>>>> get >>>>>>>>>>>>>>> next result in POStream" >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The last two of these are easily solveable (see the >> tickets >>>>>>>> for >>>>>>>>>>>>>>> details on that). The first one is a bit trickier I >> think, >>>> but >>>>>>>>> at >>>>>>>>>>>>>>> least there is a workaround for it (pass Map fields >> through >>>> an >>>>>>>>> UDF) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In my personal opinion, each of these problems is pretty >>>>>>>> severe, >>>>>>>>>>> but >>>>>>>>>>>>>>> opinions about the importance of the MAP Datatype and >>> STREAM >>>>>>>>>>>> Operator, >>>>>>>>>>>>>>> as well as Hadoop 0.20.2 compatibility might differ. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> so far .. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Kai Londenberg >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Note that I'm no longer using my Yahoo! email address. >>> Please >>>>>>>>> email >>>>>>>>>>> me >>>>>>>>>>>>> at >>>>>>>>>>>>>> billgra...@gmail.com going forward.* >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Russell Jurney twitter.com/rjurney >> russell.jur...@gmail.com >>>>>>>>>>>>> datasyndrome.com >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Note that I'm no longer using my Yahoo! email address. Please >>>> email me >>>>>> at >>>>>>>> billgra...@gmail.com going forward.* >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com >>>>>> datasyndrome.com >>>>>> >>>>>> >>>> >>> >>> >>> >>> -- >>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com >>> datasyndrome.com >>> >>