Re: Elephant Bird released

2010-03-30 Thread Dmitriy Ryaboy
Rohan, Yes. I think. Let us know if it is not. -Dmitriy On Mon, Mar 29, 2010 at 8:42 PM, Rohan Rai wrote: > Hey... > > I am so excited seeing this... > I am at the edge of my seat... > I cant even wait to see what it is... > So just looking and hoping for a heads up.. > Is this the same thing f

Re: not in via join

2010-03-30 Thread Alan Gates
What you gave seems like it should work. But I'd try it as: C = COGROUP A BY id, B BY id; D = FILTER C BY COUNT(A) = 0; E = FOREACH D GENERATE FLATTEN(B); Alan. On Mar 29, 2010, at 7:06 PM, Kent Shi wrote: Hi, I am trying to get the elements of B not in A. My code is like this C = JOIN A

Re: not in via join

2010-03-30 Thread hc busy
Just saw a response to this recently, the "right way" is to use co-group to join A and B and then to check IsEmpty(A) instead of doing an outer join and checking "is null" On Mon, Mar 29, 2010 at 7:06 PM, Kent Shi wrote: > Hi, > > I am trying to get the elements of B not in A. My code is like

Re: Compiling 0.7.0 to run against Hadoop 0.19.x

2010-03-30 Thread Alan Gates
Since 0.5 Pig has run against Hadoop 0.20, and since 0.6 it has used the new Hadoop APIs (available only in 20+). Reverting this would be very difficult. There is a patch for Pig 0.4 that will make it run against Hadoop 19 (https://issues.apache.org/jira/browse/PIG-573). Alan. On Mar 29,

Re: not in via join

2010-03-30 Thread Kent Shi
Thanks, that worked. I also found out the reason why my code didn't work here https://issues.apache.org/jira/browse/PIG-1289 On Mar 30, 2010, at 10:58 AM, Alan Gates wrote: > What you gave seems like it should work. But I'd try it as: > > C = COGROUP A BY id, B BY id; > D = FILTER C BY COUN

InputSplit in UDF

2010-03-30 Thread Sandesh Devaraju
Hi All, Is there a way to get current InputSplit in a UDF (more specifically, a filter function)? I have a filter function that validates input rows according to certain criteria and I would like to report the source of failures (if any). Thanks in advance. - Sandesh

Re: InputSplit in UDF

2010-03-30 Thread Ashutosh Chauhan
Try: PigSplit pigSplit = ((PigSplit)((Context)PigMapReduce.sJobContext).getInputSplit()); InputSplit is = pigSplit.getWrappedSplit(); Ashutosh On Tue, Mar 30, 2010 at 13:52, Sandesh Devaraju wrote: > Hi All, > > Is there a way to get current InputSplit in a UDF (more specificall

Re: MULTI_LEAF_MAP and DID_NOT_FIND_LOAD_ONLY_MAP_PLAN

2010-03-30 Thread Ashutosh Chauhan
After Pig compiles a query into a series of map-reduce plans, it once again iterate through those jobs trying to spot the opportunities of further optimizations. While visiting such compiled plan, there are certain invariants which must always hold. If Pig finds contrary to it, it backs out and doe

Re: InputSplit in UDF

2010-03-30 Thread Mridul Muralidharan
You might want to be careful with this ... the udf could get used in both map & reduce side, no ? Regards, Mridul On Wednesday 31 March 2010 02:22 AM, Sandesh Devaraju wrote: Hi All, Is there a way to get current InputSplit in a UDF (more specifically, a filter function)? I have a filter f