date:20140325

Re: 回复：Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Pradeep Gollakota

Unfortunately, the Enumerate UDF from DataFu would not work in this case. The UDF works on Bags and in this case, we want to enumerate a relation. Implementing RANK is a very tricky thing to do correctly. I'm not even sure if it's doable just by using Pig operators, UDFs or macros. Best option is p

??????Re: Any way to join two aliases without using CROSS

2014-03-25 Thread James

Hello, There is a similar UDF in DataFu named Enumerate. http://datafu.incubator.apache.org/docs/datafu/1.2.0/datafu/pig/bags/Enumerate.html I wish it may help. James

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Andrew Musselman

In that situation you could write a script that tacks on the equivalent value that rank does, and stream the ordered relations through it. I'm assuming you have a sense of order on both these relations. After that join like you would after rank. I'm not at a computer so can't type up an example

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage

I don't think my version of PIG supports the rank function, I keep getting Internal Error. I would update it, but I am not in control of the cluster. On Tue, Mar 25, 2014 at 4:16 PM, Andrew Musselman < andrew.mussel...@gmail.com> wrote: > John's answer about RANK sounds like it should solve your

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Pradeep Gollakota

CROSS is by definition a very very expensive operation. Regardless, CROSS is the wrong operator for what you're trying to do. As was suggested by others, you want to RANK the relations then do a JOIN by the rank. On Tue, Mar 25, 2014 at 1:27 PM, wrote: > Here is how to use rank and join for th

RE: Any way to join two aliases without using CROSS

2014-03-25 Thread william.dowling

Here is how to use rank and join for this problem: sh cat xxx 1,2,3,4,5 1,2,4,5,7 1,5,7,8,9 sh cat yyy 10,11 10,12 10,13 a= load 'xxx' using PigStorage(','); b= load 'yyy' using PigStorage(','); a2 = rank a; b2 = rank b; c = join a1 by $0, b2 by $0; c2 = order c by $6; c3 = foreach c2 generat

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Andrew Musselman

John's answer about RANK sounds like it should solve your problem > On Mar 25, 2014, at 1:13 PM, Christopher Surage wrote: > > @ pradeep, I know what the cross product will do, but I have many lines in > many files. So the cross will take far too long to complete. > > > On Tue, Mar 25, 2014 at

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage

@ pradeep, I know what the cross product will do, but I have many lines in many files. So the cross will take far too long to complete. On Tue, Mar 25, 2014 at 3:58 PM, Pradeep Gollakota wrote: > I don't understand what you're trying to do from your example. > > If you perform a cross on the dat

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage

yes On Tue, Mar 25, 2014 at 4:07 PM, Shahab Yunus wrote: > Oh, sorry. This new example is something different from what I understood > before. I thought you were only trying to append one relation (with one > tuple) to another (which has more than one tuple). > > So essentially you want to loop

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread John Meagher

Try this: http://pig.apache.org/docs/r0.11.0/basic.html#rank Rank each data set then join on the rank. On Tue, Mar 25, 2014 at 4:03 PM, Christopher Surage wrote: > The output I would like to see is > > (1,2,3,4,5,10,11) > (1,2,4,5,7,10,12) > (1,5,7,8,9,10,13) > > > On Tue, Mar 25, 2014 at 3:58 P

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Shahab Yunus

Oh, sorry. This new example is something different from what I understood before. I thought you were only trying to append one relation (with one tuple) to another (which has more than one tuple). So essentially you want to loop over 2 collection and combine their tuples. Are they always going to

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage

The output I would like to see is (1,2,3,4,5,10,11) (1,2,4,5,7,10,12) (1,5,7,8,9,10,13) On Tue, Mar 25, 2014 at 3:58 PM, Pradeep Gollakota wrote: > I don't understand what you're trying to do from your example. > > If you perform a cross on the data you have, the output will be the > following:

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Pradeep Gollakota

I don't understand what you're trying to do from your example. If you perform a cross on the data you have, the output will be the following: (1,2,3,4,5,10,11) (1,2,3,4,5,10,11) (1,2,3,4,5,10,11) (1,2,4,5,7,10,11) (1,2,4,5,7,10,11) (1,2,4,5,7,10,11) (1,5,7,8,9,10,11) (1,5,7,8,9,10,11) (1,5,7,8,9,

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Shahab Yunus

Have you tried iterating over the first relation and in the nested *generate* clause, always appending the second relation? Your top level looping is on first relation but in the nested block you are sort of hardcoding appending of second relation. I am referring to the examples like in "Example:

Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage

I am trying to perform the following action, but the only solution I have been able to come up with is using a CROSS, but I don't want to use that statement as it is a very expensive process. (1,2,3,4,5) (10,11) (1,2,4,5,7) (10,11) (1,5,7,8,9) (10,11) I want to make it

generic union types in piggybank

2014-03-25 Thread Liliang Li

Hi: I have a record of union type of union {TypeA, TypeB, TypeC, TypeD, TypeE} mydata; I have the serialized data in avro format, however when I am trying to use piggybank.jar's AvroStorage function to load the avro data, it gives me the following error: Caused by: java.io.IOException: We don't

Recordings from Pig user meetup at Linkedin, Mar 14

2014-03-25 Thread Jarek Jarcec Cecho

Sadly I was not able to attend the last bay area user meetup at Linkedin that was held on March 14. I'm very interested to see some of the presentations, so I'm wondering if there are plans to publish the recordings? Jarcec signature.asc Description: Digital signature

Re: Could not estimate number of reducers

2014-03-25 Thread Vincent Barat

I hithttps://issues.apache.org/jira/browse/PIG-3512 Le 24/03/2014 14:40, Vincent Barat a écrit : Hi, Since I moved from Pig 0.10.0 to 0.11.0 or 0.12.0, the estimation of the number of reducers no longer work. My script: A = load 'data'; B = group A by $0; store B into 'out'; My data: gru

pig-0.12.0+PIG-3285: Encounter "NoClassDefFoundError: org.cloudera.htrace.Trace" during reading hbase table in pig grunt

2014-03-25 Thread lulynn_2008

Hi All,I am reading hbase table as following: A = LOAD 'APE1_RATED_EVENT' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('', '-loadKey true') AS (id:bytearray); B = GROUP A BY id; X = FOREACH B GENERATE COUNT_STAR(A); DUMP X The job failed, and I found following error in hadoop task l

Re: 回复：Re: Any way to join two aliases without using CROSS

??????Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

RE: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Re: Any way to join two aliases without using CROSS

Any way to join two aliases without using CROSS

generic union types in piggybank

Recordings from Pig user meetup at Linkedin, Mar 14

Re: Could not estimate number of reducers

pig-0.12.0+PIG-3285: Encounter "NoClassDefFoundError: org.cloudera.htrace.Trace" during reading hbase table in pig grunt

19 matches

Site Navigation

Mail list logo

Footer information