Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage
I am trying to perform the following action, but the only solution I have been able to come up with is using a CROSS, but I don't want to use that statement as it is a very expensive process. (1,2,3,4,5) (10,11) (1,2,4,5,7) (10,11) (1,5,7,8,9) (10,11) I want to make

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Pradeep Gollakota
I don't understand what you're trying to do from your example. If you perform a cross on the data you have, the output will be the following: (1,2,3,4,5,10,11) (1,2,3,4,5,10,11) (1,2,3,4,5,10,11) (1,2,4,5,7,10,11) (1,2,4,5,7,10,11) (1,2,4,5,7,10,11) (1,5,7,8,9,10,11) (1,5,7,8,9,10,11)

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage
The output I would like to see is (1,2,3,4,5,10,11) (1,2,4,5,7,10,12) (1,5,7,8,9,10,13) On Tue, Mar 25, 2014 at 3:58 PM, Pradeep Gollakota pradeep...@gmail.comwrote: I don't understand what you're trying to do from your example. If you perform a cross on the data you have, the output will

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread John Meagher
Try this: http://pig.apache.org/docs/r0.11.0/basic.html#rank Rank each data set then join on the rank. On Tue, Mar 25, 2014 at 4:03 PM, Christopher Surage csur...@gmail.com wrote: The output I would like to see is (1,2,3,4,5,10,11) (1,2,4,5,7,10,12) (1,5,7,8,9,10,13) On Tue, Mar 25, 2014

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage
yes On Tue, Mar 25, 2014 at 4:07 PM, Shahab Yunus shahab.yu...@gmail.comwrote: Oh, sorry. This new example is something different from what I understood before. I thought you were only trying to append one relation (with one tuple) to another (which has more than one tuple). So essentially

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage
@ pradeep, I know what the cross product will do, but I have many lines in many files. So the cross will take far too long to complete. On Tue, Mar 25, 2014 at 3:58 PM, Pradeep Gollakota pradeep...@gmail.comwrote: I don't understand what you're trying to do from your example. If you perform

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Andrew Musselman
John's answer about RANK sounds like it should solve your problem On Mar 25, 2014, at 1:13 PM, Christopher Surage csur...@gmail.com wrote: @ pradeep, I know what the cross product will do, but I have many lines in many files. So the cross will take far too long to complete. On Tue, Mar

RE: Any way to join two aliases without using CROSS

2014-03-25 Thread william.dowling
way to join two aliases without using CROSS The output I would like to see is (1,2,3,4,5,10,11) (1,2,4,5,7,10,12) (1,5,7,8,9,10,13) On Tue, Mar 25, 2014 at 3:58 PM, Pradeep Gollakota pradeep...@gmail.comwrote: I don't understand what you're trying to do from your example. If you perform

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Pradeep Gollakota
Subject: Re: Any way to join two aliases without using CROSS The output I would like to see is (1,2,3,4,5,10,11) (1,2,4,5,7,10,12) (1,5,7,8,9,10,13) On Tue, Mar 25, 2014 at 3:58 PM, Pradeep Gollakota pradeep...@gmail.com wrote: I don't understand what you're trying to do from your

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Christopher Surage
I don't think my version of PIG supports the rank function, I keep getting Internal Error. I would update it, but I am not in control of the cluster. On Tue, Mar 25, 2014 at 4:16 PM, Andrew Musselman andrew.mussel...@gmail.com wrote: John's answer about RANK sounds like it should solve your

Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Andrew Musselman
In that situation you could write a script that tacks on the equivalent value that rank does, and stream the ordered relations through it. I'm assuming you have a sense of order on both these relations. After that join like you would after rank. I'm not at a computer so can't type up an

??????Re: Any way to join two aliases without using CROSS

2014-03-25 Thread James
Hello, There is a similar UDF in DataFu named Enumerate. http://datafu.incubator.apache.org/docs/datafu/1.2.0/datafu/pig/bags/Enumerate.html I wish it may help. James

Re: 回复:Re: Any way to join two aliases without using CROSS

2014-03-25 Thread Pradeep Gollakota
Unfortunately, the Enumerate UDF from DataFu would not work in this case. The UDF works on Bags and in this case, we want to enumerate a relation. Implementing RANK is a very tricky thing to do correctly. I'm not even sure if it's doable just by using Pig operators, UDFs or macros. Best option is