subject:"Attach bag for each tuple and pass to UDF"

Re: Attach bag for each tuple and pass to UDF

2013-10-23 Thread Pradeep Gollakota

A replicated cross (implemented as a replicated join on a synthetic key) is probably your best bet. On Wed, Oct 23, 2013 at 2:09 PM, Daniel Dai wrote: > Can you do a cross? > > > On Mon, Oct 21, 2013 at 2:21 PM, Serega Sheypak >wrote: > > > Hi, I have two relations: > > relation *rows* (>10GB)

Re: Attach bag for each tuple and pass to UDF

2013-10-23 Thread Daniel Dai

Can you do a cross? On Mon, Oct 21, 2013 at 2:21 PM, Serega Sheypak wrote: > Hi, I have two relations: > relation *rows* (>10GB) > relation *tinyDictionary* (<1MB) > > I want to take each tuple from *rows* and attach *tinyDictionary *to it. > And then pass it to python UDF: > > result = FOREACH

Attach bag for each tuple and pass to UDF

2013-10-22 Thread Serega Sheypak

Hi, I have two relations: relation *rows* (>10GB) relation *tinyDictionary* (<1MB) I want to take each tuple from *rows* and attach *tinyDictionary *to it. And then pass it to python UDF: result = FOREACH someRelation GENERATE udf.my_python_udf(single_row_from_* Rows*, whole*TinyDictionary*); Ho