A replicated cross (implemented as a replicated join on a synthetic key) is
probably your best bet.
On Wed, Oct 23, 2013 at 2:09 PM, Daniel Dai wrote:
> Can you do a cross?
>
>
> On Mon, Oct 21, 2013 at 2:21 PM, Serega Sheypak >wrote:
>
> > Hi, I have two relations:
> > relation *rows* (>10GB)
Can you do a cross?
On Mon, Oct 21, 2013 at 2:21 PM, Serega Sheypak wrote:
> Hi, I have two relations:
> relation *rows* (>10GB)
> relation *tinyDictionary* (<1MB)
>
> I want to take each tuple from *rows* and attach *tinyDictionary *to it.
> And then pass it to python UDF:
>
> result = FOREACH
Hi, I have two relations:
relation *rows* (>10GB)
relation *tinyDictionary* (<1MB)
I want to take each tuple from *rows* and attach *tinyDictionary *to it.
And then pass it to python UDF:
result = FOREACH someRelation GENERATE udf.my_python_udf(single_row_from_*
Rows*, whole*TinyDictionary*);
Ho