Re: does EvalFunc generate the entire bag always ?

2010-06-02 Thread Alan Gates
I don't think it pushes limit yet in this case. Alan. On Jun 1, 2010, at 1:44 PM, hc busy wrote: well, see that's the thing, the 'sort A by $0' is already nlg(n) ahh, I see, my own example suffers from this problem. I guess I'm wondering how 'limit' works in conjunction with UDF's... A pract

Re: does EvalFunc generate the entire bag always ?

2010-06-01 Thread hc busy
well, see that's the thing, the 'sort A by $0' is already nlg(n) ahh, I see, my own example suffers from this problem. I guess I'm wondering how 'limit' works in conjunction with UDF's... A practical application escapes me right now, But if I do C = foreach B{ C1 = MyUdf(B.bag_on_b); C2 =

Re: does EvalFunc generate the entire bag always ?

2010-05-27 Thread Alan Gates
The default case is that a UDFs that take bags (such as COUNT, etc.) are handed the entire bag at once. In the case where all UDFs in a foreach implement the algebraic interface and the expression itself is algebraic than the combiner will be used, thus significantly limiting the size of t

does EvalFunc generate the entire bag always ?

2010-05-26 Thread hc busy
Hey, guys, how are Bags passed to EvalFunc stored? I was looking at the Accumulator interface and it says that the reason why this needed for COUNT and SUM is because EvalFunc always gives you the entire bag when the EvalFunc is run on a bag. I always thought if I did COUNT(TABLE) or SUM(TABLE.FI