Yes it's tough, and no it's not common :)
Scale brings limitations...
On Tue, Dec 14, 2010 at 4:05 AM, Rajesh Balamohan
wrote:
> Thanks for the quick reply Dmitriy. Does it mean that its tough to have
> non-equi join type of joins between 2 datasets in PIG? Isn't it a common
> scenario in product
Thanks for the quick reply Dmitriy. Does it mean that its tough to have
non-equi join type of joins between 2 datasets in PIG? Isn't it a common
scenario in production systems?
On Tue, Dec 14, 2010 at 6:59 AM, Dmitriy Ryaboy wrote:
> Rajesh, that's not a map-reduce friendly computation, as it is
Rajesh, that's not a map-reduce friendly computation, as it is essentially a
cross.
Which is how you would implement something like this -- and it would be
awfully slow or just not computable for very large datasets: cross, then
filter.
-Dmitriy
On Mon, Dec 13, 2010 at 5:04 PM, Rajesh Balamohan <
Hi Folks,
I have 2 datasets (T1, T2) to be joinned.
I need to join T1 with T2 based on some criteria. COGROUP does it based on
== condition.
ex: COGROUP T1 by f1, T2 by f2 (but I need to filter T2.f2 > T1.f1).
Is there a way to specify such conditions in PIG?.
~Rajesh.B