Hi Yang,
A couple points: the grouping of z will create exactly one input group for
the reducers. Since there's only one, more reducers doesn't help any. There
are accumulator and algebraic UDFs, but SIZE is not one of them because
SIZE can also take data types other than bags (you can't split the
I set default_parallel=15
but when I did a
y = group z ALL;
x = foreach y generate SIZE(z);
the 2 lines generate a MR job with only 1 reducer.
I guess it's because SIZE() needs to count all the groups. but don't we
have the sort of cumulative/additive UDFs ?
it would be faster if we could pa
Dan,
I implemented most of the jruby stuff. Glad to hear you're trying it out!
Please let us know what your experience is like.
I definitely had plans to upgrade to jruby 1.7, and am not sure why I never
did...hmm.
Ahh, ok, it's part of this patch...which still isn't committed...you should
bump
Hey,
With the last release support for jRuby was added to Pig. I've started
using this now for some work I'm doing but there are a few details missing
that are hard to pull out of the pig code.
I can't see anywhere we are specifying the version of Ruby that the
embedded jRuby engine is using? In