Re: SIZE() always leads to 1 reducer?

2013-04-11 Thread Mark Wagner
Hi Yang, A couple points: the grouping of z will create exactly one input group for the reducers. Since there's only one, more reducers doesn't help any. There are accumulator and algebraic UDFs, but SIZE is not one of them because SIZE can also take data types other than bags (you can't split the

SIZE() always leads to 1 reducer?

2013-04-11 Thread Yang
I set default_parallel=15 but when I did a y = group z ALL; x = foreach y generate SIZE(z); the 2 lines generate a MR job with only 1 reducer. I guess it's because SIZE() needs to count all the groups. but don't we have the sort of cumulative/additive UDFs ? it would be faster if we could pa

Re: Ruby 1.9 and jRuby 1.7

2013-04-11 Thread Jonathan Coveney
Dan, I implemented most of the jruby stuff. Glad to hear you're trying it out! Please let us know what your experience is like. I definitely had plans to upgrade to jruby 1.7, and am not sure why I never did...hmm. Ahh, ok, it's part of this patch...which still isn't committed...you should bump

Ruby 1.9 and jRuby 1.7

2013-04-11 Thread Dan
Hey, With the last release support for jRuby was added to Pig. I've started using this now for some work I'm doing but there are a few details missing that are hard to pull out of the pig code. I can't see anywhere we are specifying the version of Ruby that the embedded jRuby engine is using? In