Re: CROSS optimization

2014-02-09 Thread Enns, Steven
ce one of my relations is small enough to fit in memory, I can force it to use a map side (replicated) join. Now the plan looks like this: Map(LOAD A, LOAD B, JOIN, FILTER) -> Combine(COUNT) -> Reduce(COUNT) On 2/9/14 12:53 PM, "Enns, Steven" wrote: >I am trying to ag

CROSS optimization

2014-02-09 Thread Enns, Steven
I am trying to aggregate on the cross product of two relations. It can be done using a single M/R job but pig is using two. The pig code looks like this: C = cross A, B; C = filter C by Š; G = group C by x; G = foreach G generate group, COUNT(G); The resulting M/

Re: Override input schema in AvroStorage

2013-05-01 Thread Enns, Steven
m a file. > > >This jira is work in progress, but hopefully it will be in next major >released. > >Thanks, >Cheolsoo > > > >On Sat, Apr 27, 2013 at 3:24 PM, Enns, Steven wrote: > >> Resending now that I am subscribed :) >> >> On 4/25/13 4:01 P

Re: Override input schema in AvroStorage

2013-04-27 Thread Enns, Steven
Resending now that I am subscribed :) On 4/25/13 4:01 PM, "Enns, Steven" wrote: >Hi everyone, > >I would like to override the input schema in AvroStorage to make a pig >script robust to schema evolution. For example, suppose a new field is >added to an avro schema wit

Override input schema in AvroStorage

2013-04-25 Thread Enns, Steven
Hi everyone, I would like to override the input schema in AvroStorage to make a pig script robust to schema evolution. For example, suppose a new field is added to an avro schema with a default value of null. If the input to a pig script using this field includes both old and new data, AvroStora