Re: Split Indexes

2010-07-22 Thread Alan Gates
Pig has implemented map side merge joins in this way. If the storage mechanism contains an index (e.g. Zebra) it can use it. Alan. On Jul 21, 2010, at 5:22 PM, Deem, Mike wrote: We are planning to use Hadoop to run a number of recurring jobs that involve map side joins. Rather than requi

Split Indexes

2010-07-21 Thread Deem, Mike
We are planning to use Hadoop to run a number of recurring jobs that involve map side joins. Rather than requiring that the joined datasets be partitioned into separate part-* files, we are considering the following solution. Our concerns with the partitioned approach include: * All t