Re: Can anyone point me to a good Map Reduce in memory Join implementation?

2013-02-15 Thread Prashant Kommireddi
Specifically, replicated join - http://pig.apache.org/docs/r0.10.0/perf.html#replicated-joins On Fri, Feb 15, 2013 at 6:22 PM, David Boyd wrote: > Use PIG it has specific directives for in memory joins of small > data sets. The whole thing might require a half a dozen lines > of code. > > > > On

Re: Can anyone point me to a good Map Reduce in memory Join implementation?

2013-02-15 Thread David Boyd
Use PIG it has specific directives for in memory joins of small data sets. The whole thing might require a half a dozen lines of code. On 2/15/2013 4:25 PM, Yunming Zhang wrote: Hi, I am trying to do some work with in memory Join Map Reduce implementation, it can be summarized as a a join be

Re: Can anyone point me to a good Map Reduce in memory Join implementation?

2013-02-15 Thread Viral Bajaria
Why not look at HIVE ? It already implements the JOIN that you are looking for and has features to do MAPJOIN i.e. load small file into memory. On Fri, Feb 15, 2013 at 1:25 PM, Yunming Zhang wrote: > Hi, > > I am trying to do some work with in memory Join Map Reduce implementation, > > it can be

Can anyone point me to a good Map Reduce in memory Join implementation?

2013-02-15 Thread Yunming Zhang
Hi, I am trying to do some work with in memory Join Map Reduce implementation, it can be summarized as a a join between two data set, R and S, one of them is too large to fit into memory, the other one can fit into memory reasonably well, (size of R << size of S). The typical implementation