Pig 0.8.1. On Mon, Aug 22, 2011 at 10:58 PM, Thejas Nair <[email protected]>wrote:
> Hi Byambajargal, > What version of pig does your distribution use ? > -Thejas > > > On 8/22/11 3:42 AM, byambaa wrote: > >> Hello >> I have a cluster with 11 nodes each of them have 16 GB RAM, 6 core CPU, >> 1 TB HDD and i am using cloudera distribution CHD4b with Pig. I have two >> Pig >> Join queries which are a Parallel and a Replicated version of pig Join >> and MapReduce Reduce side and Map side joins. >> >> Theoretically Replicated Join could be faster than Parallel join but in >> my case Parallel is faster. >> i have a questions : >> >> 1.I am wondering why the replicated join is so slowly how it works what >> is the behind the replicated join. >> 2. MR reduce side join was faster than parallel pig join, what is >> implemented background the parallel pig join. i guess pig implement also >> MR reduce side join. >> >> Could you explain me about the Pig joins how it works and what is run >> behind the pig scripts >> >> >> Replicated Join in HDFS Replicated Join in Hbase MR Reduce side join MR >> Joins (Singleton pattern) >> obr_wp_annotation 1786MB >> 29 sec 50 sec 36 sec 19 >> obr_ct_annotation 5916MB >> 799 sec 523 sec >> 108 sec 69 >> obr_pm_annotation 16983MB >> 1794 sec >> 707 sec 248 sec 138 >> >> the relation file is 659MB >> >> thanks you very much >> >> Byambajargal >> >> >> >
