Yeah, maybe I should have said "right or outer join".  What I wanted to make 
clear is that if you want to identify non-matches in the large (fragment, or 
left side) you can still use fragment-replicate join.  If you want to identify 
non-matches in the small (replicate, or right side) you cannot.

Alan.

On Jan 30, 2012, at 6:09 AM, Vincent Barat wrote:

> I understand you point and it makes sense.
> 
> The graph in Alan's book says that if you "outer join on the small input" you 
> should not use replicated join.
> 
> Maybe this sentence is not clear enough :)
> 
> 
> Le 28/01/12 00:21, Alex Rovner a écrit :
>> From what I understand replicated should not be used with full outer join 
>> since full outer means both tables records will be in the output regardless 
>> if they exist in the joined table. In your case you only care about session 
>> which is left join and not a full outer.
>> 
>> Reason for that is pigs and Hadoop schematics of the join: the "small" table 
>> is loaded into each mapper and thus is not meant to be used solely in the 
>> output.
>> 
>> Alex
>> 
>> Sent from my iPhone
>> 
>> On Jan 27, 2012, at 8:15 AM, Vincent Barat<vincent.ba...@gmail.com>  wrote:
>> 
>>> Hi folks,
>>> 
>>> I use replicated joins, and recently I encountered an issue : my rightmost 
>>> relation seems to become too big and, even if I don't get any "Java heap 
>>> space" the time it take to finish the maps become exponentially long (I 
>>> cannot figure why exactly).
>>> 
>>> Removing "replicated" fix the issue, but several questions raise.
>>> 
>>> In Alan's book " *Figure 8.1. Choosing a Join Implementation " it is said 
>>> that replicated joins should NOT BE USED for outer joins.
>>> 
>>> *Nevertheless, it seems to work in the following case, and is faster than 
>>> regular joins. So why ?
>>> 
>>> sessions = JOIN sessions BY locid LEFT, locations BY locid USING 
>>> 'replicated';
>>> 
>>> (not all sessions have a location in this case)
>>> 
>>> Thanks for your advices.
>>> 
>>> 
>>> 
>>> 
> 
> -- 
> 
> *Vincent BARAT, UBIKOD, CTO*
> 
> 
> vba...@ubikod.com <mailto:vba...@ubikod.com>  Mob +33 (0)6 15 41 15 18
> 
> UBIKOD Paris, c/o ESSEC VENTURES, Avenue Bernard Hirsch, 95021 Cergy-Pontoise 
> cedex, FRANCE, Tel +33 (0)1 34 43 28 89
> 
> UBIKOD Rennes, 10 rue Duhamel, 35000 Rennes, FRANCE, Tel. +33 (0)2 99 65 69 13
> 
> 
> www.ubikod.com <http://www.ubikod.com/>@ubikod <http://twitter.com/ubikod>
> 
> www.capptain.com <http://www.capptain.com/>@capptain_hq 
> <http://twitter.com/capptain_hq>
> 
> 
> IMPORTANT NOTICE -- UBIKOD and CAPPTAIN are registered trademarks of UBIKOD 
> S.A.R.L., all copyrights are reserved.  The contents of this email and 
> attachments are confidential and may be subject to legal privilege and/or 
> protected by copyright. Copying or communicating any part of it to others is 
> prohibited and may be unlawful. If you are not the intended recipient you 
> must not use, copy, distribute or rely on this email and should please return 
> it immediately or notify us by telephone. At present the integrity of email 
> across the Internet cannot be guaranteed. Therefore UBIKOD S.A.R.L. will not 
> accept liability for any claims arising as a result of the use of this medium 
> for transmissions by or to UBIKOD S.A.R.L.. UBIKOD S.A.R.L. may exercise any 
> of its rights under relevant law, to monitor the content of all electronic 
> communications. You should therefore be aware that this communication and any 
> responses might have been monitored, and may be accessed by UBIKOD S.A.R.L. 
> The views expressed in this document are that of the individual and may not 
> necessarily constitute or imply its endorsement or recommendation by UBIKOD 
> S.A.R.L. The content of this electronic mail may be subject to the 
> confidentiality terms of a "Non-Disclosure Agreement" (NDA).
> 

Reply via email to