Re: CompositeInputFormat
Map Side joins will use the CompositeInputFormat. They will only really be worth doing if one data set is small, and the other is large. This is a good example : http://www.congiu.com/joins-in-hadoop-using-compositeinputformat/ the trick is to google for CompositeInputFormat.compose() :) On Thu, Jul 11, 2013 at 5:02 PM, Botelho, Andrew wrote: > Hi, > > ** ** > > I want to perform a JOIN on two sets of data with Hadoop. I read that the > class CompositeInputFormat can be used to perform joins on data, but I > can’t find any examples of how to do it. > > Could someone help me out? It would be much appreciated. J > > ** ** > > Thanks in advance, > > ** ** > > Andrew > -- Jay Vyas http://jayunit100.blogspot.com
RE: CompositeInputFormat
Sorry I should've specified that I need an example of CompositeInputFormat that uses the new API. The example linked below uses old API objects like JobConf. Any known examples of CompositeInputFormat using the new API? Thanks in advance, Andrew From: Jay Vyas [mailto:jayunit...@gmail.com] Sent: Thursday, July 11, 2013 5:10 PM To: common-u...@hadoop.apache.org Subject: Re: CompositeInputFormat Map Side joins will use the CompositeInputFormat. They will only really be worth doing if one data set is small, and the other is large. This is a good example : http://www.congiu.com/joins-in-hadoop-using-compositeinputformat/ the trick is to google for CompositeInputFormat.compose() :) On Thu, Jul 11, 2013 at 5:02 PM, Botelho, Andrew mailto:andrew.bote...@emc.com>> wrote: Hi, I want to perform a JOIN on two sets of data with Hadoop. I read that the class CompositeInputFormat can be used to perform joins on data, but I can't find any examples of how to do it. Could someone help me out? It would be much appreciated. :) Thanks in advance, Andrew -- Jay Vyas http://jayunit100.blogspot.com
RE: CompositeInputFormat
Hi Andrew, You could make use of hadoop data join classes to perform the join or you can refer these classes for better idea to perform join. http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-datajoin Thanks Devaraj k From: Botelho, Andrew [mailto:andrew.bote...@emc.com] Sent: 12 July 2013 03:33 To: user@hadoop.apache.org Subject: RE: CompositeInputFormat Sorry I should've specified that I need an example of CompositeInputFormat that uses the new API. The example linked below uses old API objects like JobConf. Any known examples of CompositeInputFormat using the new API? Thanks in advance, Andrew From: Jay Vyas [mailto:jayunit...@gmail.com] Sent: Thursday, July 11, 2013 5:10 PM To: common-u...@hadoop.apache.org<mailto:common-u...@hadoop.apache.org> Subject: Re: CompositeInputFormat Map Side joins will use the CompositeInputFormat. They will only really be worth doing if one data set is small, and the other is large. This is a good example : http://www.congiu.com/joins-in-hadoop-using-compositeinputformat/ the trick is to google for CompositeInputFormat.compose() :) On Thu, Jul 11, 2013 at 5:02 PM, Botelho, Andrew mailto:andrew.bote...@emc.com>> wrote: Hi, I want to perform a JOIN on two sets of data with Hadoop. I read that the class CompositeInputFormat can be used to perform joins on data, but I can't find any examples of how to do it. Could someone help me out? It would be much appreciated. :) Thanks in advance, Andrew -- Jay Vyas http://jayunit100.blogspot.com