Re: How HIVE manages a join

Edward Capriolo Tue, 10 Aug 2010 14:58:10 -0700

Sorry.
$hive_root/docs/xdocs/language_manual/joins.xml


On Tue, Aug 10, 2010 at 5:57 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote:
> This page is is already in version control..
>
> /home/edward/cassandra-handler/docs/xdocs/language_manual/joins.xml
>
> Edward
>
> On Tue, Aug 10, 2010 at 5:15 PM, Carl Steinbach <c...@cloudera.com> wrote:
>> Hi Yongqiang,
>> Please go ahead and update the wiki page. I will copy it over to version
>> control when you are done.
>> Thanks.
>> Carl
>>
>> On Tue, Aug 10, 2010 at 2:11 PM, yongqiang he <heyongqiang...@gmail.com>
>> wrote:
>>>
>>> In the Hive Join wiki page, it says
>>> "THIS PAGE WAS MOVED TO HIVE XDOCS ! DO NOT EDIT!Join Syntax"
>>>
>>> Where should i do the update?
>>>
>>> On Fri, Aug 6, 2010 at 11:46 PM, yongqiang he <heyongqiang...@gmail.com>
>>> wrote:
>>> > Yeah. The sort merge bucket mapjoin has been finished for sometime,
>>> > and seems stable now. I did one skew join but haven't get a chance to
>>> > look at another skew join Namit mentioned to me. But definitely should
>>> > update the wiki earlier. My bad.
>>> >
>>> > On Fri, Aug 6, 2010 at 8:32 PM, Jeff Hammerbacher <ham...@cloudera.com>
>>> > wrote:
>>> >> Yongqiang mentioned he was going to update the wiki with this
>>> >> information in
>>> >> the thread at http://hadoop.markmail.org/thread/hxd4uwwukuo46lgw.
>>> >>
>>> >> Yongqiang, have you gotten a chance to complete the sort merge bucket
>>> >> map
>>> >> join and the other skew join you mention in the above thread?
>>> >>
>>> >> Thanks,
>>> >> Jeff
>>> >>
>>> >> On Fri, Aug 6, 2010 at 3:43 AM, bharath vissapragada
>>> >> <bhara...@students.iiit.ac.in> wrote:
>>> >>>
>>> >>> Roberto ..
>>> >>>
>>> >>> You can find these links useful ..
>>> >>>
>>> >>>
>>> >>>
>>> >>> http://www.slideshare.net/ragho/hive-icde-2010?src=related_normal&rel=2374551
>>> >>> - Simple joins and optimizations..
>>> >>>
>>> >>>
>>> >>> http://www.slideshare.net/zshao/hive-user-meeting-march-2010-hive-team  
>>> >>> -
>>> >>> New kind of joins / features of hive ..
>>> >>>
>>> >>> Thanks
>>> >>>
>>> >>> Bharath.V
>>> >>> 4th year Undergraduate..
>>> >>> IIIT Hyderabad
>>> >>>
>>> >>> On Fri, Aug 6, 2010 at 12:16 PM, Cappa Roberto
>>> >>> <roberto.ca...@guest.telecomitalia.it> wrote:
>>> >>>>
>>> >>>> Hi,
>>> >>>>
>>> >>>> I cannot find any documentation about what algorithm performs HIVE to
>>> >>>> translate JOIN clauses to Map-Reduce tasks.
>>> >>>>
>>> >>>> In particular, if I have two tables A and B, each table is written on
>>> >>>> a
>>> >>>> separate file and each file is splitted on hadoop nodes. When I
>>> >>>> perform a
>>> >>>> JOIN with A.column = B.column, the framework has to compare full data
>>> >>>> from
>>> >>>> the first file and full data from the second file. In order to
>>> >>>> perform a
>>> >>>> full scan of all possibile combinations of values, how can hadoop
>>> >>>> perform
>>> >>>> it? If each node contains a portion of each file, it seems not
>>> >>>> possible to
>>> >>>> have a complete comparison. Does one of the two files enterely
>>> >>>> replicated on
>>> >>>> each node? Or, does HIVE use another kind of strategy/optimization?
>>> >>>>
>>> >>>> Thanks.
>>> >>
>>> >>
>>> >
>>
>>
>

Re: How HIVE manages a join

Reply via email to