Re: mapjoin with left join

2015-09-22 Thread Steve Howard
Hi Gopal/All, Yep, I absolutely understand the limitation of what we are trying to do. We will try the settings you suggested. Thanks, Steve On Tue, Sep 22, 2015 at 1:44 PM, Gopal Vijayaraghavan wrote: > > > select small.* from small s left join large l on s.id = > >l.id

Re: mapjoin with left join

2015-09-22 Thread Gopal Vijayaraghavan
> select small.* from small s left join large l on s.id = >l.id where l.id is null; ... > We simply want to load the 81K rows in to RAM, then for each row in >large, check the small hash table and if it the row in small is not in >large, then add it to

Re: mapjoin with left join

2015-09-20 Thread Noam Hasson
Not sure if will help you, but you can try to use the map-join hint, basically hinting Hive to put a specific table in memory: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+JoinOptimization#LanguageManualJoinOptimization-PriorSupportforMAPJOIN On Fri, Sep 11, 2015 at 11:16 PM, S

Re: mapjoin with left join

2015-09-11 Thread Sergey Shelukhin
As far as I know it’s not currently supported. The large table will be streamed in multiple tasks with the small table in memory, so there’s not one place that knows for sure there was no row in the large table for a particular small table row in any of the locations. It could have no match in o