Hi,

Did you set hive.auto.convert.join=true and try? It does a runtime check of
input table sizes and autoconverts join operators to mapjoin if possible.
I'm not sure if it does it for subqueries too, need to look into the code
base. If any of the devs can confirm, that would be great.

- Bharath


On Wed, Dec 4, 2013 at 10:35 PM, Sukhendu Chakraborty <
sukhendu.chakrabo...@gmail.com> wrote:

> Hi,
>
> Is there anyway mapjoin works on the subquery(not the underlying table). I
> have the following query:
>
> select external_id,count(category_id) from
> catalog_products_in_categories_orc pc inner join (select * from
> catalog_products_orc where s_id=118) p on pc.product_id=p.id   group by
> external_id;
>
>
> Now, even though catalog_products_orc is a big table, after filtering
> (s_id=118) it results in very few number of rows which can be easily
> optimized to a mapjoin (with catalog_products_in_categories_orc as the big
> table and the subquery result as the small table) . However, when I try to
> specify /*+MAPJOIN(p)*/ to enforce this, it results in a mapjoin for the
> table catalog_products_orc (and not on the subquery after filtering).
>
> Any ideas to achieve mapjoin on a subquery (and not the underlying table)?
>
>
> -Sukhendu
>

Reply via email to