[ 
https://issues.apache.org/jira/browse/HIVE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900397#action_12900397
 ] 

Ning Zhang commented on HIVE-1567:
----------------------------------

The hive.mapjoin.maxsize is there not for speed, it is for limiting memory 
consumption. We saw OOM exceptions quite a lot before this parameter was 
introduced. Rather than increasing it blindly a better way may be to estimate 
how many rows can be fit into memory based on the row size and available memory 
and adjusting this parameter automatically.

> increase hive.mapjoin.maxsize to 10 million
> -------------------------------------------
>
>                 Key: HIVE-1567
>                 URL: https://issues.apache.org/jira/browse/HIVE-1567
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>
> i saw in a very wide table, hive can process 1million rows in less than one 
> minute (select all columns).
> setting the hive.mapjoin.maxsize to 100k is kind of too restrictive. Let's 
> increase this to 10 million.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to