[ https://issues.apache.org/jira/browse/HIVE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900397#action_12900397 ]
Ning Zhang commented on HIVE-1567: ---------------------------------- The hive.mapjoin.maxsize is there not for speed, it is for limiting memory consumption. We saw OOM exceptions quite a lot before this parameter was introduced. Rather than increasing it blindly a better way may be to estimate how many rows can be fit into memory based on the row size and available memory and adjusting this parameter automatically. > increase hive.mapjoin.maxsize to 10 million > ------------------------------------------- > > Key: HIVE-1567 > URL: https://issues.apache.org/jira/browse/HIVE-1567 > Project: Hadoop Hive > Issue Type: Improvement > Reporter: He Yongqiang > > i saw in a very wide table, hive can process 1million rows in less than one > minute (select all columns). > setting the hive.mapjoin.maxsize to 100k is kind of too restrictive. Let's > increase this to 10 million. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.