wan kun created HIVE-18362:
------------------------------
Summary: Introduce a parameter to control the max row number for
map join convertion
Key: HIVE-18362
URL: https://issues.apache.org/jira/browse/HIVE-18362
Project: Hive
Issue Type: Bug
Components: Query Processor
Reporter: wan kun
Assignee: wan kun
Priority: Minor
The compression ratio of the Orc compressed file will be very high in some
cases.
The test table has three Int columns, with twelve million records, but the
compressed file size is only 4M. Hive will automatically converts the Join to
Map join, but this will cause memory overflow. So I think it is better to have
a parameter to limit to the total number of table records in the Map Join
convertion, and if the total number of records is larger than that, it can not
be converted to Map join.
*hive.auto.convert.join.max.number = 2500000L*
The default value for this parameter is 2500000, because so many records occupy
about 700M memory in clint JVM, and 2500000 records for Map Join are also large
tables.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)