wan kun created HIVE-18362:
------------------------------

             Summary: Introduce a parameter to control the max row number for 
map join convertion
                 Key: HIVE-18362
                 URL: https://issues.apache.org/jira/browse/HIVE-18362
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
            Reporter: wan kun
            Assignee: wan kun
            Priority: Minor


The compression ratio of the Orc compressed file will be very high in some 
cases.

The test table has three Int columns, with twelve million records, but the 
compressed file size is only 4M. Hive will automatically converts the Join to 
Map join, but this will cause memory overflow. So I think it is better to have 
a parameter to limit to the total number of table records in the Map Join 
convertion, and if the total number of records is larger than that, it can not 
be converted to Map join.

*hive.auto.convert.join.max.number = 2500000L*

The default value for this parameter is 2500000, because so many records occupy 
about 700M memory in clint JVM, and 2500000 records for Map Join are also large 
tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to