Hi,
I have development cluster installed with Hive 0.12 running on top of
Hadoop 2.2.0. I have set the following properties values in the .hiverc
file and when running queries in the hive command line:
set hive.exec.mode.local.auto=true;
set hive.auto.convert.join=true;
set hive.mapjoin.smalltable.filesize=25000000;
>From what I see Hive is not taking into account the value for the
smalltable.filesize property. Even though I set the property value to very
small numbers like 3 Hive still converts the join to a local map join and
the query fails due to memory exhaustion with the following error:
2013-11-24 09:45:15,098 ERROR mr.MapredLocalTask
(MapredLocalTask.java:executeFromChildJVM(323)) - Hive Runtime Error: Map
local work exhausted memory
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException:
2013-11-24 09:45:14 Processing rows: 1000000 Hashtable size:
999999 Memory usage: 1031866880 percentage: 0.968
at
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
at
org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:249)
at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at
org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:136)
at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
...
Any ideas about what might cause this error or how to fix this?