Hi,
I'm running into oom issue trying to do a Union all on a bunch of AVRO
files.
The query is something like this:
with gold as ( select * from table1 where local_date=2019-01-01),
delta ss ( select * from table2 where local_date=2019-01-01)
insert overwrite table3 PARTITION ('local_date')
select * from gold
union distinct
select * from delta;
UNION ALL works. The data size is in the low gigabytes and I'm running on 6
16 GB Nodes (I've tried larger and set memory settings higher but that just
postpones the error).
Mappers fail with erros (stacktraces not all the same)
2019-03-11 13:37:22,381 ERROR [main]
org.apache.hadoop.mapred.YarnChild: Error running child :
java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.io.Text.setCapacity(Text.java:268)
at org.apache.hadoop.io.Text.set(Text.java:224)
at org.apache.hadoop.io.Text.set(Text.java:214)
at org.apache.hadoop.io.Text.<init>(Text.java:93)
at
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.copyObject(WritableStringObjectInspector.java:36)
at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:418)
at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:442)
at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:428)
at
org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.deepCopyElements(KeyWrapperFactory.java:152)
at
org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.deepCopyElements(KeyWrapperFactory.java:144)
at
org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.copyKey(KeyWrapperFactory.java:121)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:805)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:719)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
at
org.apache.hadoop.hive.ql.exec.UnionOperator.process(UnionOperator.java:148)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
at
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:148)
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:547)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
I've tried Hive 2.3.2 and Hive 2.3.4, both tez and mr engines.
I've tried running with more and less mappers, always hitting oom.
I'm running similar query on different (much larger) data without
issues so suspect something with the actual data.
The table schema is this:
c1 string
c2 bigint
c3 array<map<string,string>>
local_date string
I've narrowed it down and (not surprisingly) the 3rd column seems to
be the cause of the issue, If I remove that the union works again just
fine.
Anyone has similar experiences? Perhaps any pointers on how to tackle this?
Kind regards,
Patrick