Venkatesh: Increasing the memory: I've tried even bigger setttings, that made the error appear after twice much more time.
Dev: So I know which table is giving the issue, following your previous suggestion I did a SELECT DISTINCT * FROM DELTA, which cause the same issue so I think the DISTINCT is the likely cause. This would make sense since my original query worked for a UNION ALL. I'll try the simplest query I can reduce it to with loads of memory and see if that gets anywhere. Other pointers are much appreciated. Thanks for the help! Op ma 11 mrt. 2019 om 19:35 schreef Devopam Mittra <devo...@gmail.com>: > hi Patrick, > If it sounds worth trying please do the same: > > 1. Create physical table from table 1. (with filter clause) > 2. Create physical table from table 2. (with filter clause) > 3. Create interim table 2_1 with the DISTINCT clause. > 4. Create interim table 2_2 with the UNION clause. > 5. Do an INSERT OVERWRITE into target table from 2_2 above. > > This should help with isolating the error step. > Yes, increasing # mappers or memory helps - but I am usually averse to > this till I have exhausted all other options to be sure. > > regards > Dev > > > On Mon, Mar 11, 2019 at 9:21 PM Patrick Duin <patd...@gmail.com> wrote: > >> Very good question, Yes that does give the same problem. >> >> Op ma 11 mrt. 2019 om 16:28 schreef Devopam Mittra <devo...@gmail.com>: >> >>> Can you please try doing SELECT DISTINCT * FROM DELTA into a physical >>> table first ? >>> regards >>> Dev >>> >>> >>> On Mon, Mar 11, 2019 at 7:59 PM Patrick Duin <patd...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> I'm running into oom issue trying to do a Union all on a bunch of AVRO >>>> files. >>>> >>>> The query is something like this: >>>> >>>> with gold as ( select * from table1 where local_date=2019-01-01), >>>> delta ss ( select * from table2 where local_date=2019-01-01) >>>> insert overwrite table3 PARTITION ('local_date') >>>> select * from gold >>>> union distinct >>>> select * from delta; >>>> >>>> UNION ALL works. The data size is in the low gigabytes and I'm running >>>> on 6 16 GB Nodes (I've tried larger and set memory settings higher but that >>>> just postpones the error). >>>> >>>> Mappers fail with erros (stacktraces not all the same) >>>> >>>> 2019-03-11 13:37:22,381 ERROR [main] org.apache.hadoop.mapred.YarnChild: >>>> Error running child : java.lang.OutOfMemoryError: GC overhead limit >>>> exceeded >>>> at org.apache.hadoop.io.Text.setCapacity(Text.java:268) >>>> at org.apache.hadoop.io.Text.set(Text.java:224) >>>> at org.apache.hadoop.io.Text.set(Text.java:214) >>>> at org.apache.hadoop.io.Text.<init>(Text.java:93) >>>> at >>>> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.copyObject(WritableStringObjectInspector.java:36) >>>> at >>>> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:418) >>>> at >>>> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:442) >>>> at >>>> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:428) >>>> at >>>> org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.deepCopyElements(KeyWrapperFactory.java:152) >>>> at >>>> org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.deepCopyElements(KeyWrapperFactory.java:144) >>>> at >>>> org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.copyKey(KeyWrapperFactory.java:121) >>>> at >>>> org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:805) >>>> at >>>> org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:719) >>>> at >>>> org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787) >>>> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) >>>> at >>>> org.apache.hadoop.hive.ql.exec.UnionOperator.process(UnionOperator.java:148) >>>> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) >>>> at >>>> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) >>>> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) >>>> at >>>> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) >>>> at >>>> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:148) >>>> at >>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:547) >>>> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160) >>>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) >>>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) >>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) >>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) >>>> at java.security.AccessController.doPrivileged(Native Method) >>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>> at >>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) >>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) >>>> >>>> I've tried Hive 2.3.2 and Hive 2.3.4, both tez and mr engines. >>>> >>>> I've tried running with more and less mappers, always hitting oom. >>>> >>>> I'm running similar query on different (much larger) data without issues >>>> so suspect something with the actual data. >>>> >>>> The table schema is this: >>>> c1 string >>>> c2 bigint >>>> c3 array<map<string,string>> >>>> local_date string >>>> >>>> >>>> I've narrowed it down and (not surprisingly) the 3rd column seems to be >>>> the cause of the issue, If I remove that the union works again just fine. >>>> >>>> Anyone has similar experiences? Perhaps any pointers on how to tackle this? >>>> >>>> Kind regards, >>>> >>>> Patrick >>>> >>>> >>>> >>> >>> -- >>> Devopam Mittra >>> Life and Relations are not binary >>> >> > > -- > Devopam Mittra > Life and Relations are not binary >