Hi all,

Still having trouble with OOM/GC Overhead limit on queries. Happens very 
frequently. The tables being joined aren't terribly large (~6m rows, ~12 
columns) and we've upped our HADOOP_HEAPSIZE and several of the other 
heap-related settings to 12GB. Sample of one of the more recent errors is 
below. The error seems to occur anytime I attempt to join two or more tables 
with several million rows of data. Error occurs both in the Hue Hive shell and 
via Hive CLI. Is there anything else I should be checking? Any other info I can 
provide to help isolate what the issue might be?

Running Hive 11


2014-02-05 17:29:44,603 FATAL org.apache.hadoop.mapred.Child: Error running 
child : java.lang.OutOfMemoryError: GC overhead limit exceeded
        at 
org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.getHiveDecimal(HiveDecimalWritable.java:84)
        at 
org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.<init>(HiveDecimalWritable.java:51)
        at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveDecimalObjectInspector.copyObject(WritableHiveDecimalObjectInspector.java:44)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:260)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:296)
        at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinObjectValue.readExternal(MapJoinObjectValue.java:105)
        at 
java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1791)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
        at java.util.HashMap.readObject(HashMap.java:1030)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
        at 
org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.initilizePersistentHash(HashMapWrapper.java:128)
        at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:194)
        at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
        at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1377)
        at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
        at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
        at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:611)
        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)


From: Martin, Nick [mailto:nimar...@pssd.com]
Sent: Tuesday, October 08, 2013 3:21 PM
To: user@hive.apache.org
Subject: RE: Execution failed with exit status: 3

Thanks for the suggestion. We have the user perms configured appropriately, as 
we're able to run other Hive queries that kick off MR jobs and complete them 
successfully.

MR1

HWX suggested it might have something to do with hive.auto.convert.join = true 
and hive.mapjoin.smalltable.filesize=25MB. The smaller table in the join is 
250mb so we're going to play with that and see if it fixes anything. I'll 
update this thread after we test in case anyone else hits this issue down the 
road.


From: Sanjay Subramanian [mailto:sanjay.subraman...@wizecommerce.com]
Sent: Tuesday, October 08, 2013 2:53 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: Execution failed with exit status: 3

Hi

Are u running this thru Beeswax in Hue ? If I recall right then u might need to 
provide access to "hue" user to submit and run MR jobs on the cluster
Also r u using YARN or MR1 ?

Thanks
Regards

sanjay


From: <Martin>, Nick <nimar...@pssd.com<mailto:nimar...@pssd.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Tuesday, October 8, 2013 10:27 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: RE: Execution failed with exit status: 3

Update on this...

When I run this in Hive CLI it works perfectly. Only has a bug in Hue. I'll 
send this thread over to hue_user@ and see what they say.

From: Martin, Nick [mailto:nimar...@pssd.com]
Sent: Tuesday, October 08, 2013 12:17 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: RE: Execution failed with exit status: 3

Hi Sanjay, thanks for the suggestion.

There are no partitions on either table.

From: Sanjay Subramanian [mailto:sanjay.subraman...@wizecommerce.com]
Sent: Monday, October 07, 2013 8:19 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: Execution failed with exit status: 3

Hi Nick

How many partitions are there in table t1 and table t2
If there are many partitions in either t1 or t2 or both can u mod your query as 
follows and see if the error comes up


SELECT

T1.somecolumn,

T2.someothercolumn

FROM

    (SELECT * FROM t1 WHERE partition_column1='<some_val>') T1

JOIN

    (SELECT * FROM t2 WHERE partition_column2='<some_val>') T2

ON (T1.idfield=T2.idfield)


Thanks
Regards

sanjay

 email : 
sanjay.subraman...@wizecommerce.com<mailto:sanjay.subraman...@wizecommerce.com>
   Irc : sanjaysub (channel #noc)
 skype : sanjaysubramanian
mobile : (925) 399 2692

From: <Martin>, Nick <nimar...@pssd.com<mailto:nimar...@pssd.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Monday, October 7, 2013 5:13 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Execution failed with exit status: 3

Hi all,

I'm doing a very basic join in Hive and getting the error below. The HiveQL 
join syntax I'm using is:


SELECT

T1.somecolumn,

T2.someothercolumn

FROM t1

JOIN t2

ON (t1.idfield=t2.idfield)



Driver returned: 3.  Errors: OK
Total MapReduce jobs = 1
setting HADOOP_USER_NAME  someuser
Execution failed with exit status: 3
Obtaining error information

Task failed!
Task ID:
  Stage-4

Logs:

FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.MapredLocalTask
I searched through the logs and couldn't find anything terribly useful, 
although perhaps I'm missing something. Is this a common error I'm just now 
coming across?

On Hive 0.11

Thanks!
Nick

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.

Reply via email to