Hi Sarath,
I had a similar stack trace (see below). The problem was intermittent and only
happened when I was running on large data sets (100s of GB). I resolved the
problem by changing a GROUP BY key from a tuple of (chararray, chararray, long)
to a chararray. To do that I made the group key an ugly concatenation:
...
CONCAT(clusters::cluster_id,
CONCAT(' ',
CONCAT(tfidf::att,
CONCAT(' ', (chararray) clusters::block_size)))) as group_key,
I'm not sure if that's relevant for you but it solved my problem.
My stack trace:
2015-12-09 16:18:49,511 WARN [main] org.apache.hadoop.mapred.YarnChild:
Exception running child : java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:638)
at java.util.ArrayList.get(ArrayList.java:414)
at org.apache.pig.data.DefaultTuple.get(DefaultTuple.java:118)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getValueTuple(POPackage.java:348)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNextTuple(POPackage.java:269)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:421)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:412)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:256)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
William F Dowling
Senior Technologist
Thomson Reuters
-----Original Message-----
From: Sarath Sasidharan [mailto:[email protected]]
Sent: Friday, May 06, 2016 9:13 AM
To: [email protected]
Subject: Re: Array Out Of Bounds Exception in Nested foreach
Hi Guys,
Has anyone faced a similar issue ?
Any help would be really appreciated
Regards,
Sarath
On 06/05/16 09:45, "Sarath Sasidharan" <[email protected]> wrote:
>Hi All,
>
>I was executing the following code in pig when I encountered an array out of
>bounds exception inside a nested foreach expression.
>
>This is the snippet which I used :
>
>
>
>generateHistory = foreach (group filterFinalData
> by (PrimaryKey
> )){
>
> offersInDesc = order filterFinalData by PubTime desc;
>
> latestOffer = limit offersInDesc 1;
>
> generate flatten(latestOffer);
> }
>
>
>
>
>This is the stack trace which I had in the job tracker logs :
>
>
> org.apache.hadoop.mapred.YarnChild: Exception running child :
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.rangeCheck(ArrayList.java:653)
> at java.util.ArrayList.get(ArrayList.java:429)
> at org.apache.pig.data.DefaultTuple.get(DefaultTuple.java:117)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.Packager.getValueTuple(Packager.java:234)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage$PeekedBag$1.next(POPackage.java:431)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage$PeekedBag$1.next(POPackage.java:415)
> at
> org.apache.pig.data.DefaultAbstractBag.addAll(DefaultAbstractBag.java:151)
> at
> org.apache.pig.data.DefaultAbstractBag.addAll(DefaultAbstractBag.java:137)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.Packager.attachInput(Packager.java:125)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNextTuple(POPackage.java:290)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:431)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:422)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:269)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
> at
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>
>
>
>Could someone help me with this ? How could I overcome this issue.
>
>
>Thanks and Regards,
>
>
>Sarath
>
>
>