Bah, I forgot to paste the pig script like an idiot:

 table1 = LOAD 'hbase://table1'
 USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
 '',
 '-loadKey -noWAL=true -minTimestamp=1451624400000
-maxTimestamp=1454302800000') AS
 (uid:chararray);

 table2 = LOAD 'hbase://table2'
 USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
 '',
 '-loadKey -noWAL=true -regex=\\\\|ago=156\\\\| -minTimestamp=1451624400000
-maxTimestamp=1454302800000') AS
 (uid:chararray);


 user_segment_with_event = JOIN table1 BY uid, table2 BY uid USING 'merge';

-- fails with TableSplitComparable cannot be cast to TableSplit
-- 
http://ip-10-0-1-180.ec2.internal:19888/jobhistory/logs/ip-10-0-1-14.ec2.internal:45454/container_e10_1457365475473_0248_01_000029/attempt_1457365475473_0248_r_000000_0/hadoop/syslog/?start=0
ones = FOREACH user_segment_with_event GENERATE (int) 1 AS one:int;
 c = GROUP ones ALL; c = FOREACH c GENERATE COUNT(ones); dump c

William Watson
Lead Software Engineer

On Thu, Mar 10, 2016 at 11:11 AM, Billy Watson <williamrwat...@gmail.com>
wrote:

> Thanks to a bug fix put in by a colleague of mine, merge joins work for
> tables loaded into pig via HBaseStorage. In our test environment and in the
> test environment for pig itself, I'm able to get all sorts of fairly
> complex data merging without issue.
>
> However, when I use that same code on larger data sets in a production
> environment, the merge join fails. If I run it on the same exact tables on
> the same cluster after trimming the data down to just a few rows, the merge
> join works fine.
>
> Here is the most basic I've been able to get the pig script. I've been
> taking out pieces and parts trying to narrow it down but it still fails:
>
>
>
> If I change the count portion to a limit 5 or something, I'm able to dump
> the relation.
>
> The merge join finishes all of its mappers, but when it gets to the reduce
> step and starts doing a sort (don't ask me why it's even doing a sort on
> pre-sorted data), it throws the following error:
>
> 2016-03-09 19:36:01,738 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: Error while 
> doing final merge
>       at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:160)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ClassCastException: 
> org.apache.pig.backend.hadoop.hbase.TableSplitComparable cannot be cast to 
> org.apache.hadoop.hbase.mapreduce.TableSplit
>       at 
> org.apache.pig.backend.hadoop.hbase.TableSplitComparable.compareTo(TableSplitComparable.java:26)
>       at org.apache.pig.data.DataType.compare(DataType.java:566)
>       at org.apache.pig.data.DataType.compare(DataType.java:464)
>       at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareDatum(BinInterSedes.java:1106)
>       at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:1082)
>       at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:787)
>       at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:728)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTupleSortComparator.compare(PigTupleSortComparator.java:100)
>       at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:587)
>       at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:128)
>       at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:55)
>       at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:678)
>       at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:596)
>       at org.apache.hadoop.mapred.Merger.merge(Merger.java:131)
>       at org.apache.hadoop.mapred.Merger.merge(Merger.java:115)
>       at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:722)
>       at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.close(MergeManagerImpl.java:370)
>       at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:158)
>
>
>
> If I switch the order of the two relations in the merge join, I get a
> different error which appears more promising, but I still don't know what
> to do about it:
>
> 2016-03-09 19:55:24,789 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: c: Local Rearrange[tuple]{chararray}(false) - 
> scope-334 Operator Key: scope-334): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while 
> executing ForEach at [c[62,4]]
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:316)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:291)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:279)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:274)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Error while executing ForEach at [c[62,4]]
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>       ... 12 more
> Caused by: java.lang.NullPointerException
>       at 
> org.apache.pig.impl.builtin.DefaultIndexableLoader.seekNear(DefaultIndexableLoader.java:190)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.seekInRightStream(POMergeJoin.java:542)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNextTuple(POMergeJoin.java:299)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPreCombinerLocalRearrange.getNextTuple(POPreCombinerLocalRearrange.java:126)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:252)
>
>
> Again, I've tried replicating the exact scenario (and more complicated
> ones) in local environments and I can't get it to fail. I think it's
> related to yarn/mapreduce, but I can't figure out why that would matter or
> what it's really doing.
>
> I'm trying to set up the e2e (end to end) tests in the pig repo, but I'm
> not having any luck there, either. If I can't get a test failure, I'm
> afraid I'm not going to be able to fix the bug or issue.
>
> Can anyone help point me in the right direction as far as next debugging
> steps or what might be wrong?
>
>
> William Watson
> Lead Software Engineer
>

Reply via email to