[ https://issues.apache.org/jira/browse/PIG-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rohini Palaniswamy updated PIG-4722: ------------------------------------ Description: DefaultSorter in Tez calls Combiner from two different threads - during spill in SpillThread and flush in the main thread. If both run the combiner, one ends up with NPE as Reporter is set on only one thread in initialization. {code} java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:366) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:332) at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:128) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:197) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:175) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:50) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191) at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115) at org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:279) at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.spill(DefaultSorter.java:854) at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.sortAndSpill(DefaultSorter.java:780) at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter$SpillThread.run(DefaultSorter.java:708) Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:303) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNextTuple(POProject.java:403) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:361) ... 12 more {code} was: DefaultSorter in Tez calls Combiner from two different threads - during spill in SpillThread and flush in the main thread. If both run the combiner, one ends up with NPE as Reporter is set on only one thread in initialization. {code} java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:366) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:332) at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:128) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:197) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:175) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:50) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191) at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115) at org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:279) at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.spill(DefaultSorter.java:854) at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.sortAndSpill(DefaultSorter.java:780) at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter$SpillThread.run(DefaultSorter.java:708) {code} > [Pig on Tez] NPE while running Combiner > --------------------------------------- > > Key: PIG-4722 > URL: https://issues.apache.org/jira/browse/PIG-4722 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > > DefaultSorter in Tez calls Combiner from two different threads - during spill > in SpillThread and flush in the main thread. If both run the combiner, one > ends up with NPE as Reporter is set on only one thread in initialization. > {code} > java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:366) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:332) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:128) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:197) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:175) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:50) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) > at > org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191) > at > org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115) > at > org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:279) > at > org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.spill(DefaultSorter.java:854) > at > org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.sortAndSpill(DefaultSorter.java:780) > at > org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter$SpillThread.run(DefaultSorter.java:708) > Caused by: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:303) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNextTuple(POProject.java:403) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:361) > ... 12 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)