This has been fixed in https://issues.apache.org/jira/browse/PIG-4635. If you cannot run with the latest code, you can try running with the setting
set pig.tez.grace.parallelism false; Pig will still try to auto determine parallelism with Tez, but it will use a different algorithm which is not as dynamic. Regards, Rohini On Tue, Feb 23, 2016 at 3:18 PM, Jan Morlock <[email protected]> wrote: > Hi, > > removing the statement > > set default_parallel 24; > > and all other information about parallelism from my script, causes Pig > on Tez (0.8.2) to fail with the following stack trace: > > > org.apache.tez.dag.api.TezException: Vertex failed, > vertexName=scope-5824, vertexId=vertex_1456239615940_0236_1_61, > diagnostics=[Vertex vertex_1456239615940_0236_1_61 [scope-5824] > killed/failed due to:AM_USERCODE_FAILURE, Exception in VertexManager, > vertex:vertex_1456239615940_0236_1_61 [scope-5824], > org.apache.tez.dag.api.TezUncheckedException: > org.apache.pig.impl.plan.VisitorException: ERROR 0: > java.lang.NullPointerException > at > > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigGraceShuffleVertexManager.onVertexStateUpdated(PigGraceShuffleVertexManager.java:162) > at > > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventOnVertexStateUpdate.invoke(VertexManager.java:564) > at > > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:647) > at > > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:642) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:642) > at > > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:631) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: > java.lang.NullPointerException > at > > org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.ParallelismSetter.visitTezOp(ParallelismSetter.java:201) > at > > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:246) > at > > org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:53) > at > > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigGraceShuffleVertexManager.onVertexStateUpdated(PigGraceShuffleVertexManager.java:159) > ... 12 more > Caused by: java.lang.NullPointerException > at > > org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.TezOperDependencyParallelismEstimator.estimateParallelism(TezOperDependencyParallelismEstimator.java:114) > at > > org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.ParallelismSetter.visitTezOp(ParallelismSetter.java:138) > ... 17 more > ] > > > My primary intention behind removing that statement was, that I thought, > it would be best to leave the steering of parallelism up to Tez. > Is this basic assumption correct? And if yes, how can I avoid getting > the exception shown above? > > Thank you very much in advance. > With best regards > Jan >
