[ https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nandor Kollar updated PIG-5320: ------------------------------- Attachment: PIG-5320_2.patch > TestCubeOperator#testRollupBasic is flaky on Spark 2.2 > ------------------------------------------------------ > > Key: PIG-5320 > URL: https://issues.apache.org/jira/browse/PIG-5320 > Project: Pig > Issue Type: Bug > Components: spark > Reporter: Nandor Kollar > Assignee: Nandor Kollar > Attachments: PIG-5320_1.patch, PIG-5320_2.patch > > > TestCubeOperator#testRollupBasic occasionally fails with > {code} > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias c > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) > at org.apache.pig.PigServer.registerScript(PigServer.java:781) > at org.apache.pig.PigServer.registerScript(PigServer.java:858) > at org.apache.pig.PigServer.registerScript(PigServer.java:821) > at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972) > at > org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1475) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460) > at org.apache.pig.PigServer.execute(PigServer.java:1449) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {code} > I think the problem is that in JobStatisticCollector#waitForJobToEnd > {{sparkListener.wait()}} is not inside a loop, like suggested in wait's > javadoc: > {code} > * As in the one argument version, interrupts and spurious wakeups are > * possible, and this method should always be used in a loop: > {code} > Thus due to a spurious wakeup, the wait might pass without a notify getting > called. -- This message was sent by Atlassian JIRA (v6.4.14#64029)