[ https://issues.apache.org/jira/browse/SPARK-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183812#comment-15183812 ]
Shixiong Zhu commented on SPARK-10548: -------------------------------------- [~nicerobot] The issue was reintroduced by https://github.com/apache/spark/pull/9264 It will call Await.ready in "runJob" to wait for results. "par" uses a ForkJoin thread pool by default and ForkJoin thread pool will try to run new task when Await.ready is called. In this case, new task will see other task's "spark.sql.execution.id". Right now just don't use ForkJoin thread pool to launch Spark jobs until a fix is out. > Concurrent execution in SQL does not work > ----------------------------------------- > > Key: SPARK-10548 > URL: https://issues.apache.org/jira/browse/SPARK-10548 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.0 > Reporter: Andrew Or > Assignee: Andrew Or > Priority: Blocker > Fix For: 1.5.1, 1.6.0 > > > From the mailing list: > {code} > future { df1.count() } > future { df2.count() } > java.lang.IllegalArgumentException: spark.sql.execution.id is already set > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87) > > at > org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:1904) > at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1385) > {code} > === edit === > Simple reproduction: > {code} > (1 to 100).par.foreach { _ => > sc.parallelize(1 to 5).map { i => (i, i) }.toDF("a", "b").count() > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org