[ 
https://issues.apache.org/jira/browse/SPARK-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183812#comment-15183812
 ] 

Shixiong Zhu commented on SPARK-10548:
--------------------------------------

[~nicerobot] The issue was reintroduced by 
https://github.com/apache/spark/pull/9264

It will call Await.ready in "runJob" to wait for results. "par" uses a ForkJoin 
thread pool by default and ForkJoin thread pool will try to run new task when  
Await.ready is called. In this case, new task will see other task's 
"spark.sql.execution.id".

Right now just don't use ForkJoin thread pool to launch Spark jobs until a fix 
is out.

> Concurrent execution in SQL does not work
> -----------------------------------------
>
>                 Key: SPARK-10548
>                 URL: https://issues.apache.org/jira/browse/SPARK-10548
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.0
>            Reporter: Andrew Or
>            Assignee: Andrew Or
>            Priority: Blocker
>             Fix For: 1.5.1, 1.6.0
>
>
> From the mailing list:
> {code}
> future { df1.count() } 
> future { df2.count() } 
> java.lang.IllegalArgumentException: spark.sql.execution.id is already set 
>         at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87)
>  
>         at 
> org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:1904) 
>         at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1385) 
> {code}
> === edit ===
> Simple reproduction:
> {code}
> (1 to 100).par.foreach { _ =>
>   sc.parallelize(1 to 5).map { i => (i, i) }.toDF("a", "b").count()
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to