[ 
https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001356#comment-16001356
 ] 

Saif Addin commented on SPARK-13747:
------------------------------------

Sorry for the confusion. No, it doesnt work. I am currently trying out with 
using different execution contexts.

My issue happens always, it is 100% reproducible.

To simplify what I am doing:

1. akka-http server is started and REST DSL is setup
2. Inside a get dsl, I call a Spark dataframe which calls collect action from 
within a Future
3. the object containing the future calls await.result, as I need the dataframe 
to respond a 200 to http
4. the collect method is passed through as an annonymous function. runtime 
exception poinst at such annonymous function as the callback starter of my 
exception

When my website starts, 4 collects are called simoultaneously. Only one get 
call returns 200. The others are internal server errors.

> Concurrent execution in SQL doesn't work with Scala ForkJoinPool
> ----------------------------------------------------------------
>
>                 Key: SPARK-13747
>                 URL: https://issues.apache.org/jira/browse/SPARK-13747
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0, 2.0.1
>            Reporter: Shixiong Zhu
>            Assignee: Shixiong Zhu
>
> Run the following codes may fail
> {code}
> (1 to 100).par.foreach { _ =>
>   println(sc.parallelize(1 to 5).map { i => (i, i) }.toDF("a", "b").count())
> }
> java.lang.IllegalArgumentException: spark.sql.execution.id is already set 
>         at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87)
>  
>         at 
> org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:1904) 
>         at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1385) 
> {code}
> This is because SparkContext.runJob can be suspended when using a 
> ForkJoinPool (e.g.,scala.concurrent.ExecutionContext.Implicits.global) as it 
> calls Await.ready (introduced by https://github.com/apache/spark/pull/9264).
> So when SparkContext.runJob is suspended, ForkJoinPool will run another task 
> in the same thread, however, the local properties has been polluted.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to