[jira] [Commented] (SPARK-24005) Remove usage of Scala’s parallel collection

2018-07-29 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561221#comment-16561221
 ] 

Apache Spark commented on SPARK-24005:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/21913

> Remove usage of Scala’s parallel collection
> ---
>
> Key: SPARK-24005
> URL: https://issues.apache.org/jira/browse/SPARK-24005
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>  Labels: starter
>
> {noformat}
> val par = (1 to 100).par.flatMap { i =>
>   Thread.sleep(1000)
>   1 to 1000
> }.toSeq
> {noformat}
> We are unable to interrupt the execution of parallel collections. We need to 
> create a common utility function to do it, instead of using Scala parallel 
> collections



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24005) Remove usage of Scala’s parallel collection

2018-06-12 Thread Maxim Gekk (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509667#comment-16509667
 ] 

Maxim Gekk commented on SPARK-24005:


[~smilegator] I am trying to reproduce the issue but so far I am not lucky. The 
following test is passing successfully:
{code:scala}
  test("canceling of parallel collections") {
val conf = new SparkConf()
sc = new SparkContext("local[1]", "par col", conf)

val f = sc.parallelize(0 to 1, 1).map { i =>
  val par = (1 to 100).par
  val pool = ThreadUtils.newForkJoinPool("test pool", 2)
  par.tasksupport = new ForkJoinTaskSupport(pool)
  try {
par.flatMap { j =>
  Thread.sleep(1000)
  1 to 100
}.seq
  } finally {
pool.shutdown()
  }
}.takeAsync(100)

val sem = new Semaphore(0)
sc.addSparkListener(new SparkListener {
  override def onTaskStart(taskStart: SparkListenerTaskStart) {
sem.release()
  }
})

// Wait until some tasks were launched before we cancel the job.
sem.acquire()
// Wait until a task executes parallel collection.
Thread.sleep(1)
f.cancel()

val e = intercept[SparkException] { f.get() }.getCause
assert(e.getMessage.contains("cancelled") || 
e.getMessage.contains("killed"))
  }
{code}

> Remove usage of Scala’s parallel collection
> ---
>
> Key: SPARK-24005
> URL: https://issues.apache.org/jira/browse/SPARK-24005
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>  Labels: starter
>
> {noformat}
> val par = (1 to 100).par.flatMap { i =>
>   Thread.sleep(1000)
>   1 to 1000
> }.toSeq
> {noformat}
> We are unable to interrupt the execution of parallel collections. We need to 
> create a common utility function to do it, instead of using Scala parallel 
> collections



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24005) Remove usage of Scala’s parallel collection

2018-04-19 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444930#comment-16444930
 ] 

Xiao Li commented on SPARK-24005:
-

cc [~maxgekk]

> Remove usage of Scala’s parallel collection
> ---
>
> Key: SPARK-24005
> URL: https://issues.apache.org/jira/browse/SPARK-24005
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> {noformat}
> val par = (1 to 100).par.flatMap { i =>
>   Thread.sleep(1000)
>   1 to 1000
> }.toSeq
> {noformat}
> We are unable to interrupt the execution of parallel collections. We need to 
> create a common utility function to do it, instead of using Scala parallel 
> collections



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org