[ 
https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569202#comment-16569202
 ] 

Jiang Xingbo commented on SPARK-24375:
--------------------------------------

[~mridulm80] You are right that now we are not able to identify which barrier 
it is until we really executed the barrier() function. We've thought hard on 
the issue and don't feel we can make it unless we force users to explicitly set 
a number in a barrier() call (actually it's not a good idea because it brings 
more borden to manage the code).

The current decision is that we don't distinguish barrier() calls from the same 
task, users shall be responsible to ensure the same number of barrier() calls 
shall happen in all possible code branches, otherwise you may get the job 
hanging or a SparkException after timeout.

We've added the following message to the description of 
`BarrierTaskContext.barrier()`, I hope these can be useful:
{quote}
   * CAUTION! In a barrier stage, each task must have the same number of 
barrier() calls, in all
   * possible code branches. Otherwise, you may get the job hanging or a 
SparkException after
   * timeout. 
{quote}

> Design sketch: support barrier scheduling in Apache Spark
> ---------------------------------------------------------
>
>                 Key: SPARK-24375
>                 URL: https://issues.apache.org/jira/browse/SPARK-24375
>             Project: Spark
>          Issue Type: Story
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Xiangrui Meng
>            Assignee: Jiang Xingbo
>            Priority: Major
>
> This task is to outline a design sketch for the barrier scheduling SPIP 
> discussion. It doesn't need to be a complete design before the vote. But it 
> should at least cover both Scala/Java and PySpark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to