[ 
https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569363#comment-16569363
 ] 

Mridul Muralidharan edited comment on SPARK-24375 at 8/5/18 4:42 AM:
---------------------------------------------------------------------

{quote} We've thought hard on the issue and don't feel we can make it unless we 
force users to explicitly set a number in a barrier() call (actually it's not a 
good idea because it brings more borden to manage the code).{quote}

I am not sure where the additional burden exists.

Make it an optional param to barrier.
* If not defined, it would be analogous to what exists right now.
* If specified, fail the stage if different tasks in stage end up waiting on 
different barrier names (or some have a name and others dont).

In example usecases I have seen, there is usually partition specific code paths 
(if partition 0, do some initialization/teardown, etc) - which results in 
divergent codepaths : and so increases potential for this issue.

It will be very difficult to reason about the state when this happens.


was (Author: mridulm80):
{quote} We've thought hard on the issue and don't feel we can make it unless we 
force users to explicitly set a number in a barrier() call (actually it's not a 
good idea because it brings more borden to manage the code).{quote}

I am not sure where the additional burden exists.

Make it an optional param to barrier.
* If not defined, it would be analogous to what exists right now.
* If specified, fail the stage if different tasks in stage end up waiting on 
different barrier names (or some have a name and others dont).

In example usecases I have seen, there is usually partition specific code paths 
(if partition 0, do some initialization/teardown, etc) - which results in 
divergent codepaths : and so increases potential for this issue.

It will be very difficult to reason about the state what happens.

> Design sketch: support barrier scheduling in Apache Spark
> ---------------------------------------------------------
>
>                 Key: SPARK-24375
>                 URL: https://issues.apache.org/jira/browse/SPARK-24375
>             Project: Spark
>          Issue Type: Story
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Xiangrui Meng
>            Assignee: Jiang Xingbo
>            Priority: Major
>
> This task is to outline a design sketch for the barrier scheduling SPIP 
> discussion. It doesn't need to be a complete design before the vote. But it 
> should at least cover both Scala/Java and PySpark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to