Xingbo,

Please reference the spip and jira ticket next time:  [SPARK-24374] SPIP:
Support Barrier Scheduling in Apache Spark

On Sun, Jul 8, 2018 at 9:45 AM Xingbo Jiang <jiangxb1...@gmail.com> wrote:

> Hi All,
>
> I would like to invite you to review the design document for Barrier
> Execution Mode:
>
> https://docs.google.com/document/d/1GvcYR6ZFto3dOnjfLjZMtTezX0W5VYN9w1l4-tQXaZk/edit#
>
> TL;DR: We announced the project Hydrogen on recent Spark+AI Summit, a
> major part of the project involves significant changes to execution mode of
> Spark. This design doc proposes new APIs as well as new execution mode
> (known as barrier execution mode) to provide high-performance support for
> DL workloads.
>
> Major changes include:
>
>    - Add RDDBarrier to support gang scheduling.
>    - Add BarrierTaskContext to support global sync of all tasks in a
>    stage;
>    - Better fault tolerance approach for barrier stage, that in case some
>    tasks fail in the middle, retry all tasks in the same stage.
>    - Integrate barrier execution mode with Standalone cluster manager.
>
> Please feel free to review and discuss on the design proposal.
>
> Thanks,
> Xingbo
>
>

Reply via email to