This was stated in the other thread: Unified/Universal Shuffle
On 5/24/22, 10:04 PM, "XiaoYu" <[email protected]> wrote:
Hi
Uniffle as a project name, What does he mean~
thanks
Weiwei Yang <[email protected]> 于2022年5月25日周三 12:57写道:
>
> +1 (binding)
> Good luck!
>
> On Tue, May 24, 2022 at 8:49 PM Ye Xianjin <[email protected]> wrote:
>
> > +1 (non-binding).
> >
> > Sent from my iPhone
> >
> > > On May 25, 2022, at 9:59 AM, Goson zhang <[email protected]>
wrote:
> > >
> > > +1 (non-binding)
> > >
> > > Good luck!
> > >
> > > Daniel Widdis <[email protected]> 于2022年5月25日周三 09:53写道:
> > >
> > >> +1 (non-binding) from me! Good luck!
> > >>
> > >> On 5/24/22, 9:05 AM, "Jerry Shao" <[email protected]> wrote:
> > >>
> > >> Hi all,
> > >>
> > >> Due to the name issue in thread (
> > >> https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f),
> > we
> > >> figured out a new project name "Uniffle" and created a new Thread.
> > >> Please
> > >> help to discuss.
> > >>
> > >> We would like to propose Uniffle[1] as a new Apache incubator
> > project,
> > >> you
> > >> can find the proposal here [2] for more details.
> > >>
> > >> Uniffle is a high performance, general purpose Remote Shuffle
Service
> > >> for
> > >> distributed compute engines like Apache Spark
> > >> <https://spark.apache.org/>, Apache
> > >> Hadoop MapReduce <https://hadoop.apache.org/>, Apache Flink
> > >> <https://flink.apache.org/> and so on. We are aiming to make
> > >> Firestorm a
> > >> universal shuffle service for distributed compute engines.
> > >>
> > >> Shuffle is the key part for a distributed compute engine to
exchange
> > >> the
> > >> data between distributed tasks, the performance and stability of
> > >> shuffle
> > >> will directly affect the whole job. Current “local file pull-like
> > >> shuffle
> > >> style” has several limitations:
> > >>
> > >> 1. Current shuffle is hard to support super large workloads,
> > >> especially
> > >> in a high load environment, the major problem is IO problem
> > (random
> > >> disk IO
> > >> issue, network congestion and timeout).
> > >> 2. Current shuffle is hard to deploy on the disaggregated
compute
> > >> storage environment, as disk capacity is quite limited on
compute
> > >> nodes.
> > >> 3. The constraint of storing shuffle data locally makes it
hard to
> > >> scale
> > >> elastically.
> > >>
> > >> Remote Shuffle Service is the key technology for enterprises to
build
> > >> big
> > >> data platforms, to expand big data applications to disaggregated,
> > >> online-offline hybrid environments, and to solve above problems.
> > >>
> > >> The implementation of Remote Shuffle Service - “Uniffle” - is
> > heavily
> > >> adopted in Tencent, and shows its advantages in production. Other
> > >> enterprises also adopted or prepared to adopt Firestorm in their
> > >> environments.
> > >>
> > >> Uniffle's key idea is brought from Salfish shuffle
> > >> <
> > >>
> >
https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing
> > >>> ,
> > >> it has several key design goals:
> > >>
> > >> 1. High performance. Firestorm’s performance is close enough to
> > >> local
> > >> file based shuffle style for small workloads. For large
workloads,
> > >> it is
> > >> far better than the current shuffle style.
> > >> 2. Fault tolerance. Firestorm provides high availability for
> > >> Coordinated
> > >> nodes, and failover for Shuffle nodes.
> > >> 3. Pluggable. Firestorm is highly pluggable, which could be
suited
> > >> to
> > >> different compute engines, different backend storages, and
> > different
> > >> wire-protocols.
> > >>
> > >> We believe that Uniffle project will provide the great value for
the
> > >> community if it is accepted by the Apache incubator.
> > >>
> > >> I will help this project as champion and many thanks to the 3
> > mentors:
> > >>
> > >> -
> > >>
> > >> Felix Cheung ([email protected])
> > >> - Junping du ([email protected])
> > >> - Weiwei Yang ([email protected])
> > >> - Xun liu ([email protected])
> > >> - Zhankun Tang ([email protected])
> > >>
> > >>
> > >> [1] https://github.com/Tencent/Firestorm
> > >> [2]
> > >> https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal
> > >>
> > >> Best regards,
> > >> Jerry
> > >>
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: [email protected]
> > >> For additional commands, e-mail: [email protected]
> > >>
> > >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]