+1 (non-binding). Sent from my iPhone
> On May 25, 2022, at 9:59 AM, Goson zhang <gosonzh...@apache.org> wrote: > > +1 (non-binding) > > Good luck! > > Daniel Widdis <wid...@gmail.com> 于2022年5月25日周三 09:53写道: > >> +1 (non-binding) from me! Good luck! >> >> On 5/24/22, 9:05 AM, "Jerry Shao" <js...@apache.org> wrote: >> >> Hi all, >> >> Due to the name issue in thread ( >> https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f), we >> figured out a new project name "Uniffle" and created a new Thread. >> Please >> help to discuss. >> >> We would like to propose Uniffle[1] as a new Apache incubator project, >> you >> can find the proposal here [2] for more details. >> >> Uniffle is a high performance, general purpose Remote Shuffle Service >> for >> distributed compute engines like Apache Spark >> <https://spark.apache.org/>, Apache >> Hadoop MapReduce <https://hadoop.apache.org/>, Apache Flink >> <https://flink.apache.org/> and so on. We are aiming to make >> Firestorm a >> universal shuffle service for distributed compute engines. >> >> Shuffle is the key part for a distributed compute engine to exchange >> the >> data between distributed tasks, the performance and stability of >> shuffle >> will directly affect the whole job. Current “local file pull-like >> shuffle >> style” has several limitations: >> >> 1. Current shuffle is hard to support super large workloads, >> especially >> in a high load environment, the major problem is IO problem (random >> disk IO >> issue, network congestion and timeout). >> 2. Current shuffle is hard to deploy on the disaggregated compute >> storage environment, as disk capacity is quite limited on compute >> nodes. >> 3. The constraint of storing shuffle data locally makes it hard to >> scale >> elastically. >> >> Remote Shuffle Service is the key technology for enterprises to build >> big >> data platforms, to expand big data applications to disaggregated, >> online-offline hybrid environments, and to solve above problems. >> >> The implementation of Remote Shuffle Service - “Uniffle” - is heavily >> adopted in Tencent, and shows its advantages in production. Other >> enterprises also adopted or prepared to adopt Firestorm in their >> environments. >> >> Uniffle's key idea is brought from Salfish shuffle >> < >> https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing >>> , >> it has several key design goals: >> >> 1. High performance. Firestorm’s performance is close enough to >> local >> file based shuffle style for small workloads. For large workloads, >> it is >> far better than the current shuffle style. >> 2. Fault tolerance. Firestorm provides high availability for >> Coordinated >> nodes, and failover for Shuffle nodes. >> 3. Pluggable. Firestorm is highly pluggable, which could be suited >> to >> different compute engines, different backend storages, and different >> wire-protocols. >> >> We believe that Uniffle project will provide the great value for the >> community if it is accepted by the Apache incubator. >> >> I will help this project as champion and many thanks to the 3 mentors: >> >> - >> >> Felix Cheung (felixche...@apache.org) >> - Junping du (junping...@apache.org) >> - Weiwei Yang (w...@apache.org) >> - Xun liu (liu...@apache.org) >> - Zhankun Tang (zt...@apache.org) >> >> >> [1] https://github.com/Tencent/Firestorm >> [2] >> https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal >> >> Best regards, >> Jerry >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org