Re: [DISCUSSION] Incubating Proposal of Uniffle

2022-05-24 Thread Daniel Widdis
This was stated in the other thread: Unified/Universal Shuffle

On 5/24/22, 10:04 PM, "XiaoYu"  wrote:

Hi

Uniffle  as a project name, What does he mean~

thanks

Weiwei Yang  于2022年5月25日周三 12:57写道:
>
> +1 (binding)
> Good luck!
>
> On Tue, May 24, 2022 at 8:49 PM Ye Xianjin  wrote:
>
> > +1 (non-binding).
> >
> > Sent from my iPhone
> >
> > > On May 25, 2022, at 9:59 AM, Goson zhang  
wrote:
> > >
> > > +1 (non-binding)
> > >
> > > Good luck!
> > >
> > > Daniel Widdis  于2022年5月25日周三 09:53写道:
> > >
> > >> +1 (non-binding) from me!  Good luck!
> > >>
> > >> On 5/24/22, 9:05 AM, "Jerry Shao"  wrote:
> > >>
> > >>Hi all,
> > >>
> > >>Due to the name issue in thread (
> > >>https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f),
> > we
> > >>figured out a new project name "Uniffle" and created a new Thread.
> > >> Please
> > >>help to discuss.
> > >>
> > >>We would like to propose Uniffle[1] as a new Apache incubator
> > project,
> > >> you
> > >>can find the proposal here [2] for more details.
> > >>
> > >>Uniffle is a high performance, general purpose Remote Shuffle 
Service
> > >> for
> > >>distributed compute engines like Apache Spark
> > >>, Apache
> > >>Hadoop MapReduce , Apache Flink
> > >> and so on. We are aiming to make
> > >> Firestorm a
> > >>universal shuffle service for distributed compute engines.
> > >>
> > >>Shuffle is the key part for a distributed compute engine to 
exchange
> > >> the
> > >>data between distributed tasks, the performance and stability of
> > >> shuffle
> > >>will directly affect the whole job. Current “local file pull-like
> > >> shuffle
> > >>style” has several limitations:
> > >>
> > >>   1. Current shuffle is hard to support super large workloads,
> > >> especially
> > >>   in a high load environment, the major problem is IO problem
> > (random
> > >> disk IO
> > >>   issue, network congestion and timeout).
> > >>   2. Current shuffle is hard to deploy on the disaggregated 
compute
> > >>   storage environment, as disk capacity is quite limited on 
compute
> > >> nodes.
> > >>   3. The constraint of storing shuffle data locally makes it 
hard to
> > >> scale
> > >>   elastically.
> > >>
> > >>Remote Shuffle Service is the key technology for enterprises to 
build
> > >> big
> > >>data platforms, to expand big data applications to disaggregated,
> > >>online-offline hybrid environments, and to solve above problems.
> > >>
> > >>The implementation of Remote Shuffle Service -  “Uniffle”  - is
> > heavily
> > >>adopted in Tencent, and shows its advantages in production. Other
> > >>enterprises also adopted or prepared to adopt Firestorm in their
> > >>environments.
> > >>
> > >>Uniffle's key idea is brought from Salfish shuffle
> > >><
> > >>
> > 
https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing
> > >>> ,
> > >>it has several key design goals:
> > >>
> > >>   1. High performance. Firestorm’s performance is close enough to
> > >> local
> > >>   file based shuffle style for small workloads. For large 
workloads,
> > >> it is
> > >>   far better than the current shuffle style.
> > >>   2. Fault tolerance. Firestorm provides high availability for
> > >> Coordinated
> > >>   nodes, and failover for Shuffle nodes.
> > >>   3. Pluggable. Firestorm is highly pluggable, which could be 
suited
> > >> to
> > >>   different compute engines, different backend storages, and
> > different
> > >>   wire-protocols.
> > >>
> > >>We believe that Uniffle project will provide the great value for 
the
> > >>community if it is accepted by the Apache incubator.
> > >>
> > >>I will help this project as champion and many thanks to the 3
> > mentors:
> > >>
> > >>   -
> > >>
> > >>   Felix Cheung (felixche...@apache.org)
> > >>   - Junping du (junping...@apache.org)
> > >>   - Weiwei Yang (w...@apache.org)
> > >>   - Xun liu (liu...@apache.org)
> > >>   - Zhankun Tang (zt...@apache.org)
> > >>
> > >>
> > >>[1] https://github.com/Tencent/Firestorm
> > >>[2]
> > >> https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal
> > >>
> > >>Best regards,
> > >>Jerry
> > >>
> > >>
> > >>
> > >> 

Re: [DISCUSSION] Incubating Proposal of Uniffle

2022-05-24 Thread XiaoYu
Hi

Uniffle  as a project name, What does he mean~

thanks

Weiwei Yang  于2022年5月25日周三 12:57写道:
>
> +1 (binding)
> Good luck!
>
> On Tue, May 24, 2022 at 8:49 PM Ye Xianjin  wrote:
>
> > +1 (non-binding).
> >
> > Sent from my iPhone
> >
> > > On May 25, 2022, at 9:59 AM, Goson zhang  wrote:
> > >
> > > +1 (non-binding)
> > >
> > > Good luck!
> > >
> > > Daniel Widdis  于2022年5月25日周三 09:53写道:
> > >
> > >> +1 (non-binding) from me!  Good luck!
> > >>
> > >> On 5/24/22, 9:05 AM, "Jerry Shao"  wrote:
> > >>
> > >>Hi all,
> > >>
> > >>Due to the name issue in thread (
> > >>https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f),
> > we
> > >>figured out a new project name "Uniffle" and created a new Thread.
> > >> Please
> > >>help to discuss.
> > >>
> > >>We would like to propose Uniffle[1] as a new Apache incubator
> > project,
> > >> you
> > >>can find the proposal here [2] for more details.
> > >>
> > >>Uniffle is a high performance, general purpose Remote Shuffle Service
> > >> for
> > >>distributed compute engines like Apache Spark
> > >>, Apache
> > >>Hadoop MapReduce , Apache Flink
> > >> and so on. We are aiming to make
> > >> Firestorm a
> > >>universal shuffle service for distributed compute engines.
> > >>
> > >>Shuffle is the key part for a distributed compute engine to exchange
> > >> the
> > >>data between distributed tasks, the performance and stability of
> > >> shuffle
> > >>will directly affect the whole job. Current “local file pull-like
> > >> shuffle
> > >>style” has several limitations:
> > >>
> > >>   1. Current shuffle is hard to support super large workloads,
> > >> especially
> > >>   in a high load environment, the major problem is IO problem
> > (random
> > >> disk IO
> > >>   issue, network congestion and timeout).
> > >>   2. Current shuffle is hard to deploy on the disaggregated compute
> > >>   storage environment, as disk capacity is quite limited on compute
> > >> nodes.
> > >>   3. The constraint of storing shuffle data locally makes it hard to
> > >> scale
> > >>   elastically.
> > >>
> > >>Remote Shuffle Service is the key technology for enterprises to build
> > >> big
> > >>data platforms, to expand big data applications to disaggregated,
> > >>online-offline hybrid environments, and to solve above problems.
> > >>
> > >>The implementation of Remote Shuffle Service -  “Uniffle”  - is
> > heavily
> > >>adopted in Tencent, and shows its advantages in production. Other
> > >>enterprises also adopted or prepared to adopt Firestorm in their
> > >>environments.
> > >>
> > >>Uniffle's key idea is brought from Salfish shuffle
> > >><
> > >>
> > https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing
> > >>> ,
> > >>it has several key design goals:
> > >>
> > >>   1. High performance. Firestorm’s performance is close enough to
> > >> local
> > >>   file based shuffle style for small workloads. For large workloads,
> > >> it is
> > >>   far better than the current shuffle style.
> > >>   2. Fault tolerance. Firestorm provides high availability for
> > >> Coordinated
> > >>   nodes, and failover for Shuffle nodes.
> > >>   3. Pluggable. Firestorm is highly pluggable, which could be suited
> > >> to
> > >>   different compute engines, different backend storages, and
> > different
> > >>   wire-protocols.
> > >>
> > >>We believe that Uniffle project will provide the great value for the
> > >>community if it is accepted by the Apache incubator.
> > >>
> > >>I will help this project as champion and many thanks to the 3
> > mentors:
> > >>
> > >>   -
> > >>
> > >>   Felix Cheung (felixche...@apache.org)
> > >>   - Junping du (junping...@apache.org)
> > >>   - Weiwei Yang (w...@apache.org)
> > >>   - Xun liu (liu...@apache.org)
> > >>   - Zhankun Tang (zt...@apache.org)
> > >>
> > >>
> > >>[1] https://github.com/Tencent/Firestorm
> > >>[2]
> > >> https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal
> > >>
> > >>Best regards,
> > >>Jerry
> > >>
> > >>
> > >>
> > >> -
> > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > >> For additional commands, e-mail: general-h...@incubator.apache.org
> > >>
> > >>
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSSION] Incubating Proposal of Uniffle

2022-05-24 Thread Weiwei Yang
+1 (binding)
Good luck!

On Tue, May 24, 2022 at 8:49 PM Ye Xianjin  wrote:

> +1 (non-binding).
>
> Sent from my iPhone
>
> > On May 25, 2022, at 9:59 AM, Goson zhang  wrote:
> >
> > +1 (non-binding)
> >
> > Good luck!
> >
> > Daniel Widdis  于2022年5月25日周三 09:53写道:
> >
> >> +1 (non-binding) from me!  Good luck!
> >>
> >> On 5/24/22, 9:05 AM, "Jerry Shao"  wrote:
> >>
> >>Hi all,
> >>
> >>Due to the name issue in thread (
> >>https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f),
> we
> >>figured out a new project name "Uniffle" and created a new Thread.
> >> Please
> >>help to discuss.
> >>
> >>We would like to propose Uniffle[1] as a new Apache incubator
> project,
> >> you
> >>can find the proposal here [2] for more details.
> >>
> >>Uniffle is a high performance, general purpose Remote Shuffle Service
> >> for
> >>distributed compute engines like Apache Spark
> >>, Apache
> >>Hadoop MapReduce , Apache Flink
> >> and so on. We are aiming to make
> >> Firestorm a
> >>universal shuffle service for distributed compute engines.
> >>
> >>Shuffle is the key part for a distributed compute engine to exchange
> >> the
> >>data between distributed tasks, the performance and stability of
> >> shuffle
> >>will directly affect the whole job. Current “local file pull-like
> >> shuffle
> >>style” has several limitations:
> >>
> >>   1. Current shuffle is hard to support super large workloads,
> >> especially
> >>   in a high load environment, the major problem is IO problem
> (random
> >> disk IO
> >>   issue, network congestion and timeout).
> >>   2. Current shuffle is hard to deploy on the disaggregated compute
> >>   storage environment, as disk capacity is quite limited on compute
> >> nodes.
> >>   3. The constraint of storing shuffle data locally makes it hard to
> >> scale
> >>   elastically.
> >>
> >>Remote Shuffle Service is the key technology for enterprises to build
> >> big
> >>data platforms, to expand big data applications to disaggregated,
> >>online-offline hybrid environments, and to solve above problems.
> >>
> >>The implementation of Remote Shuffle Service -  “Uniffle”  - is
> heavily
> >>adopted in Tencent, and shows its advantages in production. Other
> >>enterprises also adopted or prepared to adopt Firestorm in their
> >>environments.
> >>
> >>Uniffle's key idea is brought from Salfish shuffle
> >><
> >>
> https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing
> >>> ,
> >>it has several key design goals:
> >>
> >>   1. High performance. Firestorm’s performance is close enough to
> >> local
> >>   file based shuffle style for small workloads. For large workloads,
> >> it is
> >>   far better than the current shuffle style.
> >>   2. Fault tolerance. Firestorm provides high availability for
> >> Coordinated
> >>   nodes, and failover for Shuffle nodes.
> >>   3. Pluggable. Firestorm is highly pluggable, which could be suited
> >> to
> >>   different compute engines, different backend storages, and
> different
> >>   wire-protocols.
> >>
> >>We believe that Uniffle project will provide the great value for the
> >>community if it is accepted by the Apache incubator.
> >>
> >>I will help this project as champion and many thanks to the 3
> mentors:
> >>
> >>   -
> >>
> >>   Felix Cheung (felixche...@apache.org)
> >>   - Junping du (junping...@apache.org)
> >>   - Weiwei Yang (w...@apache.org)
> >>   - Xun liu (liu...@apache.org)
> >>   - Zhankun Tang (zt...@apache.org)
> >>
> >>
> >>[1] https://github.com/Tencent/Firestorm
> >>[2]
> >> https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal
> >>
> >>Best regards,
> >>Jerry
> >>
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >> For additional commands, e-mail: general-h...@incubator.apache.org
> >>
> >>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [DISCUSSION] Incubating Proposal of Uniffle

2022-05-24 Thread Ye Xianjin
+1 (non-binding).

Sent from my iPhone

> On May 25, 2022, at 9:59 AM, Goson zhang  wrote:
> 
> +1 (non-binding)
> 
> Good luck!
> 
> Daniel Widdis  于2022年5月25日周三 09:53写道:
> 
>> +1 (non-binding) from me!  Good luck!
>> 
>> On 5/24/22, 9:05 AM, "Jerry Shao"  wrote:
>> 
>>Hi all,
>> 
>>Due to the name issue in thread (
>>https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f), we
>>figured out a new project name "Uniffle" and created a new Thread.
>> Please
>>help to discuss.
>> 
>>We would like to propose Uniffle[1] as a new Apache incubator project,
>> you
>>can find the proposal here [2] for more details.
>> 
>>Uniffle is a high performance, general purpose Remote Shuffle Service
>> for
>>distributed compute engines like Apache Spark
>>, Apache
>>Hadoop MapReduce , Apache Flink
>> and so on. We are aiming to make
>> Firestorm a
>>universal shuffle service for distributed compute engines.
>> 
>>Shuffle is the key part for a distributed compute engine to exchange
>> the
>>data between distributed tasks, the performance and stability of
>> shuffle
>>will directly affect the whole job. Current “local file pull-like
>> shuffle
>>style” has several limitations:
>> 
>>   1. Current shuffle is hard to support super large workloads,
>> especially
>>   in a high load environment, the major problem is IO problem (random
>> disk IO
>>   issue, network congestion and timeout).
>>   2. Current shuffle is hard to deploy on the disaggregated compute
>>   storage environment, as disk capacity is quite limited on compute
>> nodes.
>>   3. The constraint of storing shuffle data locally makes it hard to
>> scale
>>   elastically.
>> 
>>Remote Shuffle Service is the key technology for enterprises to build
>> big
>>data platforms, to expand big data applications to disaggregated,
>>online-offline hybrid environments, and to solve above problems.
>> 
>>The implementation of Remote Shuffle Service -  “Uniffle”  - is heavily
>>adopted in Tencent, and shows its advantages in production. Other
>>enterprises also adopted or prepared to adopt Firestorm in their
>>environments.
>> 
>>Uniffle's key idea is brought from Salfish shuffle
>><
>> https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing
>>> ,
>>it has several key design goals:
>> 
>>   1. High performance. Firestorm’s performance is close enough to
>> local
>>   file based shuffle style for small workloads. For large workloads,
>> it is
>>   far better than the current shuffle style.
>>   2. Fault tolerance. Firestorm provides high availability for
>> Coordinated
>>   nodes, and failover for Shuffle nodes.
>>   3. Pluggable. Firestorm is highly pluggable, which could be suited
>> to
>>   different compute engines, different backend storages, and different
>>   wire-protocols.
>> 
>>We believe that Uniffle project will provide the great value for the
>>community if it is accepted by the Apache incubator.
>> 
>>I will help this project as champion and many thanks to the 3 mentors:
>> 
>>   -
>> 
>>   Felix Cheung (felixche...@apache.org)
>>   - Junping du (junping...@apache.org)
>>   - Weiwei Yang (w...@apache.org)
>>   - Xun liu (liu...@apache.org)
>>   - Zhankun Tang (zt...@apache.org)
>> 
>> 
>>[1] https://github.com/Tencent/Firestorm
>>[2]
>> https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal
>> 
>>Best regards,
>>Jerry
>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>> 
>> 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSSION] Incubating Proposal of Uniffle

2022-05-24 Thread Goson zhang
+1 (non-binding)

Good luck!

Daniel Widdis  于2022年5月25日周三 09:53写道:

> +1 (non-binding) from me!  Good luck!
>
> On 5/24/22, 9:05 AM, "Jerry Shao"  wrote:
>
> Hi all,
>
> Due to the name issue in thread (
> https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f), we
> figured out a new project name "Uniffle" and created a new Thread.
> Please
> help to discuss.
>
> We would like to propose Uniffle[1] as a new Apache incubator project,
> you
> can find the proposal here [2] for more details.
>
> Uniffle is a high performance, general purpose Remote Shuffle Service
> for
> distributed compute engines like Apache Spark
> , Apache
> Hadoop MapReduce , Apache Flink
>  and so on. We are aiming to make
> Firestorm a
> universal shuffle service for distributed compute engines.
>
> Shuffle is the key part for a distributed compute engine to exchange
> the
> data between distributed tasks, the performance and stability of
> shuffle
> will directly affect the whole job. Current “local file pull-like
> shuffle
> style” has several limitations:
>
>1. Current shuffle is hard to support super large workloads,
> especially
>in a high load environment, the major problem is IO problem (random
> disk IO
>issue, network congestion and timeout).
>2. Current shuffle is hard to deploy on the disaggregated compute
>storage environment, as disk capacity is quite limited on compute
> nodes.
>3. The constraint of storing shuffle data locally makes it hard to
> scale
>elastically.
>
> Remote Shuffle Service is the key technology for enterprises to build
> big
> data platforms, to expand big data applications to disaggregated,
> online-offline hybrid environments, and to solve above problems.
>
> The implementation of Remote Shuffle Service -  “Uniffle”  - is heavily
> adopted in Tencent, and shows its advantages in production. Other
> enterprises also adopted or prepared to adopt Firestorm in their
> environments.
>
> Uniffle's key idea is brought from Salfish shuffle
> <
> https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing
> >,
> it has several key design goals:
>
>1. High performance. Firestorm’s performance is close enough to
> local
>file based shuffle style for small workloads. For large workloads,
> it is
>far better than the current shuffle style.
>2. Fault tolerance. Firestorm provides high availability for
> Coordinated
>nodes, and failover for Shuffle nodes.
>3. Pluggable. Firestorm is highly pluggable, which could be suited
> to
>different compute engines, different backend storages, and different
>wire-protocols.
>
> We believe that Uniffle project will provide the great value for the
> community if it is accepted by the Apache incubator.
>
> I will help this project as champion and many thanks to the 3 mentors:
>
>-
>
>Felix Cheung (felixche...@apache.org)
>- Junping du (junping...@apache.org)
>- Weiwei Yang (w...@apache.org)
>- Xun liu (liu...@apache.org)
>- Zhankun Tang (zt...@apache.org)
>
>
> [1] https://github.com/Tencent/Firestorm
> [2]
> https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal
>
> Best regards,
> Jerry
>
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [DISCUSSION] Incubating Proposal of Uniffle

2022-05-24 Thread Daniel Widdis
+1 (non-binding) from me!  Good luck!

On 5/24/22, 9:05 AM, "Jerry Shao"  wrote:

Hi all,

Due to the name issue in thread (
https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f), we
figured out a new project name "Uniffle" and created a new Thread. Please
help to discuss.

We would like to propose Uniffle[1] as a new Apache incubator project, you
can find the proposal here [2] for more details.

Uniffle is a high performance, general purpose Remote Shuffle Service for
distributed compute engines like Apache Spark
, Apache
Hadoop MapReduce , Apache Flink
 and so on. We are aiming to make Firestorm a
universal shuffle service for distributed compute engines.

Shuffle is the key part for a distributed compute engine to exchange the
data between distributed tasks, the performance and stability of shuffle
will directly affect the whole job. Current “local file pull-like shuffle
style” has several limitations:

   1. Current shuffle is hard to support super large workloads, especially
   in a high load environment, the major problem is IO problem (random disk 
IO
   issue, network congestion and timeout).
   2. Current shuffle is hard to deploy on the disaggregated compute
   storage environment, as disk capacity is quite limited on compute nodes.
   3. The constraint of storing shuffle data locally makes it hard to scale
   elastically.

Remote Shuffle Service is the key technology for enterprises to build big
data platforms, to expand big data applications to disaggregated,
online-offline hybrid environments, and to solve above problems.

The implementation of Remote Shuffle Service -  “Uniffle”  - is heavily
adopted in Tencent, and shows its advantages in production. Other
enterprises also adopted or prepared to adopt Firestorm in their
environments.

Uniffle's key idea is brought from Salfish shuffle

,
it has several key design goals:

   1. High performance. Firestorm’s performance is close enough to local
   file based shuffle style for small workloads. For large workloads, it is
   far better than the current shuffle style.
   2. Fault tolerance. Firestorm provides high availability for Coordinated
   nodes, and failover for Shuffle nodes.
   3. Pluggable. Firestorm is highly pluggable, which could be suited to
   different compute engines, different backend storages, and different
   wire-protocols.

We believe that Uniffle project will provide the great value for the
community if it is accepted by the Apache incubator.

I will help this project as champion and many thanks to the 3 mentors:

   -

   Felix Cheung (felixche...@apache.org)
   - Junping du (junping...@apache.org)
   - Weiwei Yang (w...@apache.org)
   - Xun liu (liu...@apache.org)
   - Zhankun Tang (zt...@apache.org)


[1] https://github.com/Tencent/Firestorm
[2] https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal

Best regards,
Jerry



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSSION] Incubating Proposal of Uniffle

2022-05-24 Thread Xun Liu
hi,

+1 (binding) from me,

We had several discussions and, based on the characteristics of the project
We created a new word Uniffle, and after checking, it was determined that
Uniffle has not been used as a software name yet.

Let's start a new journey. :-)

On Wed, May 25, 2022 at 12:05 AM Jerry Shao  wrote:

> Hi all,
>
> Due to the name issue in thread (
> https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f), we
> figured out a new project name "Uniffle" and created a new Thread. Please
> help to discuss.
>
> We would like to propose Uniffle[1] as a new Apache incubator project, you
> can find the proposal here [2] for more details.
>
> Uniffle is a high performance, general purpose Remote Shuffle Service for
> distributed compute engines like Apache Spark
> , Apache
> Hadoop MapReduce , Apache Flink
>  and so on. We are aiming to make Firestorm a
> universal shuffle service for distributed compute engines.
>
> Shuffle is the key part for a distributed compute engine to exchange the
> data between distributed tasks, the performance and stability of shuffle
> will directly affect the whole job. Current “local file pull-like shuffle
> style” has several limitations:
>
>1. Current shuffle is hard to support super large workloads, especially
>in a high load environment, the major problem is IO problem (random
> disk IO
>issue, network congestion and timeout).
>2. Current shuffle is hard to deploy on the disaggregated compute
>storage environment, as disk capacity is quite limited on compute nodes.
>3. The constraint of storing shuffle data locally makes it hard to scale
>elastically.
>
> Remote Shuffle Service is the key technology for enterprises to build big
> data platforms, to expand big data applications to disaggregated,
> online-offline hybrid environments, and to solve above problems.
>
> The implementation of Remote Shuffle Service -  “Uniffle”  - is heavily
> adopted in Tencent, and shows its advantages in production. Other
> enterprises also adopted or prepared to adopt Firestorm in their
> environments.
>
> Uniffle's key idea is brought from Salfish shuffle
> <
> https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing
> >,
> it has several key design goals:
>
>1. High performance. Firestorm’s performance is close enough to local
>file based shuffle style for small workloads. For large workloads, it is
>far better than the current shuffle style.
>2. Fault tolerance. Firestorm provides high availability for Coordinated
>nodes, and failover for Shuffle nodes.
>3. Pluggable. Firestorm is highly pluggable, which could be suited to
>different compute engines, different backend storages, and different
>wire-protocols.
>
> We believe that Uniffle project will provide the great value for the
> community if it is accepted by the Apache incubator.
>
> I will help this project as champion and many thanks to the 3 mentors:
>
>-
>
>Felix Cheung (felixche...@apache.org)
>- Junping du (junping...@apache.org)
>- Weiwei Yang (w...@apache.org)
>- Xun liu (liu...@apache.org)
>- Zhankun Tang (zt...@apache.org)
>
>
> [1] https://github.com/Tencent/Firestorm
> [2] https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal
>
> Best regards,
> Jerry
>


Re: [DISCUSSION] Incubating Proposal of Firestorm

2022-05-24 Thread Jerry Shao
Hi all,

I'm going to close this thread and create a new proposal thread (
https://lists.apache.org/thread/fyyhkjvhzl4hpzr52hd64csh5lt2wm6h) with
project name "Uniffle" . Please help to discuss.

Best regards,
Jerry

Daniel B. Widdis  于2022年5月24日周二 01:24写道:

> This conjures up a mental image of a unicorn with the Apache feather
> tickling its nose... I like it.
>
> On Mon, May 23, 2022 at 8:36 AM Jerry Shao  wrote:
>
> > Hi team,
> >
> > After discussing with the team, we figured out a new name "Uniffle"
> > (Unified/Universal Shuffle). And we did a name searching, seems there's
> no
> > conflict with this new name.
> >
> > So we'd go with "Uniffle" as new project name. What do you think?
> >
> > Best regards,
> > Jerry
> >
> > -
> > Uniffle Naming Search
> > Github:
> > Search for Uniffle returns 0 results
> > https://github.com/search?q=uniffle
> >
> > SF.net:
> > Search for Uniffle returns 0 results
> >
> >
> https://sourceforge.net/directory/os:mac/freshness:recently-updated/?q=Uniffle
> >
> > openhub.net
> > Search for Uniffle returns 0 results
> > https://www.openhub.net/p?ref=homepage=Uniffle
> >
> > Google code:
> > Search for Uniffle returns 0 results
> > https://opensource.google/s/results?q=Uniffle
> >
> > USPTO:
> >  I use ` Word and/or Design Mark Search (Free Form` of TESS. And type the
> > keyword '(uniffle)[BI,TI] and (software or computer)[GS] and (live)[LD]'
> .
> > There are 0 results
> > And I search
> >
> >
> https://search.uspto.gov/search?affiliate=web-sdmg-uspto.gov_by==uniffle
> > There are 0 results, too.
> >
> >
> > Trademarkia:
> > Search for Uniffle returns 0 results
> > https://www.trademarkia.com/trademarks-search.aspx?tn=uniffle
> >
> > EU Organization for Harmonization
> > Search for Uniffle returns 0 results
> > https://euipo.europa.eu/eSearch/#basic/1+1+1+1/uniffle
> >
> > Google:
> > Search for Uniffle returns 0 results
> > https://www.google.com/search?q=uniffle
> >
> > Bing:
> > Search for Uniffle returns 0 results
> > https://www.bing.com/search?q=uniffle
> >
> > Yahoo:
> > Search for Uniffle returns 0 results
> > https://search.yahoo.com/search?p=uniffle=1
> >
> > Stackoverflow:
> > Search for Uniffle returns 0 results
> > https://stackoverflow.com/search?q=uniffle
> >
> > Saisai Shao  于2022年5月23日周一 20:50写道:
> >
> > > Thanks Justin for the explanation.
> > >
> > > We discussed internally and think that a new name would be better to
> > avoid
> > > potential issue.
> > >
> > > Will figure out a new name.
> > >
> > > Best regards,
> > > Jerry
> > >
> > > Justin Mclean  于2022年5月23日周一 20:29写道:
> > >
> > >> Hi,
> > >>
> > >> > There’s a typo in this paragraph that makes it impossible to
> > >> > understand/changes the original meaning.
> > >>
> > >> Apologies I meant to say "I don’t think that is the case.”. From what
> I
> > >> can see trademarks have not approved FireStorm as a name. If the
> project
> > >> wants to enter the incubator with that name, and understands the risks
> > that
> > >> involves then that is OK. Just be aware there is a risk that this may
> > stop
> > >> the project from graduating from the Incubator under that name.
> > >>
> > >> Kind Regards,
> > >> Justin
> > >
> > >
> >
>
>
> --
> Dan Widdis
>


[DISCUSSION] Incubating Proposal of Uniffle

2022-05-24 Thread Jerry Shao
Hi all,

Due to the name issue in thread (
https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f), we
figured out a new project name "Uniffle" and created a new Thread. Please
help to discuss.

We would like to propose Uniffle[1] as a new Apache incubator project, you
can find the proposal here [2] for more details.

Uniffle is a high performance, general purpose Remote Shuffle Service for
distributed compute engines like Apache Spark
, Apache
Hadoop MapReduce , Apache Flink
 and so on. We are aiming to make Firestorm a
universal shuffle service for distributed compute engines.

Shuffle is the key part for a distributed compute engine to exchange the
data between distributed tasks, the performance and stability of shuffle
will directly affect the whole job. Current “local file pull-like shuffle
style” has several limitations:

   1. Current shuffle is hard to support super large workloads, especially
   in a high load environment, the major problem is IO problem (random disk IO
   issue, network congestion and timeout).
   2. Current shuffle is hard to deploy on the disaggregated compute
   storage environment, as disk capacity is quite limited on compute nodes.
   3. The constraint of storing shuffle data locally makes it hard to scale
   elastically.

Remote Shuffle Service is the key technology for enterprises to build big
data platforms, to expand big data applications to disaggregated,
online-offline hybrid environments, and to solve above problems.

The implementation of Remote Shuffle Service -  “Uniffle”  - is heavily
adopted in Tencent, and shows its advantages in production. Other
enterprises also adopted or prepared to adopt Firestorm in their
environments.

Uniffle's key idea is brought from Salfish shuffle
,
it has several key design goals:

   1. High performance. Firestorm’s performance is close enough to local
   file based shuffle style for small workloads. For large workloads, it is
   far better than the current shuffle style.
   2. Fault tolerance. Firestorm provides high availability for Coordinated
   nodes, and failover for Shuffle nodes.
   3. Pluggable. Firestorm is highly pluggable, which could be suited to
   different compute engines, different backend storages, and different
   wire-protocols.

We believe that Uniffle project will provide the great value for the
community if it is accepted by the Apache incubator.

I will help this project as champion and many thanks to the 3 mentors:

   -

   Felix Cheung (felixche...@apache.org)
   - Junping du (junping...@apache.org)
   - Weiwei Yang (w...@apache.org)
   - Xun liu (liu...@apache.org)
   - Zhankun Tang (zt...@apache.org)


[1] https://github.com/Tencent/Firestorm
[2] https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal

Best regards,
Jerry


[RESULT][VOTE] Release Apache Linkis (Incubating) 1.1.1-RC2

2022-05-24 Thread peacewong
Hi all, The vote to release Apache Linkis(incubating) 1.1.1-RC2 has passed
with 3 +1 binding and 2 +1 non-binding votes, no +0 or -1 votes. Binding
votes:
ShaoFeng Shi
Justin Mclean
Lidong Dai Non-Binding votes:
Zhen Wang
Heng Du

Vote thread:
https://lists.apache.org/thread/gdk5ys3sqg75hl1r455lxtk59o7do1wk

Many thanks for all our mentors helping us with the release procedure, and
all IPMC helped us to review and vote for Apache Linkis(Incubating)
release. I will
be working on publishing the artifacts soon.

Thanks
On behalf of Apache Linkis(Incubating) community


Re: [VOTE] Release Apache Linkis (Incubating) 1.1.1-RC2

2022-05-24 Thread peacewong
Hi all:

With a total of 3 +1 binding votes, 2 +1 non-binding votes, no 0 or -1
votes, I will close the voting email.

Best Regards.
Peace Wong

Lidong Dai  于2022年5月24日周二 14:24写道:

> +1 binding
>
> I checked,
> - incubating in name
> - disclaimer exists
> - download links are valid
> - can compile from source
>
>
> Best Regards
>
>
>
> ---
> Apache DolphinScheduler PMC Chair
> Lidong Dai
> lidong...@apache.org
> Linkedin: https://www.linkedin.com/in/dailidong
> Twitter: @WorkflowEasy 
>
> ---
>
>
> On Mon, May 23, 2022 at 8:33 PM Justin Mclean 
> wrote:
>
> > Hi,
> >
> > +1 binding
> >
> > I checked:
> > - incubating in name
> > - signatures and hashes correct
> > - disclaimer exists
> > - LICENSE and notice are file
> > - all ASF source files have ASF headers
> > - no unexpected binary files
> > - can compile from source
> >
> > Kind Regards,
> > Justin
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
>


Re: [VOTE] Release Apache Linkis (Incubating) 1.1.1-RC2

2022-05-24 Thread Lidong Dai
+1 binding

I checked,
- incubating in name
- disclaimer exists
- download links are valid
- can compile from source


Best Regards



---
Apache DolphinScheduler PMC Chair
Lidong Dai
lidong...@apache.org
Linkedin: https://www.linkedin.com/in/dailidong
Twitter: @WorkflowEasy 

---


On Mon, May 23, 2022 at 8:33 PM Justin Mclean 
wrote:

> Hi,
>
> +1 binding
>
> I checked:
> - incubating in name
> - signatures and hashes correct
> - disclaimer exists
> - LICENSE and notice are file
> - all ASF source files have ASF headers
> - no unexpected binary files
> - can compile from source
>
> Kind Regards,
> Justin
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>