+1 Bests, Dongjoon.
On Sun, Jun 16, 2019 at 9:41 PM Saisai Shao <sai.sai.s...@gmail.com> wrote: > +1 (binding) > > Thanks > Saisai > > Imran Rashid <im...@therashids.com> 于2019年6月15日周六 上午3:46写道: > >> +1 (binding) >> >> I think this is a really important feature for spark. >> >> First, there is already a lot of interest in alternative shuffle storage >> in the community. There is already a lot of interest in alternative >> shuffle storage, from dynamic allocation in kubernetes, to even just >> improving stability in standard on-premise use of Spark. However, they're >> often stuck doing this in forks of Spark, and in ways that are not >> maintainable (because they copy-paste many spark internals) or are >> incorrect (for not correctly handling speculative execution & stage >> retries). >> >> Second, I think the specific proposal is good for finding the right >> balance between flexibility and too much complexity, to allow incremental >> improvements. A lot of work has been put into this already to try to >> figure out which pieces are essential to make alternative shuffle storage >> implementations feasible. >> >> Of course, that means it doesn't include everything imaginable; some >> things still aren't supported, and some will still choose to use the older >> ShuffleManager api to give total control over all of shuffle. But we know >> there are a reasonable set of things which can be implemented behind the >> api as the first step, and it can continue to evolve. >> >> On Fri, Jun 14, 2019 at 12:13 PM Ilan Filonenko <i...@cornell.edu> wrote: >> >>> +1 (non-binding). This API is versatile and flexible enough to handle >>> Bloomberg's internal use-cases. The ability for us to vary implementation >>> strategies is quite appealing. It is also worth to note the minimal changes >>> to Spark core in order to make it work. This is a very much needed addition >>> within the Spark shuffle story. >>> >>> On Fri, Jun 14, 2019 at 9:59 AM bo yang <bobyan...@gmail.com> wrote: >>> >>>> +1 This is great work, allowing plugin of different sort shuffle >>>> write/read implementation! Also great to see it retain the current Spark >>>> configuration >>>> (spark.shuffle.manager=org.apache.spark.shuffle.YourShuffleManagerImpl). >>>> >>>> >>>> On Thu, Jun 13, 2019 at 2:58 PM Matt Cheah <mch...@palantir.com> wrote: >>>> >>>>> Hi everyone, >>>>> >>>>> >>>>> >>>>> I would like to call a vote for the SPIP for SPARK-25299 >>>>> <https://issues.apache.org/jira/browse/SPARK-25299>, which proposes >>>>> to introduce a pluggable storage API for temporary shuffle data. >>>>> >>>>> >>>>> >>>>> You may find the SPIP document here >>>>> <https://docs.google.com/document/d/1d6egnL6WHOwWZe8MWv3m8n4PToNacdx7n_0iMSWwhCQ/edit> >>>>> . >>>>> >>>>> >>>>> >>>>> The discussion thread for the SPIP was conducted here >>>>> <https://lists.apache.org/thread.html/2fe82b6b86daadb1d2edaef66a2d1c4dd2f45449656098ee38c50079@%3Cdev.spark.apache.org%3E> >>>>> . >>>>> >>>>> >>>>> >>>>> Please vote on whether or not this proposal is agreeable to you. >>>>> >>>>> >>>>> >>>>> Thanks! >>>>> >>>>> >>>>> >>>>> -Matt Cheah >>>>> >>>>