Thanks for clarifying Ethan, sounds good to me !

Regards,
Mridul

On Wed, Sep 20, 2023 at 8:58 PM Ethan Feng <ethanf...@apache.org> wrote:

> Hi Mridul,
>
> Thank you for your email and your positive feedback on the proposed
> enhancement to Celeborn. I'm glad you find it promising.
>
> To address your queries:
>
> a) The proposed enhancement is intended to act as a storage tire, not
> as a cache. However, it may have certain elements of both. Celeborn
> currently won't store a whole shuffle file in memory and requires a
> shuffle file to be written to disks or HDFS before the client can
> read. This proposal will allow the client to read a shuffle file from
> the worker's memory directly. I hope this clarifies things for you.
>
> b) While your suggestion for a tiered storage layer is interesting, it
> is a superset of this proposal. As you can see there is an
> issue(https://github.com/apache/incubator-celeborn/issues/146).
> Celeborn treats a shuffle partition as a shuffle file instead of
> segments so a shuffle partition will not be distributed to multiple
> storage tiers. There will be another proposal to discuss how will
> Celeborn move existing shuffle files to different storage tires.
>
> c) As mentioned above, the enhancement is intended to act as a storage
> tier that's why I explained the details about how it is handled
> internally.
>
> Thanks again for your email. Please let me know if you have any
> further questions or concerns.
>
> Regards,
> Ethan
>
> Mridul Muralidharan <mri...@gmail.com> 于2023年9月21日周四 01:09写道:
> >
> > Hi,
> >
> >   This should be a nontrivial improvement to Celeborn imo, thanks Ethan !
> >
> > I had a few queries:
> >
> > a) Are we viewing this enhancement as a cache or as a tiered storage
> layer ?
> > When going over it, I felt the proposal might be doing both - though
> > leaning more as a cache, but wanted to get clarity.
> >
> > b) If we are modelling it as a tiered storage layer, it would be good to
> > also think about what the right abstractions should be and not special
> case
> > it just for memory.
> > For example:
> > Memory -> NVME/SSD -> Spinning Disk -> HDFS/S3
> > (With one or more being missing in a deployment)
> >
> > This would unify the way we handle evictions from one level to the next
> > with a tiered view of the storage layer.
> > Complexity of the implementation is definitely a consideration here
> though.
> >
> > Note, this might be out of scope for this proposal and work for the
> future
> > as well - wanted to get your thoughts if it was considered !
> >
> > c) If modelling as a cache, we should change the abstractions in the
> > proposal slightly and hide the details behind the cache implementation.
> > Read and write path would not need to worry about how it is handled
> > internally.
> >
> >
> > Regards,
> > Mridul
> >
> >
> >
> >
> >
> > On Tue, Sep 19, 2023 at 10:27 PM Ethan Feng <ethanf...@apache.org>
> wrote:
> >
> > > Hello Celeborn community,
> > >
> > > I have a proposal to support memory file storage in Celeborn:
> > >
> > >
> https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing
> > >
> > > Would really appreciate feedback from the community on this proposal.
> > >
> > >
> > > Thanks
> > > Ethan
> > >
>

Reply via email to