Re: [DISCUSSION] Support memory file storage.

2023-09-21 Thread Mridul Muralidharan
Thanks for clarifying Ethan, sounds good to me !

Regards,
Mridul

On Wed, Sep 20, 2023 at 8:58 PM Ethan Feng  wrote:

> Hi Mridul,
>
> Thank you for your email and your positive feedback on the proposed
> enhancement to Celeborn. I'm glad you find it promising.
>
> To address your queries:
>
> a) The proposed enhancement is intended to act as a storage tire, not
> as a cache. However, it may have certain elements of both. Celeborn
> currently won't store a whole shuffle file in memory and requires a
> shuffle file to be written to disks or HDFS before the client can
> read. This proposal will allow the client to read a shuffle file from
> the worker's memory directly. I hope this clarifies things for you.
>
> b) While your suggestion for a tiered storage layer is interesting, it
> is a superset of this proposal. As you can see there is an
> issue(https://github.com/apache/incubator-celeborn/issues/146).
> Celeborn treats a shuffle partition as a shuffle file instead of
> segments so a shuffle partition will not be distributed to multiple
> storage tiers. There will be another proposal to discuss how will
> Celeborn move existing shuffle files to different storage tires.
>
> c) As mentioned above, the enhancement is intended to act as a storage
> tier that's why I explained the details about how it is handled
> internally.
>
> Thanks again for your email. Please let me know if you have any
> further questions or concerns.
>
> Regards,
> Ethan
>
> Mridul Muralidharan  于2023年9月21日周四 01:09写道:
> >
> > Hi,
> >
> >   This should be a nontrivial improvement to Celeborn imo, thanks Ethan !
> >
> > I had a few queries:
> >
> > a) Are we viewing this enhancement as a cache or as a tiered storage
> layer ?
> > When going over it, I felt the proposal might be doing both - though
> > leaning more as a cache, but wanted to get clarity.
> >
> > b) If we are modelling it as a tiered storage layer, it would be good to
> > also think about what the right abstractions should be and not special
> case
> > it just for memory.
> > For example:
> > Memory -> NVME/SSD -> Spinning Disk -> HDFS/S3
> > (With one or more being missing in a deployment)
> >
> > This would unify the way we handle evictions from one level to the next
> > with a tiered view of the storage layer.
> > Complexity of the implementation is definitely a consideration here
> though.
> >
> > Note, this might be out of scope for this proposal and work for the
> future
> > as well - wanted to get your thoughts if it was considered !
> >
> > c) If modelling as a cache, we should change the abstractions in the
> > proposal slightly and hide the details behind the cache implementation.
> > Read and write path would not need to worry about how it is handled
> > internally.
> >
> >
> > Regards,
> > Mridul
> >
> >
> >
> >
> >
> > On Tue, Sep 19, 2023 at 10:27 PM Ethan Feng 
> wrote:
> >
> > > Hello Celeborn community,
> > >
> > > I have a proposal to support memory file storage in Celeborn:
> > >
> > >
> https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing
> > >
> > > Would really appreciate feedback from the community on this proposal.
> > >
> > >
> > > Thanks
> > > Ethan
> > >
>


Re: [DISCUSSION] Support memory file storage.

2023-09-20 Thread Ethan Feng
Hi Gabriel,

Thank you for bringing this to my attention. I confirm that I will
change the authority of the Google Doc to allow you to write a
comment. I apologize for any inconvenience this may have caused.

Regards,
Ethan

Gabriel Lee  于2023年9月21日周四 10:50写道:
>
> Hi Ethan,
>
> After viewing this Google Doc, I noticed I don't have access to write a
> comment. Could you please change this doc's authority ?
>
> Best,
> Gabriel
>
>
> On Wed, 20 Sept 2023 at 11:27, Ethan Feng  wrote:
>
> > Hello Celeborn community,
> >
> > I have a proposal to support memory file storage in Celeborn:
> >
> > https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing
> >
> > Would really appreciate feedback from the community on this proposal.
> >
> >
> > Thanks
> > Ethan
> >


Re: [DISCUSSION] Support memory file storage.

2023-09-20 Thread Gabriel Lee
Hi Ethan,

After viewing this Google Doc, I noticed I don't have access to write a
comment. Could you please change this doc's authority ?

Best,
Gabriel


On Wed, 20 Sept 2023 at 11:27, Ethan Feng  wrote:

> Hello Celeborn community,
>
> I have a proposal to support memory file storage in Celeborn:
>
> https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing
>
> Would really appreciate feedback from the community on this proposal.
>
>
> Thanks
> Ethan
>


Re: [DISCUSSION] Support memory file storage.

2023-09-20 Thread Ethan Feng
Hi Mridul,

Thank you for your email and your positive feedback on the proposed
enhancement to Celeborn. I'm glad you find it promising.

To address your queries:

a) The proposed enhancement is intended to act as a storage tire, not
as a cache. However, it may have certain elements of both. Celeborn
currently won't store a whole shuffle file in memory and requires a
shuffle file to be written to disks or HDFS before the client can
read. This proposal will allow the client to read a shuffle file from
the worker's memory directly. I hope this clarifies things for you.

b) While your suggestion for a tiered storage layer is interesting, it
is a superset of this proposal. As you can see there is an
issue(https://github.com/apache/incubator-celeborn/issues/146).
Celeborn treats a shuffle partition as a shuffle file instead of
segments so a shuffle partition will not be distributed to multiple
storage tiers. There will be another proposal to discuss how will
Celeborn move existing shuffle files to different storage tires.

c) As mentioned above, the enhancement is intended to act as a storage
tier that's why I explained the details about how it is handled
internally.

Thanks again for your email. Please let me know if you have any
further questions or concerns.

Regards,
Ethan

Mridul Muralidharan  于2023年9月21日周四 01:09写道:
>
> Hi,
>
>   This should be a nontrivial improvement to Celeborn imo, thanks Ethan !
>
> I had a few queries:
>
> a) Are we viewing this enhancement as a cache or as a tiered storage layer ?
> When going over it, I felt the proposal might be doing both - though
> leaning more as a cache, but wanted to get clarity.
>
> b) If we are modelling it as a tiered storage layer, it would be good to
> also think about what the right abstractions should be and not special case
> it just for memory.
> For example:
> Memory -> NVME/SSD -> Spinning Disk -> HDFS/S3
> (With one or more being missing in a deployment)
>
> This would unify the way we handle evictions from one level to the next
> with a tiered view of the storage layer.
> Complexity of the implementation is definitely a consideration here though.
>
> Note, this might be out of scope for this proposal and work for the future
> as well - wanted to get your thoughts if it was considered !
>
> c) If modelling as a cache, we should change the abstractions in the
> proposal slightly and hide the details behind the cache implementation.
> Read and write path would not need to worry about how it is handled
> internally.
>
>
> Regards,
> Mridul
>
>
>
>
>
> On Tue, Sep 19, 2023 at 10:27 PM Ethan Feng  wrote:
>
> > Hello Celeborn community,
> >
> > I have a proposal to support memory file storage in Celeborn:
> >
> > https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing
> >
> > Would really appreciate feedback from the community on this proposal.
> >
> >
> > Thanks
> > Ethan
> >


Re: [DISCUSSION] Support memory file storage.

2023-09-20 Thread Mridul Muralidharan
Hi,

  This should be a nontrivial improvement to Celeborn imo, thanks Ethan !

I had a few queries:

a) Are we viewing this enhancement as a cache or as a tiered storage layer ?
When going over it, I felt the proposal might be doing both - though
leaning more as a cache, but wanted to get clarity.

b) If we are modelling it as a tiered storage layer, it would be good to
also think about what the right abstractions should be and not special case
it just for memory.
For example:
Memory -> NVME/SSD -> Spinning Disk -> HDFS/S3
(With one or more being missing in a deployment)

This would unify the way we handle evictions from one level to the next
with a tiered view of the storage layer.
Complexity of the implementation is definitely a consideration here though.

Note, this might be out of scope for this proposal and work for the future
as well - wanted to get your thoughts if it was considered !

c) If modelling as a cache, we should change the abstractions in the
proposal slightly and hide the details behind the cache implementation.
Read and write path would not need to worry about how it is handled
internally.


Regards,
Mridul





On Tue, Sep 19, 2023 at 10:27 PM Ethan Feng  wrote:

> Hello Celeborn community,
>
> I have a proposal to support memory file storage in Celeborn:
>
> https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing
>
> Would really appreciate feedback from the community on this proposal.
>
>
> Thanks
> Ethan
>