Re: [DISCUSSION] Support memory file storage.
Hi Gabriel, Thank you for bringing this to my attention. I confirm that I will change the authority of the Google Doc to allow you to write a comment. I apologize for any inconvenience this may have caused. Regards, Ethan Gabriel Lee 于2023年9月21日周四 10:50写道: > > Hi Ethan, > > After viewing this Google Doc, I noticed I don't have access to write a > comment. Could you please change this doc's authority ? > > Best, > Gabriel > > > On Wed, 20 Sept 2023 at 11:27, Ethan Feng wrote: > > > Hello Celeborn community, > > > > I have a proposal to support memory file storage in Celeborn: > > > > https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing > > > > Would really appreciate feedback from the community on this proposal. > > > > > > Thanks > > Ethan > >
Re: [DISCUSSION] Support memory file storage.
Hi Ethan, After viewing this Google Doc, I noticed I don't have access to write a comment. Could you please change this doc's authority ? Best, Gabriel On Wed, 20 Sept 2023 at 11:27, Ethan Feng wrote: > Hello Celeborn community, > > I have a proposal to support memory file storage in Celeborn: > > https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing > > Would really appreciate feedback from the community on this proposal. > > > Thanks > Ethan >
Re: [DISCUSSION] Support memory file storage.
Hi Mridul, Thank you for your email and your positive feedback on the proposed enhancement to Celeborn. I'm glad you find it promising. To address your queries: a) The proposed enhancement is intended to act as a storage tire, not as a cache. However, it may have certain elements of both. Celeborn currently won't store a whole shuffle file in memory and requires a shuffle file to be written to disks or HDFS before the client can read. This proposal will allow the client to read a shuffle file from the worker's memory directly. I hope this clarifies things for you. b) While your suggestion for a tiered storage layer is interesting, it is a superset of this proposal. As you can see there is an issue(https://github.com/apache/incubator-celeborn/issues/146). Celeborn treats a shuffle partition as a shuffle file instead of segments so a shuffle partition will not be distributed to multiple storage tiers. There will be another proposal to discuss how will Celeborn move existing shuffle files to different storage tires. c) As mentioned above, the enhancement is intended to act as a storage tier that's why I explained the details about how it is handled internally. Thanks again for your email. Please let me know if you have any further questions or concerns. Regards, Ethan Mridul Muralidharan 于2023年9月21日周四 01:09写道: > > Hi, > > This should be a nontrivial improvement to Celeborn imo, thanks Ethan ! > > I had a few queries: > > a) Are we viewing this enhancement as a cache or as a tiered storage layer ? > When going over it, I felt the proposal might be doing both - though > leaning more as a cache, but wanted to get clarity. > > b) If we are modelling it as a tiered storage layer, it would be good to > also think about what the right abstractions should be and not special case > it just for memory. > For example: > Memory -> NVME/SSD -> Spinning Disk -> HDFS/S3 > (With one or more being missing in a deployment) > > This would unify the way we handle evictions from one level to the next > with a tiered view of the storage layer. > Complexity of the implementation is definitely a consideration here though. > > Note, this might be out of scope for this proposal and work for the future > as well - wanted to get your thoughts if it was considered ! > > c) If modelling as a cache, we should change the abstractions in the > proposal slightly and hide the details behind the cache implementation. > Read and write path would not need to worry about how it is handled > internally. > > > Regards, > Mridul > > > > > > On Tue, Sep 19, 2023 at 10:27 PM Ethan Feng wrote: > > > Hello Celeborn community, > > > > I have a proposal to support memory file storage in Celeborn: > > > > https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing > > > > Would really appreciate feedback from the community on this proposal. > > > > > > Thanks > > Ethan > >
Re: [DISCUSSION] Support memory file storage.
Hi, This should be a nontrivial improvement to Celeborn imo, thanks Ethan ! I had a few queries: a) Are we viewing this enhancement as a cache or as a tiered storage layer ? When going over it, I felt the proposal might be doing both - though leaning more as a cache, but wanted to get clarity. b) If we are modelling it as a tiered storage layer, it would be good to also think about what the right abstractions should be and not special case it just for memory. For example: Memory -> NVME/SSD -> Spinning Disk -> HDFS/S3 (With one or more being missing in a deployment) This would unify the way we handle evictions from one level to the next with a tiered view of the storage layer. Complexity of the implementation is definitely a consideration here though. Note, this might be out of scope for this proposal and work for the future as well - wanted to get your thoughts if it was considered ! c) If modelling as a cache, we should change the abstractions in the proposal slightly and hide the details behind the cache implementation. Read and write path would not need to worry about how it is handled internally. Regards, Mridul On Tue, Sep 19, 2023 at 10:27 PM Ethan Feng wrote: > Hello Celeborn community, > > I have a proposal to support memory file storage in Celeborn: > > https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing > > Would really appreciate feedback from the community on this proposal. > > > Thanks > Ethan >