Re: [GSoC 2024] ovfs proposal discussion

Manjusaka Wed, 20 Mar 2024 22:35:05 -0700

On 2024/3/21 10:27, Xuanwo wrote:
>> As we know, the observability is very important for a service. I think 
>> we might need to define and export some metric to let people know the 
>> ovfs daemon service status
>>
>> For now, we have some layer out of box in the OpenDAL(like Prometheus, 
>> OpenTelementry, Dtrace). I'm not sure we should add more metric like 
>> cache hit rate and anything else or not.
> 
> Hi, I agree that observability is important. However, it seems we're 
> going too far. The feature requests keep extending, making it a never-ending 
> project.
> 
> I suggest we leave them as points for future expansion.
> 
> On Wed, Mar 20, 2024, at 23:24, Manjusaka wrote:
>> On 2024/3/20 22:33, Runjie Yu wrote:
>>> Got it, thanks for your suggestions, I'll keep this in mind.
>>>
>>> I've put the main content of the proposal in a Google Docs, here's the link
>>> https://docs.google.com/document/d/1huy8vHcoCTf-GausabR3PCwXIXfRJAEckeyTulp2gDU/edit?usp=sharing
>>>
>>> I've specified in the deliverables section a description of the target for
>>> each storage type, S3 for object storage and HDFS for file storage.
>>>
>>> ```
>>> 1) A code repository that implements the functions described in the project
>>> details. The services implemented by OVFS in the code repository need to
>>> meet the following requirements: (1) VirtioFS implementation, well
>>> integrated with VMs and QEMU, able to correctly handle VMs read and write
>>> requests to the file system. (2) Supports the use of distributed object
>>> storage systems and distributed file systems as storage backends, and
>>> provides complete and correct support for at least one specific storage
>>> service type for each storage system type. S3 can be used as the target for
>>> object storage systems, and HDFS can be used as the target for distributed
>>> file systems. (3) Supports related configurations of various storage
>>> systems. Users can configure storage system access and use according to
>>> actual needs. When an error occurs, users can use the configuration file to
>>> restart services.
>>> ```
>>>
>>> On Tue, Mar 19, 2024 at 10:41 PM Manjusaka <[email protected]> wrote:
>>>
>>>> On 2024/3/19 20:57, 余润杰 wrote:
>>>>> Thank you for your suggestion.
>>>>>
>>>>> For the first point, I will update this in the proposal with an exact
>>>> goal for each storage system type. For the second point, I assume this
>>>> cache is shared by all VMs running in the same host OS.
>>>>>
>>>>> Regarding cloud documents, I think this is a very good suggestion. Yes,
>>>> I need to create and maintain a cloud document. This is not only easy to
>>>> browse, but by updating and maintaining this document during the GSoC
>>>> cycle, it helps us focus on our goals and demonstrate the phased results of
>>>> development.
>>>>>
>>>>> For now, I will create a Google Docs tomorrow to display the content of
>>>> the existing proposal.
>>>>>
>>>>> Thanks again for your advice!
>>>>>
>>>>> Manjusaka <[email protected] <mailto:[email protected]>>
>>>> 于2024年3月19日周二 17:48写道：
>>>>>
>>>>>     On 2024/3/19 16:58, 余润杰 wrote:
>>>>>     > Hi, Xuanwo and Manjusaka.
>>>>>     >
>>>>>     > I hope this email didn’t bother you!
>>>>>     >
>>>>>     > Applications for GSoC 2024 contributors opened today, and I hope
>>>> to join the GSoC project in Apache OpenDAL as a candidate. I have added you
>>>> to the list of mentors for the ovfsproject proposal and hope to have the
>>>> opportunity to be mentored by you!
>>>>>     >
>>>>>     > /Project Mentors: Xuanwo ([email protected] <mailto:
>>>> [email protected]> <mailto:[email protected] <mailto:[email protected]>>),
>>>> Manjusaka ([email protected] <mailto:[email protected]> <mailto:
>>>> [email protected] <mailto:[email protected]>>)/
>>>>>     >
>>>>>     > I have supplemented and modified some of the content based on
>>>> previous proposal, mainly including the following points:
>>>>>     >
>>>>>     > 1) Based on the discussion in the previous email, the name of the
>>>> project was changed from ovirtiofs to ovfs.
>>>>>     >
>>>>>     > 2) Added explanation of ovfs design philosophy.
>>>>>     >
>>>>>     > 3) Avoid ovfs persisting any metadata.
>>>>>     >
>>>>>     > 4) Added potential application scenarios.
>>>>>     >
>>>>>     > 5) Added project deliverables.
>>>>>     >
>>>>>     > 6) Added Why Me And Why Do I Wish To Take Part In GSoC 2024
>>>> section.
>>>>>     >
>>>>>     > I hope to submit the proposal this week. I'd like to know if there
>>>> are still areas that need to be revised or discussed before the proposal is
>>>> formally submitted.
>>>>>     >
>>>>>     > Have a nice day!
>>>>>
>>>>>     Hi Runjie
>>>>>
>>>>>     Glad to hear from you!
>>>>>
>>>>>     Nice proposal! BTW maybe you can upload the document to a website
>>>> like Google Docs, Gist, so other people can preview the docs online(LOL
>>>>>
>>>>>     Most LGTM about this version proposal, I may have some
>>>> issues/suggestions
>>>>>
>>>>>     1. We can make our target to implement only one service backend for
>>>> each category of the service(like S3 for blob, HDFS for file like, KV
>>>> storage is not in the plan). This will help us to focus on the function,
>>>> not the corner behavior.
>>>>>     It would also can help us to reach the full-fuction tested target(I
>>>> think it's important for us)
>>>>>
>>>>>     2. About the cache, I would like to ask: Is the cache shared by all
>>>> the VM? or each VM would have their own cache
>>>>>
>>>>>
>>>>>     Thanks for your nice proposal, have a nice day
>>>>>
>>>>>     Best
>>>>>
>>>>>     Manjusaka
>>>>>
>>>>
>>>> Sorry, I forget something in the previous email
>>>>
>>>> About the cache, I have another suggestion. I think we should split it
>>>> into two parts: the read cache and the write cache. The people can choose
>>>> to enable the cache base on their circumstance
>>>>
>>>> For example, if the user mount a S3 bucket as backend which is modified in
>>>> high frequency(modified by other serivce), the people shouldn't enable the
>>>> read cache.
>>>>
>>>> I think this is would good for the production usage.
>>>>
>>>> Best
>>>>
>>>> Manjusaka
>>>>
>>>
>>
>> Thanks for your public docs, I would like to say this is the most 
>> extraordinary proposal I have ever seen before. Great Job!
>>
>> BTW, I'm not sure the following should be included into the original 
>> proposal, for now, this is just personal idea.
>>
>> As we know, the observability is very important for a service. I think 
>> we might need to define and export some metric to let people know the 
>> ovfs daemon service status
>>
>> For now, we have some layer out of box in the OpenDAL(like Prometheus, 
>> OpenTelementry, Dtrace). I'm not sure we should add more metric like 
>> cache hit rate and anything else or not.
>>
>> WDYT?
>>
>> Best
>>
>> Manjusaka
>


SGTM

Re: [GSoC 2024] ovfs proposal discussion

Reply via email to