> For example, we store the metadata into S3(standard S3, S3-liked > service), and we have swift, hdfs and many other backend serivces.
The primary focus of idea 2 is to cache metadata rather than storing it. Thus, metadata is fetched lazily, allowing users to restart the daemon whenever necessary. On Tue, Mar 12, 2024, at 21:44, Manjusaka wrote: > On 2024/3/12 20:56, 余润杰 wrote: >> Thank you very much for your suggestions. This should be a topic that >> requires thorough discussion, involving the design philosophy and >> positioning of ovirtiofs. >> >> My previous design was based on referencing the classic architecture of >> distributed file systems. The previous proposal stored additional metadata >> through a KV database. Thanks to KV databases like leveldb, rocksdb, which >> are used in the form of library and database files, users or ovirtiofs do >> not need to maintain a separate service process. I believe that, at this >> point, ovirtiofs transfers its state to the database files and the backend >> object storage, making the process itself stateless. With configuration >> files and database files (including additional metadata), ovirtiofs does >> not maintain any state information in memory, allowing it to be started and >> restarted arbitrarily. However, users still need to be aware of the >> existence and significance of the database files, and it is challenging to >> maintain the state synchronization overhead introduced by external changes >> to the object storage system. And external changes may be difficult to >> directly incorporate into the ovirtiofs directory tree, requiring special >> handling rules. >> >> If we do not consider metadata persistence, ovirtiofs needs to retrieve >> file system state information from the object storage when restarting. In >> this scenario, we need to make some assumptions to restore the file system >> interface. For example, the name of a bucket represents a complete >> directory path, and the objects in the bucket represent the files in the >> directory. The implementation of a file system based on such assumptions >> has certain limitations, including potential performance issues such as >> uneven object distribution and escaping of metadata operations such as >> directory traversal. However, the benefit is that ovirtiofs only needs a >> configuration file to restart and recover, providing a stateless service >> that can share directories among multiple virtual machines on multiple >> physical nodes, which is difficult to achieve in the first design. In this >> case, we do not need to consider the state changes brought about by >> external modifications to the storage system data, as all state information >> is managed through the object storage system. >> >> After thinking about it, I now support the second idea, which is to >> implement the file system interface through assumptions and without >> persistence, because at this time ovirtiofs has greater usage prospects. I >> would like to modify the proposal in these directions, modify the metadata >> management design, add a description of stateless service support, and add >> a description of document writing and usage scenarios. >> >> Xuanwo <[email protected]> 于2024年3月11日周一 22:50写道: >> >>> Great proposal. >>> >>> My only question is, can we avoid the persistence of metadata? >>> >>> I'm thinking of two things: >>> >>> - I expect virtio to be stateless and easy to recover and deploy, users >>> don't need to maintain extra stateful services. >>> - External changes to storage services such as S3 and GCS can create >>> additional synchronization work. >>> >>> On Mon, Mar 11, 2024, at 22:44, 余润杰 wrote: >>>> Greetings, everyone! >>>> >>>> I'm Runjie Yu, a student at Huazhong University of Science and >>> Technology. I would like to participate in the OpenDAL GSoC project as a >>> candidate, and I've already prepared a proposal draft. I plan to refine >>> this draft further and would appreciate to receive suggestions before >>> proceeding. I also hope to verify if my design makes sense and meets >>> expectations. >>>> >>>> This project aims to implement shared directories for virtual machines >>> based on the OpenDAL using virtio technology. This is the relevant issue >>> link: https://github.com/apache/opendal/issues/4133. >>>> >>>> *Attachments:* >>>> • zjregee_GSoC_2024_OpenDAL_Project_Proposal_Draft.md >>> Xuanwo >>> >> > > > Great discussion! I have some experience about VirtioFS before. So I > can mentor this proposal with Xuanwo at the same time. > > For me, here's a tough question personal about your second draft design > >> ovirtiofs needs to retrieve file system state information from the object >> storage when restarting > > I think this means that we need to depend on the network status for the > service. > > For example, we store the metadata into S3(standard S3, S3-liked > service), and we have swift, hdfs and many other backend serivces. > I think we need to keep the minimal function still working if the S3 > has been crashed or we got network issues. > > This is just my personal thought. Feel free to ask if you got any issues > > Best > > Manjusaka -- Xuanwo
