Thank you very much for your suggestions. This should be a topic that
requires thorough discussion, involving the design philosophy and
positioning of ovirtiofs.

My previous design was based on referencing the classic architecture of
distributed file systems. The previous proposal stored additional metadata
through a KV database. Thanks to KV databases like leveldb, rocksdb, which
are used in the form of library and database files, users or ovirtiofs do
not need to maintain a separate service process. I believe that, at this
point, ovirtiofs transfers its state to the database files and the backend
object storage, making the process itself stateless. With configuration
files and database files (including additional metadata), ovirtiofs does
not maintain any state information in memory, allowing it to be started and
restarted arbitrarily. However, users still need to be aware of the
existence and significance of the database files, and it is challenging to
maintain the state synchronization overhead introduced by external changes
to the object storage system. And external changes may be difficult to
directly incorporate into the ovirtiofs directory tree, requiring special
handling rules.

If we do not consider metadata persistence, ovirtiofs needs to retrieve
file system state information from the object storage when restarting. In
this scenario, we need to make some assumptions to restore the file system
interface. For example, the name of a bucket represents a complete
directory path, and the objects in the bucket represent the files in the
directory. The implementation of a file system based on such assumptions
has certain limitations, including potential performance issues such as
uneven object distribution and escaping of metadata operations such as
directory traversal. However, the benefit is that ovirtiofs only needs a
configuration file to restart and recover, providing a stateless service
that can share directories among multiple virtual machines on multiple
physical nodes, which is difficult to achieve in the first design. In this
case, we do not need to consider the state changes brought about by
external modifications to the storage system data, as all state information
is managed through the object storage system.

After thinking about it, I now support the second idea, which is to
implement the file system interface through assumptions and without
persistence, because at this time ovirtiofs has greater usage prospects. I
would like to modify the proposal in these directions, modify the metadata
management design, add a description of stateless service support, and add
a description of document writing and usage scenarios.

Xuanwo <[email protected]> 于2024年3月11日周一 22:50写道:

> Great proposal.
>
> My only question is, can we avoid the persistence of metadata?
>
> I'm thinking of two things:
>
> - I expect virtio to be stateless and easy to recover and deploy, users
> don't need to maintain extra stateful services.
> - External changes to storage services such as S3 and GCS can create
> additional synchronization work.
>
> On Mon, Mar 11, 2024, at 22:44, 余润杰 wrote:
> > Greetings, everyone!
> >
> > I'm Runjie Yu, a student at Huazhong University of Science and
> Technology. I would like to participate in the OpenDAL GSoC project as a
> candidate, and I've already prepared a proposal draft. I plan to refine
> this draft further and would appreciate to receive suggestions before
> proceeding. I also hope to verify if my design makes sense and meets
> expectations.
> >
> > This project aims to implement shared directories for virtual machines
> based on the OpenDAL using virtio technology. This is the relevant issue
> link: https://github.com/apache/opendal/issues/4133.
> >
> > *Attachments:*
> >  • zjregee_GSoC_2024_OpenDAL_Project_Proposal_Draft.md
> Xuanwo
>

Reply via email to