Addressing architectural differences between FUSE driver and fs - Re: virtio-fs tests between host(x86) and dpu(arm64)

2024-06-03 Thread Peter-Jan Gootzen
Hi Xiaoguang and others,
 
You have identified an issue that we are also running into and had also
planned to address with the community.
We currently solve this by explicitly telling the virtio-fs device which
architecture is running on the host through a side-channel so that it
can correctly interpret the flags (e.g. O_DIRECT).
As others are also running into this with the increased popularity of
hardware virtio-fs devices, let's get into this issue.
I have included the FUSE maintainer Milkos Szeredi and the Virtio
maintainer+specification chair Michael S. Tsirkin in this thread.

The core issue here lies in the fact that these fcntl.h definitions are
different per architecture (as of now there are 9 fcntl.h headers). More
specifically, the real numbers defined in the header are different, on
top of possible endianness differences.
FUSE has a mechanism to detect endianness differences with the 32bit
in.opcode value (via FUSE_INIT_BSWAP_RESERVED).
However, there is no mechanism in FUSE or virtio-fs to tell the FUSE
file system or the virtio-fs device which architecture is running on the
host. Thus, the virtio-fs device currently cannot know how to correctly
interpret the fcntl.h flags.
For example, we are dealing with systems that have ARM64 host and ARM64
virtio-fs device, or x86 host and ARM64 virtio-fs device.
As FUSE already contains the mechanism for endianness, it would make
sense to also include a mechanism for the architecture in FUSE to solve
this issue.


We would like to make a proposal regarding our idea for solving this
issue before sending in a patch:
Use a uint32_t from the unused array in FUSE_INIT to encode an `uint32_t
arch_indicator` that contains one of the architecture IDs specified in a
new enum (is there an existing enum like such?):
enum fuse_arch_indicator {
FUSE_ARCH_NONE = 0,
FUSE_ARCH_X86 = 1,
FUSE_ARCH_ARM64 = 2,
...
}
Through this the host tells the FUSE file system which version of
fcntl.h it will use.
The FUSE file system should keep a copy of all the possible fcntl
headers and use the one indicated by the `fuse_init_in.arch_indicator`.

For backwards compatibility, a minor version bump is needed. A new file
system implementation connected to an old driver will see the
FUSE_ARCH_NONE or the old minor version, and it will know that it cannot
read the `arch_indicator` and that it cannot make better assumptions
than previously possible.

This would be a minimal, backwards compatible change that extends the
current FUSE portability scheme and doesn't require any specification
changes. When the time comes that a new architecture is introduced with
its own fcntl.h we must simply add another enumerator and an ifdef to
the code setting the `arch_indicator`.

 
- Peter-Jan

On Thu, 2024-05-30 at 09:31 +, Lege Wang wrote:
> External email: Use caution opening links or attachments
> 
> 
> Hello,
> 
> I see that you have added multi-queue support for virtio-fs, thanks
> for this work.
> From your patch's commit log, your host is x86-64, dpu is arm64, but
> there're
> differences about O_DIRECT and O_DIRECTORY between these two
> architectures.
> 
> Test program:
> #define _GNU_SOURCE
> 
> #include 
> #include 
> 
> int main(void)
> {
>     printf("O_DIRECT:%o\n", O_DIRECT);
>     printf("O_DIRECTORY:%o\n", O_DIRECTORY);
>     return 0;
> }
> 
> In x86-64, this test program outputs:
> O_DIRECT:4
> O_DIRECTORY:20
> 
> But in arm64, it outpus:
> O_DIRECT:20
> O_DIRECTORY:4
> 
> In kernel fuse module, fuse_create_in->flags will used to hold
> open(2)'s flags, then
> a O_DIRECT flag from host(x86) would be treated as O_DIRECTORY in
> dpu(arm64), which
> seems a serious bug.
> 
> From your fio job, you use libaio engine, so it's assumed that direct-
> io is
> enabled, so I wonder why you don't get errors. Could you please try
> below
> command in your virtio-fs mount point:
>   dd if=/dev/zero of=tst_file bs=4096 count=1 oflag=direct
> to see whether it occurs any error.
> 
> Regards,
> Xiaoguang Wang




Re: Addressing architectural differences between FUSE driver and fs - Re: virtio-fs tests between host(x86) and dpu(arm64)

2024-06-03 Thread Miklos Szeredi
On Mon, 3 Jun 2024 at 10:02, Peter-Jan Gootzen  wrote:

> We would like to make a proposal regarding our idea for solving this
> issue before sending in a patch:
> Use a uint32_t from the unused array in FUSE_INIT to encode an `uint32_t
> arch_indicator` that contains one of the architecture IDs specified in a
> new enum (is there an existing enum like such?):
> enum fuse_arch_indicator {
> FUSE_ARCH_NONE = 0,
> FUSE_ARCH_X86 = 1,
> FUSE_ARCH_ARM64 = 2,
> ...
> }
> Through this the host tells the FUSE file system which version of
> fcntl.h it will use.
> The FUSE file system should keep a copy of all the possible fcntl
> headers and use the one indicated by the `fuse_init_in.arch_indicator`.

To be clear: you propose that the fuse client (in the VM kernel) sets
the arch indicator and the server (on the host) translates constants?

That sounds like a good plan.

Alternatively the client would optionally translate to a common set of
constants (x86 would be a good choice) and then the server would only
need to know the translation between x86 and native.

What about errno?  Any other arch specific constants?

Thanks,
Miklos



Re: Addressing architectural differences between FUSE driver and fs - Re: virtio-fs tests between host(x86) and dpu(arm64)

2024-06-03 Thread Peter-Jan Gootzen
On Mon, 2024-06-03 at 10:24 +0200, Miklos Szeredi wrote:
> On Mon, 3 Jun 2024 at 10:02, Peter-Jan Gootzen 
> wrote:
> 
> > We would like to make a proposal regarding our idea for solving this
> > issue before sending in a patch:
> > Use a uint32_t from the unused array in FUSE_INIT to encode an
> > `uint32_t
> > arch_indicator` that contains one of the architecture IDs specified
> > in a
> > new enum (is there an existing enum like such?):
> > enum fuse_arch_indicator {
> >     FUSE_ARCH_NONE = 0,
> >     FUSE_ARCH_X86 = 1,
> >     FUSE_ARCH_ARM64 = 2,
> >     ...
> > }
> > Through this the host tells the FUSE file system which version of
> > fcntl.h it will use.
> > The FUSE file system should keep a copy of all the possible fcntl
> > headers and use the one indicated by the
> > `fuse_init_in.arch_indicator`.
> 
> To be clear: you propose that the fuse client (in the VM kernel) sets
> the arch indicator and the server (on the host) translates constants?
Correct. Or in our case of virtio-fs, the FUSE server running behind the
virtio-fs device (possibly on another architecture) will translate
constants before sending them to the host.

> 
> That sounds like a good plan.
> 
> Alternatively the client would optionally translate to a common set of
> constants (x86 would be a good choice) and then the server would only
> need to know the translation between x86 and native.
We also considered this idea, it would kind of be like locking FUSE into
being x86. However I think this is not backwards compatible. Currently
an ARM64 client and ARM64 server work just fine. But making such a
change would break if the client has the new driver version and the
server is not updated to know that it should interpret x86 specifically.

> 
> What about errno?  Any other arch specific constants?
Yes there may be other arch specific constants that we need to consider.
I don't think errno is already an issue for us as there x86 and ARM64
are mostly the same, and everything in errno-base.h is already
compatible and are luckily most used in file systems.
But this proposed change would also help possible issues there.

> 
> Thanks,
> Miklos

- Peter-Jan



Re: Addressing architectural differences between FUSE driver and fs - Re: virtio-fs tests between host(x86) and dpu(arm64)

2024-06-03 Thread Miklos Szeredi
On Mon, 3 Jun 2024 at 10:53, Peter-Jan Gootzen  wrote:

> We also considered this idea, it would kind of be like locking FUSE into
> being x86. However I think this is not backwards compatible. Currently
> an ARM64 client and ARM64 server work just fine. But making such a
> change would break if the client has the new driver version and the
> server is not updated to know that it should interpret x86 specifically.

This would need to be negotiated, of course.

But it's certainly simpler to just indicate the client arch in the
INIT request.   Let's go with that for now.

Thanks,
Miklos



Re: Addressing architectural differences between FUSE driver and fs - Re: virtio-fs tests between host(x86) and dpu(arm64)

2024-06-03 Thread Stefan Hajnoczi
On Mon, Jun 03, 2024 at 11:06:19AM +0200, Miklos Szeredi wrote:
> On Mon, 3 Jun 2024 at 10:53, Peter-Jan Gootzen  wrote:
> 
> > We also considered this idea, it would kind of be like locking FUSE into
> > being x86. However I think this is not backwards compatible. Currently
> > an ARM64 client and ARM64 server work just fine. But making such a
> > change would break if the client has the new driver version and the
> > server is not updated to know that it should interpret x86 specifically.
> 
> This would need to be negotiated, of course.
> 
> But it's certainly simpler to just indicate the client arch in the
> INIT request.   Let's go with that for now.

In the long term it would be cleanest to choose a single canonical
format instead of requiring drivers and devices to implement many
arch-specific formats. I liked the single canonical format idea you
suggested.

My only concern is whether there are more commands/fields in FUSE that
operate in an arch-specific way (e.g. ioctl)? If there really are parts
that need to be arch-specific, then it might be necessary to negotiate
an architecture after all.

Stefan

> 
> Thanks,
> Miklos
> 


signature.asc
Description: PGP signature


Re: Addressing architectural differences between FUSE driver and fs - Re: virtio-fs tests between host(x86) and dpu(arm64)

2024-06-03 Thread Miklos Szeredi
On Mon, Jun 3, 2024 at 3:44 PM Stefan Hajnoczi  wrote:
>
> On Mon, Jun 03, 2024 at 11:06:19AM +0200, Miklos Szeredi wrote:
> > On Mon, 3 Jun 2024 at 10:53, Peter-Jan Gootzen  wrote:
> >
> > > We also considered this idea, it would kind of be like locking FUSE into
> > > being x86. However I think this is not backwards compatible. Currently
> > > an ARM64 client and ARM64 server work just fine. But making such a
> > > change would break if the client has the new driver version and the
> > > server is not updated to know that it should interpret x86 specifically.
> >
> > This would need to be negotiated, of course.
> >
> > But it's certainly simpler to just indicate the client arch in the
> > INIT request.   Let's go with that for now.
>
> In the long term it would be cleanest to choose a single canonical
> format instead of requiring drivers and devices to implement many
> arch-specific formats. I liked the single canonical format idea you
> suggested.
>
> My only concern is whether there are more commands/fields in FUSE that
> operate in an arch-specific way (e.g. ioctl)? If there really are parts
> that need to be arch-specific, then it might be necessary to negotiate
> an architecture after all.

How about something like this:

 - by default fall back to no translation for backward compatibility
 - server may request matching by sending its own arch identifier in
fuse_init_in
 - client sends back its arch identifier in fuse_init_out
 - client also sends back a flag indicating whether it will transform
to canonical or not

This means the client does not have to implement translation for every
architecture, only ones which are frequently used as guest.  The
server may opt to implement its own translation if it's lacking in the
client, or it can just fail.

We need to look at all the requests, if there are some other constants
that need to be transformed.

As for ioctl, the client cannot promise to transform everything, since
most are not interpreted by the kernel.  Ones which are, should be
transformed.

Thanks,
Miklos




Re: Addressing architectural differences between FUSE driver and fs - Re: virtio-fs tests between host(x86) and dpu(arm64)

2024-06-03 Thread Stefan Hajnoczi
On Mon, Jun 03, 2024 at 04:56:14PM +0200, Miklos Szeredi wrote:
> On Mon, Jun 3, 2024 at 3:44 PM Stefan Hajnoczi  wrote:
> >
> > On Mon, Jun 03, 2024 at 11:06:19AM +0200, Miklos Szeredi wrote:
> > > On Mon, 3 Jun 2024 at 10:53, Peter-Jan Gootzen  
> > > wrote:
> > >
> > > > We also considered this idea, it would kind of be like locking FUSE into
> > > > being x86. However I think this is not backwards compatible. Currently
> > > > an ARM64 client and ARM64 server work just fine. But making such a
> > > > change would break if the client has the new driver version and the
> > > > server is not updated to know that it should interpret x86 specifically.
> > >
> > > This would need to be negotiated, of course.
> > >
> > > But it's certainly simpler to just indicate the client arch in the
> > > INIT request.   Let's go with that for now.
> >
> > In the long term it would be cleanest to choose a single canonical
> > format instead of requiring drivers and devices to implement many
> > arch-specific formats. I liked the single canonical format idea you
> > suggested.
> >
> > My only concern is whether there are more commands/fields in FUSE that
> > operate in an arch-specific way (e.g. ioctl)? If there really are parts
> > that need to be arch-specific, then it might be necessary to negotiate
> > an architecture after all.
> 
> How about something like this:
> 
>  - by default fall back to no translation for backward compatibility
>  - server may request matching by sending its own arch identifier in
> fuse_init_in
>  - client sends back its arch identifier in fuse_init_out
>  - client also sends back a flag indicating whether it will transform
> to canonical or not
> 
> This means the client does not have to implement translation for every
> architecture, only ones which are frequently used as guest.  The
> server may opt to implement its own translation if it's lacking in the
> client, or it can just fail.

From the client perspective:

1. Do not negotiate arch in fuse_init_out - hopefully client and server
   know what they are doing :). This is the current behavior.
2. Reply to fuse_init_in with server's arch in fuse_init_out - client
   translates according to server's arch.
3. Reply to fuse_init_in with canonical flag set in fuse_init_out -
   client and server use canonical format.

From the server perspective:

1. Client does not negotiate arch - the current behavior (good luck!).
2. Arch received in fuse_init_out from client - must be equal to
   server's arch since there is no way for the server to reject the
   arch.
3. Canonical flag received in fuse_init_out from client - client and
   server use canonical format.

Is this what you had in mind?

Stefan


signature.asc
Description: PGP signature


RE: Addressing architectural differences between FUSE driver and fs - Re: virtio-fs tests between host(x86) and dpu(arm64)

2024-06-03 Thread Lege Wang
Hello,

> 
> On Mon, Jun 03, 2024 at 11:06:19AM +0200, Miklos Szeredi wrote:
> > On Mon, 3 Jun 2024 at 10:53, Peter-Jan Gootzen 
> wrote:
> >
> > > We also considered this idea, it would kind of be like locking FUSE into
> > > being x86. However I think this is not backwards compatible. Currently
> > > an ARM64 client and ARM64 server work just fine. But making such a
> > > change would break if the client has the new driver version and the
> > > server is not updated to know that it should interpret x86 specifically.
> >
> > This would need to be negotiated, of course.
> >
> > But it's certainly simpler to just indicate the client arch in the
> > INIT request.   Let's go with that for now.
> 
> In the long term it would be cleanest to choose a single canonical
> format instead of requiring drivers and devices to implement many
> arch-specific formats. I liked the single canonical format idea you
> suggested.
Agree, I also think we should use canonical format for cases that client and
server have different arches.

Regards,
Xiaoguang Wang
> 
> My only concern is whether there are more commands/fields in FUSE that
> operate in an arch-specific way (e.g. ioctl)? If there really are parts
> that need to be arch-specific, then it might be necessary to negotiate
> an architecture after all.
> 
> Stefan
> 
> >
> > Thanks,
> > Miklos
> >