Re: Emulating missing linux syscalls project

2023-03-29 Thread Martin Husemann
On Tue, Mar 28, 2023 at 03:32:35PM -0700, Zophiel wrote:
>  - What binary's in the current NetBSD stack does not run compact_test as a
> test case increasing order of missing test cases ?

The only regular compat_* tests running are compat_netbsd32, that is e.g.
testing a NetBSD/i386 userland on a NetBSD/amd64 machine.

Martin


Re: Emulating missing linux syscalls

2022-04-20 Thread Piyush Sachdeva
On Tue, Apr 19, 2022 at 4:38 AM Joerg Sonnenberger  wrote:
>
> Am Tue, Apr 19, 2022 at 02:39:44AM +0530 schrieb Piyush Sachdeva:
> > On Sat, Apr 16, 2022 at 2:06 AM Joerg Sonnenberger  wrote:
> > >
> > > Am Wed, Apr 13, 2022 at 09:51:31PM - schrieb Christos Zoulas:
> > > > In article , Joerg Sonnenberger  
> > > >  wrote:
> > > > >Am Tue, Apr 12, 2022 at 04:56:05PM - schrieb Christos Zoulas:
> > > >
> > > > >splice(2) as a concept is much older than the current Linux 
> > > > >implementation.
> > > > >There is no reason why zero-copying for sockets should require a
> > > > >different system call for zero-copying from/to pipes. There are valid
> > > > >reasons for other combinations, too. Consider /bin/cp for example.
> > > >
> > > > You don't need two system calls because the kernel knows the type of
> > > > the file descriptors and can dispatch to different implementations.
> > > > One of the questions is do you provide the means to pass an additional
> > > > header/trailer to the output data like FreeBSD does for its sendfile(2)
> > > > implementation?
> > > >
> > > > int
> > > > splice(int infd, off_t *inoff, int outfd, off_t *outoff, size_t len,
> > > > const struct {
> > > >   struct iov *head;
> > > >   size_t headcnt;
> > > >   struct iov *tail;
> > > >   size_t tailcnt;
> > > > } *ht, int flags);
> > >
> > > There are essentially two use cases here:
> > > (1) I want a simple interface to transfer data from one fd to another
> > > without extra copies.
> > >
> > > (2) I wanto avoid copies AND I want to avoid system calls.
> > >
> > > For the former:
> > > int splice(int dstfd, int srcfd, off_t *len);
> > >
> > > is more than good enough. "Transfer up to [*len] octets from srcfd to
> > > dstfd, updating [len] with the actually transferred amount and returning
> > > the first error if any.
> > >
> > > For the second category, an interface more like the posix_spawn
> > > interface (but without all the extra allocations) would be useful.
> > >
> >
> > Therefore, having the above const struct *ht to support
> > mmap() will be a good option I guess.
>
> It covers a very limited subset of the desired options. Basically, what
> you want in this case is something like:
>
> int splicev(int dstfd, struct spliceop ops[], size_t *lenops, off_t
> *outoff);
>
> where spliceops is used to specify the supported operations:
> - read from a fd with possible seek
> - read from memory
> - seek output
> and maybe other operations I can't think of right now. lenops provides
> the number of operations in input and the remaining operations on
> return, outoff is the remaining output in the current block. Some
> variant of this might be possible.

Thank you Joerg and Christos for helping me with this.
I have successfully submitted a proposal for this project through the GSoC
portal. Hope to make the cut this time :)

-- 
Regards,
Piyush


Re: Emulating missing linux syscalls

2022-04-18 Thread Joerg Sonnenberger
Am Tue, Apr 19, 2022 at 02:39:44AM +0530 schrieb Piyush Sachdeva:
> On Sat, Apr 16, 2022 at 2:06 AM Joerg Sonnenberger  wrote:
> >
> > Am Wed, Apr 13, 2022 at 09:51:31PM - schrieb Christos Zoulas:
> > > In article , Joerg Sonnenberger   
> > > wrote:
> > > >Am Tue, Apr 12, 2022 at 04:56:05PM - schrieb Christos Zoulas:
> > >
> > > >splice(2) as a concept is much older than the current Linux 
> > > >implementation.
> > > >There is no reason why zero-copying for sockets should require a
> > > >different system call for zero-copying from/to pipes. There are valid
> > > >reasons for other combinations, too. Consider /bin/cp for example.
> > >
> > > You don't need two system calls because the kernel knows the type of
> > > the file descriptors and can dispatch to different implementations.
> > > One of the questions is do you provide the means to pass an additional
> > > header/trailer to the output data like FreeBSD does for its sendfile(2)
> > > implementation?
> > >
> > > int
> > > splice(int infd, off_t *inoff, int outfd, off_t *outoff, size_t len,
> > > const struct {
> > >   struct iov *head;
> > >   size_t headcnt;
> > >   struct iov *tail;
> > >   size_t tailcnt;
> > > } *ht, int flags);
> >
> > There are essentially two use cases here:
> > (1) I want a simple interface to transfer data from one fd to another
> > without extra copies.
> >
> > (2) I wanto avoid copies AND I want to avoid system calls.
> >
> > For the former:
> > int splice(int dstfd, int srcfd, off_t *len);
> >
> > is more than good enough. "Transfer up to [*len] octets from srcfd to
> > dstfd, updating [len] with the actually transferred amount and returning
> > the first error if any.
> >
> > For the second category, an interface more like the posix_spawn
> > interface (but without all the extra allocations) would be useful.
> >
> 
> Therefore, having the above const struct *ht to support
> mmap() will be a good option I guess.

It covers a very limited subset of the desired options. Basically, what
you want in this case is something like:

int splicev(int dstfd, struct spliceop ops[], size_t *lenops, off_t
*outoff);

where spliceops is used to specify the supported operations:
- read from a fd with possible seek
- read from memory
- seek output
and maybe other operations I can't think of right now. lenops provides
the number of operations in input and the remaining operations on
return, outoff is the remaining output in the current block. Some
variant of this might be possible.

Joerg


Re: Emulating missing linux syscalls

2022-04-18 Thread Piyush Sachdeva
On Sat, Apr 16, 2022 at 2:06 AM Joerg Sonnenberger  wrote:
>
> Am Wed, Apr 13, 2022 at 09:51:31PM - schrieb Christos Zoulas:
> > In article , Joerg Sonnenberger   
> > wrote:
> > >Am Tue, Apr 12, 2022 at 04:56:05PM - schrieb Christos Zoulas:
> >
> > >splice(2) as a concept is much older than the current Linux implementation.
> > >There is no reason why zero-copying for sockets should require a
> > >different system call for zero-copying from/to pipes. There are valid
> > >reasons for other combinations, too. Consider /bin/cp for example.
> >
> > You don't need two system calls because the kernel knows the type of
> > the file descriptors and can dispatch to different implementations.
> > One of the questions is do you provide the means to pass an additional
> > header/trailer to the output data like FreeBSD does for its sendfile(2)
> > implementation?
> >
> > int
> > splice(int infd, off_t *inoff, int outfd, off_t *outoff, size_t len,
> > const struct {
> >   struct iov *head;
> >   size_t headcnt;
> >   struct iov *tail;
> >   size_t tailcnt;
> > } *ht, int flags);
>
> There are essentially two use cases here:
> (1) I want a simple interface to transfer data from one fd to another
> without extra copies.
>
> (2) I wanto avoid copies AND I want to avoid system calls.
>
> For the former:
> int splice(int dstfd, int srcfd, off_t *len);
>
> is more than good enough. "Transfer up to [*len] octets from srcfd to
> dstfd, updating [len] with the actually transferred amount and returning
> the first error if any.
>
> For the second category, an interface more like the posix_spawn
> interface (but without all the extra allocations) would be useful.
>

Therefore, having the above const struct *ht to support
mmap() will be a good option I guess.

> > >I was saying that the Linux system call can be implemented without a
> > >kernel backend, because I don't consider zero copy a necessary part of
> > >the interface contract. It's a perfectly valid, if a bit slower
> > >implementation to do allocate a kernel buffer and do IO via that.
> >
> > Of course, but how do you make an existing binary use it? LD_PRELOAD
> > a binary to override the symbol in the linux glibc? By that logic you
> > don't need an in kernel linux emulation, you can do it all in userland :-)
>
> You still provide the system call as front end, but internally implement
> it on top of regular read/write to a temporary buffer.
>

Got it, thank you Joerg!

-- 
Regards,
Piyush


Re: Emulating missing linux syscalls

2022-04-15 Thread Joerg Sonnenberger
Am Wed, Apr 13, 2022 at 09:51:31PM - schrieb Christos Zoulas:
> In article , Joerg Sonnenberger   
> wrote:
> >Am Tue, Apr 12, 2022 at 04:56:05PM - schrieb Christos Zoulas:
> 
> >splice(2) as a concept is much older than the current Linux implementation.
> >There is no reason why zero-copying for sockets should require a
> >different system call for zero-copying from/to pipes. There are valid
> >reasons for other combinations, too. Consider /bin/cp for example.
> 
> You don't need two system calls because the kernel knows the type of
> the file descriptors and can dispatch to different implementations.
> One of the questions is do you provide the means to pass an additional
> header/trailer to the output data like FreeBSD does for its sendfile(2)
> implementation?
> 
> int
> splice(int infd, off_t *inoff, int outfd, off_t *outoff, size_t len, 
> const struct {
>   struct iov *head;
>   size_t headcnt;
>   struct iov *tail;
>   size_t tailcnt;
> } *ht, int flags);

There are essentially two use cases here:
(1) I want a simple interface to transfer data from one fd to another
without extra copies.

(2) I wanto avoid copies AND I want to avoid system calls.

For the former:
int splice(int dstfd, int srcfd, off_t *len);

is more than good enough. "Transfer up to [*len] octets from srcfd to
dstfd, updating [len] with the actually transferred amount and returning
the first error if any.

For the second category, an interface more like the posix_spawn
interface (but without all the extra allocations) would be useful.

> >I was saying that the Linux system call can be implemented without a
> >kernel backend, because I don't consider zero copy a necessary part of
> >the interface contract. It's a perfectly valid, if a bit slower
> >implementation to do allocate a kernel buffer and do IO via that.
> 
> Of course, but how do you make an existing binary use it? LD_PRELOAD
> a binary to override the symbol in the linux glibc? By that logic you
> don't need an in kernel linux emulation, you can do it all in userland :-)

You still provide the system call as front end, but internally implement
it on top of regular read/write to a temporary buffer.

Joerg


Re: Emulating missing linux syscalls

2022-04-15 Thread Piyush Sachdeva
On Thu, Apr 14, 2022 at 3:22 AM Christos Zoulas  wrote:
>
> In article , Joerg Sonnenberger   
> wrote:
> >Am Tue, Apr 12, 2022 at 04:56:05PM - schrieb Christos Zoulas:
>
> >splice(2) as a concept is much older than the current Linux implementation.
> >There is no reason why zero-copying for sockets should require a
> >different system call for zero-copying from/to pipes. There are valid
> >reasons for other combinations, too. Consider /bin/cp for example.
>

I was under the assumption that zero-copying would be a preference.
I did go through /bin/cp and the important copy_file() function. There
mmap() is being used and then data is written to the destination using
write(2) in chunks. Thanks for this Joerg!

> You don't need two system calls because the kernel knows the type of
> the file descriptors and can dispatch to different implementations.

Yes. Therefore, I am assuming that only one general splice(2) function
will be implemented and in case it's supplied a socketfd, it will behave like
sendfile(2). (as it is also clear from the function def you have
provided under.)
Also, add the sendfile(2) functionality and have it invoke splice(2).

> One of the questions is do you provide the means to pass an additional
> header/trailer to the output data like FreeBSD does for its sendfile(2)
> implementation?
>
> int
> splice(int infd, off_t *inoff, int outfd, off_t *outoff, size_t len,
> const struct {
> struct iov *head;
> size_t headcnt;
> struct iov *tail;
> size_t tailcnt;
> } *ht, int flags);
>

I will be more than happy to provide the functionality
(taking reference from writev(2) for the struct iovec
and the FreeBSD implementation of sendfile(2)).

> >I was saying that the Linux system call can be implemented without a
> >kernel backend, because I don't consider zero copy a necessary part of
> >the interface contract. It's a perfectly valid, if a bit slower
> >implementation to do allocate a kernel buffer and do IO via that.
>
Right, Joerg!


As I was initially also hoping to broaden the project by actually
adding the syscall
to the NetBSD kernel as well (adds a feature) and then have the
compat_linux layer
profit from that call. Unless that is something you are trying to
avoid/steer away from.

Now the final question for me is:
The splice() prototype that you just mentioned above, Christos. Is
that for a NetBSD syscall
(as I would hope given the struct iovec parameter) and then have both
splice(2) and
sendfile(2) implemented in compat_linux layer profiting from this syscall?
Or is it just splice(2) and sendfile(2) (which will call splice(2))
both in the linux layer
only?

> Of course, but how do you make an existing binary use it? LD_PRELOAD
> a binary to override the symbol in the linux glibc? By that logic you
> don't need an in kernel linux emulation, you can do it all in userland :-)
>

Christos, if you can shine some more light on this.

I guess this will make a great proposal and I will send you something by Monday,
I hope, for a first pass.

Hope to hear from you soon.
-- 
Regards,
Piyush


Re: Emulating missing linux syscalls

2022-04-13 Thread Christos Zoulas
In article , Joerg Sonnenberger   wrote:
>Am Tue, Apr 12, 2022 at 04:56:05PM - schrieb Christos Zoulas:

>splice(2) as a concept is much older than the current Linux implementation.
>There is no reason why zero-copying for sockets should require a
>different system call for zero-copying from/to pipes. There are valid
>reasons for other combinations, too. Consider /bin/cp for example.

You don't need two system calls because the kernel knows the type of
the file descriptors and can dispatch to different implementations.
One of the questions is do you provide the means to pass an additional
header/trailer to the output data like FreeBSD does for its sendfile(2)
implementation?

int
splice(int infd, off_t *inoff, int outfd, off_t *outoff, size_t len, 
const struct {
struct iov *head;
size_t headcnt;
struct iov *tail;
size_t tailcnt;
} *ht, int flags);

>I was saying that the Linux system call can be implemented without a
>kernel backend, because I don't consider zero copy a necessary part of
>the interface contract. It's a perfectly valid, if a bit slower
>implementation to do allocate a kernel buffer and do IO via that.

Of course, but how do you make an existing binary use it? LD_PRELOAD
a binary to override the symbol in the linux glibc? By that logic you
don't need an in kernel linux emulation, you can do it all in userland :-)

christos



Re: Emulating missing linux syscalls

2022-04-13 Thread Joerg Sonnenberger
Am Tue, Apr 12, 2022 at 04:56:05PM - schrieb Christos Zoulas:
> In article , Joerg Sonnenberger   
> wrote:
> >Am Tue, Apr 12, 2022 at 12:29:21PM - schrieb Christos Zoulas:
> >> In article
> >,
> >> Piyush Sachdeva   wrote:
> >> >-=-=-=-=-=-
> >> >
> >> >Dear Stephen Borrill,
> >> >My name is Piyush, and I was looking into the
> >> >'Emulating missing Linux syscalls' project hoping to contribute
> >> >to this year's GSoC.
> >> >
> >> >I wanted to be sure of a few basic things before I go ahead:
> >> >- linux binaries are found in- src/sys/compat/linux
> >> >- particular implementation in - src/sys/compat/linux/common
> >> >- a few architecture-specific implementations in-
> >> >  src/sys/compat/linux/arch/.
> >> >- The src/sys/compat/linux/arch//linux_syscalls.c file
> >> >   lists of system calls, and states if a particular syscall is present or
> >> >not.
> >> >
> >> >I was planning to work on the 'sendfile()' syscall, which I believe
> >> >is unimplemented for amd64 and a few other architectures as well.
> >> >
> >> >Considering the above points, I was hoping you could point me in
> >> >the right direction for this project. Hope to hear from you soon.
> >> 
> >> I would look into porting the FreeBSD implementation of sendfile to NetBSD.
> >
> >sendfile(2) for Linux compat can be emulated in the kernel without
> >backing. That said, a real splice(2) or even splicev(2) would be really
> >nice to have. But that's a different project and arguable, a potentially
> >more generally useful one, too.
> 
> 
> Yes, splice is more general (as opposed to send a file to a socket), but I
> think splice has limitations too (one of the fds needs to be a pipe).
> Is that true only for linux?

splice(2) as a concept is much older than the current Linux implementation.
There is no reason why zero-copying for sockets should require a
different system call for zero-copying from/to pipes. There are valid
reasons for other combinations, too. Consider /bin/cp for example.

I was saying that the Linux system call can be implemented without a
kernel backend, because I don't consider zero copy a necessary part of
the interface contract. It's a perfectly valid, if a bit slower
implementation to do allocate a kernel buffer and do IO via that.

Joerg


Re: Emulating missing linux syscalls

2022-04-13 Thread Piyush Sachdeva
Thank you Christos and Joerg!

On Tue, Apr 12, 2022 at 10:26 PM Christos Zoulas  wrote:
>
> In article , Joerg Sonnenberger   
> wrote:
>
> >>> I would look into porting the FreeBSD implementation of sendfile to 
> >>> NetBSD.
>

I had a look at the FreeBSD implementation. What I found was-
- linux_sendfile() function in
freebsd-src/sys/compat/linux/linux_socket.c ends up calling
   linux_sendfile_common() (in the same file) which in turn calls fo_sendfile().
   I am guessing, this is supported by sendfile(2) syscall which is
present in the FreeBSD kernel.

- Therefore in case the implementation just needs to be ported,
   then it would make for a very simple project.

>
> >> sendfile(2) for Linux compat can be emulated in the kernel without
> >> backing.


Joerg, would you please explain to me how this will be possible?
As I understand, anything in the compat layer needs backing functions,
and I didn't find anything pertaining to sendfile(2) as a syscall in
the NetBSD kernel.

Or maybe you are talking about using in-kernel support functions,
which would have
been used to support sendfile(2) had it been present?
In this case, I guess, these functions support zero-copy and it will
be great if you could point me to them.

>
> That said, a real splice(2) or even splicev(2) would be really
> >> nice to have. But that's a different project and arguable, a potentially
> >> more generally useful one, too.
>
>
>
> > Yes, splice is more general (as opposed to send a file to a socket), but I
> > think splice has limitations too (one of the fds needs to be a pipe).
> > Is that true only for linux?
>
splice(2) for sure is an amazing project, and as Christos said,
splice(2) requires one of the fds
to be a pipe. Given that splice is only implemented in linux (from
what I found), we can have a
slightly different implementation in the NetBSD kernel according to
requirements (if allowed).

I haven't found sendfile(2) or splice(2) syscall in the NetBSD kernel.
I did find a reference to sendfile(),
but that was for the tftp daemon.

It will make an interesting project, to add support for these calls in
the NetBSD kernel first.
Later these very syscalls can back functionality for the linux compat
layer as well.

What I wish to know is, what other zero-copy functionality is already
present in the NetBSD kernel,
which can support these two system calls.

I hope this makes some sense and please do correct me where I have
made a wrong assumption.

Hope to hear from you soon
-- 
Regards,
Piyush


Re: Emulating missing linux syscalls

2022-04-12 Thread Christos Zoulas
In article , Joerg Sonnenberger   wrote:
>Am Tue, Apr 12, 2022 at 12:29:21PM - schrieb Christos Zoulas:
>> In article
>,
>> Piyush Sachdeva   wrote:
>> >-=-=-=-=-=-
>> >
>> >Dear Stephen Borrill,
>> >My name is Piyush, and I was looking into the
>> >'Emulating missing Linux syscalls' project hoping to contribute
>> >to this year's GSoC.
>> >
>> >I wanted to be sure of a few basic things before I go ahead:
>> >- linux binaries are found in- src/sys/compat/linux
>> >- particular implementation in - src/sys/compat/linux/common
>> >- a few architecture-specific implementations in-
>> >  src/sys/compat/linux/arch/.
>> >- The src/sys/compat/linux/arch//linux_syscalls.c file
>> >   lists of system calls, and states if a particular syscall is present or
>> >not.
>> >
>> >I was planning to work on the 'sendfile()' syscall, which I believe
>> >is unimplemented for amd64 and a few other architectures as well.
>> >
>> >Considering the above points, I was hoping you could point me in
>> >the right direction for this project. Hope to hear from you soon.
>> 
>> I would look into porting the FreeBSD implementation of sendfile to NetBSD.
>
>sendfile(2) for Linux compat can be emulated in the kernel without
>backing. That said, a real splice(2) or even splicev(2) would be really
>nice to have. But that's a different project and arguable, a potentially
>more generally useful one, too.


Yes, splice is more general (as opposed to send a file to a socket), but I
think splice has limitations too (one of the fds needs to be a pipe).
Is that true only for linux?

christos



Re: Emulating missing linux syscalls

2022-04-12 Thread Joerg Sonnenberger
Am Tue, Apr 12, 2022 at 12:29:21PM - schrieb Christos Zoulas:
> In article 
> ,
> Piyush Sachdeva   wrote:
> >-=-=-=-=-=-
> >
> >Dear Stephen Borrill,
> >My name is Piyush, and I was looking into the
> >'Emulating missing Linux syscalls' project hoping to contribute
> >to this year's GSoC.
> >
> >I wanted to be sure of a few basic things before I go ahead:
> >- linux binaries are found in- src/sys/compat/linux
> >- particular implementation in - src/sys/compat/linux/common
> >- a few architecture-specific implementations in-
> >  src/sys/compat/linux/arch/.
> >- The src/sys/compat/linux/arch//linux_syscalls.c file
> >   lists of system calls, and states if a particular syscall is present or
> >not.
> >
> >I was planning to work on the 'sendfile()' syscall, which I believe
> >is unimplemented for amd64 and a few other architectures as well.
> >
> >Considering the above points, I was hoping you could point me in
> >the right direction for this project. Hope to hear from you soon.
> 
> I would look into porting the FreeBSD implementation of sendfile to NetBSD.

sendfile(2) for Linux compat can be emulated in the kernel without
backing. That said, a real splice(2) or even splicev(2) would be really
nice to have. But that's a different project and arguable, a potentially
more generally useful one, too.

Joerg


Re: Emulating missing linux syscalls

2022-04-12 Thread Christos Zoulas
In article 
,
Piyush Sachdeva   wrote:
>-=-=-=-=-=-
>
>Dear Stephen Borrill,
>My name is Piyush, and I was looking into the
>'Emulating missing Linux syscalls' project hoping to contribute
>to this year's GSoC.
>
>I wanted to be sure of a few basic things before I go ahead:
>- linux binaries are found in- src/sys/compat/linux
>- particular implementation in - src/sys/compat/linux/common
>- a few architecture-specific implementations in-
>  src/sys/compat/linux/arch/.
>- The src/sys/compat/linux/arch//linux_syscalls.c file
>   lists of system calls, and states if a particular syscall is present or
>not.
>
>I was planning to work on the 'sendfile()' syscall, which I believe
>is unimplemented for amd64 and a few other architectures as well.
>
>Considering the above points, I was hoping you could point me in
>the right direction for this project. Hope to hear from you soon.

I would look into porting the FreeBSD implementation of sendfile to NetBSD.

christos