Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile)
On Sat, 2006-08-12 at 15:28 -0400, Jakub Jelinek wrote: > On Sat, Aug 12, 2006 at 12:10:35PM -0700, Ulrich Drepper wrote: > > > I am wondering about that too. IIRC, the IO_NOTIFY_* constants are not > > > part of the ABI, but only internal to the kernel implementation. I think > > > Zach had suggested inferring THREAD_ID notification if the pid specified > > > is not zero. But, I don't see why ->sigev_notify couldn't used directly > > > (just like the POSIX timers code does) thus doing away with the > > > new constants altogether. Sebestian/Laurent, do you recall? > > > > I suggest to model the implementation after the timer code which does > > exactly what we need. > > Yeah, and if at all possible we want to use just one helper thread for > SIGEV_THREAD notification of timers/aio/etc., so it really should behave the > same as timer thread notification. > That's exactly what is done in libposix-aio. Sébastien. -- - Sébastien DuguéBULL/FREC:B1-247 phone: (+33) 476 29 77 70 Bullcom: 229-7770 mailto:[EMAIL PROTECTED] Linux POSIX AIO: http://www.bullopensource.org/posix http://sourceforge.net/projects/paiol - - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile)
On Sat, 2006-08-12 at 12:10 -0700, Ulrich Drepper wrote: > Suparna Bhattacharya wrote: > > I am wondering about that too. IIRC, the IO_NOTIFY_* constants are not > > part of the ABI, but only internal to the kernel implementation. I think > > Zach had suggested inferring THREAD_ID notification if the pid specified > > is not zero. But, I don't see why ->sigev_notify couldn't used directly > > (just like the POSIX timers code does) thus doing away with the > > new constants altogether. Sebestian/Laurent, do you recall? > > I suggest to model the implementation after the timer code which does > exactly what we need. > Will do. > > > I'm guessing they are being used for validation of permissions at the time > > of sending the signal, but maybe saving the task pointer in the iocb instead > > of the pid would suffice ? > > Why should any verification be necessary? The requests are generated in > the same process which will receive the notification. Even if the POSIX > process (aka, kernel process group) changes the IDs the notifications > should be set. The key is that notifications cannot be sent to another > POSIX process. > > Adding this as a feature just makes things so much more complicated. > Agreed. Sébastien. -- - Sébastien DuguéBULL/FREC:B1-247 phone: (+33) 476 29 77 70 Bullcom: 229-7770 mailto:[EMAIL PROTECTED] Linux POSIX AIO: http://www.bullopensource.org/posix http://sourceforge.net/projects/paiol - - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile)
Hi, just came back from vacation, sorry for the delay. On Sat, 2006-08-12 at 23:59 +0530, Suparna Bhattacharya wrote: > BTW, if anyone would like to be dropped off this growing cc list, please > let us know. > > On Fri, Aug 11, 2006 at 12:45:55PM -0700, Ulrich Drepper wrote: > > Sébastien Dugué wrote: > > >aio completion notification > > > > I looked over this now but I don't think I understand everything. Or I > > don't see how it all is integrated. And no, I'm not looking at the > > proposed glibc code since would mean being tainted. > > Oh, I didn't realise that. > I'll make an attempt to clarify parts that I understand based on what I > have gleaned from my reading of the code and intent, but hopefully Sebastien, > Ben, Zach et al will be able to pitch in for a more accurate and complete > picture. > > > > > > > > Details: > > > --- > > > > > > A struct sigevent *aio_sigeventp is added to struct iocb in > > > include/linux/aio_abi.h > > > > > > An enum {IO_NOTIFY_SIGNAL = 0, IO_NOTIFY_THREAD_ID = 1} is added in > > > include/linux/aio.h: > > > > > > - IO_NOTIFY_SIGNAL means that the signal is to be sent to the > > > requesting thread > > > > > > - IO_NOTIFY_THREAD_ID means that the signal is to be sent to a > > > specifi thread. > > > > This has been proved to be sufficient in the timer code which basically > > has the same problem. But why do you need separate constants? We have > > the various SIGEV_* constants, among them SIGEV_THREAD_ID. Just use > > these constants for the values of ki_notify. > > > > I am wondering about that too. IIRC, the IO_NOTIFY_* constants are not > part of the ABI, but only internal to the kernel implementation. I think > Zach had suggested inferring THREAD_ID notification if the pid specified > is not zero. But, I don't see why ->sigev_notify couldn't used directly > (just like the POSIX timers code does) thus doing away with the > new constants altogether. Sebestian/Laurent, do you recall? As I see it, those IO_NOTIFY_* constants are uneeded and we could use ->sigev_notify directly. I will change this so that we use the same mechanism as the POSIX timers code. > > > > > > The following fields are added to struct kiocb in include/linux/aio.h: > > > > > > - pid_t ki_pid: target of the signal > > > > > > - __u16 ki_signo: signal number > > > > > > - __u16 ki_notify: kind of notification, IO_NOTIFY_SIGNAL or > > > IO_NOTIFY_THREAD_ID > > > > > > - uid_t ki_uid, ki_euid: filled with the submitter credentials > > > > These two fields aren't needed for the POSIX interfaces. Where does the > > requirement come from? I don't say they should be removed, they might > > be useful, but if the costs are non-negligible then they could go away. > > I'm guessing they are being used for validation of permissions at the time > of sending the signal, but maybe saving the task pointer in the iocb instead > of the pid would suffice ? IIRC, Ben added these for that exact reason. Is this really needed? Ben? > > > > > > > > - check whether the submitting thread wants to be notified directly > > > (sigevent->sigev_notify_thread_id is 0) or wants the signal to be sent > > > to another thread. > > > In the latter case a check is made to assert that the target thread > > > is in the same thread group > > > > Is this really how it's implemented? This is not how it should be. > > Either a signal is sent to a specific thread in the same process (this > > is what SIGEV_THREAD_ID is for) or the signal is sent to a calling > > process. Sending a signal to the process means that from the kernel's > > POV any thread which doesn't have the signal blocked can receive it. > > The final decision is made by the kernel. There is no mechanism to send > > the signal to another process. > > The code seems to be set up to call specific_send_sig_info() in the case > of *_THREAD_ID , and __group_send_sig_info() otherwise. So I think the > intended behaviour is as you describe it should be (__group_send_sig_info > does the equivalent of sending a signal to the process and so any thread > which doesn't have signals blocked can receive it, while > specific_send_sig_info > sends it to a particular thread). > > But, I should really leave it to Sebestian to confirm that. That's right, but I think that part needs to be reworked to follow the same logic as the POSIX timers. > > > listio support > > > > > > > I really don't understand the kernel interface for this feature. > > I'm sorry this is confusing. This probably means that we need to > separate the external interface description more clearly and completely > from the internals. > > > > > > > > Details: > > > --- > > > > > > An IOCB_CMD_GROUP is added to the IOCB_CMD enum in > > > include/linux/aio_abi.h > > > > > > A struct lio_event is added in include/linux/aio.h > > > > > > A struct
Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile)
On Mon, 2006-08-14 at 09:38 -0700, Ulrich Drepper wrote: > Suparna Bhattacharya wrote: > > Is there a (remote) possibility that the thread could have died and its > > pid got reused by a new thread in another process ? Or is there a mechanism > > that prevents such a possibility from arising (not just in NPTL library, > > but at the kernel level) ? > > The UID/GID won't help you with dying processes. What if the same user > creates a process with the same PID? That process will not expect the > notification and mustn't receive it. If you cannot detect whether the > issuing process died you have problems which cannot be solved with a > uid/gid pair. > > Eric W. Biederman sent a series of patches that introduced a struct task_ref specifically to solve this sort of problem on January 28 of this year, but I don't think it went anywhere. -- Nicholas Miell <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile)
Suparna Bhattacharya wrote: > Is there a (remote) possibility that the thread could have died and its > pid got reused by a new thread in another process ? Or is there a mechanism > that prevents such a possibility from arising (not just in NPTL library, > but at the kernel level) ? The UID/GID won't help you with dying processes. What if the same user creates a process with the same PID? That process will not expect the notification and mustn't receive it. If you cannot detect whether the issuing process died you have problems which cannot be solved with a uid/gid pair. > AIO for pipes should not be a problem - Chris Mason had a patch, so we can > just bring it up to the current levels, possibly with some additional > improvements. Good. > I'm not sure what would be the right thing to do for the sockets case. While > we could put together a patch for basic aio_read/write (based on the same > model used for files), given the whole ongoing kevent effort, its not yet > clear to me what would make the most sense ... > > Ben had a patch to do a fallback to kernel threads for AIO operations that > are not yet supported natively. I had some concerns about the approach, but > I guess he had intended it as an interim path for cases like this. A fallback solution would be sufficient. Nobody _should_ use POSIX AIO for networking but people do and just giving them something that works is good enough. It cannot really be worse than the userlevel emulation we have know. The alternative, separately and sequentially handling network sockets at userlevel is horrible. We'd have to go over every file descriptor and check whether it's a socket and then take if out of the request list for the kernel. Then they need to be handled separately before or after the kernel AIO code. This would punish unduly all the 99.9% of the programs which don't use POSIX AIO for network I/O. -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ signature.asc Description: OpenPGP digital signature
Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile)
On Sat, Aug 12, 2006 at 12:10:35PM -0700, Ulrich Drepper wrote: > Suparna Bhattacharya wrote: > > I am wondering about that too. IIRC, the IO_NOTIFY_* constants are not > > part of the ABI, but only internal to the kernel implementation. I think > > Zach had suggested inferring THREAD_ID notification if the pid specified > > is not zero. But, I don't see why ->sigev_notify couldn't used directly > > (just like the POSIX timers code does) thus doing away with the > > new constants altogether. Sebestian/Laurent, do you recall? > > I suggest to model the implementation after the timer code which does > exactly what we need. Agreed. > > > > I'm guessing they are being used for validation of permissions at the time > > of sending the signal, but maybe saving the task pointer in the iocb instead > > of the pid would suffice ? > > Why should any verification be necessary? The requests are generated in > the same process which will receive the notification. Even if the POSIX > process (aka, kernel process group) changes the IDs the notifications > should be set. The key is that notifications cannot be sent to another > POSIX process. Is there a (remote) possibility that the thread could have died and its pid got reused by a new thread in another process ? Or is there a mechanism that prevents such a possibility from arising (not just in NPTL library, but at the kernel level) ? I think the timer code saves a reference to the task pointer instead of the pid, which is what I was suggesting above (instead of the euid checks), as way to avoid the above situation. > > Adding this as a feature just makes things so much more complicated. > > > > So I think the > > intended behaviour is as you describe it should be > > Then the documentation needs to be adjusted. *Nod* > > > > The way it works (and better ideas are welcome) is that, since the > > io_submit() > > syscall already accepts an array of iocbs[], no new syscall was introduced. > > To implement lio_listio, one has to set up such an array, with the first > > iocb > > in the array having the special (new) grouping opcode of IOCB_CMD_GROUP > > which > > specifies the sigev notification to be associated with group completion > > (a NULL value of the sigev notification pointer would imply equivalent of > > LIO_WAIT). > > OK, this seems OK. We have to construct the iocb arrays dynamically anyway. > > > > My thought here was that it should be possible to include M as a parameter > > to the IOCB_CMD_GROUP opcode iocb, and thus incorporated in the lio control > > block ... then whatever semantics are agreed upon can be implemented. > > If you have room for the parameter this is fine. For the beginning we > can enforce the number to be the same as the total number of requests. > Sounds good. > > > Let us know what you think about the listio interface ... hopefully the > > other issues are mostly simple to resolve. > > It should be fine and I would support adding all this assuming the > normal file support (as opposed to direct I/O only) is added, too. OK. I updated my patchset against 2618-rc3 just after OLS. > > > But I have one last question: sockets, pipes and the like are already > supported, right? If this is not the case we have a problem with the > currently proposed lio_listio interface. AIO for pipes should not be a problem - Chris Mason had a patch, so we can just bring it up to the current levels, possibly with some additional improvements. I'm not sure what would be the right thing to do for the sockets case. While we could put together a patch for basic aio_read/write (based on the same model used for files), given the whole ongoing kevent effort, its not yet clear to me what would make the most sense ... Ben had a patch to do a fallback to kernel threads for AIO operations that are not yet supported natively. I had some concerns about the approach, but I guess he had intended it as an interim path for cases like this. Suggestions would be much appreciated ? DaveM, Ingo, Andrew ? Regards Suparna > > -- > ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ > -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile)
On Sat, Aug 12, 2006 at 12:10:35PM -0700, Ulrich Drepper wrote: > > I am wondering about that too. IIRC, the IO_NOTIFY_* constants are not > > part of the ABI, but only internal to the kernel implementation. I think > > Zach had suggested inferring THREAD_ID notification if the pid specified > > is not zero. But, I don't see why ->sigev_notify couldn't used directly > > (just like the POSIX timers code does) thus doing away with the > > new constants altogether. Sebestian/Laurent, do you recall? > > I suggest to model the implementation after the timer code which does > exactly what we need. Yeah, and if at all possible we want to use just one helper thread for SIGEV_THREAD notification of timers/aio/etc., so it really should behave the same as timer thread notification. Jakub - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile)
Suparna Bhattacharya wrote: > I am wondering about that too. IIRC, the IO_NOTIFY_* constants are not > part of the ABI, but only internal to the kernel implementation. I think > Zach had suggested inferring THREAD_ID notification if the pid specified > is not zero. But, I don't see why ->sigev_notify couldn't used directly > (just like the POSIX timers code does) thus doing away with the > new constants altogether. Sebestian/Laurent, do you recall? I suggest to model the implementation after the timer code which does exactly what we need. > I'm guessing they are being used for validation of permissions at the time > of sending the signal, but maybe saving the task pointer in the iocb instead > of the pid would suffice ? Why should any verification be necessary? The requests are generated in the same process which will receive the notification. Even if the POSIX process (aka, kernel process group) changes the IDs the notifications should be set. The key is that notifications cannot be sent to another POSIX process. Adding this as a feature just makes things so much more complicated. > So I think the > intended behaviour is as you describe it should be Then the documentation needs to be adjusted. > The way it works (and better ideas are welcome) is that, since the io_submit() > syscall already accepts an array of iocbs[], no new syscall was introduced. > To implement lio_listio, one has to set up such an array, with the first iocb > in the array having the special (new) grouping opcode of IOCB_CMD_GROUP which > specifies the sigev notification to be associated with group completion > (a NULL value of the sigev notification pointer would imply equivalent of > LIO_WAIT). OK, this seems OK. We have to construct the iocb arrays dynamically anyway. > My thought here was that it should be possible to include M as a parameter > to the IOCB_CMD_GROUP opcode iocb, and thus incorporated in the lio control > block ... then whatever semantics are agreed upon can be implemented. If you have room for the parameter this is fine. For the beginning we can enforce the number to be the same as the total number of requests. > Let us know what you think about the listio interface ... hopefully the > other issues are mostly simple to resolve. It should be fine and I would support adding all this assuming the normal file support (as opposed to direct I/O only) is added, too. But I have one last question: sockets, pipes and the like are already supported, right? If this is not the case we have a problem with the currently proposed lio_listio interface. -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ signature.asc Description: OpenPGP digital signature