Re: regarding generic AIO, async syscalls precedent + some benchmarks by lighttpd
Davide Libenzi wrote: Yes, that is some very interesting data IMO. I did not bench the GUASI (userspace async thread library) against AIO, but those numbers show that a *userspace* async syscall wrapper interface performs in the ballpark of AIO. This leads to some hope about the ability to effectively deploy the kernel generic async AIO (being it fibril or kthreads based) as low-impact async provider for basically anything. SGI's kaio patch to linux kind of went that route (using kthreads) for non-SCSI async IO. It wasn't a bad way to go, but at least for disk-based access they achieved much better results when they could go right to the hardware. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: regarding generic AIO, async syscalls precedent + some benchmarks by lighttpd
Davide Libenzi wrote: Yes, that is some very interesting data IMO. I did not bench the GUASI (userspace async thread library) against AIO, but those numbers show that a *userspace* async syscall wrapper interface performs in the ballpark of AIO. This leads to some hope about the ability to effectively deploy the kernel generic async AIO (being it fibril or kthreads based) as low-impact async provider for basically anything. SGI's kaio patch to linux kind of went that route (using kthreads) for non-SCSI async IO. It wasn't a bad way to go, but at least for disk-based access they achieved much better results when they could go right to the hardware. --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
Just to confirm Dan. I was a fool and did not install the dummy handler for the masked signal I was using. I added the proper code over the weekend with no noticable effect (JDK 1.3 still sigtimedwait()'s on the signal :-(). --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
Just to confirm Dan. I was a fool and did not install the dummy handler for the masked signal I was using. I added the proper code over the weekend with no noticable effect (JDK 1.3 still sigtimedwait()'s on the signal :-(). --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 01:11 PM 6/28/2001 -0700, Daniel R. Kegel wrote: >AFAIK, there's no 'read with a timeout' system call for file descriptors, so >if you needed the equivalent of sigtimedwait(), >you might end up doing a select on the sigopen fd, which is an extra >system call. (I wish Posix had invented sigopen() and readtimedwait() >instead of >sigtimedwait...) What, you don't want to use AIO or to collect your AIO signals? ;-) In my apps, what I'd do is spawn a few worker threads which would do blocking reads. Particularly if the sigopen() API allows me to grab multiple signals from one fd, this should work well enough for one's high performance needs. Alternatively, one could use select/poll to check if there was data ready before doing the read. I agree though that it'd be nice to have a timed read. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 10:59 AM 6/28/2001 -0400, Dan Maas wrote: >life-threatening things like SIGTERM, SIGKILL, and SIGSEGV. The mutation >into queued, information-carrying siginfo signals just shows how badly we >need a more robust event model... (what would truly kick butt is a unified >interface that could deliver everything from fd events to AIO completions to >semaphore/msgqueue events, etc, with explicit binding between event queues >and threads). I guess this is my thinking: it's really not that much of a stretch to make signals behave like GetMessage(). Indeed, sigopen() brings them sufficiently close. By doing this, you DO provide this unified interface for all the different types of events you described which works much like GetMessage(). So, but adding a couple of syscalls you avoid having to implement a whole new set of API's for doing AIO, semaphores, msgqueues, etc. --Chris P.S.: What do you mean by explicit binding between event queues and threads? I'm not sure I see what this gains you. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 01:58 PM 6/28/2001 +0100, John Fremlin wrote: > Dan Kegel <[EMAIL PROTECTED]> writes: > >A signal number cannot be opened more than once concurrently; > >sigopen() thus provides a way to avoid signal usage clashes > >in large programs. >Signals are a pretty dopey API anyway - so instead of trying to patch >them up, why not think of something better for AIO? You assume that this issue only comes up when you're doing AIO. If we do something that makes signals work better, we can have a much broader impact that just AIO. If nothing else, the signal usage clashing issue has nothing to do with AIO. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 07:49 PM 6/27/2001 -0700, Daniel R. Kegel wrote: >Balbir Singh <[EMAIL PROTECTED]> wrote: > >sigopen() should be selective about the signals it allows > >as argument. Try and make sigopen() thread specific, so that if one > >thread does a sigopen(), it does not imply it will do all the signal > >handling for all the threads. > >IMHO sigopen()/read() should behave just like sigwait() with respect >to threads. That means that in Posix, it would not be thread specific, >but in Linux, it would be thread specific, because that's how signals >and threads work there at the moment. Actually, I believe with IBM's new Posix threads implementation, Linux finally does signal delivery "the right way". In general, I think it'd be nice if this API *always* sucked up signals from all threads. This makes sense particularly since the FD is accessible by all threads. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: A signal fairy tale - a little comphist
At 03:57 PM 6/28/2001 +0200, Heusden, Folkert van wrote: >[...] > >A signal number cannot be opened more than once concurrently; > >sigopen() thus provides a way to avoid signal usage clashes > >in large programs. >YOU> Signals are a pretty dopey API anyway - > >Exactly. When signals were made up, signalhandlers were supposed to >not so much more then a last cry and then exit the application. sigHUP >to re-read the config was not supposed to happen. > >YOU> so instead of trying to patch >YOU> them up, why not think of something better for AIO? > >Yeah, a select() on excepfds. POSIX AIO API's are significantly more powerful then using select(), particularly for certain types of applications. select() doesn't provide you with a good way to perform I/O operations at different offsets simultaneously, doesn't allow for I/O priority, etc. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 07:57 PM 6/27/2001 -0700, Daniel R. Kegel wrote: >From: Christopher Smith <[EMAIL PROTECTED]> > >I guess the main thing I'm thinking is this could require some significant > >changes to the way the kernel behaves. Still, it's worth taking a "try it > >and see approach". If anyone else thinks this is a good idea I may hack > >together a sample patch and give it a whirl. > >What's the biggest change you see? From my (two-martini-lunch-tainted) >viewpoint, it's just another kind of signal masking, sorta... Yeah, the more I think about it, the more I think this is just another branch in the signal delivery code. Not necessarily too huge a change. I'll hack on this over the weekend I think. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 07:49 PM 6/27/2001 -0700, Daniel R. Kegel wrote: Balbir Singh [EMAIL PROTECTED] wrote: sigopen() should be selective about the signals it allows as argument. Try and make sigopen() thread specific, so that if one thread does a sigopen(), it does not imply it will do all the signal handling for all the threads. IMHO sigopen()/read() should behave just like sigwait() with respect to threads. That means that in Posix, it would not be thread specific, but in Linux, it would be thread specific, because that's how signals and threads work there at the moment. Actually, I believe with IBM's new Posix threads implementation, Linux finally does signal delivery the right way. In general, I think it'd be nice if this API *always* sucked up signals from all threads. This makes sense particularly since the FD is accessible by all threads. --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: A signal fairy tale - a little comphist
At 03:57 PM 6/28/2001 +0200, Heusden, Folkert van wrote: [...] A signal number cannot be opened more than once concurrently; sigopen() thus provides a way to avoid signal usage clashes in large programs. YOU Signals are a pretty dopey API anyway - Exactly. When signals were made up, signalhandlers were supposed to not so much more then a last cry and then exit the application. sigHUP to re-read the config was not supposed to happen. YOU so instead of trying to patch YOU them up, why not think of something better for AIO? Yeah, a select() on excepfds. POSIX AIO API's are significantly more powerful then using select(), particularly for certain types of applications. select() doesn't provide you with a good way to perform I/O operations at different offsets simultaneously, doesn't allow for I/O priority, etc. --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 07:57 PM 6/27/2001 -0700, Daniel R. Kegel wrote: From: Christopher Smith [EMAIL PROTECTED] I guess the main thing I'm thinking is this could require some significant changes to the way the kernel behaves. Still, it's worth taking a try it and see approach. If anyone else thinks this is a good idea I may hack together a sample patch and give it a whirl. What's the biggest change you see? From my (two-martini-lunch-tainted) viewpoint, it's just another kind of signal masking, sorta... Yeah, the more I think about it, the more I think this is just another branch in the signal delivery code. Not necessarily too huge a change. I'll hack on this over the weekend I think. --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 01:58 PM 6/28/2001 +0100, John Fremlin wrote: Dan Kegel [EMAIL PROTECTED] writes: A signal number cannot be opened more than once concurrently; sigopen() thus provides a way to avoid signal usage clashes in large programs. Signals are a pretty dopey API anyway - so instead of trying to patch them up, why not think of something better for AIO? You assume that this issue only comes up when you're doing AIO. If we do something that makes signals work better, we can have a much broader impact that just AIO. If nothing else, the signal usage clashing issue has nothing to do with AIO. --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 10:59 AM 6/28/2001 -0400, Dan Maas wrote: life-threatening things like SIGTERM, SIGKILL, and SIGSEGV. The mutation into queued, information-carrying siginfo signals just shows how badly we need a more robust event model... (what would truly kick butt is a unified interface that could deliver everything from fd events to AIO completions to semaphore/msgqueue events, etc, with explicit binding between event queues and threads). I guess this is my thinking: it's really not that much of a stretch to make signals behave like GetMessage(). Indeed, sigopen() brings them sufficiently close. By doing this, you DO provide this unified interface for all the different types of events you described which works much like GetMessage(). So, but adding a couple of syscalls you avoid having to implement a whole new set of API's for doing AIO, semaphores, msgqueues, etc. --Chris P.S.: What do you mean by explicit binding between event queues and threads? I'm not sure I see what this gains you. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
At 01:11 PM 6/28/2001 -0700, Daniel R. Kegel wrote: AFAIK, there's no 'read with a timeout' system call for file descriptors, so if you needed the equivalent of sigtimedwait(), you might end up doing a select on the sigopen fd, which is an extra system call. (I wish Posix had invented sigopen() and readtimedwait() instead of sigtimedwait...) What, you don't want to use AIO or to collect your AIO signals? ;-) In my apps, what I'd do is spawn a few worker threads which would do blocking reads. Particularly if the sigopen() API allows me to grab multiple signals from one fd, this should work well enough for one's high performance needs. Alternatively, one could use select/poll to check if there was data ready before doing the read. I agree though that it'd be nice to have a timed read. --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
--On Wednesday, June 27, 2001 11:18:28 +0200 Jamie Lokier <[EMAIL PROTECTED]> wrote: > Btw, this functionality is already available using sigaction(). Just > search for a signal whose handler is SIG_DFL. If you then block that > signal before changing, checking the result, and unblocking the signal, > you can avoid race conditions too. (This is what my programs do). It's more than whether a signal is blocked or not, unfortunately. Lots of applications will invoke sigwaitinfo() on whatever the current signal mask is, which means you can't rely on sigaction to solve your problems. :-( --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
--On Wednesday, June 27, 2001 11:51:36 +0530 Balbir Singh <[EMAIL PROTECTED]> wrote: > Shouldn't there be a sigclose() and other operations to make the API Wouldn't the existing close() be good enough for that? > orthogonal. sigopen() should be selective about the signals it allows > as argument. Try and make sigopen() thread specific, so that if one > thread does a sigopen(), it does not imply it will do all the signal > handling for all the threads. Actually, this is exactly what you do want to happen. Linux's existing signals + threads semantics are not exactly ideal for high-performance computing. Of course, fd's are shared by all threads, so all of the threads would be able to read the siginfo structures into memory. > Does using sigopen() imply that signal(), sigaction(), etc cannot be used. > In the same process one could do a sigopen() in the library, but the > process could use sigaction()/signal() without knowing what the library > does (which signals it handles, etc). If I understood Dan's intentions correctly, you could use signal() and sigaction(), but while the fd is open, signals would be queued up to the fd rather than passed off to a signal handler or sigwaitinfo(). Care to comment Dan? > Let me know, when somebody has a patch or needs help, I would like to > help or take a look at it. Maybe we can both hack on this. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
--On Wednesday, June 27, 2001 11:51:36 +0530 Balbir Singh [EMAIL PROTECTED] wrote: Shouldn't there be a sigclose() and other operations to make the API Wouldn't the existing close() be good enough for that? orthogonal. sigopen() should be selective about the signals it allows as argument. Try and make sigopen() thread specific, so that if one thread does a sigopen(), it does not imply it will do all the signal handling for all the threads. Actually, this is exactly what you do want to happen. Linux's existing signals + threads semantics are not exactly ideal for high-performance computing. Of course, fd's are shared by all threads, so all of the threads would be able to read the siginfo structures into memory. Does using sigopen() imply that signal(), sigaction(), etc cannot be used. In the same process one could do a sigopen() in the library, but the process could use sigaction()/signal() without knowing what the library does (which signals it handles, etc). If I understood Dan's intentions correctly, you could use signal() and sigaction(), but while the fd is open, signals would be queued up to the fd rather than passed off to a signal handler or sigwaitinfo(). Care to comment Dan? Let me know, when somebody has a patch or needs help, I would like to help or take a look at it. Maybe we can both hack on this. --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
--On Wednesday, June 27, 2001 11:18:28 +0200 Jamie Lokier [EMAIL PROTECTED] wrote: Btw, this functionality is already available using sigaction(). Just search for a signal whose handler is SIG_DFL. If you then block that signal before changing, checking the result, and unblocking the signal, you can avoid race conditions too. (This is what my programs do). It's more than whether a signal is blocked or not, unfortunately. Lots of applications will invoke sigwaitinfo() on whatever the current signal mask is, which means you can't rely on sigaction to solve your problems. :-( --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
--On Tuesday, June 26, 2001 05:54:37 -0700 Dan Kegel <[EMAIL PROTECTED]> wrote: > Once upon a time a hacker named Xman > wrote a library that used aio, and decided > to use sigtimedwait() to pick up completion > notifications. It worked well, and his I/O > was blazing fast (since was using a copy > of Linux that was patched to have good aio). > But when he tried to integrate his library > into a large application someone else had > written, woe! that application's use of signals > conflicted with his library. "Fsck!" said Xman. > At that moment a fairy appeared, and said > "Young man, watch your language, or I'm going to > have to turn you into a goon! I'm the good fairy Eunice. > Can I help you?" Xman explained his problem to Eunice, > who smiled and said "All you need is right here, > just type 'man 2 sigopen'". Xman did, and saw: I must thank the god fair Eunice. ;-) From a programming standpoint, this looks like a really nice approach. I must say I prefer this approach to the various "event" strategies I've seen to date, as it fixes the primary problem with signals, while still allowing us to hook in to all the standard POSIX API's that already use signals. It'd be nice if I could pass in a 0 for signum and have the kernel select from unused signals (problem being that "unused" is not necessarily easy to define), althouh I guess an inefficient version of this could be handled in userland. I presume the fd could be shared between threads and otherwise behave like a normal fd, which would be sper nice. I guess the main thing I'm thinking is this could require some significant changes to the way the kernel behaves. Still, it's worth taking a "try it and see approach". If anyone else thinks this is a good idea I may hack together a sample patch and give it a whirl. Thanks again good fairy Dan/Eunice. ;-) --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A signal fairy tale
--On Tuesday, June 26, 2001 05:54:37 -0700 Dan Kegel [EMAIL PROTECTED] wrote: Once upon a time a hacker named Xman wrote a library that used aio, and decided to use sigtimedwait() to pick up completion notifications. It worked well, and his I/O was blazing fast (since was using a copy of Linux that was patched to have good aio). But when he tried to integrate his library into a large application someone else had written, woe! that application's use of signals conflicted with his library. Fsck! said Xman. At that moment a fairy appeared, and said Young man, watch your language, or I'm going to have to turn you into a goon! I'm the good fairy Eunice. Can I help you? Xman explained his problem to Eunice, who smiled and said All you need is right here, just type 'man 2 sigopen'. Xman did, and saw: I must thank the god fair Eunice. ;-) From a programming standpoint, this looks like a really nice approach. I must say I prefer this approach to the various event strategies I've seen to date, as it fixes the primary problem with signals, while still allowing us to hook in to all the standard POSIX API's that already use signals. It'd be nice if I could pass in a 0 for signum and have the kernel select from unused signals (problem being that unused is not necessarily easy to define), althouh I guess an inefficient version of this could be handled in userland. I presume the fd could be shared between threads and otherwise behave like a normal fd, which would be sper nice. I guess the main thing I'm thinking is this could require some significant changes to the way the kernel behaves. Still, it's worth taking a try it and see approach. If anyone else thinks this is a good idea I may hack together a sample patch and give it a whirl. Thanks again good fairy Dan/Eunice. ;-) --Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Asynchronous IO
--On Friday, April 13, 2001 04:45:07 -0400 Dan Maas <[EMAIL PROTECTED]> wrote: > IIRC the problem with implementing asynchronous *disk* I/O in Linux today > is that the filesystem code assumes synchronous I/O operations that block > the whole process/thread. So implementing "real" asynch I/O (without the > overhead of creating a process context for each operation) would require > re-writing the filesystems as non-blocking state machines. Last I heard > this was a long-term goal, but nobody's done the work yet (aside from > maybe the SGI folks with XFS?). Or maybe I don't know what I'm talking > about... If the FS supports generic read then this is not a problem. This is what SGI's KAIO does as well as Bart's work. > Bart, glad to hear you are working on an event interface, sounds cool! One > feature that I really, really, *really* want to see implemented is the > ability to block on a set of any "waitable kernel objects" with one > syscall - not just file descriptors, but also SysV semaphores and message > queues, UNIX signals and child proceses, file locks, pthreads condition > variables, asynch disk I/O completions, etc. I am dying for a clean way to > accomplish this that doesn't require more than one thread... (Win32 and > FreeBSD kick our butts here with MsgWaitForMultipleObjects() and > kevent()...) IMHO cleaning up this API deficiency is just as important as > optimizing the extreme case of socket I/O with zillions of file > descriptors... Actually, sigwaitinfo() has zero problem waiting on muliple signals. If you are using real-time signals each signal can pass a pointer to the relevant object, so even if you're only blocking on a single signal you can receive info about several objects. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Asynchronous IO
--On Friday, April 13, 2001 04:45:07 -0400 Dan Maas [EMAIL PROTECTED] wrote: IIRC the problem with implementing asynchronous *disk* I/O in Linux today is that the filesystem code assumes synchronous I/O operations that block the whole process/thread. So implementing "real" asynch I/O (without the overhead of creating a process context for each operation) would require re-writing the filesystems as non-blocking state machines. Last I heard this was a long-term goal, but nobody's done the work yet (aside from maybe the SGI folks with XFS?). Or maybe I don't know what I'm talking about... If the FS supports generic read then this is not a problem. This is what SGI's KAIO does as well as Bart's work. Bart, glad to hear you are working on an event interface, sounds cool! One feature that I really, really, *really* want to see implemented is the ability to block on a set of any "waitable kernel objects" with one syscall - not just file descriptors, but also SysV semaphores and message queues, UNIX signals and child proceses, file locks, pthreads condition variables, asynch disk I/O completions, etc. I am dying for a clean way to accomplish this that doesn't require more than one thread... (Win32 and FreeBSD kick our butts here with MsgWaitForMultipleObjects() and kevent()...) IMHO cleaning up this API deficiency is just as important as optimizing the extreme case of socket I/O with zillions of file descriptors... Actually, sigwaitinfo() has zero problem waiting on muliple signals. If you are using real-time signals each signal can pass a pointer to the relevant object, so even if you're only blocking on a single signal you can receive info about several objects. insert thread about how signals suck here --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: a quest for a better scheduler
--On Thursday, April 05, 2001 15:38:41 -0700 "Timothy D. Witham" <[EMAIL PROTECTED]> wrote: > Database performance: > Raw storage I/O performance >OLTP workload You probably want to add an OLAP scenario as well. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Signal Handling Performance?
--On Wednesday, April 04, 2001 21:30:51 -0400 "Carey B. Stortz" <[EMAIL PROTECTED]> wrote: > either stayed the same or had a performance increase. A general decrease > started around kernel 2.1.32, then performance drastically fell at kernel > 2.3.20. There is an Excel graph which shows the trend at: > > http://euclid.nmu.edu/~benchmark/Carey/signalhandling.gif > > I was wondering if anybody had any ideas why this is happening, and what > happened in kernel 2.3.20 to cause such a decrease in performance? Lies, damn lies, and benchmarks. ;-) Seriously though, I'm not clear on what you are measuring or how you are measuring it. It looks like this is measuring signal latency, which is important, but what about thoroughput? --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: a quest for a better scheduler
--On Wednesday, April 04, 2001 15:16:32 -0700 Tim Wright <[EMAIL PROTECTED]> wrote: > On Wed, Apr 04, 2001 at 03:23:34PM +0200, Ingo Molnar wrote: >> nope. The goal is to satisfy runnable processes in the range of NR_CPUS. >> You are playing word games by suggesting that the current behavior >> prefers 'low end'. 'thousands of runnable processes' is not 'high end' >> at all, it's 'broken end'. Thousands of runnable processes are the sign >> of a broken application design, and 'fixing' the scheduler to perform >> better in that case is just fixing the symptom. [changing the scheduler >> to perform better in such situations is possible too, but all solutions >> proposed so far had strings attached.] > > Ingo, you continue to assert this without giving much evidence to back it > up. All the world is not a web server. If I'm running a large OLTP > database with thousands of clients, it's not at all unreasonable to > expect periods where several hundred (forget the thousands) want to be > serviced by the database engine. That sounds like hundreds of schedulable > entities be they processes or threads or whatever. This sort of load is > regularly run on machine with 16-64 CPUs. Actually, it's not just OLTP, anytime you are doing time sharing between hundreds of users (something POSIX systems are supposed to be good at) this will happen. > Now I will admit that it is conceivable that you can design an > application that finds out how many CPUs are available, creates threads > to match that number and tries to divvy up the work between them using > some combination of polling and asynchronous I/O etc. There are, however > a number of problems with this approach: Actually, one way to semi-support this approach is to implement many-to-many threads as per the Solaris approach. This also requires significant hacking of both the kernel and the runtime, and certainly is significantly more error prone than trying to write a flexible scheduler. One problem you didn't highlight that even the above case does not happily identify is that for security reasons you may very well need each user's requests to take place in a different process. If you don't, then you have to implement a very well tested and secure user-level security mechanism to ensure things like privacy (above and beyond the time-sharing). The world is filled with a wide variety of types of applications, and unless you know two programming approaches are functionaly equivalent (and event driven/polling I/O vs. tons of running processes are NOT), you shouldn't say one approach is "broken". You could say it's a "broken" approach to building web servers. Unfortunately, things like kernels and standard libraries should work well in the general case. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: a quest for a better scheduler
--On Wednesday, April 04, 2001 15:16:32 -0700 Tim Wright [EMAIL PROTECTED] wrote: On Wed, Apr 04, 2001 at 03:23:34PM +0200, Ingo Molnar wrote: nope. The goal is to satisfy runnable processes in the range of NR_CPUS. You are playing word games by suggesting that the current behavior prefers 'low end'. 'thousands of runnable processes' is not 'high end' at all, it's 'broken end'. Thousands of runnable processes are the sign of a broken application design, and 'fixing' the scheduler to perform better in that case is just fixing the symptom. [changing the scheduler to perform better in such situations is possible too, but all solutions proposed so far had strings attached.] Ingo, you continue to assert this without giving much evidence to back it up. All the world is not a web server. If I'm running a large OLTP database with thousands of clients, it's not at all unreasonable to expect periods where several hundred (forget the thousands) want to be serviced by the database engine. That sounds like hundreds of schedulable entities be they processes or threads or whatever. This sort of load is regularly run on machine with 16-64 CPUs. Actually, it's not just OLTP, anytime you are doing time sharing between hundreds of users (something POSIX systems are supposed to be good at) this will happen. Now I will admit that it is conceivable that you can design an application that finds out how many CPUs are available, creates threads to match that number and tries to divvy up the work between them using some combination of polling and asynchronous I/O etc. There are, however a number of problems with this approach: Actually, one way to semi-support this approach is to implement many-to-many threads as per the Solaris approach. This also requires significant hacking of both the kernel and the runtime, and certainly is significantly more error prone than trying to write a flexible scheduler. One problem you didn't highlight that even the above case does not happily identify is that for security reasons you may very well need each user's requests to take place in a different process. If you don't, then you have to implement a very well tested and secure user-level security mechanism to ensure things like privacy (above and beyond the time-sharing). The world is filled with a wide variety of types of applications, and unless you know two programming approaches are functionaly equivalent (and event driven/polling I/O vs. tons of running processes are NOT), you shouldn't say one approach is "broken". You could say it's a "broken" approach to building web servers. Unfortunately, things like kernels and standard libraries should work well in the general case. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Signal Handling Performance?
--On Wednesday, April 04, 2001 21:30:51 -0400 "Carey B. Stortz" [EMAIL PROTECTED] wrote: either stayed the same or had a performance increase. A general decrease started around kernel 2.1.32, then performance drastically fell at kernel 2.3.20. There is an Excel graph which shows the trend at: http://euclid.nmu.edu/~benchmark/Carey/signalhandling.gif I was wondering if anybody had any ideas why this is happening, and what happened in kernel 2.3.20 to cause such a decrease in performance? Lies, damn lies, and benchmarks. ;-) Seriously though, I'm not clear on what you are measuring or how you are measuring it. It looks like this is measuring signal latency, which is important, but what about thoroughput? --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: a quest for a better scheduler
--On Tuesday, April 03, 2001 18:17:30 -0700 Fabio Riccardi <[EMAIL PROTECTED]> wrote: > Alan Cox wrote: > Indeed, I'm using RT sigio/sigwait event scheduling, bare clone threads > and zero-copy io. Fabio, I'm working on a similar solution, although I'm experimenting with SGI's KAIO patch to see what it can do. I've had to patch the kernel to implement POSIX style signal dispatch symantics (so that the thread which posted an I/O request doesn't have to be the one which catches the signal). Are you taking a similar approach, or is the lack of this behavior the reason you are using so many threads? --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: a quest for a better scheduler
--On Tuesday, April 03, 2001 18:17:30 -0700 Fabio Riccardi [EMAIL PROTECTED] wrote: Alan Cox wrote: Indeed, I'm using RT sigio/sigwait event scheduling, bare clone threads and zero-copy io. Fabio, I'm working on a similar solution, although I'm experimenting with SGI's KAIO patch to see what it can do. I've had to patch the kernel to implement POSIX style signal dispatch symantics (so that the thread which posted an I/O request doesn't have to be the one which catches the signal). Are you taking a similar approach, or is the lack of this behavior the reason you are using so many threads? --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/