Re: regarding generic AIO, async syscalls precedent + some benchmarks by lighttpd

2007-02-04 Thread Christopher Smith

Davide Libenzi wrote:
Yes, that is some very interesting data IMO. I did not bench the GUASI 
(userspace async thread library) against AIO, but those numbers show that a 
*userspace* async syscall wrapper interface performs in the ballpark of AIO.
This leads to some hope about the ability to effectively deploy the kernel 
generic async AIO (being it fibril or kthreads based) as low-impact async 
provider for basically anything.
  


SGI's kaio patch to linux kind of went that route (using kthreads) for 
non-SCSI async IO. It wasn't a bad way to go, but at least for 
disk-based access they achieved much better results when they could go 
right to the hardware.


--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: regarding generic AIO, async syscalls precedent + some benchmarks by lighttpd

2007-02-04 Thread Christopher Smith

Davide Libenzi wrote:
Yes, that is some very interesting data IMO. I did not bench the GUASI 
(userspace async thread library) against AIO, but those numbers show that a 
*userspace* async syscall wrapper interface performs in the ballpark of AIO.
This leads to some hope about the ability to effectively deploy the kernel 
generic async AIO (being it fibril or kthreads based) as low-impact async 
provider for basically anything.
  


SGI's kaio patch to linux kind of went that route (using kthreads) for 
non-SCSI async IO. It wasn't a bad way to go, but at least for 
disk-based access they achieved much better results when they could go 
right to the hardware.


--Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: A signal fairy tale

2001-07-02 Thread Christopher Smith

Just to confirm Dan. I was a fool and did not install the dummy handler for 
the masked signal I was using. I added the proper code over the weekend with 
no noticable effect (JDK 1.3 still sigtimedwait()'s on the signal :-().

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-07-02 Thread Christopher Smith

Just to confirm Dan. I was a fool and did not install the dummy handler for 
the masked signal I was using. I added the proper code over the weekend with 
no noticable effect (JDK 1.3 still sigtimedwait()'s on the signal :-().

--Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 01:11 PM 6/28/2001 -0700, Daniel R. Kegel wrote:
>AFAIK, there's no 'read with a timeout' system call for file descriptors, so
>if you needed the equivalent of sigtimedwait(),
>you might end up doing a select on the sigopen fd, which is an extra
>system call.  (I wish Posix had invented sigopen() and readtimedwait() 
>instead of
>sigtimedwait...)

What, you don't want to use AIO or to collect your AIO signals? ;-)

In my apps, what I'd do is spawn a few worker threads which would do 
blocking reads. Particularly if the sigopen() API allows me to grab 
multiple signals from one fd, this should work well enough for one's high 
performance needs. Alternatively, one could use select/poll to check if 
there was data ready before doing the read. I agree though that it'd be 
nice to have a timed read.

--Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 10:59 AM 6/28/2001 -0400, Dan Maas wrote:
>life-threatening things like SIGTERM, SIGKILL, and SIGSEGV. The mutation
>into queued, information-carrying siginfo signals just shows how badly we
>need a more robust event model... (what would truly kick butt is a unified
>interface that could deliver everything from fd events to AIO completions to
>semaphore/msgqueue events, etc, with explicit binding between event queues
>and threads).

I guess this is my thinking: it's really not that much of a stretch to make 
signals behave like GetMessage(). Indeed, sigopen() brings them 
sufficiently close. By doing this, you DO provide this unified interface 
for all the different types of events you described which works much like 
GetMessage(). So, but adding a couple of syscalls you avoid having to 
implement a whole new set of API's for doing AIO, semaphores, msgqueues, etc.

--Chris

P.S.: What do you mean by explicit binding between event queues and 
threads? I'm not sure I see what this gains you.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 01:58 PM 6/28/2001 +0100, John Fremlin wrote:
> Dan Kegel <[EMAIL PROTECTED]> writes:
> >A signal number cannot be opened more than once concurrently;
> >sigopen() thus provides a way to avoid signal usage clashes
> >in large programs.
>Signals are a pretty dopey API anyway - so instead of trying to patch
>them up, why not think of something better for AIO?

You assume that this issue only comes up when you're doing AIO. If we do 
something that makes signals work better, we can have a much broader impact 
that just AIO. If nothing else, the signal usage clashing issue has nothing 
to do with AIO.

--Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 07:49 PM 6/27/2001 -0700, Daniel R. Kegel wrote:
>Balbir Singh <[EMAIL PROTECTED]> wrote:
> >sigopen() should be selective about the signals it allows
> >as argument. Try and make sigopen() thread specific, so that if one
> >thread does a sigopen(), it does not imply it will do all the signal
> >handling for all the threads.
>
>IMHO sigopen()/read() should behave just like sigwait() with respect
>to threads.  That means that in Posix, it would not be thread specific,
>but in Linux, it would be thread specific, because that's how signals
>and threads work there at the moment.

Actually, I believe with IBM's new Posix threads implementation, Linux 
finally does signal delivery "the right way". In general, I think it'd be 
nice if this API *always* sucked up signals from all threads. This makes 
sense particularly since the FD is accessible by all threads.

--Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: A signal fairy tale - a little comphist

2001-06-29 Thread Christopher Smith

At 03:57 PM 6/28/2001 +0200, Heusden, Folkert van wrote:
>[...]
> >A signal number cannot be opened more than once concurrently;
> >sigopen() thus provides a way to avoid signal usage clashes
> >in large programs.
>YOU> Signals are a pretty dopey API anyway -
>
>Exactly. When signals were made up, signalhandlers were supposed to
>not so much more then a last cry and then exit the application. sigHUP
>to re-read the config was not supposed to happen.
>
>YOU> so instead of trying to patch
>YOU> them up, why not think of something better for AIO?
>
>Yeah, a select() on excepfds.

POSIX AIO API's are significantly more powerful then using select(), 
particularly for certain types of applications. select() doesn't provide 
you with a good way to perform I/O operations at different offsets 
simultaneously, doesn't allow for I/O priority, etc.

--Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 07:57 PM 6/27/2001 -0700, Daniel R. Kegel wrote:
>From: Christopher Smith <[EMAIL PROTECTED]>
> >I guess the main thing I'm thinking is this could require some significant
> >changes to the way the kernel behaves. Still, it's worth taking a "try it
> >and see approach". If anyone else thinks this is a good idea I may hack
> >together a sample patch and give it a whirl.
>
>What's the biggest change you see?  From my (two-martini-lunch-tainted)
>viewpoint, it's just another kind of signal masking, sorta...

Yeah, the more I think about it, the more I think this is just another 
branch in the signal delivery code. Not necessarily too huge a change. I'll 
hack on this over the weekend I think.

--Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 07:49 PM 6/27/2001 -0700, Daniel R. Kegel wrote:
Balbir Singh [EMAIL PROTECTED] wrote:
 sigopen() should be selective about the signals it allows
 as argument. Try and make sigopen() thread specific, so that if one
 thread does a sigopen(), it does not imply it will do all the signal
 handling for all the threads.

IMHO sigopen()/read() should behave just like sigwait() with respect
to threads.  That means that in Posix, it would not be thread specific,
but in Linux, it would be thread specific, because that's how signals
and threads work there at the moment.

Actually, I believe with IBM's new Posix threads implementation, Linux 
finally does signal delivery the right way. In general, I think it'd be 
nice if this API *always* sucked up signals from all threads. This makes 
sense particularly since the FD is accessible by all threads.

--Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: A signal fairy tale - a little comphist

2001-06-29 Thread Christopher Smith

At 03:57 PM 6/28/2001 +0200, Heusden, Folkert van wrote:
[...]
 A signal number cannot be opened more than once concurrently;
 sigopen() thus provides a way to avoid signal usage clashes
 in large programs.
YOU Signals are a pretty dopey API anyway -

Exactly. When signals were made up, signalhandlers were supposed to
not so much more then a last cry and then exit the application. sigHUP
to re-read the config was not supposed to happen.

YOU so instead of trying to patch
YOU them up, why not think of something better for AIO?

Yeah, a select() on excepfds.

POSIX AIO API's are significantly more powerful then using select(), 
particularly for certain types of applications. select() doesn't provide 
you with a good way to perform I/O operations at different offsets 
simultaneously, doesn't allow for I/O priority, etc.

--Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 07:57 PM 6/27/2001 -0700, Daniel R. Kegel wrote:
From: Christopher Smith [EMAIL PROTECTED]
 I guess the main thing I'm thinking is this could require some significant
 changes to the way the kernel behaves. Still, it's worth taking a try it
 and see approach. If anyone else thinks this is a good idea I may hack
 together a sample patch and give it a whirl.

What's the biggest change you see?  From my (two-martini-lunch-tainted)
viewpoint, it's just another kind of signal masking, sorta...

Yeah, the more I think about it, the more I think this is just another 
branch in the signal delivery code. Not necessarily too huge a change. I'll 
hack on this over the weekend I think.

--Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 01:58 PM 6/28/2001 +0100, John Fremlin wrote:
 Dan Kegel [EMAIL PROTECTED] writes:
 A signal number cannot be opened more than once concurrently;
 sigopen() thus provides a way to avoid signal usage clashes
 in large programs.
Signals are a pretty dopey API anyway - so instead of trying to patch
them up, why not think of something better for AIO?

You assume that this issue only comes up when you're doing AIO. If we do 
something that makes signals work better, we can have a much broader impact 
that just AIO. If nothing else, the signal usage clashing issue has nothing 
to do with AIO.

--Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 10:59 AM 6/28/2001 -0400, Dan Maas wrote:
life-threatening things like SIGTERM, SIGKILL, and SIGSEGV. The mutation
into queued, information-carrying siginfo signals just shows how badly we
need a more robust event model... (what would truly kick butt is a unified
interface that could deliver everything from fd events to AIO completions to
semaphore/msgqueue events, etc, with explicit binding between event queues
and threads).

I guess this is my thinking: it's really not that much of a stretch to make 
signals behave like GetMessage(). Indeed, sigopen() brings them 
sufficiently close. By doing this, you DO provide this unified interface 
for all the different types of events you described which works much like 
GetMessage(). So, but adding a couple of syscalls you avoid having to 
implement a whole new set of API's for doing AIO, semaphores, msgqueues, etc.

--Chris

P.S.: What do you mean by explicit binding between event queues and 
threads? I'm not sure I see what this gains you.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-29 Thread Christopher Smith

At 01:11 PM 6/28/2001 -0700, Daniel R. Kegel wrote:
AFAIK, there's no 'read with a timeout' system call for file descriptors, so
if you needed the equivalent of sigtimedwait(),
you might end up doing a select on the sigopen fd, which is an extra
system call.  (I wish Posix had invented sigopen() and readtimedwait() 
instead of
sigtimedwait...)

What, you don't want to use AIO or to collect your AIO signals? ;-)

In my apps, what I'd do is spawn a few worker threads which would do 
blocking reads. Particularly if the sigopen() API allows me to grab 
multiple signals from one fd, this should work well enough for one's high 
performance needs. Alternatively, one could use select/poll to check if 
there was data ready before doing the read. I agree though that it'd be 
nice to have a timed read.

--Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-27 Thread Christopher Smith

--On Wednesday, June 27, 2001 11:18:28 +0200 Jamie Lokier 
<[EMAIL PROTECTED]> wrote:
> Btw, this functionality is already available using sigaction().  Just
> search for a signal whose handler is SIG_DFL.  If you then block that
> signal before changing, checking the result, and unblocking the signal,
> you can avoid race conditions too.  (This is what my programs do).

It's more than whether a signal is blocked or not, unfortunately. Lots of 
applications will invoke sigwaitinfo() on whatever the current signal mask 
is, which means you can't rely on sigaction to solve your problems. :-(

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-27 Thread Christopher Smith

--On Wednesday, June 27, 2001 11:51:36 +0530 Balbir Singh 
<[EMAIL PROTECTED]> wrote:
> Shouldn't there be a sigclose() and other operations to make the API
Wouldn't the existing close() be good enough for that?

> orthogonal. sigopen() should be selective about the signals it allows
> as argument. Try and make sigopen() thread specific, so that if one
> thread does a sigopen(), it does not imply it will do all the signal
> handling for all the threads.

Actually, this is exactly what you do want to happen. Linux's existing 
signals + threads semantics are not exactly ideal for high-performance 
computing. Of course, fd's are shared by all threads, so all of the threads 
would be able to read the siginfo structures into memory.

> Does using sigopen() imply that signal(), sigaction(), etc cannot be used.
> In the same process one could do a sigopen() in the library, but the
> process could use sigaction()/signal() without knowing what the library
> does (which signals it handles, etc).

If I understood Dan's intentions correctly, you could use signal() and 
sigaction(), but while the fd is open, signals would be queued up to the fd 
rather than passed off to a signal handler or sigwaitinfo(). Care to 
comment  Dan?

> Let me know, when somebody has a patch or needs help, I would like to
> help or take a look at it.

Maybe we can both hack on this.

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-27 Thread Christopher Smith

--On Wednesday, June 27, 2001 11:51:36 +0530 Balbir Singh 
[EMAIL PROTECTED] wrote:
 Shouldn't there be a sigclose() and other operations to make the API
Wouldn't the existing close() be good enough for that?

 orthogonal. sigopen() should be selective about the signals it allows
 as argument. Try and make sigopen() thread specific, so that if one
 thread does a sigopen(), it does not imply it will do all the signal
 handling for all the threads.

Actually, this is exactly what you do want to happen. Linux's existing 
signals + threads semantics are not exactly ideal for high-performance 
computing. Of course, fd's are shared by all threads, so all of the threads 
would be able to read the siginfo structures into memory.

 Does using sigopen() imply that signal(), sigaction(), etc cannot be used.
 In the same process one could do a sigopen() in the library, but the
 process could use sigaction()/signal() without knowing what the library
 does (which signals it handles, etc).

If I understood Dan's intentions correctly, you could use signal() and 
sigaction(), but while the fd is open, signals would be queued up to the fd 
rather than passed off to a signal handler or sigwaitinfo(). Care to 
comment  Dan?

 Let me know, when somebody has a patch or needs help, I would like to
 help or take a look at it.

Maybe we can both hack on this.

--Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-27 Thread Christopher Smith

--On Wednesday, June 27, 2001 11:18:28 +0200 Jamie Lokier 
[EMAIL PROTECTED] wrote:
 Btw, this functionality is already available using sigaction().  Just
 search for a signal whose handler is SIG_DFL.  If you then block that
 signal before changing, checking the result, and unblocking the signal,
 you can avoid race conditions too.  (This is what my programs do).

It's more than whether a signal is blocked or not, unfortunately. Lots of 
applications will invoke sigwaitinfo() on whatever the current signal mask 
is, which means you can't rely on sigaction to solve your problems. :-(

--Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-26 Thread Christopher Smith

--On Tuesday, June 26, 2001 05:54:37 -0700 Dan Kegel <[EMAIL PROTECTED]> wrote:
> Once upon a time a hacker named Xman
> wrote a library that used aio, and decided
> to use sigtimedwait() to pick up completion
> notifications.  It worked well, and his I/O
> was blazing fast (since was using a copy
> of Linux that was patched to have good aio).
> But when he tried to integrate his library
> into a large application someone else had
> written, woe! that application's use of signals
> conflicted with his library.  "Fsck!" said Xman.
> At that moment a fairy appeared, and said
> "Young man, watch your language, or I'm going to
> have to turn you into a goon!  I'm the good fairy Eunice.
> Can I help you?"  Xman explained his problem to Eunice,
> who smiled and said "All you need is right here,
> just type 'man 2 sigopen'".  Xman did, and saw:

I must thank the god fair Eunice. ;-) From a programming standpoint, this 
looks like a really nice approach. I must say I prefer this approach to the 
various "event" strategies I've seen to date, as it fixes the primary 
problem with signals, while still allowing us to hook in to all the 
standard POSIX API's that already use signals. It'd be nice if I could pass 
in a 0 for signum and have the kernel select from unused signals (problem 
being that "unused" is not necessarily easy to define), althouh I guess an 
inefficient version of this could be handled in userland.

I presume the fd could be shared between threads and otherwise behave like 
a normal fd, which would be sper nice.

I guess the main thing I'm thinking is this could require some significant 
changes to the way the kernel behaves. Still, it's worth taking a "try it 
and see approach". If anyone else thinks this is a good idea I may hack 
together a sample patch and give it a whirl.

Thanks again good fairy Dan/Eunice. ;-)

--Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A signal fairy tale

2001-06-26 Thread Christopher Smith

--On Tuesday, June 26, 2001 05:54:37 -0700 Dan Kegel [EMAIL PROTECTED] wrote:
 Once upon a time a hacker named Xman
 wrote a library that used aio, and decided
 to use sigtimedwait() to pick up completion
 notifications.  It worked well, and his I/O
 was blazing fast (since was using a copy
 of Linux that was patched to have good aio).
 But when he tried to integrate his library
 into a large application someone else had
 written, woe! that application's use of signals
 conflicted with his library.  Fsck! said Xman.
 At that moment a fairy appeared, and said
 Young man, watch your language, or I'm going to
 have to turn you into a goon!  I'm the good fairy Eunice.
 Can I help you?  Xman explained his problem to Eunice,
 who smiled and said All you need is right here,
 just type 'man 2 sigopen'.  Xman did, and saw:

I must thank the god fair Eunice. ;-) From a programming standpoint, this 
looks like a really nice approach. I must say I prefer this approach to the 
various event strategies I've seen to date, as it fixes the primary 
problem with signals, while still allowing us to hook in to all the 
standard POSIX API's that already use signals. It'd be nice if I could pass 
in a 0 for signum and have the kernel select from unused signals (problem 
being that unused is not necessarily easy to define), althouh I guess an 
inefficient version of this could be handled in userland.

I presume the fd could be shared between threads and otherwise behave like 
a normal fd, which would be sper nice.

I guess the main thing I'm thinking is this could require some significant 
changes to the way the kernel behaves. Still, it's worth taking a try it 
and see approach. If anyone else thinks this is a good idea I may hack 
together a sample patch and give it a whirl.

Thanks again good fairy Dan/Eunice. ;-)

--Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Asynchronous IO

2001-04-13 Thread Christopher Smith

--On Friday, April 13, 2001 04:45:07 -0400 Dan Maas <[EMAIL PROTECTED]> wrote:
> IIRC the problem with implementing asynchronous *disk* I/O in Linux today
> is that the filesystem code assumes synchronous I/O operations that block
> the whole process/thread. So implementing "real" asynch I/O (without the
> overhead of creating a process context for each operation) would require
> re-writing the filesystems as non-blocking state machines. Last I heard
> this was a long-term goal, but nobody's done the work yet (aside from
> maybe the SGI folks with XFS?). Or maybe I don't know what I'm talking
> about...

If the FS supports generic read then this is not a problem. This is what 
SGI's KAIO does as well as Bart's work.

> Bart, glad to hear you are working on an event interface, sounds cool! One
> feature that I really, really, *really* want to see implemented is the
> ability to block on a set of any "waitable kernel objects" with one
> syscall - not just file descriptors, but also SysV semaphores and message
> queues, UNIX signals and child proceses, file locks, pthreads condition
> variables, asynch disk I/O completions, etc. I am dying for a clean way to
> accomplish this that doesn't require more than one thread... (Win32 and
> FreeBSD kick our butts here with MsgWaitForMultipleObjects() and
> kevent()...) IMHO cleaning up this API deficiency is just as important as
> optimizing the extreme case of socket I/O with zillions of file
> descriptors...

Actually, sigwaitinfo() has zero problem waiting on muliple signals. If you 
are using real-time signals each signal can pass a pointer to the relevant 
object, so even if you're only blocking on a single signal you can receive 
info about several objects.



--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Asynchronous IO

2001-04-13 Thread Christopher Smith

--On Friday, April 13, 2001 04:45:07 -0400 Dan Maas [EMAIL PROTECTED] wrote:
 IIRC the problem with implementing asynchronous *disk* I/O in Linux today
 is that the filesystem code assumes synchronous I/O operations that block
 the whole process/thread. So implementing "real" asynch I/O (without the
 overhead of creating a process context for each operation) would require
 re-writing the filesystems as non-blocking state machines. Last I heard
 this was a long-term goal, but nobody's done the work yet (aside from
 maybe the SGI folks with XFS?). Or maybe I don't know what I'm talking
 about...

If the FS supports generic read then this is not a problem. This is what 
SGI's KAIO does as well as Bart's work.

 Bart, glad to hear you are working on an event interface, sounds cool! One
 feature that I really, really, *really* want to see implemented is the
 ability to block on a set of any "waitable kernel objects" with one
 syscall - not just file descriptors, but also SysV semaphores and message
 queues, UNIX signals and child proceses, file locks, pthreads condition
 variables, asynch disk I/O completions, etc. I am dying for a clean way to
 accomplish this that doesn't require more than one thread... (Win32 and
 FreeBSD kick our butts here with MsgWaitForMultipleObjects() and
 kevent()...) IMHO cleaning up this API deficiency is just as important as
 optimizing the extreme case of socket I/O with zillions of file
 descriptors...

Actually, sigwaitinfo() has zero problem waiting on muliple signals. If you 
are using real-time signals each signal can pass a pointer to the relevant 
object, so even if you're only blocking on a single signal you can receive 
info about several objects.

insert thread about how signals suck here

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: a quest for a better scheduler

2001-04-05 Thread Christopher Smith

--On Thursday, April 05, 2001 15:38:41 -0700 "Timothy D. Witham" 
<[EMAIL PROTECTED]> wrote:
>   Database performance:
>   Raw storage I/O performance
>OLTP workload
You probably want to add an OLAP scenario as well.

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Signal Handling Performance?

2001-04-04 Thread Christopher Smith

--On Wednesday, April 04, 2001 21:30:51 -0400 "Carey B. Stortz" 
<[EMAIL PROTECTED]> wrote:
> either stayed the same or had a performance increase. A general decrease
> started around kernel 2.1.32, then performance drastically fell at kernel
> 2.3.20. There is an Excel graph which shows the trend at:
>
> http://euclid.nmu.edu/~benchmark/Carey/signalhandling.gif
>
> I was wondering if anybody had any ideas why this is happening, and what
> happened in kernel 2.3.20 to cause such a decrease in performance?

Lies, damn lies, and benchmarks. ;-) Seriously though, I'm not clear on 
what you are measuring or how you are measuring it. It looks like this is 
measuring signal latency, which is important, but what about thoroughput?

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: a quest for a better scheduler

2001-04-04 Thread Christopher Smith

--On Wednesday, April 04, 2001 15:16:32 -0700 Tim Wright <[EMAIL PROTECTED]> 
wrote:
> On Wed, Apr 04, 2001 at 03:23:34PM +0200, Ingo Molnar wrote:
>> nope. The goal is to satisfy runnable processes in the range of NR_CPUS.
>> You are playing word games by suggesting that the current behavior
>> prefers 'low end'. 'thousands of runnable processes' is not 'high end'
>> at all, it's 'broken end'. Thousands of runnable processes are the sign
>> of a broken application design, and 'fixing' the scheduler to perform
>> better in that case is just fixing the symptom. [changing the scheduler
>> to perform better in such situations is possible too, but all solutions
>> proposed so far had strings attached.]
>
> Ingo, you continue to assert this without giving much evidence to back it
> up. All the world is not a web server. If I'm running a large OLTP
> database with thousands of clients, it's not at all unreasonable to
> expect periods where several hundred (forget the thousands) want to be
> serviced by the database engine. That sounds like hundreds of schedulable
> entities be they processes or threads or whatever. This sort of load is
> regularly run on machine with 16-64 CPUs.

Actually, it's not just OLTP, anytime you are doing time sharing between 
hundreds of users (something POSIX systems are supposed to be good at) this 
will happen.

> Now I will admit that it is conceivable that you can design an
> application that finds out how many CPUs are available, creates threads
> to match that number and tries to divvy up the work between them using
> some combination of polling and asynchronous I/O etc. There are, however
> a number of problems with this approach:

Actually, one way to semi-support this approach is to implement 
many-to-many threads as per the Solaris approach. This also requires 
significant hacking of both the kernel and the runtime, and certainly is 
significantly more error prone than trying to write a flexible scheduler.

One problem you didn't highlight that even the above case does not happily 
identify is that for security reasons you may very well need each user's 
requests to take place in a different process. If you don't, then you have 
to implement a very well tested and secure user-level security mechanism to 
ensure things like privacy (above and beyond the time-sharing).

The world is filled with a wide variety of types of applications, and 
unless you know two programming approaches are functionaly equivalent (and 
event driven/polling I/O vs. tons of running processes are NOT), you 
shouldn't say one approach is "broken". You could say it's a "broken" 
approach to building web servers. Unfortunately, things like kernels and 
standard libraries should work well in the general case.

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: a quest for a better scheduler

2001-04-04 Thread Christopher Smith

--On Wednesday, April 04, 2001 15:16:32 -0700 Tim Wright [EMAIL PROTECTED] 
wrote:
 On Wed, Apr 04, 2001 at 03:23:34PM +0200, Ingo Molnar wrote:
 nope. The goal is to satisfy runnable processes in the range of NR_CPUS.
 You are playing word games by suggesting that the current behavior
 prefers 'low end'. 'thousands of runnable processes' is not 'high end'
 at all, it's 'broken end'. Thousands of runnable processes are the sign
 of a broken application design, and 'fixing' the scheduler to perform
 better in that case is just fixing the symptom. [changing the scheduler
 to perform better in such situations is possible too, but all solutions
 proposed so far had strings attached.]

 Ingo, you continue to assert this without giving much evidence to back it
 up. All the world is not a web server. If I'm running a large OLTP
 database with thousands of clients, it's not at all unreasonable to
 expect periods where several hundred (forget the thousands) want to be
 serviced by the database engine. That sounds like hundreds of schedulable
 entities be they processes or threads or whatever. This sort of load is
 regularly run on machine with 16-64 CPUs.

Actually, it's not just OLTP, anytime you are doing time sharing between 
hundreds of users (something POSIX systems are supposed to be good at) this 
will happen.

 Now I will admit that it is conceivable that you can design an
 application that finds out how many CPUs are available, creates threads
 to match that number and tries to divvy up the work between them using
 some combination of polling and asynchronous I/O etc. There are, however
 a number of problems with this approach:

Actually, one way to semi-support this approach is to implement 
many-to-many threads as per the Solaris approach. This also requires 
significant hacking of both the kernel and the runtime, and certainly is 
significantly more error prone than trying to write a flexible scheduler.

One problem you didn't highlight that even the above case does not happily 
identify is that for security reasons you may very well need each user's 
requests to take place in a different process. If you don't, then you have 
to implement a very well tested and secure user-level security mechanism to 
ensure things like privacy (above and beyond the time-sharing).

The world is filled with a wide variety of types of applications, and 
unless you know two programming approaches are functionaly equivalent (and 
event driven/polling I/O vs. tons of running processes are NOT), you 
shouldn't say one approach is "broken". You could say it's a "broken" 
approach to building web servers. Unfortunately, things like kernels and 
standard libraries should work well in the general case.

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Signal Handling Performance?

2001-04-04 Thread Christopher Smith

--On Wednesday, April 04, 2001 21:30:51 -0400 "Carey B. Stortz" 
[EMAIL PROTECTED] wrote:
 either stayed the same or had a performance increase. A general decrease
 started around kernel 2.1.32, then performance drastically fell at kernel
 2.3.20. There is an Excel graph which shows the trend at:

 http://euclid.nmu.edu/~benchmark/Carey/signalhandling.gif

 I was wondering if anybody had any ideas why this is happening, and what
 happened in kernel 2.3.20 to cause such a decrease in performance?

Lies, damn lies, and benchmarks. ;-) Seriously though, I'm not clear on 
what you are measuring or how you are measuring it. It looks like this is 
measuring signal latency, which is important, but what about thoroughput?

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: a quest for a better scheduler

2001-04-03 Thread Christopher Smith

--On Tuesday, April 03, 2001 18:17:30 -0700 Fabio Riccardi 
<[EMAIL PROTECTED]> wrote:
> Alan Cox wrote:
> Indeed, I'm using RT sigio/sigwait event scheduling, bare clone threads
> and zero-copy io.

Fabio, I'm working on a similar solution, although I'm experimenting with 
SGI's KAIO patch to see what it can do. I've had to patch the kernel to 
implement POSIX style signal dispatch symantics (so that the thread which 
posted an I/O request doesn't have to be the one which catches the signal). 
Are you taking a similar approach, or is the lack of this behavior the 
reason you are using so many threads?

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: a quest for a better scheduler

2001-04-03 Thread Christopher Smith

--On Tuesday, April 03, 2001 18:17:30 -0700 Fabio Riccardi 
[EMAIL PROTECTED] wrote:
 Alan Cox wrote:
 Indeed, I'm using RT sigio/sigwait event scheduling, bare clone threads
 and zero-copy io.

Fabio, I'm working on a similar solution, although I'm experimenting with 
SGI's KAIO patch to see what it can do. I've had to patch the kernel to 
implement POSIX style signal dispatch symantics (so that the thread which 
posted an I/O request doesn't have to be the one which catches the signal). 
Are you taking a similar approach, or is the lack of this behavior the 
reason you are using so many threads?

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/