Re: Fwd: timerfd read only gets single byte?

2007-07-17 Thread Davi Arnaut
340.004: read: 1; total=84 > 341.004: read: 1; total=85 > ^C > == > > The after bringing the program back into the foreground, I would have > expected to get an overrun count of 334 or thereabouts, but it looks as > though I'm only getting the least signifi

[PATCH] timerfd/eventfd context lock doesn't protect against poll_wait

2007-05-17 Thread Davi Arnaut
Hi, poll_wait() callback may modify the waitqueue without holding the context private lock. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> diff --git a/fs/eventfd.c b/fs/eventfd.c index 480e2b3..9c672be 100644 --- a/fs/eventfd.c +++ b/fs/eventfd.c @@ -50,7 +50,7 @@ int eventfd_signal(struc

[PATCH] signalfd: retrieve multiple signals with one read() call

2007-05-19 Thread Davi Arnaut
Hi, Gathering signals in bulk enables server applications to drain a signal queue (almost full of realtime signals) more efficiently by reducing the syscall and file look-up overhead. Very similar to the sigtimedwait4() call described by Niels Provos, Chuck Lever, and Stephen Tweedie in a paper e

[PATCH] shrink task_struct by 16 bytes

2007-05-19 Thread Davi Arnaut
Hi, Shrink task_struct by replacing the notifier callback with a notifier list. The block_all_signals() function (and the signal notifier mechanism) has only one user at the moment, which is drm. Pahole output for task_struct: i386 before: /* size: 2640, cachelines: 42 */ /* sum members:

Re: [PATCH] shrink task_struct by 16 bytes

2007-05-21 Thread Davi Arnaut
Christoph Hellwig wrote: > On Sun, May 20, 2007 at 12:40:11AM -0300, Davi Arnaut wrote: >> Hi, >> >> Shrink task_struct by replacing the notifier callback with a >> notifier list. The block_all_signals() function (and the signal >> notifier mechanism) has only one

Re: + signalfd-retrieve-multiple-signals-with-one-read-call.patch added to -mm tree

2007-05-21 Thread Davi Arnaut
Davide Libenzi wrote: > On Mon, 21 May 2007, Oleg Nesterov wrote: > >>> + schedule(); >>> + locked = signalfd_lock(ctx, &lk); >>> + if (unlikely(!locked)) { >>> + /* >>> +* Let the caller read zero byte, ala socket >>> +

[PATCH] rfc: threaded epoll_wait thundering herd

2007-05-04 Thread Davi Arnaut
Hi, If multiple threads are parked on epoll_wait (on a single epoll fd) and events become available, epoll performs a wake up of all threads of the poll wait list, causing a thundering herd of processes trying to grab the eventpoll lock. This patch addresses this by using exclusive waiters (wake

Re: [PATCH] rfc: threaded epoll_wait thundering herd

2007-05-05 Thread Davi Arnaut
Davide Libenzi wrote: > On Fri, 4 May 2007, Davi Arnaut wrote: > >> Hi, >> >> If multiple threads are parked on epoll_wait (on a single epoll fd) and >> events become available, epoll performs a wake up of all threads of the >> poll wait list, causing a thunder

Re: [patch 14/22] pollfs: pollable futex

2007-05-06 Thread Davi Arnaut
livery machinery is not necessary. And it makes me wonder why I hadn't followed its "watch" approach for futexes: futex_init(); // Davide's anon fd futex_add_watch(int fd, void *addr, int val, uint32_t mask); futex_rm_watch(int fd, uint32_t wd); Anyway, this unifying event machin

Re: [PATCH] rfc: threaded epoll_wait thundering herd

2007-05-07 Thread Davi Arnaut
Ulrich Drepper wrote: > On 5/5/07, Davi Arnaut <[EMAIL PROTECTED]> wrote: >> A google search turns up a few users. It also addresses some complaints >> from Drepper. > > There is a huge problem with this approach and we're back at the > inadequate interface

Re: [PATCH] rfc: threaded epoll_wait thundering herd

2007-05-07 Thread Davi Arnaut
Ulrich Drepper wrote: > On 5/7/07, Davi Arnaut <[EMAIL PROTECTED]> wrote: >> See Linus's message on this same thread. > > No. I'm talking about the userlevel side, not kernel side. So you probably knew the answer before asking the question. > If a thread is ca

Re: [PATCH] rfc: threaded epoll_wait thundering herd

2007-05-07 Thread Davi Arnaut
Ulrich Drepper wrote: > On 5/7/07, Davi Arnaut <[EMAIL PROTECTED]> wrote: >> Anyway, we could extend epoll to be mmapable... > > Welcome to kevent, well, except with a lot more ballast and awkward > interfaces. So an mmapable epoll is equivalent to kevent.. great! Well

[patch 01/22] pollfs: kernel-side API header

2007-05-01 Thread Davi Arnaut
Add pollfs_fs.h header which contains the kernel-side declarations and auxiliary macros for type safety checks. Those macros can be simplified later. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- include/linux/pollfs_fs.h | 57 ++ 1 file c

[patch 00/22] pollfs: filesystem abstraction for pollable objects

2007-05-01 Thread Davi Arnaut
upling the core filesystem from the "subsystems" (mere push and pop operations). Currently implemented waitable "objects" are: signals, futexes, ai/o blocks and timers. More details at each patch. http://haxent.com/~davi/pollfs/ Comments are welcome. -- Davi Arnaut - To

[patch 22/22] pollfs: x86_64, wire up the plaio system call

2007-05-01 Thread Davi Arnaut
Make the plaio syscall available to user-space on x86_64. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- arch/x86_64/ia32/ia32entry.S |1 + include/asm-x86_64/unistd.h |4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) Index: linux-2.6/arch/x86_64/ia32/ia32entry.S =

[patch 20/22] pollfs: export the plaio system call

2007-05-01 Thread Davi Arnaut
Export the new plaio syscall prototype. While there, make it "conditional". Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- include/linux/syscalls.h |2 ++ kernel/sys_ni.c |1 + 2 files changed, 3 insertions(+) Index: linux-2.6/include/linux/syscalls.h

[patch 12/22] pollfs: x86_64, wire up the pltimer system call

2007-05-01 Thread Davi Arnaut
Make the pltimer syscall available to user-space on x86_64. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- arch/x86_64/ia32/ia32entry.S |1 + include/asm-x86_64/unistd.h |4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) Index: linux-2.6/arch/x86_64/ia32/ia32entry.S ===

[patch 11/22] pollfs: x86, wire up the pltimer system call

2007-05-01 Thread Davi Arnaut
Make the pltimer syscall available to user-space on x86. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- arch/i386/kernel/syscall_table.S |1 + include/asm-i386/unistd.h|3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) Index: linux-2.6/include/asm-i386/unistd.h ==

[patch 14/22] pollfs: pollable futex

2007-05-01 Thread Davi Arnaut
Asynchronously wait for FUTEX_WAKE operation on a futex if it still contains a given value. There can be only one futex wait per file descriptor. However, it can be rearmed (possibly at a different address) anytime. The pollable futex approach is far superior (send and receive events from userspac

[patch 15/22] pollfs: export the plfutex system call

2007-05-01 Thread Davi Arnaut
Export the new plfutex syscall prototype. While there, make it "conditional". Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- include/linux/syscalls.h |2 ++ kernel/sys_ni.c |1 + 2 files changed, 3 insertions(+) Index: linux-2.6/include/linux/syscalls.h ==

[patch 16/22] pollfs: x86, wire up the plfutex system call

2007-05-01 Thread Davi Arnaut
Make the plfutex syscall available to user-space on x86. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- arch/i386/kernel/syscall_table.S |1 + include/asm-i386/unistd.h|3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) Index: linux-2.6/include/asm-i386/unistd.h ==

[patch 18/22] pollfs: check if a AIO event ring is empty

2007-05-01 Thread Davi Arnaut
The aio_ring_empty() function returns true if the AIO event ring has no elements, false otherwise. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- fs/aio.c| 17 + include/linux/aio.h |1 + 2 files changed, 18 insertions(+) Index: linux-2.6/fs/aio.c ===

[patch 10/22] pollfs: export the pltimer system call

2007-05-01 Thread Davi Arnaut
Export the new pltimer syscall prototype. While there, make it "conditional". Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- include/linux/syscalls.h |2 ++ kernel/sys_ni.c |1 + 2 files changed, 3 insertions(+) Index: linux-2.6/include/linux/syscalls.h ==

[patch 07/22] pollfs: x86, wire up the plsignal system call

2007-05-01 Thread Davi Arnaut
Make the plsignal syscall available to user-space on x86. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- arch/i386/kernel/syscall_table.S |1 + include/asm-i386/unistd.h|3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) Index: linux-2.6/include/asm-i386/unistd.h =

[patch 19/22] pollfs: pollable aio

2007-05-01 Thread Davi Arnaut
Submit, retrieve, or poll aio requests for completion through a file descriptor. User supplies a aio_context_t that is used to fetch a reference to the kioctx. Once the file descriptor is closed, the reference is decremented. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- fs/pollfs/Mak

[patch 03/22] pollfs: asynchronously wait for a signal

2007-05-01 Thread Davi Arnaut
Add a wait queue to the task_struct in order to be able to associate (wait for) a signal with other resources. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- include/linux/init_task.h |1 + include/linux/sched.h |1 + kernel/fork.c |1 + kernel/signal.c

[patch 21/22] pollfs: x86, wire up the plaio system call

2007-05-01 Thread Davi Arnaut
Make the plaio syscall available to user-space on x86. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- arch/i386/kernel/syscall_table.S |1 + include/asm-i386/unistd.h|3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) Index: linux-2.6/include/asm-i386/unistd.h

[patch 09/22] pollfs: pollable hrtimers

2007-05-01 Thread Davi Arnaut
Per file descriptor high-resolution timers. A classic unix file interface for the POSIX timer_(create|settime|gettime|delete) family of functions. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- fs/pollfs/Makefile |1 fs/pollfs/timer.c | 198 ++

[patch 08/22] pollfs: x86_64, wire up the plsignal system call

2007-05-01 Thread Davi Arnaut
Make the plsignal syscall available to user-space on x86_64. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- arch/x86_64/ia32/ia32entry.S |3 ++- include/asm-x86_64/unistd.h |4 +++- 2 files changed, 5 insertions(+), 2 deletions(-) Index: linux-2.6/include/asm-x86_64/unistd.h

[patch 17/22] pollfs: x86_64, wire up the plfutex system call

2007-05-01 Thread Davi Arnaut
Make the plfutex syscall available to user-space on x86_64. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- arch/x86_64/ia32/ia32entry.S |1 + include/asm-x86_64/unistd.h |4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) Index: linux-2.6/arch/x86_64/ia32/ia32entry.S ===

[patch 13/22] pollfs: asynchronous futex wait

2007-05-01 Thread Davi Arnaut
Break apart and export the futex_wait function in order to be able to associate (wait for) a futex with other resources. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- include/linux/futex.h | 80 ++ kernel/futex.c| 130 ++--

[patch 06/22] pollfs: export the plsignal system call

2007-05-01 Thread Davi Arnaut
Export the new plsignal syscall prototype. While there, make it "conditional". Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- include/linux/syscalls.h |2 ++ kernel/sys_ni.c |1 + 2 files changed, 3 insertions(+) Index: linux-2.6/include/linux/syscalls.h =

[patch 04/22] pollfs: pollable signal

2007-05-01 Thread Davi Arnaut
Retrieve multiple per-process signals through a file descriptor. The mask of signals can be changed at any time. Also, the compat code can be kept very simple. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- fs/pollfs/Makefile |2 fs/pollfs/signal.c | 144 +

[patch 05/22] pollfs: pollable signal compat code

2007-05-01 Thread Davi Arnaut
Compat handlers for the pollable signal operations. Later the0 compat operations can operate on a per call basis. Signed-off-by: Davi E. M. Arnaut <[EMAIL PROTECTED]> --- fs/pollfs/signal.c | 85 + 1 file changed, 85 insertions(+) Index: lin

[patch 02/22] pollfs: file system operations

2007-05-01 Thread Davi Arnaut
The key feature of the pollfs file operations is to internally handle pollable (waitable) resources as files without exporting complex and bug-prone underlying (VFS) implementation details. All resource handlers are required to implement the read, write, poll, release operations and must not block

Re: [patch 14/22] pollfs: pollable futex

2007-05-01 Thread Davi Arnaut
Eric Dumazet wrote: > Davi Arnaut a écrit : >> Asynchronously wait for FUTEX_WAKE operation on a futex if it still contains >> a given value. There can be only one futex wait per file descriptor. However, >> it can be rearmed (possibly at a different address) anytime. >

Re: [patch 14/22] pollfs: pollable futex

2007-05-01 Thread Davi Arnaut
Eric Dumazet wrote: > Davi Arnaut a écrit : >> Eric Dumazet wrote: >>> Davi Arnaut a écrit : >>>> Asynchronously wait for FUTEX_WAKE operation on a futex if it still >>>> contains >>>> a given value. There can be only one futex wait per

Re: [patch 14/22] pollfs: pollable futex

2007-05-02 Thread Davi Arnaut
Eric Dumazet wrote: > Davi Arnaut a écrit : >> Eric Dumazet wrote: >>> Davi Arnaut a écrit : >>>> Asynchronously wait for FUTEX_WAKE operation on a futex if it still >>>> contains >>>> a given value. There can be only one futex wait per

Re: [patch 14/22] pollfs: pollable futex

2007-05-02 Thread Davi Arnaut
Ulrich Drepper wrote: > On 5/1/07, Davi Arnaut <[EMAIL PROTECTED]> wrote: >> The pollable futex approach is far superior (send and receive events from >> userspace or kernel) to eventfd and fixes (supercedes) FUTEX_FD at the same >> time. >> [...] > >

Re: [patch 14/22] pollfs: pollable futex

2007-05-02 Thread Davi Arnaut
Ulrich Drepper wrote: > On 5/1/07, Davi Arnaut <[EMAIL PROTECTED]> wrote: >> The pollable futex approach is far superior (send and receive events from >> userspace or kernel) to eventfd and fixes (supercedes) FUTEX_FD at the same >> time. >> [...] > >

Re: [patch 14/22] pollfs: pollable futex

2007-05-02 Thread Davi Arnaut
ner and safer to do it right instead of piling on more and more workarounds for special situations. It simple as is, there is no need to overdesign. -- Davi Arnaut - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Mo

Re: [patch 14/22] pollfs: pollable futex

2007-05-02 Thread Davi Arnaut
Ulrich Drepper wrote: On 5/2/07, Davi Arnaut <[EMAIL PROTECTED]> wrote: It's quite easy to implement this scheme by write()ing the futexes all at once but that would break the one futex per fd association. For atomicity: if one of the futexes can't be queued, we would rollba

Re: [patch 14/22] pollfs: pollable futex

2007-05-02 Thread Davi Arnaut
Ulrich Drepper wrote: On 5/2/07, Davi Arnaut <[EMAIL PROTECTED]> wrote: thread A: int fd = plfutex(addr, 0); do poll(fdset+fd); process network events queue obj to thread B if fd: job processed thread B: wait_job(); proce

Re: [patch 00/22] pollfs: filesystem abstraction for pollable objects

2007-05-02 Thread Davi Arnaut
vent_ (address/value). -- Davi Arnaut - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [patch 14/22] pollfs: pollable futex

2007-05-02 Thread Davi Arnaut
Ulrich Drepper wrote: On 5/2/07, Davi Arnaut <[EMAIL PROTECTED]> wrote: NO! Every single waiter of the _file descriptor_ is waked, not of the futex. And how is this better? In this world of yours a program must have one file descriptor for each single futex which is used like thi

Re: [patch 00/22] pollfs: filesystem abstraction for pollable objects

2007-05-02 Thread Davi Arnaut
Davide Libenzi wrote: On Wed, 2 May 2007, Davi Arnaut wrote: Davide Libenzi wrote: On Tue, 1 May 2007, Andrew Morton wrote: David, could you provide some feedback please? The patches are stunningly free of comments, but you used to do that to me pretty often so my

Re: [patch 00/22] pollfs: filesystem abstraction for pollable objects

2007-05-02 Thread Davi Arnaut
Davide Libenzi wrote: On Wed, 2 May 2007, Davi Arnaut wrote: So in this case I may borrow some signalfd code :-) I really like the signalfd approach, but IMHO the code is quite ugly and duplicates a lot of hairy code. Ugly, really? Please ... + while (!mutex_trylock(&

Re: [patch 09/22] pollfs: pollable hrtimers

2007-05-02 Thread Davi Arnaut
Thomas Gleixner wrote: > On Wed, 2007-05-02 at 02:22 -0300, Davi Arnaut wrote: >> plain text document attachment (pollfs-timer.patch) >> Per file descriptor high-resolution timers. A classic unix file interface for >> the POSIX timer_(create|settime|gettime|delete