Re: [take29 0/8] kevent: Generic event handling mechanism.

2006-12-29 Thread Evgeniy Polyakov
On Thu, Dec 28, 2006 at 04:56:45PM +0100, Ingo Molnar ([EMAIL PROTECTED]) wrote:
 
 * Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 
  Generic event handling mechanism.
 
 it would be /very/ helpful to state against which kernel tree the 
 patch-queue is. It does not apply to 2.6.20-rc1 nor to -rc2 nor to 
 2.6.19. At which point i gave up ...

It was against 2.6.18 (d4397acde6fd047f13c744e5471a9bfe287f78a3) git.
Next patchset with possibility to inject already read event and
userspace reserved notifications will be against the latest tree (I will
push it today before final New Year celebration started).

   Ingo

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [take29 0/8] kevent: Generic event handling mechanism.

2006-12-29 Thread Evgeniy Polyakov
On Thu, Dec 28, 2006 at 05:01:37PM +0100, Ingo Molnar ([EMAIL PROTECTED]) wrote:
 
 * Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 
  Generic event handling mechanism.
 
 i see it covers alot of event sources, but i cannot see block IO 
 notifications. Am i missing something?

Depending on what it is :)
If you mean kevent based AIO, then it was dropped to reduce size of the
patchset, and in favour of new AIO design.
Other kinds of read/write notifications can be handled by poll/select
notifications.

   Ingo

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] igmp: spin_lock_bh in timer (Re: BUG: soft lockup detected on CPU#0!)

2006-12-29 Thread Jarek Poplawski
On Wed, Dec 27, 2006 at 08:16:10AM -0800, Ben Greear wrote:
 Jarek Poplawski wrote:
 On Fri, Dec 22, 2006 at 06:05:18AM -0800, Ben Greear wrote:
 Jarek Poplawski wrote:
 On Fri, Dec 22, 2006 at 08:13:08AM +0100, Jarek Poplawski wrote:
 On 20-12-2006 03:13, Ben Greear wrote:
 This is from 2.6.18.2 kernel with my patch set.  The MAC-VLANs are in 
 active use.
 From the backtrace, I am thinking this might be a generic problem, 
 however.
 
 Any ideas about what this could be?  It seems to be reproducible every 
 day or
 ...
 If it doesn't help, I hope lockdep will be more
 precise when you'll upgrade to 2.6.19 or higher.
 ... or when you enable lockdep in 2.6.18 (I've
 forgotten it's there alredy!).
 I got lucky..the system was available by ssh still.  I see this in the 
 boot logs..I assume
 this means lockdep is enabled?  Should I have expected to see a lockdep 
 trace in the case of
 his soft-lockup then?
 
 .
 Dec 19 04:33:48 localhost kernel: Lock dependency validator: Copyright 
 (c) 2006 Red Hat, Inc., Ingo MolnarDec 19 04:33:48 localhost kernel: ... 
 MAX_LOCKDEP_SUBCLASSES:8
 
 Yes, you got it enabled in the config.
 
 If there is no message later about validator
 turning off and no warnings which could point
 at lockdep then it is working.
 
 But then, IMHO, there is rather small probability
 this bug is really from lockup. Another possibility
 is hardware irqs (timer in particular) are turned
 off by something (maybe those hacks?) for extremely
 long time (~10 sec.). 
 
 The system hangs and does not recover (well, a few processes
 continue on the other processor for a few minutes before they
 too deadlock...)
 
 I am guessing this problem has been around for a while, but it
 is only triggered when interfaces are created, and probably only
 when UDP traffic is already running heavily on the system.  Most
 systems w/out virtual devices will not trigger this sort of
 race.

I'd one more look at this considering the info about
creating interfaces and here are some of my doubts on
possible races (I hope you'll forgive me if I totaly
miss some point):

- During register procedure the real device seems to
be up and running; vlan_rx_register is used but I see
drivers differ here: some of them do netif_stop and
disable irqs while others only lock. It seems they
can start do vlan_hwaccel_rx directly after
this (sometimes even during registration if
irq will happen).

- vlan_hwaccel_rx is checking skb_bond_should_drop
but I'm not sure it is really useful here, so
probably at least broadcasts and multicasts can
use netif_rx even before vlan_dev is up (and your
log accidentally shows multicast receive).

- Preemption is blocked for quite a long time in
vlan_skb_recv and during netif_receive; I guess 
this could be also possible reason of triggering
the softlockup bug. I wonder if lowering the
value of netdev_max_backlog wouldn't improve
scheduling times.

Happy New Year,

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[take30 5/9] kevent: Timer notifications.

2006-12-29 Thread Evgeniy Polyakov

Timer notifications.

Timer notifications can be used for fine grained per-process time 
management, since interval timers are very inconvenient to use, 
and they are limited.

This subsystem uses high-resolution timers.
id.raw[0] is used as number of seconds
id.raw[1] is used as number of nanoseconds

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/kernel/kevent/kevent_timer.c b/kernel/kevent/kevent_timer.c
new file mode 100644
index 000..c21a155
--- /dev/null
+++ b/kernel/kevent/kevent_timer.c
@@ -0,0 +1,114 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/hrtimer.h
+#include linux/jiffies.h
+#include linux/kevent.h
+
+struct kevent_timer
+{
+   struct hrtimer  ktimer;
+   struct kevent_storage   ktimer_storage;
+   struct kevent   *ktimer_event;
+};
+
+static int kevent_timer_func(struct hrtimer *timer)
+{
+   struct kevent_timer *t = container_of(timer, struct kevent_timer, 
ktimer);
+   struct kevent *k = t-ktimer_event;
+
+   kevent_storage_ready(t-ktimer_storage, NULL, KEVENT_MASK_ALL);
+   hrtimer_forward(timer, timer-base-softirq_time,
+   ktime_set(k-event.id.raw[0], k-event.id.raw[1]));
+   return HRTIMER_RESTART;
+}
+
+static struct lock_class_key kevent_timer_key;
+
+static int kevent_timer_enqueue(struct kevent *k)
+{
+   int err;
+   struct kevent_timer *t;
+
+   t = kmalloc(sizeof(struct kevent_timer), GFP_KERNEL);
+   if (!t)
+   return -ENOMEM;
+
+   hrtimer_init(t-ktimer, CLOCK_MONOTONIC, HRTIMER_REL);
+   t-ktimer.expires = ktime_set(k-event.id.raw[0], k-event.id.raw[1]);
+   t-ktimer.function = kevent_timer_func;
+   t-ktimer_event = k;
+
+   err = kevent_storage_init(t-ktimer, t-ktimer_storage);
+   if (err)
+   goto err_out_free;
+   lockdep_set_class(t-ktimer_storage.lock, kevent_timer_key);
+
+   err = kevent_storage_enqueue(t-ktimer_storage, k);
+   if (err)
+   goto err_out_st_fini;
+
+   hrtimer_start(t-ktimer, t-ktimer.expires, HRTIMER_REL);
+
+   return 0;
+
+err_out_st_fini:
+   kevent_storage_fini(t-ktimer_storage);
+err_out_free:
+   kfree(t);
+
+   return err;
+}
+
+static int kevent_timer_dequeue(struct kevent *k)
+{
+   struct kevent_storage *st = k-st;
+   struct kevent_timer *t = container_of(st, struct kevent_timer, 
ktimer_storage);
+
+   hrtimer_cancel(t-ktimer);
+   kevent_storage_dequeue(st, k);
+   kfree(t);
+
+   return 0;
+}
+
+static int kevent_timer_callback(struct kevent *k)
+{
+   k-event.ret_data[0] = jiffies_to_msecs(jiffies);
+   return 1;
+}
+
+static int __init kevent_init_timer(void)
+{
+   struct kevent_callbacks tc = {
+   .callback = kevent_timer_callback,
+   .enqueue = kevent_timer_enqueue,
+   .dequeue = kevent_timer_dequeue,
+   .flags = 0,
+   };
+
+   return kevent_add_callbacks(tc, KEVENT_TIMER);
+}
+module_init(kevent_init_timer);
+

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[take30 4/9] kevent: Socket notifications.

2006-12-29 Thread Evgeniy Polyakov

Socket notifications.

This patch includes socket send/recv/accept notifications.
Using trivial web server based on kevent and this features
instead of epoll it's performance increased more than noticebly.
More details about various benchmarks and server itself 
(evserver_kevent.c) can be found on project's homepage.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/fs/inode.c b/fs/inode.c
index bf21dc6..82817b1 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -21,6 +21,7 @@
 #include linux/cdev.h
 #include linux/bootmem.h
 #include linux/inotify.h
+#include linux/kevent.h
 #include linux/mount.h
 
 /*
@@ -164,12 +165,18 @@ static struct inode *alloc_inode(struct super_block *sb)
}
inode-i_private = NULL;
inode-i_mapping = mapping;
+#if defined CONFIG_KEVENT_SOCKET || defined CONFIG_KEVENT_PIPE
+   kevent_storage_init(inode, inode-st);
+#endif
}
return inode;
 }
 
 void destroy_inode(struct inode *inode) 
 {
+#if defined CONFIG_KEVENT_SOCKET || defined CONFIG_KEVENT_PIPE
+   kevent_storage_fini(inode-st);
+#endif
BUG_ON(inode_has_buffers(inode));
security_inode_free(inode);
if (inode-i_sb-s_op-destroy_inode)
diff --git a/include/net/sock.h b/include/net/sock.h
index 03684e7..d840399 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -49,6 +49,7 @@
 #include linux/skbuff.h  /* struct sk_buff */
 #include linux/mm.h
 #include linux/security.h
+#include linux/kevent.h
 
 #include linux/filter.h
 
@@ -451,6 +452,21 @@ static inline int sk_stream_memory_free(struct sock *sk)
 
 extern void sk_stream_rfree(struct sk_buff *skb);
 
+struct socket_alloc {
+   struct socket socket;
+   struct inode vfs_inode;
+};
+
+static inline struct socket *SOCKET_I(struct inode *inode)
+{
+   return container_of(inode, struct socket_alloc, vfs_inode)-socket;
+}
+
+static inline struct inode *SOCK_INODE(struct socket *socket)
+{
+   return container_of(socket, struct socket_alloc, socket)-vfs_inode;
+}
+
 static inline void sk_stream_set_owner_r(struct sk_buff *skb, struct sock *sk)
 {
skb-sk = sk;
@@ -478,6 +494,7 @@ static inline void sk_add_backlog(struct sock *sk, struct 
sk_buff *skb)
sk-sk_backlog.tail = skb;
}
skb-next = NULL;
+   kevent_socket_notify(sk, KEVENT_SOCKET_RECV);
 }
 
 #define sk_wait_event(__sk, __timeo, __condition)  \
@@ -679,21 +696,6 @@ static inline struct kiocb *siocb_to_kiocb(struct 
sock_iocb *si)
return si-kiocb;
 }
 
-struct socket_alloc {
-   struct socket socket;
-   struct inode vfs_inode;
-};
-
-static inline struct socket *SOCKET_I(struct inode *inode)
-{
-   return container_of(inode, struct socket_alloc, vfs_inode)-socket;
-}
-
-static inline struct inode *SOCK_INODE(struct socket *socket)
-{
-   return container_of(socket, struct socket_alloc, socket)-vfs_inode;
-}
-
 extern void __sk_stream_mem_reclaim(struct sock *sk);
 extern int sk_stream_mem_schedule(struct sock *sk, int size, int kind);
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index b7d8317..2763b30 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -864,6 +864,7 @@ static inline int tcp_prequeue(struct sock *sk, struct 
sk_buff *skb)
tp-ucopy.memory = 0;
} else if (skb_queue_len(tp-ucopy.prequeue) == 1) {
wake_up_interruptible(sk-sk_sleep);
+   kevent_socket_notify(sk, 
KEVENT_SOCKET_RECV|KEVENT_SOCKET_SEND);
if (!inet_csk_ack_scheduled(sk))
inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
  (3 * TCP_RTO_MIN) / 4,
diff --git a/kernel/kevent/kevent_socket.c b/kernel/kevent/kevent_socket.c
new file mode 100644
index 000..d1a2701
--- /dev/null
+++ b/kernel/kevent/kevent_socket.c
@@ -0,0 +1,144 @@
+/*
+ * kevent_socket.c
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/timer.h
+#include linux/file.h

[take30 1/9] kevent: Description.

2006-12-29 Thread Evgeniy Polyakov

Description.


diff --git a/Documentation/kevent.txt b/Documentation/kevent.txt
new file mode 100644
index 000..95cb36e
--- /dev/null
+++ b/Documentation/kevent.txt
@@ -0,0 +1,244 @@
+Description.
+
+int kevent_init(struct kevent_ring *ring, unsigned int ring_size, 
+   unsigned int flags);
+
+num - size of the ring buffer in events 
+ring - pointer to allocated ring buffer
+flags - various flags, see KEVENT_FLAGS_* definitions.
+
+Return value: kevent control file descriptor or negative error value.
+
+ struct kevent_ring
+ {
+   unsigned int ring_kidx, ring_over;
+   struct ukevent event[0];
+ }
+
+ring_kidx - index in the ring buffer where kernel will put new events 
+   when kevent_wait() or kevent_get_events() is called 
+ring_over - number of overflows of ring_uidx happend from the start.
+   Overflow counter is used to prevent situation when two threads 
+   are going to free the same events, but one of them was scheduled 
+   away for too long, so ring indexes were wrapped, so when that 
+   thread will be awakened, it will free not those events, which 
+   it suppose to free.
+
+Example userspace code (ring_buffer.c) can be found on project's homepage.
+
+Each kevent syscall can be so called cancellation point in glibc, i.e. when 
+thread has been cancelled in kevent syscall, thread can be safely removed 
+and no events will be lost, since each syscall (kevent_wait() or 
+kevent_get_events()) will copy event into special ring buffer, accessible 
+from other threads or even processes (if shared memory is used).
+
+When kevent is removed (not dequeued when it is ready, but just removed), 
+even if it was ready, it is not copied into ring buffer, since if it is 
+removed, no one cares about it (otherwise user would wait until it becomes 
+ready and got it through usual way using kevent_get_events() or kevent_wait()) 
+and thus no need to copy it to the ring buffer.
+
+---
+
+
+int kevent_ctl(int fd, unsigned int cmd, unsigned int num, struct ukevent 
*arg);
+
+fd - is the file descriptor referring to the kevent queue to manipulate. 
+It is created by opening /dev/kevent char device, which is created with 
+dynamic minor number and major number assigned for misc devices. 
+
+cmd - is the requested operation. It can be one of the following:
+KEVENT_CTL_ADD - add event notification 
+KEVENT_CTL_REMOVE - remove event notification 
+KEVENT_CTL_MODIFY - modify existing notification 
+KEVENT_CTL_READY - mark existing events as ready, if number of events is 
zero,
+   it just wakes up parked in syscall thread
+
+num - number of struct ukevent in the array pointed to by arg 
+arg - array of struct ukevent
+
+Return value: 
+ number of events processed or negative error value.
+
+When called, kevent_ctl will carry out the operation specified in the 
+cmd parameter.
+---
+
+ int kevent_get_events(int ctl_fd, unsigned int min_nr, unsigned int max_nr, 
+   struct timespec timeout, struct ukevent *buf, unsigned flags);
+
+ctl_fd - file descriptor referring to the kevent queue 
+min_nr - minimum number of completed events that kevent_get_events will block 
+waiting for 
+max_nr - number of struct ukevent in buf 
+timeout - time to wait before returning less than min_nr 
+ events. If this is -1, then wait forever. 
+buf - pointer to an array of struct ukevent. 
+flags - various flags, see KEVENT_FLAGS_* definitions.
+
+Return value:
+ number of events copied or negative error value.
+
+kevent_get_events will wait timeout milliseconds for at least min_nr completed 
+events, copying completed struct ukevents to buf and deleting any 
+KEVENT_REQ_ONESHOT event requests. In nonblocking mode it returns as many 
+events as possible, but not more than max_nr. In blocking mode it waits until 
+timeout or if at least min_nr events are ready.
+
+This function copies event into ring buffer if it was initialized, if ring 
buffer
+is full, KEVENT_RET_COPY_FAILED flag is set in ret_flags field.
+---
+
+ int kevent_wait(int ctl_fd, unsigned int num, unsigned int old_uidx, 
+   struct timespec timeout, unsigned int flags);
+
+ctl_fd - file descriptor referring to the kevent queue 
+num - number of processed kevents 
+old_uidx - the last index user is aware of
+timeout - time to wait until there is free space in kevent queue
+flags - various flags, see KEVENT_FLAGS_* definitions.
+
+Return value:
+ number of events copied into ring buffer or negative error value.
+
+This syscall waits until either timeout expires or at least one event becomes 
+ready. It also copies events into special ring buffer. If ring buffer is full,
+it waits until there are ready events and then return.
+If kevent is one-shot kevent it is 

[take30 6/9] kevent: Pipe notifications.

2006-12-29 Thread Evgeniy Polyakov

Pipe notifications.


diff --git a/fs/pipe.c b/fs/pipe.c
index 68090e8..0c75bf1 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -16,6 +16,7 @@
 #include linux/uio.h
 #include linux/highmem.h
 #include linux/pagemap.h
+#include linux/kevent.h
 
 #include asm/uaccess.h
 #include asm/ioctls.h
@@ -313,6 +314,7 @@ redo:
break;
}
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_SEND);
wake_up_interruptible_sync(pipe-wait);
kill_fasync(pipe-fasync_writers, SIGIO, POLL_OUT);
}
@@ -322,6 +324,7 @@ redo:
 
/* Signal writers asynchronously that there is more room. */
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_SEND);
wake_up_interruptible(pipe-wait);
kill_fasync(pipe-fasync_writers, SIGIO, POLL_OUT);
}
@@ -484,6 +487,7 @@ redo2:
break;
}
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_RECV);
wake_up_interruptible_sync(pipe-wait);
kill_fasync(pipe-fasync_readers, SIGIO, POLL_IN);
do_wakeup = 0;
@@ -495,6 +499,7 @@ redo2:
 out:
mutex_unlock(inode-i_mutex);
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_RECV);
wake_up_interruptible(pipe-wait);
kill_fasync(pipe-fasync_readers, SIGIO, POLL_IN);
}
@@ -590,6 +595,7 @@ pipe_release(struct inode *inode, int decr, int decw)
free_pipe_info(inode);
} else {
wake_up_interruptible(pipe-wait);
+   kevent_pipe_notify(inode, 
KEVENT_SOCKET_SEND|KEVENT_SOCKET_RECV);
kill_fasync(pipe-fasync_readers, SIGIO, POLL_IN);
kill_fasync(pipe-fasync_writers, SIGIO, POLL_OUT);
}
diff --git a/kernel/kevent/kevent_pipe.c b/kernel/kevent/kevent_pipe.c
new file mode 100644
index 000..91dc1eb
--- /dev/null
+++ b/kernel/kevent/kevent_pipe.c
@@ -0,0 +1,123 @@
+/*
+ * kevent_pipe.c
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/file.h
+#include linux/fs.h
+#include linux/kevent.h
+#include linux/pipe_fs_i.h
+
+static int kevent_pipe_callback(struct kevent *k)
+{
+   struct inode *inode = k-st-origin;
+   struct pipe_inode_info *pipe = inode-i_pipe;
+   int nrbufs = pipe-nrbufs;
+
+   if (k-event.event  KEVENT_SOCKET_RECV  nrbufs  0) {
+   if (!pipe-writers)
+   return -1;
+   return 1;
+   }
+   
+   if (k-event.event  KEVENT_SOCKET_SEND  nrbufs  PIPE_BUFFERS) {
+   if (!pipe-readers)
+   return -1;
+   return 1;
+   }
+
+   return 0;
+}
+
+int kevent_pipe_enqueue(struct kevent *k)
+{
+   struct file *pipe;
+   int err = -EBADF;
+   struct inode *inode;
+
+   pipe = fget(k-event.id.raw[0]);
+   if (!pipe)
+   goto err_out_exit;
+
+   inode = igrab(pipe-f_dentry-d_inode);
+   if (!inode)
+   goto err_out_fput;
+
+   err = -EINVAL;
+   if (!S_ISFIFO(inode-i_mode))
+   goto err_out_iput;
+
+   err = kevent_storage_enqueue(inode-st, k);
+   if (err)
+   goto err_out_iput;
+
+   if (k-event.req_flags  KEVENT_REQ_ALWAYS_QUEUE) {
+   kevent_requeue(k);
+   err = 0;
+   } else {
+   err = k-callbacks.callback(k);
+   if (err)
+   goto err_out_dequeue;
+   }
+
+   fput(pipe);
+
+   return err;
+
+err_out_dequeue:
+   kevent_storage_dequeue(k-st, k);
+err_out_iput:
+   iput(inode);
+err_out_fput:
+   fput(pipe);
+err_out_exit:
+   return err;
+}
+
+int kevent_pipe_dequeue(struct kevent *k)
+{
+   struct inode *inode = k-st-origin;
+
+   kevent_storage_dequeue(k-st, k);
+   iput(inode);
+
+   return 0;
+}
+
+void kevent_pipe_notify(struct inode 

[take30 7/9] kevent: Signal notifications.

2006-12-29 Thread Evgeniy Polyakov

Signal notifications.

This type of notifications allows to deliver signals through kevent queue.
One can find example application signal.c on project homepage.

If KEVENT_SIGNAL_NOMASK bit is set in raw_u64 id then signal will be
delivered only through queue, otherwise both delivery types are used - old
through update of mask of pending signals and through queue.

If signal is delivered only through kevent queue mask of pending signals
is not updated at all, which is equal to putting signal into blocked mask,
but with delivery of that signal through kevent queue.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]


diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4463735..e7372f2 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -82,6 +82,7 @@ struct sched_param {
 #include linux/resource.h
 #include linux/timer.h
 #include linux/hrtimer.h
+#include linux/kevent_storage.h
 #include linux/task_io_accounting.h
 
 #include asm/processor.h
@@ -1048,6 +1049,10 @@ struct task_struct {
 #ifdef CONFIG_TASK_DELAY_ACCT
struct task_delay_info *delays;
 #endif
+#ifdef CONFIG_KEVENT_SIGNAL
+   struct kevent_storage st;
+   u32 kevent_signals;
+#endif
 #ifdef CONFIG_FAULT_INJECTION
int make_it_fail;
 #endif
diff --git a/kernel/fork.c b/kernel/fork.c
index fc723e5..fd7c749 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -49,6 +49,7 @@
 #include linux/delayacct.h
 #include linux/taskstats_kern.h
 #include linux/random.h
+#include linux/kevent.h
 
 #include asm/pgtable.h
 #include asm/pgalloc.h
@@ -118,6 +119,9 @@ void __put_task_struct(struct task_struct *tsk)
WARN_ON(atomic_read(tsk-usage));
WARN_ON(tsk == current);
 
+#ifdef CONFIG_KEVENT_SIGNAL
+   kevent_storage_fini(tsk-st);
+#endif
security_task_free(tsk);
free_uid(tsk-user);
put_group_info(tsk-group_info);
@@ -1126,6 +1130,10 @@ static struct task_struct *copy_process(unsigned long 
clone_flags,
if (retval)
goto bad_fork_cleanup_namespaces;
 
+#ifdef CONFIG_KEVENT_SIGNAL
+   kevent_storage_init(p, p-st);
+#endif
+
p-set_child_tid = (clone_flags  CLONE_CHILD_SETTID) ? child_tidptr : 
NULL;
/*
 * Clear TID on mm_release()?
diff --git a/kernel/kevent/kevent_signal.c b/kernel/kevent/kevent_signal.c
new file mode 100644
index 000..abe3972
--- /dev/null
+++ b/kernel/kevent/kevent_signal.c
@@ -0,0 +1,94 @@
+/*
+ * kevent_signal.c
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/file.h
+#include linux/fs.h
+#include linux/kevent.h
+
+static int kevent_signal_callback(struct kevent *k)
+{
+   struct task_struct *tsk = k-st-origin;
+   int sig = k-event.id.raw[0];
+   int ret = 0;
+
+   if (sig == tsk-kevent_signals)
+   ret = 1;
+
+   if (ret  (k-event.id.raw_u64  KEVENT_SIGNAL_NOMASK))
+   tsk-kevent_signals |= 0x8000;
+
+   return ret;
+}
+
+int kevent_signal_enqueue(struct kevent *k)
+{
+   int err;
+
+   err = kevent_storage_enqueue(current-st, k);
+   if (err)
+   goto err_out_exit;
+
+   if (k-event.req_flags  KEVENT_REQ_ALWAYS_QUEUE) {
+   kevent_requeue(k);
+   err = 0;
+   } else {
+   err = k-callbacks.callback(k);
+   if (err)
+   goto err_out_dequeue;
+   }
+
+   return err;
+
+err_out_dequeue:
+   kevent_storage_dequeue(k-st, k);
+err_out_exit:
+   return err;
+}
+
+int kevent_signal_dequeue(struct kevent *k)
+{
+   kevent_storage_dequeue(k-st, k);
+   return 0;
+}
+
+int kevent_signal_notify(struct task_struct *tsk, int sig)
+{
+   tsk-kevent_signals = sig;
+   kevent_storage_ready(tsk-st, NULL, KEVENT_SIGNAL_DELIVERY);
+   return (tsk-kevent_signals  0x8000);
+}
+
+static int __init kevent_init_signal(void)
+{
+   struct kevent_callbacks sc = {
+   .callback = kevent_signal_callback,
+   .enqueue = kevent_signal_enqueue,
+   .dequeue = kevent_signal_dequeue,
+   .flags = 0,
+   

[take30 3/9] kevent: poll/select() notifications.

2006-12-29 Thread Evgeniy Polyakov

poll/select() notifications.

This patch includes generic poll/select notifications.
kevent_poll works simialr to epoll and has the same issues (callback
is invoked not from internal state machine of the caller, but through
process awake, a lot of allocations and so on).

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/fs/file_table.c b/fs/file_table.c
index 4c17a18..46f458c 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -20,6 +20,7 @@
 #include linux/cdev.h
 #include linux/fsnotify.h
 #include linux/sysctl.h
+#include linux/kevent.h
 #include linux/percpu_counter.h
 
 #include asm/atomic.h
@@ -119,6 +120,7 @@ struct file *get_empty_filp(void)
f-f_uid = tsk-fsuid;
f-f_gid = tsk-fsgid;
eventpoll_init_file(f);
+   kevent_init_file(f);
/* f-f_version: 0 */
return f;
 
@@ -164,6 +166,7 @@ void fastcall __fput(struct file *file)
 * in the file cleanup chain.
 */
eventpoll_release(file);
+   kevent_cleanup_file(file);
locks_remove_flock(file);
 
if (file-f_op  file-f_op-release)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 186da81..62ef137 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -280,6 +280,7 @@ extern int dir_notify_enable;
 #include linux/init.h
 #include linux/pid.h
 #include linux/mutex.h
+#include linux/kevent_storage.h
 
 #include asm/atomic.h
 #include asm/semaphore.h
@@ -578,6 +579,10 @@ struct inode {
struct mutexinotify_mutex;  /* protects the watches list */
 #endif
 
+#if defined CONFIG_KEVENT_SOCKET || defined CONFIG_KEVENT_PIPE
+   struct kevent_storage   st;
+#endif
+
unsigned long   i_state;
unsigned long   dirtied_when;   /* jiffies of first dirtying */
 
@@ -737,6 +742,9 @@ struct file {
struct list_headf_ep_links;
spinlock_t  f_ep_lock;
 #endif /* #ifdef CONFIG_EPOLL */
+#ifdef CONFIG_KEVENT_POLL
+   struct kevent_storage   st;
+#endif
struct address_space*f_mapping;
 };
 extern spinlock_t files_lock;
diff --git a/kernel/kevent/kevent_poll.c b/kernel/kevent/kevent_poll.c
new file mode 100644
index 000..58129fa
--- /dev/null
+++ b/kernel/kevent/kevent_poll.c
@@ -0,0 +1,234 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/kernel.h
+#include linux/types.h
+#include linux/list.h
+#include linux/slab.h
+#include linux/spinlock.h
+#include linux/timer.h
+#include linux/file.h
+#include linux/kevent.h
+#include linux/poll.h
+#include linux/fs.h
+
+static struct kmem_cache *kevent_poll_container_cache;
+static struct kmem_cache *kevent_poll_priv_cache;
+
+struct kevent_poll_ctl
+{
+   struct poll_table_structpt;
+   struct kevent   *k;
+};
+
+struct kevent_poll_wait_container
+{
+   struct list_headcontainer_entry;
+   wait_queue_head_t   *whead;
+   wait_queue_twait;
+   struct kevent   *k;
+};
+
+struct kevent_poll_private
+{
+   struct list_headcontainer_list;
+   spinlock_t  container_lock;
+};
+
+static int kevent_poll_enqueue(struct kevent *k);
+static int kevent_poll_dequeue(struct kevent *k);
+static int kevent_poll_callback(struct kevent *k);
+
+static int kevent_poll_wait_callback(wait_queue_t *wait,
+   unsigned mode, int sync, void *key)
+{
+   struct kevent_poll_wait_container *cont =
+   container_of(wait, struct kevent_poll_wait_container, wait);
+   struct kevent *k = cont-k;
+
+   kevent_storage_ready(k-st, NULL, KEVENT_MASK_ALL);
+   return 0;
+}
+
+static void kevent_poll_qproc(struct file *file, wait_queue_head_t *whead,
+   struct poll_table_struct *poll_table)
+{
+   struct kevent *k =
+   container_of(poll_table, struct kevent_poll_ctl, pt)-k;
+   struct kevent_poll_private *priv = k-priv;
+   struct kevent_poll_wait_container *cont;
+   unsigned long flags;
+
+   cont = kmem_cache_alloc(kevent_poll_container_cache, GFP_KERNEL);
+   if (!cont) {
+   kevent_break(k);
+   return;
+   }
+
+   cont-k = k;
+   init_waitqueue_func_entry(cont-wait, kevent_poll_wait_callback);
+   cont-whead = whead;
+
+   spin_lock_irqsave(priv-container_lock, flags);
+   

[take30 9/9] kevent: Private userspace notifications.

2006-12-29 Thread Evgeniy Polyakov

Private userspace notifications.

Allows to register notifications of any private userspace
events over kevent. Events can be marked as readt using 
kevent_ctl(KEVENT_READY) command.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/kernel/kevent/kevent_unotify.c b/kernel/kevent/kevent_unotify.c
new file mode 100644
index 000..618c09c
--- /dev/null
+++ b/kernel/kevent/kevent_unotify.c
@@ -0,0 +1,62 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include linux/kernel.h
+#include linux/kevent.h
+
+static int kevent_unotify_callback(struct kevent *k)
+{
+   return 1;
+}
+
+int kevent_unotify_enqueue(struct kevent *k)
+{
+   int err;
+
+   err = kevent_storage_enqueue(k-user-st, k);
+   if (err)
+   goto err_out_exit;
+
+   if (k-event.req_flags  KEVENT_REQ_ALWAYS_QUEUE)
+   kevent_requeue(k);
+
+   return 0;
+
+err_out_exit:
+   return err;
+}
+
+int kevent_unotify_dequeue(struct kevent *k)
+{
+   kevent_storage_dequeue(k-st, k);
+   return 0;
+}
+
+static int __init kevent_init_unotify(void)
+{
+   struct kevent_callbacks sc = {
+   .callback = kevent_unotify_callback,
+   .enqueue = kevent_unotify_enqueue,
+   .dequeue = kevent_unotify_dequeue,
+   .flags = 0,
+   };
+
+   return kevent_add_callbacks(sc, KEVENT_UNOTIFY);
+}
+module_init(kevent_init_unotify);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


degradation in network performance upto 20 % in 2.6.18 when compared to 2.6.17

2006-12-29 Thread kalyan tejaswi

Hi all,
I have been comparing routing and bridging performance tests for
2.6.16,2.6.17,2.6.18,2.6.19 kernels.
The setup is:


  
eth0 
  |  |
--||
  |  Smartbits|
   |Malta 4Kc|
  |  |
--||
  
eth1 -


I use D-Link cards for the Malta 4Kc board.
I see that 2.6.18 and 2.6.19 have almost identical performance
figures. Also, same is observed with 2.6.16 and 2.6.17.

But 2.6.16/2.6.17   is roughly 20% better than 2.6.18/2.6.19

Are these results justified? Has anyone observed similar behaviour?
What could be the reason?


Regards
Kalyan
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [take29 0/8] kevent: Generic event handling mechanism.

2006-12-29 Thread Ingo Molnar

* Evgeniy Polyakov [EMAIL PROTECTED] wrote:

 On Thu, Dec 28, 2006 at 05:01:37PM +0100, Ingo Molnar ([EMAIL PROTECTED]) 
 wrote:
  
  * Evgeniy Polyakov [EMAIL PROTECTED] wrote:
  
   Generic event handling mechanism.
  
  i see it covers alot of event sources, but i cannot see block IO 
  notifications. Am i missing something?
 
 Depending on what it is :) If you mean kevent based AIO, then it was 
 dropped to reduce size of the patchset, and in favour of new AIO 
 design.

yes, kevent based AIO. Could you please re-add it, preferably ontop of 
Suparna's AIO patchset? I dont see how a generic event handling 
mechanism can exclude block IO because we really need to see how it 
plugs into (and plays along with) block AIO and how it performs relative 
to block AIO to be able to judge whether this API and infrastructure 
should be included in the kernel in its current form.

 Other kinds of read/write notifications can be handled by poll/select 
 notifications.

but poll/select notifications are just a second-degree way of doing an 
IO state machine, and they are mostly there in kevents for completeness 
and compatibility.

To be able to judge a generic event mechanism it really must support 
block IO as well, natively. Without that we'd have the following obscene 
API situation:

 - poll()/select(): supports everything but is slow and inaccurate
 - epoll(): more modern API ontop of poll notifications
 - async IO: supports block IO
 - kevent supports almost everything /except/ block IO

so what we need is for kevents to support /all/ the important 
high-performance event types natively:

 - networking
 - block IO
 - VFS namespace
 - timers

(rarer things like mouse/input events can stay with poll notifications)

and it is /especially/ important to include block IO events in kevents 
to be able to judge its performance and scalability relative to the 
async IO API and infrastructure.

Ingo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [take29 0/8] kevent: Generic event handling mechanism.

2006-12-29 Thread Evgeniy Polyakov
On Fri, Dec 29, 2006 at 01:54:27PM +0100, Ingo Molnar ([EMAIL PROTECTED]) wrote:
Generic event handling mechanism.
   
   i see it covers alot of event sources, but i cannot see block IO 
   notifications. Am i missing something?
  
  Depending on what it is :) If you mean kevent based AIO, then it was 
  dropped to reduce size of the patchset, and in favour of new AIO 
  design.
 
 yes, kevent based AIO. Could you please re-add it, preferably ontop of 
 Suparna's AIO patchset? I dont see how a generic event handling 
 mechanism can exclude block IO because we really need to see how it 
 plugs into (and plays along with) block AIO and how it performs relative 
 to block AIO to be able to judge whether this API and infrastructure 
 should be included in the kernel in its current form.

I like new design much more than my previous kevent based approach and
existing repeated call approach. I plan to start working on it jst after
New Year vacations are over (in about a week or two, it is the longest
vacations of the year in Russia, which are spent in a way which does not 
allow to hack or perform any other usefull work).
Kevent AIO was completely different thing than Suparna's AIO, and
although it hooked into block/fs subsystem on a bit different layer (I
exported -get_block() callback), it was possible to fully separate AIO
from main code.

  Other kinds of read/write notifications can be handled by poll/select 
  notifications.
 
 but poll/select notifications are just a second-degree way of doing an 
 IO state machine, and they are mostly there in kevents for completeness 
 and compatibility.

Yes, indeed.

 To be able to judge a generic event mechanism it really must support 
 block IO as well, natively. Without that we'd have the following obscene 
 API situation:
 
  - poll()/select(): supports everything but is slow and inaccurate
  - epoll(): more modern API ontop of poll notifications
  - async IO: supports block IO

Network AIO should not be different from block IO - it is essentially
the same mechanisms, which just have different lower layer from where
callbacks are invoked. 

  - kevent supports almost everything /except/ block IO
 
 so what we need is for kevents to support /all/ the important 
 high-performance event types natively:
 
  - networking
  - block IO
  - VFS namespace
  - timers
 
 (rarer things like mouse/input events can stay with poll notifications)
 
 and it is /especially/ important to include block IO events in kevents 
 to be able to judge its performance and scalability relative to the 
 async IO API and infrastructure.

Yes, async IO is a significant part, and will be implemented, IMHO, new
design I highlighted in linux-fsdevel@ in AIO related thread is the way
to go (at least I will imlement it that way).

   Ingo

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [take29 0/8] kevent: Generic event handling mechanism.

2006-12-29 Thread Ingo Molnar

* Evgeniy Polyakov [EMAIL PROTECTED] wrote:

  (rarer things like mouse/input events can stay with poll 
  notifications)
  
  and it is /especially/ important to include block IO events in 
  kevents to be able to judge its performance and scalability relative 
  to the async IO API and infrastructure.
 
 Yes, async IO is a significant part, and will be implemented, IMHO, 
 new design I highlighted in linux-fsdevel@ in AIO related thread is 
 the way to go (at least I will imlement it that way).

yes. Note that a prototype exists already: take a look at Tux's work 
atom infrastructure of how you can build a relatively straightforward 
state-machine that can be programmed and can be driven even from IRQ 
contexts. Via that i implemented fully asynchronous IO for networking 5 
years ago, and programmed it to handle HTTP and FTP protocol server 
logic, fully asynchronously. (For block IO it also does emulation of 
event handling via the 'cachemiss' kernel threads. State-machine driven 
filesystems are quite hard - but not impossible in the long run.)

It would be a natural thing to extend that fundamental concept to 
user-space as well. /That/ i'd call a generic, grounds-up event handling 
infrastructure. That would be a worthwile unification point for all 
existing IO APIs.

Ingo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re. Please pull 'upstream' branch of wireless-2.6

2006-12-29 Thread Roger While



 Roy Marples (1):
   prism54: set carrier flags correctly

Why is this not #upstream-fixes material?  What's the impact?


Actually, I think the patch is incorrect.
At best it is insufficient and at worst it
stops the driver working correctly.

I can't see why we do carrier_off after start_queue in the open.
Other drivers (eg. ipw2100) do carrier_on.

We should also look at other places where eg. stop_queue
is called and do a carrier_off eg. the close routine.
(Amongst others)

Also according to Documentation/networking/operstates.txt
(netif_carrier_on/off) -
It is guaranteed that only the driver has write access,
 however, if different layers of the driver manipulate the same flag,
 the driver has to provide the synchronisation needed.

The trap routine in isl_ioctl.c however is lockless.
Assuming that the doc is correct, I would have thought
that putting carrier_on/off here is buggy or ?

Roger While


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Fwd: [Netem] [PATCH 2.6.18 0/2] LARTC: trace control for netem]

2006-12-29 Thread Rainer Baumann
Hi Stephen

I just wanted to ask you, if you already had time to test our trace
extension for netem as discussed on the 13th of December.

Cheers
Rainer

Rainer Baumann wrote:
 Hi Stephen

 As discussed yesterday, here our patches to integrate trace control into netem



 Trace Control for Netem: Emulate network properties such as long range 
 dependency and self-similarity of cross-traffic.

 A new option (trace) has been added to the netem command. If the trace option 
 is used, the values for packet delay etc. are read from a pregenerated trace 
 file, afterwards the packets are processed by the normal netem functions. The 
 packet action values are readout from the trace file in user space and sent 
 to kernel space via configfs.







 ___
 Netem mailing list
 [EMAIL PROTECTED]
 https://lists.osdl.org/mailman/listinfo/netem
   

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Netchanel netfilter usage example.

2006-12-29 Thread Evgeniy Polyakov
On Fri, Dec 29, 2006 at 05:35:27PM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
 * netfilter netchannel backend (only NAT is supported as the most interesting
   user, NAT caches appropriate route, so essentially routing becomes part
   of the netchannel trie)

Source NAT example:

connctl -p 6-6 -t 0 -T 10.0.0.1 -s 192.168.0.0-255.255.248.0 -d
10.0.0.0-255.255.255.0 -S 80-80 -D 1234-1234

which is equivalent to

iptables -t nat -I POSTROUTING -p tcp -s 192.168.0.0/21 -d 10.0.0.0/24
--sport 80 --dport 1234 -j SNAT --to-source 10.0.0.1

Netchannels scale easily to tens of thousands entries, userspace test
showed scaling to millions of entries (consider system with one million
of netfilter rules or at least sockets).

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Patches not in Linus tree yet

2006-12-29 Thread Tjernlund
These two patches aren't in Linus tree yet:
http://ozlabs.org/pipermail/linuxppc-dev/2006-December/029247.html
http://ozlabs.org/pipermail/linuxppc-dev/2006-December/029248.html

They really need be in next release since mpx83xx ucc_geth driver
wont compile otherwise.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch sungem] improved locking

2006-12-29 Thread Benjamin Herrenschmidt
On Thu, 2006-12-28 at 21:05 -0800, David Miller wrote:
 From: Benjamin Herrenschmidt [EMAIL PROTECTED]
 Date: Wed, 13 Dec 2006 15:07:24 +1100
 
  tg3 says
  
  tg3: eth0: Link is up at 1000 Mbps, full duplex.
  tg3: eth0: Flow control is on for TX and on for RX.
  
  but sungem says
  
  eth0: Link is up at 1000 Mbps, full-duplex.
  eth0: Pause is disabled
  
  Hrm... I suppose I need to dig more. No time to do that today though.
 
 I was about to try and debug this, and noticed immediately that I
 didn't recognize any of the code.
 
 Could you look into this, you rewrote all of this stuff and this
 looks like a regression added, because I know this pause stuff
 used to work perfectly when I wrote the original GEM driver. :-)

Heh, it's very possible it's a regression I added indeed. I'll try to
have a look next week. Do you know of anybody who can verify on non-mii
hardware or is pause irrelevant there ?

Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch sungem] improved locking

2006-12-29 Thread David Miller
From: Benjamin Herrenschmidt [EMAIL PROTECTED]
Date: Sat, 30 Dec 2006 08:36:07 +1100

 Heh, it's very possible it's a regression I added indeed. I'll try to
 have a look next week. Do you know of anybody who can verify on non-mii
 hardware or is pause irrelevant there ?

For PCS based PHY's most of the logic is taken care of internally,
and the pause-present signal is just sampled from one of the
PCS registers.

I think I have one here using non-mii that I could check at
some point :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG KERNEL 2.6.20-rc1] ftp: get or put stops during file-transfer

2006-12-29 Thread Komuro
Hi,

I investigated the ftp-file-transfer-stop problem by git-bisect method,
and found this problem was introduced by
[TCP]: MD5 Signature Option (RFC2385) support patch.

Mr.YOSHIFUJI san, please fix this problem.

commit cfb6eeb4c860592edd123fdea908d23c6ad1c7dc
Author: YOSHIFUJI Hideaki [EMAIL PROTECTED]
Date:   Tue Nov 14 19:07:45 2006 -0800

[TCP]: MD5 Signature Option (RFC2385) support.

Based on implementation by Rick Payne.

Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
Signed-off-by: David S. Miller [EMAIL PROTECTED]

Best Regards
Komuro


 On Sun, Dec 17, 2006 at 11:23:11PM +0900, Komuro wrote:
  On Sun, 17 Dec 2006 04:02:22 +
  Al Viro [EMAIL PROTECTED] wrote:
  
   On Sun, Dec 17, 2006 at 09:27:52PM +0900, Komuro wrote:

Hello,

On kernel 2.6.20-rc1, ftp (get or put) stops
during file-transfer.

Client: ftp-0.17-33.fc6  (192.168.1.1)
Server: vsftpd-2.0.5-8   (192.168.1.3)

This problem does _not_ happen on kernel-2.6.19.
is it caused by network-subsystem change on 2.6.20-rc1??
   
   Do you have NAT between you and server?
  
  No. I don't have NAT between the client and the server.
  Actually, the client and the sever is located in same room.
  
  client -- 100MbpsHub -- server.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG KERNEL 2.6.20-rc1] ftp: get or put stops during file-transfer

2006-12-29 Thread YOSHIFUJI Hideaki / 吉藤英明
In article [EMAIL PROTECTED] (at Sat, 30 Dec 2006 18:50:43 +0900), Komuro 
[EMAIL PROTECTED] says:

 I investigated the ftp-file-transfer-stop problem by git-bisect method,
 and found this problem was introduced by
 [TCP]: MD5 Signature Option (RFC2385) support patch.
 
 Mr.YOSHIFUJI san, please fix this problem.

Hmm, have you try disabling CONFIG_TCP_MD5SIG?
(Is it already disabled?)

Are there any specific size of transfer to reproduce this?
Do you see similar issue with other simple application?

--yoshfuji
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG KERNEL 2.6.20-rc1] ftp: get or put stops during file-transfer

2006-12-29 Thread Komuro

 
  I investigated the ftp-file-transfer-stop problem by git-bisect method,
  and found this problem was introduced by
  [TCP]: MD5 Signature Option (RFC2385) support patch.
  
  Mr.YOSHIFUJI san, please fix this problem.
 
 Hmm, have you try disabling CONFIG_TCP_MD5SIG?
 (Is it already disabled?)

This problem happens both CONFIG_TCP_MD5SIG is disabled and enabled.

 Are there any specific size of transfer to reproduce this?

When I do ftp 40Mbytes file for 5-times or more,
 this problem happens.


 Do you see similar issue with other simple application?

sorry, I don't reproduce this problem on other application.

Thanks,

Best Regards
Komuro.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


wake on lan with skge?

2006-12-29 Thread David Liontooth
Back in Jan 2005, Stephen Hemminger added wake-on-lan support to the
skge driver -- cf. http://lwn.net/Articles/120679/

I'm unable to make it wake up from a powered-down state -- is this
expected to work? I have it working fine on an e1000 nic,
using the wakeonlan package --

$ wakeonlan -i 192.168.1.255 01:02:03:04:05:06

The motherboard is a Gigabyte K8NSC-939. I don't see an option to turn
it on or off in the BIOS.

The specs show a Marvell 8001 Gigabit Ethernet controller --  lspci:

02:0b.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001
Gigabit Ethernet Controller (rev 13)

The detailed specs at
http://www.marvell.com/products/pcconn/yukon/Yukon_88E8001_10_073103_final.pdf
states it supports WOL.

Dave

# modinfo skge
filename:   /lib/modules/2.6.19.1/kernel/drivers/net/skge.ko
description:SysKonnect Gigabit Ethernet driver
author: Stephen Hemminger shemminger osdl.org
license:GPL
version:1.9
vermagic:   2.6.19.1 SMP mod_unload
depends:
alias:  pci:v10B7d1700sv*sd*bc*sc*i*
alias:  pci:v10B7d80EBsv*sd*bc*sc*i*
alias:  pci:v1148d4300sv*sd*bc*sc*i*
alias:  pci:v1148d4320sv*sd*bc*sc*i*
alias:  pci:v1186d4C00sv*sd*bc*sc*i*
alias:  pci:v1186d4B01sv*sd*bc*sc*i*
alias:  pci:v11ABd4320sv*sd*bc*sc*i*
alias:  pci:v11ABd5005sv*sd*bc*sc*i*
alias:  pci:v1371d434Esv*sd*bc*sc*i*
alias:  pci:v1737d1064sv*sd*bc*sc*i*
alias:  pci:v1737d1032sv*sd0015bc*sc*i*
srcversion: 34A66981E89644C61A48CB9
parm:   debug:Debug level (0=none,...,16=all) (int)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH -mm take2 3/5] add interface for netconsole using sysfs

2006-12-29 Thread Stephen Hemminger
Keiichi KII wrote:
 From: Keiichi KII [EMAIL PROTECTED]

 This patch contains the following changes.

 create a sysfs entry for netconsole in /sys/class/misc.
 This entry has elements related to netconsole as follows.
 You can change configuration of netconsole(writable attributes such as IP
 address, port number and so on) and check current configuration of netconsole.

 -+- /sys/class/misc/
  |-+- netconsole/
|-+- port1/
| |--- id  [r--r--r--]  unique port id
| |--- remove  [-w---]  if you write something to remove,
| | this port is removed.
   
IMHO this kind of magic side effect is a misuse of sysfs. and would
make proper locking
impossible. How do you deal with the dangling reference to the
netconsole object?
f= open (... netconsole/port1/remove)
write(f, , 1)
sleep(2)
write(f, , 1)  this probably would crash...


Maybe having a state variable/sysfs file so you could setup the port and
turn it on/off with write.
| |--- dev_name[r--r--r--]  network interface name
   

Please don't use dev_name, instead use a a symlink. You see if the
device is renamed,
the dev_name will be wrong, but the symlink to the net_device kobject
should be okay.
| |--- local_ip[rw-r--r--]  source IP to use, writable
| |--- local_port  [rw-r--r--]  source port number for UDP packets, 
 writable
| |--- local_mac   [r--r--r--]  source MAC address
| |--- remote_ip   [rw-r--r--]  port number for logging agent, writable
| |--- remote_port [rw-r--r--]  IP address for logging agent, writable
|  remote_mac  [rw-r--r--]  MAC address for logging agent, writable
|--- port2/
|--- port3/
...

 Signed-off-by: Keiichi KII [EMAIL PROTECTED]
 Signed-off-by: Takayoshi Kochi [EMAIL PROTECTED]
   

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html