We've run into an issue (on 2.6.10) where calling "lsof" triggers lost packets on our server. Preempt is disabled, and NAPI is enabled.

It appears that for some reason the networking softirq is not being handled in a timely fashion, which means that the rx ring buffer fills up and packets overflow.

It appears that the problem path is:

seq_read
        tcp_seq_next
                established_get_next
                        read_lock/read_unlock

The issue appears to be related to the amount of time that this syscall takes. While we're in the syscall we cannot run the softirqd thread, and so the rx buffer is not being cleaned.

The fact that there are kmalloc(GFP_KERNEL) calls in seq_read() seems to indicate that sleeping is safe, so would it be reasonable to call schedule() periodically (maybe based on elapsed time) to ensure that system latency is kept under control?

Thanks,

Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to