Re: bad networking related lag in v2.6.22-rc2

2007-05-24 Thread Anant Nitya
On Thursday 24 May 2007 03:00:56 David Miller wrote: > From: Ingo Molnar <[EMAIL PROTECTED]> > Date: Wed, 23 May 2007 13:40:21 +0200 > > > * Herbert Xu <[EMAIL PROTECTED]> wrote: > > > [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > > > > > My previous patch that changed the r

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Thu, 24 May 2007 07:41:00 +0200 > David Miller wrote: > >>* Herbert Xu <[EMAIL PROTECTED]> wrote: > >> > >>>[NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > > > Applied, thanks everyone. > > > Even though it didn't fix this pro

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread Patrick McHardy
David Miller wrote: >>* Herbert Xu <[EMAIL PROTECTED]> wrote: >> >>>[NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > Applied, thanks everyone. Even though it didn't fix this problem, this patch I sent earlier is also needed. [NET_SCHED]: sch_htb: fix event cache time calcula

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Wed, 23 May 2007 13:40:21 +0200 > > * Herbert Xu <[EMAIL PROTECTED]> wrote: > > > [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > > > My previous patch that changed the return value of qdisc_restart > > incorrectly made the case

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread Patrick McHardy
Linus Torvalds wrote: > There appear to be other obvious problems in the recent "cleanups" in this > area.. > > Look at > > psched_tdiff_bounded(psched_time_t tv1, psched_time_t tv2, > psched_time_t bound) > { > return min(tv1 - tv2, bound); > } > > and compare

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread Linus Torvalds
On Wed, 23 May 2007, Patrick McHardy wrote: > > Yes, that looks better, thanks. There appear to be other obvious problems in the recent "cleanups" in this area.. Look at psched_tdiff_bounded(psched_time_t tv1, psched_time_t tv2, psched_time_t bound) { return

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread Ingo Molnar
* Herbert Xu <[EMAIL PROTECTED]> wrote: > [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > My previous patch that changed the return value of qdisc_restart > incorrectly made the case where dequeue returns empty continue > processing packets. > > This patch is based on di

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread Patrick McHardy
Herbert Xu wrote: > On Wed, May 23, 2007 at 12:56:04PM +0200, Patrick McHardy wrote: > >>Looking at the recent changes to __qdisc_run, this indeed seems >>to be the case, when the qdisc is throttled and has packets queued >>we return a value != 0, causing __qdisc_run to loop until all >>packets ha

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread Herbert Xu
On Wed, May 23, 2007 at 12:56:04PM +0200, Patrick McHardy wrote: > > Looking at the recent changes to __qdisc_run, this indeed seems > to be the case, when the qdisc is throttled and has packets queued > we return a value != 0, causing __qdisc_run to loop until all > packets have been sent, which

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread Ingo Molnar
* Patrick McHardy <[EMAIL PROTECTED]> wrote: > How is this trace to be understood? Is it simply a call trace in > execution-order? [...] yeah. There's a help section at the top of the trace which explains the other fields too: _--=> CPU# / _-=> irqs-of

Re: bad networking related lag in v2.6.22-rc2

2007-05-23 Thread Patrick McHardy
Ingo Molnar wrote: > if you feel inclined to try the git-bisection then by all means please > do it (it will certainly be helpful and educative), but it's optional: i > dont think you should 'need' to go through extra debugging chores, my > analysis based on the excellent trace you provided stil

Re: bad networking related lag in v2.6.22-rc2

2007-05-22 Thread Ingo Molnar
* Anant Nitya <[EMAIL PROTECTED]> wrote: > > could you also apply the fix for the softirq problem below, to make > > sure it does not interact? > Above patch does solve __ soft_irq_pending __ problem. I am running > this patch with kernel 2.6.21.1 since last day doing all kinda things > but h

Re: bad networking related lag in v2.6.22-rc2

2007-05-22 Thread Anant Nitya
On Tuesday 22 May 2007 11:52:33 Ingo Molnar wrote: > * Anant Nitya <[EMAIL PROTECTED]> wrote: > > > I think I already found the bug, please try if this patch helps. > > > > Sorry, but this patch is not helping here. I recompiled the kernel > > with this patch but same load pattern still make system

Re: bad networking related lag in v2.6.22-rc2

2007-05-22 Thread Anant Nitya
On Tuesday 22 May 2007 14:47:47 Patrick McHardy wrote: > Anant Nitya wrote: > >>Patrick McHardy wrote: > >> > >>I think I already found the bug, please try if this patch helps. > > > > Sorry, but this patch is not helping here. I recompiled the kernel with > > this patch but same load pattern still

Re: bad networking related lag in v2.6.22-rc2

2007-05-22 Thread Patrick McHardy
Anant Nitya wrote: >>Patrick McHardy wrote: >> >>I think I already found the bug, please try if this patch helps. > > > Sorry, but this patch is not helping here. I recompiled the kernel with this > patch but same load pattern still make system to crawl. > > Here is the link for script I use to

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* Anant Nitya <[EMAIL PROTECTED]> wrote: > > I think I already found the bug, please try if this patch helps. > > Sorry, but this patch is not helping here. [...] btw., could you please send this patch on-list too please? Ingo - To unsubscribe from this list: send the line "unsubscribe

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > > Sorry, but this patch is not helping here. [...] > > btw., could you please send this patch on-list too please? disregard this - just found Patrick's patch. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in th

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* Anant Nitya <[EMAIL PROTECTED]> wrote: > > I think I already found the bug, please try if this patch helps. > > Sorry, but this patch is not helping here. I recompiled the kernel > with this patch but same load pattern still make system to crawl. > > Here is the link for script I use to shap

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Anant Nitya
On Monday 21 May 2007 15:50:09 Ingo Molnar wrote: > * Anant Nitya <[EMAIL PROTECTED]> wrote: > > Tcp: > > 5 connections established > > hm, this does not explain the /proc/net/tcp overhead i think - although > it could be a red herring. Will have a closer look at your new trace. > > if possible

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Anant Nitya
On Tuesday 22 May 2007 03:00:31 Patrick McHardy wrote: > Patrick McHardy wrote: > > Ingo Molnar wrote: > >>* Anant Nitya <[EMAIL PROTECTED]> wrote: > >>>I am posting links to the information you asked for. One more thing, > >>>after digging a bit more I found its QoS shaping that is making the > >>

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Patrick McHardy
Patrick McHardy wrote: > Ingo Molnar wrote: > >>* Anant Nitya <[EMAIL PROTECTED]> wrote: >> >>>I am posting links to the information you asked for. One more thing, >>>after digging a bit more I found its QoS shaping that is making the >>>box crawl. Once I disabled the traffic shaping everything

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Patrick McHardy
Ingo Molnar wrote: > * Anant Nitya <[EMAIL PROTECTED]> wrote: > > >>I am posting links to the information you asked for. One more thing, >>after digging a bit more I found its QoS shaping that is making the >>box crawl. Once I disabled the traffic shaping everything comes back >>to smooth and

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* Anant Nitya <[EMAIL PROTECTED]> wrote: > I am posting links to the information you asked for. One more thing, > after digging a bit more I found its QoS shaping that is making the > box crawl. Once I disabled the traffic shaping everything comes back > to smooth and normal. Shaping being don

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Anant Nitya
On Monday 21 May 2007 13:42:01 Ingo Molnar wrote: > * Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > ouch! a nearly 1 second delay got observed by the scheduler - something > > > is really killing your system! > > > > ah, you got the latency tracer from Thomas, as part of the -hrt patchset > > - that

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* Anant Nitya <[EMAIL PROTECTED]> wrote: > Tcp: > 5 connections established hm, this does not explain the /proc/net/tcp overhead i think - although it could be a red herring. Will have a closer look at your new trace. if possible please try to generate the automatic softirq trace for Tho

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Anant Nitya
On Monday 21 May 2007 13:42:01 Ingo Molnar wrote: > * Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > ouch! a nearly 1 second delay got observed by the scheduler - something > > > is really killing your system! > > > > ah, you got the latency tracer from Thomas, as part of the -hrt patchset > > - that

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* David Miller <[EMAIL PROTECTED]> wrote: > > gkrellm-5977 0..s. 0us : cond_resched_softirq > > (established_get_next) > > So it's not the 3c59x bug :-) > > If you have a lot of sockets, there is not way to make the performance > of dumping /proc/net/tcp not suck, use the netlink socket du

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Mon, 21 May 2007 10:28:05 +0200 > the problem first showed up in v2.6.22-rc1 and he didnt have it in > v2.6.21 - does that still qualify his box for the 3c59x problem? If the latency is showing up in /proc/net/tcp dumping, it's not the 3c59x problem.

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* David Miller <[EMAIL PROTECTED]> wrote: > From: Ingo Molnar <[EMAIL PROTECTED]> > Date: Mon, 21 May 2007 09:58:24 +0200 > > > what does 'top' show during an upload? Is any system related task > > out of whack? Could you try to get a readprofile or an oprofile > > output from the kernel, so tha

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Mon, 21 May 2007 10:12:01 +0200 > and ... you already did a trace for Thomas, for the softirq problem: > >http://cybertek.info/taitai/trace.txt.bz2 > > this trace shows really bad networking related kernel activities! > > gkrellm-5977 does this at

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Mon, 21 May 2007 09:58:24 +0200 > what does 'top' show during an upload? Is any system related task > out of whack? Could you try to get a readprofile or an oprofile > output from the kernel, so that we can see what is slowing it down > so much? It could

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > > ouch! a nearly 1 second delay got observed by the scheduler - something > > is really killing your system! > > ah, you got the latency tracer from Thomas, as part of the -hrt patchset > - that makes it quite a bit easier to debug. [...] and ... you

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > ah, you got the latency tracer from Thomas, as part of the -hrt patchset > - that makes it quite a bit easier to debug. Could you run the attached > trace-it-10sec utility: > > trace-it-10sec > trace-to-ingo.txt attached ... Ingo /*

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > ok, i got your -rc2 debug numbers (off-list), and it doesnt look pretty: > > before-lag: > > sleep_max: 259502076 > block_max:27690921 > wait_max :16381558 > > after-

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Ingo Molnar
* Anant Nitya <[EMAIL PROTECTED]> wrote: > Please ignore my last report about lag problem while using CFS-v13, it > is working perfectly fine with 2.6.21.1 and the lag I used to see in > v12 is not there with v13 anymore. After digging in a bit I found that > problem is only occurring in 2.6.2