RE: [patch] sched: improve pinned task handling again!
Siddha, Suresh B wrote on Friday, April 01, 2005 8:05 PM > On Sat, Apr 02, 2005 at 01:11:20PM +1000, Nick Piggin wrote: > > How important is this? Any application to real workloads? Even if > > not, I agree it would be nice to improve this more. I don't know > > if I really like this approach - I guess due to what it adds to > > fastpaths. > > Ken initially observed with older kernels(2.4 kernel with Ingo's sched), > it was happening with few hundred processes. 2.6 is not that bad and it > improved with recent fixes. It is not very important. We want to raise > the flag and see if we can comeup with a decent solution. The livelock is observed with an in-house stress test suite. The original intent of that test is remotely connected to stress the kernel. It is by accident that it triggered a kernel issue. Though, we are now worried that this can be used as a DOS attack. Nick Piggin wrote on Friday, April 01, 2005 7:11 PM > > Now presumably if the all_pinned logic is working properly in the > > first place, and it is correctly causing balancing to back-off, you > > could tweak that a bit to avoid livelocks? Perhaps the all_pinned > > case should back off faster than the usual doubling of the interval, > > and be allowed to exceed max_interval? This sounds plausible, though my first try did not yield desired result (i.e., still hangs the kernel, I might missed a few things here and there). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch] sched: improve pinned task handling again!
Siddha, Suresh B wrote on Friday, April 01, 2005 8:05 PM On Sat, Apr 02, 2005 at 01:11:20PM +1000, Nick Piggin wrote: How important is this? Any application to real workloads? Even if not, I agree it would be nice to improve this more. I don't know if I really like this approach - I guess due to what it adds to fastpaths. Ken initially observed with older kernels(2.4 kernel with Ingo's sched), it was happening with few hundred processes. 2.6 is not that bad and it improved with recent fixes. It is not very important. We want to raise the flag and see if we can comeup with a decent solution. The livelock is observed with an in-house stress test suite. The original intent of that test is remotely connected to stress the kernel. It is by accident that it triggered a kernel issue. Though, we are now worried that this can be used as a DOS attack. Nick Piggin wrote on Friday, April 01, 2005 7:11 PM Now presumably if the all_pinned logic is working properly in the first place, and it is correctly causing balancing to back-off, you could tweak that a bit to avoid livelocks? Perhaps the all_pinned case should back off faster than the usual doubling of the interval, and be allowed to exceed max_interval? This sounds plausible, though my first try did not yield desired result (i.e., still hangs the kernel, I might missed a few things here and there). - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: improve pinned task handling again!
Siddha, Suresh B wrote: On Sat, Apr 02, 2005 at 01:11:20PM +1000, Nick Piggin wrote: How important is this? Any application to real workloads? Even if not, I agree it would be nice to improve this more. I don't know if I really like this approach - I guess due to what it adds to fastpaths. Ken initially observed with older kernels(2.4 kernel with Ingo's sched), it was happening with few hundred processes. 2.6 is not that bad and it improved with recent fixes. It is not very important. We want to raise the flag and see if we can comeup with a decent solution. OK. We changed nr_running from "unsigned long" to "unsigned int". So on 64-bit architectures, our change to fastpath is not a big deal. Yeah I see. You are looking at data from remote runqueues a bit more often too, although I think they're all places where the remote cacheline would have already been touched recently. Now presumably if the all_pinned logic is working properly in the first place, and it is correctly causing balancing to back-off, you could tweak that a bit to avoid livelocks? Perhaps the all_pinned case should back off faster than the usual doubling of the interval, and be allowed to exceed max_interval? Coming up with that number(how much to exceed) will be a big task. It depends on number of cpus and how fast they traverse the runqueue,... Well we probably don't need to really fine tune it a great deal. Just pick a lage number that should work OK on most CPU speeds and CPU counts. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: improve pinned task handling again!
On Sat, Apr 02, 2005 at 01:11:20PM +1000, Nick Piggin wrote: > How important is this? Any application to real workloads? Even if > not, I agree it would be nice to improve this more. I don't know > if I really like this approach - I guess due to what it adds to > fastpaths. Ken initially observed with older kernels(2.4 kernel with Ingo's sched), it was happening with few hundred processes. 2.6 is not that bad and it improved with recent fixes. It is not very important. We want to raise the flag and see if we can comeup with a decent solution. We changed nr_running from "unsigned long" to "unsigned int". So on 64-bit architectures, our change to fastpath is not a big deal. > > Now presumably if the all_pinned logic is working properly in the > first place, and it is correctly causing balancing to back-off, you > could tweak that a bit to avoid livelocks? Perhaps the all_pinned > case should back off faster than the usual doubling of the interval, > and be allowed to exceed max_interval? Coming up with that number(how much to exceed) will be a big task. It depends on number of cpus and how fast they traverse the runqueue,... thanks, suresh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: improve pinned task handling again!
Siddha, Suresh B wrote: This time Ken Chen brought up this issue -- No it has nothing to do with industry db benchmark ;-) Even with the above mentioned Nick's patch in -mm, I see system livelock's if for example I have 7000 processes pinned onto one cpu (this is on the fastest 8-way system I have access to). I am sure there will be other systems where this problem can be encountered even with lesser pin count. Thanks for testing these patches in -mm, by the way. We tried to fix this issue but as you know there is no good mechanism in fixing this issue with out letting the regular paths know about this. Our proposed solution is appended and we tried to minimize the affect on fast path. It builds up on Nick's patch and once this situation is detected, it will not do any more move_tasks as long as busiest cpu is always the same cpu and the queued processes on busiest_cpu, their cpu affinity remain same(found out by runqueue's "generation_num") 7000 running processes pinned into one CPU. I guess that isn't a great deal :( How important is this? Any application to real workloads? Even if not, I agree it would be nice to improve this more. I don't know if I really like this approach - I guess due to what it adds to fastpaths. Now presumably if the all_pinned logic is working properly in the first place, and it is correctly causing balancing to back-off, you could tweak that a bit to avoid livelocks? Perhaps the all_pinned case should back off faster than the usual doubling of the interval, and be allowed to exceed max_interval? Any thoughts Ingo? -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: improve pinned task handling again!
Siddha, Suresh B wrote: This time Ken Chen brought up this issue -- No it has nothing to do with industry db benchmark ;-) Even with the above mentioned Nick's patch in -mm, I see system livelock's if for example I have 7000 processes pinned onto one cpu (this is on the fastest 8-way system I have access to). I am sure there will be other systems where this problem can be encountered even with lesser pin count. Thanks for testing these patches in -mm, by the way. We tried to fix this issue but as you know there is no good mechanism in fixing this issue with out letting the regular paths know about this. Our proposed solution is appended and we tried to minimize the affect on fast path. It builds up on Nick's patch and once this situation is detected, it will not do any more move_tasks as long as busiest cpu is always the same cpu and the queued processes on busiest_cpu, their cpu affinity remain same(found out by runqueue's generation_num) 7000 running processes pinned into one CPU. I guess that isn't a great deal :( How important is this? Any application to real workloads? Even if not, I agree it would be nice to improve this more. I don't know if I really like this approach - I guess due to what it adds to fastpaths. Now presumably if the all_pinned logic is working properly in the first place, and it is correctly causing balancing to back-off, you could tweak that a bit to avoid livelocks? Perhaps the all_pinned case should back off faster than the usual doubling of the interval, and be allowed to exceed max_interval? Any thoughts Ingo? -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: improve pinned task handling again!
On Sat, Apr 02, 2005 at 01:11:20PM +1000, Nick Piggin wrote: How important is this? Any application to real workloads? Even if not, I agree it would be nice to improve this more. I don't know if I really like this approach - I guess due to what it adds to fastpaths. Ken initially observed with older kernels(2.4 kernel with Ingo's sched), it was happening with few hundred processes. 2.6 is not that bad and it improved with recent fixes. It is not very important. We want to raise the flag and see if we can comeup with a decent solution. We changed nr_running from unsigned long to unsigned int. So on 64-bit architectures, our change to fastpath is not a big deal. Now presumably if the all_pinned logic is working properly in the first place, and it is correctly causing balancing to back-off, you could tweak that a bit to avoid livelocks? Perhaps the all_pinned case should back off faster than the usual doubling of the interval, and be allowed to exceed max_interval? Coming up with that number(how much to exceed) will be a big task. It depends on number of cpus and how fast they traverse the runqueue,... thanks, suresh - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: improve pinned task handling again!
Siddha, Suresh B wrote: On Sat, Apr 02, 2005 at 01:11:20PM +1000, Nick Piggin wrote: How important is this? Any application to real workloads? Even if not, I agree it would be nice to improve this more. I don't know if I really like this approach - I guess due to what it adds to fastpaths. Ken initially observed with older kernels(2.4 kernel with Ingo's sched), it was happening with few hundred processes. 2.6 is not that bad and it improved with recent fixes. It is not very important. We want to raise the flag and see if we can comeup with a decent solution. OK. We changed nr_running from unsigned long to unsigned int. So on 64-bit architectures, our change to fastpath is not a big deal. Yeah I see. You are looking at data from remote runqueues a bit more often too, although I think they're all places where the remote cacheline would have already been touched recently. Now presumably if the all_pinned logic is working properly in the first place, and it is correctly causing balancing to back-off, you could tweak that a bit to avoid livelocks? Perhaps the all_pinned case should back off faster than the usual doubling of the interval, and be allowed to exceed max_interval? Coming up with that number(how much to exceed) will be a big task. It depends on number of cpus and how fast they traverse the runqueue,... Well we probably don't need to really fine tune it a great deal. Just pick a lage number that should work OK on most CPU speeds and CPU counts. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/