Re: 2.6.21-rc1: known regressions (v2) (part 2)
On Friday 02 March 2007 00:30, Ingo Molnar wrote: > * Con Kolivas <[EMAIL PROTECTED]> wrote: > > [...] Even though I'm finding myself defending code that has already > > been softly tagged for redundancy, let's be clear here; we're talking > > about at most a further 70ms delay in scheduling a niced task in the > > presence of a nice 0 task, which is a reasonable delay for ksoftirqd > > which we nice the eyeballs out of in mainline. Considering under load > > our scheduler has been known to cause scheduling delays of 10 seconds > > I still don't see this as a bug. Dynticks just "points it out to us". > > well, not running softirqs when we could is a bug. It's not a big bug, > but it's a bug nevertheless. It doesnt matter that softirqs could be > delayed even worse under high load - there was no 'high load' here. Gotcha. I'll prepare a smt-nice removal patch shortly. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
* Con Kolivas <[EMAIL PROTECTED]> wrote: > [...] Even though I'm finding myself defending code that has already > been softly tagged for redundancy, let's be clear here; we're talking > about at most a further 70ms delay in scheduling a niced task in the > presence of a nice 0 task, which is a reasonable delay for ksoftirqd > which we nice the eyeballs out of in mainline. Considering under load > our scheduler has been known to cause scheduling delays of 10 seconds > I still don't see this as a bug. Dynticks just "points it out to us". well, not running softirqs when we could is a bug. It's not a big bug, but it's a bug nevertheless. It doesnt matter that softirqs could be delayed even worse under high load - there was no 'high load' here. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
On Thu, 2007-03-01 at 23:05 +1100, Con Kolivas wrote: > > > And that's the depressing part because of course I was interested in that > > > as the original approach to the problem (and it was a big problem). When > > > I spoke to Intel and AMD (of course to date no SMT AMD chip exists) at > > > kernel summit they said it was too hard to implement hardware priorities > > > well. Which is real odd since IBM have already done it with Power... > > > > > > Still I think it has been working fine in software till now, but now it > > > has to deal with the added confusion of dynticks, so I already know what > > > will happen to it. > > > > Well, it's not a dyntick problem in the first place. Even w/o dynticks > > we go idle with local_softirq_pending(). Dynticks contains an explicit > > check for that, which makes it visible. > > Oops I'm sorry if I made it sound like there's a dynticks problem. That was > not my intent and I said as much in an earlier email. Even though I'm finding > myself defending code that has already been softly tagged for redundancy, > let's be clear here; we're talking about at most a further 70ms delay in > scheduling a niced task in the presence of a nice 0 task, which is a > reasonable delay for ksoftirqd which we nice the eyeballs out of in mainline. > Considering under load our scheduler has been known to cause scheduling > delays of 10 seconds I still don't see this as a bug. Dynticks just "points > it out to us". Well, dyntick might end up to delay it for X seconds as well, which _is_ observable and that's why the check was put there in the first place. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
On Thursday 01 March 2007 22:33, Thomas Gleixner wrote: > On Thu, 2007-03-01 at 22:13 +1100, Con Kolivas wrote: > > > if then there should be a mechanism /in the hardware/ to set the > > > priority of a CPU - and then the hardware could decide how to > > > prioritize between siblings. Doing this in software is really hard. > > > > And that's the depressing part because of course I was interested in that > > as the original approach to the problem (and it was a big problem). When > > I spoke to Intel and AMD (of course to date no SMT AMD chip exists) at > > kernel summit they said it was too hard to implement hardware priorities > > well. Which is real odd since IBM have already done it with Power... > > > > Still I think it has been working fine in software till now, but now it > > has to deal with the added confusion of dynticks, so I already know what > > will happen to it. > > Well, it's not a dyntick problem in the first place. Even w/o dynticks > we go idle with local_softirq_pending(). Dynticks contains an explicit > check for that, which makes it visible. Oops I'm sorry if I made it sound like there's a dynticks problem. That was not my intent and I said as much in an earlier email. Even though I'm finding myself defending code that has already been softly tagged for redundancy, let's be clear here; we're talking about at most a further 70ms delay in scheduling a niced task in the presence of a nice 0 task, which is a reasonable delay for ksoftirqd which we nice the eyeballs out of in mainline. Considering under load our scheduler has been known to cause scheduling delays of 10 seconds I still don't see this as a bug. Dynticks just "points it out to us". -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
On Thu, 2007-03-01 at 22:13 +1100, Con Kolivas wrote: > > if then there should be a mechanism /in the hardware/ to set the > > priority of a CPU - and then the hardware could decide how to prioritize > > between siblings. Doing this in software is really hard. > > And that's the depressing part because of course I was interested in that as > the original approach to the problem (and it was a big problem). When I spoke > to Intel and AMD (of course to date no SMT AMD chip exists) at kernel summit > they said it was too hard to implement hardware priorities well. Which is > real odd since IBM have already done it with Power... > > Still I think it has been working fine in software till now, but now it has > to > deal with the added confusion of dynticks, so I already know what will happen > to it. Well, it's not a dyntick problem in the first place. Even w/o dynticks we go idle with local_softirq_pending(). Dynticks contains an explicit check for that, which makes it visible. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
On Thursday 01 March 2007 19:46, Ingo Molnar wrote: > * Mike Galbraith <[EMAIL PROTECTED]> wrote: > > I see no real difference between the two assertions. Nice is just a > > mechanism to set priority, so I applied your assertion to a different > > range of priorities than nice covers, and returned it to show that the > > code contradicts itself. It can't be bad for a nice 1 task to run > > with a nice 0 task, but OK for a minimum RT task to run with a maximum > > RT task. Iff HT without corrective measures breaks nice, then it > > breaks realtime priorities as well. > > i'm starting to lean towards your view that we should not artificially > keep tasks from running, when there's a free CPU available. We should > still keep the 'other half' of SMT scheduling: the immediate pushing of > tasks to a related core, but this bit of 'do not run tasks on this CPU' > dependent-sleeper logic is i think a bit fragile. Plus these days SMT > siblings do not tend to influence each other in such a negative way as > older P4 ones where a HT sibling would slow down the other sibling > significantly. Well it is meant to be tuned to the cpu type in per_cpu_gain. So it should be easy to be set to the appropriate scaling. It was never meant to be a one value fits all as the processors changed. > plus with an increasing number of siblings (which seems like an > inevitable thing on the hardware side), the dependent-sleeper logic > becomes less and less scalable. We'd have to cross-check every other > 'related' CPU's current priority to decide what to run. Yes even I've commented before that this current system is unworkable come multiple shared power threads. This I do see as a real problem with it - in the future. > if then there should be a mechanism /in the hardware/ to set the > priority of a CPU - and then the hardware could decide how to prioritize > between siblings. Doing this in software is really hard. And that's the depressing part because of course I was interested in that as the original approach to the problem (and it was a big problem). When I spoke to Intel and AMD (of course to date no SMT AMD chip exists) at kernel summit they said it was too hard to implement hardware priorities well. Which is real odd since IBM have already done it with Power... Still I think it has been working fine in software till now, but now it has to deal with the added confusion of dynticks, so I already know what will happen to it. Hrm it's been a good time for my code all round... I think I'll just swap prefetch myself up the staircase to some pluggable scheduler that would hyperthread me to sleep as an idle priority task. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
* Mike Galbraith <[EMAIL PROTECTED]> wrote: > I see no real difference between the two assertions. Nice is just a > mechanism to set priority, so I applied your assertion to a different > range of priorities than nice covers, and returned it to show that the > code contradicts itself. It can't be bad for a nice 1 task to run > with a nice 0 task, but OK for a minimum RT task to run with a maximum > RT task. Iff HT without corrective measures breaks nice, then it > breaks realtime priorities as well. i'm starting to lean towards your view that we should not artificially keep tasks from running, when there's a free CPU available. We should still keep the 'other half' of SMT scheduling: the immediate pushing of tasks to a related core, but this bit of 'do not run tasks on this CPU' dependent-sleeper logic is i think a bit fragile. Plus these days SMT siblings do not tend to influence each other in such a negative way as older P4 ones where a HT sibling would slow down the other sibling significantly. plus with an increasing number of siblings (which seems like an inevitable thing on the hardware side), the dependent-sleeper logic becomes less and less scalable. We'd have to cross-check every other 'related' CPU's current priority to decide what to run. if then there should be a mechanism /in the hardware/ to set the priority of a CPU - and then the hardware could decide how to prioritize between siblings. Doing this in software is really hard. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
On Thu, 2007-03-01 at 09:01 +1100, Con Kolivas wrote: > On Wednesday 28 February 2007 15:21, Mike Galbraith wrote: > > On Wed, 2007-02-28 at 09:58 +1100, Con Kolivas wrote: > > > On Tuesday 27 February 2007 19:54, Mike Galbraith wrote: > > > > Agreed. > > > > > > > > I was recently looking at that spot because I found that niced tasks > > > > were taking latency hits, and disabled it, which helped a bunch. > > > > > > Ok... as I said above to Ingo, nice means more latency too, and there is > > > no doubt that if we disable nice as a working feature then the niced > > > tasks will have less latency. Of course, this ends up meaning that > > > un-niced tasks no longer receive their better cpu performance.. You're > > > basically saying that you prefer nice not to work in the setting of HT. > > > > No I'm not, but let's go further in that direction just for the sake of > > argument. You're then saying that you prefer realtime priorities to not > > work in the HT setting, given that realtime tasks don't participate in > > the 'single stream me' program. > > Where do I say that? I do not presume to manage realtime priorities in any > way. You're turning my argument about nice levels around and somehow saying > that because hyperthreading breaks the single stream me semantics by > parallelising them that I would want to stop that happening. Nowhere have I > argued that realtime semantics should be changed to somehow work around > hyperthreading. SMT nice is about managing nice only, and not realtime > priorities which are independent entities. I see no real difference between the two assertions. Nice is just a mechanism to set priority, so I applied your assertion to a different range of priorities than nice covers, and returned it to show that the code contradicts itself. It can't be bad for a nice 1 task to run with a nice 0 task, but OK for a minimum RT task to run with a maximum RT task. Iff HT without corrective measures breaks nice, then it breaks realtime priorities as well. > > I'm saying only that we're defeating the purpose of HT, and overriding a > > user decision every time we force a sibling idle. > > > > > > I also > > > > can't understand why it would be OK to interleave a normal task with an > > > > RT task sometimes, but not others.. that's meaningless to the RT task. > > > > > > Clearly there would be a reason that code is there... The whole problem > > > with HT is that as soon as you load up a sibling, you slow down the > > > logical sibling, hence why this code is there in the first place. Since I > > > know you're one to test things for yourself, I will put it to you this > > > way: > > > > > > Boot into UP. Time how long it takes to do a million of these in a real > > > time task: > > > asm volatile("" : : : "memory"); > > > > > > Then start up a SCHED_NORMAL task fully cpu bound such as "yes > > > > /dev/null" and time that again. Obviously the former being a realtime > > > task will take the same amount of time and the SCHED_NORMAL task will be > > > starved until the realtime task finishes. > > > > Sure. > > > > > Now try the same experiment with hyperthreading enabled and an ordinary > > > SMP kernel. You'll find the realtime task runs at only ~60% performance. > > > > So? User asked for HT. That's hardware multiplexing. It ain't free. > > Buyer beware. > > But the buyer is not aware. You are aware because you tinker, but the vast > majority of users who enable hyperthreading in their shiny pcs are not aware. Then we need to make them aware of what they're enabling? > The only thing they know is that if they enable hyperthreading their programs > run slower in multitasking environments no matter how much they nice the > other processes. Buyers do not buy hardware knowing that the internal design > breaks something as fundamental as 'nice'. You seem to presume that most > people who get hyperthreading are happy to compromise 'nice' in order to get > their second core working and I put it to you that they do not make that > decision. To me it's pretty much black and white. Either you want to split your cpu into logical units, which means each has less to offer than the total, or you want all your processing power in one bucket. > > > That's a > > > serious performance hit for realtime tasks considering you're running a > > > SCHED_NORMAL task. The SMT code that you seem to dislike fixes this > > > problem. > > > > I don't think it does actually. Let your RT task sleep regularly, and > > ever so briefly. We don't evict lower priority tasks from siblings upon > > wakeup, we only prevent entry... sometimes. > > Well you know as well as I do that you're selecting out the exception rather > than the rule, and statistically overall, it does work. I don't agree that it's the exception, and if you look at this HT thing from the split cpu perspective, I'm not sure there's even a problem. Scrolling down, I see that this is getting too long, and we aren't commun
Re: 2.6.21-rc1: known regressions (v2) (part 2)
On Wednesday 28 February 2007 15:21, Mike Galbraith wrote: > (hrmph. having to copy/paste/try again. evolution seems to be broken.. > RCPT TO <[EMAIL PROTECTED]> failed: Cannot resolve your domain > {mp049} ..caused me to be unable to send despite receipts being disabled) Apologies for mangling the email address as I said :-( > On Wed, 2007-02-28 at 09:58 +1100, Con Kolivas wrote: > > On Tuesday 27 February 2007 19:54, Mike Galbraith wrote: > > > Agreed. > > > > > > I was recently looking at that spot because I found that niced tasks > > > were taking latency hits, and disabled it, which helped a bunch. > > > > Ok... as I said above to Ingo, nice means more latency too, and there is > > no doubt that if we disable nice as a working feature then the niced > > tasks will have less latency. Of course, this ends up meaning that > > un-niced tasks no longer receive their better cpu performance.. You're > > basically saying that you prefer nice not to work in the setting of HT. > > No I'm not, but let's go further in that direction just for the sake of > argument. You're then saying that you prefer realtime priorities to not > work in the HT setting, given that realtime tasks don't participate in > the 'single stream me' program. Where do I say that? I do not presume to manage realtime priorities in any way. You're turning my argument about nice levels around and somehow saying that because hyperthreading breaks the single stream me semantics by parallelising them that I would want to stop that happening. Nowhere have I argued that realtime semantics should be changed to somehow work around hyperthreading. SMT nice is about managing nice only, and not realtime priorities which are independent entities. > I'm saying only that we're defeating the purpose of HT, and overriding a > user decision every time we force a sibling idle. > > > > I also > > > can't understand why it would be OK to interleave a normal task with an > > > RT task sometimes, but not others.. that's meaningless to the RT task. > > > > Clearly there would be a reason that code is there... The whole problem > > with HT is that as soon as you load up a sibling, you slow down the > > logical sibling, hence why this code is there in the first place. Since I > > know you're one to test things for yourself, I will put it to you this > > way: > > > > Boot into UP. Time how long it takes to do a million of these in a real > > time task: > > asm volatile("" : : : "memory"); > > > > Then start up a SCHED_NORMAL task fully cpu bound such as "yes > > > /dev/null" and time that again. Obviously the former being a realtime > > task will take the same amount of time and the SCHED_NORMAL task will be > > starved until the realtime task finishes. > > Sure. > > > Now try the same experiment with hyperthreading enabled and an ordinary > > SMP kernel. You'll find the realtime task runs at only ~60% performance. > > So? User asked for HT. That's hardware multiplexing. It ain't free. > Buyer beware. But the buyer is not aware. You are aware because you tinker, but the vast majority of users who enable hyperthreading in their shiny pcs are not aware. The only thing they know is that if they enable hyperthreading their programs run slower in multitasking environments no matter how much they nice the other processes. Buyers do not buy hardware knowing that the internal design breaks something as fundamental as 'nice'. You seem to presume that most people who get hyperthreading are happy to compromise 'nice' in order to get their second core working and I put it to you that they do not make that decision. > > That's a > > serious performance hit for realtime tasks considering you're running a > > SCHED_NORMAL task. The SMT code that you seem to dislike fixes this > > problem. > > I don't think it does actually. Let your RT task sleep regularly, and > ever so briefly. We don't evict lower priority tasks from siblings upon > wakeup, we only prevent entry... sometimes. Well you know as well as I do that you're selecting out the exception rather than the rule, and statistically overall, it does work. > > The reason for interleaving is that there are a few cycles to be gained > > by using the second core for a separate SCHED_NORMAL task, and you don't > > want to disable access to the second core entirely for the duration the > > realtime task is running. Since there is no simple relationship between > > SCHED_NORMAL timeslices and realtime timeslices, we have to use some form > > of interleaving based on the expected extra cycles and HZ is the obvious > > choice. > > To me, the reason for interleaving is solely about keeping the core > busy . It has nothing to do with SCHED_POLICY_X what so ever. > > > > IMHO, SMT scheduling should be a buyer beware thing. Maximizing your > > > core utilization comes at a price, but so does disabling it, so I think > > > letting the user decide what he wants is the right thing to do. > > > > To me this is lik
Re: 2.6.21-rc1: known regressions (v2) (part 2)
(hrmph. having to copy/paste/try again. evolution seems to be broken.. RCPT TO <[EMAIL PROTECTED]> failed: Cannot resolve your domain {mp049} ..caused me to be unable to send despite receipts being disabled) On Wed, 2007-02-28 at 09:58 +1100, Con Kolivas wrote: > On Tuesday 27 February 2007 19:54, Mike Galbraith wrote: > > Agreed. > > > > I was recently looking at that spot because I found that niced tasks > > were taking latency hits, and disabled it, which helped a bunch. > > Ok... as I said above to Ingo, nice means more latency too, and there is no > doubt that if we disable nice as a working feature then the niced tasks will > have less latency. Of course, this ends up meaning that un-niced tasks no > longer receive their better cpu performance.. You're basically saying that > you prefer nice not to work in the setting of HT. No I'm not, but let's go further in that direction just for the sake of argument. You're then saying that you prefer realtime priorities to not work in the HT setting, given that realtime tasks don't participate in the 'single stream me' program. I'm saying only that we're defeating the purpose of HT, and overriding a user decision every time we force a sibling idle. > > I also > > can't understand why it would be OK to interleave a normal task with an > > RT task sometimes, but not others.. that's meaningless to the RT task. > > Clearly there would be a reason that code is there... The whole problem with > HT is that as soon as you load up a sibling, you slow down the logical > sibling, hence why this code is there in the first place. Since I know you're > one to test things for yourself, I will put it to you this way: > > Boot into UP. Time how long it takes to do a million of these in a real time > task: > asm volatile("" : : : "memory"); > > Then start up a SCHED_NORMAL task fully cpu bound such as "yes > /dev/null" > and time that again. Obviously the former being a realtime task will take the > same amount of time and the SCHED_NORMAL task will be starved until the > realtime task finishes. Sure. > Now try the same experiment with hyperthreading enabled and an ordinary SMP > kernel. You'll find the realtime task runs at only ~60% performance. So? User asked for HT. That's hardware multiplexing. It ain't free. Buyer beware. > That's a > serious performance hit for realtime tasks considering you're running a > SCHED_NORMAL task. The SMT code that you seem to dislike fixes this problem. I don't think it does actually. Let your RT task sleep regularly, and ever so briefly. We don't evict lower priority tasks from siblings upon wakeup, we only prevent entry... sometimes. > The reason for interleaving is that there are a few cycles to be gained by > using the second core for a separate SCHED_NORMAL task, and you don't want to > disable access to the second core entirely for the duration the realtime task > is running. Since there is no simple relationship between SCHED_NORMAL > timeslices and realtime timeslices, we have to use some form of interleaving > based on the expected extra cycles and HZ is the obvious choice. To me, the reason for interleaving is solely about keeping the core busy . It has nothing to do with SCHED_POLICY_X what so ever. > > IMHO, SMT scheduling should be a buyer beware thing. Maximizing your > > core utilization comes at a price, but so does disabling it, so I think > > letting the user decide what he wants is the right thing to do. > > To me this is like arguing that we should not implement 'nice' within the cpu > scheduler at all and only allow nice to work on the few architectures that > support hardware priorities in the cpu (like power5). Now there is no doubt > that if we do away with nice entirely everywhere in the scheduler we'll gain > some throughput. However, nice is a basic unix/linux function and if hardware > comes along that breaks it working we should be working to make sure that it > keeps working in software. That is why smt nice and smp nice was implemented. > Of course it is our duty to ensure we do that at minimal overhead at all > times. That's a different argument to what you are debating here. The > throughput should not be adversely affected by this SMT priority code because > although the nice 19 task gets less throughput, the higher priority task gets > more as a result, which is essentially what nice is meant to do. Re-read this paragraph with realtime task priorities in mind, or for that matter, dynamic priorities. If you carry your priority/throughput argument to it's logical conclusion, only instruction streams of absolutely equal priority should be able to share the core at any given time. You may as well just disable HT and be done with it. To me, siblings are logically separate units, and should be treated as such (as they mostly are). They share an important resource, but so do physically discrete units. -Mike - To unsubscribe from this list:
Re: 2.6.21-rc1: known regressions (v2) (part 2)
Apologies for the resend, lkml address got mangled... On Tuesday 27 February 2007 19:54, Mike Galbraith wrote: > On Tue, 2007-02-27 at 09:33 +0100, Ingo Molnar wrote: > > * Michal Piotrowski <[EMAIL PROTECTED]> wrote: > > > Thomas Gleixner napisał(a): > > > > Adrian, > > > > > > > > On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote: > > > >> Subject: kernel BUG at kernel/time/tick-sched.c:168 > > > >> (CONFIG_NO_HZ) References : http://lkml.org/lkml/2007/2/16/346 > > > >> Submitter : Michal Piotrowski <[EMAIL PROTECTED]> > > > I can confirm that the bug is fixed (over 20 hours of testing should > > > be enough). > > > > thanks alot! I think this thing was a long-term performance/latency > > regression in HT scheduling as well. Ingo I'm going to have to partially disagree with you on this. This has only become a problem because of what happens with dynticks now when rq->curr == rq->idle. Prior to this, that particular SMT code only leads to relative delays in scheduling for lower priority tasks. Whether or not that task is ksoftirqd should not matter because it is not like they are starved indefinitely, it is only that nice 19 tasks are relatively delayed, which by definition is implied with the usage of nice as a scheduler hint wouldn't you say? I know it has been discussed many times before as to whether 'nice' means less cpu and/or more latency, but in our current implementation, nice means both less cpu and more latency. So to me, the kernels without dynticks do not have a regression. This seems to only be a problem in the setting of the new dynticks code IMHO. That's not to say it isn't a bug! Nor am I saying that dynticks is a problem! Please don't misinterpret that. The second issue is that this is a problem because of the fuzzy definition of what idle is for a runqueue in the setting of this SMT code. Normally, rq->curr==rq->idle means the runqueue is idle, but not with this code since there are still rq->nr_running on that runqueue. What dynticks in this implementation is doing is trying to idle a hyperthread sibling on a cpu whose logical partner is busy. I did not find that added any power saving on my earlier dynticks implementation, and found it easier to keep that sibling ticking at the same rate as its partner. Of course you may have found something different, and I definitely agree with what you are likely to say in response to this- we shouldn't have to special case logical siblings as having a different definition of idle than any other smp case. Ultimately, that leaves us with your simple patch as a reasonable solution for the dynticks case even though it does change the behaviour dramatically for a more loaded cpu. I don't see this code as presenting a problem without or prior to the dynticks implementation. You being the scheduler maintainer means you get to choose what is the best way to tackle this problem. > Agreed. > > I was recently looking at that spot because I found that niced tasks > were taking latency hits, and disabled it, which helped a bunch. Ok... as I said above to Ingo, nice means more latency too, and there is no doubt that if we disable nice as a working feature then the niced tasks will have less latency. Of course, this ends up meaning that un-niced tasks no longer receive their better cpu performance.. You're basically saying that you prefer nice not to work in the setting of HT. > I also > can't understand why it would be OK to interleave a normal task with an > RT task sometimes, but not others.. that's meaningless to the RT task. Clearly there would be a reason that code is there... The whole problem with HT is that as soon as you load up a sibling, you slow down the logical sibling, hence why this code is there in the first place. Since I know you're one to test things for yourself, I will put it to you this way: Boot into UP. Time how long it takes to do a million of these in a real time task: asm volatile("" : : : "memory"); Then start up a SCHED_NORMAL task fully cpu bound such as "yes > /dev/null" and time that again. Obviously the former being a realtime task will take the same amount of time and the SCHED_NORMAL task will be starved until the realtime task finishes. Now try the same experiment with hyperthreading enabled and an ordinary SMP kernel. You'll find the realtime task runs at only ~60% performance. That's a serious performance hit for realtime tasks considering you're running a SCHED_NORMAL task. The SMT code that you seem to dislike fixes this problem. The reason for interleaving is that there are a few cycles to be gained by using the second core for a separate SCHED_NORMAL task, and you don't want to disable access to the second core entirely for the duration the realtime task is running. Since there is no simple relationship between SCHED_NORMAL timeslices and realtime timeslices, we have to use some form of interleaving based on the expected extra cycles and HZ is the ob
Re: 2.6.21-rc1: known regressions (v2) (part 2)
On Tue, 2007-02-27 at 09:33 +0100, Ingo Molnar wrote: > * Michal Piotrowski <[EMAIL PROTECTED]> wrote: > > > Thomas Gleixner napisał(a): > > > Adrian, > > > > > > On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote: > > >> Subject: kernel BUG at kernel/time/tick-sched.c:168 (CONFIG_NO_HZ) > > >> References : http://lkml.org/lkml/2007/2/16/346 > > >> Submitter : Michal Piotrowski <[EMAIL PROTECTED]> > > >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> > > >> Status : problem is being debugged > > > > > > The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a > > > problem with the SMT scheduler. See below. > > > > > >> Subject: BUG: soft lockup detected on CPU#0 > > >> NOHZ: local_softirq_pending 20 (SMT scheduler) > > >> References : http://lkml.org/lkml/2007/2/20/257 > > >> Submitter : Michal Piotrowski <[EMAIL PROTECTED]> > > >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> > > >> Ingo Molnar <[EMAIL PROTECTED]> > > >> Status : problem is being debugged > > > > > > Patch available, not confirmed yet. > > > > > > > I can confirm that the bug is fixed (over 20 hours of testing should > > be enough). > > thanks alot! I think this thing was a long-term performance/latency > regression in HT scheduling as well. Agreed. I was recently looking at that spot because I found that niced tasks were taking latency hits, and disabled it, which helped a bunch. I also can't understand why it would be OK to interleave a normal task with an RT task sometimes, but not others.. that's meaningless to the RT task. IMHO, SMT scheduling should be a buyer beware thing. Maximizing your core utilization comes at a price, but so does disabling it, so I think letting the user decide what he wants is the right thing to do. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
* Michal Piotrowski <[EMAIL PROTECTED]> wrote: > Thomas Gleixner napisał(a): > > Adrian, > > > > On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote: > >> Subject: kernel BUG at kernel/time/tick-sched.c:168 (CONFIG_NO_HZ) > >> References : http://lkml.org/lkml/2007/2/16/346 > >> Submitter : Michal Piotrowski <[EMAIL PROTECTED]> > >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> > >> Status : problem is being debugged > > > > The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a > > problem with the SMT scheduler. See below. > > > >> Subject: BUG: soft lockup detected on CPU#0 > >> NOHZ: local_softirq_pending 20 (SMT scheduler) > >> References : http://lkml.org/lkml/2007/2/20/257 > >> Submitter : Michal Piotrowski <[EMAIL PROTECTED]> > >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> > >> Ingo Molnar <[EMAIL PROTECTED]> > >> Status : problem is being debugged > > > > Patch available, not confirmed yet. > > > > I can confirm that the bug is fixed (over 20 hours of testing should > be enough). thanks alot! I think this thing was a long-term performance/latency regression in HT scheduling as well. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
Michal Piotrowski napisał(a): > Thomas Gleixner napisał(a): >> Adrian, >> >> On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote: >>> Subject: kernel BUG at kernel/time/tick-sched.c:168 (CONFIG_NO_HZ) >>> References : http://lkml.org/lkml/2007/2/16/346 >>> Submitter : Michal Piotrowski <[EMAIL PROTECTED]> >>> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> >>> Status : problem is being debugged >> The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a >> problem with the SMT scheduler. See below. >> >>> Subject: BUG: soft lockup detected on CPU#0 >>> NOHZ: local_softirq_pending 20 (SMT scheduler) >>> References : http://lkml.org/lkml/2007/2/20/257 >>> Submitter : Michal Piotrowski <[EMAIL PROTECTED]> >>> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> >>> Ingo Molnar <[EMAIL PROTECTED]> >>> Status : problem is being debugged >> Patch available, not confirmed yet. >> > > I can confirm that the bug is fixed (over 20 hours of testing should be > enough). almost ;) Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
Thomas Gleixner napisał(a): > Adrian, > > On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote: >> Subject: kernel BUG at kernel/time/tick-sched.c:168 (CONFIG_NO_HZ) >> References : http://lkml.org/lkml/2007/2/16/346 >> Submitter : Michal Piotrowski <[EMAIL PROTECTED]> >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> >> Status : problem is being debugged > > The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a > problem with the SMT scheduler. See below. > >> Subject: BUG: soft lockup detected on CPU#0 >> NOHZ: local_softirq_pending 20 (SMT scheduler) >> References : http://lkml.org/lkml/2007/2/20/257 >> Submitter : Michal Piotrowski <[EMAIL PROTECTED]> >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> >> Ingo Molnar <[EMAIL PROTECTED]> >> Status : problem is being debugged > > Patch available, not confirmed yet. > I can confirm that the bug is fixed (over 20 hours of testing should be enough). Huge thanks! Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc1: known regressions (v2) (part 2)
Adrian, On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote: > Subject: kernel BUG at kernel/time/tick-sched.c:168 (CONFIG_NO_HZ) > References : http://lkml.org/lkml/2007/2/16/346 > Submitter : Michal Piotrowski <[EMAIL PROTECTED]> > Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> > Status : problem is being debugged The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a problem with the SMT scheduler. See below. > Subject: BUG: soft lockup detected on CPU#0 > NOHZ: local_softirq_pending 20 (SMT scheduler) > References : http://lkml.org/lkml/2007/2/20/257 > Submitter : Michal Piotrowski <[EMAIL PROTECTED]> > Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> > Ingo Molnar <[EMAIL PROTECTED]> > Status : problem is being debugged Patch available, not confirmed yet. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.21-rc1: known regressions (v2) (part 2)
This email lists some known regressions in 2.6.21-rc1 compared to 2.6.20 that are not yet fixed in Linus' tree. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject: forcedeth no longer works References : http://bugzilla.kernel.org/show_bug.cgi?id=8090 Submitter : David P. Reed <[EMAIL PROTECTED]> Caused-By : Ayaz Abdulla <[EMAIL PROTECTED]> Status : unknown Subject: forcedeth: skb_over_panic References : http://bugzilla.kernel.org/show_bug.cgi?id=8058 Submitter : Albert Hopkins <[EMAIL PROTECTED]> Status : unknown Subject: natsemi ethernet card not detected correctly References : http://lkml.org/lkml/2007/2/23/4 http://lkml.org/lkml/2007/2/23/7 Submitter : Bob Tracy <[EMAIL PROTECTED]> Caused-By : Mark Brown <[EMAIL PROTECTED]> Handled-By : Mark Brown <[EMAIL PROTECTED]> Patch : http://lkml.org/lkml/2007/2/23/142 Status : patch available Subject: ThinkPad T60: system doesn't come out of suspend to RAM (CONFIG_NO_HZ) References : http://lkml.org/lkml/2007/2/22/391 Submitter : Michael S. Tsirkin <[EMAIL PROTECTED]> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> Ingo Molnar <[EMAIL PROTECTED]> Status : unknown Subject: kernel BUG at kernel/time/tick-sched.c:168 (CONFIG_NO_HZ) References : http://lkml.org/lkml/2007/2/16/346 Submitter : Michal Piotrowski <[EMAIL PROTECTED]> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> Status : problem is being debugged Subject: BUG: soft lockup detected on CPU#0 NOHZ: local_softirq_pending 20 (SMT scheduler) References : http://lkml.org/lkml/2007/2/20/257 Submitter : Michal Piotrowski <[EMAIL PROTECTED]> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> Ingo Molnar <[EMAIL PROTECTED]> Status : problem is being debugged Subject: i386: no boot with nmi_watchdog=1 (clockevents) References : http://lkml.org/lkml/2007/2/21/208 Submitter : Daniel Walker <[EMAIL PROTECTED]> Caused-By : Thomas Gleixner <[EMAIL PROTECTED]> commit e9e2cdb412412326c4827fc78ba27f410d837e6e Handled-By : Thomas Gleixner <[EMAIL PROTECTED]> Status : problem is being debugged - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/