subject:"\"2.6.21\\\-rc1\\\: known regressions \\\(v2\\\) \\\(part 2\\\)\""

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-03-01 Thread Con Kolivas

On Friday 02 March 2007 00:30, Ingo Molnar wrote:
> * Con Kolivas <[EMAIL PROTECTED]> wrote:
> > [...] Even though I'm finding myself defending code that has already
> > been softly tagged for redundancy, let's be clear here; we're talking
> > about at most a further 70ms delay in scheduling a niced task in the
> > presence of a nice 0 task, which is a reasonable delay for ksoftirqd
> > which we nice the eyeballs out of in mainline. Considering under load
> > our scheduler has been known to cause scheduling delays of 10 seconds
> > I still don't see this as a bug. Dynticks just "points it out to us".
>
> well, not running softirqs when we could is a bug. It's not a big bug,
> but it's a bug nevertheless. It doesnt matter that softirqs could be
> delayed even worse under high load - there was no 'high load' here.

Gotcha. I'll prepare a smt-nice removal patch shortly. 

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-03-01 Thread Ingo Molnar


* Con Kolivas <[EMAIL PROTECTED]> wrote:

> [...] Even though I'm finding myself defending code that has already 
> been softly tagged for redundancy, let's be clear here; we're talking 
> about at most a further 70ms delay in scheduling a niced task in the 
> presence of a nice 0 task, which is a reasonable delay for ksoftirqd 
> which we nice the eyeballs out of in mainline. Considering under load 
> our scheduler has been known to cause scheduling delays of 10 seconds 
> I still don't see this as a bug. Dynticks just "points it out to us".

well, not running softirqs when we could is a bug. It's not a big bug, 
but it's a bug nevertheless. It doesnt matter that softirqs could be 
delayed even worse under high load - there was no 'high load' here.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-03-01 Thread Thomas Gleixner

On Thu, 2007-03-01 at 23:05 +1100, Con Kolivas wrote:
> > > And that's the depressing part because of course I was interested in that
> > > as the original approach to the problem (and it was a big problem). When
> > > I spoke to Intel and AMD (of course to date no SMT AMD chip exists) at
> > > kernel summit they said it was too hard to implement hardware priorities
> > > well. Which is real odd since IBM have already done it with Power...
> > >
> > > Still I think it has been working fine in software till now, but now it
> > > has to deal with the added confusion of dynticks, so I already know what
> > > will happen to it.
> >
> > Well, it's not a dyntick problem in the first place. Even w/o dynticks
> > we go idle with local_softirq_pending(). Dynticks contains an explicit
> > check for that, which makes it visible.
> 
> Oops I'm sorry if I made it sound like there's a dynticks problem. That was 
> not my intent and I said as much in an earlier email. Even though I'm finding 
> myself defending code that has already been softly tagged for redundancy, 
> let's be clear here; we're talking about at most a further 70ms delay in 
> scheduling a niced task in the presence of a nice 0 task, which is a 
> reasonable delay for ksoftirqd which we nice the eyeballs out of in mainline. 
> Considering under load our scheduler has been known to cause scheduling 
> delays of 10 seconds I still don't see this as a bug. Dynticks just "points 
> it out to us".

Well, dyntick might end up to delay it for X seconds as well, which _is_
observable and that's why the check was put there in the first place.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-03-01 Thread Con Kolivas

On Thursday 01 March 2007 22:33, Thomas Gleixner wrote:
> On Thu, 2007-03-01 at 22:13 +1100, Con Kolivas wrote:
> > > if then there should be a mechanism /in the hardware/ to set the
> > > priority of a CPU - and then the hardware could decide how to
> > > prioritize between siblings. Doing this in software is really hard.
> >
> > And that's the depressing part because of course I was interested in that
> > as the original approach to the problem (and it was a big problem). When
> > I spoke to Intel and AMD (of course to date no SMT AMD chip exists) at
> > kernel summit they said it was too hard to implement hardware priorities
> > well. Which is real odd since IBM have already done it with Power...
> >
> > Still I think it has been working fine in software till now, but now it
> > has to deal with the added confusion of dynticks, so I already know what
> > will happen to it.
>
> Well, it's not a dyntick problem in the first place. Even w/o dynticks
> we go idle with local_softirq_pending(). Dynticks contains an explicit
> check for that, which makes it visible.

Oops I'm sorry if I made it sound like there's a dynticks problem. That was 
not my intent and I said as much in an earlier email. Even though I'm finding 
myself defending code that has already been softly tagged for redundancy, 
let's be clear here; we're talking about at most a further 70ms delay in 
scheduling a niced task in the presence of a nice 0 task, which is a 
reasonable delay for ksoftirqd which we nice the eyeballs out of in mainline. 
Considering under load our scheduler has been known to cause scheduling 
delays of 10 seconds I still don't see this as a bug. Dynticks just "points 
it out to us".

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-03-01 Thread Thomas Gleixner

On Thu, 2007-03-01 at 22:13 +1100, Con Kolivas wrote:
> > if then there should be a mechanism /in the hardware/ to set the
> > priority of a CPU - and then the hardware could decide how to prioritize
> > between siblings. Doing this in software is really hard.
> 
> And that's the depressing part because of course I was interested in that as 
> the original approach to the problem (and it was a big problem). When I spoke 
> to Intel and AMD (of course to date no SMT AMD chip exists) at kernel summit 
> they said it was too hard to implement hardware priorities well. Which is 
> real odd since IBM have already done it with Power...
> 
> Still I think it has been working fine in software till now, but now it has 
> to 
> deal with the added confusion of dynticks, so I already know what will happen 
> to it.

Well, it's not a dyntick problem in the first place. Even w/o dynticks
we go idle with local_softirq_pending(). Dynticks contains an explicit
check for that, which makes it visible.

tglx




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-03-01 Thread Con Kolivas

On Thursday 01 March 2007 19:46, Ingo Molnar wrote:
> * Mike Galbraith <[EMAIL PROTECTED]> wrote:
> > I see no real difference between the two assertions.  Nice is just a
> > mechanism to set priority, so I applied your assertion to a different
> > range of priorities than nice covers, and returned it to show that the
> > code contradicts itself.  It can't be bad for a nice 1 task to run
> > with a nice 0 task, but OK for a minimum RT task to run with a maximum
> > RT task.  Iff HT without corrective measures breaks nice, then it
> > breaks realtime priorities as well.
>
> i'm starting to lean towards your view that we should not artificially
> keep tasks from running, when there's a free CPU available. We should
> still keep the 'other half' of SMT scheduling: the immediate pushing of
> tasks to a related core, but this bit of 'do not run tasks on this CPU'
> dependent-sleeper logic is i think a bit fragile. Plus these days SMT
> siblings do not tend to influence each other in such a negative way as
> older P4 ones where a HT sibling would slow down the other sibling
> significantly.

Well it is meant to be tuned to the cpu type in per_cpu_gain. So it should be 
easy to be set to the appropriate scaling. It was never meant to be a one 
value fits all as the processors changed.

> plus with an increasing number of siblings (which seems like an
> inevitable thing on the hardware side), the dependent-sleeper logic
> becomes less and less scalable. We'd have to cross-check every other
> 'related' CPU's current priority to decide what to run.

Yes even I've commented before that this current system is unworkable come 
multiple shared power threads. This I do see as a real problem with it - in 
the future.

> if then there should be a mechanism /in the hardware/ to set the
> priority of a CPU - and then the hardware could decide how to prioritize
> between siblings. Doing this in software is really hard.

And that's the depressing part because of course I was interested in that as 
the original approach to the problem (and it was a big problem). When I spoke 
to Intel and AMD (of course to date no SMT AMD chip exists) at kernel summit 
they said it was too hard to implement hardware priorities well. Which is 
real odd since IBM have already done it with Power...

Still I think it has been working fine in software till now, but now it has to 
deal with the added confusion of dynticks, so I already know what will happen 
to it.

Hrm it's been a good time for my code all round... I think I'll just swap 
prefetch myself up the staircase to some pluggable scheduler that would 
hyperthread me to sleep as an idle priority task.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-03-01 Thread Ingo Molnar

* Mike Galbraith <[EMAIL PROTECTED]> wrote:

> I see no real difference between the two assertions.  Nice is just a 
> mechanism to set priority, so I applied your assertion to a different 
> range of priorities than nice covers, and returned it to show that the 
> code contradicts itself.  It can't be bad for a nice 1 task to run 
> with a nice 0 task, but OK for a minimum RT task to run with a maximum 
> RT task.  Iff HT without corrective measures breaks nice, then it 
> breaks realtime priorities as well.

i'm starting to lean towards your view that we should not artificially 
keep tasks from running, when there's a free CPU available. We should 
still keep the 'other half' of SMT scheduling: the immediate pushing of 
tasks to a related core, but this bit of 'do not run tasks on this CPU' 
dependent-sleeper logic is i think a bit fragile. Plus these days SMT 
siblings do not tend to influence each other in such a negative way as 
older P4 ones where a HT sibling would slow down the other sibling 
significantly.

plus with an increasing number of siblings (which seems like an 
inevitable thing on the hardware side), the dependent-sleeper logic 
becomes less and less scalable. We'd have to cross-check every other 
'related' CPU's current priority to decide what to run.

if then there should be a mechanism /in the hardware/ to set the 
priority of a CPU - and then the hardware could decide how to prioritize 
between siblings. Doing this in software is really hard.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-02-28 Thread Mike Galbraith

On Thu, 2007-03-01 at 09:01 +1100, Con Kolivas wrote:
> On Wednesday 28 February 2007 15:21, Mike Galbraith wrote:

> > On Wed, 2007-02-28 at 09:58 +1100, Con Kolivas wrote:
> > > On Tuesday 27 February 2007 19:54, Mike Galbraith wrote:
> > > > Agreed.
> > > >
> > > > I was recently looking at that spot because I found that niced tasks
> > > > were taking latency hits, and disabled it, which helped a bunch.
> > >
> > > Ok... as I said above to Ingo, nice means more latency too, and there is
> > > no doubt that if we disable nice as a working feature then the niced
> > > tasks will have less latency. Of course, this ends up meaning that
> > > un-niced tasks no longer receive their better cpu performance..  You're
> > > basically saying that you prefer nice not to work in the setting of HT.
> >
> > No I'm not, but let's go further in that direction just for the sake of
> > argument.  You're then saying that you prefer realtime priorities to not
> > work in the HT setting, given that realtime tasks don't participate in
> > the 'single stream me' program.
> 
> Where do I say that? I do not presume to manage realtime priorities in any 
> way. You're turning my argument about nice levels around and somehow saying 
> that because hyperthreading breaks the single stream me semantics by 
> parallelising them that I would want to stop that happening. Nowhere have I 
> argued that realtime semantics should be changed to somehow work around 
> hyperthreading. SMT nice is about managing nice only, and not realtime 
> priorities which are independent entities.

I see no real difference between the two assertions.  Nice is just a
mechanism to set priority, so I applied your assertion to a different
range of priorities than nice covers, and returned it to show that the
code contradicts itself.  It can't be bad for a nice 1 task to run with
a nice 0 task, but OK for a minimum RT task to run with a maximum RT
task.  Iff HT without corrective measures breaks nice, then it breaks
realtime priorities as well.

> > I'm saying only that we're defeating the purpose of HT, and overriding a
> > user decision every time we force a sibling idle.
> >
> > > > I also
> > > > can't understand why it would be OK to interleave a normal task with an
> > > > RT task sometimes, but not others.. that's meaningless to the RT task.
> > >
> > > Clearly there would be a reason that code is there... The whole problem
> > > with HT is that as soon as you load up a sibling, you slow down the
> > > logical sibling, hence why this code is there in the first place. Since I
> > > know you're one to test things for yourself, I will put it to you this
> > > way:
> > >
> > > Boot into UP. Time how long it takes to do a million of these in a real
> > > time task:
> > >  asm volatile("" : : : "memory");
> > >
> > > Then start up a SCHED_NORMAL task fully cpu bound such as "yes >
> > > /dev/null" and time that again. Obviously the former being a realtime
> > > task will take the same amount of time and the SCHED_NORMAL task will be
> > > starved until the realtime task finishes.
> >
> > Sure.
> >
> > > Now try the same experiment with hyperthreading enabled and an ordinary
> > > SMP kernel. You'll find the realtime task runs at only ~60% performance.
> >
> > So?  User asked for HT.  That's hardware multiplexing. It ain't free.
> > Buyer beware.
> 
> But the buyer is not aware. You are aware because you tinker, but the vast 
> majority of users who enable hyperthreading in their shiny pcs are not aware.

Then we need to make them aware of what they're enabling?
 
> The only thing they know is that if they enable hyperthreading their programs 
> run slower in multitasking environments no matter how much they nice the 
> other processes. Buyers do not buy hardware knowing that the internal design 
> breaks something as fundamental as 'nice'. You seem to presume that most 
> people who get hyperthreading are happy to compromise 'nice' in order to get 
> their second core working and I put it to you that they do not make that 
> decision.

To me it's pretty much black and white.  Either you want to split your
cpu into logical units, which means each has less to offer than the
total, or you want all your processing power in one bucket.

> > >  That's a
> > > serious performance hit for realtime tasks considering you're running a
> > > SCHED_NORMAL task. The SMT code that you seem to dislike fixes this
> > > problem.
> >
> > I don't think it does actually. Let your RT task sleep regularly, and
> > ever so briefly.  We don't evict lower priority tasks from siblings upon
> > wakeup, we only prevent entry... sometimes.
> 
> Well you know as well as I do that you're selecting out the exception rather 
> than the rule, and statistically overall, it does work.

I don't agree that it's the exception, and if you look at this HT thing
from the split cpu perspective, I'm not sure there's even a problem.

Scrolling down, I see that this is getting too long, and we aren't
commun

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-02-28 Thread Con Kolivas

On Wednesday 28 February 2007 15:21, Mike Galbraith wrote:
> (hrmph.  having to copy/paste/try again.  evolution seems to be broken..
> RCPT TO <[EMAIL PROTECTED]> failed: Cannot resolve your domain
> {mp049} ..caused me to be unable to send despite receipts being disabled)

Apologies for mangling the email address as I said :-(

> On Wed, 2007-02-28 at 09:58 +1100, Con Kolivas wrote:
> > On Tuesday 27 February 2007 19:54, Mike Galbraith wrote:
> > > Agreed.
> > >
> > > I was recently looking at that spot because I found that niced tasks
> > > were taking latency hits, and disabled it, which helped a bunch.
> >
> > Ok... as I said above to Ingo, nice means more latency too, and there is
> > no doubt that if we disable nice as a working feature then the niced
> > tasks will have less latency. Of course, this ends up meaning that
> > un-niced tasks no longer receive their better cpu performance..  You're
> > basically saying that you prefer nice not to work in the setting of HT.
>
> No I'm not, but let's go further in that direction just for the sake of
> argument.  You're then saying that you prefer realtime priorities to not
> work in the HT setting, given that realtime tasks don't participate in
> the 'single stream me' program.

Where do I say that? I do not presume to manage realtime priorities in any 
way. You're turning my argument about nice levels around and somehow saying 
that because hyperthreading breaks the single stream me semantics by 
parallelising them that I would want to stop that happening. Nowhere have I 
argued that realtime semantics should be changed to somehow work around 
hyperthreading. SMT nice is about managing nice only, and not realtime 
priorities which are independent entities.

> I'm saying only that we're defeating the purpose of HT, and overriding a
> user decision every time we force a sibling idle.
>
> > > I also
> > > can't understand why it would be OK to interleave a normal task with an
> > > RT task sometimes, but not others.. that's meaningless to the RT task.
> >
> > Clearly there would be a reason that code is there... The whole problem
> > with HT is that as soon as you load up a sibling, you slow down the
> > logical sibling, hence why this code is there in the first place. Since I
> > know you're one to test things for yourself, I will put it to you this
> > way:
> >
> > Boot into UP. Time how long it takes to do a million of these in a real
> > time task:
> >  asm volatile("" : : : "memory");
> >
> > Then start up a SCHED_NORMAL task fully cpu bound such as "yes >
> > /dev/null" and time that again. Obviously the former being a realtime
> > task will take the same amount of time and the SCHED_NORMAL task will be
> > starved until the realtime task finishes.
>
> Sure.
>
> > Now try the same experiment with hyperthreading enabled and an ordinary
> > SMP kernel. You'll find the realtime task runs at only ~60% performance.
>
> So?  User asked for HT.  That's hardware multiplexing. It ain't free.
> Buyer beware.

But the buyer is not aware. You are aware because you tinker, but the vast 
majority of users who enable hyperthreading in their shiny pcs are not aware. 
The only thing they know is that if they enable hyperthreading their programs 
run slower in multitasking environments no matter how much they nice the 
other processes. Buyers do not buy hardware knowing that the internal design 
breaks something as fundamental as 'nice'. You seem to presume that most 
people who get hyperthreading are happy to compromise 'nice' in order to get 
their second core working and I put it to you that they do not make that 
decision.

> >  That's a
> > serious performance hit for realtime tasks considering you're running a
> > SCHED_NORMAL task. The SMT code that you seem to dislike fixes this
> > problem.
>
> I don't think it does actually. Let your RT task sleep regularly, and
> ever so briefly.  We don't evict lower priority tasks from siblings upon
> wakeup, we only prevent entry... sometimes.

Well you know as well as I do that you're selecting out the exception rather 
than the rule, and statistically overall, it does work.

> > The reason for interleaving is that there are a few cycles to be gained
> > by using the second core for a separate SCHED_NORMAL task, and you don't
> > want to disable access to the second core entirely for the duration the
> > realtime task is running. Since there is no simple relationship between
> > SCHED_NORMAL timeslices and realtime timeslices, we have to use some form
> > of interleaving based on the expected extra cycles and HZ is the obvious
> > choice.
>
> To me, the reason for interleaving is solely about keeping the core
> busy .  It has nothing to do with SCHED_POLICY_X what so ever.
>
> > > IMHO, SMT scheduling should be a buyer beware thing.  Maximizing your
> > > core utilization comes at a price, but so does disabling it, so I think
> > > letting the user decide what he wants is the right thing to do.
> >
> > To me this is lik

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-02-27 Thread Mike Galbraith

(hrmph.  having to copy/paste/try again.  evolution seems to be broken..
RCPT TO <[EMAIL PROTECTED]> failed: Cannot resolve your domain {mp049}
..caused me to be unable to send despite receipts being disabled)

On Wed, 2007-02-28 at 09:58 +1100, Con Kolivas wrote:
> On Tuesday 27 February 2007 19:54, Mike Galbraith wrote:

> > Agreed.
> >
> > I was recently looking at that spot because I found that niced tasks
> > were taking latency hits, and disabled it, which helped a bunch.
> 
> Ok... as I said above to Ingo, nice means more latency too, and there is no 
> doubt that if we disable nice as a working feature then the niced tasks will 
> have less latency. Of course, this ends up meaning that un-niced tasks no 
> longer receive their better cpu performance..  You're basically saying that 
> you prefer nice not to work in the setting of HT.

No I'm not, but let's go further in that direction just for the sake of
argument.  You're then saying that you prefer realtime priorities to not
work in the HT setting, given that realtime tasks don't participate in
the 'single stream me' program.
 
I'm saying only that we're defeating the purpose of HT, and overriding a
user decision every time we force a sibling idle.

> > I also 
> > can't understand why it would be OK to interleave a normal task with an
> > RT task sometimes, but not others.. that's meaningless to the RT task.
> 
> Clearly there would be a reason that code is there... The whole problem with 
> HT is that as soon as you load up a sibling, you slow down the logical 
> sibling, hence why this code is there in the first place. Since I know you're 
> one to test things for yourself, I will put it to you this way:
> 
> Boot into UP. Time how long it takes to do a million of these in a real time 
> task:
>  asm volatile("" : : : "memory");
> 
> Then start up a SCHED_NORMAL task fully cpu bound such as "yes > /dev/null" 
> and time that again. Obviously the former being a realtime task will take the 
> same amount of time and the SCHED_NORMAL task will be starved until the 
> realtime task finishes.

Sure.

> Now try the same experiment with hyperthreading enabled and an ordinary SMP 
> kernel. You'll find the realtime task runs at only ~60% performance.

So?  User asked for HT.  That's hardware multiplexing. It ain't free.
Buyer beware.

>  That's a 
> serious performance hit for realtime tasks considering you're running a 
> SCHED_NORMAL task. The SMT code that you seem to dislike fixes this problem.

I don't think it does actually. Let your RT task sleep regularly, and
ever so briefly.  We don't evict lower priority tasks from siblings upon
wakeup, we only prevent entry... sometimes.
 
> The reason for interleaving is that there are a few cycles to be gained by 
> using the second core for a separate SCHED_NORMAL task, and you don't want to 
> disable access to the second core entirely for the duration the realtime task 
> is running. Since there is no simple relationship between SCHED_NORMAL 
> timeslices and realtime timeslices, we have to use some form of interleaving 
> based on the expected extra cycles and HZ is the obvious choice.

To me, the reason for interleaving is solely about keeping the core
busy .  It has nothing to do with SCHED_POLICY_X what so ever.

> > IMHO, SMT scheduling should be a buyer beware thing.  Maximizing your
> > core utilization comes at a price, but so does disabling it, so I think
> > letting the user decide what he wants is the right thing to do.
> 
> To me this is like arguing that we should not implement 'nice' within the cpu 
> scheduler at all and only allow nice to work on the few architectures that 
> support hardware priorities in the cpu (like power5). Now there is no doubt 
> that if we do away with nice entirely everywhere in the scheduler we'll gain 
> some throughput. However, nice is a basic unix/linux function and if hardware 
> comes along that breaks it working we should be working to make sure that it 
> keeps working in software. That is why smt nice and smp nice was implemented. 
> Of course it is our duty to ensure we do that at minimal overhead at all 
> times. That's a different argument to what you are debating here. The 
> throughput should not be adversely affected by this SMT priority code because 
> although the nice 19 task gets less throughput, the higher priority task gets 
> more as a result, which is essentially what nice is meant to do.

Re-read this paragraph with realtime task priorities in mind, or for
that matter, dynamic priorities.  If you carry your priority/throughput
argument to it's logical conclusion, only instruction streams of
absolutely equal priority should be able to share the core at any given
time.  You may as well just disable HT and be done with it.

To me, siblings are logically separate units, and should be treated as
such (as they mostly are).  They share an important resource, but so do
physically discrete units.

-Mike


-
To unsubscribe from this list:

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-02-27 Thread Con Kolivas

Apologies for the resend, lkml address got mangled...

On Tuesday 27 February 2007 19:54, Mike Galbraith wrote:
> On Tue, 2007-02-27 at 09:33 +0100, Ingo Molnar wrote:
> > * Michal Piotrowski <[EMAIL PROTECTED]> wrote:
> > > Thomas Gleixner napisał(a):
> > > > Adrian,
> > > >
> > > > On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote:
> > > >> Subject: kernel BUG at kernel/time/tick-sched.c:168 
> > > >> (CONFIG_NO_HZ) References : http://lkml.org/lkml/2007/2/16/346
> > > >> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
> > > I can confirm that the bug is fixed (over 20 hours of testing should
> > > be enough).
> >
> > thanks alot! I think this thing was a long-term performance/latency
> > regression in HT scheduling as well.

Ingo I'm going to have to partially disagree with you on this. 

This has only become a problem because of what happens with dynticks now when 
rq->curr == rq->idle. Prior to this, that particular SMT code only leads to 
relative delays in scheduling for lower priority tasks. Whether or not that 
task is ksoftirqd should not matter because it is not like they are starved 
indefinitely, it is only that nice 19 tasks are relatively delayed, which by 
definition is implied with the usage of nice as a scheduler hint wouldn't you 
say? I know it has been discussed many times before as to whether 'nice' 
means less cpu and/or more latency, but in our current implementation, nice 
means both less cpu and more latency. So to me, the kernels without dynticks 
do not have a regression. This seems to only be a problem in the setting of 
the new dynticks code IMHO. That's not to say it isn't a bug! Nor am I saying 
that dynticks is a problem! Please don't misinterpret that.

The second issue is that this is a problem because of the fuzzy definition of 
what idle is for a runqueue in the setting of this SMT code. Normally, 
rq->curr==rq->idle means the runqueue is idle, but not with this code since 
there are still rq->nr_running on that runqueue. What dynticks in this 
implementation is doing is trying to idle a hyperthread sibling on a cpu 
whose logical partner is busy. I did not find that added any power saving on 
my earlier dynticks implementation, and found it easier to keep that sibling 
ticking at the same rate as its partner. Of course you may have found 
something different, and I definitely agree with what you are likely to say 
in response to this- we shouldn't have to special case logical siblings as 
having a different definition of idle than any other smp case. Ultimately, 
that leaves us with your simple patch as a reasonable solution for the 
dynticks case even though it does change the behaviour dramatically for a 
more loaded cpu. I don't see this code as presenting a problem without or 
prior to the dynticks implementation. You being the scheduler maintainer 
means you get to choose what is the best way to tackle this problem. 

> Agreed.
>
> I was recently looking at that spot because I found that niced tasks
> were taking latency hits, and disabled it, which helped a bunch.

Ok... as I said above to Ingo, nice means more latency too, and there is no 
doubt that if we disable nice as a working feature then the niced tasks will 
have less latency. Of course, this ends up meaning that un-niced tasks no 
longer receive their better cpu performance..  You're basically saying that 
you prefer nice not to work in the setting of HT.

> I also 
> can't understand why it would be OK to interleave a normal task with an
> RT task sometimes, but not others.. that's meaningless to the RT task.

Clearly there would be a reason that code is there... The whole problem with 
HT is that as soon as you load up a sibling, you slow down the logical 
sibling, hence why this code is there in the first place. Since I know you're 
one to test things for yourself, I will put it to you this way:

Boot into UP. Time how long it takes to do a million of these in a real time 
task:
 asm volatile("" : : : "memory");

Then start up a SCHED_NORMAL task fully cpu bound such as "yes > /dev/null" 
and time that again. Obviously the former being a realtime task will take the 
same amount of time and the SCHED_NORMAL task will be starved until the 
realtime task finishes.

Now try the same experiment with hyperthreading enabled and an ordinary SMP 
kernel. You'll find the realtime task runs at only ~60% performance. That's a 
serious performance hit for realtime tasks considering you're running a 
SCHED_NORMAL task. The SMT code that you seem to dislike fixes this problem. 
The reason for interleaving is that there are a few cycles to be gained by 
using the second core for a separate SCHED_NORMAL task, and you don't want to 
disable access to the second core entirely for the duration the realtime task 
is running. Since there is no simple relationship between SCHED_NORMAL 
timeslices and realtime timeslices, we have to use some form of interleaving 
based on the expected extra cycles and HZ is the ob

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-02-27 Thread Mike Galbraith

On Tue, 2007-02-27 at 09:33 +0100, Ingo Molnar wrote:
> * Michal Piotrowski <[EMAIL PROTECTED]> wrote:
> 
> > Thomas Gleixner napisał(a):
> > > Adrian,
> > > 
> > > On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote:
> > >> Subject: kernel BUG at kernel/time/tick-sched.c:168  (CONFIG_NO_HZ)
> > >> References : http://lkml.org/lkml/2007/2/16/346
> > >> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
> > >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
> > >> Status : problem is being debugged
> > > 
> > > The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a
> > > problem with the SMT scheduler. See below.
> > > 
> > >> Subject: BUG: soft lockup detected on CPU#0
> > >>  NOHZ: local_softirq_pending 20  (SMT scheduler)
> > >> References : http://lkml.org/lkml/2007/2/20/257
> > >> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
> > >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
> > >>  Ingo Molnar <[EMAIL PROTECTED]>
> > >> Status : problem is being debugged
> > > 
> > > Patch available, not confirmed yet.
> > > 
> > 
> > I can confirm that the bug is fixed (over 20 hours of testing should 
> > be enough).
> 
> thanks alot! I think this thing was a long-term performance/latency 
> regression in HT scheduling as well.

Agreed.

I was recently looking at that spot because I found that niced tasks
were taking latency hits, and disabled it, which helped a bunch.  I also
can't understand why it would be OK to interleave a normal task with an
RT task sometimes, but not others.. that's meaningless to the RT task.

IMHO, SMT scheduling should be a buyer beware thing.  Maximizing your
core utilization comes at a price, but so does disabling it, so I think
letting the user decide what he wants is the right thing to do.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-02-27 Thread Ingo Molnar


* Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> Thomas Gleixner napisał(a):
> > Adrian,
> > 
> > On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote:
> >> Subject: kernel BUG at kernel/time/tick-sched.c:168  (CONFIG_NO_HZ)
> >> References : http://lkml.org/lkml/2007/2/16/346
> >> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
> >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
> >> Status : problem is being debugged
> > 
> > The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a
> > problem with the SMT scheduler. See below.
> > 
> >> Subject: BUG: soft lockup detected on CPU#0
> >>  NOHZ: local_softirq_pending 20  (SMT scheduler)
> >> References : http://lkml.org/lkml/2007/2/20/257
> >> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
> >> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
> >>  Ingo Molnar <[EMAIL PROTECTED]>
> >> Status : problem is being debugged
> > 
> > Patch available, not confirmed yet.
> > 
> 
> I can confirm that the bug is fixed (over 20 hours of testing should 
> be enough).

thanks alot! I think this thing was a long-term performance/latency 
regression in HT scheduling as well.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-02-27 Thread Michal Piotrowski

Michal Piotrowski napisał(a):
> Thomas Gleixner napisał(a):
>> Adrian,
>>
>> On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote:
>>> Subject: kernel BUG at kernel/time/tick-sched.c:168  (CONFIG_NO_HZ)
>>> References : http://lkml.org/lkml/2007/2/16/346
>>> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
>>> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
>>> Status : problem is being debugged
>> The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a
>> problem with the SMT scheduler. See below.
>>
>>> Subject: BUG: soft lockup detected on CPU#0
>>>  NOHZ: local_softirq_pending 20  (SMT scheduler)
>>> References : http://lkml.org/lkml/2007/2/20/257
>>> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
>>> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
>>>  Ingo Molnar <[EMAIL PROTECTED]>
>>> Status : problem is being debugged
>> Patch available, not confirmed yet.
>>
> 
> I can confirm that the bug is fixed (over 20 hours of testing should be 
> enough).
    almost ;)

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-02-27 Thread Michal Piotrowski

Thomas Gleixner napisał(a):
> Adrian,
> 
> On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote:
>> Subject: kernel BUG at kernel/time/tick-sched.c:168  (CONFIG_NO_HZ)
>> References : http://lkml.org/lkml/2007/2/16/346
>> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
>> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
>> Status : problem is being debugged
> 
> The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a
> problem with the SMT scheduler. See below.
> 
>> Subject: BUG: soft lockup detected on CPU#0
>>  NOHZ: local_softirq_pending 20  (SMT scheduler)
>> References : http://lkml.org/lkml/2007/2/20/257
>> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
>> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
>>  Ingo Molnar <[EMAIL PROTECTED]>
>> Status : problem is being debugged
> 
> Patch available, not confirmed yet.
> 

I can confirm that the bug is fixed (over 20 hours of testing should be enough).

Huge thanks!

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2007-02-27 Thread Thomas Gleixner

Adrian,

On Mon, 2007-02-26 at 23:05 +0100, Adrian Bunk wrote:
> Subject: kernel BUG at kernel/time/tick-sched.c:168  (CONFIG_NO_HZ)
> References : http://lkml.org/lkml/2007/2/16/346
> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
> Status : problem is being debugged

The BUG_ON() was replaced by a warning printk(). The BUG_ON() exposed a
problem with the SMT scheduler. See below.

> Subject: BUG: soft lockup detected on CPU#0
>  NOHZ: local_softirq_pending 20  (SMT scheduler)
> References : http://lkml.org/lkml/2007/2/20/257
> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
> Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
>  Ingo Molnar <[EMAIL PROTECTED]>
> Status : problem is being debugged

Patch available, not confirmed yet.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.21-rc1: known regressions (v2) (part 2)

2007-02-26 Thread Adrian Bunk

This email lists some known regressions in 2.6.21-rc1 compared to 2.6.20
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject: forcedeth no longer works
References : http://bugzilla.kernel.org/show_bug.cgi?id=8090
Submitter  : David P. Reed <[EMAIL PROTECTED]>
Caused-By  : Ayaz Abdulla <[EMAIL PROTECTED]>
Status : unknown


Subject: forcedeth: skb_over_panic
References : http://bugzilla.kernel.org/show_bug.cgi?id=8058
Submitter  : Albert Hopkins <[EMAIL PROTECTED]>
Status : unknown


Subject: natsemi ethernet card not detected correctly
References : http://lkml.org/lkml/2007/2/23/4
 http://lkml.org/lkml/2007/2/23/7
Submitter  : Bob Tracy <[EMAIL PROTECTED]>
Caused-By  : Mark Brown <[EMAIL PROTECTED]>
Handled-By : Mark Brown <[EMAIL PROTECTED]>
Patch  : http://lkml.org/lkml/2007/2/23/142
Status : patch available


Subject: ThinkPad T60: system doesn't come out of suspend to RAM
 (CONFIG_NO_HZ)
References : http://lkml.org/lkml/2007/2/22/391
Submitter  : Michael S. Tsirkin <[EMAIL PROTECTED]>
Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
 Ingo Molnar <[EMAIL PROTECTED]>
Status : unknown


Subject: kernel BUG at kernel/time/tick-sched.c:168  (CONFIG_NO_HZ)
References : http://lkml.org/lkml/2007/2/16/346
Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
Status : problem is being debugged


Subject: BUG: soft lockup detected on CPU#0
 NOHZ: local_softirq_pending 20  (SMT scheduler)
References : http://lkml.org/lkml/2007/2/20/257
Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
 Ingo Molnar <[EMAIL PROTECTED]>
Status : problem is being debugged


Subject: i386: no boot with nmi_watchdog=1  (clockevents)
References : http://lkml.org/lkml/2007/2/21/208
Submitter  : Daniel Walker <[EMAIL PROTECTED]>
Caused-By  : Thomas Gleixner <[EMAIL PROTECTED]>
 commit e9e2cdb412412326c4827fc78ba27f410d837e6e
Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
Status : problem is being debugged

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

Re: 2.6.21-rc1: known regressions (v2) (part 2)

2.6.21-rc1: known regressions (v2) (part 2)

17 matches

Site Navigation

Mail list logo

Footer information