Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Willy Tarreau
On Fri, Sep 14, 2007 at 03:10:47PM +0200, Roman Zippel wrote:
> Hi,
> Getting credit is indeed not really that important to me, but apparently 
> some lousy credit notes is the only way to get any kind of 
> acknowledgement.

Acknowledgement has always been one of the kernel's weaknesses it seems,
given the recent issues on other subjects. But it's not always easy either,
especially when you just change sparse parts of code based on someone else's
analysis. I personally do credit people in the GIT changelogs for their ideas
or patches, but that does not appear in the code.

> I want to get more attention, but not in a way you suspect. I don't think 
> that a mouse is a really good analogy, is he really that defenseless?

I don't know. But I observe that you're very efficient at building the road
you want him to walk on.

> All I want is to be taken a bit more seriously, the communication aspect I 
> mentioned is really important. From my perspective Ingo is somewhere up on 
> his pedestal and I have to scream to get any kind of attention.

In my opinion, you're screaming in a language he does not understand, and
when he proposes random responses, you don't understand them either. That
game can last very long. You want to speak maths, he cannot. He wants to
speak patches, you cannot. I'm not saying one is better than the other, but
I know for sure that the common language here on LKML is patches. So my
conclusion is that you need someone to act as a translator when you want
to communicate here. It should be a very hard work, BTW!

> I skip the rest of the mail, it's one big attempt trying to prove that I'm 
> dishonest, but you only look at the issue from one side and thus making it 
> yourself very easy.

Maybe there was a very prominent side then. I might be wrong in my analysis,
but I cannot find any other interpretation, there are too many coincidences,
and I don't believe in that, especially from smart people ;-)

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Fri, 14 Sep 2007, Arjan van de Ven wrote:

> > There is actually a very simple reason for that, the actual patch is
> > not my primary focus, 
> 
> 
> for someone who's not focused on patches/code,  you make quite a bit of
> noise when someone does turn your discussion into smaller patches and
> only credits you three times.

As I said before, it's not really the lack of credit, it's the lack of 
discussion.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Arjan van de Ven
On Fri, 14 Sep 2007 16:50:22 +0200 (CEST)
Roman Zippel <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> On Fri, 14 Sep 2007, Arjan van de Ven wrote:
> 
> > > This is ridiculous, I asked you multiple times to explain to me
> > > some of the differences relative to CFS as response to the splitup
> > > requests. Not once did you react, you didn't even ask what I'd
> > > like to know specifically.
> > 
> > Roman,
> > 
> > this is... a strange comment. It almost sounds like you were holding
> > the splitup hostage depending on some other thing happening
> > that's not a good attitude in my book. Having big-blob patches that
> > do many things at the same time leads to them being impossible to
> > apply. Linux works by having smaller incrementals. You know that;
> > you've been around for a long time.
> 
> There is actually a very simple reason for that, the actual patch is
> not my primary focus, 


for someone who's not focused on patches/code,  you make quite a bit of
noise when someone does turn your discussion into smaller patches and
only credits you three times.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Fri, 14 Sep 2007, Arjan van de Ven wrote:

> > This is ridiculous, I asked you multiple times to explain to me some
> > of the differences relative to CFS as response to the splitup
> > requests. Not once did you react, you didn't even ask what I'd like
> > to know specifically.
> 
> Roman,
> 
> this is... a strange comment. It almost sounds like you were holding
> the splitup hostage depending on some other thing happening that's
> not a good attitude in my book. Having big-blob patches that do many
> things at the same time leads to them being impossible to apply. Linux
> works by having smaller incrementals. You know that; you've been around
> for a long time.

There is actually a very simple reason for that, the actual patch is not
my primary focus, for me it's actually more an afterthought of the actual 
design to show that it actually works.
My primary interest is a _discussion_ of the scheduler design, but Ingo 
insists on patches. Sorry, but I don't really work this way, I want to 
think things through _first_, I need a solid concept and I don't like to 
rely on guesswork.

How much response would I have gotten if I had only posted the example 
program and the math description as I initially planned?

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Arjan van de Ven
On Thu, 13 Sep 2007 18:50:12 +0200 (CEST)
Roman Zippel <[EMAIL PROTECTED]> wrote:
> > You never directly replied to these pretty explicit requests, all
> > you did was this side remark 5 days later in one of your patch 
> > announcements:
> 
> This is ridiculous, I asked you multiple times to explain to me some
> of the differences relative to CFS as response to the splitup
> requests. Not once did you react, you didn't even ask what I'd like
> to know specifically.

Roman,

this is... a strange comment. It almost sounds like you were holding
the splitup hostage depending on some other thing happening that's
not a good attitude in my book. Having big-blob patches that do many
things at the same time leads to them being impossible to apply. Linux
works by having smaller incrementals. You know that; you've been around
for a long time.

Complaining that someone finally did splitup work after you refused,
and even puts credit in for you... that's beyond my comprehension.
Sorry.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Fri, 14 Sep 2007, Willy Tarreau wrote:

> On Aug 10th, I was disappointed to see that you still had not provided
> the critical information that Ingo had been asking to you for 9 days
> (cfs-sched-debug output). Your motivations in this work started to
> become a bit fuzzy to me, since people who behave like this generally
> do so to get all the lights on them and you really don't need this.
> 
> Your explanation was kind of "show me yours and only then I'll show
> you mine". Pretty childish but you finally sent that long-requested
> information.

Well, I admit it was rather fruitless attempt to get some information out 
of Ingo, but I only did it _once_.
The problem is that the flow of informations hasn't improved since, later 
I actually answered his questions, but my information requests still go to 
/dev/null.

> I'm now fairly convinced that you're not seeking credits either. There
> are more credits to your name per line of patch here than there is in
> your own code in the kernel. That complaint does not stand by itself.
> 
> In fact, I'm beginning to think that you're like a cat who has found a mouse.
> Why kill it if you can play with it ? Each of your "will I get a response"
> are just like a small kick in the mouse's back to make it move. But by dint
> of doing this, you're slowly pushing the mouse to the door where it risks
> to escape from you, and you're losing your toy.

Getting credit is indeed not really that important to me, but apparently 
some lousy credit notes is the only way to get any kind of 
acknowledgement.
I want to get more attention, but not in a way you suspect. I don't think 
that a mouse is a really good analogy, is he really that defenseless?
All I want is to be taken a bit more seriously, the communication aspect I 
mentioned is really important. From my perspective Ingo is somewhere up on 
his pedestal and I have to scream to get any kind of attention.

I skip the rest of the mail, it's one big attempt trying to prove that I'm 
dishonest, but you only look at the issue from one side and thus making it 
yourself very easy.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Sam Ravnborg wrote:

> I have read the announcement from Ingo and after reading it I concluded
> that it was good to see that Ingo had taken in consideration the feedback
> from you and improved the schduler based on this.
> And when I read that he removed a lot of stuff I smiled. This reminded
> me of countless monkey aka code review sessions where I repeatedly do
> like my childred and asks why so many times that the author realize that
> something is not needed or no longer used.
> 
> 
> The above were my impression after reading the announcement with
> respect to your influence and that goes far beyond "two cleanups".
> I bet many others read it roughly like I did.

Sam, in a way you actually prove my point. Thanks. :)
The primary thing you remember here from the announcements is the cleanup 
and tuning part, during which he picked up some small parts from my patch.

That's what I was afraid of, most people won't realize what was added in 
this process and even if they notice it, Ingo describes it somewhat 
"similiar", but actually "different". That part is pretty important to me, 
but Ingo treats it more as a minor matter.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Peter Zijlstra
On Fri, 2007-09-14 at 14:04 +0200, Roman Zippel wrote:

> AFAICT the compensation part is already done by the scaling part, without 
> the load part it largely mirrors what __update_stats_wait_end() does, i.e. 
> it gets the same time as other tasks, which have been on the rq.

All it tried to do was approximate the situation where the task never
left the rq. I'm not saying it makes sense or is the right thing to do,
just what the thought behind that particular bit was.

There is a reason it was turned off by default:

-   SCHED_FEAT_SLEEPER_AVG  *0 |




signature.asc
Description: This is a digitally signed message part


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Peter Zijlstra wrote:

> On Thu, 2007-09-13 at 18:50 +0200, Roman Zippel wrote:
> 
> > I never claimed to understand every detail of CFS, I can _guess_ what 
> > _might_ have been intended, but from that it's impossible to know for 
> > certain how important they are. Let's take this patch fragment:
> > 
> 
>   delta_fair = se->delta_fair_sleep;
> 
> we slept that much
> 
> > -   /*
> > -* Fix up delta_fair with the effect of us running
> > -* during the whole sleep period:
> > -*/
> > -   if (sched_feat(SLEEPER_AVG))
> > -   delta_fair = div64_likely32((u64)delta_fair * load,
> > -   load + se->load.weight);
> 
> if we would have ran we would not have been removed from the rq and the
> weight would have been: rq_weight + weight
> 
> so compensate for us having been removed from the rq by scaling the
> delta with: rq_weight/(rq_weight + weight)
> 
> > -   delta_fair = calc_weighted(delta_fair, se);
> 
> scale for nice levels

AFAICT the compensation part is already done by the scaling part, without 
the load part it largely mirrors what __update_stats_wait_end() does, i.e. 
it gets the same time as other tasks, which have been on the rq.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Ingo Molnar wrote:

> > The rest of the math is indeed different - it's simply missing. What 
> > is there is IMO not really adequate. I guess you will see the 
> > differences, once you test a bit more with different nice levels.
> 
> Roman, i disagree strongly. I did test with different nice levels. Here 
> are some hard numbers: the CPU usage table of 40 busy loops started at 
> once, all running at a different nice level, from nice -20 to nice +19:

Ingo, you should have read the rest of the paragraph too, I said "it's 
needed for a good task placement", I didn't say anything about time 
distribution.
Try to start a few niced busy loops and then try some interactivity tests.
You should also increase the granularity, the rather small time slices can
cover up a lot of bad scheduling decisions.

> In the announcement of your "Really Fair Scheduler" patch you used the 
> following very strong statement:
> 
> " This model is far more accurate than CFS is [...]"
> 
> http://lkml.org/lkml/2007/8/30/307
> 
> but when i stressed you for actual real-world proof of CFS misbehavior, 

You're forgetting that only a few days before that announcement, the worst 
issues had been fixed, which at that time I hadn't taken into account yet.

> you said:
> 
> "[...] they have indeed little effect in the short term, [...] "
> 
> http://lkml.org/lkml/2007/9/2/282
> 
> so how can CFS be "far less accurate" (paraphrased) while it has "little 
> effect in the short term"?
> 
> so to repeat my question: my (and Peter's) claim is that there is no 
> real-world significance of much of the complexity you added to avoid 
> rounding effects. You do disagree with that, so our follow-up question 
> is: what actual real-world significance does it have in your opinion? 
> What is the worst-case effect? Do we even care? We have measured it 
> every which way and it just does not matter. (but we could easily be 
> wrong, so please be specific if you know about something that we 
> overlooked.) Thanks,

Did you read the rest of mail? I said a little bit more than that, which 
actually explains this already in large parts.
(BTW this mail also has one example where I almost begged you to explain 
me some of the CFS features in response to your splitup request - no 
response.)

Accuracy is an important aspect, but it's not really the primary goal. 
As I said I wanted a correct mathematical model of CFS, but due to the 
complexity of CFS (of which a lot has been removed now in CFS-devel) it 
was rather difficult to produce such a model.
Producing an accurate model is meant as a _tool_ for further 
transformations, e.g. to analyze where are further simplifications 
possible, where can the 64bit math be replaced with something simpler 
without reducing scheduling quality significantly.
The added accuracy increases of course the complexity, but compared to the 
already existing complexity it was still less (at least according to the 
lmbench numbers), so IMO it's worth it. The advantage is that I didn't had 
to worry about any effects of unexpected rounding errors. This scheduler 
has to work with a wide range of clock implementations and AFAICT it's 
impossible to guarantee that it work in any situation, it may not 
break down completely, but I couldn't exclude unexplainable anomalities, 
especially after seeing the problems in the early CFS version, which got 
merged.
As I also mentioned this is only part of the problem (but to which early 
CFS version significantly contributed). The main problem were the limits, 
once the limits are exceeded, that overflow/underflow time is simply lost 
and that is what finally resulted in the misbehaviour. The rounding 
problems were one possible cause but not the only one. Other possibilities 
would require more complex scheduling pattern, where de-/enqueuing of 
tasks would push some tasks into these limits. Prime suspect here was the 
sleeper bonus and the question was: is it possible to accumulate the 
bonus, is it possible to force the punishment onto specific tasks.

The complexity of CFS makes it now hard to quantify the problem, it's easy
to say that it will work in most cases, but e.g. the rounding fixes 
changed more the common case but not really the worst case. The point is 
what would cost to be a little more acurate and as proved with my patch 
not much, but in the end we would have a more reliable scheduler, that 
not only works well in the common cases.

Anyway, as I said already earlier, with the step to an absolute virtual 
time the biggest error source is gone, so in a way you also proved my 
point that it's worth it, even if you don't want to admit it.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Peter Zijlstra
On Thu, 2007-09-13 at 14:44 +0200, Peter Zijlstra wrote:

> > If you look at the math, you'll see that I took the overflow into account, 
> > I even expected it. If you see this effect in my implementation, it would 
> > be a bug.
> 
> Ah, ok, I shall look to your patches in more detail, it was not obvious
> from the formulae you posted.

You indeed outlined it in your email, I must have forgotten it.
I have the attention span of a goldfish these days :-/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Ingo Molnar

* dimm <[EMAIL PROTECTED]> wrote:

> and here's something a bit more intrusive.
> 
> The initial idea was to completely get rid of 'se->fair_key'. It's 
> always equal to 'se->vruntime' for all runnable tasks but the 
> 'current'. The exact key within the tree for the 'current' has to be 
> known in order for __enqueue_entity() to work properly (if we just use 
> 'vruntime', we may go a wrong way down the tree while looking for the 
> correct position for a new element). Sure, it's possible to cache the 
> current's key in the 'cfs_rq' and add a few additional checks, but 
> that's not very nice... so what if we don't keep the 'current' within 
> the tree? :-)
> 
> The illustration is below. Some bits can be missed so far but a 
> patched kernel boots/works (haven't done real regression tests yet... 
> can say that the mail client is still working at this very moment :-).
> 
> There are 2 benefits:
> 
> (1) no more 'fair_key' ;
> (2) entity_tick() is simpler/more effective : 'update_curr()' now vs.
> 'dequeue_entity() + enqueue_entity()' before.

cool patch - i like it! It removes some code as well, besides shrinking 
struct task_struct with another 64-bit variable - so it's a nice 
speedup:

   textdata bss dec hex filename
  344673466  24   379579445 sched.o.before
  344143466  24   379049410 sched.o.after

i've applied it to the end of the queue - it depends on whether 
->vruntime works out well.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Ingo Molnar

* dimm <[EMAIL PROTECTED]> wrote:

> Better placement of #ifdef CONFIG_SCHEDSTAT block in dequeue_entity().

thanks, applied.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Ingo Molnar

* dimm <[EMAIL PROTECTED]> wrote:

> unless we are very eager to keep an additional layer of abstraction, 
> 'struct load_stat' is redundant now so let's get rid of it.

yeah - good one, it's indeed redundant. Applied.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Kyle Moffett

On Sep 13, 2007, at 21:47:25, Rob Hussey wrote:

On 9/13/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:
are you sure this is happening with the latest iteration of the  
patch too? (with the combo-3.patch?) You can pick it up from here:


   http://people.redhat.com/mingo/cfs-scheduler/devel/sched-cfs- 
v2.6.23-rc6-v21-combo-3.patch


I managed to work it all out (it was my fault after all), and I've now
made the changes you suggested to my .configs for 2.6.23-rc1 and
2.6.23-rc6. I've done the benchmarks all over, including tests with
the task bound to a single core. Without further ado, the numbers I
promised:

[...]

I've made graphs like last time:
http://www.healthcarelinen.com/misc/lat_ctx_benchmark.png
http://www.healthcarelinen.com/misc/hackbench_benchmark.png
http://www.healthcarelinen.com/misc/pipe-test_benchmark.png
http://www.healthcarelinen.com/misc/BOUND_lat_ctx_benchmark.png
http://www.healthcarelinen.com/misc/BOUND_hackbench_benchmark.png
http://www.healthcarelinen.com/misc/BOUND_pipe-test_benchmark.png


Well looking at these graphs (and the fixed one from your second  
email), it sure looks a lot like CFS is doing at *least* as well as  
the old scheduler in every single test, and doing much better in most  
of them (in addition it's much more consistent between runs).  This  
seems to jive with all the other benchmarks and overall empirical  
testing that everyone has been doing.  Overall I have to say a job  
well done for Ingo, Peter, Con, and all the other major contributors  
to this impressive endeavor.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Kyle Moffett

On Sep 13, 2007, at 21:47:25, Rob Hussey wrote:

On 9/13/07, Ingo Molnar [EMAIL PROTECTED] wrote:
are you sure this is happening with the latest iteration of the  
patch too? (with the combo-3.patch?) You can pick it up from here:


   http://people.redhat.com/mingo/cfs-scheduler/devel/sched-cfs- 
v2.6.23-rc6-v21-combo-3.patch


I managed to work it all out (it was my fault after all), and I've now
made the changes you suggested to my .configs for 2.6.23-rc1 and
2.6.23-rc6. I've done the benchmarks all over, including tests with
the task bound to a single core. Without further ado, the numbers I
promised:

[...]

I've made graphs like last time:
http://www.healthcarelinen.com/misc/lat_ctx_benchmark.png
http://www.healthcarelinen.com/misc/hackbench_benchmark.png
http://www.healthcarelinen.com/misc/pipe-test_benchmark.png
http://www.healthcarelinen.com/misc/BOUND_lat_ctx_benchmark.png
http://www.healthcarelinen.com/misc/BOUND_hackbench_benchmark.png
http://www.healthcarelinen.com/misc/BOUND_pipe-test_benchmark.png


Well looking at these graphs (and the fixed one from your second  
email), it sure looks a lot like CFS is doing at *least* as well as  
the old scheduler in every single test, and doing much better in most  
of them (in addition it's much more consistent between runs).  This  
seems to jive with all the other benchmarks and overall empirical  
testing that everyone has been doing.  Overall I have to say a job  
well done for Ingo, Peter, Con, and all the other major contributors  
to this impressive endeavor.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Ingo Molnar

* dimm [EMAIL PROTECTED] wrote:

 and here's something a bit more intrusive.
 
 The initial idea was to completely get rid of 'se-fair_key'. It's 
 always equal to 'se-vruntime' for all runnable tasks but the 
 'current'. The exact key within the tree for the 'current' has to be 
 known in order for __enqueue_entity() to work properly (if we just use 
 'vruntime', we may go a wrong way down the tree while looking for the 
 correct position for a new element). Sure, it's possible to cache the 
 current's key in the 'cfs_rq' and add a few additional checks, but 
 that's not very nice... so what if we don't keep the 'current' within 
 the tree? :-)
 
 The illustration is below. Some bits can be missed so far but a 
 patched kernel boots/works (haven't done real regression tests yet... 
 can say that the mail client is still working at this very moment :-).
 
 There are 2 benefits:
 
 (1) no more 'fair_key' ;
 (2) entity_tick() is simpler/more effective : 'update_curr()' now vs.
 'dequeue_entity() + enqueue_entity()' before.

cool patch - i like it! It removes some code as well, besides shrinking 
struct task_struct with another 64-bit variable - so it's a nice 
speedup:

   textdata bss dec hex filename
  344673466  24   379579445 sched.o.before
  344143466  24   379049410 sched.o.after

i've applied it to the end of the queue - it depends on whether 
-vruntime works out well.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Ingo Molnar

* dimm [EMAIL PROTECTED] wrote:

 Better placement of #ifdef CONFIG_SCHEDSTAT block in dequeue_entity().

thanks, applied.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Ingo Molnar

* dimm [EMAIL PROTECTED] wrote:

 unless we are very eager to keep an additional layer of abstraction, 
 'struct load_stat' is redundant now so let's get rid of it.

yeah - good one, it's indeed redundant. Applied.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Peter Zijlstra
On Thu, 2007-09-13 at 14:44 +0200, Peter Zijlstra wrote:

  If you look at the math, you'll see that I took the overflow into account, 
  I even expected it. If you see this effect in my implementation, it would 
  be a bug.
 
 Ah, ok, I shall look to your patches in more detail, it was not obvious
 from the formulae you posted.

You indeed outlined it in your email, I must have forgotten it.
I have the attention span of a goldfish these days :-/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Ingo Molnar wrote:

  The rest of the math is indeed different - it's simply missing. What 
  is there is IMO not really adequate. I guess you will see the 
  differences, once you test a bit more with different nice levels.
 
 Roman, i disagree strongly. I did test with different nice levels. Here 
 are some hard numbers: the CPU usage table of 40 busy loops started at 
 once, all running at a different nice level, from nice -20 to nice +19:

Ingo, you should have read the rest of the paragraph too, I said it's 
needed for a good task placement, I didn't say anything about time 
distribution.
Try to start a few niced busy loops and then try some interactivity tests.
You should also increase the granularity, the rather small time slices can
cover up a lot of bad scheduling decisions.

 In the announcement of your Really Fair Scheduler patch you used the 
 following very strong statement:
 
  This model is far more accurate than CFS is [...]
 
 http://lkml.org/lkml/2007/8/30/307
 
 but when i stressed you for actual real-world proof of CFS misbehavior, 

You're forgetting that only a few days before that announcement, the worst 
issues had been fixed, which at that time I hadn't taken into account yet.

 you said:
 
 [...] they have indeed little effect in the short term, [...] 
 
 http://lkml.org/lkml/2007/9/2/282
 
 so how can CFS be far less accurate (paraphrased) while it has little 
 effect in the short term?
 
 so to repeat my question: my (and Peter's) claim is that there is no 
 real-world significance of much of the complexity you added to avoid 
 rounding effects. You do disagree with that, so our follow-up question 
 is: what actual real-world significance does it have in your opinion? 
 What is the worst-case effect? Do we even care? We have measured it 
 every which way and it just does not matter. (but we could easily be 
 wrong, so please be specific if you know about something that we 
 overlooked.) Thanks,

Did you read the rest of mail? I said a little bit more than that, which 
actually explains this already in large parts.
(BTW this mail also has one example where I almost begged you to explain 
me some of the CFS features in response to your splitup request - no 
response.)

Accuracy is an important aspect, but it's not really the primary goal. 
As I said I wanted a correct mathematical model of CFS, but due to the 
complexity of CFS (of which a lot has been removed now in CFS-devel) it 
was rather difficult to produce such a model.
Producing an accurate model is meant as a _tool_ for further 
transformations, e.g. to analyze where are further simplifications 
possible, where can the 64bit math be replaced with something simpler 
without reducing scheduling quality significantly.
The added accuracy increases of course the complexity, but compared to the 
already existing complexity it was still less (at least according to the 
lmbench numbers), so IMO it's worth it. The advantage is that I didn't had 
to worry about any effects of unexpected rounding errors. This scheduler 
has to work with a wide range of clock implementations and AFAICT it's 
impossible to guarantee that it work in any situation, it may not 
break down completely, but I couldn't exclude unexplainable anomalities, 
especially after seeing the problems in the early CFS version, which got 
merged.
As I also mentioned this is only part of the problem (but to which early 
CFS version significantly contributed). The main problem were the limits, 
once the limits are exceeded, that overflow/underflow time is simply lost 
and that is what finally resulted in the misbehaviour. The rounding 
problems were one possible cause but not the only one. Other possibilities 
would require more complex scheduling pattern, where de-/enqueuing of 
tasks would push some tasks into these limits. Prime suspect here was the 
sleeper bonus and the question was: is it possible to accumulate the 
bonus, is it possible to force the punishment onto specific tasks.

The complexity of CFS makes it now hard to quantify the problem, it's easy
to say that it will work in most cases, but e.g. the rounding fixes 
changed more the common case but not really the worst case. The point is 
what would cost to be a little more acurate and as proved with my patch 
not much, but in the end we would have a more reliable scheduler, that 
not only works well in the common cases.

Anyway, as I said already earlier, with the step to an absolute virtual 
time the biggest error source is gone, so in a way you also proved my 
point that it's worth it, even if you don't want to admit it.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Peter Zijlstra wrote:

 On Thu, 2007-09-13 at 18:50 +0200, Roman Zippel wrote:
 
  I never claimed to understand every detail of CFS, I can _guess_ what 
  _might_ have been intended, but from that it's impossible to know for 
  certain how important they are. Let's take this patch fragment:
  
 
   delta_fair = se-delta_fair_sleep;
 
 we slept that much
 
  -   /*
  -* Fix up delta_fair with the effect of us running
  -* during the whole sleep period:
  -*/
  -   if (sched_feat(SLEEPER_AVG))
  -   delta_fair = div64_likely32((u64)delta_fair * load,
  -   load + se-load.weight);
 
 if we would have ran we would not have been removed from the rq and the
 weight would have been: rq_weight + weight
 
 so compensate for us having been removed from the rq by scaling the
 delta with: rq_weight/(rq_weight + weight)
 
  -   delta_fair = calc_weighted(delta_fair, se);
 
 scale for nice levels

AFAICT the compensation part is already done by the scaling part, without 
the load part it largely mirrors what __update_stats_wait_end() does, i.e. 
it gets the same time as other tasks, which have been on the rq.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Sam Ravnborg wrote:

 I have read the announcement from Ingo and after reading it I concluded
 that it was good to see that Ingo had taken in consideration the feedback
 from you and improved the schduler based on this.
 And when I read that he removed a lot of stuff I smiled. This reminded
 me of countless monkey aka code review sessions where I repeatedly do
 like my childred and asks why so many times that the author realize that
 something is not needed or no longer used.
 
 
 The above were my impression after reading the announcement with
 respect to your influence and that goes far beyond two cleanups.
 I bet many others read it roughly like I did.

Sam, in a way you actually prove my point. Thanks. :)
The primary thing you remember here from the announcements is the cleanup 
and tuning part, during which he picked up some small parts from my patch.

That's what I was afraid of, most people won't realize what was added in 
this process and even if they notice it, Ingo describes it somewhat 
similiar, but actually different. That part is pretty important to me, 
but Ingo treats it more as a minor matter.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Fri, 14 Sep 2007, Willy Tarreau wrote:

 On Aug 10th, I was disappointed to see that you still had not provided
 the critical information that Ingo had been asking to you for 9 days
 (cfs-sched-debug output). Your motivations in this work started to
 become a bit fuzzy to me, since people who behave like this generally
 do so to get all the lights on them and you really don't need this.
 
 Your explanation was kind of show me yours and only then I'll show
 you mine. Pretty childish but you finally sent that long-requested
 information.

Well, I admit it was rather fruitless attempt to get some information out 
of Ingo, but I only did it _once_.
The problem is that the flow of informations hasn't improved since, later 
I actually answered his questions, but my information requests still go to 
/dev/null.

 I'm now fairly convinced that you're not seeking credits either. There
 are more credits to your name per line of patch here than there is in
 your own code in the kernel. That complaint does not stand by itself.
 
 In fact, I'm beginning to think that you're like a cat who has found a mouse.
 Why kill it if you can play with it ? Each of your will I get a response
 are just like a small kick in the mouse's back to make it move. But by dint
 of doing this, you're slowly pushing the mouse to the door where it risks
 to escape from you, and you're losing your toy.

Getting credit is indeed not really that important to me, but apparently 
some lousy credit notes is the only way to get any kind of 
acknowledgement.
I want to get more attention, but not in a way you suspect. I don't think 
that a mouse is a really good analogy, is he really that defenseless?
All I want is to be taken a bit more seriously, the communication aspect I 
mentioned is really important. From my perspective Ingo is somewhere up on 
his pedestal and I have to scream to get any kind of attention.

I skip the rest of the mail, it's one big attempt trying to prove that I'm 
dishonest, but you only look at the issue from one side and thus making it 
yourself very easy.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Arjan van de Ven
On Thu, 13 Sep 2007 18:50:12 +0200 (CEST)
Roman Zippel [EMAIL PROTECTED] wrote:
  You never directly replied to these pretty explicit requests, all
  you did was this side remark 5 days later in one of your patch 
  announcements:
 
 This is ridiculous, I asked you multiple times to explain to me some
 of the differences relative to CFS as response to the splitup
 requests. Not once did you react, you didn't even ask what I'd like
 to know specifically.

Roman,

this is... a strange comment. It almost sounds like you were holding
the splitup hostage depending on some other thing happening that's
not a good attitude in my book. Having big-blob patches that do many
things at the same time leads to them being impossible to apply. Linux
works by having smaller incrementals. You know that; you've been around
for a long time.

Complaining that someone finally did splitup work after you refused,
and even puts credit in for you... that's beyond my comprehension.
Sorry.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Fri, 14 Sep 2007, Arjan van de Ven wrote:

  This is ridiculous, I asked you multiple times to explain to me some
  of the differences relative to CFS as response to the splitup
  requests. Not once did you react, you didn't even ask what I'd like
  to know specifically.
 
 Roman,
 
 this is... a strange comment. It almost sounds like you were holding
 the splitup hostage depending on some other thing happening that's
 not a good attitude in my book. Having big-blob patches that do many
 things at the same time leads to them being impossible to apply. Linux
 works by having smaller incrementals. You know that; you've been around
 for a long time.

There is actually a very simple reason for that, the actual patch is not
my primary focus, for me it's actually more an afterthought of the actual 
design to show that it actually works.
My primary interest is a _discussion_ of the scheduler design, but Ingo 
insists on patches. Sorry, but I don't really work this way, I want to 
think things through _first_, I need a solid concept and I don't like to 
rely on guesswork.

How much response would I have gotten if I had only posted the example 
program and the math description as I initially planned?

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Arjan van de Ven
On Fri, 14 Sep 2007 16:50:22 +0200 (CEST)
Roman Zippel [EMAIL PROTECTED] wrote:

 Hi,
 
 On Fri, 14 Sep 2007, Arjan van de Ven wrote:
 
   This is ridiculous, I asked you multiple times to explain to me
   some of the differences relative to CFS as response to the splitup
   requests. Not once did you react, you didn't even ask what I'd
   like to know specifically.
  
  Roman,
  
  this is... a strange comment. It almost sounds like you were holding
  the splitup hostage depending on some other thing happening
  that's not a good attitude in my book. Having big-blob patches that
  do many things at the same time leads to them being impossible to
  apply. Linux works by having smaller incrementals. You know that;
  you've been around for a long time.
 
 There is actually a very simple reason for that, the actual patch is
 not my primary focus, 


for someone who's not focused on patches/code,  you make quite a bit of
noise when someone does turn your discussion into smaller patches and
only credits you three times.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Roman Zippel
Hi,

On Fri, 14 Sep 2007, Arjan van de Ven wrote:

  There is actually a very simple reason for that, the actual patch is
  not my primary focus, 
 
 
 for someone who's not focused on patches/code,  you make quite a bit of
 noise when someone does turn your discussion into smaller patches and
 only credits you three times.

As I said before, it's not really the lack of credit, it's the lack of 
discussion.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-14 Thread Willy Tarreau
On Fri, Sep 14, 2007 at 03:10:47PM +0200, Roman Zippel wrote:
 Hi,
 Getting credit is indeed not really that important to me, but apparently 
 some lousy credit notes is the only way to get any kind of 
 acknowledgement.

Acknowledgement has always been one of the kernel's weaknesses it seems,
given the recent issues on other subjects. But it's not always easy either,
especially when you just change sparse parts of code based on someone else's
analysis. I personally do credit people in the GIT changelogs for their ideas
or patches, but that does not appear in the code.

 I want to get more attention, but not in a way you suspect. I don't think 
 that a mouse is a really good analogy, is he really that defenseless?

I don't know. But I observe that you're very efficient at building the road
you want him to walk on.

 All I want is to be taken a bit more seriously, the communication aspect I 
 mentioned is really important. From my perspective Ingo is somewhere up on 
 his pedestal and I have to scream to get any kind of attention.

In my opinion, you're screaming in a language he does not understand, and
when he proposes random responses, you don't understand them either. That
game can last very long. You want to speak maths, he cannot. He wants to
speak patches, you cannot. I'm not saying one is better than the other, but
I know for sure that the common language here on LKML is patches. So my
conclusion is that you need someone to act as a translator when you want
to communicate here. It should be a very hard work, BTW!

 I skip the rest of the mail, it's one big attempt trying to prove that I'm 
 dishonest, but you only look at the issue from one side and thus making it 
 yourself very easy.

Maybe there was a very prominent side then. I might be wrong in my analysis,
but I cannot find any other interpretation, there are too many coincidences,
and I don't believe in that, especially from smart people ;-)

Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/13/07, Rob Hussey <[EMAIL PROTECTED]> wrote:
> Bound to single core:
...
> hackbench 50
> #  rc1   rc6   cfs-devel
> 1  7.528 7.950 7.538
> 2  7.649 8.026 7.548
> 3  7.613 8.160 7.580
> 4  7.550 8.054 7.558
> 5  7.563 8.373 7.559
> 6  7.617 8.152 7.550
> 7  7.593 7.831 7.562
> 8  7.602 8.311 7.588
> 9  7.589 8.010 7.552
> 10 7.682 8.059 7.556
>

I knew there was no way I'd post all these numbers and not screw
something up. Switch rc6 and rc1 for hackbench 50 (bound to single
core). Updated graph:
http://www.healthcarelinen.com/misc/BOUND_hackbench_benchmark_fixed.png

Also attached.
<>

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread dimm

and here's something a bit more intrusive.

The initial idea was to completely get rid of 'se->fair_key'. It's always equal 
to 'se->vruntime' for
all runnable tasks but the 'current'. The exact key within the tree for the 
'current' has to be known in
order for __enqueue_entity() to work properly (if we just use 'vruntime', we 
may go a wrong way down the tree
while looking for the correct position for a new element).
Sure, it's possible to cache the current's key in the 'cfs_rq' and add a few 
additional checks, but that's
not very nice... so what if we don't keep the 'current' within the tree? :-)

The illustration is below. Some bits can be missed so far but a patched kernel 
boots/works
(haven't done real regression tests yet... can say that the mail client is 
still working
at this very moment :-).

There are 2 benefits:

(1) no more 'fair_key' ;
(2) entity_tick() is simpler/more effective : 'update_curr()' now vs.
'dequeue_entity() + enqueue_entity()' before.

anyway, consider it as mainly an illustration of idea so far.

---
diff -upr linux-2.6.23-rc6/include/linux/sched.h 
linux-2.6.23-rc6-my/include/linux/sched.h
--- linux-2.6.23-rc6/include/linux/sched.h  2007-09-13 21:38:49.0 
+0200
+++ linux-2.6.23-rc6-my/include/linux/sched.h   2007-09-13 23:01:21.0 
+0200
@@ -890,7 +890,6 @@ struct load_weight {
  * 6 se->load.weight
  */
 struct sched_entity {
-   s64 fair_key;
struct load_weight  load;   /* for load-balancing */
struct rb_node  run_node;
unsigned inton_rq;
diff -upr linux-2.6.23-rc6/kernel/sched.c linux-2.6.23-rc6-my/kernel/sched.c
--- linux-2.6.23-rc6/kernel/sched.c 2007-09-13 21:52:13.0 +0200
+++ linux-2.6.23-rc6-my/kernel/sched.c  2007-09-13 23:00:19.0 +0200
@@ -6534,7 +6534,6 @@ void normalize_rt_tasks(void)
 
read_lock_irq(_lock);
do_each_thread(g, p) {
-   p->se.fair_key  = 0;
p->se.exec_start= 0;
 #ifdef CONFIG_SCHEDSTATS
p->se.wait_start= 0;
diff -upr linux-2.6.23-rc6/kernel/sched_debug.c 
linux-2.6.23-rc6-my/kernel/sched_debug.c
--- linux-2.6.23-rc6/kernel/sched_debug.c   2007-09-13 21:52:13.0 
+0200
+++ linux-2.6.23-rc6-my/kernel/sched_debug.c2007-09-13 23:00:50.0 
+0200
@@ -38,7 +38,7 @@ print_task(struct seq_file *m, struct rq
 
SEQ_printf(m, "%15s %5d %15Ld %13Ld %5d ",
p->comm, p->pid,
-   (long long)p->se.fair_key,
+   (long long)p->se.vruntime,
(long long)(p->nvcsw + p->nivcsw),
p->prio);
 #ifdef CONFIG_SCHEDSTATS
diff -upr linux-2.6.23-rc6/kernel/sched_fair.c 
linux-2.6.23-rc6-my/kernel/sched_fair.c
--- linux-2.6.23-rc6/kernel/sched_fair.c2007-09-13 21:52:13.0 
+0200
+++ linux-2.6.23-rc6-my/kernel/sched_fair.c 2007-09-13 23:48:02.0 
+0200
@@ -125,7 +125,7 @@ set_leftmost(struct cfs_rq *cfs_rq, stru
 
 s64 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
-   return se->fair_key - cfs_rq->min_vruntime;
+   return se->vruntime - cfs_rq->min_vruntime;
 }
 
 /*
@@ -167,9 +167,6 @@ __enqueue_entity(struct cfs_rq *cfs_rq, 
 
rb_link_node(>run_node, parent, link);
rb_insert_color(>run_node, _rq->tasks_timeline);
-   update_load_add(_rq->load, se->load.weight);
-   cfs_rq->nr_running++;
-   se->on_rq = 1;
 }
 
 static void
@@ -179,9 +176,6 @@ __dequeue_entity(struct cfs_rq *cfs_rq, 
set_leftmost(cfs_rq, rb_next(>run_node));
 
rb_erase(>run_node, _rq->tasks_timeline);
-   update_load_sub(_rq->load, se->load.weight);
-   cfs_rq->nr_running--;
-   se->on_rq = 0;
 }
 
 static inline struct rb_node *first_fair(struct cfs_rq *cfs_rq)
@@ -320,10 +314,6 @@ static void update_stats_enqueue(struct 
 */
if (se != cfs_rq->curr)
update_stats_wait_start(cfs_rq, se);
-   /*
-* Update the key:
-*/
-   se->fair_key = se->vruntime;
 }
 
 static void
@@ -371,6 +361,22 @@ update_stats_curr_end(struct cfs_rq *cfs
  * Scheduling class queueing methods:
  */
 
+static void
+account_entity_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
+{
+   update_load_add(_rq->load, se->load.weight);
+   cfs_rq->nr_running++;
+   se->on_rq = 1;
+}
+
+static void
+account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
+{
+   update_load_sub(_rq->load, se->load.weight);
+   cfs_rq->nr_running--;
+   se->on_rq = 0;
+}
+
 static void enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
 #ifdef CONFIG_SCHEDSTATS
@@ -446,7 +452,9 @@ enqueue_entity(struct cfs_rq *cfs_rq, st
}
 
update_stats_enqueue(cfs_rq, se);
-   __enqueue_entity(cfs_rq, se);
+   if (se != cfs_rq->curr)
+   __enqueue_entity(cfs_rq, se);
+   

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Willy Tarreau
Roman,

I've been trying to follow your mails about CFS since your review posted
on Aug 1st. Back to that date, I was thinking "cool, an in-depth review
by someone who understands schedulers and mathematics very well, we'll
quickly have a very solid design".

On Aug 10th, I was disappointed to see that you still had not provided
the critical information that Ingo had been asking to you for 9 days
(cfs-sched-debug output). Your motivations in this work started to
become a bit fuzzy to me, since people who behave like this generally
do so to get all the lights on them and you really don't need this.

Your explanation was kind of "show me yours and only then I'll show
you mine". Pretty childish but you finally sent that long-requested
information.

Since then, I've been noticing your now popular "will I get a response
to my questions" stuffed in most of your mails. That was getting very
suspicious from someone who can write down mathematics equations to
prove his design is right, especially considering the fact that your
"question" only relates to what a few lines were supposed to do. Nobody
believes that someone as smart as you is still blocked on the same
line of code after one month!

And if getting CFS fixed wasn't your real motivation...

On Thu, Sep 13, 2007 at 12:17:42AM +0200, Roman Zippel wrote:
> On Tue, 11 Sep 2007, Ingo Molnar wrote:
> 
> > fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to 
> > announce the latest iteration of the CFS scheduler development tree. Our 
> > main focus has been on simplifications and performance - and as part of 
> > that we've also picked up some ideas from Roman Zippel's 'Really Fair 
> > Scheduler' patch as well and integrated them into CFS. We'd like to ask 
> > people go give these patches a good workout, especially with an eye on 
> > any interactivity regressions.
> 
> I'm must really say, I'm quite impressed by your efforts to give me as 
> little credit as possible.
> On the one hand it's of course positive to see so much sudden activity, on 
> the other hand I'm not sure how much had happened if I hadn't posted my 
> patch, I don't really think it were my complaints about CFS's complexity 
> that finally lead to the improvements in this area.

I'm now fairly convinced that you're not seeking credits either. There
are more credits to your name per line of patch here than there is in
your own code in the kernel. That complaint does not stand by itself.

In fact, I'm beginning to think that you're like a cat who has found a mouse.
Why kill it if you can play with it ? Each of your "will I get a response"
are just like a small kick in the mouse's back to make it move. But by dint
of doing this, you're slowly pushing the mouse to the door where it risks
to escape from you, and you're losing your toy.

So right now, I'm sure you really do not want to get any code merged. It's
so much fun for you to say "hey, Ingo, respond to me" that you would lose
this ability would your code get merged.

> I presented the basic 
> concepts of my patch already with my first CFS review, but at that time 
> you didn't show any interest and instead you were rather quick to simply 
> dismiss it.

At that time, if my memory serves me, you were complaining about a fairness
problem you had with a few programs that you already took days to show the
sources. Proposing an alternate design with a bug report generally has no
chance to be considered because the developer mostly focuses on the bug
report. You should have spent time explaining how your design would work
*after* your problems were solved.

> My patch did not add that much new, it's mostly a conceptual 
> improvement and describes the math in more detail

- why those details were never explained in pure english when nobody could
  understand your maths, then ?

- if you have no problem reading code and translating it to concepts, without
  any comment around it, then how is it believable that you have a problem
  understanding 10 lines of code after 1 month ?

>, but it also demonstrated a number of improvements.

Very likely, reason why Ingo and Peter accepted to take parts of those
improvements. But do you realize that your lack of ability to communicate
on this list has probably delayed mainline integration of parts of your
work, because it was required to get a patch to try to understand your
intents ? It's not sci.math here, its linux-kernel, the _development_
mailing list, where the raw material and common language between people
is the _code_. Some people do not have the skills required to code their
excellent ideas, but they can spend time explaining those to other people.

In your case, it was just a guess game. It does not work like this and
you know it. I really think that you deliberately slowed all the process
down in order to stay on the scene playing this game.

> > The sched-devel.git tree can be pulled from:
> > 
> >
> > 

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread dimm

Hi,

please find a couple of minor cleanups below (on top of 
sched-cfs-v2.6.23-rc6-v21-combo-3.patch):


(1)

Better placement of #ifdef CONFIG_SCHEDSTAT block in dequeue_entity().

Signed-off-by: Dmitry Adamushko <[EMAIL PROTECTED]>

---
diff -upr linux-2.6.23-rc6/kernel/sched_fair.c 
linux-2.6.23-rc6-my/kernel/sched_fair.c
--- linux-2.6.23-rc6/kernel/sched_fair.c2007-09-13 21:38:49.0 
+0200
+++ linux-2.6.23-rc6-my/kernel/sched_fair.c 2007-09-13 21:48:50.0 
+0200
@@ -453,8 +453,8 @@ static void
 dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int sleep)
 {
update_stats_dequeue(cfs_rq, se);
-   if (sleep) {
 #ifdef CONFIG_SCHEDSTATS
+   if (sleep) {
if (entity_is_task(se)) {
struct task_struct *tsk = task_of(se);
 
@@ -463,8 +463,8 @@ dequeue_entity(struct cfs_rq *cfs_rq, st
if (tsk->state & TASK_UNINTERRUPTIBLE)
se->block_start = rq_of(cfs_rq)->clock;
}
-#endif
}
+#endif
__dequeue_entity(cfs_rq, se);
 }
 
---


(2)

unless we are very eager to keep an additional layer of abstraction,
'struct load_stat' is redundant now so let's get rid of it.

Signed-off-by: Dmitry Adamushko <[EMAIL PROTECTED]>


---
diff -upr linux-2.6.23-rc6/kernel/sched.c 
linux-2.6.23-rc6-sched-dev/kernel/sched.c
--- linux-2.6.23-rc6/kernel/sched.c 2007-09-12 21:37:41.0 +0200
+++ linux-2.6.23-rc6-sched-dev/kernel/sched.c   2007-09-12 21:26:10.0 
+0200
@@ -170,10 +170,6 @@ struct rt_prio_array {
struct list_head queue[MAX_RT_PRIO];
 };
 
-struct load_stat {
-   struct load_weight load;
-};
-
 /* CFS-related fields in a runqueue */
 struct cfs_rq {
struct load_weight load;
@@ -232,7 +228,7 @@ struct rq {
 #ifdef CONFIG_NO_HZ
unsigned char in_nohz_recently;
 #endif
-   struct load_stat ls;/* capture load from *all* tasks on this cpu */
+   struct load_weight load;/* capture load from *all* tasks on 
this cpu */
unsigned long nr_load_updates;
u64 nr_switches;
 
@@ -804,7 +800,7 @@ static int balance_tasks(struct rq *this
  * Update delta_exec, delta_fair fields for rq.
  *
  * delta_fair clock advances at a rate inversely proportional to
- * total load (rq->ls.load.weight) on the runqueue, while
+ * total load (rq->load.weight) on the runqueue, while
  * delta_exec advances at the same rate as wall-clock (provided
  * cpu is not idle).
  *
@@ -812,17 +808,17 @@ static int balance_tasks(struct rq *this
  * runqueue over any given interval. This (smoothened) load is used
  * during load balance.
  *
- * This function is called /before/ updating rq->ls.load
+ * This function is called /before/ updating rq->load
  * and when switching tasks.
  */
 static inline void inc_load(struct rq *rq, const struct task_struct *p)
 {
-   update_load_add(>ls.load, p->se.load.weight);
+   update_load_add(>load, p->se.load.weight);
 }
 
 static inline void dec_load(struct rq *rq, const struct task_struct *p)
 {
-   update_load_sub(>ls.load, p->se.load.weight);
+   update_load_sub(>load, p->se.load.weight);
 }
 
 static void inc_nr_running(struct task_struct *p, struct rq *rq)
@@ -967,7 +963,7 @@ inline int task_curr(const struct task_s
 /* Used instead of source_load when we know the type == 0 */
 unsigned long weighted_cpuload(const int cpu)
 {
-   return cpu_rq(cpu)->ls.load.weight;
+   return cpu_rq(cpu)->load.weight;
 }
 
 static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
@@ -1933,7 +1929,7 @@ unsigned long nr_active(void)
  */
 static void update_cpu_load(struct rq *this_rq)
 {
-   unsigned long this_load = this_rq->ls.load.weight;
+   unsigned long this_load = this_rq->load.weight;
int i, scale;
 
this_rq->nr_load_updates++;
diff -upr linux-2.6.23-rc6/kernel/sched_debug.c 
linux-2.6.23-rc6-sched-dev/kernel/sched_debug.c
--- linux-2.6.23-rc6/kernel/sched_debug.c   2007-09-12 21:37:41.0 
+0200
+++ linux-2.6.23-rc6-sched-dev/kernel/sched_debug.c 2007-09-12 
21:36:04.0 +0200
@@ -137,7 +137,7 @@ static void print_cpu(struct seq_file *m
 
P(nr_running);
SEQ_printf(m, "  .%-30s: %lu\n", "load",
-  rq->ls.load.weight);
+  rq->load.weight);
P(nr_switches);
P(nr_load_updates);
P(nr_uninterruptible);
diff -upr linux-2.6.23-rc6/kernel/sched_fair.c 
linux-2.6.23-rc6-sched-dev/kernel/sched_fair.c
--- linux-2.6.23-rc6/kernel/sched_fair.c2007-09-12 21:37:41.0 
+0200
+++ linux-2.6.23-rc6-sched-dev/kernel/sched_fair.c  2007-09-12 
21:35:27.0 +0200
@@ -499,7 +499,7 @@ set_next_entity(struct cfs_rq *cfs_rq, s
 * least twice that of our own weight (i.e. dont track it
 * when there are only lesser-weight tasks around):
 */
-   if (rq_of(cfs_rq)->ls.load.weight >= 2*se->load.weight) {

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 14:28 -0400, Kyle Moffett wrote:

>  with the exception of one patch that's missing a changelog entry.

Ah, that would have been one of mine.

---
From: Peter Zijlstra <[EMAIL PROTECTED]>

Handle vruntime overflow by centering the key space around min_vruntime.

Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/sched_fair.c |   15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index a306f05..b8e2a0d 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -116,11 +116,18 @@ set_leftmost(struct cfs_rq *cfs_rq, struct rb_node 
*leftmost)
cfs_rq->rb_leftmost = leftmost;
if (leftmost) {
se = rb_entry(leftmost, struct sched_entity, run_node);
-   cfs_rq->min_vruntime = max(se->vruntime,
-   cfs_rq->min_vruntime);
+   if ((se->vruntime > cfs_rq->min_vruntime) ||
+   (cfs_rq->min_vruntime > (1ULL << 61) &&
+se->vruntime < (1ULL << 50)))
+   cfs_rq->min_vruntime = se->vruntime;
}
 }
 
+s64 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *se)
+{
+   return se->fair_key - cfs_rq->min_vruntime;
+}
+
 /*
  * Enqueue an entity into the rb-tree:
  */
@@ -130,7 +137,7 @@ __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity 
*se)
struct rb_node **link = _rq->tasks_timeline.rb_node;
struct rb_node *parent = NULL;
struct sched_entity *entry;
-   s64 key = se->fair_key;
+   s64 key = entity_key(cfs_rq, se);
int leftmost = 1;
 
/*
@@ -143,7 +150,7 @@ __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity 
*se)
 * We dont care about collisions. Nodes with
 * the same key stay together.
 */
-   if (key - entry->fair_key < 0) {
+   if (key < entity_key(cfs_rq, entry)) {
link = >rb_left;
} else {
link = >rb_right;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Sam Ravnborg
Hi Roman.

On Thu, Sep 13, 2007 at 02:35:35PM +0200, Roman Zippel wrote:
> Hi,
> 
> On Thu, 13 Sep 2007, Ingo Molnar wrote:
> 
> > > Out of curiousity: will I ever get answers to my questions?
> > 
> > the last few weeks/months have been pretty hectic - i get more than 50 
> > non-list emails a day so i could easily have missed some.
> 
> Well, let's just take the recent "Really Simple Really Fair Scheduler" 
> thread. You had the time to ask me questions about my scheduler, I even 
> explained to you how the sleeping bonus works in my model. At the end I 
> was sort of hoping you would start answering my questions and explaining 
> things how the same things work in CFS - but nothing.
> Then you had the time to reimplement the very things you've just asked me 
> about and what do I get credit for - "two cleanups from RFS".

I have read the announcement from Ingo and after reading it I concluded
that it was good to see that Ingo had taken in consideration the feedback
from you and improved the schduler based on this.
And when I read that he removed a lot of stuff I smiled. This reminded
me of countless monkey aka code review sessions where I repeatedly do
like my childred and asks why so many times that the author realize that
something is not needed or no longer used.


The above were my impression after reading the announcement with
respect to your influence and that goes far beyond "two cleanups".
I bet many others read it roughly like I did.

And no - I did not go back and re-read it. So do not answering
by quoting the announcement or stuff like this.
Because that will NOT change what my first impression was.

So keep up the review - we get a better scheduler this way.

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Kyle Moffett

On Sep 13, 2007, at 12:50:12, Roman Zippel wrote:

On Thu, 13 Sep 2007, Ingo Molnar wrote:

And you are duly credited in 3 patches:


This needs a little perspective, as I couldn't clone the repository  
(and you know that), all I had was this announcement, so using the  
patch descriptions now as defense is unfair by you.


How the hell is that unfair?  The fact that nobody could clone the  
repo for about 24 hours is *totally* *irrelevant* to the whole  
discussion as it's simply a matter of a technical glitch.  His point  
in referencing patch descriptions is to clear up matters of credit.   
Ingo has never in this discussion been "out to get you".  From the  
point of view of a sideline observer it's been *you* that has been  
demanding answers and refusing to answer questions directed at you.


The most brilliant mathematician in the world would have nothing to  
contribute to the Linux scheduler if he couldn't describe, code, and  
comment his algorithm in detail so that others (even code-monkeys  
like myself) could grok at least the basic outline and be able to  
give useful commentary and suggestions.



In this announcement you make relatively few references how this  
relates to my work.  Maybe someone else can show me how to read  
that announcement differently, but IMO the casual reader is likely  
to get the impression, that you only picked some minor cleanups  
from my patch, but it's rather unclear that you already  
reimplemented key aspects of my patch.


As a casual reader and reviewer I have yet to actually see you post  
readable/reviewable patches in this thread.  I was basically  
completely unable to follow the detailed math you go into (even with  
a math minor) due to your *complete* lack of comments.  The fact that  
you renamed files and didn't split up your patch made it useless for  
actual practical kernel development, its only value was as a  
comparison point.  I did however get the impression that Ingo got  
something significantly useful out of your code despite the problems,  
but I still haven't had time to read through his and Peter's patches  
in detail to understand exactly what it was.  From personal  
inspection of a fair percentage of the changes that Ingo and Peter  
committed, they certainly appear to be deleting a lot more code than  
they add.  More specifically they appear to describe in detail what  
they are deleting and why, with the exception of one patch that's  
missing a changelog entry.


So yeah, I get the impression that Ingo re-implemented some ideas  
that you had because you refused to do so in a way that was  
acceptable for the upstream kernel.  How exactly is this a bad  
thing?  You came up with a great idea that worked and somebody else  
did the ugly grunt work to get it ready to go upstream!  On the other  
hand, given the "pleasant" attitude that you've showed Ingo during  
this whole thing I doubt he'd be likely to do it again.



You never directly replied to these pretty explicit requests, all  
you did was this side remark 5 days later in one of your patch  
announcements:


This is ridiculous, I asked you multiple times to explain to me  
some of the differences relative to CFS as response to the splitup  
requests. Not once did you react, you didn't even ask what I'd like  
to know specifically.


How exactly is Ingo supposed to explain to YOU the differences  
between his scheduler and your modified one?  Completely ignoring the  
fact that you merged all your changes into a single patch and didn't  
add a single comment, it's not *his* algorithm that I have trouble  
understanding.  From a relatively basic scan of the source-code and  
comments I was able to figure out how the algorithm works in general,  
enough to ask much more specific questions than yours.  If anything,  
Ingo should have been asking *you* how your scheduler differed from  
the one it was based on.



I never claimed to understand every detail of CFS, I can _guess_  
what _might_ have been intended, but from that it's impossible to  
know for certain how important they are. Let's take this patch  
fragment:


Oh come on, you appear to be quite knowledgeable about CPU scheduling  
and the algorithms involved, surely as such you should have a much  
easier time with reading the comments and asking specific questions.   
For example, your below question specifically about the sleep  
averaging could have been answered in fifteen minutes had you  
actually *ASKED* that.  You'll notice that in fact Peter Zijlstra's  
email response did come almost exactly 15 minutes after you sent this  
email, and for a casual reader like me it seems perfectly  
sufficient;  it does depend on you asking specific questions instead  
of "how does it differ from my hundred-kbyte patch".


As for that specific patch, it's very clear that the affected logic  
is controlled by one of the sched-feature tweaking tools, so you  
could very easily experiment with it yourself to see what 

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 19:06 +0200, Peter Zijlstra wrote:
> On Thu, 2007-09-13 at 18:50 +0200, Roman Zippel wrote:
> 
> > I never claimed to understand every detail of CFS, I can _guess_ what 
> > _might_ have been intended, but from that it's impossible to know for 
> > certain how important they are. Let's take this patch fragment:
> > 
> 
>   delta_fair = se->delta_fair_sleep;
> 
> we slept that much
> 
> > -   /*
> > -* Fix up delta_fair with the effect of us running
> > -* during the whole sleep period:
> > -*/
> > -   if (sched_feat(SLEEPER_AVG))
> > -   delta_fair = div64_likely32((u64)delta_fair * load,
> > -   load + se->load.weight);
> 
> if we would have ran we would not have been removed from the rq and the
> weight would have been: rq_weight + weight
> 
> so compensate for us having been removed from the rq by scaling the
> delta with: rq_weight/(rq_weight + weight)
> 
> > -   delta_fair = calc_weighted(delta_fair, se);
> 
> scale for nice levels
> 

Or at least, I think that is how to read it :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 18:50 +0200, Roman Zippel wrote:

> I never claimed to understand every detail of CFS, I can _guess_ what 
> _might_ have been intended, but from that it's impossible to know for 
> certain how important they are. Let's take this patch fragment:
> 

delta_fair = se->delta_fair_sleep;

we slept that much

> -   /*
> -* Fix up delta_fair with the effect of us running
> -* during the whole sleep period:
> -*/
> -   if (sched_feat(SLEEPER_AVG))
> -   delta_fair = div64_likely32((u64)delta_fair * load,
> -   load + se->load.weight);

if we would have ran we would not have been removed from the rq and the
weight would have been: rq_weight + weight

so compensate for us having been removed from the rq by scaling the
delta with: rq_weight/(rq_weight + weight)

> -   delta_fair = calc_weighted(delta_fair, se);

scale for nice levels




signature.asc
Description: This is a digitally signed message part


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Ingo Molnar wrote:

> > Then you had the time to reimplement the very things you've just asked 
> > me about and what do I get credit for - "two cleanups from RFS".
> 
> i'm sorry to say this, but you must be reading some other email list and 
> a different git tree than what i am reading.
> 
> Firstly, about communications - in the past 3 months i've written you 40 
> emails regarding CFS - and that's more emails than my wife (or any 
> member of my family) got in that timeframe :-( I just ran a quick 
> script: i sent more CFS related emails to you than to any other person 
> on this planet. I bent backwards trying to somehow get you to cooperate 
> with us (and i still havent given up on that!) - instead of you 
> disparaging CFS and me frequently :-(
> 
> Secondly, i prominently credited you as early as in the second sentence 
> of our announcement:
> 
>  | fresh back from the Kernel Summit, Peter Zijlstra and me are pleased 
>  | to announce the latest iteration of the CFS scheduler development 
>  | tree. Our main focus has been on simplifications and performance - 
>  | and as part of that we've also picked up some ideas from Roman 
>  | Zippel's 'Really Fair Scheduler' patch as well and integrated them 
>  | into CFS. We'd like to ask people go give these patches a good 
>  | workout, especially with an eye on any interactivity regressions.
> 
>http://lkml.org/lkml/2007/9/11/395
> 
> And you are duly credited in 3 patches:

This needs a little perspective, as I couldn't clone the repository (and 
you know that), all I had was this announcement, so using the patch 
descriptions now as defense is unfair by you.
In this announcement you make relatively few references how this relates 
to my work. Maybe someone else can show me how to read that announcement 
differently, but IMO the casual reader is likely to get the impression, 
that you only picked some minor cleanups from my patch, but it's rather 
unclear that you already reimplemented key aspects of my patch. Don't 
blame me for your own ambiguity.

>--->
> 
>Subject: sched: introduce se->vruntime
> 
>introduce se->vruntime as a sum of weighted delta-exec's, and use 
>that as the key into the tree.
> 
>the idea to use absolute virtual time as the basic metric of 
>scheduling has been first raised by William Lee Irwin, advanced by 
>Tong Li and first prototyped by Roman Zippel in the "Really Fair 
>Scheduler" (RFS) patchset.
> 
>also see:
> 
>   http://lkml.org/lkml/2007/9/2/76
> 
>for a simpler variant of this patch.

Let's compare this to the relevant part of the announcement:

| The ->vruntime metric is similar to the ->time_norm metric used by
| Roman's patch (and both are losely related to the already existing
| sum_exec_runtime metric in CFS), it's in essence the sum of CPU time
| executed by a task, in nanoseconds - weighted up or down by their nice
| level (or kept the same on the default nice 0 level). Besides this basic
| metric our implementation and math differs from RFS.

In the patch you are more explicit about the virtual time aspect, in the 
announcement you're less clear that it's all based on the same idea and 
somehow it's important to stress the point that "implementation and math 
differs", which is not untrue, but your forget to mention that the 
differences are rather small.

> You never directly replied to these pretty explicit requests, all you 
> did was this side remark 5 days later in one of your patch 
> announcements:

This is ridiculous, I asked you multiple times to explain to me some of 
the differences relative to CFS as response to the splitup requests. Not 
once did you react, you didn't even ask what I'd like to know 
specifically.

> 
>" For a split version I'm still waiting for some more explanation
>  about the CFS tuning parameter. "
> 
>  http://lkml.org/lkml/2007/9/7/87
> 
> You are an experienced kernel hacker. How you can credibly claim that 
> while you were capable of writing a new scheduler along with a series of 
> 25 complex mathematical equations that few if any lkml readers are able 
> to understand (and which scheduler came in one intermixed patch that 
> added no new comments at all!), and that you are able to maintain the 
> m68k Linux architecture code, but that at the same time some supposed 
> missing explanation from _me_ makes you magically incapable to split up 
> _your own fine code_? This is really beyond me.

I never claimed to understand every detail of CFS, I can _guess_ what 
_might_ have been intended, but from that it's impossible to know for 
certain how important they are. Let's take this patch fragment:

-   /*
-* Fix up delta_fair with the effect of us running
-* during the whole sleep period:
-*/
-   if (sched_feat(SLEEPER_AVG))
-   delta_fair = div64_likely32((u64)delta_fair * load,
-   

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Roman Zippel <[EMAIL PROTECTED]> wrote:

> Then you had the time to reimplement the very things you've just asked 
> me about and what do I get credit for - "two cleanups from RFS".

i'm sorry to say this, but you must be reading some other email list and 
a different git tree than what i am reading.

Firstly, about communications - in the past 3 months i've written you 40 
emails regarding CFS - and that's more emails than my wife (or any 
member of my family) got in that timeframe :-( I just ran a quick 
script: i sent more CFS related emails to you than to any other person 
on this planet. I bent backwards trying to somehow get you to cooperate 
with us (and i still havent given up on that!) - instead of you 
disparaging CFS and me frequently :-(

Secondly, i prominently credited you as early as in the second sentence 
of our announcement:

 | fresh back from the Kernel Summit, Peter Zijlstra and me are pleased 
 | to announce the latest iteration of the CFS scheduler development 
 | tree. Our main focus has been on simplifications and performance - 
 | and as part of that we've also picked up some ideas from Roman 
 | Zippel's 'Really Fair Scheduler' patch as well and integrated them 
 | into CFS. We'd like to ask people go give these patches a good 
 | workout, especially with an eye on any interactivity regressions.

   http://lkml.org/lkml/2007/9/11/395

And you are duly credited in 3 patches:

   --->

   Subject: sched: introduce se->vruntime

   introduce se->vruntime as a sum of weighted delta-exec's, and use 
   that as the key into the tree.

   the idea to use absolute virtual time as the basic metric of 
   scheduling has been first raised by William Lee Irwin, advanced by 
   Tong Li and first prototyped by Roman Zippel in the "Really Fair 
   Scheduler" (RFS) patchset.

   also see:

  http://lkml.org/lkml/2007/9/2/76

   for a simpler variant of this patch.

   --->

   Subject: sched: track cfs_rq->curr on !group-scheduling too

   Noticed by Roman Zippel: use cfs_rq->curr in the !group-scheduling 
   case too. Small micro-optimization and cleanup effect:

   --->

   Subject: sched: uninline __enqueue_entity()/__dequeue_entity()

   suggested by Roman Zippel: uninline __enqueue_entity() and 
   __dequeue_entity().

   --->

We could not add you as the author, because you unfortunately did not 
make your changes applicable to CFS. I've asked you _three_ separate 
times to send a nicely split up series so that we can apply your code:

  " it's far easier to review and merge stuff if it's nicely split up. "

   http://lkml.org/lkml/2007/9/2/38

  " I also think that the core math changes should be split from the 
Breshenham optimizations. "

   http://lkml.org/lkml/2007/9/2/43

   " That's also why i've asked for a split-up patch series - it makes 
 it far easier to review and test the code and it makes it far 
 easier to quickly apply the obviously correct bits. "

   http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg204094.html

You never directly replied to these pretty explicit requests, all you 
did was this side remark 5 days later in one of your patch 
announcements:

   " For a split version I'm still waiting for some more explanation
 about the CFS tuning parameter. "

 http://lkml.org/lkml/2007/9/7/87

You are an experienced kernel hacker. How you can credibly claim that 
while you were capable of writing a new scheduler along with a series of 
25 complex mathematical equations that few if any lkml readers are able 
to understand (and which scheduler came in one intermixed patch that 
added no new comments at all!), and that you are able to maintain the 
m68k Linux architecture code, but that at the same time some supposed 
missing explanation from _me_ makes you magically incapable to split up 
_your own fine code_? This is really beyond me.

I even gave you the first baby step of the split-up by sending this:

http://lkml.org/lkml/2007/9/2/76

And your reaction to this was dismissive:

  " It simplifies the math too much, the nice level weighting is an 
essential part of the math and without it one can't really 
understand the problem I'm trying to solve. "

http://lkml.org/lkml/2007/9/3/174

So we advanced this whole issue by trying the vruntime concept in CFS 
and adding the 2 cleanups from RFS (we couldnt actually use any code 
from you, due to the way you shaped your patch - but we'd certainly be 
glad to!). You've seen the earliest iteration of that at:

http://lkml.org/lkml/2007/9/2/76

So far you've sent 3 updates of your patch without addressing any of the 
structural feedback we gave. We virtually begged you to make your code 
finegrained and applicable - but you did not do that.

And please understand, splitting up patches is paramount when 
cooperating with others: we are not against adding code that makes sense 
(to the contrary and we do that 

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Roman Zippel <[EMAIL PROTECTED]> wrote:

> The rest of the math is indeed different - it's simply missing. What 
> is there is IMO not really adequate. I guess you will see the 
> differences, once you test a bit more with different nice levels.

Roman, i disagree strongly. I did test with different nice levels. Here 
are some hard numbers: the CPU usage table of 40 busy loops started at 
once, all running at a different nice level, from nice -20 to nice +19:

 top - 12:25:07 up 19 min,  2 users,  load average: 40.00, 39.15, 28.35
 Tasks: 172 total,  41 running, 131 sleeping,   0 stopped,   0 zombie

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 2455 root   0 -20  1576  248  196 R   20  0.0   3:47.56 loop
 2456 root   1 -19  1576  244  196 R   16  0.0   3:03.96 loop
 2457 root   2 -18  1576  244  196 R   13  0.0   2:24.80 loop
 2458 root   3 -17  1576  248  196 R   10  0.0   1:58.63 loop
 2459 root   4 -16  1576  244  196 R8  0.0   1:33.04 loop
 2460 root   5 -15  1576  248  196 R7  0.0   1:14.73 loop
 2461 root   6 -14  1576  248  196 R5  0.0   0:59.61 loop
 2462 root   7 -13  1576  244  196 R4  0.0   0:47.95 loop
 2463 root   8 -12  1576  248  196 R3  0.0   0:38.31 loop
 2464 root   9 -11  1576  244  196 R3  0.0   0:30.54 loop
 2465 root  10 -10  1576  244  196 R2  0.0   0:24.47 loop
 2466 root  11  -9  1576  244  196 R2  0.0   0:19.52 loop
 2467 root  12  -8  1576  248  196 R1  0.0   0:15.63 loop
 2468 root  13  -7  1576  248  196 R1  0.0   0:12.56 loop
 2469 root  14  -6  1576  248  196 R1  0.0   0:10.00 loop
 2470 root  15  -5  1576  244  196 R1  0.0   0:07.99 loop
 2471 root  16  -4  1576  244  196 R1  0.0   0:06.40 loop
 2472 root  17  -3  1576  244  196 R0  0.0   0:05.09 loop
 2473 root  18  -2  1576  244  196 R0  0.0   0:04.05 loop
 2474 root  19  -1  1576  248  196 R0  0.0   0:03.26 loop
 2475 root  20   0  1576  244  196 R0  0.0   0:02.61 loop
 2476 root  21   1  1576  244  196 R0  0.0   0:02.09 loop
 2477 root  22   2  1576  244  196 R0  0.0   0:01.67 loop
 2478 root  23   3  1576  244  196 R0  0.0   0:01.33 loop
 2479 root  24   4  1576  248  196 R0  0.0   0:01.07 loop
 2480 root  25   5  1576  244  196 R0  0.0   0:00.84 loop
 2481 root  26   6  1576  248  196 R0  0.0   0:00.68 loop
 2482 root  27   7  1576  248  196 R0  0.0   0:00.54 loop
 2483 root  28   8  1576  248  196 R0  0.0   0:00.43 loop
 2484 root  29   9  1576  248  196 R0  0.0   0:00.34 loop
 2485 root  30  10  1576  244  196 R0  0.0   0:00.27 loop
 2486 root  31  11  1576  248  196 R0  0.0   0:00.21 loop
 2487 root  32  12  1576  244  196 R0  0.0   0:00.17 loop
 2488 root  33  13  1576  244  196 R0  0.0   0:00.13 loop
 2489 root  34  14  1576  244  196 R0  0.0   0:00.10 loop
 2490 root  35  15  1576  244  196 R0  0.0   0:00.08 loop
 2491 root  36  16  1576  248  196 R0  0.0   0:00.06 loop
 2493 root  38  18  1576  248  196 R0  0.0   0:00.03 loop
 2492 root  37  17  1576  244  196 R0  0.0   0:00.04 loop
 2494 root  39  19  1576  244  196 R0  0.0   0:00.02 loop

check a few select rows (the ratio of CPU time should be 1.25 at every 
step) and see that CPU time is distributed very exactly. (and the same 
is true for both -rc6 and -rc6-cfs-devel)

So even in this pretty extreme example (who on this planet runs 40 busy 
loops with each loop on exactly one separate nice level, creating a load 
average of 40.0 and expects perfect distribution after just a few 
minutes?) CFS still distributes CPU time perfectly.

When you first raised accuracy issues i have asked you to provide 
specific real-world examples showing any of the "problems" with nice 
levels you implied to repeatedly:

http://lkml.org/lkml/2007/9/2/38

In the announcement of your "Really Fair Scheduler" patch you used the 
following very strong statement:

" This model is far more accurate than CFS is [...]"

http://lkml.org/lkml/2007/8/30/307

but when i stressed you for actual real-world proof of CFS misbehavior, 
you said:

"[...] they have indeed little effect in the short term, [...] "

http://lkml.org/lkml/2007/9/2/282

so how can CFS be "far less accurate" (paraphrased) while it has "little 
effect in the short term"?

so to repeat my question: my (and Peter's) claim is that there is no 
real-world significance of much of the complexity you added to avoid 
rounding effects. You do disagree with 

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 14:14 +0200, Roman Zippel wrote:
> Hi,
> 
> On Thu, 13 Sep 2007, Peter Zijlstra wrote:
> 
> > >  There's a good reason 
> > > I put that much effort into maintaining a good, but still cheap average, 
> > > it's needed for a good task placement.
> > 
> > While I agree that having this average is nice, your particular
> > implementation has the problem that it quickly overflows u64 at which
> > point it becomes a huge problem (a CPU hog could basically lock up your
> > box when that happens).
> 
> If you look at the math, you'll see that I took the overflow into account, 
> I even expected it. If you see this effect in my implementation, it would 
> be a bug.

Ah, ok, I shall look to your patches in more detail, it was not obvious
from the formulae you posted.

> > >  There is of course more than one 
> > > way to implement this, so you'll have good chances to simply reimplement 
> > > it somewhat differently, but I'd be surprised if it would be something 
> > > completely different.
> > 
> > Currently we have 2 approximations in place:
> > 
> >   (leftmost + rightmost) / 2
> > 
> > and
> > 
> >   leftmost + period/2   (where period should match the span of the tree)
> > 
> > neither are perfect but they seem to work quite well.
> 
> You need more than two busy loops. 

I'm missing context here, are you referring to the nice level error or
the avg approximation?

> There's a reason I implemented a simple simulator first, so I could 
> actually study the scheduling behaviour of different load situations. That 
> doesn't protect from all surprises of course, but it gives me the 
> necessary confidence the scheduler will work reasonably even in weird 
> situations.

Right, I've build user-space simulators too, handy little things to play
with :-)

> From these tests I already know that your approximations only work with 
> rather simple loads.

I've not yet seen it go spectacularly wrong, although admittedly a
highly concurrent kbuild is the most complex task I let loose on it.

Could you perhaps be more specific on the circumstances it breaks down
and what the negative impact is?


signature.asc
Description: This is a digitally signed message part


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Ingo Molnar wrote:

> > Out of curiousity: will I ever get answers to my questions?
> 
> the last few weeks/months have been pretty hectic - i get more than 50 
> non-list emails a day so i could easily have missed some.

Well, let's just take the recent "Really Simple Really Fair Scheduler" 
thread. You had the time to ask me questions about my scheduler, I even 
explained to you how the sleeping bonus works in my model. At the end I 
was sort of hoping you would start answering my questions and explaining 
things how the same things work in CFS - but nothing.
Then you had the time to reimplement the very things you've just asked me 
about and what do I get credit for - "two cleanups from RFS".
And now I get this lame ass excuse for not answering my questions? :-(

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Peter Zijlstra wrote:

> >  There's a good reason 
> > I put that much effort into maintaining a good, but still cheap average, 
> > it's needed for a good task placement.
> 
> While I agree that having this average is nice, your particular
> implementation has the problem that it quickly overflows u64 at which
> point it becomes a huge problem (a CPU hog could basically lock up your
> box when that happens).

If you look at the math, you'll see that I took the overflow into account, 
I even expected it. If you see this effect in my implementation, it would 
be a bug.

> >  There is of course more than one 
> > way to implement this, so you'll have good chances to simply reimplement 
> > it somewhat differently, but I'd be surprised if it would be something 
> > completely different.
> 
> Currently we have 2 approximations in place:
> 
>   (leftmost + rightmost) / 2
> 
> and
> 
>   leftmost + period/2   (where period should match the span of the tree)
> 
> neither are perfect but they seem to work quite well.

You need more than two busy loops. 
There's a reason I implemented a simple simulator first, so I could 
actually study the scheduling behaviour of different load situations. That 
doesn't protect from all surprises of course, but it gives me the 
necessary confidence the scheduler will work reasonably even in weird 
situations.
>From these tests I already know that your approximations only work with 
rather simple loads.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Rob Hussey <[EMAIL PROTECTED]> wrote:

> Well, I was going over my config myself after you asked for me to post
> it, and I thought to do the same thing. Except, disabling sched_debug
> caused the same error as before:
> In file included from kernel/sched.c:794:
> kernel/sched_fair.c: In function 'task_new_fair':
> kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
> undeclared (first use in this function)
> kernel/sched_fair.c:857: error: (Each undeclared identifier is
> reported only once
> kernel/sched_fair.c:857: error: for each function it appears in.)
> make[1]: *** [kernel/sched.o] Error 1
> make: *** [kernel] Error 2
> 
> It only happens with sched_debug=y. I take it back, it wasn't my fault :)
> 
> As for everything else, I'd be happy to.

are you sure this is happening with the latest iteration of the patch 
too? (with the combo-3.patch?) You can pick it up from here:

   
http://people.redhat.com/mingo/cfs-scheduler/devel/sched-cfs-v2.6.23-rc6-v21-combo-3.patch

I tried your config and it builds fine here.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 00:17 +0200, Roman Zippel wrote:

> The rest of the math is indeed different - it's simply missing. What is 
> there is IMO not really adequate. I guess you will see the differences, 
> once you test a bit more with different nice levels.

The rounding error we now still have is accumulative over the long time
but has no real effect. The only effect is that a nice level would be a
little different that it would have been had the division been perfect,
not dissimilar to having a small error in the divisor series to being
with. (note that in order to see this little fuzz you need amazingly
high context switch rates)

We've measured the effect with the strongest nice levels -20 and 19, a
normal loop against two yield loops (this generated 700.000 context
switches per second), and the effect is <1%. Not something worth fixing
IMHO (unless it comes for free). 

At that high switching rates the overhead of scheduling itself and
caching causes more skew than this - the small error is totally swamped
by the time lost scheduling.

>  There's a good reason 
> I put that much effort into maintaining a good, but still cheap average, 
> it's needed for a good task placement.

While I agree that having this average is nice, your particular
implementation has the problem that it quickly overflows u64 at which
point it becomes a huge problem (a CPU hog could basically lock up your
box when that happens).

I solved the wrap around problem in cfs-devel, and from that base I
_could_ probably maintain the average without overflow problems, but
have yet to try.

>  There is of course more than one 
> way to implement this, so you'll have good chances to simply reimplement 
> it somewhat differently, but I'd be surprised if it would be something 
> completely different.

Currently we have 2 approximations in place:

  (leftmost + rightmost) / 2

and

  leftmost + period/2   (where period should match the span of the tree)

neither are perfect but they seem to work quite well.


signature.asc
Description: This is a digitally signed message part


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/13/07, Rob Hussey <[EMAIL PROTECTED]> wrote:
> On 9/13/07, Rob Hussey <[EMAIL PROTECTED]> wrote:
> > On 9/13/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:
> > >
> > > * Rob Hussey <[EMAIL PROTECTED]> wrote:
> > >
> > > > On 9/13/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:
> > > > >
> > > > > thanks for the numbers! Could you please also post the .config you 
> > > > > used?
> > > >
> > > > Sure, .config for 2.6.23-rc1 and 2.6.23-rc6 attached.
> > >
> > > thx! If you've got some time, could you perhaps re-measure with these
> > > disabled:
> > >
> > >   CONFIG_SCHED_DEBUG=y
> >
> > Well, I was going over my config myself after you asked for me to post
> > it, and I thought to do the same thing. Except, disabling sched_debug
> > caused the same error as before:
> > In file included from kernel/sched.c:794:
> > kernel/sched_fair.c: In function 'task_new_fair':
> > kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
> > undeclared (first use in this function)
> > kernel/sched_fair.c:857: error: (Each undeclared identifier is
> > reported only once
> > kernel/sched_fair.c:857: error: for each function it appears in.)
> > make[1]: *** [kernel/sched.o] Error 1
> > make: *** [kernel] Error 2
> >
> > It only happens with sched_debug=y. I take it back, it wasn't my fault :)
> >
> I'm trying the patches now to see if they help.
>
Current cfs-devel git compiles fine without sched_debug. Not sure how
I broke things, but I need some sleep. I know the 2.6.23-rc1 numbers
were good, but not sure about the others. I'll make the changes you
suggested, and get some new and hopefully good numbers for
2.6.23-rc6-cfs and 2.6.23-rc6-cfs-devel.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/13/07, Rob Hussey <[EMAIL PROTECTED]> wrote:
> On 9/13/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:
> >
> > * Rob Hussey <[EMAIL PROTECTED]> wrote:
> >
> > > On 9/13/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:
> > > >
> > > > thanks for the numbers! Could you please also post the .config you used?
> > >
> > > Sure, .config for 2.6.23-rc1 and 2.6.23-rc6 attached.
> >
> > thx! If you've got some time, could you perhaps re-measure with these
> > disabled:
> >
> >   CONFIG_SCHED_DEBUG=y
>
> Well, I was going over my config myself after you asked for me to post
> it, and I thought to do the same thing. Except, disabling sched_debug
> caused the same error as before:
> In file included from kernel/sched.c:794:
> kernel/sched_fair.c: In function 'task_new_fair':
> kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
> undeclared (first use in this function)
> kernel/sched_fair.c:857: error: (Each undeclared identifier is
> reported only once
> kernel/sched_fair.c:857: error: for each function it appears in.)
> make[1]: *** [kernel/sched.o] Error 1
> make: *** [kernel] Error 2
>
> It only happens with sched_debug=y. I take it back, it wasn't my fault :)
>
I'm trying the patches now to see if they help.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/13/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
> * Rob Hussey <[EMAIL PROTECTED]> wrote:
>
> > On 9/13/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:
> > >
> > > thanks for the numbers! Could you please also post the .config you used?
> >
> > Sure, .config for 2.6.23-rc1 and 2.6.23-rc6 attached.
>
> thx! If you've got some time, could you perhaps re-measure with these
> disabled:
>
>   CONFIG_SCHED_DEBUG=y

Well, I was going over my config myself after you asked for me to post
it, and I thought to do the same thing. Except, disabling sched_debug
caused the same error as before:
In file included from kernel/sched.c:794:
kernel/sched_fair.c: In function 'task_new_fair':
kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
undeclared (first use in this function)
kernel/sched_fair.c:857: error: (Each undeclared identifier is
reported only once
kernel/sched_fair.c:857: error: for each function it appears in.)
make[1]: *** [kernel/sched.o] Error 1
make: *** [kernel] Error 2

It only happens with sched_debug=y. I take it back, it wasn't my fault :)

As for everything else, I'd be happy to.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Rob Hussey <[EMAIL PROTECTED]> wrote:

> On 9/13/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:
> >
> > thanks for the numbers! Could you please also post the .config you used?
> 
> Sure, .config for 2.6.23-rc1 and 2.6.23-rc6 attached.

thx! If you've got some time, could you perhaps re-measure with these 
disabled:

  CONFIG_SCHED_DEBUG=y
  CONFIG_SCHEDSTATS=y

these options mask some of the performance enhancements we made. There's 
also a new code drop at:

   http://people.redhat.com/mingo/cfs-scheduler/devel/

with some fixes for SMP. (and you've got an SMP box it appears)

also, if you want to maximize performance, it usually makes more sense 
to build with these flipped around:

  # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
  CONFIG_FORCED_INLINING=y

i.e.:

  CONFIG_CC_OPTIMIZE_FOR_SIZE=y
  # CONFIG_FORCED_INLINING is not set

because especially on modern x86 CPUs, smaller x86 code is faster. (and 
it also takes up less I-cache size)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar


* Roman Zippel <[EMAIL PROTECTED]> wrote:

> > The sched-devel.git tree can be pulled from:
> >
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git
> 
> Am I the only one who can't clone that thing? [...]

Ah - i have messed up my sched-devel.git script so the git-push went to 
kernel.org but into my home directory :-/ Should work now - let me know 
if it doesnt.

i've also uploaded the patch series in quilt format, to:

  http://people.redhat.com/mingo/cfs-scheduler/devel/patches.tar.gz

> [...] It can't be entirely explained with the Kernel Summit, as this 
> is not the first time patches appear out of the blue in form of a git 
> tree.

i'm not sure what you mean, but i can definitely tell you that there was 
no scheduler hacking at the Kernel Summit. (there's no good wireless in 
the pubs and not enough space for a laptop anyway ;)

The impressive linecount has been mostly achieved by dumb removal:

sched: remove wait_runtime fields and features
4 files changed, 14 insertions(+), 161 deletions(-)

sched: remove wait_runtime limit
5 files changed, 3 insertions(+), 124 deletions(-)

sched: remove precise CPU load calculations #2
1 file changed, 1 insertion(+), 31 deletions(-)

sched: remove precise CPU load
3 files changed, 9 insertions(+), 41 deletions(-)

sched: remove stat_gran
4 files changed, 15 insertions(+), 50 deletions(-)

Hack time to do them: ~10 minutes apiece. Removing stuff is _easy_ :-)

The rest is finegrained, small changes. One of the harder patches was 
this one:

commit 28c4b8ed35f0fc7050f186147da9e10b55e1e446
sched: introduce se->vruntime
3 files changed, 50 insertions(+), 33 deletions(-)

And i sent you the first variant of that already:

http://lkml.org/lkml/2007/9/2/76

we needed 2 days after the KS to put it into shape and send it out for 
feedback.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Rob Hussey <[EMAIL PROTECTED]> wrote:

> On 9/11/07, Rob Hussey <[EMAIL PROTECTED]> wrote:
> > Hi Ingo,
> >
> > When compiling, I get:
> 
> Yeah, this was my fault :(
> 
> I've had a chance to test this now, and everything feels great. I did
> some benchmarks for 2.6.23-rc1, 2.6.23-rc6-cfs, and
> 2.6.23-rc6-cfs-devel:

thanks for the numbers! Could you please also post the .config you used? 
Thx,

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/11/07, Rob Hussey <[EMAIL PROTECTED]> wrote:
> Hi Ingo,
>
> When compiling, I get:

Yeah, this was my fault :(

I've had a chance to test this now, and everything feels great. I did
some benchmarks for 2.6.23-rc1, 2.6.23-rc6-cfs, and
2.6.23-rc6-cfs-devel:
lat_ctx -s 0 2:
2.6.23-rc1  2.6.23-rc6-cfs  2.6.23-rc6-cfs-devel
   5.154.91 5.05
   5.235.18 4.85
   5.194.89 5.17
   5.365.23 4.86
   5.355.00 5.13
   5.345.05 5.12
   5.264.99 5.06
   5.115.04 4.96
   5.295.19 5.18
   5.404.93 5.07

hackbench 50:
 2.6.23-rc1 2.6.23-rc6-cfs  2.6.23-rc6-cfs-devel
6.301   5.963   5.837
6.417   5.961   5.814
6.468   5.965   5.757
6.525   5.926   5.840
6.320   5.929   5.751
6.457   5.909   5.825

pipe-test (http://redhat.com/~mingo/cfs-scheduler/tools/pipe-test.c):
 2.6.23-rc1  2.6.23-rc6-cfs 2.6.23-rc6-cfs-devel
14.29   14.03   13.89
14.31   14.01   14.10
14.27   13.99   14.15
14.31   14.02   14.16
14.53   14.02   14.14
14.53   14.27   14.16
14.51   14.36   14.12
14.48   14.33   14.16
14.52   14.36   14.17
14.47   14.36   14.15

I turned the results into graphs as well. I'll attach them, but they're also at:
http://www.healthcarelinen.com/misc/lat_ctx_benchmark.png
http://www.healthcarelinen.com/misc/hackbench_benchmark.png
http://www.healthcarelinen.com/misc/pipe-test_benchmark.png

The hackbench and pipe-test numbers are very encouraging. The avg
between the 2.6.23-rc6-cfs and 2.6.23-rc6-cfs-devel lat_ctx numbers
are nearly identical (5.041 and 5.045 respectively).
<><><>

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Roman Zippel <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> Out of curiousity: will I ever get answers to my questions?

the last few weeks/months have been pretty hectic - i get more than 50 
non-list emails a day so i could easily have missed some. (and to take a 
line from Linus: my attention span is roughly that of a slightly 
retarded golden retriever ;)

so it would be helpful if you could please re-state any questions you 
still have, in context of our latest CFS-devel queue. I tried to answer 
the error/rounding worries you had - which seemed to be the main theme 
of your patch. There are lots of good kernel hackers on lkml who know 
the new scheduler code pretty well and who might be able to provide an 
answer even if i dont manage to answer. (Perhaps asking the questions 
without heavy math will also help more people be able to understand and 
answer your questions and their practical relevance.) In any case - if 
you see packet loss on my side then please resend :) That would be 
hugely helpful. Thanks,

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread debian developer
Please ignore the previous mail, i messed it up bad.

On 9/12/07, Roman Zippel <[EMAIL PROTECTED]> wrote:
> Hi,
>
> On Tue, 11 Sep 2007, Ingo Molnar wrote:
>
> > fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to
> > announce the latest iteration of the CFS scheduler development tree. Our
> > main focus has been on simplifications and performance - and as part of
> > that we've also picked up some ideas from Roman Zippel's 'Really Fair
> > Scheduler' patch as well and integrated them into CFS. We'd like to ask
> > people go give these patches a good workout, especially with an eye on
> > any interactivity regressions.
>
> I'm must really say, I'm quite impressed by your efforts to give me as
> little credit as possible.
> On the one hand it's of course positive to see so much sudden activity, on
> the other hand I'm not sure how much had happened if I hadn't posted my
> patch, I don't really think it were my complaints about CFS's complexity
> that finally lead to the improvements in this area. I presented the basic
> concepts of my patch already with my first CFS review, but at that time
> you didn't show any interest and instead you were rather quick to simply
> dismiss it. My patch did not add that much new, it's mostly a conceptual
> improvement and describes the math in more detail, but it also
> demonstrated a number of improvements.
>
> > The combo patch against 2.6.23-rc6 can be picked up from:
> >
> >   http://people.redhat.com/mingo/cfs-scheduler/devel/
> >
> > The sched-devel.git tree can be pulled from:
> >
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git
>
> Am I the only one who can't clone that thing? So I can't go into much
> detail about the individual changes here.
> The thing that makes me curious, is that it also includes patches by
> others. It can't be entirely explained with the Kernel Summit, as this is
> not the first time patches appear out of the blue in form of a git tree.
> The funny/sad thing is that at some point Linus complained about Con that
> his development activity happend on a separate mailing list, but there was
> at least a place to go to. CFS's development appears to mostly happen in
> private. Patches may be your primary form of communication, but that isn't
> true for many other people, with patches a lot of intent and motivation
> for a change is lost. I know it's rather tempting to immediately try out
> an idea first, but would it really hurt you so much to formulate an idea
> in a more conventional manner? Are you afraid it might hurt your
> ueberhacker status by occasionally screwing up in public? Patches on the
> other hand have the advantage to more easily cover that up by simply
> posting a fix - it makes it more difficult to understand what's going on.
> A more conventional way of communication would give more people a chance
> to participate, they may not understand every detail of the patch, but
> they can try to understand the general concepts and apply them to their
> own situation and eventually come up with some ideas/improvements of their
> own, they would be less dependent on you to come up with a solution to
> their problem. Unless of course that's exactly what you want - unless you
> want to be in full control of the situation and you want to be the hero
> that saves the day.
>
> > There are lots of small performance improvements in form of a
> > finegrained 29-patch series. We have removed a number of features and
> > metrics from CFS that might have been needed but ended up being
> > superfluous - while keeping the things that worked out fine, like
> > sleeper fairness. On 32-bit x86 there's a ~16% speedup (over -rc6) in
> > lmbench (lat_ctx -s 0 2) results:
>
> In the patch you really remove _a_lot_ of stuff. You also removed a lot of
> things I tried to get you to explain them to me. On the one hand I could
> be happy that these things are gone, as they were the major road block to
> splitting up my own patch. On the other hand it still leaves me somewhat
> unsatisfied, as I still don't know what that stuff was good for.
> In a more collaborative development model I would have expected that you
> tried to explain these features, which could have resulted in a discussion
> how else things can be implemented or if it's still needed at all. Instead
> of this you now simply decide unilaterally that these things are not
> needed anymore.
>
> BTW the old sleeper fairness logic "that worked out fine" is actually
> completely gone and is now conceptually closer to what I'm already doing
> in my patch (only the amount of sleeper bonus differs).
>
> >   (microseconds, lower is better)
> >  
> > v2.6.222.6.23-rc6(CFS) v2.6.23-rc6-CFS-devel
> >  
> >0.70  0.750.65
> >0.62  

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread debian developer
-- Forwarded message --
From: Roman Zippel <[EMAIL PROTECTED]>
Date: Sep 12, 2007 6:17 PM
Subject: Re: [announce] CFS-devel, performance improvements
To: Ingo Molnar <[EMAIL PROTECTED]>
Cc: linux-kernel@vger.kernel.org, Peter Zijlstra
<[EMAIL PROTECTED]>, Mike Galbraith <[EMAIL PROTECTED]>


Hi,

On Tue, 11 Sep 2007, Ingo Molnar wrote:

> fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to
> announce the latest iteration of the CFS scheduler development tree. Our
> main focus has been on simplifications and performance - and as part of
> that we've also picked up some ideas from Roman Zippel's 'Really Fair
> Scheduler' patch as well and integrated them into CFS. We'd like to ask
> people go give these patches a good workout, especially with an eye on
> any interactivity regressions.

I'm must really say, I'm quite impressed by your efforts to give me as
little credit as possible.
On the one hand it's of course positive to see so much sudden activity, on
the other hand I'm not sure how much had happened if I hadn't posted my
patch, I don't really think it were my complaints about CFS's complexity
that finally lead to the improvements in this area. I presented the basic
concepts of my patch already with my first CFS review, but at that time
you didn't show any interest and instead you were rather quick to simply
dismiss it. My patch did not add that much new, it's mostly a conceptual
improvement and describes the math in more detail, but it also
demonstrated a number of improvements.

> The combo patch against 2.6.23-rc6 can be picked up from:
>
>   http://people.redhat.com/mingo/cfs-scheduler/devel/
>
> The sched-devel.git tree can be pulled from:
>
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git

Am I the only one who can't clone that thing? So I can't go into much
detail about the individual changes here.
The thing that makes me curious, is that it also includes patches by
others. It can't be entirely explained with the Kernel Summit, as this is
not the first time patches appear out of the blue in form of a git tree.
The funny/sad thing is that at some point Linus complained about Con that
his development activity happend on a separate mailing list, but there was
at least a place to go to. CFS's development appears to mostly happen in
private. Patches may be your primary form of communication, but that isn't
true for many other people, with patches a lot of intent and motivation
for a change is lost. I know it's rather tempting to immediately try out
an idea first, but would it really hurt you so much to formulate an idea
in a more conventional manner? Are you afraid it might hurt your
ueberhacker status by occasionally screwing up in public? Patches on the
other hand have the advantage to more easily cover that up by simply
posting a fix - it makes it more difficult to understand what's going on.
A more conventional way of communication would give more people a chance
to participate, they may not understand every detail of the patch, but
they can try to understand the general concepts and apply them to their
own situation and eventually come up with some ideas/improvements of their
own, they would be less dependent on you to come up with a solution to
their problem. Unless of course that's exactly what you want - unless you
want to be in full control of the situation and you want to be the hero
that saves the day.

> There are lots of small performance improvements in form of a
> finegrained 29-patch series. We have removed a number of features and
> metrics from CFS that might have been needed but ended up being
> superfluous - while keeping the things that worked out fine, like
> sleeper fairness. On 32-bit x86 there's a ~16% speedup (over -rc6) in
> lmbench (lat_ctx -s 0 2) results:

In the patch you really remove _a_lot_ of stuff. You also removed a lot of
things I tried to get you to explain them to me. On the one hand I could
be happy that these things are gone, as they were the major road block to
splitting up my own patch. On the other hand it still leaves me somewhat
unsatisfied, as I still don't know what that stuff was good for.
In a more collaborative development model I would have expected that you
tried to explain these features, which could have resulted in a discussion
how else things can be implemented or if it's still needed at all. Instead
of this you now simply decide unilaterally that these things are not
needed anymore.

BTW the old sleeper fairness logic "that worked out fine" is actually
completely gone and is now conceptually closer to what I'm already doing
in my patch (only the amount of sleeper bonus differs).

>   (microseconds, lower is better)
>  
> v2.6.222.6.23-rc6(CFS) v2.6.23

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread debian developer
-- Forwarded message --
From: Roman Zippel [EMAIL PROTECTED]
Date: Sep 12, 2007 6:17 PM
Subject: Re: [announce] CFS-devel, performance improvements
To: Ingo Molnar [EMAIL PROTECTED]
Cc: linux-kernel@vger.kernel.org, Peter Zijlstra
[EMAIL PROTECTED], Mike Galbraith [EMAIL PROTECTED]


Hi,

On Tue, 11 Sep 2007, Ingo Molnar wrote:

 fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to
 announce the latest iteration of the CFS scheduler development tree. Our
 main focus has been on simplifications and performance - and as part of
 that we've also picked up some ideas from Roman Zippel's 'Really Fair
 Scheduler' patch as well and integrated them into CFS. We'd like to ask
 people go give these patches a good workout, especially with an eye on
 any interactivity regressions.

I'm must really say, I'm quite impressed by your efforts to give me as
little credit as possible.
On the one hand it's of course positive to see so much sudden activity, on
the other hand I'm not sure how much had happened if I hadn't posted my
patch, I don't really think it were my complaints about CFS's complexity
that finally lead to the improvements in this area. I presented the basic
concepts of my patch already with my first CFS review, but at that time
you didn't show any interest and instead you were rather quick to simply
dismiss it. My patch did not add that much new, it's mostly a conceptual
improvement and describes the math in more detail, but it also
demonstrated a number of improvements.

 The combo patch against 2.6.23-rc6 can be picked up from:

   http://people.redhat.com/mingo/cfs-scheduler/devel/

 The sched-devel.git tree can be pulled from:


 git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git

Am I the only one who can't clone that thing? So I can't go into much
detail about the individual changes here.
The thing that makes me curious, is that it also includes patches by
others. It can't be entirely explained with the Kernel Summit, as this is
not the first time patches appear out of the blue in form of a git tree.
The funny/sad thing is that at some point Linus complained about Con that
his development activity happend on a separate mailing list, but there was
at least a place to go to. CFS's development appears to mostly happen in
private. Patches may be your primary form of communication, but that isn't
true for many other people, with patches a lot of intent and motivation
for a change is lost. I know it's rather tempting to immediately try out
an idea first, but would it really hurt you so much to formulate an idea
in a more conventional manner? Are you afraid it might hurt your
ueberhacker status by occasionally screwing up in public? Patches on the
other hand have the advantage to more easily cover that up by simply
posting a fix - it makes it more difficult to understand what's going on.
A more conventional way of communication would give more people a chance
to participate, they may not understand every detail of the patch, but
they can try to understand the general concepts and apply them to their
own situation and eventually come up with some ideas/improvements of their
own, they would be less dependent on you to come up with a solution to
their problem. Unless of course that's exactly what you want - unless you
want to be in full control of the situation and you want to be the hero
that saves the day.

 There are lots of small performance improvements in form of a
 finegrained 29-patch series. We have removed a number of features and
 metrics from CFS that might have been needed but ended up being
 superfluous - while keeping the things that worked out fine, like
 sleeper fairness. On 32-bit x86 there's a ~16% speedup (over -rc6) in
 lmbench (lat_ctx -s 0 2) results:

In the patch you really remove _a_lot_ of stuff. You also removed a lot of
things I tried to get you to explain them to me. On the one hand I could
be happy that these things are gone, as they were the major road block to
splitting up my own patch. On the other hand it still leaves me somewhat
unsatisfied, as I still don't know what that stuff was good for.
In a more collaborative development model I would have expected that you
tried to explain these features, which could have resulted in a discussion
how else things can be implemented or if it's still needed at all. Instead
of this you now simply decide unilaterally that these things are not
needed anymore.

BTW the old sleeper fairness logic that worked out fine is actually
completely gone and is now conceptually closer to what I'm already doing
in my patch (only the amount of sleeper bonus differs).

   (microseconds, lower is better)
  
 v2.6.222.6.23-rc6(CFS) v2.6.23-rc6-CFS-devel
  
0.70  0.750.65
0.62

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread debian developer
Please ignore the previous mail, i messed it up bad.

On 9/12/07, Roman Zippel [EMAIL PROTECTED] wrote:
 Hi,

 On Tue, 11 Sep 2007, Ingo Molnar wrote:

  fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to
  announce the latest iteration of the CFS scheduler development tree. Our
  main focus has been on simplifications and performance - and as part of
  that we've also picked up some ideas from Roman Zippel's 'Really Fair
  Scheduler' patch as well and integrated them into CFS. We'd like to ask
  people go give these patches a good workout, especially with an eye on
  any interactivity regressions.

 I'm must really say, I'm quite impressed by your efforts to give me as
 little credit as possible.
 On the one hand it's of course positive to see so much sudden activity, on
 the other hand I'm not sure how much had happened if I hadn't posted my
 patch, I don't really think it were my complaints about CFS's complexity
 that finally lead to the improvements in this area. I presented the basic
 concepts of my patch already with my first CFS review, but at that time
 you didn't show any interest and instead you were rather quick to simply
 dismiss it. My patch did not add that much new, it's mostly a conceptual
 improvement and describes the math in more detail, but it also
 demonstrated a number of improvements.

  The combo patch against 2.6.23-rc6 can be picked up from:
 
http://people.redhat.com/mingo/cfs-scheduler/devel/
 
  The sched-devel.git tree can be pulled from:
 
 
  git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git

 Am I the only one who can't clone that thing? So I can't go into much
 detail about the individual changes here.
 The thing that makes me curious, is that it also includes patches by
 others. It can't be entirely explained with the Kernel Summit, as this is
 not the first time patches appear out of the blue in form of a git tree.
 The funny/sad thing is that at some point Linus complained about Con that
 his development activity happend on a separate mailing list, but there was
 at least a place to go to. CFS's development appears to mostly happen in
 private. Patches may be your primary form of communication, but that isn't
 true for many other people, with patches a lot of intent and motivation
 for a change is lost. I know it's rather tempting to immediately try out
 an idea first, but would it really hurt you so much to formulate an idea
 in a more conventional manner? Are you afraid it might hurt your
 ueberhacker status by occasionally screwing up in public? Patches on the
 other hand have the advantage to more easily cover that up by simply
 posting a fix - it makes it more difficult to understand what's going on.
 A more conventional way of communication would give more people a chance
 to participate, they may not understand every detail of the patch, but
 they can try to understand the general concepts and apply them to their
 own situation and eventually come up with some ideas/improvements of their
 own, they would be less dependent on you to come up with a solution to
 their problem. Unless of course that's exactly what you want - unless you
 want to be in full control of the situation and you want to be the hero
 that saves the day.

  There are lots of small performance improvements in form of a
  finegrained 29-patch series. We have removed a number of features and
  metrics from CFS that might have been needed but ended up being
  superfluous - while keeping the things that worked out fine, like
  sleeper fairness. On 32-bit x86 there's a ~16% speedup (over -rc6) in
  lmbench (lat_ctx -s 0 2) results:

 In the patch you really remove _a_lot_ of stuff. You also removed a lot of
 things I tried to get you to explain them to me. On the one hand I could
 be happy that these things are gone, as they were the major road block to
 splitting up my own patch. On the other hand it still leaves me somewhat
 unsatisfied, as I still don't know what that stuff was good for.
 In a more collaborative development model I would have expected that you
 tried to explain these features, which could have resulted in a discussion
 how else things can be implemented or if it's still needed at all. Instead
 of this you now simply decide unilaterally that these things are not
 needed anymore.

 BTW the old sleeper fairness logic that worked out fine is actually
 completely gone and is now conceptually closer to what I'm already doing
 in my patch (only the amount of sleeper bonus differs).

(microseconds, lower is better)
   
  v2.6.222.6.23-rc6(CFS) v2.6.23-rc6-CFS-devel
   
 0.70  0.750.65
 0.62  0.660.63
 0.60  0.720.69
 0.62  0.74 

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Roman Zippel [EMAIL PROTECTED] wrote:

 Hi,
 
 Out of curiousity: will I ever get answers to my questions?

the last few weeks/months have been pretty hectic - i get more than 50 
non-list emails a day so i could easily have missed some. (and to take a 
line from Linus: my attention span is roughly that of a slightly 
retarded golden retriever ;)

so it would be helpful if you could please re-state any questions you 
still have, in context of our latest CFS-devel queue. I tried to answer 
the error/rounding worries you had - which seemed to be the main theme 
of your patch. There are lots of good kernel hackers on lkml who know 
the new scheduler code pretty well and who might be able to provide an 
answer even if i dont manage to answer. (Perhaps asking the questions 
without heavy math will also help more people be able to understand and 
answer your questions and their practical relevance.) In any case - if 
you see packet loss on my side then please resend :) That would be 
hugely helpful. Thanks,

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/11/07, Rob Hussey [EMAIL PROTECTED] wrote:
 Hi Ingo,

 When compiling, I get:

Yeah, this was my fault :(

I've had a chance to test this now, and everything feels great. I did
some benchmarks for 2.6.23-rc1, 2.6.23-rc6-cfs, and
2.6.23-rc6-cfs-devel:
lat_ctx -s 0 2:
2.6.23-rc1  2.6.23-rc6-cfs  2.6.23-rc6-cfs-devel
   5.154.91 5.05
   5.235.18 4.85
   5.194.89 5.17
   5.365.23 4.86
   5.355.00 5.13
   5.345.05 5.12
   5.264.99 5.06
   5.115.04 4.96
   5.295.19 5.18
   5.404.93 5.07

hackbench 50:
 2.6.23-rc1 2.6.23-rc6-cfs  2.6.23-rc6-cfs-devel
6.301   5.963   5.837
6.417   5.961   5.814
6.468   5.965   5.757
6.525   5.926   5.840
6.320   5.929   5.751
6.457   5.909   5.825

pipe-test (http://redhat.com/~mingo/cfs-scheduler/tools/pipe-test.c):
 2.6.23-rc1  2.6.23-rc6-cfs 2.6.23-rc6-cfs-devel
14.29   14.03   13.89
14.31   14.01   14.10
14.27   13.99   14.15
14.31   14.02   14.16
14.53   14.02   14.14
14.53   14.27   14.16
14.51   14.36   14.12
14.48   14.33   14.16
14.52   14.36   14.17
14.47   14.36   14.15

I turned the results into graphs as well. I'll attach them, but they're also at:
http://www.healthcarelinen.com/misc/lat_ctx_benchmark.png
http://www.healthcarelinen.com/misc/hackbench_benchmark.png
http://www.healthcarelinen.com/misc/pipe-test_benchmark.png

The hackbench and pipe-test numbers are very encouraging. The avg
between the 2.6.23-rc6-cfs and 2.6.23-rc6-cfs-devel lat_ctx numbers
are nearly identical (5.041 and 5.045 respectively).
attachment: lat_ctx_benchmark.pngattachment: hackbench_benchmark.pngattachment: pipe-test_benchmark.png

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Rob Hussey [EMAIL PROTECTED] wrote:

 On 9/11/07, Rob Hussey [EMAIL PROTECTED] wrote:
  Hi Ingo,
 
  When compiling, I get:
 
 Yeah, this was my fault :(
 
 I've had a chance to test this now, and everything feels great. I did
 some benchmarks for 2.6.23-rc1, 2.6.23-rc6-cfs, and
 2.6.23-rc6-cfs-devel:

thanks for the numbers! Could you please also post the .config you used? 
Thx,

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar


* Roman Zippel [EMAIL PROTECTED] wrote:

  The sched-devel.git tree can be pulled from:
 
 
  git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git
 
 Am I the only one who can't clone that thing? [...]

Ah - i have messed up my sched-devel.git script so the git-push went to 
kernel.org but into my home directory :-/ Should work now - let me know 
if it doesnt.

i've also uploaded the patch series in quilt format, to:

  http://people.redhat.com/mingo/cfs-scheduler/devel/patches.tar.gz

 [...] It can't be entirely explained with the Kernel Summit, as this 
 is not the first time patches appear out of the blue in form of a git 
 tree.

i'm not sure what you mean, but i can definitely tell you that there was 
no scheduler hacking at the Kernel Summit. (there's no good wireless in 
the pubs and not enough space for a laptop anyway ;)

The impressive linecount has been mostly achieved by dumb removal:

sched: remove wait_runtime fields and features
4 files changed, 14 insertions(+), 161 deletions(-)

sched: remove wait_runtime limit
5 files changed, 3 insertions(+), 124 deletions(-)

sched: remove precise CPU load calculations #2
1 file changed, 1 insertion(+), 31 deletions(-)

sched: remove precise CPU load
3 files changed, 9 insertions(+), 41 deletions(-)

sched: remove stat_gran
4 files changed, 15 insertions(+), 50 deletions(-)

Hack time to do them: ~10 minutes apiece. Removing stuff is _easy_ :-)

The rest is finegrained, small changes. One of the harder patches was 
this one:

commit 28c4b8ed35f0fc7050f186147da9e10b55e1e446
sched: introduce se-vruntime
3 files changed, 50 insertions(+), 33 deletions(-)

And i sent you the first variant of that already:

http://lkml.org/lkml/2007/9/2/76

we needed 2 days after the KS to put it into shape and send it out for 
feedback.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Rob Hussey [EMAIL PROTECTED] wrote:

 On 9/13/07, Ingo Molnar [EMAIL PROTECTED] wrote:
 
  thanks for the numbers! Could you please also post the .config you used?
 
 Sure, .config for 2.6.23-rc1 and 2.6.23-rc6 attached.

thx! If you've got some time, could you perhaps re-measure with these 
disabled:

  CONFIG_SCHED_DEBUG=y
  CONFIG_SCHEDSTATS=y

these options mask some of the performance enhancements we made. There's 
also a new code drop at:

   http://people.redhat.com/mingo/cfs-scheduler/devel/

with some fixes for SMP. (and you've got an SMP box it appears)

also, if you want to maximize performance, it usually makes more sense 
to build with these flipped around:

  # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
  CONFIG_FORCED_INLINING=y

i.e.:

  CONFIG_CC_OPTIMIZE_FOR_SIZE=y
  # CONFIG_FORCED_INLINING is not set

because especially on modern x86 CPUs, smaller x86 code is faster. (and 
it also takes up less I-cache size)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/13/07, Ingo Molnar [EMAIL PROTECTED] wrote:

 * Rob Hussey [EMAIL PROTECTED] wrote:

  On 9/13/07, Ingo Molnar [EMAIL PROTECTED] wrote:
  
   thanks for the numbers! Could you please also post the .config you used?
 
  Sure, .config for 2.6.23-rc1 and 2.6.23-rc6 attached.

 thx! If you've got some time, could you perhaps re-measure with these
 disabled:

   CONFIG_SCHED_DEBUG=y

Well, I was going over my config myself after you asked for me to post
it, and I thought to do the same thing. Except, disabling sched_debug
caused the same error as before:
In file included from kernel/sched.c:794:
kernel/sched_fair.c: In function 'task_new_fair':
kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
undeclared (first use in this function)
kernel/sched_fair.c:857: error: (Each undeclared identifier is
reported only once
kernel/sched_fair.c:857: error: for each function it appears in.)
make[1]: *** [kernel/sched.o] Error 1
make: *** [kernel] Error 2

It only happens with sched_debug=y. I take it back, it wasn't my fault :)

As for everything else, I'd be happy to.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/13/07, Rob Hussey [EMAIL PROTECTED] wrote:
 On 9/13/07, Ingo Molnar [EMAIL PROTECTED] wrote:
 
  * Rob Hussey [EMAIL PROTECTED] wrote:
 
   On 9/13/07, Ingo Molnar [EMAIL PROTECTED] wrote:
   
thanks for the numbers! Could you please also post the .config you used?
  
   Sure, .config for 2.6.23-rc1 and 2.6.23-rc6 attached.
 
  thx! If you've got some time, could you perhaps re-measure with these
  disabled:
 
CONFIG_SCHED_DEBUG=y

 Well, I was going over my config myself after you asked for me to post
 it, and I thought to do the same thing. Except, disabling sched_debug
 caused the same error as before:
 In file included from kernel/sched.c:794:
 kernel/sched_fair.c: In function 'task_new_fair':
 kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
 undeclared (first use in this function)
 kernel/sched_fair.c:857: error: (Each undeclared identifier is
 reported only once
 kernel/sched_fair.c:857: error: for each function it appears in.)
 make[1]: *** [kernel/sched.o] Error 1
 make: *** [kernel] Error 2

 It only happens with sched_debug=y. I take it back, it wasn't my fault :)

I'm trying the patches now to see if they help.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/13/07, Rob Hussey [EMAIL PROTECTED] wrote:
 On 9/13/07, Rob Hussey [EMAIL PROTECTED] wrote:
  On 9/13/07, Ingo Molnar [EMAIL PROTECTED] wrote:
  
   * Rob Hussey [EMAIL PROTECTED] wrote:
  
On 9/13/07, Ingo Molnar [EMAIL PROTECTED] wrote:

 thanks for the numbers! Could you please also post the .config you 
 used?
   
Sure, .config for 2.6.23-rc1 and 2.6.23-rc6 attached.
  
   thx! If you've got some time, could you perhaps re-measure with these
   disabled:
  
 CONFIG_SCHED_DEBUG=y
 
  Well, I was going over my config myself after you asked for me to post
  it, and I thought to do the same thing. Except, disabling sched_debug
  caused the same error as before:
  In file included from kernel/sched.c:794:
  kernel/sched_fair.c: In function 'task_new_fair':
  kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
  undeclared (first use in this function)
  kernel/sched_fair.c:857: error: (Each undeclared identifier is
  reported only once
  kernel/sched_fair.c:857: error: for each function it appears in.)
  make[1]: *** [kernel/sched.o] Error 1
  make: *** [kernel] Error 2
 
  It only happens with sched_debug=y. I take it back, it wasn't my fault :)
 
 I'm trying the patches now to see if they help.

Current cfs-devel git compiles fine without sched_debug. Not sure how
I broke things, but I need some sleep. I know the 2.6.23-rc1 numbers
were good, but not sure about the others. I'll make the changes you
suggested, and get some new and hopefully good numbers for
2.6.23-rc6-cfs and 2.6.23-rc6-cfs-devel.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 00:17 +0200, Roman Zippel wrote:

 The rest of the math is indeed different - it's simply missing. What is 
 there is IMO not really adequate. I guess you will see the differences, 
 once you test a bit more with different nice levels.

The rounding error we now still have is accumulative over the long time
but has no real effect. The only effect is that a nice level would be a
little different that it would have been had the division been perfect,
not dissimilar to having a small error in the divisor series to being
with. (note that in order to see this little fuzz you need amazingly
high context switch rates)

We've measured the effect with the strongest nice levels -20 and 19, a
normal loop against two yield loops (this generated 700.000 context
switches per second), and the effect is 1%. Not something worth fixing
IMHO (unless it comes for free). 

At that high switching rates the overhead of scheduling itself and
caching causes more skew than this - the small error is totally swamped
by the time lost scheduling.

  There's a good reason 
 I put that much effort into maintaining a good, but still cheap average, 
 it's needed for a good task placement.

While I agree that having this average is nice, your particular
implementation has the problem that it quickly overflows u64 at which
point it becomes a huge problem (a CPU hog could basically lock up your
box when that happens).

I solved the wrap around problem in cfs-devel, and from that base I
_could_ probably maintain the average without overflow problems, but
have yet to try.

  There is of course more than one 
 way to implement this, so you'll have good chances to simply reimplement 
 it somewhat differently, but I'd be surprised if it would be something 
 completely different.

Currently we have 2 approximations in place:

  (leftmost + rightmost) / 2

and

  leftmost + period/2   (where period should match the span of the tree)

neither are perfect but they seem to work quite well.


signature.asc
Description: This is a digitally signed message part


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Rob Hussey [EMAIL PROTECTED] wrote:

 Well, I was going over my config myself after you asked for me to post
 it, and I thought to do the same thing. Except, disabling sched_debug
 caused the same error as before:
 In file included from kernel/sched.c:794:
 kernel/sched_fair.c: In function 'task_new_fair':
 kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
 undeclared (first use in this function)
 kernel/sched_fair.c:857: error: (Each undeclared identifier is
 reported only once
 kernel/sched_fair.c:857: error: for each function it appears in.)
 make[1]: *** [kernel/sched.o] Error 1
 make: *** [kernel] Error 2
 
 It only happens with sched_debug=y. I take it back, it wasn't my fault :)
 
 As for everything else, I'd be happy to.

are you sure this is happening with the latest iteration of the patch 
too? (with the combo-3.patch?) You can pick it up from here:

   
http://people.redhat.com/mingo/cfs-scheduler/devel/sched-cfs-v2.6.23-rc6-v21-combo-3.patch

I tried your config and it builds fine here.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Peter Zijlstra wrote:

   There's a good reason 
  I put that much effort into maintaining a good, but still cheap average, 
  it's needed for a good task placement.
 
 While I agree that having this average is nice, your particular
 implementation has the problem that it quickly overflows u64 at which
 point it becomes a huge problem (a CPU hog could basically lock up your
 box when that happens).

If you look at the math, you'll see that I took the overflow into account, 
I even expected it. If you see this effect in my implementation, it would 
be a bug.

   There is of course more than one 
  way to implement this, so you'll have good chances to simply reimplement 
  it somewhat differently, but I'd be surprised if it would be something 
  completely different.
 
 Currently we have 2 approximations in place:
 
   (leftmost + rightmost) / 2
 
 and
 
   leftmost + period/2   (where period should match the span of the tree)
 
 neither are perfect but they seem to work quite well.

You need more than two busy loops. 
There's a reason I implemented a simple simulator first, so I could 
actually study the scheduling behaviour of different load situations. That 
doesn't protect from all surprises of course, but it gives me the 
necessary confidence the scheduler will work reasonably even in weird 
situations.
From these tests I already know that your approximations only work with 
rather simple loads.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Ingo Molnar wrote:

  Out of curiousity: will I ever get answers to my questions?
 
 the last few weeks/months have been pretty hectic - i get more than 50 
 non-list emails a day so i could easily have missed some.

Well, let's just take the recent Really Simple Really Fair Scheduler 
thread. You had the time to ask me questions about my scheduler, I even 
explained to you how the sleeping bonus works in my model. At the end I 
was sort of hoping you would start answering my questions and explaining 
things how the same things work in CFS - but nothing.
Then you had the time to reimplement the very things you've just asked me 
about and what do I get credit for - two cleanups from RFS.
And now I get this lame ass excuse for not answering my questions? :-(

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 14:14 +0200, Roman Zippel wrote:
 Hi,
 
 On Thu, 13 Sep 2007, Peter Zijlstra wrote:
 
There's a good reason 
   I put that much effort into maintaining a good, but still cheap average, 
   it's needed for a good task placement.
  
  While I agree that having this average is nice, your particular
  implementation has the problem that it quickly overflows u64 at which
  point it becomes a huge problem (a CPU hog could basically lock up your
  box when that happens).
 
 If you look at the math, you'll see that I took the overflow into account, 
 I even expected it. If you see this effect in my implementation, it would 
 be a bug.

Ah, ok, I shall look to your patches in more detail, it was not obvious
from the formulae you posted.

There is of course more than one 
   way to implement this, so you'll have good chances to simply reimplement 
   it somewhat differently, but I'd be surprised if it would be something 
   completely different.
  
  Currently we have 2 approximations in place:
  
(leftmost + rightmost) / 2
  
  and
  
leftmost + period/2   (where period should match the span of the tree)
  
  neither are perfect but they seem to work quite well.
 
 You need more than two busy loops. 

I'm missing context here, are you referring to the nice level error or
the avg approximation?

 There's a reason I implemented a simple simulator first, so I could 
 actually study the scheduling behaviour of different load situations. That 
 doesn't protect from all surprises of course, but it gives me the 
 necessary confidence the scheduler will work reasonably even in weird 
 situations.

Right, I've build user-space simulators too, handy little things to play
with :-)

 From these tests I already know that your approximations only work with 
 rather simple loads.

I've not yet seen it go spectacularly wrong, although admittedly a
highly concurrent kbuild is the most complex task I let loose on it.

Could you perhaps be more specific on the circumstances it breaks down
and what the negative impact is?


signature.asc
Description: This is a digitally signed message part


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Roman Zippel [EMAIL PROTECTED] wrote:

 The rest of the math is indeed different - it's simply missing. What 
 is there is IMO not really adequate. I guess you will see the 
 differences, once you test a bit more with different nice levels.

Roman, i disagree strongly. I did test with different nice levels. Here 
are some hard numbers: the CPU usage table of 40 busy loops started at 
once, all running at a different nice level, from nice -20 to nice +19:

 top - 12:25:07 up 19 min,  2 users,  load average: 40.00, 39.15, 28.35
 Tasks: 172 total,  41 running, 131 sleeping,   0 stopped,   0 zombie

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 2455 root   0 -20  1576  248  196 R   20  0.0   3:47.56 loop
 2456 root   1 -19  1576  244  196 R   16  0.0   3:03.96 loop
 2457 root   2 -18  1576  244  196 R   13  0.0   2:24.80 loop
 2458 root   3 -17  1576  248  196 R   10  0.0   1:58.63 loop
 2459 root   4 -16  1576  244  196 R8  0.0   1:33.04 loop
 2460 root   5 -15  1576  248  196 R7  0.0   1:14.73 loop
 2461 root   6 -14  1576  248  196 R5  0.0   0:59.61 loop
 2462 root   7 -13  1576  244  196 R4  0.0   0:47.95 loop
 2463 root   8 -12  1576  248  196 R3  0.0   0:38.31 loop
 2464 root   9 -11  1576  244  196 R3  0.0   0:30.54 loop
 2465 root  10 -10  1576  244  196 R2  0.0   0:24.47 loop
 2466 root  11  -9  1576  244  196 R2  0.0   0:19.52 loop
 2467 root  12  -8  1576  248  196 R1  0.0   0:15.63 loop
 2468 root  13  -7  1576  248  196 R1  0.0   0:12.56 loop
 2469 root  14  -6  1576  248  196 R1  0.0   0:10.00 loop
 2470 root  15  -5  1576  244  196 R1  0.0   0:07.99 loop
 2471 root  16  -4  1576  244  196 R1  0.0   0:06.40 loop
 2472 root  17  -3  1576  244  196 R0  0.0   0:05.09 loop
 2473 root  18  -2  1576  244  196 R0  0.0   0:04.05 loop
 2474 root  19  -1  1576  248  196 R0  0.0   0:03.26 loop
 2475 root  20   0  1576  244  196 R0  0.0   0:02.61 loop
 2476 root  21   1  1576  244  196 R0  0.0   0:02.09 loop
 2477 root  22   2  1576  244  196 R0  0.0   0:01.67 loop
 2478 root  23   3  1576  244  196 R0  0.0   0:01.33 loop
 2479 root  24   4  1576  248  196 R0  0.0   0:01.07 loop
 2480 root  25   5  1576  244  196 R0  0.0   0:00.84 loop
 2481 root  26   6  1576  248  196 R0  0.0   0:00.68 loop
 2482 root  27   7  1576  248  196 R0  0.0   0:00.54 loop
 2483 root  28   8  1576  248  196 R0  0.0   0:00.43 loop
 2484 root  29   9  1576  248  196 R0  0.0   0:00.34 loop
 2485 root  30  10  1576  244  196 R0  0.0   0:00.27 loop
 2486 root  31  11  1576  248  196 R0  0.0   0:00.21 loop
 2487 root  32  12  1576  244  196 R0  0.0   0:00.17 loop
 2488 root  33  13  1576  244  196 R0  0.0   0:00.13 loop
 2489 root  34  14  1576  244  196 R0  0.0   0:00.10 loop
 2490 root  35  15  1576  244  196 R0  0.0   0:00.08 loop
 2491 root  36  16  1576  248  196 R0  0.0   0:00.06 loop
 2493 root  38  18  1576  248  196 R0  0.0   0:00.03 loop
 2492 root  37  17  1576  244  196 R0  0.0   0:00.04 loop
 2494 root  39  19  1576  244  196 R0  0.0   0:00.02 loop

check a few select rows (the ratio of CPU time should be 1.25 at every 
step) and see that CPU time is distributed very exactly. (and the same 
is true for both -rc6 and -rc6-cfs-devel)

So even in this pretty extreme example (who on this planet runs 40 busy 
loops with each loop on exactly one separate nice level, creating a load 
average of 40.0 and expects perfect distribution after just a few 
minutes?) CFS still distributes CPU time perfectly.

When you first raised accuracy issues i have asked you to provide 
specific real-world examples showing any of the problems with nice 
levels you implied to repeatedly:

http://lkml.org/lkml/2007/9/2/38

In the announcement of your Really Fair Scheduler patch you used the 
following very strong statement:

 This model is far more accurate than CFS is [...]

http://lkml.org/lkml/2007/8/30/307

but when i stressed you for actual real-world proof of CFS misbehavior, 
you said:

[...] they have indeed little effect in the short term, [...] 

http://lkml.org/lkml/2007/9/2/282

so how can CFS be far less accurate (paraphrased) while it has little 
effect in the short term?

so to repeat my question: my (and Peter's) claim is that there is no 
real-world significance of much of the complexity you added to avoid 
rounding effects. You do disagree with that, so our 

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Ingo Molnar

* Roman Zippel [EMAIL PROTECTED] wrote:

 Then you had the time to reimplement the very things you've just asked 
 me about and what do I get credit for - two cleanups from RFS.

i'm sorry to say this, but you must be reading some other email list and 
a different git tree than what i am reading.

Firstly, about communications - in the past 3 months i've written you 40 
emails regarding CFS - and that's more emails than my wife (or any 
member of my family) got in that timeframe :-( I just ran a quick 
script: i sent more CFS related emails to you than to any other person 
on this planet. I bent backwards trying to somehow get you to cooperate 
with us (and i still havent given up on that!) - instead of you 
disparaging CFS and me frequently :-(

Secondly, i prominently credited you as early as in the second sentence 
of our announcement:

 | fresh back from the Kernel Summit, Peter Zijlstra and me are pleased 
 | to announce the latest iteration of the CFS scheduler development 
 | tree. Our main focus has been on simplifications and performance - 
 | and as part of that we've also picked up some ideas from Roman 
 | Zippel's 'Really Fair Scheduler' patch as well and integrated them 
 | into CFS. We'd like to ask people go give these patches a good 
 | workout, especially with an eye on any interactivity regressions.

   http://lkml.org/lkml/2007/9/11/395

And you are duly credited in 3 patches:

   ---

   Subject: sched: introduce se-vruntime

   introduce se-vruntime as a sum of weighted delta-exec's, and use 
   that as the key into the tree.

   the idea to use absolute virtual time as the basic metric of 
   scheduling has been first raised by William Lee Irwin, advanced by 
   Tong Li and first prototyped by Roman Zippel in the Really Fair 
   Scheduler (RFS) patchset.

   also see:

  http://lkml.org/lkml/2007/9/2/76

   for a simpler variant of this patch.

   ---

   Subject: sched: track cfs_rq-curr on !group-scheduling too

   Noticed by Roman Zippel: use cfs_rq-curr in the !group-scheduling 
   case too. Small micro-optimization and cleanup effect:

   ---

   Subject: sched: uninline __enqueue_entity()/__dequeue_entity()

   suggested by Roman Zippel: uninline __enqueue_entity() and 
   __dequeue_entity().

   ---

We could not add you as the author, because you unfortunately did not 
make your changes applicable to CFS. I've asked you _three_ separate 
times to send a nicely split up series so that we can apply your code:

   it's far easier to review and merge stuff if it's nicely split up. 

   http://lkml.org/lkml/2007/9/2/38

   I also think that the core math changes should be split from the 
Breshenham optimizations. 

   http://lkml.org/lkml/2007/9/2/43

That's also why i've asked for a split-up patch series - it makes 
 it far easier to review and test the code and it makes it far 
 easier to quickly apply the obviously correct bits. 

   http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg204094.html

You never directly replied to these pretty explicit requests, all you 
did was this side remark 5 days later in one of your patch 
announcements:

For a split version I'm still waiting for some more explanation
 about the CFS tuning parameter. 

 http://lkml.org/lkml/2007/9/7/87

You are an experienced kernel hacker. How you can credibly claim that 
while you were capable of writing a new scheduler along with a series of 
25 complex mathematical equations that few if any lkml readers are able 
to understand (and which scheduler came in one intermixed patch that 
added no new comments at all!), and that you are able to maintain the 
m68k Linux architecture code, but that at the same time some supposed 
missing explanation from _me_ makes you magically incapable to split up 
_your own fine code_? This is really beyond me.

I even gave you the first baby step of the split-up by sending this:

http://lkml.org/lkml/2007/9/2/76

And your reaction to this was dismissive:

   It simplifies the math too much, the nice level weighting is an 
essential part of the math and without it one can't really 
understand the problem I'm trying to solve. 

http://lkml.org/lkml/2007/9/3/174

So we advanced this whole issue by trying the vruntime concept in CFS 
and adding the 2 cleanups from RFS (we couldnt actually use any code 
from you, due to the way you shaped your patch - but we'd certainly be 
glad to!). You've seen the earliest iteration of that at:

http://lkml.org/lkml/2007/9/2/76

So far you've sent 3 updates of your patch without addressing any of the 
structural feedback we gave. We virtually begged you to make your code 
finegrained and applicable - but you did not do that.

And please understand, splitting up patches is paramount when 
cooperating with others: we are not against adding code that makes sense 
(to the contrary and we do that every day), but it has to be 

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Roman Zippel
Hi,

On Thu, 13 Sep 2007, Ingo Molnar wrote:

  Then you had the time to reimplement the very things you've just asked 
  me about and what do I get credit for - two cleanups from RFS.
 
 i'm sorry to say this, but you must be reading some other email list and 
 a different git tree than what i am reading.
 
 Firstly, about communications - in the past 3 months i've written you 40 
 emails regarding CFS - and that's more emails than my wife (or any 
 member of my family) got in that timeframe :-( I just ran a quick 
 script: i sent more CFS related emails to you than to any other person 
 on this planet. I bent backwards trying to somehow get you to cooperate 
 with us (and i still havent given up on that!) - instead of you 
 disparaging CFS and me frequently :-(
 
 Secondly, i prominently credited you as early as in the second sentence 
 of our announcement:
 
  | fresh back from the Kernel Summit, Peter Zijlstra and me are pleased 
  | to announce the latest iteration of the CFS scheduler development 
  | tree. Our main focus has been on simplifications and performance - 
  | and as part of that we've also picked up some ideas from Roman 
  | Zippel's 'Really Fair Scheduler' patch as well and integrated them 
  | into CFS. We'd like to ask people go give these patches a good 
  | workout, especially with an eye on any interactivity regressions.
 
http://lkml.org/lkml/2007/9/11/395
 
 And you are duly credited in 3 patches:

This needs a little perspective, as I couldn't clone the repository (and 
you know that), all I had was this announcement, so using the patch 
descriptions now as defense is unfair by you.
In this announcement you make relatively few references how this relates 
to my work. Maybe someone else can show me how to read that announcement 
differently, but IMO the casual reader is likely to get the impression, 
that you only picked some minor cleanups from my patch, but it's rather 
unclear that you already reimplemented key aspects of my patch. Don't 
blame me for your own ambiguity.

---
 
Subject: sched: introduce se-vruntime
 
introduce se-vruntime as a sum of weighted delta-exec's, and use 
that as the key into the tree.
 
the idea to use absolute virtual time as the basic metric of 
scheduling has been first raised by William Lee Irwin, advanced by 
Tong Li and first prototyped by Roman Zippel in the Really Fair 
Scheduler (RFS) patchset.
 
also see:
 
   http://lkml.org/lkml/2007/9/2/76
 
for a simpler variant of this patch.

Let's compare this to the relevant part of the announcement:

| The -vruntime metric is similar to the -time_norm metric used by
| Roman's patch (and both are losely related to the already existing
| sum_exec_runtime metric in CFS), it's in essence the sum of CPU time
| executed by a task, in nanoseconds - weighted up or down by their nice
| level (or kept the same on the default nice 0 level). Besides this basic
| metric our implementation and math differs from RFS.

In the patch you are more explicit about the virtual time aspect, in the 
announcement you're less clear that it's all based on the same idea and 
somehow it's important to stress the point that implementation and math 
differs, which is not untrue, but your forget to mention that the 
differences are rather small.

 You never directly replied to these pretty explicit requests, all you 
 did was this side remark 5 days later in one of your patch 
 announcements:

This is ridiculous, I asked you multiple times to explain to me some of 
the differences relative to CFS as response to the splitup requests. Not 
once did you react, you didn't even ask what I'd like to know 
specifically.

 
 For a split version I'm still waiting for some more explanation
  about the CFS tuning parameter. 
 
  http://lkml.org/lkml/2007/9/7/87
 
 You are an experienced kernel hacker. How you can credibly claim that 
 while you were capable of writing a new scheduler along with a series of 
 25 complex mathematical equations that few if any lkml readers are able 
 to understand (and which scheduler came in one intermixed patch that 
 added no new comments at all!), and that you are able to maintain the 
 m68k Linux architecture code, but that at the same time some supposed 
 missing explanation from _me_ makes you magically incapable to split up 
 _your own fine code_? This is really beyond me.

I never claimed to understand every detail of CFS, I can _guess_ what 
_might_ have been intended, but from that it's impossible to know for 
certain how important they are. Let's take this patch fragment:

-   /*
-* Fix up delta_fair with the effect of us running
-* during the whole sleep period:
-*/
-   if (sched_feat(SLEEPER_AVG))
-   delta_fair = div64_likely32((u64)delta_fair * load,
-   load + se-load.weight);
-
-   delta_fair = calc_weighted(delta_fair, se);

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 18:50 +0200, Roman Zippel wrote:

 I never claimed to understand every detail of CFS, I can _guess_ what 
 _might_ have been intended, but from that it's impossible to know for 
 certain how important they are. Let's take this patch fragment:
 

delta_fair = se-delta_fair_sleep;

we slept that much

 -   /*
 -* Fix up delta_fair with the effect of us running
 -* during the whole sleep period:
 -*/
 -   if (sched_feat(SLEEPER_AVG))
 -   delta_fair = div64_likely32((u64)delta_fair * load,
 -   load + se-load.weight);

if we would have ran we would not have been removed from the rq and the
weight would have been: rq_weight + weight

so compensate for us having been removed from the rq by scaling the
delta with: rq_weight/(rq_weight + weight)

 -   delta_fair = calc_weighted(delta_fair, se);

scale for nice levels




signature.asc
Description: This is a digitally signed message part


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 19:06 +0200, Peter Zijlstra wrote:
 On Thu, 2007-09-13 at 18:50 +0200, Roman Zippel wrote:
 
  I never claimed to understand every detail of CFS, I can _guess_ what 
  _might_ have been intended, but from that it's impossible to know for 
  certain how important they are. Let's take this patch fragment:
  
 
   delta_fair = se-delta_fair_sleep;
 
 we slept that much
 
  -   /*
  -* Fix up delta_fair with the effect of us running
  -* during the whole sleep period:
  -*/
  -   if (sched_feat(SLEEPER_AVG))
  -   delta_fair = div64_likely32((u64)delta_fair * load,
  -   load + se-load.weight);
 
 if we would have ran we would not have been removed from the rq and the
 weight would have been: rq_weight + weight
 
 so compensate for us having been removed from the rq by scaling the
 delta with: rq_weight/(rq_weight + weight)
 
  -   delta_fair = calc_weighted(delta_fair, se);
 
 scale for nice levels
 

Or at least, I think that is how to read it :-)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Kyle Moffett

On Sep 13, 2007, at 12:50:12, Roman Zippel wrote:

On Thu, 13 Sep 2007, Ingo Molnar wrote:

And you are duly credited in 3 patches:


This needs a little perspective, as I couldn't clone the repository  
(and you know that), all I had was this announcement, so using the  
patch descriptions now as defense is unfair by you.


How the hell is that unfair?  The fact that nobody could clone the  
repo for about 24 hours is *totally* *irrelevant* to the whole  
discussion as it's simply a matter of a technical glitch.  His point  
in referencing patch descriptions is to clear up matters of credit.   
Ingo has never in this discussion been out to get you.  From the  
point of view of a sideline observer it's been *you* that has been  
demanding answers and refusing to answer questions directed at you.


The most brilliant mathematician in the world would have nothing to  
contribute to the Linux scheduler if he couldn't describe, code, and  
comment his algorithm in detail so that others (even code-monkeys  
like myself) could grok at least the basic outline and be able to  
give useful commentary and suggestions.



In this announcement you make relatively few references how this  
relates to my work.  Maybe someone else can show me how to read  
that announcement differently, but IMO the casual reader is likely  
to get the impression, that you only picked some minor cleanups  
from my patch, but it's rather unclear that you already  
reimplemented key aspects of my patch.


As a casual reader and reviewer I have yet to actually see you post  
readable/reviewable patches in this thread.  I was basically  
completely unable to follow the detailed math you go into (even with  
a math minor) due to your *complete* lack of comments.  The fact that  
you renamed files and didn't split up your patch made it useless for  
actual practical kernel development, its only value was as a  
comparison point.  I did however get the impression that Ingo got  
something significantly useful out of your code despite the problems,  
but I still haven't had time to read through his and Peter's patches  
in detail to understand exactly what it was.  From personal  
inspection of a fair percentage of the changes that Ingo and Peter  
committed, they certainly appear to be deleting a lot more code than  
they add.  More specifically they appear to describe in detail what  
they are deleting and why, with the exception of one patch that's  
missing a changelog entry.


So yeah, I get the impression that Ingo re-implemented some ideas  
that you had because you refused to do so in a way that was  
acceptable for the upstream kernel.  How exactly is this a bad  
thing?  You came up with a great idea that worked and somebody else  
did the ugly grunt work to get it ready to go upstream!  On the other  
hand, given the pleasant attitude that you've showed Ingo during  
this whole thing I doubt he'd be likely to do it again.



You never directly replied to these pretty explicit requests, all  
you did was this side remark 5 days later in one of your patch  
announcements:


This is ridiculous, I asked you multiple times to explain to me  
some of the differences relative to CFS as response to the splitup  
requests. Not once did you react, you didn't even ask what I'd like  
to know specifically.


How exactly is Ingo supposed to explain to YOU the differences  
between his scheduler and your modified one?  Completely ignoring the  
fact that you merged all your changes into a single patch and didn't  
add a single comment, it's not *his* algorithm that I have trouble  
understanding.  From a relatively basic scan of the source-code and  
comments I was able to figure out how the algorithm works in general,  
enough to ask much more specific questions than yours.  If anything,  
Ingo should have been asking *you* how your scheduler differed from  
the one it was based on.



I never claimed to understand every detail of CFS, I can _guess_  
what _might_ have been intended, but from that it's impossible to  
know for certain how important they are. Let's take this patch  
fragment:


Oh come on, you appear to be quite knowledgeable about CPU scheduling  
and the algorithms involved, surely as such you should have a much  
easier time with reading the comments and asking specific questions.   
For example, your below question specifically about the sleep  
averaging could have been answered in fifteen minutes had you  
actually *ASKED* that.  You'll notice that in fact Peter Zijlstra's  
email response did come almost exactly 15 minutes after you sent this  
email, and for a casual reader like me it seems perfectly  
sufficient;  it does depend on you asking specific questions instead  
of how does it differ from my hundred-kbyte patch.


As for that specific patch, it's very clear that the affected logic  
is controlled by one of the sched-feature tweaking tools, so you  
could very easily experiment with it yourself to see what the  

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Sam Ravnborg
Hi Roman.

On Thu, Sep 13, 2007 at 02:35:35PM +0200, Roman Zippel wrote:
 Hi,
 
 On Thu, 13 Sep 2007, Ingo Molnar wrote:
 
   Out of curiousity: will I ever get answers to my questions?
  
  the last few weeks/months have been pretty hectic - i get more than 50 
  non-list emails a day so i could easily have missed some.
 
 Well, let's just take the recent Really Simple Really Fair Scheduler 
 thread. You had the time to ask me questions about my scheduler, I even 
 explained to you how the sleeping bonus works in my model. At the end I 
 was sort of hoping you would start answering my questions and explaining 
 things how the same things work in CFS - but nothing.
 Then you had the time to reimplement the very things you've just asked me 
 about and what do I get credit for - two cleanups from RFS.

I have read the announcement from Ingo and after reading it I concluded
that it was good to see that Ingo had taken in consideration the feedback
from you and improved the schduler based on this.
And when I read that he removed a lot of stuff I smiled. This reminded
me of countless monkey aka code review sessions where I repeatedly do
like my childred and asks why so many times that the author realize that
something is not needed or no longer used.


The above were my impression after reading the announcement with
respect to your influence and that goes far beyond two cleanups.
I bet many others read it roughly like I did.

And no - I did not go back and re-read it. So do not answering
by quoting the announcement or stuff like this.
Because that will NOT change what my first impression was.

So keep up the review - we get a better scheduler this way.

Sam
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Peter Zijlstra
On Thu, 2007-09-13 at 14:28 -0400, Kyle Moffett wrote:

  with the exception of one patch that's missing a changelog entry.

Ah, that would have been one of mine.

---
From: Peter Zijlstra [EMAIL PROTECTED]

Handle vruntime overflow by centering the key space around min_vruntime.

Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]
Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 kernel/sched_fair.c |   15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index a306f05..b8e2a0d 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -116,11 +116,18 @@ set_leftmost(struct cfs_rq *cfs_rq, struct rb_node 
*leftmost)
cfs_rq-rb_leftmost = leftmost;
if (leftmost) {
se = rb_entry(leftmost, struct sched_entity, run_node);
-   cfs_rq-min_vruntime = max(se-vruntime,
-   cfs_rq-min_vruntime);
+   if ((se-vruntime  cfs_rq-min_vruntime) ||
+   (cfs_rq-min_vruntime  (1ULL  61) 
+se-vruntime  (1ULL  50)))
+   cfs_rq-min_vruntime = se-vruntime;
}
 }
 
+s64 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *se)
+{
+   return se-fair_key - cfs_rq-min_vruntime;
+}
+
 /*
  * Enqueue an entity into the rb-tree:
  */
@@ -130,7 +137,7 @@ __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity 
*se)
struct rb_node **link = cfs_rq-tasks_timeline.rb_node;
struct rb_node *parent = NULL;
struct sched_entity *entry;
-   s64 key = se-fair_key;
+   s64 key = entity_key(cfs_rq, se);
int leftmost = 1;
 
/*
@@ -143,7 +150,7 @@ __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity 
*se)
 * We dont care about collisions. Nodes with
 * the same key stay together.
 */
-   if (key - entry-fair_key  0) {
+   if (key  entity_key(cfs_rq, entry)) {
link = parent-rb_left;
} else {
link = parent-rb_right;


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread dimm

Hi,

please find a couple of minor cleanups below (on top of 
sched-cfs-v2.6.23-rc6-v21-combo-3.patch):


(1)

Better placement of #ifdef CONFIG_SCHEDSTAT block in dequeue_entity().

Signed-off-by: Dmitry Adamushko [EMAIL PROTECTED]

---
diff -upr linux-2.6.23-rc6/kernel/sched_fair.c 
linux-2.6.23-rc6-my/kernel/sched_fair.c
--- linux-2.6.23-rc6/kernel/sched_fair.c2007-09-13 21:38:49.0 
+0200
+++ linux-2.6.23-rc6-my/kernel/sched_fair.c 2007-09-13 21:48:50.0 
+0200
@@ -453,8 +453,8 @@ static void
 dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int sleep)
 {
update_stats_dequeue(cfs_rq, se);
-   if (sleep) {
 #ifdef CONFIG_SCHEDSTATS
+   if (sleep) {
if (entity_is_task(se)) {
struct task_struct *tsk = task_of(se);
 
@@ -463,8 +463,8 @@ dequeue_entity(struct cfs_rq *cfs_rq, st
if (tsk-state  TASK_UNINTERRUPTIBLE)
se-block_start = rq_of(cfs_rq)-clock;
}
-#endif
}
+#endif
__dequeue_entity(cfs_rq, se);
 }
 
---


(2)

unless we are very eager to keep an additional layer of abstraction,
'struct load_stat' is redundant now so let's get rid of it.

Signed-off-by: Dmitry Adamushko [EMAIL PROTECTED]


---
diff -upr linux-2.6.23-rc6/kernel/sched.c 
linux-2.6.23-rc6-sched-dev/kernel/sched.c
--- linux-2.6.23-rc6/kernel/sched.c 2007-09-12 21:37:41.0 +0200
+++ linux-2.6.23-rc6-sched-dev/kernel/sched.c   2007-09-12 21:26:10.0 
+0200
@@ -170,10 +170,6 @@ struct rt_prio_array {
struct list_head queue[MAX_RT_PRIO];
 };
 
-struct load_stat {
-   struct load_weight load;
-};
-
 /* CFS-related fields in a runqueue */
 struct cfs_rq {
struct load_weight load;
@@ -232,7 +228,7 @@ struct rq {
 #ifdef CONFIG_NO_HZ
unsigned char in_nohz_recently;
 #endif
-   struct load_stat ls;/* capture load from *all* tasks on this cpu */
+   struct load_weight load;/* capture load from *all* tasks on 
this cpu */
unsigned long nr_load_updates;
u64 nr_switches;
 
@@ -804,7 +800,7 @@ static int balance_tasks(struct rq *this
  * Update delta_exec, delta_fair fields for rq.
  *
  * delta_fair clock advances at a rate inversely proportional to
- * total load (rq-ls.load.weight) on the runqueue, while
+ * total load (rq-load.weight) on the runqueue, while
  * delta_exec advances at the same rate as wall-clock (provided
  * cpu is not idle).
  *
@@ -812,17 +808,17 @@ static int balance_tasks(struct rq *this
  * runqueue over any given interval. This (smoothened) load is used
  * during load balance.
  *
- * This function is called /before/ updating rq-ls.load
+ * This function is called /before/ updating rq-load
  * and when switching tasks.
  */
 static inline void inc_load(struct rq *rq, const struct task_struct *p)
 {
-   update_load_add(rq-ls.load, p-se.load.weight);
+   update_load_add(rq-load, p-se.load.weight);
 }
 
 static inline void dec_load(struct rq *rq, const struct task_struct *p)
 {
-   update_load_sub(rq-ls.load, p-se.load.weight);
+   update_load_sub(rq-load, p-se.load.weight);
 }
 
 static void inc_nr_running(struct task_struct *p, struct rq *rq)
@@ -967,7 +963,7 @@ inline int task_curr(const struct task_s
 /* Used instead of source_load when we know the type == 0 */
 unsigned long weighted_cpuload(const int cpu)
 {
-   return cpu_rq(cpu)-ls.load.weight;
+   return cpu_rq(cpu)-load.weight;
 }
 
 static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
@@ -1933,7 +1929,7 @@ unsigned long nr_active(void)
  */
 static void update_cpu_load(struct rq *this_rq)
 {
-   unsigned long this_load = this_rq-ls.load.weight;
+   unsigned long this_load = this_rq-load.weight;
int i, scale;
 
this_rq-nr_load_updates++;
diff -upr linux-2.6.23-rc6/kernel/sched_debug.c 
linux-2.6.23-rc6-sched-dev/kernel/sched_debug.c
--- linux-2.6.23-rc6/kernel/sched_debug.c   2007-09-12 21:37:41.0 
+0200
+++ linux-2.6.23-rc6-sched-dev/kernel/sched_debug.c 2007-09-12 
21:36:04.0 +0200
@@ -137,7 +137,7 @@ static void print_cpu(struct seq_file *m
 
P(nr_running);
SEQ_printf(m,   .%-30s: %lu\n, load,
-  rq-ls.load.weight);
+  rq-load.weight);
P(nr_switches);
P(nr_load_updates);
P(nr_uninterruptible);
diff -upr linux-2.6.23-rc6/kernel/sched_fair.c 
linux-2.6.23-rc6-sched-dev/kernel/sched_fair.c
--- linux-2.6.23-rc6/kernel/sched_fair.c2007-09-12 21:37:41.0 
+0200
+++ linux-2.6.23-rc6-sched-dev/kernel/sched_fair.c  2007-09-12 
21:35:27.0 +0200
@@ -499,7 +499,7 @@ set_next_entity(struct cfs_rq *cfs_rq, s
 * least twice that of our own weight (i.e. dont track it
 * when there are only lesser-weight tasks around):
 */
-   if (rq_of(cfs_rq)-ls.load.weight = 2*se-load.weight) {
+   if 

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Willy Tarreau
Roman,

I've been trying to follow your mails about CFS since your review posted
on Aug 1st. Back to that date, I was thinking cool, an in-depth review
by someone who understands schedulers and mathematics very well, we'll
quickly have a very solid design.

On Aug 10th, I was disappointed to see that you still had not provided
the critical information that Ingo had been asking to you for 9 days
(cfs-sched-debug output). Your motivations in this work started to
become a bit fuzzy to me, since people who behave like this generally
do so to get all the lights on them and you really don't need this.

Your explanation was kind of show me yours and only then I'll show
you mine. Pretty childish but you finally sent that long-requested
information.

Since then, I've been noticing your now popular will I get a response
to my questions stuffed in most of your mails. That was getting very
suspicious from someone who can write down mathematics equations to
prove his design is right, especially considering the fact that your
question only relates to what a few lines were supposed to do. Nobody
believes that someone as smart as you is still blocked on the same
line of code after one month!

And if getting CFS fixed wasn't your real motivation...

On Thu, Sep 13, 2007 at 12:17:42AM +0200, Roman Zippel wrote:
 On Tue, 11 Sep 2007, Ingo Molnar wrote:
 
  fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to 
  announce the latest iteration of the CFS scheduler development tree. Our 
  main focus has been on simplifications and performance - and as part of 
  that we've also picked up some ideas from Roman Zippel's 'Really Fair 
  Scheduler' patch as well and integrated them into CFS. We'd like to ask 
  people go give these patches a good workout, especially with an eye on 
  any interactivity regressions.
 
 I'm must really say, I'm quite impressed by your efforts to give me as 
 little credit as possible.
 On the one hand it's of course positive to see so much sudden activity, on 
 the other hand I'm not sure how much had happened if I hadn't posted my 
 patch, I don't really think it were my complaints about CFS's complexity 
 that finally lead to the improvements in this area.

I'm now fairly convinced that you're not seeking credits either. There
are more credits to your name per line of patch here than there is in
your own code in the kernel. That complaint does not stand by itself.

In fact, I'm beginning to think that you're like a cat who has found a mouse.
Why kill it if you can play with it ? Each of your will I get a response
are just like a small kick in the mouse's back to make it move. But by dint
of doing this, you're slowly pushing the mouse to the door where it risks
to escape from you, and you're losing your toy.

So right now, I'm sure you really do not want to get any code merged. It's
so much fun for you to say hey, Ingo, respond to me that you would lose
this ability would your code get merged.

 I presented the basic 
 concepts of my patch already with my first CFS review, but at that time 
 you didn't show any interest and instead you were rather quick to simply 
 dismiss it.

At that time, if my memory serves me, you were complaining about a fairness
problem you had with a few programs that you already took days to show the
sources. Proposing an alternate design with a bug report generally has no
chance to be considered because the developer mostly focuses on the bug
report. You should have spent time explaining how your design would work
*after* your problems were solved.

 My patch did not add that much new, it's mostly a conceptual 
 improvement and describes the math in more detail

- why those details were never explained in pure english when nobody could
  understand your maths, then ?

- if you have no problem reading code and translating it to concepts, without
  any comment around it, then how is it believable that you have a problem
  understanding 10 lines of code after 1 month ?

, but it also demonstrated a number of improvements.

Very likely, reason why Ingo and Peter accepted to take parts of those
improvements. But do you realize that your lack of ability to communicate
on this list has probably delayed mainline integration of parts of your
work, because it was required to get a patch to try to understand your
intents ? It's not sci.math here, its linux-kernel, the _development_
mailing list, where the raw material and common language between people
is the _code_. Some people do not have the skills required to code their
excellent ideas, but they can spend time explaining those to other people.

In your case, it was just a guess game. It does not work like this and
you know it. I really think that you deliberately slowed all the process
down in order to stay on the scene playing this game.

  The sched-devel.git tree can be pulled from:
  
 
  git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git
 
 Am I the only one who can't clone 

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread dimm

and here's something a bit more intrusive.

The initial idea was to completely get rid of 'se-fair_key'. It's always equal 
to 'se-vruntime' for
all runnable tasks but the 'current'. The exact key within the tree for the 
'current' has to be known in
order for __enqueue_entity() to work properly (if we just use 'vruntime', we 
may go a wrong way down the tree
while looking for the correct position for a new element).
Sure, it's possible to cache the current's key in the 'cfs_rq' and add a few 
additional checks, but that's
not very nice... so what if we don't keep the 'current' within the tree? :-)

The illustration is below. Some bits can be missed so far but a patched kernel 
boots/works
(haven't done real regression tests yet... can say that the mail client is 
still working
at this very moment :-).

There are 2 benefits:

(1) no more 'fair_key' ;
(2) entity_tick() is simpler/more effective : 'update_curr()' now vs.
'dequeue_entity() + enqueue_entity()' before.

anyway, consider it as mainly an illustration of idea so far.

---
diff -upr linux-2.6.23-rc6/include/linux/sched.h 
linux-2.6.23-rc6-my/include/linux/sched.h
--- linux-2.6.23-rc6/include/linux/sched.h  2007-09-13 21:38:49.0 
+0200
+++ linux-2.6.23-rc6-my/include/linux/sched.h   2007-09-13 23:01:21.0 
+0200
@@ -890,7 +890,6 @@ struct load_weight {
  * 6 se-load.weight
  */
 struct sched_entity {
-   s64 fair_key;
struct load_weight  load;   /* for load-balancing */
struct rb_node  run_node;
unsigned inton_rq;
diff -upr linux-2.6.23-rc6/kernel/sched.c linux-2.6.23-rc6-my/kernel/sched.c
--- linux-2.6.23-rc6/kernel/sched.c 2007-09-13 21:52:13.0 +0200
+++ linux-2.6.23-rc6-my/kernel/sched.c  2007-09-13 23:00:19.0 +0200
@@ -6534,7 +6534,6 @@ void normalize_rt_tasks(void)
 
read_lock_irq(tasklist_lock);
do_each_thread(g, p) {
-   p-se.fair_key  = 0;
p-se.exec_start= 0;
 #ifdef CONFIG_SCHEDSTATS
p-se.wait_start= 0;
diff -upr linux-2.6.23-rc6/kernel/sched_debug.c 
linux-2.6.23-rc6-my/kernel/sched_debug.c
--- linux-2.6.23-rc6/kernel/sched_debug.c   2007-09-13 21:52:13.0 
+0200
+++ linux-2.6.23-rc6-my/kernel/sched_debug.c2007-09-13 23:00:50.0 
+0200
@@ -38,7 +38,7 @@ print_task(struct seq_file *m, struct rq
 
SEQ_printf(m, %15s %5d %15Ld %13Ld %5d ,
p-comm, p-pid,
-   (long long)p-se.fair_key,
+   (long long)p-se.vruntime,
(long long)(p-nvcsw + p-nivcsw),
p-prio);
 #ifdef CONFIG_SCHEDSTATS
diff -upr linux-2.6.23-rc6/kernel/sched_fair.c 
linux-2.6.23-rc6-my/kernel/sched_fair.c
--- linux-2.6.23-rc6/kernel/sched_fair.c2007-09-13 21:52:13.0 
+0200
+++ linux-2.6.23-rc6-my/kernel/sched_fair.c 2007-09-13 23:48:02.0 
+0200
@@ -125,7 +125,7 @@ set_leftmost(struct cfs_rq *cfs_rq, stru
 
 s64 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
-   return se-fair_key - cfs_rq-min_vruntime;
+   return se-vruntime - cfs_rq-min_vruntime;
 }
 
 /*
@@ -167,9 +167,6 @@ __enqueue_entity(struct cfs_rq *cfs_rq, 
 
rb_link_node(se-run_node, parent, link);
rb_insert_color(se-run_node, cfs_rq-tasks_timeline);
-   update_load_add(cfs_rq-load, se-load.weight);
-   cfs_rq-nr_running++;
-   se-on_rq = 1;
 }
 
 static void
@@ -179,9 +176,6 @@ __dequeue_entity(struct cfs_rq *cfs_rq, 
set_leftmost(cfs_rq, rb_next(se-run_node));
 
rb_erase(se-run_node, cfs_rq-tasks_timeline);
-   update_load_sub(cfs_rq-load, se-load.weight);
-   cfs_rq-nr_running--;
-   se-on_rq = 0;
 }
 
 static inline struct rb_node *first_fair(struct cfs_rq *cfs_rq)
@@ -320,10 +314,6 @@ static void update_stats_enqueue(struct 
 */
if (se != cfs_rq-curr)
update_stats_wait_start(cfs_rq, se);
-   /*
-* Update the key:
-*/
-   se-fair_key = se-vruntime;
 }
 
 static void
@@ -371,6 +361,22 @@ update_stats_curr_end(struct cfs_rq *cfs
  * Scheduling class queueing methods:
  */
 
+static void
+account_entity_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
+{
+   update_load_add(cfs_rq-load, se-load.weight);
+   cfs_rq-nr_running++;
+   se-on_rq = 1;
+}
+
+static void
+account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
+{
+   update_load_sub(cfs_rq-load, se-load.weight);
+   cfs_rq-nr_running--;
+   se-on_rq = 0;
+}
+
 static void enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
 #ifdef CONFIG_SCHEDSTATS
@@ -446,7 +452,9 @@ enqueue_entity(struct cfs_rq *cfs_rq, st
}
 
update_stats_enqueue(cfs_rq, se);
-   __enqueue_entity(cfs_rq, se);
+   if (se != cfs_rq-curr)
+   __enqueue_entity(cfs_rq, se);
+   

Re: [announce] CFS-devel, performance improvements

2007-09-13 Thread Rob Hussey
On 9/13/07, Rob Hussey [EMAIL PROTECTED] wrote:
 Bound to single core:
...
 hackbench 50
 #  rc1   rc6   cfs-devel
 1  7.528 7.950 7.538
 2  7.649 8.026 7.548
 3  7.613 8.160 7.580
 4  7.550 8.054 7.558
 5  7.563 8.373 7.559
 6  7.617 8.152 7.550
 7  7.593 7.831 7.562
 8  7.602 8.311 7.588
 9  7.589 8.010 7.552
 10 7.682 8.059 7.556


I knew there was no way I'd post all these numbers and not screw
something up. Switch rc6 and rc1 for hackbench 50 (bound to single
core). Updated graph:
http://www.healthcarelinen.com/misc/BOUND_hackbench_benchmark_fixed.png

Also attached.
attachment: BOUND_hackbench_benchmark_fixed.png

Re: [announce] CFS-devel, performance improvements

2007-09-12 Thread Roman Zippel
Hi,

On Tue, 11 Sep 2007, Ingo Molnar wrote:

> fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to 
> announce the latest iteration of the CFS scheduler development tree. Our 
> main focus has been on simplifications and performance - and as part of 
> that we've also picked up some ideas from Roman Zippel's 'Really Fair 
> Scheduler' patch as well and integrated them into CFS. We'd like to ask 
> people go give these patches a good workout, especially with an eye on 
> any interactivity regressions.

I'm must really say, I'm quite impressed by your efforts to give me as 
little credit as possible.
On the one hand it's of course positive to see so much sudden activity, on 
the other hand I'm not sure how much had happened if I hadn't posted my 
patch, I don't really think it were my complaints about CFS's complexity 
that finally lead to the improvements in this area. I presented the basic 
concepts of my patch already with my first CFS review, but at that time 
you didn't show any interest and instead you were rather quick to simply 
dismiss it. My patch did not add that much new, it's mostly a conceptual 
improvement and describes the math in more detail, but it also 
demonstrated a number of improvements.

> The combo patch against 2.6.23-rc6 can be picked up from:
> 
>   http://people.redhat.com/mingo/cfs-scheduler/devel/
> 
> The sched-devel.git tree can be pulled from:
> 
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git

Am I the only one who can't clone that thing? So I can't go into much 
detail about the individual changes here.
The thing that makes me curious, is that it also includes patches by 
others. It can't be entirely explained with the Kernel Summit, as this is 
not the first time patches appear out of the blue in form of a git tree. 
The funny/sad thing is that at some point Linus complained about Con that 
his development activity happend on a separate mailing list, but there was 
at least a place to go to. CFS's development appears to mostly happen in 
private. Patches may be your primary form of communication, but that isn't 
true for many other people, with patches a lot of intent and motivation 
for a change is lost. I know it's rather tempting to immediately try out 
an idea first, but would it really hurt you so much to formulate an idea 
in a more conventional manner? Are you afraid it might hurt your 
ueberhacker status by occasionally screwing up in public? Patches on the 
other hand have the advantage to more easily cover that up by simply 
posting a fix - it makes it more difficult to understand what's going on.
A more conventional way of communication would give more people a chance 
to participate, they may not understand every detail of the patch, but 
they can try to understand the general concepts and apply them to their 
own situation and eventually come up with some ideas/improvements of their 
own, they would be less dependent on you to come up with a solution to 
their problem. Unless of course that's exactly what you want - unless you 
want to be in full control of the situation and you want to be the hero 
that saves the day.

> There are lots of small performance improvements in form of a 
> finegrained 29-patch series. We have removed a number of features and 
> metrics from CFS that might have been needed but ended up being 
> superfluous - while keeping the things that worked out fine, like 
> sleeper fairness. On 32-bit x86 there's a ~16% speedup (over -rc6) in 
> lmbench (lat_ctx -s 0 2) results:

In the patch you really remove _a_lot_ of stuff. You also removed a lot of 
things I tried to get you to explain them to me. On the one hand I could 
be happy that these things are gone, as they were the major road block to 
splitting up my own patch. On the other hand it still leaves me somewhat 
unsatisfied, as I still don't know what that stuff was good for.
In a more collaborative development model I would have expected that you 
tried to explain these features, which could have resulted in a discussion 
how else things can be implemented or if it's still needed at all. Instead 
of this you now simply decide unilaterally that these things are not 
needed anymore.

BTW the old sleeper fairness logic "that worked out fine" is actually 
completely gone and is now conceptually closer to what I'm already doing 
in my patch (only the amount of sleeper bonus differs).

>   (microseconds, lower is better)
>  
> v2.6.222.6.23-rc6(CFS) v2.6.23-rc6-CFS-devel
>  
>0.70  0.750.65
>0.62  0.660.63
>0.60  0.720.69
>0.62  0.740.61
>0.69  0.730.53
>0.66  0.73   

Re: [announce] CFS-devel, performance improvements

2007-09-12 Thread Mike Galbraith
On Tue, 2007-09-11 at 22:04 +0200, Ingo Molnar wrote:
> fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to 
> announce the latest iteration of the CFS scheduler development tree. Our 
> main focus has been on simplifications and performance - and as part of 
> that we've also picked up some ideas from Roman Zippel's 'Really Fair 
> Scheduler' patch as well and integrated them into CFS. We'd like to ask 
> people go give these patches a good workout, especially with an eye on 
> any interactivity regressions.

Initial test-drive looks good here, but I do see a regression.  First
the good news.

fairtest2 is perfect, more perfect than ever seen before in fact.  Mixed
interval sleepers/hog looks fine as well (can't say perfect due to
startup differences with the various proggies, but cpu% looks perfect).
Amarok song switch time under hefty kbuild load is fine as well.  I
haven't done heavy multimedia testing yet, but will give it a more
thorough workout later (errands).

The regression:  I see some GUI lurch, easily reproducible by running a
make -j5 and moving the mouse in a circle... perceptible (100ms or so)
lurches not present in rc5. 

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-12 Thread Roman Zippel
Hi,

On Tue, 11 Sep 2007, Ingo Molnar wrote:

 fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to 
 announce the latest iteration of the CFS scheduler development tree. Our 
 main focus has been on simplifications and performance - and as part of 
 that we've also picked up some ideas from Roman Zippel's 'Really Fair 
 Scheduler' patch as well and integrated them into CFS. We'd like to ask 
 people go give these patches a good workout, especially with an eye on 
 any interactivity regressions.

I'm must really say, I'm quite impressed by your efforts to give me as 
little credit as possible.
On the one hand it's of course positive to see so much sudden activity, on 
the other hand I'm not sure how much had happened if I hadn't posted my 
patch, I don't really think it were my complaints about CFS's complexity 
that finally lead to the improvements in this area. I presented the basic 
concepts of my patch already with my first CFS review, but at that time 
you didn't show any interest and instead you were rather quick to simply 
dismiss it. My patch did not add that much new, it's mostly a conceptual 
improvement and describes the math in more detail, but it also 
demonstrated a number of improvements.

 The combo patch against 2.6.23-rc6 can be picked up from:
 
   http://people.redhat.com/mingo/cfs-scheduler/devel/
 
 The sched-devel.git tree can be pulled from:
 

 git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git

Am I the only one who can't clone that thing? So I can't go into much 
detail about the individual changes here.
The thing that makes me curious, is that it also includes patches by 
others. It can't be entirely explained with the Kernel Summit, as this is 
not the first time patches appear out of the blue in form of a git tree. 
The funny/sad thing is that at some point Linus complained about Con that 
his development activity happend on a separate mailing list, but there was 
at least a place to go to. CFS's development appears to mostly happen in 
private. Patches may be your primary form of communication, but that isn't 
true for many other people, with patches a lot of intent and motivation 
for a change is lost. I know it's rather tempting to immediately try out 
an idea first, but would it really hurt you so much to formulate an idea 
in a more conventional manner? Are you afraid it might hurt your 
ueberhacker status by occasionally screwing up in public? Patches on the 
other hand have the advantage to more easily cover that up by simply 
posting a fix - it makes it more difficult to understand what's going on.
A more conventional way of communication would give more people a chance 
to participate, they may not understand every detail of the patch, but 
they can try to understand the general concepts and apply them to their 
own situation and eventually come up with some ideas/improvements of their 
own, they would be less dependent on you to come up with a solution to 
their problem. Unless of course that's exactly what you want - unless you 
want to be in full control of the situation and you want to be the hero 
that saves the day.

 There are lots of small performance improvements in form of a 
 finegrained 29-patch series. We have removed a number of features and 
 metrics from CFS that might have been needed but ended up being 
 superfluous - while keeping the things that worked out fine, like 
 sleeper fairness. On 32-bit x86 there's a ~16% speedup (over -rc6) in 
 lmbench (lat_ctx -s 0 2) results:

In the patch you really remove _a_lot_ of stuff. You also removed a lot of 
things I tried to get you to explain them to me. On the one hand I could 
be happy that these things are gone, as they were the major road block to 
splitting up my own patch. On the other hand it still leaves me somewhat 
unsatisfied, as I still don't know what that stuff was good for.
In a more collaborative development model I would have expected that you 
tried to explain these features, which could have resulted in a discussion 
how else things can be implemented or if it's still needed at all. Instead 
of this you now simply decide unilaterally that these things are not 
needed anymore.

BTW the old sleeper fairness logic that worked out fine is actually 
completely gone and is now conceptually closer to what I'm already doing 
in my patch (only the amount of sleeper bonus differs).

   (microseconds, lower is better)
  
 v2.6.222.6.23-rc6(CFS) v2.6.23-rc6-CFS-devel
  
0.70  0.750.65
0.62  0.660.63
0.60  0.720.69
0.62  0.740.61
0.69  0.730.53
0.66  0.730.63

Re: [announce] CFS-devel, performance improvements

2007-09-12 Thread Mike Galbraith
On Tue, 2007-09-11 at 22:04 +0200, Ingo Molnar wrote:
 fresh back from the Kernel Summit, Peter Zijlstra and me are pleased to 
 announce the latest iteration of the CFS scheduler development tree. Our 
 main focus has been on simplifications and performance - and as part of 
 that we've also picked up some ideas from Roman Zippel's 'Really Fair 
 Scheduler' patch as well and integrated them into CFS. We'd like to ask 
 people go give these patches a good workout, especially with an eye on 
 any interactivity regressions.

Initial test-drive looks good here, but I do see a regression.  First
the good news.

fairtest2 is perfect, more perfect than ever seen before in fact.  Mixed
interval sleepers/hog looks fine as well (can't say perfect due to
startup differences with the various proggies, but cpu% looks perfect).
Amarok song switch time under hefty kbuild load is fine as well.  I
haven't done heavy multimedia testing yet, but will give it a more
thorough workout later (errands).

The regression:  I see some GUI lurch, easily reproducible by running a
make -j5 and moving the mouse in a circle... perceptible (100ms or so)
lurches not present in rc5. 

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-11 Thread Rob Hussey
Hi Ingo,

When compiling, I get:
In file included from kernel/sched.c:794:
kernel/sched_fair.c: In function 'task_new_fair':
kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
undeclared (first use in this function)
kernel/sched_fair.c:857: error: (Each undeclared identifier is
reported only once
kernel/sched_fair.c:857: error: for each function it appears in.)

Presumably because sched_fair.c is being included into sched.c before
sysctl_sched_child_runs_first is defined.

Regards,
Rob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-11 Thread Roman Zippel
Hi,

Hi,

Out of curiousity: will I ever get answers to my questions?

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-11 Thread Roman Zippel
Hi,

Hi,

Out of curiousity: will I ever get answers to my questions?

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [announce] CFS-devel, performance improvements

2007-09-11 Thread Rob Hussey
Hi Ingo,

When compiling, I get:
In file included from kernel/sched.c:794:
kernel/sched_fair.c: In function 'task_new_fair':
kernel/sched_fair.c:857: error: 'sysctl_sched_child_runs_first'
undeclared (first use in this function)
kernel/sched_fair.c:857: error: (Each undeclared identifier is
reported only once
kernel/sched_fair.c:857: error: for each function it appears in.)

Presumably because sched_fair.c is being included into sched.c before
sysctl_sched_child_runs_first is defined.

Regards,
Rob
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/