Re: [ck] Re: 2.6.20-ck1
On Sun, 2007-02-18 at 00:15 -0600, Rodney Gordon II wrote: > On Sun, 2007-02-18 at 13:38 +1100, Con Kolivas wrote: > > mdew . writes: > > > > > On 2/16/07, Con Kolivas <[EMAIL PROTECTED]> wrote: > > >> This patchset is designed to improve system responsiveness and > > >> interactivity. > > >> It is configurable to any workload but the default -ck patch is aimed at > > >> the > > >> desktop and -cks is available with more emphasis on serverspace. > > >> > > >> Apply to 2.6.20 > > > > > > any benchmarks for 2.6.20-ck vs 2.6.20? > > > > Would some -ck user on the mailing list like to perform a set of interbench > > benchmarks? They're pretty straight forward to do; see: > > > > http://interbench.kolivas.org > > > > -- > > -ck > > Here are some benches comparing 2.6.18-4-686 (Debian sid stock) and > 2.6.20-ck1-mt1 (2.6.20-ck1 + sched-idleprio-1.11-2.0.patch) > > I know it's not what was asked for, but it might be useful for review of > anyone using Debian kernels considering ck patches :) > > Take a look. > > -r System specs by the way: Pentium-D 830 3.0GHz Dualcore, 1.5GB RAM, 7200RPM 16MB Cache SATA3 using AHCI w/ NCQ on. -- Rodney "meff" Gordon II -*- [EMAIL PROTECTED] Systems Administrator / Coder Geek -*- Open yourself to OpenSource - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On Sun, 2007-02-18 at 13:38 +1100, Con Kolivas wrote: > mdew . writes: > > > On 2/16/07, Con Kolivas <[EMAIL PROTECTED]> wrote: > >> This patchset is designed to improve system responsiveness and > >> interactivity. > >> It is configurable to any workload but the default -ck patch is aimed at > >> the > >> desktop and -cks is available with more emphasis on serverspace. > >> > >> Apply to 2.6.20 > > > > any benchmarks for 2.6.20-ck vs 2.6.20? > > Would some -ck user on the mailing list like to perform a set of interbench > benchmarks? They're pretty straight forward to do; see: > > http://interbench.kolivas.org > > -- > -ck Here are some benches comparing 2.6.18-4-686 (Debian sid stock) and 2.6.20-ck1-mt1 (2.6.20-ck1 + sched-idleprio-1.11-2.0.patch) I know it's not what was asked for, but it might be useful for review of anyone using Debian kernels considering ck patches :) Take a look. -r -- Rodney "meff" Gordon II -*- [EMAIL PROTECTED] Systems Administrator / Coder Geek -*- Open yourself to OpenSource Using 1816966 loops per ms, running every load for 30 seconds Benchmarking kernel 2.6.18-4-686 at datestamp 200702172244 --- Benchmarking simulated cpu of Audio in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.005 +/- 0.005450.008 100 100 Video 0.086 +/- 0.6616.7 100 100 X 0.03 +/- 0.272 5.32 100 100 Burn 0.005 +/- 0.00565 0.01 100 100 Write 0.043 +/- 0.281 5.28 100 100 Read 0.01 +/- 0.0293 0.537 100 100 Compile 0.013 +/- 0.119 2.91 100 100 Memload 0.033 +/- 0.2896.2 100 100 --- Benchmarking simulated cpu of Video in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.024 +/- 0.556 16.7 100 99.9 X 0.874 +/- 3.7816.7 100 94.9 Burn 0.005 +/- 0.005590.008 100 100 Write 0.128 +/- 1.3624.6 100 99.6 Read 0.524 +/- 2.9316.7 100 96.9 Compile 0.136 +/- 1.4317.3 100 99.3 Memload 0.751 +/- 3.4817.3 100 95.7 --- Benchmarking simulated cpu of X in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.293 +/- 1.34 10 92.3 89.2 Video 0.606 +/- 2.32 18 89.8 84.6 Burn 0.526 +/- 1.93 10 90.6 85.1 Write 1.35 +/- 7.79 92 87.4 84 Read 2.3 +/- 7.4 44 78.8 72 Compile 2.09 +/- 7.75 72 78.5 72.9 Memload 0.767 +/- 2.82 24 87.5 82.2 --- Benchmarking simulated cpu of Gaming in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU None 2.64 +/- 9.3150.6 97.4 Video 0.063 +/- 0.297 5.07 99.9 X 0.061 +/- 0.377 6.48 99.9 Burn 183 +/- 194 400 35.3 Write 1.32 +/- 6.2180.9 98.7 Read 4.98 +/- 7 34.5 95.3 Compile 210 +/- 228 449 32.3 Memload 4.57 +/- 11.2 83 95.6 Using 1816966 loops per ms, running every load for 30 seconds Benchmarking kernel 2.6.20-ck1-mt1 at datestamp 200702172307 --- Benchmarking simulated cpu of Audio in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.005 +/- 0.005170.009 100 100 Video 0.016 +/- 0.017 0.022 100 100 X 0.018 +/- 0.133.17 100 100 Burn 0.005 +/- 0.005510.013 100 100 Write 0.016 +/- 0.0489 1.07 100 100 Read 0.016 +/- 0.102 2.48 100 100 Compile 0.051 +/- 0.421 7 100 100 Memload 0.012 +/- 0.081.55 100 100 --- Benchmarking simulated cpu of Video in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.088 +/- 1.1816.7 100 99.5 X 0.014 +/- 0.0153 0.026 100 100 Burn 0.005 +/- 0.005530.016 100 100 Write 0.057 +/- 0.734 16.7 100 99.8 Read 0.016 +/- 0.0187 0.21 100 100 Compile 0.042 +/- 0.328 5.59 100 100 Memload 0.014 +/- 0.0883 1.93 100 100 --- Benchmarking simulated cpu of X in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.033 +/- 0.258 2 98.7 97.7 Video 0.033 +/- 0.258 2 98.7 97.7 Burn 0.046 +/- 0.337 3 98.4 97 Write 0.129 +/- 0.777 7 96.5 94.4 Read 0.292 +/- 1.75 18 94.4 91.6 Compile 0.473 +/- 2.66 28 92 89 Memload 0.178 +/- 0.98 8 96.8 93.8 --- Benchmarking simulated cpu of Gaming in the presence of simulated --- Load
Re: [ck] Re: 2.6.20-ck1
On Sunday 18 February 2007 13:38, Con Kolivas wrote: > mdew . writes: > > On 2/16/07, Con Kolivas <[EMAIL PROTECTED]> wrote: > >> This patchset is designed to improve system responsiveness and > >> interactivity. It is configurable to any workload but the default -ck > >> patch is aimed at the desktop and -cks is available with more emphasis > >> on serverspace. > >> > >> Apply to 2.6.20 > > > > any benchmarks for 2.6.20-ck vs 2.6.20? > > Would some -ck user on the mailing list like to perform a set of interbench > benchmarks? They're pretty straight forward to do; see: > > http://interbench.kolivas.org I couldn't take down any lower power machine for these benchmarks... A lower power single cpu machine would be better for this. Feel free to throw any other benchmarks at it. This core2 duo 2.4 GHz with 2GB ram and 7200 rpm 16MB cache hard drive is not too discrimanatory, but here are the results (use fixed font to see): Using 2392573 loops per ms, running every load for 30 seconds Benchmarking kernel 2.6.20 at datestamp 200702181608 --- Benchmarking simulated cpu of Audio in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.002 +/- 0.003290.006 100100 Video 0.002 +/- 0.00356 0.01 100100 X 0.007 +/- 0.0819 2 100100 Burn 0.002 +/- 0.003350.005 100100 Write 0.105 +/- 1.5535.5 100100 Read 0.006 +/- 0.007070.014 100100 Compile 0.312 +/- 5.61 13599.8 99.8 Memload0.01 +/- 0.037 0.72 100100 --- Benchmarking simulated cpu of Video in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.004 +/- 0.004310.017 100100 X 0.006 +/- 0.006080.013 100100 Burn 0.003 +/- 0.003920.012 100100 Write 0.097 +/- 3.44 14499.8 99.8 Read 0.005 +/- 0.005230.013 100100 Compile 0.059 +/- 1.2 36.799.8 99.8 Memload0.01 +/- 0.0767 1.85 100100 --- Benchmarking simulated cpu of X in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.056 +/- 0.379 398.4 96.7 Video 0.033 +/- 0.258 298.7 97.7 Burn 0 +/- 0 0 100100 Write 0.051 +/- 0.6711.299.3 99 Read 0.053 +/- 0.384 3 98 96.7 Compile 0.139 +/- 2.29 39 99 98.6 Memload 0.166 +/- 2.25 3998.1 97.1 --- Benchmarking simulated cpu of Gaming in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU None 0.551 +/- 0.553 0.66599.5 Video 0.594 +/- 0.596 0.65699.4 X 0.019 +/- 0.317 5.49 100 Burn179 +/- 186 19335.9 Write 1.16 +/- 5.8769.298.9 Read 0.876 +/- 0.884 1.3199.1 Compile 193 +/- 209 49934.1 Memload1.11 +/- 1.5915.398.9 Using 2392573 loops per ms, running every load for 30 seconds Benchmarking kernel 2.6.20-ck1 at datestamp 200702181542 --- Benchmarking simulated cpu of Audio in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.002 +/- 0.003330.005 100100 Video 0.004 +/- 0.0309 0.717 100100 X 0.008 +/- 0.124 2.99 100100 Burn 0.002 +/- 0.003390.005 100100 Write 0.03 +/- 0.228 2.99 100100 Read 0.005 +/- 0.006360.017 100100 Compile 0.041 +/- 0.268 3.06 100100 Memload0.31 +/- 4.83 6.3 100100 --- Benchmarking simulated cpu of Video in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.003 +/- 0.003830.014 100100 X 0.008 +/- 0.143 5.99 100100 Burn 0.003 +/- 0.003830.009 100100 Write 0.023 +/- 0.219 4.57 100100 Read 0.004 +/- 0.0047 0.017 100100 Compile 0.027 +/- 0.214 3.73 100100 Memload 0.015 +/- 0.113 3 100100 ---
Re: [ck] Re: 2.6.20-ck1
On 2/18/07, Andrew Morton <[EMAIL PROTECTED]> wrote: Generally, the penalties for getting this stuff wrong are very very high: orders of magnitude slowdowns in the right situations. Which I suspect will make any system-wide knob ultimately unsuccessful. Yes, they were. Now, it's an extremely light and well-tuned patch. kprefetchd should only run on a totally idle system now. The ideal way of getting this *right* is to change every application in the world to get smart about using sync_page_range() and/or posix_fadvise(), then to add a set of command-line options to each application in the world so the user can control its pagecache handling. We don't live in a perfect world. :-) Obviously that isn't practical. But what _could_ be done is to put these pagecache smarts into glibc's read() and write() code. So the user can do: MAX_PAGECACHE=4M MAX_DIRTY_PAGECACHE=2M rsync foo bar This will provide pagecache control for pretty much every application. It has limitations (fork+exec behaviour??) but will be useful. Not too useful for interactive applications with unpredictable memory consumption behaviour, where swap-prefetch still helps. A kernel-based solution might use new rlimits, but would not be as flexible or successful as a libc-based one, I suspect. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On Sun, 18 Feb 2007 08:00:06 +1100 Con Kolivas <[EMAIL PROTECTED]> wrote: > On Sunday 18 February 2007 05:45, Chuck Ebbert wrote: > ... > > But the one I like, mm-filesize_dependant_lru_cache_add.patch, > > has an on-off switch. > > > > ... > > Do you still want this patch for mainline?... Don't think so. The problems I see are: - It's a system-wide knob. In many situations this will do the wrong thing. Controlling pagecache should be per-process. - Its heuristics for working out when to invalidate the pagecache will be too much for some situations and too little for others. - Whatever we do, there will be some applications in some situations which are hurt badly by changes like this: they'll do heaps of extra IO. Generally, the penalties for getting this stuff wrong are very very high: orders of magnitude slowdowns in the right situations. Which I suspect will make any system-wide knob ultimately unsuccessful. The ideal way of getting this *right* is to change every application in the world to get smart about using sync_page_range() and/or posix_fadvise(), then to add a set of command-line options to each application in the world so the user can control its pagecache handling. Obviously that isn't practical. But what _could_ be done is to put these pagecache smarts into glibc's read() and write() code. So the user can do: MAX_PAGECACHE=4M MAX_DIRTY_PAGECACHE=2M rsync foo bar This will provide pagecache control for pretty much every application. It has limitations (fork+exec behaviour??) but will be useful. A kernel-based solution might use new rlimits, but would not be as flexible or successful as a libc-based one, I suspect. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On 2/17/07, Con Kolivas <[EMAIL PROTECTED]> wrote: On Sunday 18 February 2007 05:45, Chuck Ebbert wrote: > Con Kolivas wrote: > > Maintainers are far too busy off testing code for > > 16+ cpus, petabytes of disk storage and so on to try it for themselves. > > Plus they worry incessantly that my patches may harm those precious > > machines' performance... > > But the one I like, mm-filesize_dependant_lru_cache_add.patch, > has an on-off switch. > > In other words it adds an option to do things differently. > How could that possibly affect any workload if that option > isn't enabled? Swap prefetch not only has an on-off switch, you can even build your kernel without it entirely so it costs even less than this patch... I'm not going to support the argument that it might be built into the kernel and enabled unknowingly and _then_ cause overhead. The patch, the way it's written now -- is the default to build with swap-prefetch, or build without by default? If the former, maybe it would be more accepted if the latter was the default. (Of course, that defeats the point for desktop users who add the patch and then wonder why it doesn't work, but... *shrugs*) Oh and this patch depends on some of the code from the swap prefetch patch too. I guess since they're so suspicious of swap prefetch the swap prefetch patch can be ripped apart for the portions of code required to make this patch work. While I'm all for putting Con's patches into mainline, I'm worried about what happens if you rip swap prefetch apart and (if the unthinkable happens) somebody accidentally omits something or worse. Then mainline would have even more reason to be suspicious of code from you, Con. Unless you already ripped the swap prefetch patch into the parts that mm-filesize_dependant_lru_cache_add.patch depend on and the parts it doesn't, and check it's "sane" to use them independently... (I'd be WAY more suspicious of having "half" of swap prefetch than having all of it. I hope that most of mainline agrees with me, but I have a sneaking suspicion they don't.) In any case, this "ripping"... would it make the reverse happen? i.e. swap prefetch being dependent on mm-filesize_dependant_lru_cache_add.patch instead? -- ~Mike - Just the crazy copy cat. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On Sunday 18 February 2007 05:45, Chuck Ebbert wrote: > Con Kolivas wrote: > > Maintainers are far too busy off testing code for > > 16+ cpus, petabytes of disk storage and so on to try it for themselves. > > Plus they worry incessantly that my patches may harm those precious > > machines' performance... > > But the one I like, mm-filesize_dependant_lru_cache_add.patch, > has an on-off switch. > > In other words it adds an option to do things differently. > How could that possibly affect any workload if that option > isn't enabled? Swap prefetch not only has an on-off switch, you can even build your kernel without it entirely so it costs even less than this patch... I'm not going to support the argument that it might be built into the kernel and enabled unknowingly and _then_ cause overhead. Oh and this patch depends on some of the code from the swap prefetch patch too. I guess since they're so suspicious of swap prefetch the swap prefetch patch can be ripped apart for the portions of code required to make this patch work. Do you still want this patch for mainline?... -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
Con Kolivas wrote: > Maintainers are far too busy off testing code for > 16+ cpus, petabytes of disk storage and so on to try it for themselves. Plus > they worry incessantly that my patches may harm those precious machines' > performance... > But the one I like, mm-filesize_dependant_lru_cache_add.patch, has an on-off switch. In other words it adds an option to do things differently. How could that possibly affect any workload if that option isn't enabled? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On 2/16/07, Con Kolivas <[EMAIL PROTECTED]> wrote: On Saturday 17 February 2007 13:15, michael chang wrote: > On 2/16/07, Con Kolivas <[EMAIL PROTECTED]> wrote: > > I'm thru with bashing my head against the wall. > > I do hope this post isn't in any way redundant, but from what I can > see, this has never been suggested... (someone please do enlighten me > if I'm wrong.) > > Has anyone tried booting a kernel with the various patches in question > with a mem=###M boot flag (maybe mem=96M or some other "insanely low > number" ?) to make the kernel think it has less memory than is > physically available (and then compare to vanilla with the same > flags)? It might more clearly demonstrate the effects of Con's patches > when the kernel thinks (or knows) it has relatively little memory > (since many critics, from what I can tell, have quite a bit of memory > on their systems for their workloads). > > Just my two cents. Oh that's not a bad idea of course. I've been testing it like that for ages, It never hurts to point out the obvious in case someone didn't notice, so long as one doesn't become repetitive. and there are many -ck users who have testified to swap prefetch helping in low memory situations for real as well. Now how do you turn those testimonies into convincing arguments? Maintainers are far too busy off testing code for 16+ cpus, petabytes of disk storage and so on to try it for themselves. Plus Pity. What about virtualization? Surely one of these 16+ CPU machines with petabytes of disk storage can spare one CPU for an hour for a virtual machine, just set it with like a 3 GB hard drive image (which is later deleted), 1 GB swap (ditto), 128 MB memory, and one CPU instance -- then test with vanilla and the patches in question? (My understanding is that one of the major "fun things" that these kinds of machines have is that they come with interesting VM features and instruction sets.) they worry incessantly that my patches may harm those precious machines' performance... Has anyone tested it on one of these massive multi-core "beasts" and seen if it DOES degrade performance? I want to see numbers. Since the performance improvements for these machines are based on numbers, I want to see any argument for degradation also in numbers. Both absolute numbers and relative numbers. (Obviously, since -ck doesn't target that kind of thing, it's not possible at the moment to prove how useful -ck is with numbers. But surely we can measure how much of a "negative" impact it does have on everything else. If it isn't hurting anyone, then what's wrong with it?) Unfortunately, the argument that "xyz" is just as bad/worse is hardly useful from what I've seen in kernel talks... maybe we're missing something here. Is it possible to command a program's memory usage be put into swap on purpose, without "forcing" it into swap by taking other memory? (Maybe such a feature could be used to time how long it takes to restore from swap by timing how long the first or second display update takes on some typically-used GUI program that takes a while to draw its GUI.) Just a couple of additional thoughts. -- -ck -- ~Mike - Just the crazy copy cat. P.S. For anyone who cares and is sending replies to my messages, I am subscribed to the ck ML, but not linux-kernel. So if you want me to see it and it's in the latter, CC me. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On 2/16/07, Con Kolivas [EMAIL PROTECTED] wrote: On Saturday 17 February 2007 13:15, michael chang wrote: On 2/16/07, Con Kolivas [EMAIL PROTECTED] wrote: I'm thru with bashing my head against the wall. I do hope this post isn't in any way redundant, but from what I can see, this has never been suggested... (someone please do enlighten me if I'm wrong.) Has anyone tried booting a kernel with the various patches in question with a mem=###M boot flag (maybe mem=96M or some other insanely low number ?) to make the kernel think it has less memory than is physically available (and then compare to vanilla with the same flags)? It might more clearly demonstrate the effects of Con's patches when the kernel thinks (or knows) it has relatively little memory (since many critics, from what I can tell, have quite a bit of memory on their systems for their workloads). Just my two cents. Oh that's not a bad idea of course. I've been testing it like that for ages, It never hurts to point out the obvious in case someone didn't notice, so long as one doesn't become repetitive. and there are many -ck users who have testified to swap prefetch helping in low memory situations for real as well. Now how do you turn those testimonies into convincing arguments? Maintainers are far too busy off testing code for 16+ cpus, petabytes of disk storage and so on to try it for themselves. Plus Pity. What about virtualization? Surely one of these 16+ CPU machines with petabytes of disk storage can spare one CPU for an hour for a virtual machine, just set it with like a 3 GB hard drive image (which is later deleted), 1 GB swap (ditto), 128 MB memory, and one CPU instance -- then test with vanilla and the patches in question? (My understanding is that one of the major fun things that these kinds of machines have is that they come with interesting VM features and instruction sets.) they worry incessantly that my patches may harm those precious machines' performance... Has anyone tested it on one of these massive multi-core beasts and seen if it DOES degrade performance? I want to see numbers. Since the performance improvements for these machines are based on numbers, I want to see any argument for degradation also in numbers. Both absolute numbers and relative numbers. (Obviously, since -ck doesn't target that kind of thing, it's not possible at the moment to prove how useful -ck is with numbers. But surely we can measure how much of a negative impact it does have on everything else. If it isn't hurting anyone, then what's wrong with it?) Unfortunately, the argument that xyz is just as bad/worse is hardly useful from what I've seen in kernel talks... maybe we're missing something here. Is it possible to command a program's memory usage be put into swap on purpose, without forcing it into swap by taking other memory? (Maybe such a feature could be used to time how long it takes to restore from swap by timing how long the first or second display update takes on some typically-used GUI program that takes a while to draw its GUI.) Just a couple of additional thoughts. -- -ck -- ~Mike - Just the crazy copy cat. P.S. For anyone who cares and is sending replies to my messages, I am subscribed to the ck ML, but not linux-kernel. So if you want me to see it and it's in the latter, CC me. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
Con Kolivas wrote: Maintainers are far too busy off testing code for 16+ cpus, petabytes of disk storage and so on to try it for themselves. Plus they worry incessantly that my patches may harm those precious machines' performance... But the one I like, mm-filesize_dependant_lru_cache_add.patch, has an on-off switch. In other words it adds an option to do things differently. How could that possibly affect any workload if that option isn't enabled? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On Sunday 18 February 2007 05:45, Chuck Ebbert wrote: Con Kolivas wrote: Maintainers are far too busy off testing code for 16+ cpus, petabytes of disk storage and so on to try it for themselves. Plus they worry incessantly that my patches may harm those precious machines' performance... But the one I like, mm-filesize_dependant_lru_cache_add.patch, has an on-off switch. In other words it adds an option to do things differently. How could that possibly affect any workload if that option isn't enabled? Swap prefetch not only has an on-off switch, you can even build your kernel without it entirely so it costs even less than this patch... I'm not going to support the argument that it might be built into the kernel and enabled unknowingly and _then_ cause overhead. Oh and this patch depends on some of the code from the swap prefetch patch too. I guess since they're so suspicious of swap prefetch the swap prefetch patch can be ripped apart for the portions of code required to make this patch work. Do you still want this patch for mainline?... -- -ck - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On 2/17/07, Con Kolivas [EMAIL PROTECTED] wrote: On Sunday 18 February 2007 05:45, Chuck Ebbert wrote: Con Kolivas wrote: Maintainers are far too busy off testing code for 16+ cpus, petabytes of disk storage and so on to try it for themselves. Plus they worry incessantly that my patches may harm those precious machines' performance... But the one I like, mm-filesize_dependant_lru_cache_add.patch, has an on-off switch. In other words it adds an option to do things differently. How could that possibly affect any workload if that option isn't enabled? Swap prefetch not only has an on-off switch, you can even build your kernel without it entirely so it costs even less than this patch... I'm not going to support the argument that it might be built into the kernel and enabled unknowingly and _then_ cause overhead. The patch, the way it's written now -- is the default to build with swap-prefetch, or build without by default? If the former, maybe it would be more accepted if the latter was the default. (Of course, that defeats the point for desktop users who add the patch and then wonder why it doesn't work, but... *shrugs*) Oh and this patch depends on some of the code from the swap prefetch patch too. I guess since they're so suspicious of swap prefetch the swap prefetch patch can be ripped apart for the portions of code required to make this patch work. While I'm all for putting Con's patches into mainline, I'm worried about what happens if you rip swap prefetch apart and (if the unthinkable happens) somebody accidentally omits something or worse. Then mainline would have even more reason to be suspicious of code from you, Con. Unless you already ripped the swap prefetch patch into the parts that mm-filesize_dependant_lru_cache_add.patch depend on and the parts it doesn't, and check it's sane to use them independently... (I'd be WAY more suspicious of having half of swap prefetch than having all of it. I hope that most of mainline agrees with me, but I have a sneaking suspicion they don't.) In any case, this ripping... would it make the reverse happen? i.e. swap prefetch being dependent on mm-filesize_dependant_lru_cache_add.patch instead? -- ~Mike - Just the crazy copy cat. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On Sun, 18 Feb 2007 08:00:06 +1100 Con Kolivas [EMAIL PROTECTED] wrote: On Sunday 18 February 2007 05:45, Chuck Ebbert wrote: ... But the one I like, mm-filesize_dependant_lru_cache_add.patch, has an on-off switch. ... Do you still want this patch for mainline?... Don't think so. The problems I see are: - It's a system-wide knob. In many situations this will do the wrong thing. Controlling pagecache should be per-process. - Its heuristics for working out when to invalidate the pagecache will be too much for some situations and too little for others. - Whatever we do, there will be some applications in some situations which are hurt badly by changes like this: they'll do heaps of extra IO. Generally, the penalties for getting this stuff wrong are very very high: orders of magnitude slowdowns in the right situations. Which I suspect will make any system-wide knob ultimately unsuccessful. The ideal way of getting this *right* is to change every application in the world to get smart about using sync_page_range() and/or posix_fadvise(), then to add a set of command-line options to each application in the world so the user can control its pagecache handling. Obviously that isn't practical. But what _could_ be done is to put these pagecache smarts into glibc's read() and write() code. So the user can do: MAX_PAGECACHE=4M MAX_DIRTY_PAGECACHE=2M rsync foo bar This will provide pagecache control for pretty much every application. It has limitations (fork+exec behaviour??) but will be useful. A kernel-based solution might use new rlimits, but would not be as flexible or successful as a libc-based one, I suspect. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On 2/18/07, Andrew Morton [EMAIL PROTECTED] wrote: Generally, the penalties for getting this stuff wrong are very very high: orders of magnitude slowdowns in the right situations. Which I suspect will make any system-wide knob ultimately unsuccessful. Yes, they were. Now, it's an extremely light and well-tuned patch. kprefetchd should only run on a totally idle system now. The ideal way of getting this *right* is to change every application in the world to get smart about using sync_page_range() and/or posix_fadvise(), then to add a set of command-line options to each application in the world so the user can control its pagecache handling. We don't live in a perfect world. :-) Obviously that isn't practical. But what _could_ be done is to put these pagecache smarts into glibc's read() and write() code. So the user can do: MAX_PAGECACHE=4M MAX_DIRTY_PAGECACHE=2M rsync foo bar This will provide pagecache control for pretty much every application. It has limitations (fork+exec behaviour??) but will be useful. Not too useful for interactive applications with unpredictable memory consumption behaviour, where swap-prefetch still helps. A kernel-based solution might use new rlimits, but would not be as flexible or successful as a libc-based one, I suspect. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On Sunday 18 February 2007 13:38, Con Kolivas wrote: mdew . writes: On 2/16/07, Con Kolivas [EMAIL PROTECTED] wrote: This patchset is designed to improve system responsiveness and interactivity. It is configurable to any workload but the default -ck patch is aimed at the desktop and -cks is available with more emphasis on serverspace. Apply to 2.6.20 any benchmarks for 2.6.20-ck vs 2.6.20? Would some -ck user on the mailing list like to perform a set of interbench benchmarks? They're pretty straight forward to do; see: http://interbench.kolivas.org I couldn't take down any lower power machine for these benchmarks... A lower power single cpu machine would be better for this. Feel free to throw any other benchmarks at it. This core2 duo 2.4 GHz with 2GB ram and 7200 rpm 16MB cache hard drive is not too discrimanatory, but here are the results (use fixed font to see): Using 2392573 loops per ms, running every load for 30 seconds Benchmarking kernel 2.6.20 at datestamp 200702181608 --- Benchmarking simulated cpu of Audio in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.002 +/- 0.003290.006 100100 Video 0.002 +/- 0.00356 0.01 100100 X 0.007 +/- 0.0819 2 100100 Burn 0.002 +/- 0.003350.005 100100 Write 0.105 +/- 1.5535.5 100100 Read 0.006 +/- 0.007070.014 100100 Compile 0.312 +/- 5.61 13599.8 99.8 Memload0.01 +/- 0.037 0.72 100100 --- Benchmarking simulated cpu of Video in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.004 +/- 0.004310.017 100100 X 0.006 +/- 0.006080.013 100100 Burn 0.003 +/- 0.003920.012 100100 Write 0.097 +/- 3.44 14499.8 99.8 Read 0.005 +/- 0.005230.013 100100 Compile 0.059 +/- 1.2 36.799.8 99.8 Memload0.01 +/- 0.0767 1.85 100100 --- Benchmarking simulated cpu of X in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.056 +/- 0.379 398.4 96.7 Video 0.033 +/- 0.258 298.7 97.7 Burn 0 +/- 0 0 100100 Write 0.051 +/- 0.6711.299.3 99 Read 0.053 +/- 0.384 3 98 96.7 Compile 0.139 +/- 2.29 39 99 98.6 Memload 0.166 +/- 2.25 3998.1 97.1 --- Benchmarking simulated cpu of Gaming in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU None 0.551 +/- 0.553 0.66599.5 Video 0.594 +/- 0.596 0.65699.4 X 0.019 +/- 0.317 5.49 100 Burn179 +/- 186 19335.9 Write 1.16 +/- 5.8769.298.9 Read 0.876 +/- 0.884 1.3199.1 Compile 193 +/- 209 49934.1 Memload1.11 +/- 1.5915.398.9 Using 2392573 loops per ms, running every load for 30 seconds Benchmarking kernel 2.6.20-ck1 at datestamp 200702181542 --- Benchmarking simulated cpu of Audio in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.002 +/- 0.003330.005 100100 Video 0.004 +/- 0.0309 0.717 100100 X 0.008 +/- 0.124 2.99 100100 Burn 0.002 +/- 0.003390.005 100100 Write 0.03 +/- 0.228 2.99 100100 Read 0.005 +/- 0.006360.017 100100 Compile 0.041 +/- 0.268 3.06 100100 Memload0.31 +/- 4.83 6.3 100100 --- Benchmarking simulated cpu of Video in the presence of simulated --- LoadLatency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.003 +/- 0.003830.014 100100 X 0.008 +/- 0.143 5.99 100100 Burn 0.003 +/- 0.003830.009 100100 Write 0.023 +/- 0.219 4.57 100100 Read 0.004 +/- 0.0047 0.017 100100 Compile 0.027 +/- 0.214 3.73 100100 Memload 0.015 +/- 0.113 3 100100 --- Benchmarking simulated cpu of X in the
Re: [ck] Re: 2.6.20-ck1
On Sun, 2007-02-18 at 13:38 +1100, Con Kolivas wrote: mdew . writes: On 2/16/07, Con Kolivas [EMAIL PROTECTED] wrote: This patchset is designed to improve system responsiveness and interactivity. It is configurable to any workload but the default -ck patch is aimed at the desktop and -cks is available with more emphasis on serverspace. Apply to 2.6.20 any benchmarks for 2.6.20-ck vs 2.6.20? Would some -ck user on the mailing list like to perform a set of interbench benchmarks? They're pretty straight forward to do; see: http://interbench.kolivas.org -- -ck Here are some benches comparing 2.6.18-4-686 (Debian sid stock) and 2.6.20-ck1-mt1 (2.6.20-ck1 + sched-idleprio-1.11-2.0.patch) I know it's not what was asked for, but it might be useful for review of anyone using Debian kernels considering ck patches :) Take a look. -r -- Rodney meff Gordon II -*- [EMAIL PROTECTED] Systems Administrator / Coder Geek -*- Open yourself to OpenSource Using 1816966 loops per ms, running every load for 30 seconds Benchmarking kernel 2.6.18-4-686 at datestamp 200702172244 --- Benchmarking simulated cpu of Audio in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.005 +/- 0.005450.008 100 100 Video 0.086 +/- 0.6616.7 100 100 X 0.03 +/- 0.272 5.32 100 100 Burn 0.005 +/- 0.00565 0.01 100 100 Write 0.043 +/- 0.281 5.28 100 100 Read 0.01 +/- 0.0293 0.537 100 100 Compile 0.013 +/- 0.119 2.91 100 100 Memload 0.033 +/- 0.2896.2 100 100 --- Benchmarking simulated cpu of Video in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.024 +/- 0.556 16.7 100 99.9 X 0.874 +/- 3.7816.7 100 94.9 Burn 0.005 +/- 0.005590.008 100 100 Write 0.128 +/- 1.3624.6 100 99.6 Read 0.524 +/- 2.9316.7 100 96.9 Compile 0.136 +/- 1.4317.3 100 99.3 Memload 0.751 +/- 3.4817.3 100 95.7 --- Benchmarking simulated cpu of X in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.293 +/- 1.34 10 92.3 89.2 Video 0.606 +/- 2.32 18 89.8 84.6 Burn 0.526 +/- 1.93 10 90.6 85.1 Write 1.35 +/- 7.79 92 87.4 84 Read 2.3 +/- 7.4 44 78.8 72 Compile 2.09 +/- 7.75 72 78.5 72.9 Memload 0.767 +/- 2.82 24 87.5 82.2 --- Benchmarking simulated cpu of Gaming in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU None 2.64 +/- 9.3150.6 97.4 Video 0.063 +/- 0.297 5.07 99.9 X 0.061 +/- 0.377 6.48 99.9 Burn 183 +/- 194 400 35.3 Write 1.32 +/- 6.2180.9 98.7 Read 4.98 +/- 7 34.5 95.3 Compile 210 +/- 228 449 32.3 Memload 4.57 +/- 11.2 83 95.6 Using 1816966 loops per ms, running every load for 30 seconds Benchmarking kernel 2.6.20-ck1-mt1 at datestamp 200702172307 --- Benchmarking simulated cpu of Audio in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.005 +/- 0.005170.009 100 100 Video 0.016 +/- 0.017 0.022 100 100 X 0.018 +/- 0.133.17 100 100 Burn 0.005 +/- 0.005510.013 100 100 Write 0.016 +/- 0.0489 1.07 100 100 Read 0.016 +/- 0.102 2.48 100 100 Compile 0.051 +/- 0.421 7 100 100 Memload 0.012 +/- 0.081.55 100 100 --- Benchmarking simulated cpu of Video in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.088 +/- 1.1816.7 100 99.5 X 0.014 +/- 0.0153 0.026 100 100 Burn 0.005 +/- 0.005530.016 100 100 Write 0.057 +/- 0.734 16.7 100 99.8 Read 0.016 +/- 0.0187 0.21 100 100 Compile 0.042 +/- 0.328 5.59 100 100 Memload 0.014 +/- 0.0883 1.93 100 100 --- Benchmarking simulated cpu of X in the presence of simulated --- Load Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.033 +/- 0.258 2 98.7 97.7 Video 0.033 +/- 0.258 2 98.7 97.7 Burn 0.046 +/- 0.337 3 98.4 97 Write 0.129 +/- 0.777 7 96.5 94.4 Read 0.292 +/- 1.75 18 94.4 91.6 Compile 0.473 +/- 2.66 28 92 89 Memload 0.178 +/- 0.98 8 96.8 93.8 --- Benchmarking simulated cpu of Gaming in the presence of simulated --- Load Latency +/- SD (ms) Max Latency %
Re: [ck] Re: 2.6.20-ck1
On Sun, 2007-02-18 at 00:15 -0600, Rodney Gordon II wrote: On Sun, 2007-02-18 at 13:38 +1100, Con Kolivas wrote: mdew . writes: On 2/16/07, Con Kolivas [EMAIL PROTECTED] wrote: This patchset is designed to improve system responsiveness and interactivity. It is configurable to any workload but the default -ck patch is aimed at the desktop and -cks is available with more emphasis on serverspace. Apply to 2.6.20 any benchmarks for 2.6.20-ck vs 2.6.20? Would some -ck user on the mailing list like to perform a set of interbench benchmarks? They're pretty straight forward to do; see: http://interbench.kolivas.org -- -ck Here are some benches comparing 2.6.18-4-686 (Debian sid stock) and 2.6.20-ck1-mt1 (2.6.20-ck1 + sched-idleprio-1.11-2.0.patch) I know it's not what was asked for, but it might be useful for review of anyone using Debian kernels considering ck patches :) Take a look. -r System specs by the way: Pentium-D 830 3.0GHz Dualcore, 1.5GB RAM, 7200RPM 16MB Cache SATA3 using AHCI w/ NCQ on. -- Rodney meff Gordon II -*- [EMAIL PROTECTED] Systems Administrator / Coder Geek -*- Open yourself to OpenSource - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On Saturday 17 February 2007 13:15, michael chang wrote: > On 2/16/07, Con Kolivas <[EMAIL PROTECTED]> wrote: > > I'm thru with bashing my head against the wall. > > I do hope this post isn't in any way redundant, but from what I can > see, this has never been suggested... (someone please do enlighten me > if I'm wrong.) > > Has anyone tried booting a kernel with the various patches in question > with a mem=###M boot flag (maybe mem=96M or some other "insanely low > number" ?) to make the kernel think it has less memory than is > physically available (and then compare to vanilla with the same > flags)? It might more clearly demonstrate the effects of Con's patches > when the kernel thinks (or knows) it has relatively little memory > (since many critics, from what I can tell, have quite a bit of memory > on their systems for their workloads). > > Just my two cents. Oh that's not a bad idea of course. I've been testing it like that for ages, and there are many -ck users who have testified to swap prefetch helping in low memory situations for real as well. Now how do you turn those testimonies into convincing arguments? Maintainers are far too busy off testing code for 16+ cpus, petabytes of disk storage and so on to try it for themselves. Plus they worry incessantly that my patches may harm those precious machines' performance... -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On 2/16/07, Con Kolivas <[EMAIL PROTECTED]> wrote: I'm thru with bashing my head against the wall. I do hope this post isn't in any way redundant, but from what I can see, this has never been suggested... (someone please do enlighten me if I'm wrong.) Has anyone tried booting a kernel with the various patches in question with a mem=###M boot flag (maybe mem=96M or some other "insanely low number" ?) to make the kernel think it has less memory than is physically available (and then compare to vanilla with the same flags)? It might more clearly demonstrate the effects of Con's patches when the kernel thinks (or knows) it has relatively little memory (since many critics, from what I can tell, have quite a bit of memory on their systems for their workloads). Just my two cents. -- ~Mike - Just the crazy copy cat. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On 2/16/07, Con Kolivas [EMAIL PROTECTED] wrote: I'm thru with bashing my head against the wall. I do hope this post isn't in any way redundant, but from what I can see, this has never been suggested... (someone please do enlighten me if I'm wrong.) Has anyone tried booting a kernel with the various patches in question with a mem=###M boot flag (maybe mem=96M or some other insanely low number ?) to make the kernel think it has less memory than is physically available (and then compare to vanilla with the same flags)? It might more clearly demonstrate the effects of Con's patches when the kernel thinks (or knows) it has relatively little memory (since many critics, from what I can tell, have quite a bit of memory on their systems for their workloads). Just my two cents. -- ~Mike - Just the crazy copy cat. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: 2.6.20-ck1
On Saturday 17 February 2007 13:15, michael chang wrote: On 2/16/07, Con Kolivas [EMAIL PROTECTED] wrote: I'm thru with bashing my head against the wall. I do hope this post isn't in any way redundant, but from what I can see, this has never been suggested... (someone please do enlighten me if I'm wrong.) Has anyone tried booting a kernel with the various patches in question with a mem=###M boot flag (maybe mem=96M or some other insanely low number ?) to make the kernel think it has less memory than is physically available (and then compare to vanilla with the same flags)? It might more clearly demonstrate the effects of Con's patches when the kernel thinks (or knows) it has relatively little memory (since many critics, from what I can tell, have quite a bit of memory on their systems for their workloads). Just my two cents. Oh that's not a bad idea of course. I've been testing it like that for ages, and there are many -ck users who have testified to swap prefetch helping in low memory situations for real as well. Now how do you turn those testimonies into convincing arguments? Maintainers are far too busy off testing code for 16+ cpus, petabytes of disk storage and so on to try it for themselves. Plus they worry incessantly that my patches may harm those precious machines' performance... -- -ck - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/