Re: Unpredictable performance
Nick Piggin wrote: > On Saturday 26 January 2008 02:03, Asbjørn Sannes wrote: > >> Asbjørn Sannes wrote: >> >>> Nick Piggin wrote: >>> On Friday 25 January 2008 22:32, Asbjorn Sannes wrote: > Hi, > > I am experiencing unpredictable results with the following test > without other processes running (exception is udev, I believe): > cd /usr/src/test > tar -jxf ../linux-2.6.22.12 > cp ../working-config linux-2.6.22.12/.config > cd linux-2.6.22.12 > make oldconfig > time make -j3 > /dev/null # This is what I note down as a "test" result > cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test > and then reboot > > The kernel is booted with the parameter mem=8192 > > For 2.6.23.14 the results vary from (real time) 33m30.551s to > 45m32.703s (30 runs) > For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 > runs) For 2.6.22.14 also varied a lot.. but, lost results :( > For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) > > Any idea of what can cause this? I have tried to make the runs as equal > as possible, rebooting between each run.. i/o scheduler is cfq as > default. > > sys and user time only varies a couple of seconds.. and the order of > when it is "fast" and when it is "slow" is completly random, but it > seems that the results are mostly concentrated around the mean. > Hmm, lots of things could cause it. With such big variations in elapsed time, and small variations on CPU time, I guess the fs/IO layers are the prime suspects, although it could also involve the VM if you are doing a fair amount of page reclaim. Can you boot with enough memory such that it never enters page reclaim? `grep scan /proc/vmstat` to check. Otherwise you could mount the working directory as tmpfs to eliminate IO. bisecting it down to a single patch would be really helpful if you can spare the time. >>> I'm going to run some tests without limiting the memory to 80 megabytes >>> (so that it is 2 gigabyte) and see how much it varies then, but iff I >>> recall correctly it did not vary much. I'll reply to this e-mail with >>> the results. >>> >> 5 runs gives me: >> real5m58.626s >> real5m57.280s >> real5m56.584s >> real5m57.565s >> real5m56.613s >> >> Should I test with tmpfs aswell? >> > > I wouldn't worry about it. It seems like it might be due to page reclaim > (fs / IO can't be ruled out completely though). Hmm, I haven't been following > reclaim so closely lately; you say it started going bad around 2.6.22? It > may be lumpy reclaim patches? > Going to bisect it soon, but I suspect it will take some time (considering how many runs I need to make any sense of the results). -- Asbjorn Sannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
Ray Lee wrote: > On Jan 25, 2008 12:49 PM, Asbjørn Sannes <[EMAIL PROTECTED]> wrote: > >> Ray Lee wrote: >> >>> On Jan 25, 2008 3:32 AM, Asbjorn Sannes <[EMAIL PROTECTED]> wrote: >>> >>> Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 > /dev/null # This is what I note down as a "test" result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is "fast" and when it is "slow" is completly random, but it seems that the results are mostly concentrated around the mean. >> .. I may have jumped the gun a "little" early saying that it is mostly >> concentrated around the mean, grepping from memory is not always .. hm, >> accurate :P >> > > For you (or anyone!) to have any faith in your conclusions at all, you > need to generate the mean and the standard deviation of each of your > runs. > > >>> First off, not all tests are good tests. In particular, small timing >>> differences can get magnified horrendously by heading into swap. >>> >>> >>> >> So, what you are saying is that it is expected to vary this much under >> memory pressure? That I can not do anything with this on real hardware? >> > > No, I'm saying exactly what I wrote. > > What you're testing is basically a bunch of processes competing for > the CPU scheduler. Who wins that competition is essentially random. > Whoever wins then places pressure on the IO subsystem. If you then go > into swap, you're then placing even *more* random pressure on the IO > system. The reason is that the order of the requests you're asking it > to do vary *wildly* between each of your 'tests', and disk drives have > a horrible time seeking between tracks. That's what I'm saying by > minute differences in the kernel's behavior can get magnified > thousands of times if you start hitting swap, or running a test that > won't all fit into cache. > > Yes, I get this, just to make it clear, the test is supposed to throw things out from the page cache, because this is in part what I want to test. I was under the impression that (not anymore though) a kernel compile would be pretty deterministic in how it would pressure the page cache and how much I/O would result from that.. so, I'm going to try and change the test. > So whatever you're trying to measure, you need to be aware that you're > basically throwing a random number generator into the mix. > > >>> That said, do you have the means and standard deviations of those >>> runs? That's a good way to tell whether the tests are converging or >>> not, and whether your results are telling you anything. >>> >>> >>> >> I have all the numbers, I was just hoping that there was a way to >> benchmark a small change without a lot of runs. It seems to me to quite >> randomly distributed .. from the 2.6.23.14 runs: >> > > Sure, you just keep running the tests until your standard deviation > converges to a significant enough range, where significant is whatever > you like it to be (+- one minute, say, or 10 seconds, or whatever). > But beware, if your test is essentially random, then it may never > converge. That in itself is interesting, too. > > >> It seems to me to quite >> randomly distributed .. from the 2.6.23.14 runs: >> 43m10.022s, 34m31.104s, 43m47.221s, 41m17.840s, 34m15.454s, >> 37m54.327s, 35m6.193s, 38m16.909s, 37m45.411s, 40m13.169s >> 38m17.414s, 34m37.561s, 43m18.181s, 35m46.233s, 34m44.414s, >> 39m55.257s, 35m28.477s, 33m30.551s, 41m36.394s, 43m6.359s, >> 42m42.396s, 37m44.293s, 41m6.615s, 35m43.084s, 39m25.846s, >> 34m23.753s, 36m0.556s, 41m38.095s, 45m32.703s, 36m18.325s, >> 42m4.840s, 43m53.759s, 35m51.138s, 40m19.001s >> >> Say I made a histogram of this (tilt your head :P) with 1 minute intervals: >> 33 * >> 34 * >> 35 * >> 36 ** >> 37 *** >> 38 ** >> 39 ** >> 40 ** >> 41 >> 42 ** >> 43 * >> 44 >> 45 * >> > > Mean is 2328s and standard deviation is 210s. > Just eyeballing that, I can tell you your standard deviation is large, > and you would therefore need to run more tests. > > However, let me just
Re: Unpredictable performance
Ray Lee wrote: On Jan 25, 2008 12:49 PM, Asbjørn Sannes [EMAIL PROTECTED] wrote: Ray Lee wrote: On Jan 25, 2008 3:32 AM, Asbjorn Sannes [EMAIL PROTECTED] wrote: Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 /dev/null # This is what I note down as a test result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is fast and when it is slow is completly random, but it seems that the results are mostly concentrated around the mean. .. I may have jumped the gun a little early saying that it is mostly concentrated around the mean, grepping from memory is not always .. hm, accurate :P For you (or anyone!) to have any faith in your conclusions at all, you need to generate the mean and the standard deviation of each of your runs. First off, not all tests are good tests. In particular, small timing differences can get magnified horrendously by heading into swap. So, what you are saying is that it is expected to vary this much under memory pressure? That I can not do anything with this on real hardware? No, I'm saying exactly what I wrote. What you're testing is basically a bunch of processes competing for the CPU scheduler. Who wins that competition is essentially random. Whoever wins then places pressure on the IO subsystem. If you then go into swap, you're then placing even *more* random pressure on the IO system. The reason is that the order of the requests you're asking it to do vary *wildly* between each of your 'tests', and disk drives have a horrible time seeking between tracks. That's what I'm saying by minute differences in the kernel's behavior can get magnified thousands of times if you start hitting swap, or running a test that won't all fit into cache. Yes, I get this, just to make it clear, the test is supposed to throw things out from the page cache, because this is in part what I want to test. I was under the impression that (not anymore though) a kernel compile would be pretty deterministic in how it would pressure the page cache and how much I/O would result from that.. so, I'm going to try and change the test. So whatever you're trying to measure, you need to be aware that you're basically throwing a random number generator into the mix. That said, do you have the means and standard deviations of those runs? That's a good way to tell whether the tests are converging or not, and whether your results are telling you anything. I have all the numbers, I was just hoping that there was a way to benchmark a small change without a lot of runs. It seems to me to quite randomly distributed .. from the 2.6.23.14 runs: Sure, you just keep running the tests until your standard deviation converges to a significant enough range, where significant is whatever you like it to be (+- one minute, say, or 10 seconds, or whatever). But beware, if your test is essentially random, then it may never converge. That in itself is interesting, too. It seems to me to quite randomly distributed .. from the 2.6.23.14 runs: 43m10.022s, 34m31.104s, 43m47.221s, 41m17.840s, 34m15.454s, 37m54.327s, 35m6.193s, 38m16.909s, 37m45.411s, 40m13.169s 38m17.414s, 34m37.561s, 43m18.181s, 35m46.233s, 34m44.414s, 39m55.257s, 35m28.477s, 33m30.551s, 41m36.394s, 43m6.359s, 42m42.396s, 37m44.293s, 41m6.615s, 35m43.084s, 39m25.846s, 34m23.753s, 36m0.556s, 41m38.095s, 45m32.703s, 36m18.325s, 42m4.840s, 43m53.759s, 35m51.138s, 40m19.001s Say I made a histogram of this (tilt your head :P) with 1 minute intervals: 33 * 34 * 35 * 36 ** 37 *** 38 ** 39 ** 40 ** 41 42 ** 43 * 44 45 * Mean is 2328s and standard deviation is 210s. Just eyeballing that, I can tell you your standard deviation is large, and you would therefore need to run more tests. However, let me just underscore that unless you're planning on setting up a compile server that you *want* to push into swap all the time, then this is a pretty silly test for getting good numbers, and instead should try testing something closer to what you're actually concerned about. As an example -- there's
Re: Unpredictable performance
Nick Piggin wrote: On Saturday 26 January 2008 02:03, Asbjørn Sannes wrote: Asbjørn Sannes wrote: Nick Piggin wrote: On Friday 25 January 2008 22:32, Asbjorn Sannes wrote: Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 /dev/null # This is what I note down as a test result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is fast and when it is slow is completly random, but it seems that the results are mostly concentrated around the mean. Hmm, lots of things could cause it. With such big variations in elapsed time, and small variations on CPU time, I guess the fs/IO layers are the prime suspects, although it could also involve the VM if you are doing a fair amount of page reclaim. Can you boot with enough memory such that it never enters page reclaim? `grep scan /proc/vmstat` to check. Otherwise you could mount the working directory as tmpfs to eliminate IO. bisecting it down to a single patch would be really helpful if you can spare the time. I'm going to run some tests without limiting the memory to 80 megabytes (so that it is 2 gigabyte) and see how much it varies then, but iff I recall correctly it did not vary much. I'll reply to this e-mail with the results. 5 runs gives me: real5m58.626s real5m57.280s real5m56.584s real5m57.565s real5m56.613s Should I test with tmpfs aswell? I wouldn't worry about it. It seems like it might be due to page reclaim (fs / IO can't be ruled out completely though). Hmm, I haven't been following reclaim so closely lately; you say it started going bad around 2.6.22? It may be lumpy reclaim patches? Going to bisect it soon, but I suspect it will take some time (considering how many runs I need to make any sense of the results). -- Asbjorn Sannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
On Saturday 26 January 2008 02:03, Asbjørn Sannes wrote: > Asbjørn Sannes wrote: > > Nick Piggin wrote: > >> On Friday 25 January 2008 22:32, Asbjorn Sannes wrote: > >>> Hi, > >>> > >>> I am experiencing unpredictable results with the following test > >>> without other processes running (exception is udev, I believe): > >>> cd /usr/src/test > >>> tar -jxf ../linux-2.6.22.12 > >>> cp ../working-config linux-2.6.22.12/.config > >>> cd linux-2.6.22.12 > >>> make oldconfig > >>> time make -j3 > /dev/null # This is what I note down as a "test" result > >>> cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test > >>> and then reboot > >>> > >>> The kernel is booted with the parameter mem=8192 > >>> > >>> For 2.6.23.14 the results vary from (real time) 33m30.551s to > >>> 45m32.703s (30 runs) > >>> For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 > >>> runs) For 2.6.22.14 also varied a lot.. but, lost results :( > >>> For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) > >>> > >>> Any idea of what can cause this? I have tried to make the runs as equal > >>> as possible, rebooting between each run.. i/o scheduler is cfq as > >>> default. > >>> > >>> sys and user time only varies a couple of seconds.. and the order of > >>> when it is "fast" and when it is "slow" is completly random, but it > >>> seems that the results are mostly concentrated around the mean. > >> > >> Hmm, lots of things could cause it. With such big variations in > >> elapsed time, and small variations on CPU time, I guess the fs/IO > >> layers are the prime suspects, although it could also involve the > >> VM if you are doing a fair amount of page reclaim. > >> > >> Can you boot with enough memory such that it never enters page > >> reclaim? `grep scan /proc/vmstat` to check. > >> > >> Otherwise you could mount the working directory as tmpfs to > >> eliminate IO. > >> > >> bisecting it down to a single patch would be really helpful if you > >> can spare the time. > > > > I'm going to run some tests without limiting the memory to 80 megabytes > > (so that it is 2 gigabyte) and see how much it varies then, but iff I > > recall correctly it did not vary much. I'll reply to this e-mail with > > the results. > > 5 runs gives me: > real5m58.626s > real5m57.280s > real5m56.584s > real5m57.565s > real5m56.613s > > Should I test with tmpfs aswell? I wouldn't worry about it. It seems like it might be due to page reclaim (fs / IO can't be ruled out completely though). Hmm, I haven't been following reclaim so closely lately; you say it started going bad around 2.6.22? It may be lumpy reclaim patches? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
Ray Lee wrote: On Jan 25, 2008 3:32 AM, Asbjorn Sannes <[EMAIL PROTECTED]> wrote: Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 > /dev/null # This is what I note down as a "test" result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is "fast" and when it is "slow" is completly random, but it seems that the results are mostly concentrated around the mean. .. I may have jumped the gun a "little" early saying that it is mostly concentrated around the mean, grepping from memory is not always .. hm, accurate :P First off, not all tests are good tests. In particular, small timing differences can get magnified horrendously by heading into swap. So, what you are saying is that it is expected to vary this much under memory pressure? That I can not do anything with this on real hardware? That said, do you have the means and standard deviations of those runs? That's a good way to tell whether the tests are converging or not, and whether your results are telling you anything. I have all the numbers, I was just hoping that there was a way to benchmark a small change without a lot of runs. It seems to me to quite randomly distributed .. from the 2.6.23.14 runs: 43m10.022s, 34m31.104s, 43m47.221s, 41m17.840s, 34m15.454s, 37m54.327s, 35m6.193s, 38m16.909s, 37m45.411s, 40m13.169s 38m17.414s, 34m37.561s, 43m18.181s, 35m46.233s, 34m44.414s, 39m55.257s, 35m28.477s, 33m30.551s, 41m36.394s, 43m6.359s, 42m42.396s, 37m44.293s, 41m6.615s, 35m43.084s, 39m25.846s, 34m23.753s, 36m0.556s, 41m38.095s, 45m32.703s, 36m18.325s, 42m4.840s, 43m53.759s, 35m51.138s, 40m19.001s Say I made a histogram of this (tilt your head :P) with 1 minute intervals: 33 * 34 * 35 * 36 ** 37 *** 38 ** 39 ** 40 ** 41 42 ** 43 * 44 45 * I don't really know what to make of that.. Going to see what happens with less memory and make -j1, perhaps it will be more stable. Also as you're on a uniprocessor system, make -j2 is probably going to be faster than make -j3. Perhaps immaterial to whatever you're trying to test, but there you go. Yes, I was hoping to have a more deterministic test to get a higher confidence in fewer runs when testing changes. Especially under memory pressure. And I truly was not expecting this much fluctuations, which is why I tested several kernel versions to see if this influenced it and mailed lkml. The computer is actually a dual core amd processor, but I compiled the kernel with no smp to see if that helped on the dispersion. -- Asbjorn Sannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
On Jan 25, 2008 3:32 AM, Asbjorn Sannes <[EMAIL PROTECTED]> wrote: > Hi, > > I am experiencing unpredictable results with the following test > without other processes running (exception is udev, I believe): > cd /usr/src/test > tar -jxf ../linux-2.6.22.12 > cp ../working-config linux-2.6.22.12/.config > cd linux-2.6.22.12 > make oldconfig > time make -j3 > /dev/null # This is what I note down as a "test" result > cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test > and then reboot > > The kernel is booted with the parameter mem=8192 > > For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s > (30 runs) > For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) > For 2.6.22.14 also varied a lot.. but, lost results :( > For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) > > Any idea of what can cause this? I have tried to make the runs as equal > as possible, rebooting between each run.. i/o scheduler is cfq as default. > > sys and user time only varies a couple of seconds.. and the order of > when it is "fast" and when it is "slow" is completly random, but it > seems that the results are mostly concentrated around the mean. First off, not all tests are good tests. In particular, small timing differences can get magnified horrendously by heading into swap. That said, do you have the means and standard deviations of those runs? That's a good way to tell whether the tests are converging or not, and whether your results are telling you anything. Also as you're on a uniprocessor system, make -j2 is probably going to be faster than make -j3. Perhaps immaterial to whatever you're trying to test, but there you go. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
Asbjørn Sannes wrote: > Nick Piggin wrote: > >> On Friday 25 January 2008 22:32, Asbjorn Sannes wrote: >> >> >>> Hi, >>> >>> I am experiencing unpredictable results with the following test >>> without other processes running (exception is udev, I believe): >>> cd /usr/src/test >>> tar -jxf ../linux-2.6.22.12 >>> cp ../working-config linux-2.6.22.12/.config >>> cd linux-2.6.22.12 >>> make oldconfig >>> time make -j3 > /dev/null # This is what I note down as a "test" result >>> cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test >>> and then reboot >>> >>> The kernel is booted with the parameter mem=8192 >>> >>> For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s >>> (30 runs) >>> For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) >>> For 2.6.22.14 also varied a lot.. but, lost results :( >>> For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) >>> >>> Any idea of what can cause this? I have tried to make the runs as equal >>> as possible, rebooting between each run.. i/o scheduler is cfq as default. >>> >>> sys and user time only varies a couple of seconds.. and the order of >>> when it is "fast" and when it is "slow" is completly random, but it >>> seems that the results are mostly concentrated around the mean. >>> >>> >> Hmm, lots of things could cause it. With such big variations in >> elapsed time, and small variations on CPU time, I guess the fs/IO >> layers are the prime suspects, although it could also involve the >> VM if you are doing a fair amount of page reclaim. >> >> Can you boot with enough memory such that it never enters page >> reclaim? `grep scan /proc/vmstat` to check. >> >> Otherwise you could mount the working directory as tmpfs to >> eliminate IO. >> >> bisecting it down to a single patch would be really helpful if you >> can spare the time. >> >> > I'm going to run some tests without limiting the memory to 80 megabytes > (so that it is 2 gigabyte) and see how much it varies then, but iff I > recall correctly it did not vary much. I'll reply to this e-mail with > the results. > 5 runs gives me: real5m58.626s real5m57.280s real5m56.584s real5m57.565s real5m56.613s Should I test with tmpfs aswell? -- Asbjorn Sannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
Nick Piggin wrote: > On Friday 25 January 2008 22:32, Asbjorn Sannes wrote: > >> Hi, >> >> I am experiencing unpredictable results with the following test >> without other processes running (exception is udev, I believe): >> cd /usr/src/test >> tar -jxf ../linux-2.6.22.12 >> cp ../working-config linux-2.6.22.12/.config >> cd linux-2.6.22.12 >> make oldconfig >> time make -j3 > /dev/null # This is what I note down as a "test" result >> cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test >> and then reboot >> >> The kernel is booted with the parameter mem=8192 >> >> For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s >> (30 runs) >> For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) >> For 2.6.22.14 also varied a lot.. but, lost results :( >> For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) >> >> Any idea of what can cause this? I have tried to make the runs as equal >> as possible, rebooting between each run.. i/o scheduler is cfq as default. >> >> sys and user time only varies a couple of seconds.. and the order of >> when it is "fast" and when it is "slow" is completly random, but it >> seems that the results are mostly concentrated around the mean. >> > > Hmm, lots of things could cause it. With such big variations in > elapsed time, and small variations on CPU time, I guess the fs/IO > layers are the prime suspects, although it could also involve the > VM if you are doing a fair amount of page reclaim. > > Can you boot with enough memory such that it never enters page > reclaim? `grep scan /proc/vmstat` to check. > > Otherwise you could mount the working directory as tmpfs to > eliminate IO. > > bisecting it down to a single patch would be really helpful if you > can spare the time. > I'm going to run some tests without limiting the memory to 80 megabytes (so that it is 2 gigabyte) and see how much it varies then, but iff I recall correctly it did not vary much. I'll reply to this e-mail with the results. I can do some bisecting next week and see if I find any, but it will probably take a lot of time considering that I need to do enough runs.. how much should this vary anyways? The kernel is compiled as an UP kernel and there is nothing running in parallel with it.. it is basically a .sh script running on boot appending the output of time to a file .. formatting and rebooting. -- Asbjørn Sannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
On Friday 25 January 2008 22:32, Asbjorn Sannes wrote: > Hi, > > I am experiencing unpredictable results with the following test > without other processes running (exception is udev, I believe): > cd /usr/src/test > tar -jxf ../linux-2.6.22.12 > cp ../working-config linux-2.6.22.12/.config > cd linux-2.6.22.12 > make oldconfig > time make -j3 > /dev/null # This is what I note down as a "test" result > cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test > and then reboot > > The kernel is booted with the parameter mem=8192 > > For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s > (30 runs) > For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) > For 2.6.22.14 also varied a lot.. but, lost results :( > For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) > > Any idea of what can cause this? I have tried to make the runs as equal > as possible, rebooting between each run.. i/o scheduler is cfq as default. > > sys and user time only varies a couple of seconds.. and the order of > when it is "fast" and when it is "slow" is completly random, but it > seems that the results are mostly concentrated around the mean. Hmm, lots of things could cause it. With such big variations in elapsed time, and small variations on CPU time, I guess the fs/IO layers are the prime suspects, although it could also involve the VM if you are doing a fair amount of page reclaim. Can you boot with enough memory such that it never enters page reclaim? `grep scan /proc/vmstat` to check. Otherwise you could mount the working directory as tmpfs to eliminate IO. bisecting it down to a single patch would be really helpful if you can spare the time. Thanks, Nick -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Unpredictable performance
Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 > /dev/null # This is what I note down as a "test" result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is "fast" and when it is "slow" is completly random, but it seems that the results are mostly concentrated around the mean. -- Asbjørn Sannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Unpredictable performance
Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 /dev/null # This is what I note down as a test result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is fast and when it is slow is completly random, but it seems that the results are mostly concentrated around the mean. -- Asbjørn Sannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
Ray Lee wrote: On Jan 25, 2008 3:32 AM, Asbjorn Sannes [EMAIL PROTECTED] wrote: Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 /dev/null # This is what I note down as a test result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is fast and when it is slow is completly random, but it seems that the results are mostly concentrated around the mean. .. I may have jumped the gun a little early saying that it is mostly concentrated around the mean, grepping from memory is not always .. hm, accurate :P First off, not all tests are good tests. In particular, small timing differences can get magnified horrendously by heading into swap. So, what you are saying is that it is expected to vary this much under memory pressure? That I can not do anything with this on real hardware? That said, do you have the means and standard deviations of those runs? That's a good way to tell whether the tests are converging or not, and whether your results are telling you anything. I have all the numbers, I was just hoping that there was a way to benchmark a small change without a lot of runs. It seems to me to quite randomly distributed .. from the 2.6.23.14 runs: 43m10.022s, 34m31.104s, 43m47.221s, 41m17.840s, 34m15.454s, 37m54.327s, 35m6.193s, 38m16.909s, 37m45.411s, 40m13.169s 38m17.414s, 34m37.561s, 43m18.181s, 35m46.233s, 34m44.414s, 39m55.257s, 35m28.477s, 33m30.551s, 41m36.394s, 43m6.359s, 42m42.396s, 37m44.293s, 41m6.615s, 35m43.084s, 39m25.846s, 34m23.753s, 36m0.556s, 41m38.095s, 45m32.703s, 36m18.325s, 42m4.840s, 43m53.759s, 35m51.138s, 40m19.001s Say I made a histogram of this (tilt your head :P) with 1 minute intervals: 33 * 34 * 35 * 36 ** 37 *** 38 ** 39 ** 40 ** 41 42 ** 43 * 44 45 * I don't really know what to make of that.. Going to see what happens with less memory and make -j1, perhaps it will be more stable. Also as you're on a uniprocessor system, make -j2 is probably going to be faster than make -j3. Perhaps immaterial to whatever you're trying to test, but there you go. Yes, I was hoping to have a more deterministic test to get a higher confidence in fewer runs when testing changes. Especially under memory pressure. And I truly was not expecting this much fluctuations, which is why I tested several kernel versions to see if this influenced it and mailed lkml. The computer is actually a dual core amd processor, but I compiled the kernel with no smp to see if that helped on the dispersion. -- Asbjorn Sannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
On Saturday 26 January 2008 02:03, Asbjørn Sannes wrote: Asbjørn Sannes wrote: Nick Piggin wrote: On Friday 25 January 2008 22:32, Asbjorn Sannes wrote: Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 /dev/null # This is what I note down as a test result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is fast and when it is slow is completly random, but it seems that the results are mostly concentrated around the mean. Hmm, lots of things could cause it. With such big variations in elapsed time, and small variations on CPU time, I guess the fs/IO layers are the prime suspects, although it could also involve the VM if you are doing a fair amount of page reclaim. Can you boot with enough memory such that it never enters page reclaim? `grep scan /proc/vmstat` to check. Otherwise you could mount the working directory as tmpfs to eliminate IO. bisecting it down to a single patch would be really helpful if you can spare the time. I'm going to run some tests without limiting the memory to 80 megabytes (so that it is 2 gigabyte) and see how much it varies then, but iff I recall correctly it did not vary much. I'll reply to this e-mail with the results. 5 runs gives me: real5m58.626s real5m57.280s real5m56.584s real5m57.565s real5m56.613s Should I test with tmpfs aswell? I wouldn't worry about it. It seems like it might be due to page reclaim (fs / IO can't be ruled out completely though). Hmm, I haven't been following reclaim so closely lately; you say it started going bad around 2.6.22? It may be lumpy reclaim patches? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
On Jan 25, 2008 3:32 AM, Asbjorn Sannes [EMAIL PROTECTED] wrote: Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 /dev/null # This is what I note down as a test result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is fast and when it is slow is completly random, but it seems that the results are mostly concentrated around the mean. First off, not all tests are good tests. In particular, small timing differences can get magnified horrendously by heading into swap. That said, do you have the means and standard deviations of those runs? That's a good way to tell whether the tests are converging or not, and whether your results are telling you anything. Also as you're on a uniprocessor system, make -j2 is probably going to be faster than make -j3. Perhaps immaterial to whatever you're trying to test, but there you go. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
Nick Piggin wrote: On Friday 25 January 2008 22:32, Asbjorn Sannes wrote: Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 /dev/null # This is what I note down as a test result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is fast and when it is slow is completly random, but it seems that the results are mostly concentrated around the mean. Hmm, lots of things could cause it. With such big variations in elapsed time, and small variations on CPU time, I guess the fs/IO layers are the prime suspects, although it could also involve the VM if you are doing a fair amount of page reclaim. Can you boot with enough memory such that it never enters page reclaim? `grep scan /proc/vmstat` to check. Otherwise you could mount the working directory as tmpfs to eliminate IO. bisecting it down to a single patch would be really helpful if you can spare the time. I'm going to run some tests without limiting the memory to 80 megabytes (so that it is 2 gigabyte) and see how much it varies then, but iff I recall correctly it did not vary much. I'll reply to this e-mail with the results. I can do some bisecting next week and see if I find any, but it will probably take a lot of time considering that I need to do enough runs.. how much should this vary anyways? The kernel is compiled as an UP kernel and there is nothing running in parallel with it.. it is basically a .sh script running on boot appending the output of time to a file .. formatting and rebooting. -- Asbjørn Sannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unpredictable performance
Asbjørn Sannes wrote: Nick Piggin wrote: On Friday 25 January 2008 22:32, Asbjorn Sannes wrote: Hi, I am experiencing unpredictable results with the following test without other processes running (exception is udev, I believe): cd /usr/src/test tar -jxf ../linux-2.6.22.12 cp ../working-config linux-2.6.22.12/.config cd linux-2.6.22.12 make oldconfig time make -j3 /dev/null # This is what I note down as a test result cd /usr/src ; umount /usr/src/test ; mkfs.ext3 /dev/cc/test and then reboot The kernel is booted with the parameter mem=8192 For 2.6.23.14 the results vary from (real time) 33m30.551s to 45m32.703s (30 runs) For 2.6.23.14 with nop i/o scheduler from 29m8.827s to 55m36.744s (24 runs) For 2.6.22.14 also varied a lot.. but, lost results :( For 2.6.20.21 only vary from 34m32.054s to 38m1.928s (10 runs) Any idea of what can cause this? I have tried to make the runs as equal as possible, rebooting between each run.. i/o scheduler is cfq as default. sys and user time only varies a couple of seconds.. and the order of when it is fast and when it is slow is completly random, but it seems that the results are mostly concentrated around the mean. Hmm, lots of things could cause it. With such big variations in elapsed time, and small variations on CPU time, I guess the fs/IO layers are the prime suspects, although it could also involve the VM if you are doing a fair amount of page reclaim. Can you boot with enough memory such that it never enters page reclaim? `grep scan /proc/vmstat` to check. Otherwise you could mount the working directory as tmpfs to eliminate IO. bisecting it down to a single patch would be really helpful if you can spare the time. I'm going to run some tests without limiting the memory to 80 megabytes (so that it is 2 gigabyte) and see how much it varies then, but iff I recall correctly it did not vary much. I'll reply to this e-mail with the results. 5 runs gives me: real5m58.626s real5m57.280s real5m56.584s real5m57.565s real5m56.613s Should I test with tmpfs aswell? -- Asbjorn Sannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/