Re: lzo2 shows insane speed gap
Kris Kennaway wrote: Christian Weisgerber wrote: Bruce Cran br...@cran.org.uk wrote: I'm running 8.0-CURRENT amd64 here on a Turion64 X2 machine. Without malloc debugging (malloc.conf - aj) 'make test' takes 25s; after removing malloc.conf thus turning on debugging, it takes over 10 minutes. ... But still. Two orders of magnitude? That is a pathological case. Probably it means that lzo2 is doing pathological numbers of mallocs. Rather, the lzo2 test suite. Test suites do tend to hammer malloc() pretty hard. I see similar variations for the libarchive test suite with malloc debugging. Tim ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
Christian Weisgerber wrote: Bruce Cran br...@cran.org.uk wrote: I'm running 8.0-CURRENT amd64 here on a Turion64 X2 machine. Without malloc debugging (malloc.conf - aj) 'make test' takes 25s; after removing malloc.conf thus turning on debugging, it takes over 10 minutes. Wow! That. Is. It. Toggling malloc debugging option J makes the slow machines fast and vice versa. Athlon 64 X2 5200+ 2.6 GHz, FreeBSD 8.0-CURRENT amd64 ~60 min 19 seconds. I guess that falls under the obvious configuration differences to check, but since it usually doesn't cause a significant slowdown I completely forgot about it. Embarrassing. But still. Two orders of magnitude? That is a pathological case. Probably it means that lzo2 is doing pathological numbers of mallocs. Kris ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
Bruce Cran br...@cran.org.uk wrote: I'm running 8.0-CURRENT amd64 here on a Turion64 X2 machine. Without malloc debugging (malloc.conf - aj) 'make test' takes 25s; after removing malloc.conf thus turning on debugging, it takes over 10 minutes. Wow! That. Is. It. Toggling malloc debugging option J makes the slow machines fast and vice versa. Athlon 64 X2 5200+ 2.6 GHz, FreeBSD 8.0-CURRENT amd64 ~60 min 19 seconds. I guess that falls under the obvious configuration differences to check, but since it usually doesn't cause a significant slowdown I completely forgot about it. Embarrassing. But still. Two orders of magnitude? That is a pathological case. -- Christian naddy Weisgerber na...@mips.inka.de ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
Christian Weisgerber na...@mips.inka.de writes: Oh, and everybody is invited to run $ cd /usr/ports/archivers/lzo2 make I assume you meant time make. This is insane: 3108.27 real 1215.69 user 1888.06 sys on an E6600 with 4 GB RAM. What surprises me most is the high sys time. DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
lzo2 shows insane speed gap
The archivers/lzo2 port runs a series of regression tests after the actual build. These tests show extremely divergent behavior on different machines. There are two types of machines: Type #1: Running the tests takes roughly the same time as configure and compile did, whether it's 30 seconds on a fast machine or 10 minutes on an old slow one. Type #2: Running the tests takes much, much, MUCH longer. I've tried this across alpha, amd64, i386, and sparc64, partially on FreeBSD, partially on OpenBSD. The operating system doesn't matter and there is no pattern related to endianness or 32/64 bits. You can find machines that are the same architecture (e.g. amd64) and are of similar overall speed (e.g. an Intel Xeon Xeon E5405 and an AMD Phenom 9350e) and one of these machines will be type #1 and the other will be #2 and take _a hundred_ times longer to run the tests. A hundred times. I have never seen anything like this before. On the slow machines, the tests also consume a lot of system time. I've seen figures from 20 to 50%. However, ktrace shows nothing out of the ordinary. My best guess at this time is that lzo2 somehow manages to induce crazy cache thrashing on some CPU models. Ideas and explanations welcome. -- Christian naddy Weisgerber na...@mips.inka.de ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
On 2008-12-29 22:25, Christian Weisgerber wrote: On the slow machines, the tests also consume a lot of system time. I've seen figures from 20 to 50%. However, ktrace shows nothing out of the ordinary. What's up with the memory on these machines? Lzo tends to take insane amounts ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
On 2008-12-30 00:17, Dimitry Andric wrote: What's up with the memory on these machines? Lzo tends to take insane amounts Duh, nevermind... I'm confusing this with lzma. :) Sorry for the noise. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
On Mon, 29 Dec 2008, Christian Weisgerber wrote: The archivers/lzo2 port runs a series of regression tests after the actual build. These tests show extremely divergent behavior on different machines. There are two types of machines: Type #1: Running the tests takes roughly the same time as configure and compile did, whether it's 30 seconds on a fast machine or 10 minutes on an old slow one. Type #2: Running the tests takes much, much, MUCH longer. I've tried this across alpha, amd64, i386, and sparc64, partially on FreeBSD, partially on OpenBSD. The operating system doesn't matter and there is no pattern related to endianness or 32/64 bits. You can find machines that are the same architecture (e.g. amd64) and are of similar overall speed (e.g. an Intel Xeon Xeon E5405 and an AMD Phenom 9350e) and one of these machines will be type #1 and the other will be #2 and take _a hundred_ times longer to run the tests. A hundred times. I have never seen anything like this before. It might be good first to rule out compiler / library differences. First, can you isolate a single lzo command / input combination whose time differs dramatically? This would simplify tests compared to running the whole test suite. (It should be easy because it looks like the test suite prints the time for each test.) It might also simplify things to work on one fast and one slow machine. Then try copying the lzo binary from the fast machine to the slow machine (and vice versa) and see if the same test speeds up with the copied binary. If not, try again with the binary statically linked. If still not, it would be good to have a copy of the binary made available, along with more information about the fast and slow machines (CPU, amount of memory, load on the machine, kernel version, disk, etc). If the copied binary isn't faster than the natively produced one, then it would be good to have information about the compiler options, versions, etc. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
Christian Weisgerber wrote: skipped My best guess at this time is that lzo2 somehow manages to induce crazy cache thrashing on some CPU models. Ideas and explanations welcome. Did you ask the author? He might be the best person to ask. Yuri ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
On Monday 29 December 2008 12:25:00 Christian Weisgerber wrote: On the slow machines, the tests also consume a lot of system time. I've seen figures from 20 to 50%. However, ktrace shows nothing out of the ordinary. If the program itself doesn't directly cause the system time, do interrupt rates give any hint as to what does? And to rule out the obvious, you did check swapping? -- Mel ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
Christian Weisgerber wrote: My best guess at this time is that lzo2 somehow manages to induce crazy cache thrashing on some CPU models. Ideas and explanations welcome Try running single command that is different on different machines under valgrind (callgrind) on these machines and see that at least number of instructions executed is the same. Lzo2 documentation says that there are a lot of algorithms implemented. It might be choosing the algorithm based on the CPU and the choice it's making might be bad. Yuri ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
I see this performance difference on my boxes. First one has Core2Duo(E5-something), 4GB and runs RELENG_7/i386. lzotest is very fast. Second box is Core2Quad (Q9450), 8GB RAM and runs -current as of about a week ago. lzo2 binary built from ports is *slow*. However, 32-bit binary from the first box runs very fast. The only interesting difference I can see in ktrace is that read and munmap take much much longer in case of 64-bit lzotest. Here are two excerpts from ktrace on the second box: ### 32-bit app - runs fast on both boxes. 59657 lzotest 0.10 CALL open(0xd91b,O_RDONLY,unused0x1b6) 59657 lzotest 0.07 NAMI ./src/lzo1_d.ch 59657 lzotest 0.12 RET open 3 59657 lzotest 0.05 CALL fstat(0x3,0xd504) 59657 lzotest 0.07 STRU struct stat {dev=102, ino=544718, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=2169160, atime=1230595144, stime=1209559909, ctime=1230588212, birthtime=1209559909, size=4563, blksize=4096, blocks=12, flags=0x0 } 59657 lzotest 0.05 RET fstat 0 59657 lzotest 0.06 CALL lseek(0x3,0,SEEK_SET,0x1) 59657 lzotest 0.05 RET lseek 0 59657 lzotest 0.05 CALL lseek(0x3,0x400,SEEK_SET,0) 59657 lzotest 0.05 RET lseek 67108864/0x400 59657 lzotest 0.06 CALL lseek(0x3,0,SEEK_SET,0) 59657 lzotest 0.05 RET lseek 0 59657 lzotest 0.05 CALL mmap(0,0x400,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,0x,0,0) 59657 lzotest 0.07 RET mmap 673185792/0x2820 59657 lzotest 0.06 CALL read(0x3,0x28196000,0x1000) 59657 lzotest 0.10 GIO fd 3 read 4096 bytes 59657 lzotest 0.29 RET read 4096/0x1000 59657 lzotest 0.28 CALL read(0x3,0x28196000,0x1000) 59657 lzotest 0.10 GIO fd 3 read 467 bytes 59657 lzotest 0.05 RET read 467/0x1d3 59657 lzotest 0.10 CALL read(0x3,0x28196000,0x1000) 59657 lzotest 0.07 GIO fd 3 read 0 bytes 59657 lzotest 0.06 RET read 0 59657 lzotest 0.05 CALL close(0x3) 59657 lzotest 0.10 RET close 0 59657 lzotest 0.25 CALL getrusage(0,0xd60c) 59657 lzotest 0.06 RET getrusage 0 59657 lzotest 0.05 CALL getrusage(0,0xd628) 59657 lzotest 0.06 RET getrusage 0 59657 lzotest 0.05 CALL getrusage(0,0xd60c) 59657 lzotest 0.06 RET getrusage 0 59657 lzotest 0.64 CALL getrusage(0,0xd60c) 59657 lzotest 0.06 RET getrusage 0 59657 lzotest 0.05 CALL getrusage(0,0xd60c) 59657 lzotest 0.06 RET getrusage 0 59657 lzotest 0.29 CALL getrusage(0,0xd60c) 59657 lzotest 0.06 RET getrusage 0 59657 lzotest 0.12 CALL getrusage(0,0xd60c) 59657 lzotest 0.36 RET getrusage 0 59657 lzotest 0.10 CALL write(0x1,0x28194000,0x4f) 59657 lzotest 0.10 GIO fd 1 wrote 79 bytes 59657 lzotest 0.06 RET write 79/0x4f 59657 lzotest 0.06 CALL munmap(0x2820,0x400) 59657 lzotest 0.17 RET munmap 0 ### same file. 64-bit app (slow). Look at read/munmap 59158 lzotest 0.15 CALL open(0x7fffe760,O_RDONLY,unused0x1b6) 59158 lzotest 0.14 NAMI ./src/lzo1_d.ch 59158 lzotest 0.24 RET open 3 59158 lzotest 0.11 CALL fstat(0x3,0x7fffe2d0) 59158 lzotest 0.11 STRU struct stat {dev=102, ino=544718, mode=-rw-r--r-- , nlink=1, uid=0, gid=0, rdev=2169160, atime=1230588427, stime=1209559909, ctime=1230588212, birthtime=1209559909, size=4563, blksize=4096, blocks=12, flags=0x0 } 59158 lzotest 0.07 RET fstat 0 59158 lzotest 0.15 CALL lseek(0x3,0,SEEK_CUR) 59158 lzotest 0.07 RET lseek 0 59158 lzotest 0.06 CALL lseek(0x3,0x400,SEEK_SET) 59158 lzotest 0.07 RET lseek 67108864/0x400 59158 lzotest 0.07 CALL lseek(0x3,0,SEEK_SET) 59158 lzotest 0.06 RET lseek 0 59158 lzotest 0.08 CALL mmap(0,0x400,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,0x,0) 59158 lzotest 0.10 RET mmap 11534336/0x800b0 59158 lzotest 0.074126 CALL read(0x3,0x800a9e000,0x1000) 59158 lzotest 0.54 GIO fd 3 read 4096 bytes 59158 lzotest 0.10 RET read 4096/0x1000 59158 lzotest 0.07 CALL read(0x3,0x800a9e000,0x1000) 59158 lzotest 0.12 GIO fd 3 read 467 bytes 59158 lzotest 0.06 RET read 467/0x1d3 59158 lzotest 0.07 CALL read(0x3,0x800a9e000,0x1000) 59158 lzotest 0.09 GIO fd 3 read 0 bytes 59158 lzotest 0.06 RET read 0 59158 lzotest 0.08 CALL close(0x3) 59158 lzotest 0.20 RET close 0 59158 lzotest 0.29 CALL getrusage(0,0x7fffe3d0) 59158 lzotest 0.10 RET getrusage 0 59158 lzotest 0.07 CALL getrusage(0,0x7fffe3e0) 59158 lzotest 0.07 RET getrusage 0 59158 lzotest 0.07 CALL getrusage(0,0x7fffe3d0) 59158 lzotest 0.07 RET getrusage 0 59158 lzotest 0.69 CALL getrusage(0,0x7fffe3d0) 59158 lzotest 0.07 RET getrusage 0 59158 lzotest 0.06 CALL
Re: lzo2 shows insane speed gap
Mel fbsd.hack...@rachie.is-a-geek.net wrote: If the program itself doesn't directly cause the system time, do interrupt rates give any hint as to what does? systat -vmstat shows a conspicuously large number of traps, I think. (I'm short on comparable FreeBSD machines.) And to rule out the obvious, you did check swapping? No swapping. -- Christian naddy Weisgerber na...@mips.inka.de ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
Nate Eldredge: It might be good first to rule out compiler / library differences. Sure. Let's cut this short: Slow Athlon 64 X2 5200+ 2.6 GHz, FreeBSD 8.0-CURRENT amd64 ~60 min Phenom 9350e 2.0 GHz,OpenBSD 4.4-CURRENT amd64 ~80 min UltraSPARC-IIe 500 MHz (Blade 100), OpenBSD 4.4-CURRENT sparc64 10 h++ Fast Pentium 4 3.0 GHz, FreeBSD 6.4-RELEASE i386 36 s Xeon E5405 2.0 GHz (PowerEdge 1950), OpenBSD 4.4-CURRENT amd6447 s Alpha 21164A 500 MHz (AlphaPC164), OpenBSD 4.4-CURRENT alpha 9 min Let me draw your attention to the fact that the two amd64 systems that run different operating systems are both slow, whereas the two amd64 systems that run the same operating system (compiler, libraries) diverge in speed. Oh, and everybody is invited to run $ cd /usr/ports/archivers/lzo2 make and check for themselves. PS: The Blade 100 is still crunching as I write this... -- Christian naddy Weisgerber na...@mips.inka.de ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
On Tue, 30 Dec 2008 01:47:47 +0100 Christian Weisgerber na...@mips.inka.de wrote: Nate Eldredge: It might be good first to rule out compiler / library differences. Sure. Let's cut this short: Slow Athlon 64 X2 5200+ 2.6 GHz, FreeBSD 8.0-CURRENT amd64 ~60 min Phenom 9350e 2.0 GHz,OpenBSD 4.4-CURRENT amd64 ~80 min UltraSPARC-IIe 500 MHz (Blade 100), OpenBSD 4.4-CURRENT sparc64 10 h++ Fast Pentium 4 3.0 GHz, FreeBSD 6.4-RELEASE i386 36 s Xeon E5405 2.0 GHz (PowerEdge 1950), OpenBSD 4.4-CURRENT amd6447 s Alpha 21164A 500 MHz (AlphaPC164), OpenBSD 4.4-CURRENT alpha 9 min Let me draw your attention to the fact that the two amd64 systems that run different operating systems are both slow, whereas the two amd64 systems that run the same operating system (compiler, libraries) diverge in speed. Oh, and everybody is invited to run $ cd /usr/ports/archivers/lzo2 make I'm running 8.0-CURRENT amd64 here on a Turion64 X2 machine. Without malloc debugging (malloc.conf - aj) 'make test' takes 25s; after removing malloc.conf thus turning on debugging, it takes over 10 minutes. -- Bruce Cran ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
Oh, and everybody is invited to run $ cd /usr/ports/archivers/lzo2 make and check for themselves. I've used lzo2 quite a bit in the past and never saw this, so I thought I'd try this on a few boxes we have... Output is from make fetch ; time make 8-core Opteron 2350 2.0ghz, 64GB RAM, FreeBSD 7.1-PRERELEASE (just before RC1 was tagged), amd64 41.464u 20.671s 1:02.04 100.1% 2430+1556k 0+0io 377pf+0w 4-core Opteron 280 2.4ghz, 4GB RAM, FreeBSD 7.0-RELEASE-p6, amd64 40.907u 18.638s 1:03.08 94.3% 2339+603k 182+91io 681pf+0w Dual Athlon MP 2100+ 1.73ghz, 1GB RAM, FreeBSD 6.3-RELEASE, i386 82.812u 44.963s 2:06.89 100.6% 959+37724k 32+82io 46pf+0w Dual P3 850mhz, 1GB RAM, FreeBSD 7.0-RELEASE-p4, i386 208.494u 84.935s 8:07.23 60.2% 2270+990k 17+87io 60pf+0w 4-core Opteron 2218 2.6ghz, 16GB RAM, FreeBSD 7.0-RELEASE-p4, amd64 38.893u 16.623s 0:55.53 99.9% 2290+591k 96+99io 48pf+0w Dual Xeon 3.06GHz, 4GB RAM, FreeBSD 7.0-RELEASE-p4, i386 60.910u 24.667s 1:22.54 103.6% 2143+988k 146+134io 105pf+0w Dual P3 866mhz, 2GB RAM, FreeBSD 7.0-RELEASE-p4, i386 169.135u 58.198s 3:52.71 97.6% 2443+1002k 160+99io 368pf+0w 2-core Core 2 Duo 2.33ghz, 2GB RAM, Mac OS X 10.5.6, i386 48.155u 29.896s 1:25.14 91.6% 0+0k 30+222io 1845pf+0w 4-core Xeon 2.66ghz, 6GB RAM, Mac OS X 10.5.6, i386 real1m17.024s user 0m44.373s sys 0m34.249s None of these boxes were idle, so relative times are pretty useless, but i'm not seeing anything on the order of tens of minutes or hours. Is the source .tar.gz identical on all your systems? -- Kevin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: lzo2 shows insane speed gap
Christian Weisgerber na...@mips.inka.de wrote: Oh, and everybody is invited to run $ cd /usr/ports/archivers/lzo2 make $cd /usr/ports/archivers/lzo2 time sudo make [...] All tests passed. Now you are ready to install LZO. real1m1.041s user0m38.087s sys 0m17.613s This is Intel q6600. -- Adios ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org