Thomas Rösner <[EMAIL PROTECTED]> posted [EMAIL PROTECTED], excerpted below, on Mon, 15 Jan 2007 16:39:30 +0100:
> Compiling the kernel with -j is a popular benchmark, because it really > stresses the VM/disk/CPU. And before you get your hopes up too high: the > ebuilds that really take long (mozilla, openoffice, glibc, gcc) won't > use your makeopts anyway. > > My guess; going higher than -j5 won't do much for you, there will always > be a process not waiting for IO (if your disk can handle the load, that > is) for each CPU. -j3 will be better for cpp compiles, which hog the CPU > longer and won't have to be scheduled out like with -j5. > > Other factors: is this a desktop system? Do you want to actually do > something with it while it compiles? How much RAM do you have? > > (These are rethorical questions ;-)) Rhetorical or not, I've been curious at just how parallelizable things such as kernel compiles actually are. I have 8 gigs of memory now, and a dual Opteron (242, to be upgraded to dual cores soon) that I was running at -j5 to -j8 for kernel compiles (set in a patch routinely applied by my kernel maintenance scripts) for some time. However, recently, I've tried, apparently depending on make version, either -j (unlimited in some, I get a warning now and it reduces it to -j1, which isn't any fun...) or -j1000, just to see how high I could make my load average climb! =8^). I've been frustrated due to being unable to find an easy way to measure load averages of less than the 1 minute rolling, but with even with it, I've been highly amused to see it climb to something over 250! (I want to say 450 as I think I remember that, but I'm keeping the claim to what I know I've seen, several times. =8^) It's still fascinating to me to see how well the AMD64 arch and kernel copes with that, with the Linux kernel naturally scheduling interactive processes (which generally spend a lot of time idling, waiting for input) at a higher priority, thus keeping the system amazingly smoothly running even at that sort of load average, without any nicing on my part other than what the kernel does normally. =8^) Memory-wise, four gig would be plenty, even for semi-contrived usage like this one, but I expect to keep this system for a a couple more years yet and as I said will be upgrading to dual-dual-cores, so I decided I might as well go for it when I did the upgrade, and went 8 gigs. Those 250+ tasks ready-to-run do noticeably load the memory, but only by a gig or two. Often, that won't even push cache off the top of the 8 gig, so as I said, even here, four gigs would still be very reasonable. As for disk access, the average guy with a single hard drive will certainly find that the bottleneck in an unlimited jobs scenario such as the above. I'm running a four disk SATA based RAID array, RAID-6 (so two-way-striped, with two-way recovery as well) on my main system, but full four-way RAID-0/striped for my temp-data stuff, including both the portage tree and kernel sources, and again, while the mouse movement does chop up slightly, and ksysguard does late-update the activity plots during the initial load, there's no way I'd know the system was running 250+ load average if I wasn't actually watching the ksysguard one-minute load average graph. As for portage, no matter the -j setting and despite running $PORTAGE_TMPDIR on tmpfs, as you (Thomas) mention, not a whole lot even keeps the two CPUs busy all the time. In particular, the autotools configure scripts are normally serialized, so single thread only. If I have a lot of updates to do, as when a KDE version refresh comes along, I'll routinely run five merges in parallel, as separate konsole tabs, keeping an emerge --pretend --tree in one tab or a different console window, using the tree layout to keep the dependency trees separate so none of the emerge tabs interferes with the others. Still, it's often just easier to run with a single emerge --update --deep --newuse world and $PORTAGE_NICENESS=20, and just let the update run on one CPU/core, and basically forget about it, going about my normal business as if the update wasn't running. It doesn't take /that/ much longer, as while it's not so efficient at using every bit of CPU, there's less scheduling contention so it's more efficient there, and with dual-core/cpu, $PORTAGE_TMPDIR on tmpfs, and an effective two-way striped (as a four-spindle RAID-6) main system, there's little I/O contention with my regular tasks, so it just runs and I do what I'd do if it weren't running (well, I've not tried burning a CD while doing it, or something like that, but streaming Internet radio doesn't mind), and don't worry about it. All that said, it does seem enough stuff is beginning to be designed with multi-core in mind, that I can actually see a dual-dual-core system being of some use, and am looking forward to that upgrade, both for the clock cycles upgrade (1.6GHz Opteron 242s, 2.6 GHz Opteron 285s) and the dual-cores, giving me four cores total to work with. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- [email protected] mailing list
