[gentoo-amd64] Re: MAKEOPTS values for Athlon 64 X2

Duncan Mon, 15 Jan 2007 10:11:30 -0800

Thomas Rösner <[EMAIL PROTECTED]> posted
[EMAIL PROTECTED], excerpted below, on  Mon, 15 Jan 2007
16:39:30 +0100:


> Compiling the kernel with -j is a popular benchmark, because it really 
> stresses the VM/disk/CPU. And before you get your hopes up too high: the 
> ebuilds that really take long (mozilla, openoffice, glibc, gcc) won't 
> use your makeopts anyway.
> 
> My guess; going higher than -j5 won't do much for you, there will always 
> be a process not waiting for IO (if your disk can handle the load, that 
> is) for each CPU. -j3 will be better for cpp compiles, which hog the CPU 
> longer and won't have to be scheduled out like with -j5.
> 
> Other factors: is this a desktop system? Do you want to actually do 
> something with it while it compiles? How much RAM do you have?
> 
> (These are rethorical questions ;-))

Rhetorical or not, I've been curious at just how parallelizable things
such as kernel compiles actually are.  I have 8 gigs of memory now, and a
dual Opteron (242, to be upgraded to dual cores soon) that I was running
at -j5 to -j8 for kernel compiles (set in a patch routinely applied by my
kernel maintenance scripts) for some time. However, recently, I've tried,
apparently depending on make version, either -j (unlimited in some, I get
a warning now and it reduces it to -j1, which isn't any fun...) or -j1000,
just to see how high I could make my load average climb!  =8^).

I've been frustrated due to being unable to find an easy way to measure
load averages of less than the 1 minute rolling, but with even with it,
I've been highly amused to see it climb to something over 250! (I want to
say 450 as I think I remember that, but I'm keeping the claim to what
I know I've seen, several times. =8^) It's still fascinating to me to see
how well the AMD64 arch and kernel copes with that, with the Linux kernel
naturally scheduling interactive processes (which generally spend a lot of
time idling, waiting for input) at a higher priority, thus keeping the
system amazingly smoothly running even at that sort of load average,
without any nicing on my part other than what the kernel does normally. =8^)

Memory-wise, four gig would be plenty, even for semi-contrived usage like
this one, but I expect to keep this system for a a couple more years yet
and as I said will be upgrading to dual-dual-cores, so I decided I might
as well go for it when I did the upgrade, and went 8 gigs. Those 250+
tasks ready-to-run do noticeably load the memory, but only by a gig or
two.  Often, that won't even push cache off the top of the 8 gig, so as I
said, even here, four gigs would still be very reasonable.

As for disk access, the average guy with a single hard drive will
certainly find that the bottleneck in an unlimited jobs scenario such as
the above.  I'm running a four disk SATA based RAID array, RAID-6 (so
two-way-striped, with two-way recovery as well) on my main system, but
full four-way RAID-0/striped for my temp-data stuff, including both the
portage tree and kernel sources, and again, while the mouse movement does
chop up slightly, and ksysguard does late-update the activity plots during
the initial load, there's no way I'd know the system was running 250+ load
average if I wasn't actually watching the ksysguard one-minute load average
graph.

As for portage, no matter the -j setting and despite running
$PORTAGE_TMPDIR on tmpfs, as you (Thomas) mention, not a whole lot even
keeps the two CPUs busy all the time.  In particular, the autotools
configure scripts are normally serialized, so single thread only.  If I
have a lot of updates to do, as when a KDE version refresh comes along,
I'll routinely run five merges in parallel, as separate konsole tabs,
keeping an emerge --pretend --tree in one tab or a different console
window, using the tree layout to keep the dependency trees separate so
none of the emerge tabs interferes with the others.

Still, it's often just easier to run with a single emerge --update --deep
--newuse world and $PORTAGE_NICENESS=20, and just let the update run on
one CPU/core, and basically forget about it, going about my normal
business as if the update wasn't running.  It doesn't take /that/ much
longer, as while it's not so efficient at using every bit of CPU, there's
less scheduling contention so it's more efficient there, and with
dual-core/cpu, $PORTAGE_TMPDIR on tmpfs, and an effective two-way striped
(as a four-spindle RAID-6) main system, there's little I/O contention with
my regular tasks, so it just runs and I do what I'd do if it weren't
running (well, I've not tried burning a CD while doing it, or something
like that, but streaming Internet radio doesn't mind), and don't worry
about it.

All that said, it does seem enough stuff is beginning to be designed with
multi-core in mind, that I can actually see a dual-dual-core system being
of some use, and am looking forward to that upgrade, both for the clock
cycles upgrade (1.6GHz Opteron 242s, 2.6 GHz Opteron 285s) and the
dual-cores, giving me four cores total to work with.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

-- 
[email protected] mailing list

[gentoo-amd64] Re: MAKEOPTS values for Athlon 64 X2

Reply via email to