Re: OT: Nehelam's New HT ability.... and ability to handle spamd high load (preheating cache?)

Linda Walsh Fri, 07 Aug 2009 17:49:25 -0700

My bios doesn't allow shutting off HT, but does allow turning off
2 or 3 cores (allowing dual or single) -- I'd rather see that type
of feature at runtime - allowing system load to decide whether to activate 
another core -- though the diff on my 2.6GHZ in power consumption
when from about 157 watts (according to its front panel), to over
260 when I loaded all 8 'virtual' cores (only 4 corex2HT's/core).


That's w/8 hard disks inside (though not under load...just spinning).

Seems to be no way on my machine (Dell is so limiting sometimes), to
turn off unused hard drives, or only spin them up when I want to use
them -- Some are hot-spare or just unconfig'ed, yet they spinup.

I'd also prefer the my own *choice* of whether or not to use theon-disk cache as well as the raid controller's cache. I virtually never have unplanned shutdowns -- (its on a UPS that will run for >1hour under its load).


Maybe some of this control will get into the lk -- or does the bios have
to support everything?

Supposedly it has temp and electrical monitoring 'galore', but I can'
even read the DIMM temps.  I went with the 'eco' power supplies at 570W (vs. 
870).  But got the dual power supply backup -- I think, from what I an measure, 
it splits the power usage between the supplies unless one goes out. That could 
mean I really have a 1140W available?  Dunno.  Not sure exactly
what 'spare' means -- if it limits total consumption to level  of 1 supply even 
though it splits the load (power meter hooked to one and watched it go to half 
load when other was plugged in).

BTW, I'm running at 1333MHZ, so maybe it's a heat dissipation prob and not

power? I'm only pulling 157-160 to a max of 260 (didn't have diskschurning though -- was just running copies of ssh-keygen -b 16384 -- that seems to take it a little bit...8192 comes out in about 10 seconds though. :-).


Oblig:sa-users -- I may finally have my 'dead -email' restart problem solved.  
Before, if I had a large queue, I had to stop fetchmail, often -- download only 
10-20 at a  time so it's emails wouldn't overload my sendmail queue (it gets 
backed up on spamassassin).  My minimum time for SA (w/network tests) is around 
3seconds.  But during heavy loads it can really go high -- and my machine can 
just run out of memory and process space.
(part of it is sendmail looking up hosts of received email and bind starting 
'cold' (no cache).  But started with 2700 emails, ... after # processes
got to about 900, I chickened a bit and paused the fetchmail until they dropped 
under 400 (note, 'load' never went over '2' the whole time, so it was mostly 
network wait time).  But after the initial clear I had about 2200 emails left 
and just let it run.  At that point, I could see it keeping up -- bind's cache 
was alot warmer now, so not as much network traffic.

I added the 'delay time' taken by spamd when running my email inputs (its' 
actually my filter delay time, but the max diff between the two is about .01 
seconds, so it's mostly spamd delay -- my stats for today from ~9:30am
are: (n=#emails)
n=4513, min=3.27s, max=208.09s, ave=35.16s, mean=27.43s

I suppose for RBL's, some of those results are cached in bind as well?

I wonder if there's anyway to speed up priming the cache before downloading a 
bunch of emails (not that I'm off line for that long usually) -- but it's sorta 
too bad bind doesn't save it's DB on disk on a shutdown, and read it back in 
after a reboot -- and then expire if needed...


Nix wrote:

On 1 Aug 2009, Linda Walsh stated:


Per Jessen wrote:

Not sure about that - AFAICT, it's exactly the same technology. (I
haven't done in exhaustive tests though).

----
        
Supposedly 'Very' different (I hope)...


Oh yes. I have a P4 here (2GHz Northwood), and two Nehalems (one 2.6GHz
Core i7 with 12Gb RAM and a 2.26GHz L5520 with 24Gb, hello overkill).
Compared to the P4s, the Nehalems are *searingly* fast: the performance
difference is far higher than I was expecting, and much higher than the
clockspeed different would imply.

Things the P4 takes half an hour to do, the Nehalems often slam through
in a minute or less (!), especially things like compilations that need a
lot of cache. Surprisingly, even some non-parallelizable things (like
going into a big newsgroup in Gnus) are hugely faster (22 minutes versus
39 seconds: it's a *really* big newsgroup).

I suspect the cause is almost entirely the memory interface and cache.
The Northwood has, what, 512Kb L2 cache? The Nehalem has 256Kb... but it
has 8Mb of shared L3 cache, and an enormously faster memory interface
(the FSB is dead, Intel has a decent competitor to HyperTransport at
last).

I was an AMD fan for years, but the Nehalem has won me back to Intel
again.

1) You can't turn it off in the BIOS


This depends on the BIOS. Both of mine provide the option: I benched
it and found a 40% speedup for the things I do leaving it on.

2) claim of benefit from increased cache (FALSE), (have older 2x2 Dual
Core machine with 4MBxL2 Cache/Dual core.
   If you only use 1 Core/CPU, that 4MB L2 cache/Core)


It's true that the cache-per-core is the same, but the FSB slows things
down a lot.

   to use memory faster than 800MHz -- only Quad cores go up to Quick
   Connect Speeds that will support fastest memory of 1333MHz (even if
   you only have 1 CPU).  So you are 'encouraged' to go with
   Quad over


One of my machines has 1333MHz RAM, but unless I clock it down to
1066MHz I get regular machine check exceptions and random coredumps. Our
best guess is that the motherboard on that machine doesn't supply enough
power when the RAM is fully populated.

The biggest cool thing about Nehelam is power savings -- they implemented
Celeron's power-step tech in a big way.   Quiescent cores crank down their
clocks independently to about 60% of top speed and have efficient sleep
states (I think some cores can be halted, but not sure).  Some of their
processors have a 'turbo mode', which will some small amount faster speed
than the speed on the chip label (does that mean the turbo chips are really
faster rated chips...you tell me),


Nope, it's much cooler than that. The power management system on the Nehalem
is quite nifty (it's got more transistors than a 486 on its own). One of the
things it can do is track power consumption and estimate the heat dissipation
of different parts of the CPU core over time. All turbo mode does is exploit
this to briefly overclock bits of the CPU die which happen to be running
cool right now, then downclock them again to stop the die exceeding its
rated thermal dissipation figures.

This does mean that if you have crappy cooling on your Nehalem, turn off
turbo mode...

                                   BUT if fewer cores are used -- say only
2/4, the turbo boost can be a small amount greater (don't have access


That's because it realises that less heat is being dissipated.

(don't know if any is published).  If one was to go from their
marketing graphs (HAHAHAHAHA), Turbo for 4 cores is about 10 more, and
if only 2/4 cores are running, it's an additional 10%.  So marketing
hype/reality, might mean 1-3% faster?


My (admittedly crude) benchmarks ('run a GCC bootstrap out of /tmp under
/usr/bin/time') show about 6% for heavily parallelizable stuff, 9% for
serial (the same thing without 'make -j'). (So it's quite close to the
marketing figures.)

Re: OT: Nehelam's New HT ability.... and ability to handle spamd high load (preheating cache?)

Reply via email to