Re: HT-perf, paralism, thruput+latncy (dsk, net, RBLs) powr usg/meas, perlMiltring & ISP's reducng spamd latency

Linda Walsh Sat, 08 Aug 2009 15:22:20 -0700

Nix wrote:

[This is really OT for spamassassin, isn't it? Should we take it

off-list?]

--


a bit -- and somewhat not.  Much of it boils down to speed.  How to best do
it, parallelism, new hardware features...lowering latency...etc.

I'd really hoped to speed up my SA processing -- at least it can handle
a sizable concurrent load now, that's an improvement.  Need to figure out
a way to cache or speedup the network requests -- I'm sure it's mostly
latency on the spamd servers i'm checking w/. The highest my download speed
went was about 500K (on a 768K DSL)...it's all in packet latency that's the
prob.

On 8 Aug 2009, Linda Walsh spake thusly:
OK, you've out-RAIDed me.

----

   It's a server.  Mostly unraided...sorta...4 of them are in 2 VD's in
mirror mode.  the system disk is a 15K SAS, but only 70G space.  The rest
are what RAID are supposed to be -- Redundant Arrays of _Inexpensive_ disks
(SATA).  Boy was Dell pissed.  They really don't like selling bare-bones
systems.  I had to buy the disk trays elsewhere (dell won't sell them
separate from a disk).  Only 1 VD is a real RAID(5) with a whopping
3 disks...ooOOOOooo..   2 disks are sitting around as spares until I can
figure out how to add them to existing arrays (supposed to be 'easy' and
the controller rebuilds -- but nooOOOOooooo....I'm just spending too much
time on computer solving mail-filter problems while forcing myself
uptospeed w/perl5.10's new features, CSS, and fonts again (I just hosed my
desktop's fonts, so need to reboot...oops).

I'd also prefer the my own *choice* of whether or not to use the on-disk
cache ...  Maybe some of this control will get into the or does the bios
have to support everything?


Well, you'll never get the option to turn off the Linux kernel's disk
cache,

---

   On-disk cache = cache 16-32MB of cache on the disk.  It really speeds
up writes when you are writing small junks as it can coalesce the writes to
physical positions on disk -- while the kernel only uses a generic 'model'
all disks.  The real internal geometry is completely hidden these days --
you can see it talked about on Tom's HW occasionally when they bench
a disk.  You see fast constant speeds at track 0 (outside of disk), then
you see multiple 'drops' as the sectors/track shrink due to lower diameter.
But the on disk cache -- all the kernel developers dis it because they run
unstable kernels that can leave up to 32MB in a write buffer on a disk if
it gets reset or loses power before it finishes flushing it's cache.  But
on a system on a UPS, not running test-kernels all the time, unplanned
shutdowns are rare, so the speed-up is worth it.  Just like the RAID
controller itself has it's own battery backed up (non-extensible) RAM (it
doesn't know about UPS's and such -- my previous server lasted 9 years...I
feel in large part due to it being on a power-conditioned UPS (APC SmartUPS
that supposedly puts out a sinewave, despite my flakey PG&E power).

fast speeds -- then  because executables and shared libraries run out of
it,

---

   I'm more worried about large write speeds.  There, circumventing the
system cache and using direct i/o can get you faster throughput when doing
disk-to-disk copies -- the limiting factor is the target disk write rate,
and no kernel cache will help.   What does help is overlapping reads from
one device and doing writes to the other device that fit in it's buffer.
Then you can theoretically get _closer_, but not quite, double the
throughput (as writes are slower).  But if you write in, say 24M chunks to
a 34MB on-disk (no RAID) buffer, it can often get the data out while you
are reading the next 24MB from the 1st disk.

   If you go through the kernel's system cache, it throws everything off
-- you can watch it -- the kernel will give priority to reads (as it
should, as reads usually block the CPU or the USER from getting things
done.  While writes can *usually* be done lazy in the background.  But on
D:D copies of large multi-gig files, you want write and read to be exactly
balanced for optimal throughput.  But that's a *special* case, when you are
moving large data files around (for example a 157G full backup -- and
that's gzipped, because bzip2 (lzma is much worse, but way good at
compression) uses too much cpu.  On my old server (which died after 9 years
(started with a p...@400mhz, ended with dual-P-III's at 1GHz, but 256K
cache each...hardly better than a celeron!) bzip2 would slow down backup
writing to disk to about 600K/s!  gzip only cut speed by about half (from
20MB/s for raw data to 10MB/s).  Compressed backups are nice, BUT, when you
need to access them -- if you need to unpack a 100+G level zero...ouch...

just to uncompress it would take hours!.

   So ... while my new server is relatively fast -- I sorta earned it --
sitting on my old server giving feeding it power cycles through
a baby-bottle to keep it going at times...(had disk controllers updated,
all sorts of extra fans -- was never thermally designed for the CPU and
disk load i put on her.  Oh well...

I'd be lucky if "spamc --learn" would process at 4-5 msgs/second!

Supposedly it has temp and electrical monitoring 'galore', but I can'
even read the DIMM temps.


I can on mine! Actually I can read it two different ways, via IPMI and
via the temperature sensors directly. They give completely different
readings :/ I'm inlined to trust the sensors more than IPMI: half its
figures seem to be completely fictional. (Also the IPMI engine often
locks up and refuses to do anything until you power the whole machine off
and unplug it from the wall. Not what you want in a robust monitoring
system.)

---

   Temp sensors directly?  you mean manually on outside of box?  Mine has
something like that too -- and I can read the cpu's but am not seeing the
DIMM's as I can with with older workstation (i5000k memory monitor).

I went with the 'eco' power supplies at 570W

That's 'eco'?!

---

   Compared to my workstations at 750 and 1KW, yeah!...This is a server we
are talking about, not a desktop.  The graphics cards alone on PC's, these
days suck way big -- for the GTX 260 series, they recommend no less than
a 600W PW, and for the GTX 280/285 -- I think 750-1K is the norm.

    (vs. 870).
That's not a power supply, that's an electric fire.

???

 You obviously aren't running 224-core graphics cards. (That's the
low-end, revised GTX-260). Dunno about the the 285, I think the 280 was 256

or so cores...(and those are on top of any general CPU's you might have).

BTW, I'm running at 1333MHZ, so maybe it's a heat dissipation prob and
not power?


Nah, that machine is hugely overequipped with cooling, and clocking it
down to 1066 and slamming heat-inducing CPU-pounding stuff through it for
24 hours plus gives no problems whatsoever. I'm sure it got hotter in 24
hours of CPU churning than it did in one twenty-minute(!) GCC bootstrap,
but the latter, with faster RAM, MCEd on me more than once.

----

   What PS do you have?

I'm only pulling 157-160 to a max of 260
How can you tell without specialized power monitoring hardware?

---

   Specialized?  *cough*....they sell 'watt-minders'
(http://www.sierraenergyproducts.com/) for under $40 and 'kill-a-watts'
(http://www.amazon.com/P3-International-P4400-Electricity-Monitor/dp/B00009MDBU)
for under 30$ these days.  Mine's an older model with more bells/whistles,
but the computer, itself,  actually displays it's power consumption in
BTU/HR or Watts (trade-off between that and a constant temp display in C or
F) on the front panel. It's accurate according to my external 15 year old
meter...  My external meter is more expensive, but was around before the
others and has more functions that most people don't need.

 My external meter, a 'Brand Electronics'
(http://www.brandelectronics.com/) meter, is s pretty accurate -- down to
.1W, also logs total, computes cost/month (enter your elec cost), also
shows the iRMS/Power Factor, VA's, and VAR's).  Bought it about 10 years or
more ago when I went on an energy kick and reduced my household elec  by
2/3rds...  (slowly edged up again, but still less than half of my heights
back in the mid 90's).  .  Their prices, like everything else have gone up
in the past 15 years..  The Model 21-1850/CI includes a computer interface
to dumps and examine the logging.  It's got a lifetime warrantee and it has
lasted....

Oblig:sa-users -- I may finally have my 'dead -email' restart problem
solved.

I had that too! Well, OK, my ISP broke my email entirely (by changing my
static IP without warning, denying me access to my DNS to adjust it
accordingly, and breaking the backup MX they provided so that it bounced
all email to me with a relaying denied message).

----

   Ouch -- at least my mail probs are *usually* recoverable.  Only had one
time when I bounced email (unknowingly) for 3 days solid...Ouch!


   I let my ISP (speakeasy.net) manage my email collection, and
I downloaded with imaps -- so if I go down , it backs up on their servers
for a while. My quota will hold about a week or so, of me being dead -- my
backup plan would be to forward my email  to one of my gmail
accounts...they have 4GB each! :-)...then I can later imap download it from
them!  Very Sweet!

Google for 'zetnet breathe fiasco' for more than you could possibly want
on the disaster. I got off lightly: other people have had no email or
news for many months, and some people lost all their old email too (!)

I added the 'delay time' taken by spamd when running my email inputs
(its' actually my filter delay time, but the max diff between the two is
about .01 seconds, so it's mostly spamd delay -- my stats for today from
~9:30am are: (n=#emails) n=4513, min=3.27s, max=208.09s, ave=35.16s,
mean=27.43s

----

   I think spamd alone (w/no network tests, takes well under a second).
I need to look at what network tests are taking the most time and seeing
how much value they add -- that might speed up my per-message rate to best

effect.

   Rewriting my mail filter script (it's evolved since before modern
alternatives like procmail were around (in fact I tried procmail, and had
it too unreliable (at the time  v<1.0) for my needs.  My initial version of
my script was written in Perl3....  Much of my recent pain with mail
backlog and spamd taking so much time has been tweaking the script for some
new addresses and realizing it was getting "too unweildy", again..and
needed a make-over...and there's all those new features in 5.10.  Before
I knew it...insufficient testing before putting it in place and all my
email was getting backed up in various ways ...  all sorts of new ways to
create messes.   But through it all, I don't think I sent any bounce
messages (or lost any email)....at one point had over 1000 processes
'deadlocked', but was able to unwind them w/o losing email or generating
bounce messages.

   I have fantasies about perl 5.12 or 5.14 getting fancy with some of
it's 'foreach(@array)' processing -- and allowing for parallel threads that
can run on multi-cpu's.  If the graphics processors are getting to over 200
vector processors, and rumor is Intel is going that direction as well for
their onboard graphics - but with more powerful 'thread' like action.
Windows will really take off over Linux as it has much lower thread
overhead (higher process overhead), but because Linus decided linux would
not have light weight threads -- just pseudo threads that are really just
more processes, he's optimized process creation, but hurt thread creation.

   This could bit linux if a move comes toward more Hyperthreads which
only work best with 'lightweight threads' -- threads that cheap to spawn
and intimately intertwined in doing the same task... -- like that
'foreach(@array) I was talking about -- can you imagine the overhead of the
current process spawn and how it would kill something simple like that?
But (and it may require really bit 'loops' in perl to overcome the thread
creation overhead, as it's interpreted).

   But thinking of parallelism -- if the bottleneck really is the speed of
the RBL servers, maybe the solution would be a distributed RBL among
independent ISP's (not biggies like AT&T/MaBell/Comcrass), as they'd be too
likely to much it up...but if my ISP hosted an RBL cache, for example for
their own customers...that could really speed things up.  Dunno if it is
practical or doable or not, but sure might help the problem.

Linda

Re: HT-perf, paralism, thruput+latncy (dsk, net, RBLs) powr usg/meas, perlMiltring & ISP's reducng spamd latency

Reply via email to