John Rudd wrote:
While I don't disagree with your assessment of XP systems, I have a
different hunch about why such a large percentage of the mail coming
from XP systems is spam, and a smaller percentage of mail coming from
the other systems is spam:
a) In general, XP systems are not servers, and therefore, are not mail
servers.
b) Due to (a), if you do your mail/spam/virus scanning on machines that
do not receive direct connections from your own clients (mail/spam/virus
scanning at the border), OR if you do not have a high percentage of XP
clients in your domain, then your scanning systems will not receive many
(if any) legitimate direct connections from XP clients ... because a
legitimate mail sending process on an XP system will be directly
connecting to their own domain's mail server, and not to YOUR mail
scanning systems.
c) Thus, if you meed the conditions in (b), and if we accept (a) as
true, then the vast majority of connections you receive from XP systems,
on your mail scanning systems, will be from spam/virus bots trying to
directly submit spam or virus laden messages to your mail gateways
instead of submitting it to their own mail servers (as bots are known to
do).
We would expect to see a lower percentage of spam from server type OSes
(or OSes that can be clients or servers) because a higher percentage of
those platforms are used as legitimate mail servers.
The other factor here is: while I _hate_ linux, how much of the spam
being submitted by linux boxes is merely a mail server relaying on
behalf of one of their infected clients? (same with the unix systems,
and the 2000/2003 systems) And thus not at all indicative of the
quality of linux systems administration out on the internet.
I think this is one of those cases where "the statistics work as blind
observations of behavior, but attempting to describe _why_ the
statistics works is not something you can sum up with a simple an
straight forward explanation". Kinda like QM.
<ot>
I agree that statistics aren't the whole story. you can study the
percentage of thiefs/criminals based on skin color and origin (some
people already do it, and many jump to conclusions without studies). but
you can do the same study based on social situation and past history of
people. the first "researcher" will probably conclude that
black/arabic/latin/... people are "more" criminal. the second
"researcher" will instead conclude that criminality is more seen in poor
communities, but that these aren't the worst criminals (killing vs
stealing for instance).
</ot>
back to xp and co. my feeling (no, I didn't run a study and won't) is
that even if any study would show that we get more spam from XP than
from linux, I will not use this to classify my mail.
I am certain that if you do stats on mail date, you'll find that some
dates correspond to more spam than others. we've already seen people
jumping to block specific mailers (the bat for instance) based on their
stats. I am also seing many legit mail trigering some SA rules (*_exess,
no_real_name, x_library, ...). when I see this, I check the rule, and if
I can't find a justification, I disable it.