Re: Counting tells you if you are making progress

2007-02-28 Thread Rich Kulawiec

On Wed, Feb 21, 2007 at 12:31:30AM -0500, Sean Donelan wrote:
> Counting IP addresses tends to greatly overestimate and underestimate
> the problem of compromised machines.
> 
> It tends to overestimate the problem in networks with large dynamic
> pools of IP addresses as a few compromised machines re-appear across
> multiple IP addresses.  It tends to underestimate the problem in
> networks with small NAT pools with multiple machines sharing a few IP
> addresses. Differences between networks may reflect different address
> pool management algorithms rather than different infection rates.

Yes, but (I think) we already knew that.  If the goal is to provide
a minimum estimate, then we can ignore everything that might cause
an underestimate (such as NAT).  In order to avoid an overestimate,
multiple techniques can be used.  For example, observation from multiple
points over a period of time much shorter than the average IP address
lease time for dynamic pools, use of rDNS to identify static pools,
use of rDNS to identify separate dynamic pools (e.g., a system which
appears today inside hsd1.oh.comcast.net is highly unlike to show up
tomorrow inside hsd1.nj.comcast.net), classification by OS type (which,
BTW, is one way to detect multiple systems behind NAT), and so on.

I think Gadi makes a good point: in one sense, the number doesn't really
matter, because sufficiently clueful attackers can already lay their
hands on enough to mount attacks worth paying attention to.

On the other hand, I still think that it might be worth knowing, because
I think "the fix" (or probably more accurately "fixes") (and this is
optimistically assuming such exist) may well be very different if we
have 50M than if we have 300M on our hands.

---Rsk


Re: Counting tells you if you are making progress

2007-02-23 Thread Todd Vierling


On 2/22/07, Sean Donelan <[EMAIL PROTECTED]> wrote:

On Wed, 21 Feb 2007, Todd Vierling wrote:
> I'd say it's severely biased in the overestimation direction -- but
> that's not to say it isn't a problem, because zombies Suck.

People with access to the ppp, dhcp or nat logs for a network can de-dup the
counts based on IP addresses to come up with better surveys of infected
computers.  They can further correlate the reports with contact
with the computer owners of how many computers were found with known or unknown
malware. But we rarely hear data from them.


Because this is a circular problem:  such providers want to deny the
problem until there's a sufficient number, and once they take notice,
the de-dup ... reduces the number.

This isn't a technology problem, it's a *business approach* problem.

But now I'm straying OT.

--
-- Todd Vierling <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>


Re: Counting tells you if you are making progress

2007-02-23 Thread Sean Donelan


On Wed, 21 Feb 2007, Todd Vierling wrote:

I'd say it's severely biased in the overestimation direction -- but
that's not to say it isn't a problem, because zombies Suck.


People with access to the ppp, dhcp or nat logs for a network can de-dup the 
counts based on IP addresses to come up with better surveys of infected 
computers.  They can further correlate the reports with contact
with the computer owners of how many computers were found with known or unknown 
malware. But we rarely hear data from them.


Although I disagree with some of the survey counts, finding zombies isn't 
a problem.  Figuring out if a computer is actually fixed and stays fixed 
is still the problem.  Sometimes it feels like an episode of "House." 
Except House wraps up the case in 60 minutes.




Re: Counting tells you if you are making progress

2007-02-21 Thread Todd Vierling


On 2/21/07, Sean Donelan <[EMAIL PROTECTED]> wrote:

Counting IP addresses tends to greatly overestimate and underestimate
the problem of compromised machines.

It tends to overestimate the problem in networks with large dynamic
pools of IP addresses as a few compromised machines re-appear across
multiple IP addresses.


This issue is actually quite large.  Cable-based consumer broadband
tends to use DHCP with relatively long leases, so the IPs there don't
change a whole lot.  PPPoE DSL-based broadband, however, usually
changes IPs many times a day, as even a small amount of idle time
typically triggers a "disconnect" (and upon reconnect, a new IP is
assigned by whichever PPPoE concentrator "answered the call").

Some DSL providers (*cough*SBCATTBLS*wheeze*) push very hard for the
installation of their specialized connection monitoring software
(whose vendor, if expressed as initials, is also a nickname for a lewd
act ;), which further compounds the problem.  That software tries Hard
to keep the connection closed during any idle time, starting up only
on an on-demand basis when socket connection requests occur.


It tends to underestimate the problem in
networks with small NAT pools with multiple machines sharing a few IP
addresses.


This problem is not nearly so huge, as "home networks" are not
particularly common compared to the scale of PPPoE deployment.  The
"home network" averages at most 2-3 machines, if that; I've seen
plenty of wireless routers installed for the sole purpose of making it
easier for a single computer to reach the DSL connection at the wall
jack.

I'd say it's severely biased in the overestimation direction -- but
that's not to say it isn't a problem, because zombies Suck.

--
-- Todd Vierling <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>


Re: Counting tells you if you are making progress

2007-02-20 Thread Gadi Evron

On Wed, 21 Feb 2007, Sean Donelan wrote:
> 
> 
> If you can't measure a problem, its difficult to tell if you are
> making things better or worse.
> 
> On Tue, 20 Feb 2007, Rich Kulawiec wrote:
> > I don't understand why you don't believe those numbers.  The estimates
> > that people are making are based on externally-observed known-hostile
> > behavior by the systems in question: they're sending spam, performing
> > SSH attacks, participating in botnets, controlling botnets, hosting
> > spamvertised web sites, handling phisher DNS, etc.  They're not based
> > on things like mere downloads or similar.  As Joe St. Sauver pointed
> > out to me, "a million compromised systems a day is quite reasonable,
> > actually (you can track it by rsync'ing copies of the CBL and cummulating
> > the dotted quads over time)".
> 
> Counting IP addresses tends to greatly overestimate and underestimate
> the problem of compromised machines.
> 
> It tends to overestimate the problem in networks with large dynamic
> pools of IP addresses as a few compromised machines re-appear across
> multiple IP addresses.  It tends to underestimate the problem in
> networks with small NAT pools with multiple machines sharing a few IP
> addresses. Differences between networks may reflect different address
> pool management algorithms rather than different infection rates.
> 
> How do you measure if changes are actually making a difference?
> 

NAT on the one end, DHCP on the other. Time-based calculations along with
OS/Client fingerprinting often seem to produce interesting results.



Counting tells you if you are making progress

2007-02-20 Thread Sean Donelan



If you can't measure a problem, its difficult to tell if you are
making things better or worse.

On Tue, 20 Feb 2007, Rich Kulawiec wrote:

I don't understand why you don't believe those numbers.  The estimates
that people are making are based on externally-observed known-hostile
behavior by the systems in question: they're sending spam, performing
SSH attacks, participating in botnets, controlling botnets, hosting
spamvertised web sites, handling phisher DNS, etc.  They're not based
on things like mere downloads or similar.  As Joe St. Sauver pointed
out to me, "a million compromised systems a day is quite reasonable,
actually (you can track it by rsync'ing copies of the CBL and cummulating
the dotted quads over time)".


Counting IP addresses tends to greatly overestimate and underestimate
the problem of compromised machines.

It tends to overestimate the problem in networks with large dynamic
pools of IP addresses as a few compromised machines re-appear across
multiple IP addresses.  It tends to underestimate the problem in
networks with small NAT pools with multiple machines sharing a few IP
addresses. Differences between networks may reflect different address
pool management algorithms rather than different infection rates.

How do you measure if changes are actually making a difference?