FWIW Here's an du -sk directory size summary of the reports
SURBL grabbed from SpamCop Spamvertised sites over the past
4 days or so, stored by TLD or first octet of a numeric URI:

KBytes  TLD or first octet of numeric address
======  =====================================
7       140
7       163
7       196
7       199
34      200
13      202
8       203
7       204
37      205
25      207
3       208
14      209
1       210
67      211
19      213
7       216
7       217
31      218
41      219
7       220
13      24
11      61
7       63
31      64
27      66
13      68
33      69
13      80
7       82
1       ae
1       an
5       ar
5       aspa
9       au
5       be
5550    biz
5       bogeyme
60      br
5       bz
1       ca
38      cc
3       celer
9       ch
21      cl
57      cn
7653    com
57      de
5       edu
9       es
3       f
17      fr
5       gg
9       gr
3       grand
9       hk
1       hostingp
11      il
3       imabigpimp
5       in
5798    info
21      it
9       jp
25      kr
5       mx
5       name
946     net
21      nl
5       no
3       nort
5       nu
305     org
5       pe
75      ph
1       pl
9       pt
21      ro
51      ru
5       se
1       sg
1       sk
1       st
1       st1
3       tabletswh
29      tc
5       thesed
5       tk
11      to
5       tr
51      tv
50      tw
9       ua
32      uk
1880    us
5       whole
69      ws
13      za

Looks like .com is the top spam site TLD reported to SpamCop,
followed by .info and .biz, then .us.  And 211. is the top
numeric URI.

The obviously wrong TLDs like "grand" and "tabletswh" are either
sloppy URIs or an attempt to take advantage of an implicit .com
some browsers apparently add when no TLD is specified in
a URI.  If the latter it could be an attempt to get around
message body scanning: sort of "obfuscation by underspecification".
We could counter this by adding a ".com" before processing any
domain lacking a legitimate-looking TLD. 

Individual record lines vary in size somewhat so something like a
record count (line count) would be a more accurate way to measure
the number of minute-unique spam reports, but as an general
estimate of reported activity, it's probably pretty good.

Source data is the "domains" directory SURBL uses as a text
database of reports, stored into a tree of domain levels:

  http://spamcheck.freeapp.net/domains/

Hope this kind of info is not too redundant; I'm new here...

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://sc.surbl.org/

Reply via email to