Rob McEwen wrote:
Jesse Thompson wrote:A word of caution. Be very careful how you use the list.OK. I was wrong. Due to this discussion, I'm convinced that MD5 of the whole (lower case!) e-mail address is best, with the entire e-mail address still showing up in plain text in the DNS txt record. But I have some questions: (1) is MD5 of the entire address reasonably safe from collisions. (consider the 'birthday paradox' before being too quick to answer) (2) I'm also interested in knowing more specifics about the data found at http://anti-phishing-email-reply.googlecode.com/svn/trunk/phishing_reply_addresses (2.a.) how frequently are new scam addresses added to that list?
Every day. Contributers add addresses when they find them.
(2.b.) how long does an address take to expire since the last e-mail address is used for scams "in the wild"
They don't expire. You can use the date to make up your own policies depending on what you are doing.
We do have a 'phishing_cleared_addresses' list which we use when we get confirmation that an account has been locked down. Addresses on the cleared list are automatically removed from the 'phishing_reply_addresses' list if the activity date is older than the cleared date.
(2.c.) Is the data auto-added? or must e-mail addresses go through a manual review first?
Manually added. But I can't speak for the methods of everyone that contributes.
(2.d.) Moreover, what is a typical time between the "419" spammer's last spotted use of the e-mail, and appearance in that list?
It's reactionary, so the spam must be received before it can be discovered.
(I don't need exactly precise answers which spammers might use to 'game' the system... just basic estimates will do)
Jesse -- Jesse Thompson Division of Information Technology, University of Wisconsin-Madison Email/IM: jesse.thomp...@doit.wisc.edu
smime.p7s
Description: S/MIME Cryptographic Signature