On Tue, Jun 15, 2010 at 10:44:03PM -0400, Barry Warsaw wrote: > > Given that all signups require an email validation step, and that we'll > rate-limit that to prevent using signups as a spam vector, what additional > protection does captcha provide?
Are you saying that no scripts/bots can automatically sign up for mailman lists? I get plenty of signups like "qneu45...@nanke62w.net" that suggest otherwise. I should take the time to log those and send them to you, perhaps? After my masters paper... Most of these numbers are educated guess numbers; if you want real, validated numbers they'll have to wait, again, until I turn in my masters paper. With that... Let's say I have a large list that receives 16 signups a day, and of those two are actually humans and not scripts. The list owner, having had trouble with spammy signups in the past, has set the list to require moderator approval before users can post. What are the human costs? We'll say that the two human signups took 40s each (80s), and the moderator also took 40 seconds per signup (640s), for a total of 720s = 12 minutes. Now let's assume the reCAPTCHA adds 13s[0] to real human signups and cuts down spammy signups to 4 per day and re-run our math. The two people now spend 106s and the moderator spends 160s, or 4.43 minutes. Yes, we've shifted some costs to our subscribers, but they do that once, and the moderator gets back time daily. What's more, we've increased their burden by just over a quarter and almost divided the moderators burden by three. And we haven't even mentioned the increased cost to the spammer, or (in the case of reCAPTCHA) the benefit to society the CAPTCHA solving work. That's the real point of all this: drive up the cost to spammers as much as possible while imposing as little burden as is reasonable on list owners, moderators, subscribers, site admins, etc. We can't exactly follow the metafilter model[0] here, and I think this is as good an idea as I have seen, but I'd love for others to propose something else that imposes less of a burden on subscribers and we know will drive up costs to spammers over a longer-term basis. Again, I don't even propose we turn this on by default. I would just like to see this as a documented, tested option that can be enabled by site admins and cleanly upgraded without extra work. Okay... now that I've put all this energy into this explanation, I'll admit: spam to list owners, especially of the "Dear $LISTNAME owner, we at $SITENAME security need you to reset your password. Please find instructions in the attached .zip file..." were a much bigger problem a couple of years ago (surprisingly even after implementing SA) until I decided to block .zip and several other mime types at the MTA level. So if y'all have no interest in doing any reCAPTCHA integration, I'll just spend that much more time making anti-spam tweaks at the MTA level, and I'll field one or two more "I'm a moderator and I'm dealing with a lot of spam here" tickets every now and then. That's another point, come to think of it: I've had plenty of time and experience running a couple of decently-sized mailman installs, but what about the many, many people who have less experience running mailman? The easier we make it for them to make it hard on spammers, the better. A final note: are there any published user studies on mailman? I see your ATEC '03 and LISA '98 presentations in the ACM portal, and I see http://www.gnu.org/software/mailman/otherstuff.html ... but nothing else turns up in google scholar. Please point me to other research on mailman and its user base if it exists. If it doesn't, maybe I need to make that happen.... Thanks so much for all the work all of you do. It really is a pleasure and a privilege to be involved. Cheers, -- Cristóbal Palmer ibiblio.org metalab.unc.edu [0] http://www.sciencemag.org/cgi/content/full/321/5895/1465 "reCAPTCHA: Human-Based Character Recognition via Web Security Measures." Originally published in Science Express on 14 August 2008 Science 12 September 2008: Vol. 321. no. 5895, pp. 1465 - 1468 DOI: 10.1126/science.1160379 Quoting: User testing on our site (http://captcha.net) showed that it took 13.51 s on average (SD = 6.37) for 1000 randomly chosen users to solve a seven-letter conventional CAPTCHA (25th percentile was 8.28 s, median was 12.62 s, and 75th percentile was 17.12 s), whereas it took 13.06 s on average (SD = 7.67) for a different set of 1000 randomly chosen users (also from http://captcha.net) to solve a reCAPTCHA (25th percentile was 5.79 s, median was 12.64 s, and 75th percentile was 18.91 s). [1] Charge five US dollars (paypal) for an account. _______________________________________________ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9