Some of you may remember that I announced the release of the harvester script. We discussed the issue and as the result was, not to change mailman, I now released the script to raise public awareness of the problem.
Bernhard
-------- Original Message -------- Subject: mailman email harvester Date: Mon, 07 Feb 2005 23:48:44 +0100 From: Bernhard Kuemel <[EMAIL PROTECTED]> To: [email protected], [email protected], [email protected]
Hi!
Tons of email addresses from mailman mailing lists are vulnerable to be collected by spammers.
They are "protected" by obfuscation ([EMAIL PROTECTED] -> user at example.com) and access to the subscriber list can be restricted to subscribers. The obfuscation is trivially reversed and harvester scripts can subscribe to gain access to restricted lists.
I suggested a graphical turing test that would bar scripts but the mailman developers argued spammers might hire a couple of temps that would solve the test as it already happened for the creation of email accounts. The only solution would be not to have the desired information available. This is already an option by restricting access to the member list to the list administrator.
However, still many lists either have the member list openly published, or available to the list members. To raise awareness to this issue I wrote a script that collects addresses from openly accessible lists. It stops after processing 1000 (the maximum allowed) search results from google and collects 76772 email addresses (61124 unique). It is attached as mmxp1.
An improved version that collects addresses that are restricted to subscribers, processes more lists and works more parallelized is planned.
Bye, Bernhard
#!/usr/bin/perl -w
#http://www.google.com/search?q=%22list+is+only+available+to+the+list+members%22+mailman/listinfo&start=600&num=100 #2.1.4 "current archive" "private list which" mailman/listinfo site:org $n=0; $u=0; for ($i=0;1;$i+=10) { $#urls=-1; $google=`wget -qO - -U 'any browser' 'http://www.google.com/search?q=%22Click+here+for+the+list%22+mailman%2Flistinfo&start=$i'`; # print $google; @urls=($google=~m*<p class=g><a href=(http://\S+?)>*g); # print join("\n",@urls); if ($#urls==-1) {last;} # print "\naoeu $#urls\n"; foreach $url (@urls) { $u++; $url=~s*/listinfo/*/roster/*; print STDERR "$url...\n"; $roster=`lynx -connect_timeout=10 -dump $url`; # print $roster; @mails=$roster=~/^ +\* \(?\[\d+\](.* at .*?)\)?$/mgo; foreach $mail (@mails) { $mail=~s/ at /@/; print "$mail\n"; $n++; } print STDERR "mails=".($#mails+1).", total=$n, url=$u, google=$i\n"; # exit; } #foreach url } #while google
_______________________________________________ Mailman-Developers mailing list [email protected] http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org
