On Tue, 4 Jan 2000, brd wrote: > Hey all. A few weeks back I proposed putting some archived messages > up on a page as part of Mark Reda's "Winter Project Page". There was > concern that spiders or spammers might slurp up email addresses and spam > the list--a very real fear. > > Someone put forth a robots.txt file to keep spiders from sweeping > the web site, and someone else proposed passwd protecting it. I > think passwds are kind of prohibitive and more work to maintain and > distribute. I would still do the robots.txt file, but I also came up > with another possible solution that might make everyone feel more > comfortable. > > I wrote a perl script to replace all email addresses in a web page > (.html file) with an empty string "". Then I ran it on all the pages in > the list archive I've been playing with. All the links still work, but > there are no email addresses anywhere in the files that a spammer could > possibly grab. > > The downside is that someone looking at the archive will not be able to > email the original poster directly. The other downside is that people > who don't have a complete From: field (seems to only be aol addresses > so far) get their name, which was just an email address, wiped off the > index pages. > > Check out what I mean: > http://members.home.net/vwbrd/a2/ > > Thoughts? >
How about instead of deleting them, just mangle them? $email = '[email protected]'; $email =~ s/@/ at /; then you would have 'jcald at veedub.nu'... still readable by a human, utterly useless to a spam-crawling robot. you could go even further to mangle it, just in case there is a "smart" robot out there. BTW, I'm another web dude/perl nut, so if you guys need any help with the site, let me know. ;) -- John Caldwell [email protected] _____________ List Sponsor: http://www.netsville.com To remove yourself from this list, send mail to [email protected] with 'unsubscribe a2_16v' in the body of your message See us on the web at http://www.a2-16v.com Visit the 16V Homepage at http://www.gti16v.org
