I've been watching this thread thinking I must be like a pirate in the
group.
I kick bad bots off my ship as soon as I see them and do not let them
poke around twice or use any beyesian filters .
I try to make my site very unattractive to most harvesting bots
collecting pages or email addresses or trying to exploit mail scripts.
I'm on a Linux server. My hataccess file is 100Kb so this unfortunately
adds extra time at every hit for this list to be checked for banned IPs
or User Agents, but it also allows banning any file type being used by
an external IP or domain, so it seems to stops hot linking as well as
exploits of cgi mail scripts.
RewriteCond %{HTTP_REFERER} ^-?$
RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yoursite.com(/)?.*$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?43.223.82.45(/)?.*$
RewriteRule .*\.(cgi|wav)$ http://www.yourdomain.com/Stolen.jpe
I can email anyone off list my .htaccess file of banned IPs and banned
bots, there are some really well known bad ones, rima-tide.net comes to
mind and the user agent WGet is really suspect as well, but could be
honestly used.
I also use a cgi that makes endless links, it is only referenced in a
couple of spots with the "noindex" and "nofollow" tags which all the
good bots seem to respect. On every few pages I put a dozen or so fake
email addresses just for bots which do get through and then in the CSS
have them as display none "noindex" and "nofollow".
http://www.hereticpress.com/Private/members.foo
Just for today I will ignore hits on the above page, the cgi is called
poison but it is not used much anymore, most bots do not follow the
links but a few have been trapped in a loop for a day or so until I see
the logs and ban them by IP or user. I also have a full page of junk
email addresses about 500Kb of them which I occasionally change a few
details and update the page modification date. A few bots and humans
have gone for it and then I see them on server logs and ban them, I
reckon that the fake emails will pollute their email spam database
making it less valuable.
Junk emails addresses for your favourite spam bots;
http://www.hereticpress.com/Bots.html
Lastly when I put email address in forms or on pages I hide it in some
javascript and encode the individual letters. I know the Javascript is
not the best for accessibility but you can break up an email address
into parts with Javascript and encode individual letter in ASCII, this
is not a real address below.
<a href="#" tabindex="131"
onclick="JavaScript:
window.location='m'+'ail'+'to:'+'r'+'@'+'he
reticpssm'" accesskey="M"
class="LinkItems" title="Contact the webmaster Tim at Heretic Press
Ctrl+M">Tim</a><br />
I have used the same email address for years and do one more thing.
Sorry Windows users, With Mac mail you can return to sender as bounced,
so anyone with a valid email will think you don't exist. I discovered
that some of the spammers allow a small window which you can return
email to them as bounced but after some time that return email address
will not work.
That's about most of what I know to prevent spam and exploits of my
mail system.
Tim
On 16/02/2007, at 7:49 PM, James Crooke wrote:
but if I don't find a good alternative soon I might also be forced to
use them as the
spambots out there get smarter and more capable to getting around
basic
obsticles like form fields being named differently or checks on ips.
This was me 6 months ago ^
I just had to give in to using a CAPTCHA until a proven solution comes
along.
On 2/16/07, Michael MD <[EMAIL PROTECTED]> wrote:
> SilverStripe Newsletter
> >I personally get very frustrated with captchas, especially really
awkwardly
> >hard to interpret ones. And the questions below are novel for a
while but
> >wear you down after 10->20 a day!
> >
> >One reason I get frustrated with them is that there are great
beyesian
> >filters out there that just "know" if a comment is spam or not.
When you
> >submit something, it asks a >global webservice if the text seems
human or
> >not, and its very accurate. I only realised these existed late last
year,
> >but they've been a godsend for the sites we build.
>
> I don't really think that is a good solution...
> look at email spam and how much of it gets though spam filtering...
and the
> risk of false positives is too high for my liking.
> ... we need a better way.
>
> I don't like captchas either and have so far avoided using them but
if I
> don't find a good alternative soon I might also be forced to use
them as the
> spambots out there get smarter and more capable to getting around
basic
> obsticles like form fields being named differently or checks on ips.
(there
> are even some spambots blatently using real ip addresses - eg
rbnnetwork)
>
> I has to disable trackbacks on my site because the submission
process for
> those is too open to spambots... (the standard process for submitting
> trackbacks is fundamentally flawed - it lacks an extra step to ask
for a
> response from the client to check if the ip is real!)
>
>
>
>
>
> *******************************************************************
> List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
> Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
> Help: [EMAIL PROTECTED]
> *******************************************************************
>
>
--
James
*******************************************************************
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
*******************************************************************
The Editor
Heretic Press
http://www.hereticpress.com
Email [EMAIL PROTECTED]
*******************************************************************
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
*******************************************************************