Thank you very much, Jane!
We will certainly give fail2ban a try, though - as we use Apache - some
implementation details will probably be a bit different :-).
Linda
On 4/19/24 13:05, Jane Sandberg wrote:
Hi Linda,
It's not for Evergreen, but my colleague recently blocked claudebot
using fail2ban on our load balancer
<https://github.com/pulibrary/princeton_ansible/commit/6f9009249a168442391d90e2b75028d40a8a9e91>.
Essentially, fail2ban is configured to watch Nginx's access log, and
if more than 10 claudebot requests appear within the past minute from
a particular IP, it automatically blocks all requests from that IP for
the next 24 hours. I would think that something similar could work
for Apache's access log.
Good luck with the bots!
-Jane
El vie, 19 abr 2024 a la(s) 3:42 a.m., Linda Jansová via
Evergreen-general (evergreen-general@list.evergreen-ils.org) escribió:
Dear all,
Have any of you encountered an extensive crawling by Bytespider and
Bytedance (see e.g.,
https://wordpress.org/support/topic/psa-bytedance-and-bytespider-bots-recommend-blocking/),
Claudebot or other AI bots?
If so, do you have any secret recipe how to disable the crawler from
accessing the site?
Thank you very much for sharing your experience!
Linda
_______________________________________________
Evergreen-general mailing list
Evergreen-general@list.evergreen-ils.org
http://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-general
_______________________________________________
Evergreen-general mailing list
Evergreen-general@list.evergreen-ils.org
http://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-general