Hi Nigel, My solution for that is simple two step process:
1) using mod_sec to monitor and match the UA string of the incoming request against a list of UAs I don't want and return a HTTP 406 if the UA matches for the first time. 2) Have fail2ban monitor the apache log for 406 and immediately ban the IP (IPv4 / IPv6) for 96 hours using an apache-badbots jail. This strategy has so far managed to keep my servers "cool". cheers -idg On Thu, Jul 25, 2024, 16:57 Nigel Titley <ni...@titley.com> wrote: > Is anyone else getting problems with the facebook web crawler hammering > their OPAC search function? > > This has been happening on and off for a couple of months but set in > with a vengeance a couple of days ago. The crawler is hitting us with > many OPAC search queries, beyond the capacity of our system to respond. > > robots.txt is being ignored > > I started by blocking facebook's entire IPv6 range as the queries were > all coming in over IPv6. They responded by switching to IPv4 and because > they have a number of blocks it wasn't practical to block each and every > one of them. > > I've temporarily switched off OPAC entirely and the system has returned > to normal and I can at least perform intranet functions but this is > obviously non-ideal. > > Does anyone have any thoughts on this? > > I'm running 22.05.13.000 on Ubuntu. > > Thanks > > Nigel > _______________________________________________ > > Koha mailing list http://koha-community.org > Koha@lists.katipo.co.nz > Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha > _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha