While they do ignore robots.txt they do at least supply a recognizable user agent that you can just block:

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} "facebookexternalhit|other|bots|here"
RewriteCond %{REQUEST_URI} "!403\.pl" [NC]
RewriteRule "^.*" "-" [F]

Note that second RewriteCond is required or you'll end up with a redirect loop. They will still be sending you requests but at least they won't tie up a plack backend doing useless work. I haven't tried returning 5xx errors to see if that causes them to back off but I doubt they would take much notice.

Jason
--
Jason Boyer
Senior System Administrator
Equinox Open Library Initiative
jbo...@equinoxoli.org
+1 (877) Open-ILS (673-6457)
https://equinoxOLI.org/ <https://equinoxoli.org/>

On Thu, Jul 25 2024 at 01:45:56 PM +0100, Nigel Titley <ni...@titley.com> wrote:
Dear Michael

On 25/07/2024 13:28, Michael Kuhn wrote:
Hi Nigel

In such a case I would advise to create a sitemap - unfortunately this Koha feature seems not so well documented, but the following may give you a start:

* <https://lists.katipo.co.nz/public/koha/2020-November/055401.html>

* <https://wiki.koha-community.org/wiki/Commands_provided_by_the_Debian_packages#koha-sitemap>

* <https://koha-community.org/manual/24.05/en/html/cron_jobs.html#sitemap>

Thanks for this. I'll give it a go and see what happens, although if Facebook is ignoring the robots.txt file I suspect it will ignore the sitemap too.

There's been a great deal of annoyance about this on the facebook developers forums.

I'll let you know how it goes

Nigel
_______________________________________________

Koha mailing list http://koha-community.org <http://koha-community.org/>
Koha@lists.katipo.co.nz <mailto:Koha@lists.katipo.co.nz>
Unsubscribe: <https://lists.katipo.co.nz/mailman/listinfo/koha>

_______________________________________________

Koha mailing list  http://koha-community.org
Koha@lists.katipo.co.nz
Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha

Reply via email to