I have tightened up and separated php pooling so that wiki and forums are
separated.
Trouble is caused by back pressure from the database. It turns out that forum
session tables needed a specific index created and while they are cleaned up
there is an overuse on the Hungarian forum which was causing contention.
I plan to setup blocking for aggressive spiders in the apache configuration
tomorrow.
Here is the list:
# Known aggressive bots
SetEnvIfNoCase User-Agent "Bytespider" bad_bot
SetEnvIfNoCase User-Agent "MJ12bot" bad_bot
SetEnvIfNoCase User-Agent "AhrefsBot" bad_bot
SetEnvIfNoCase User-Agent "SemrushBot" bad_bot
Let me know if you know others to add.
Let’s see what the next 24 hours brings.
Best,
Dave
> On Feb 1, 2026, at 1:42 PM, Dave Fisher <[email protected]> wrote:
>
>
>
>> On Jan 31, 2026, at 10:02 PM, Dean Webber <[email protected]>
>> wrote:
>>
>>> I’ve used a new tool from ASF Infra to block a couple of data centers with
>>> bad connection requests.
>>
>> It would be interesting to know how this is achieved,
>> One way is to use AS (autonomous system, very large network or group of
>> networks with a single routing policy) blocking.
>> This can be achieved using local tools, and creating a blocklist, or
>> hardware firewalls.
>>
>> I would recommend blocking AS24940, AS16276, AS14061, AS51167, AS16509,
>> AS8075, AS396982, AS136907.
>> You can generate lists using, bgpq4, to generate lists that you can use,
>> supporting a wide output for hardware.
>> Or you can define output format, such as bgpq4 -F "%n\\n" AS24940.
>>
>> From the above list, "bgpq4 -F "%n\\n" AS24940 AS16276 AS14061 AS16509
>> AS8075 AS396982 AS136907 > output" gives a file with 36 945 address. Even
>> just using these tools to see where this traffic is coming from is helpful.
>
> If you wish to help then please share more details about this output on
> [email protected] and I can at least see if we see these ip in
> the logs.
>
>> There are many autonomous systems that should be filtered, or monitored.
>
> We do have deny lists and can support manual changes to these filters.
>
>> I also asked a question recently to Brave (Kit), and it replied with an
>> answer from the public email archive in very recent history. It would be
>> worth investigating where the traffic comes from, as what the intention of
>> the connection.
>>
>> If it is a web scraper, you could feed it to a tarpit,
>> https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/.
>> This has varying levels of success.
>
> For me just setting up a tarpit is a personal “tarpit”.
>
>>
>> Otherwise, BGP filtering will allow you to get rid of a bunch of bad
>> connections. Some should be outright blocked, others can be monitored for
>> abuse, and rejected upon x request in a given time.
>
> The tuning I am working on is starting to stop bad requests before they get
> to the php layer.
>
> Last year I put aggressive IP blocking in place and we have a large deny list
> (and a small allowlist) built over time.
>
> The latest set of trouble has included slow loris connections and that is
> mitigated. Otherwise we have the usual scanners trying phpBB style requests
> w/o knowing our unique organization.
>
> I’ll need to revisit the deny list process (python code that counts IPs in
> access logs) to include looks in the error logs for specific errors which
> should show bad actors more quickly.
>
> Best,
> Dave
>
>>
>> All the best.
>>
>> ________________________________
>> From: Dave Fisher <[email protected]>
>> Sent: 30 January 2026 02:53
>> To: dev <[email protected]>
>> Subject: Re: Unable to access on forums and wiki
>>
>> I’ve used a new tool from ASF Infra to block a couple of data centers with
>> bad connection requests.
>>
>> I then restarted the services.
>>
>> The site appears to be better.
>>
>>> On Jan 29, 2026, at 12:40 AM, Dick Groskamp <[email protected]> wrote:
>>>
>>> And again today.
>>>
>>> On 2026/01/08 07:37:04 [email protected] wrote:
>>>> Since this morning, access was very slow
>>>>
>>>> But now, I got timeout
>>>>
>>>> Thanks to restore it
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]