On Thu, 29 Jan 2026 at 12:16, Christian Schulte <[email protected]> wrote:

> What about something like this in /etc/pf.conf?
>
> source limiter "default" id 1 entries 100 limit 1 rate 1/10
> pass in on egress from any to any source limiter "default"
>
> Just change 100, 1 and 1/10 as required. I somehow doubt a real human
> will need to create more than one state every 10 seconds.
>

The C10K problem has long been solved by software like nginx.

The issue here is that many of these bots, are distributed, using proxies
to make their requests, thus, the limits based on the IP address, are not
effective.

The way to address the issue hasn't changed from the time the C10K problem
was first formulated in 1999.

* If you have to spawn a separate process for each request, you'll simply
never have enough capacity to serve all the requests.

How's that AI's fault when your software was always inefficient at scale?

The way to solve it is multifold:

0. Ensure resource limits are placed appropriately.

  OpenBSD's login.conf limits help a lot here, to avoid the entire system
from having all resources exhausted by a single app.

  Unfortunately, the defaults are far too conservative for an average
server installation these days.

  For example, the default ":openfiles-cur=512:" in /etc/login.conf, means
that you can only have 512 file descriptors for each process (it's even
worse by default for the daemon processes, at ":openfiles-cur=128:").  Each
TCP socket is a file descriptor.  Each log file is a file descriptor as
well.  Each connection to each upstream is a file descriptor as well.  It
follows that, by default, OpenBSD is incapable of serving 10000 concurrent
connections, because it'll stop accepting connections at something like 50
to 240 connections (if you're proxying each nginx connection to an
upstream, those take a file descriptor, too, thus, reducing the 512 in half
to 256, or 128 in half to 64).

  (This is kind of difficult to test in a naive way, because your testing
software, testing for 1000req/sec, might actually never cross the boundary
on concurrent requests, provided that your caching is good, and requests
complete quickly enough, only for the real users in production, to quickly
discover the bottleneck that the entire thing breaks after just a dozen of
users loading your site with the keepalive timeouts hoarding all the
available file descriptors.)

  Can openfiles-cur be increased to 10000 without running afoul of the
kernel limits?  Nope.  The total for the entire system is just 7030 file
descriptors across all the processes per the default `sysctl
kern.maxfiles`.
https://unix.stackexchange.com/questions/104929/does-openbsd-have-a-limit-to-the-number-of-file-descriptors

1. Ensure each page generation is super cheap.

  That's where caching comes in; if 1000 people request a single page /
search string / etc, at the same time, you can serve it in less than a
second with proper caching of the search results at a cost of less than
$0.0001 with everyone receiving a response within 1s; if you use the anubis
proof-of-work malware in the same scenario, you'll be wasting a whole bunch
of trees, and as per the reports, "clicking" a link from Slack, may require
folks having to wait more than a minute each, running the pointless anubis
proof-of-work malware, in order to receive the "DDoS'ed" page; not to
mention the extra server load to actually run an identical search query
independently for each request; hot search is faster than cold one, but
it's still FAR slower than straight nginx cache.

C.

Reply via email to