On Thu, 29 Jan 2026 at 12:16, Christian Schulte <[email protected]> wrote:
> What about something like this in /etc/pf.conf? > > source limiter "default" id 1 entries 100 limit 1 rate 1/10 > pass in on egress from any to any source limiter "default" > > Just change 100, 1 and 1/10 as required. I somehow doubt a real human > will need to create more than one state every 10 seconds. > The C10K problem has long been solved by software like nginx. The issue here is that many of these bots, are distributed, using proxies to make their requests, thus, the limits based on the IP address, are not effective. The way to address the issue hasn't changed from the time the C10K problem was first formulated in 1999. * If you have to spawn a separate process for each request, you'll simply never have enough capacity to serve all the requests. How's that AI's fault when your software was always inefficient at scale? The way to solve it is multifold: 0. Ensure resource limits are placed appropriately. OpenBSD's login.conf limits help a lot here, to avoid the entire system from having all resources exhausted by a single app. Unfortunately, the defaults are far too conservative for an average server installation these days. For example, the default ":openfiles-cur=512:" in /etc/login.conf, means that you can only have 512 file descriptors for each process (it's even worse by default for the daemon processes, at ":openfiles-cur=128:"). Each TCP socket is a file descriptor. Each log file is a file descriptor as well. Each connection to each upstream is a file descriptor as well. It follows that, by default, OpenBSD is incapable of serving 10000 concurrent connections, because it'll stop accepting connections at something like 50 to 240 connections (if you're proxying each nginx connection to an upstream, those take a file descriptor, too, thus, reducing the 512 in half to 256, or 128 in half to 64). (This is kind of difficult to test in a naive way, because your testing software, testing for 1000req/sec, might actually never cross the boundary on concurrent requests, provided that your caching is good, and requests complete quickly enough, only for the real users in production, to quickly discover the bottleneck that the entire thing breaks after just a dozen of users loading your site with the keepalive timeouts hoarding all the available file descriptors.) Can openfiles-cur be increased to 10000 without running afoul of the kernel limits? Nope. The total for the entire system is just 7030 file descriptors across all the processes per the default `sysctl kern.maxfiles`. https://unix.stackexchange.com/questions/104929/does-openbsd-have-a-limit-to-the-number-of-file-descriptors 1. Ensure each page generation is super cheap. That's where caching comes in; if 1000 people request a single page / search string / etc, at the same time, you can serve it in less than a second with proper caching of the search results at a cost of less than $0.0001 with everyone receiving a response within 1s; if you use the anubis proof-of-work malware in the same scenario, you'll be wasting a whole bunch of trees, and as per the reports, "clicking" a link from Slack, may require folks having to wait more than a minute each, running the pointless anubis proof-of-work malware, in order to receive the "DDoS'ed" page; not to mention the extra server load to actually run an identical search query independently for each request; hot search is faster than cold one, but it's still FAR slower than straight nginx cache. C.

