Any update on this? Is this now confirmed a bug? > -----Original Message----- > From: Marc > Sent: Monday, 31 March 2025 10:37 > To: [email protected] > Subject: RE: [MariaDB discuss] Re: possible bug in dropping max > connections > > Can you put this finally on the agenda, it is becoming an annoying > issue. > > > > > > If the server will delay enforcing of > > max_connections (that is, > > the server will not reject connections about > > max_connections at > > once), then this user in the above scenario will > > open all possible > > connections your OS can handle and the computer > > will become > > completely inaccessible. > > > > > > The idea about this change is to have a more useful and > > expected > > implementation of max_user_connections and > > max_connections. > > Currently I am using max_connections not for what it is > > supposed to > > be used, just because the max_user_connections is not > > doing as much > > as it 'should'. > > > > > > Hi Sergei, Is this something you are going to look in to? I > > am also > > curious about this delay between first package and package > > with the > > username. I can't imagine that being such a problem, to me > > this looks > > feasible currently. > > > > > > I'm afraid, I don't understand your use case. > > > > There are, basically, three limits now: max_user_connections, > > max_connections, OS limit. > > > > An ordinary user would connect many times, hit max_user_connections > > and stop. Or will keep connecting and get disconnects because of > > max_user_connections. > > > > A malicious user would connect and wouldn't authenticate, this will > > exhaust max_connections and nobody will be able to connect to the > > server > > anymore. max_user_connections won't help here. > > > > I think let me explain a different way, and doesn't directly to what I > > understand Marc's use-case to be, but relates, and what I reckon is > not > > a bad compromise, because I get where Marc is coming from. We've had > > some interesting experiences with a remote party effectively DOS'ing > > themselves from connecting to one of our haproxy instances as well > > recently (https), so not directly related. > > > > > > TCP connection establishment is phase one. This is limited by > operating > > system receive queues (yes, two of them, one for SYN_RECV, and one for > > ESTABLISHED but not yet accept()ed), but in the case of Linux the > > SYN_RECV queue can be exceeded if configured to use syn cookies. Bad > > idea? Possibly as it prevents tcp options from being used, but it > does > > allow a connection to be established at all in case of SYN flood, so > > IMHO, switching to SYN cookies once SYN_RECV queue is full is a good > > idea since a degraded but working connection is significantly better > > than no connection at all. This isn't mariadb specific, nor does it > > relate to Marc's request but does give some level of background. It's > > the same issue under-lyingly, just at a different layer. > > > > > > Once MariaDB accept()s the connection I understand MariaDB counts it > > against max_connections. If max_connections is then exceeded the new > > connection is dropped. This can trivially deny service to legitimate > > well-behaved clients. > > > > > > This provides for a very, very simple DOS situation. Simply open a > > connection from a remote side, and never send anything. Eventually > > MariaDB will close this connection, not sure how long this would take, > > dropping the connection count again, and only now legitimate users can > > connect again. As I understand, this is what Marc is experiencing. > > There are many reasons why this could happen under *normal operations* > > but you're right, this is a "badly behaving client". You're also > right > > that not limiting this pre-auth would just move the problem to > operating > > system limits. > > > > Our use-cases are controlled in most cases, we have one case where > > unfortunately MariaDB needs to be world-exposed, and we've got no way > > around that, and this would apply to us here as well. Fortunately we > > have other mechanisms in place to rate limit how fast untrusted > sources > > can connect, which helps to mitigate this. I think one could also > front > > this with a tool like haproxy which can be configured to say if the > > client side doesn't send something within the first X ms of a > > connection, close the connection, which could be protection layer two. > > > > > > That said, I agree with Marc that the situation can be improved > MariaDB > > side. He's worried about a mix of good and bad actors from the same > IP > > address, our use-case is from different IPs, but the same underlying > > problem. MariaDB can (in my opinion) help in both cases. > > > > > > I would suggest have a separate max_unauthenticated_connections > counter, > > and an authenticate_timeout variable (no more than 2s for most use- > > cases, and I can't imagine this should be higher than 5s in any > > situation). > > > > > > I would probably run with something like: > > > > max_connections = 5000 > > max_user_connections = 250 > > max_unauthenticated_connections = 500 > > > > > > Combining this with firewall rate-limits from untrusted sources one > can > > then get a fairly protected setup, so we normally do burst 100, max > 1/s > > connections by default, with a 5s timeout on mariadb side this permits > a > > "bad player" maximum 100 connections initially, but over time max 5 > > connections to DOS. Per source IP. So even with my suggestion one > can > > run into trouble, but at least it makes it harder. One could adjust > > relevant rate limits to something like 1/min with a burst of 500, or > > have small + large buckets and connections over time has to pass > through > > both, but this gets complicated, and if someone wants to DOS you that > > desperately, there honestly isn't much you're going to do, but see > > below. > > > > > > Once max_unauthenticated_connections is reached, I can think of two > > possible strategies: > > > > 1. Drop the new connection. > > 2. Drop the connection we've been waiting for the longest to auth. > > > > Each with pro's and cons. Possibly a third option would be to drop the > > connection we've been waiting for longest from same source, else > revert > > to 1 or 2. > > > > > > In this scenario, I don't think it matters significantly if > > unauthenticated connections counts towards max_connections or not, but > > my gut would go towards not. > > > > > > To further mitigate the multiple sources it would be great if we can > get > > logs specifically for authentication results, ie, for each incoming > > connection log exactly one line indicating the source IP, and the auth > > result, eg: > > > > Connection auth result: [email protected] <mailto:[email protected]> accepted. > > Connection auth result: a.b.c.d timed out. > > Connection auth result: [email protected] <mailto:[email protected]> auth > > failed. > > > > Of course a.b.c.d could also be IPv6 dead::beef, as the case may be. > > > > > > One can then feed this into fail2ban or similar to mitigate further. > > There might be a way to log this already that I just haven't found > yet, > > I've only spent a very superficial amount of time looking for this. > > Given this, we can adjust rate limits to higher limits once a > > successfully authenticated connection happens, say what our current > > defaults are, and run with even lower defaults than current (say burst > > 10, 1/min or something). Too many auth failed or timeouts in the > > absence of successful auth can be used to outright ban source IPs for > > some time. > > > > > > After your suggestion of delayed max_connections check - an > > ordinary > > user would still connect max_user_connections times, nohing would > > change > > for him. A malicious user, not stopped by max_connections anymore, > > would > > completely exhaust OS capability for opening new connections making > > the > > whole OS inaccessible. > > > > Bingo. You're spot on. But the current mechanism does allow for a > very > > effective and trivial denial of service on any remote server. > > > > That's what I mean - I don't understand your use case. It doesn't > > change > > much if all users behave and it makes the situation much worse if a > > user > > is malicious. So, in what use case your change would be an > > improvement? > > > > I hope the above helped. > > > > > > Kind regards, > > Jaco
_______________________________________________ discuss mailing list -- [email protected] To unsubscribe send an email to [email protected]
