Can you put this finally on the agenda, it is becoming an annoying issue.
> > If the server will delay enforcing of > max_connections (that is, > the server will not reject connections about > max_connections at > once), then this user in the above scenario will > open all possible > connections your OS can handle and the computer > will become > completely inaccessible. > > > The idea about this change is to have a more useful and > expected > implementation of max_user_connections and > max_connections. > Currently I am using max_connections not for what it is > supposed to > be used, just because the max_user_connections is not > doing as much > as it 'should'. > > > Hi Sergei, Is this something you are going to look in to? I > am also > curious about this delay between first package and package > with the > username. I can't imagine that being such a problem, to me > this looks > feasible currently. > > > I'm afraid, I don't understand your use case. > > There are, basically, three limits now: max_user_connections, > max_connections, OS limit. > > An ordinary user would connect many times, hit max_user_connections > and stop. Or will keep connecting and get disconnects because of > max_user_connections. > > A malicious user would connect and wouldn't authenticate, this will > exhaust max_connections and nobody will be able to connect to the > server > anymore. max_user_connections won't help here. > > I think let me explain a different way, and doesn't directly to what I > understand Marc's use-case to be, but relates, and what I reckon is not > a bad compromise, because I get where Marc is coming from. We've had > some interesting experiences with a remote party effectively DOS'ing > themselves from connecting to one of our haproxy instances as well > recently (https), so not directly related. > > > TCP connection establishment is phase one. This is limited by operating > system receive queues (yes, two of them, one for SYN_RECV, and one for > ESTABLISHED but not yet accept()ed), but in the case of Linux the > SYN_RECV queue can be exceeded if configured to use syn cookies. Bad > idea? Possibly as it prevents tcp options from being used, but it does > allow a connection to be established at all in case of SYN flood, so > IMHO, switching to SYN cookies once SYN_RECV queue is full is a good > idea since a degraded but working connection is significantly better > than no connection at all. This isn't mariadb specific, nor does it > relate to Marc's request but does give some level of background. It's > the same issue under-lyingly, just at a different layer. > > > Once MariaDB accept()s the connection I understand MariaDB counts it > against max_connections. If max_connections is then exceeded the new > connection is dropped. This can trivially deny service to legitimate > well-behaved clients. > > > This provides for a very, very simple DOS situation. Simply open a > connection from a remote side, and never send anything. Eventually > MariaDB will close this connection, not sure how long this would take, > dropping the connection count again, and only now legitimate users can > connect again. As I understand, this is what Marc is experiencing. > There are many reasons why this could happen under *normal operations* > but you're right, this is a "badly behaving client". You're also right > that not limiting this pre-auth would just move the problem to operating > system limits. > > Our use-cases are controlled in most cases, we have one case where > unfortunately MariaDB needs to be world-exposed, and we've got no way > around that, and this would apply to us here as well. Fortunately we > have other mechanisms in place to rate limit how fast untrusted sources > can connect, which helps to mitigate this. I think one could also front > this with a tool like haproxy which can be configured to say if the > client side doesn't send something within the first X ms of a > connection, close the connection, which could be protection layer two. > > > That said, I agree with Marc that the situation can be improved MariaDB > side. He's worried about a mix of good and bad actors from the same IP > address, our use-case is from different IPs, but the same underlying > problem. MariaDB can (in my opinion) help in both cases. > > > I would suggest have a separate max_unauthenticated_connections counter, > and an authenticate_timeout variable (no more than 2s for most use- > cases, and I can't imagine this should be higher than 5s in any > situation). > > > I would probably run with something like: > > max_connections = 5000 > max_user_connections = 250 > max_unauthenticated_connections = 500 > > > Combining this with firewall rate-limits from untrusted sources one can > then get a fairly protected setup, so we normally do burst 100, max 1/s > connections by default, with a 5s timeout on mariadb side this permits a > "bad player" maximum 100 connections initially, but over time max 5 > connections to DOS. Per source IP. So even with my suggestion one can > run into trouble, but at least it makes it harder. One could adjust > relevant rate limits to something like 1/min with a burst of 500, or > have small + large buckets and connections over time has to pass through > both, but this gets complicated, and if someone wants to DOS you that > desperately, there honestly isn't much you're going to do, but see > below. > > > Once max_unauthenticated_connections is reached, I can think of two > possible strategies: > > 1. Drop the new connection. > 2. Drop the connection we've been waiting for the longest to auth. > > Each with pro's and cons. Possibly a third option would be to drop the > connection we've been waiting for longest from same source, else revert > to 1 or 2. > > > In this scenario, I don't think it matters significantly if > unauthenticated connections counts towards max_connections or not, but > my gut would go towards not. > > > To further mitigate the multiple sources it would be great if we can get > logs specifically for authentication results, ie, for each incoming > connection log exactly one line indicating the source IP, and the auth > result, eg: > > Connection auth result: [email protected] <mailto:[email protected]> accepted. > Connection auth result: a.b.c.d timed out. > Connection auth result: [email protected] <mailto:[email protected]> auth > failed. > > Of course a.b.c.d could also be IPv6 dead::beef, as the case may be. > > > One can then feed this into fail2ban or similar to mitigate further. > There might be a way to log this already that I just haven't found yet, > I've only spent a very superficial amount of time looking for this. > Given this, we can adjust rate limits to higher limits once a > successfully authenticated connection happens, say what our current > defaults are, and run with even lower defaults than current (say burst > 10, 1/min or something). Too many auth failed or timeouts in the > absence of successful auth can be used to outright ban source IPs for > some time. > > > After your suggestion of delayed max_connections check - an > ordinary > user would still connect max_user_connections times, nohing would > change > for him. A malicious user, not stopped by max_connections anymore, > would > completely exhaust OS capability for opening new connections making > the > whole OS inaccessible. > > Bingo. You're spot on. But the current mechanism does allow for a very > effective and trivial denial of service on any remote server. > > That's what I mean - I don't understand your use case. It doesn't > change > much if all users behave and it makes the situation much worse if a > user > is malicious. So, in what use case your change would be an > improvement? > > I hope the above helped. > > > Kind regards, > Jaco _______________________________________________ discuss mailing list -- [email protected] To unsubscribe send an email to [email protected]
