I guess I confuse easily...still Either I don't understand what it's doing, or I don't understand why it's doing what it is, or what it's doing is confused.
Sigh. On 08-Jun-14 14:24, Evan Hunt wrote: > On Sun, Jun 08, 2014 at 09:45:23AM -0400, Timothe Litt wrote: >> Consider a continuous stream of queries to a slow server. For the sake >> of exposition, assume the incremental adjustment is 1 rather than 5. >> >> Named drops the 11th query, but increases the limit. > It only increases the limit if one of the pending queries for that name > got an answer back. If the authoritative server's not responding at all, > we just carry on dropping queries, but if *is* answering -- just not > quickly enough to keep up with demand -- then we adjust our queue > lengths. Better, but unless this limit/accounting is per-server, it still doesn't make sense. And if it is, it ought to be able to add (# dropped) to the limit rather than sneak up on it. As described, the resolver threads are per name + record type. I don't see why a server would respond quickly to one record type and not another. I don't see why one slow responder should control the max queue depth for all, unless one argues that for the faster ones, the excess quota wouldn't be used. That might open a DOS attack; If I control a slow responder, I can make other resolvers allocate extra resources... > The code was written before my time, so I'm only guessing here, but I Not blaming you :-( Just trying to understand. > suspect the idea was to adapt to the situation where you have a fast > local network and a slow upstream connection. If we're getting queries > for popular names faster than we can resolve them, it may make sense to > buffer more queries. I suppose - but per name+record type (+class?)? That does tell us what requests have been issued - no point in making more than one request to the same server for the same data. (Modulo UDP lossage.) From first principles: If a (client) request arrives that requires data not in the cache: a) If new data is not already been requested, clearly it needs to be fetched So you probably allocate a cache entry marked 'pending' and queue the requesting task. b) The client request must either get queued or dropped. When should one drop? -- If too many client resolver tasks are queued (for any data) -- If too many client+internal resolver tasks are queued for this *server* c) The assumption on dropping is that the client can retry, and will hit in the cache. This must get ugly when the record has a TTL of 0 (or < client retry interval) If the upstream physical connection (pipe) is slow, it doesn't matter where the query goes - if we send a lot, it will get saturated. Coalescing duplicate queries makes sense, up to some memory limit. But why would it be limited per-name with a globally tuned throttle? If we usually request from more than one NS, a slow pipe might benefit from dropping back to less parallelism when congested. But this isn't the right measurement point for that. If the pipe is fast, but some *server* is slow, we can expect all responses from *that server* to be equally slow. No matter what the name/type/class is. So in that case, we still want to coalesce the requests if quota allows. But the quota would want to be 'outstanding requests to *that server*', not something per-name+type. As for popular vs. unpopular names - we only know that a name is popular because the queue grows. (Or maybe we could have some cache history if this is a ttl expired(ing prefetch) case.) The only sensible thing one might do is drop requests for unpopular names if a popular one hit a limit... on the theory that limited resources should go to satisfy the most requests. >> For the former, a global threshold makes some sense - an abusive burst >> of queries can be for multiple zones - or focused on one. >> But isn't this what response rate limiting is for? Given RRL, does this >> still make sense? > RRL is response rate limiting -- it applies to output not input. Isn't it pretty much the same thing? Dropping a request is the same as dropping a response - from the client's point of view. It's better from named's point of view, since it can avoid internal work. RRL claims to limit based on 'lots of packets with similar source addresses asking for similar or identical information'. Isn't that sufficient to prevent this case? To make an internal resolver query, there must be a client request, right? So limiting the client requests ought to prevent unreasonable numbers of internal resolver queries...I think. If not, Request Rate Limiting has the same acronym, and RRL.current could reasonably extend to cover this case. If a given client is creating an unacceptable request rate, the requests could be dropped. Some people do this with iptables in front of named. Named could be smarter - e.g. drop duplicate requests for the same data from a client(range) at an interval less than TTL / n. Or something like that. >> For the latter, separating the measurement/threshold tuning from the >> decision to drop would seem to produce more sensible behavior than >> dropping every 5i-th packet. And for it to make any sense at all, it >> must be adjusted per server, not globally... > As it happens I'm in the middle of a research project on this very > subject; future releases will probably have some additional per-server > throttling and holddown controls and more finely adjustable drop > policies. Stay tuned. Of course the 'servers' in this case have to be dynamically discovered; they're not 'server' in the sense of 'my masters and slaves', they're servers in the sense of other authoritative servers consulted for resolving non-local names. Be careful with the vocabulary (doc and config items) to avoid adding to my confusion :-) >> Or I'm missing something, in which case the documentation needs some >> more/different words :-( > If the above was helpful and you feel inspired to rephrase it into > text for the ARM, I'm always happy to take your patches. :) > Perhaps when I'm unconfused.
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users