Willy Tarreau said the following on 02/20/2014 08:44 PM: [CUT] > Or you can also use this old script I used many years ago to detect such > issues : > > #!/bin/bash > old=$(date +%s); > while : ; do > new=$(date +%s); > if [ $new -lt $old ]; then > echo "Time jumps backwards : $old => $new" > elif [ $new -gt $((old+1)) ]; then > echo "Time jumps forwards : $old => $new" > fi > old=$new > done > I had this running for a few minutes - while I was able to reproduce the 408 issue.. and it echo'ed nothing. I do monitor time and normally catch timejumps (I didn't even have to have tinker panic 0 set in ntpd.conf).
[CUT] > I just checked, and the two only situations where we may report a 408 are > - when detecting a timeout waiting for a request ; > - when detecting a timeout waiting for a body in "balance url_param > check-post" > > But only the former will return "cR". So there's no risk of inheritating this > from somewhere else. > > The condition to enter this block is the following (proto_http.c:2486) : > > if (req->flags & CF_READ_TIMEOUT || tick_is_expired(req->analyse_exp, > now_ms)) > > So I'm seeing two possibilities : > - the CF_READ_TIMEOUT flag was present on the channel > - the http-request delay was really expired. > > Since the logs showed that the delay was very short, either it's the first > case, or we're facing a case there analyse_exp is not properly set. This > expiration is set from "timeout http-request" or from "timeout > http-keep-alive" > when processing the second and next requests. So here I don't think it can be > the issue. > > Thus in my opinion, the only remaining possibility is that the CF_READ_TIMEOUT > flag is set. It is normally set if the read timeout is reached. So here it > will > be "timeout client". But that doesn't seem to make much sense, especially if > you manage to reproduce it easily. > last time it took me ~20 reloads, in a new firefox, of the page to reproduce it.. but on IE (on a buddys machine) it reproduces more easily. > One thing I'm noticing is that this flag is not cleared between two > consecutive > requests over the same connection. So we could imagine a scenario where a read > timeout is detected on the first request, but an event arrives just in time, > at > the exact same moment, resulting in the data still being processed while the > flag is set. The request completes and reinitializes to handle the next > request > over the connection, but the flag is not cleared. Thus the second request > would > be the victim of this inherited timeout. It seems a bit far-fetched, but it > could explain what you're seeing. > > Could you please add the following patch to proto_http.c (1.5.22) ? It will > log > in debug level a few flags and timers to try to better find what is happeneing > when this happens. You'll need that your syslog deamon logs debug level > entries, > maybe they're stored in a differen file (eg: /var/log/debug). Otherwise you > can > change the level LOG_DEBUG for something else such as LOG_INFO. Warning, I > copy > pasted both the patches below, so you'll have to copy the lines or they won't > apply due to mangled spaces. > I'll apply the patch and build a new rpm.. will return back later today. Thank you very much for your assistance. -- Regards, Klavs Klavsen, GSEC - k...@vsen.dk - http://www.vsen.dk - Tlf. 61281200 "Those who do not understand Unix are condemned to reinvent it, poorly." --Henry Spencer