Willy Tarreau said the following on 02/20/2014 08:44 PM:
[CUT]
> Or you can also use this old script I used many years ago to detect such 
> issues :
> 
> #!/bin/bash
> old=$(date +%s);
> while : ; do
>   new=$(date +%s);
>   if [ $new -lt $old ]; then
>     echo "Time jumps backwards : $old => $new"
>   elif [ $new -gt $((old+1)) ]; then
>     echo "Time jumps forwards  : $old => $new"
>   fi
>   old=$new
> done
> 
I had this running for a few minutes - while I was able to reproduce the
408 issue.. and it echo'ed nothing. I do monitor time and normally catch
timejumps (I didn't even have to have tinker panic 0 set in ntpd.conf).

[CUT]
> I just checked, and the two only situations where we may report a 408 are
>   - when detecting a timeout waiting for a request ;
>   - when detecting a timeout waiting for a body in "balance url_param 
> check-post"
> 
> But only the former will return "cR". So there's no risk of inheritating this
> from somewhere else.
> 
> The condition to enter this block is the following (proto_http.c:2486) :
> 
>       if (req->flags & CF_READ_TIMEOUT || tick_is_expired(req->analyse_exp, 
> now_ms))
> 
> So I'm seeing two possibilities :
>   - the CF_READ_TIMEOUT flag was present on the channel
>   - the http-request delay was really expired.
> 
> Since the logs showed that the delay was very short, either it's the first
> case, or we're facing a case there analyse_exp is not properly set. This
> expiration is set from "timeout http-request" or from "timeout 
> http-keep-alive"
> when processing the second and next requests. So here I don't think it can be
> the issue.
> 
> Thus in my opinion, the only remaining possibility is that the CF_READ_TIMEOUT
> flag is set. It is normally set if the read timeout is reached. So here it 
> will
> be "timeout client". But that doesn't seem to make much sense, especially if
> you manage to reproduce it easily.
> 

last time it took me ~20 reloads, in a new firefox, of the page to
reproduce it.. but on IE (on a buddys machine) it reproduces more easily.

> One thing I'm noticing is that this flag is not cleared between two 
> consecutive
> requests over the same connection. So we could imagine a scenario where a read
> timeout is detected on the first request, but an event arrives just in time, 
> at
> the exact same moment, resulting in the data still being processed while the
> flag is set. The request completes and reinitializes to handle the next 
> request
> over the connection, but the flag is not cleared. Thus the second request 
> would
> be the victim of this inherited timeout. It seems a bit far-fetched, but it
> could explain what you're seeing.
> 

> Could you please add the following patch to proto_http.c (1.5.22) ? It will 
> log
> in debug level a few flags and timers to try to better find what is happeneing
> when this happens. You'll need that your syslog deamon logs debug level 
> entries,
> maybe they're stored in a differen file (eg: /var/log/debug). Otherwise you 
> can
> change the level LOG_DEBUG for something else such as LOG_INFO. Warning, I 
> copy
> pasted both the patches below, so you'll have to copy the lines or they won't
> apply due to mangled spaces.
> 
I'll apply the patch and build a new rpm.. will return back later today.

Thank you very much for your assistance.

-- 
Regards,
Klavs Klavsen, GSEC - k...@vsen.dk - http://www.vsen.dk - Tlf. 61281200

"Those who do not understand Unix are condemned to reinvent it, poorly."
  --Henry Spencer


Reply via email to