Here's this morning's failure. For testing, I've set http-timeout=67. I am
trying to see if the zero window is a coincidence or not.

The debug output:
Reading 8192 bytes of response body.
Got 8192 bytes.
XML: Parsing 8192 bytes.
XML: char-data (235) returns 0
XML: xmlParseChunk returned 0
Reading 8192 bytes of response body.
Got 6180 bytes.
XML: Parsing 6180 bytes.
XML: char-data (235) returns 0
XML: xmlParseChunk returned 0
Reading 8192 bytes of response body.
Aborted request (-3): Could not read response body
sess: Closing connection.
sess: Connection closed.
Request ends, status 200 class 2xx, error line:
Could not read response body: connection was closed by server
Running destroy hooks.
Request ends.
svn: E175002: REPORT of '/svn/repos/deckard-65x/!svn/me': Could not read
response body: connection was closed by server (http://foundry51.qnx.com)
sess: Destroying session.
sess: Destroying session.

The network capture:
391174    498.610439000    10.222.3.88    10.14.11.50    HTTP    1434
Continuation or non-HTTP traffic
391175    498.610495000    10.14.11.50    10.222.3.88    TCP    54    37480
> http [ACK] Seq=1945767 Ack=480574851 Win=1280 Len=0
391176    498.842179000    10.222.3.88    10.14.11.50    HTTP    1334
[TCP Window Full] Continuation or non-HTTP traffic
391177    499.042264000    10.14.11.50    10.222.3.88    TCP    54    [TCP
ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0
391178    499.297972000    10.222.3.88    10.14.11.50    TCP    60    [TCP
Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0
391179    499.298065000    10.14.11.50    10.222.3.88    TCP    54    [TCP
ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0
391180    499.796762000    10.222.3.88    10.14.11.50    TCP    60    [TCP
Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0
391181    499.796811000    10.14.11.50    10.222.3.88    TCP    54    [TCP
ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0
391182    500.778349000    10.222.3.88    10.14.11.50    TCP    60    [TCP
Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0
391183    500.778405000    10.14.11.50    10.222.3.88    TCP    54    [TCP
ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0
391184    502.728191000    10.222.3.88    10.14.11.50    TCP    60    [TCP
Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0
391185    502.728260000    10.14.11.50    10.222.3.88    TCP    54    [TCP
ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0
391186    506.614031000    10.222.3.88    10.14.11.50    TCP    60    [TCP
Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0
391187    506.614093000    10.14.11.50    10.222.3.88    TCP    54    [TCP
ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0
391188    514.371638000    10.222.3.88    10.14.11.50    TCP    60    [TCP
Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0
391189    514.371694000    10.14.11.50    10.222.3.88    TCP    54    [TCP
ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0
391190    529.872211000    10.222.3.88    10.14.11.50    TCP    60    [TCP
Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0
391191    *REF*    10.14.11.50    10.222.3.88    TCP    54    [TCP
ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0
391192    30.988601000    10.222.3.88    10.14.11.50    TCP    60    [TCP
Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0
391193    30.988645000    10.14.11.50    10.222.3.88    TCP    54    [TCP
ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0
391194    71.282655000    10.14.11.50    10.222.3.88    TCP    54    [TCP
Window Update] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=1171456
Len=0
391195    71.296598000    10.222.3.88    10.14.11.50    HTTP    1434
Continuation or non-HTTP traffic
which results in:
392598    71.906863000    10.222.3.88    10.14.11.50    HTTP    1434
Continuation or non-HTTP traffic
392599    71.906925000    10.14.11.50    10.222.3.88    TCP    54    37480
> http [ACK] Seq=1945767 Ack=482310892 Win=987904 Len=0
392600    72.043276000    10.14.11.50    10.222.3.88    TCP    54    [TCP
Window Update] 37480 > http [ACK] Seq=1945767 Ack=482310892 Win=1188096
Len=0
392601    72.375325000    10.14.11.50    10.222.3.88    TCP    54    37480
> http [FIN, ACK] Seq=1945767 Ack=482310892 Win=1188096 Len=0
392602    72.388690000    10.222.3.88    10.14.11.50    TCP    60    http >
37480 [ACK] Seq=482310892 Ack=1945768 Win=908288 Len=0

There are two interesting things:
The long gap after the *REF* packet is unusual, it's the first time I've
seen that.
The FIN packet is 72 seconds later, not 67 like I set the timeout. OTOH,
it's close, and, given the long gap no packets occur at FIN - 67 seconds.

I'm going to keep trying random timeout values.

...Stu


On Tue, Jan 14, 2014 at 9:27 AM, Stuart MacDonald
<stuartm.cod...@gmail.com>wrote:

> I don't have access to the server log, but you can see from the wireshark
> capture that the *client* closed the connection (look for the RST packet),
> not the server. The error of "socket closed" is consistent with that.
>
> I suspect that there's an issue handling the "http-timeout" timer; it gets
> triggered during a TCP zero window situation, and then not canceled
> properly, and then it fires, closing the socket.
>
> That would also fit the data of the bug being in 1.6 (reported by someone
> else, but I linked to it in my original post), 1.7 and 1.8; it's a bug in
> svn's handling of the underlying HTTP interaction layer, and not in that
> layer or the server or somewhere else.
>
> I could be wrong. When I look at the evidence so far, that's what I see.
>
> ...Stu
>
>
> On Tue, Jan 14, 2014 at 6:33 AM, Philip Martin <philip.mar...@wandisco.com
> > wrote:
>
>> stuartm.cod...@gmail.com writes:
>>
>> > I have another failure, with some debug info, and some updates.
>> >
>> > We're switching to a different system for this particular repository,
>> so I
>> > may or may not be able to continue debug efforts. :-(
>> >
>> > Since there's no debug info available for the svn 1.8 client, I haven't
>> > been running that.
>> >
>> > I went a full week without a failure, which I suspect is due to the
>> > enormous amount of debugging that neon-debug-mask=255 produces, so this
>> > week I dialed it back to 127 and got a failure on the first try.
>> >
>> > Here's the latest failure. First up, the debug info:
>> > Got 8192 bytes.
>> > XML: Parsing 8192 bytes.
>> > XML: char-data (235) returns 0
>> > XML: xmlParseChunk returned 0
>> > Reading 8192 bytes of response body.
>> > Got 6600 bytes.
>> > XML: Parsing 6600 bytes.
>> > XML: char-data (235) returns 0
>> > XML: xmlParseChunk returned 0
>> > Reading 8192 bytes of response body.
>> > Aborted request (-3): Could not read response body
>>
>> That message comes from ne_request.c and the -3 is NE_SOCK_CLOSED
>> defined in ne_socket.h:
>>
>> #define NE_SOCK_CLOSED (-3)
>> /* Connection was reset (e.g. server crashed) */
>>
>> > sess: Closing connection.
>> > sess: Connection closed.
>> > Request ends, status 200 class 2xx, error line:
>> > Could not read response body: connection was closed by server
>> > Running destroy hooks.
>> > Request ends.
>> > svn: E175002: REPORT of '/svn/repos/<name>/!svn/me': Could not read
>> > response body: connection was closed by server (http://<server>)
>> > sess: Destroying session.
>> > sess: Destroying session.
>>
>> What does the server error log show?
>>
>> --
>> Philip Martin | Subversion Committer
>> WANdisco // *Non-Stop Data*
>>
>
>

Reply via email to