On 21/09/2020 10:44, Martin Grigorov wrote:
> On Mon, Sep 21, 2020 at 12:08 PM Martin Grigorov <mgrigo...@apache.org>
> wrote:
>> On Mon, Sep 21, 2020 at 11:23 AM Mark Thomas <ma...@apache.org> wrote:

<snip/>

>>> RFC 7540 allows the connection to be closed with a protocol error if the
>>> reset is received "a significant time after sending END_STREAM".
>>>
>>> Tomcat retains state for maxConcurrentStreams + 10% with closed streams
>>> being pruned (oldest first) every 10 new streams.
>>
>> The spec talks about time but Tomcat uses the number of streams.
>> I understand that under normal usage this could be enough "time" but under
>> heavy load this number is reached quickly and the time is short.

Agreed. The closed stream is being pruned within a fraction of a second.
 However, the window where the client could legitimately send RST_STREAM
is also very small. Looking at the timings, I don't think the client is
sending the RST_STREAM within that window.

You get back to the (lack of) definition of significant. I think I could
make a case for significant being anything from a few milliseconds to
several minutes.

>>> It isn't clear at this point why the client is sending the RST_STREAM.
>>>
>>
>> I don't know either. But Vegeta is using the Golang standard libraries, so
>> any Golang application should behave the same way.
>> Also as I investigated last week - Golang/Node.js/Rust/Netty HTTP2 servers
>> do not complain about it.
>>
>>
>>> This normally indicates an error condition. From the description of the
>>> load test I don't see why the client would be sending a RST_STREAM.
>>>
>>> If the RST_STREAM is valid and it isn't received a significant time
>>> after the END_STREAM then Tomcat needs to handle this.
>>>
>>> If the RST_STREAM is valid but received a significant amount of time
>>> after the END_STREAM that would be a client error would be a client error.
>>>
>>> Of course, significant isn't defined in the specification.
>>>
>>> Any special handling for RST_STREAM also needs to be applied to
>>> WINDOW_UPDATE. It also needs to differentiate between a frame received
>>> for a past closed stream and a frame received for an stream that never
>>> existed.
>>>
>>> I think the key question here is why is the client sending the
>>> RST_STREAM in the first place. Is Tomcat doing something it shouldn't /
>>> not doing something it should to cause this? If so, we need to fix the
>>> root cause rather than tackle the symptom.
>>>
>>> There are a couple of options for debugging this to see why the client
>>> is sending the RST_STREAM:
>>> - Enable debug logging for HTTP/2 in Tomcat. This is very verbose and
>>>   if the root cause is threading related typically changes the timing
>>>   enough to mask the issue.
>>>
>>
>> Here are the logs of 20 seconds load:
>>
>> https://gist.githubusercontent.com/martin-g/37dacc12149d5cecbfd4e16dc2248cfd/raw/76693cd1dd1f37ba126edb84ebd7631802aa91a6/tomcat-http2.log

Thanks.

My first observation is that we aren't helping ourselves by having some
messages display stream IDs in 0,000 format, and others without the
commas. That should be a fairly simple fix.

I picked a request somewhere in the middle and I see the expected
sequence of frames for a normal request. Total processing time is about
30ms. Then about 50ms later, the client sends a RST_STREAM. I'd really
like to see what reason code was sent with that reset. That is another
simple fix.

>> This is with my patch in #reset(). Without it there are also errors that
>> stream XYZ is already closed

ACK.

<snip/>

>> According to https://tools.ietf.org/html/rfc7540#section-5.1 when the
>> stream is in HALF_CLOSED_REMOTE state it should send either END_STREAM or
>> RST_STREAM to the remote peer.
>> In the logs I see such sequence of events:
>>
>> ...
>> o.a.coyote.http2.StreamStateMachine      : Connection [124], Stream
>> [5,279], State changed from [OPEN] to [HALF_CLOSED_REMOTE]
>> o.a.coyote.http2.Http2UpgradeHandler     : Connection [124], Stream
>> [5,279], Writing the headers
>> org.apache.coyote.http2.Stream           : Connection [124], Stream
>> [5,279], flushing output with buffer at position [12], writeInProgress
>> [false] and closed [true]
>> org.apache.coyote.http2.AbstractStream   : Connection [124], Stream
>> [5,279], reduce flow control window by [12] to [4194292]
>> o.a.coyote.http2.Http2UpgradeHandler     : Connection [124], Stream
>> [5,279], Data length [12]
>> o.a.coyote.http2.StreamStateMachine      : Connection [124], Stream
>> [5,279], State changed from [HALF_CLOSED_REMOTE] to [CLOSED_TX]
>> org.apache.coyote.http2.Stream           : Connection [124], Stream
>> [5,279] has been recycled
>>
>>
>> From the logs it is not clear whether any of these flags is being sent to
>> the client. I will investigate!
>>
> 
> The flag is always being sent with the data!

Agreed. Tomcat's debug logs show a normal stream with a clean closure.

> One minor thing though: at
> https://github.com/apache/tomcat/blob/613787d26fe14aec775fe82e59d0e4bc079d1814/java/org/apache/coyote/http2/Http2AsyncUpgradeHandler.java#L211
> the internal state is being changed but the actual sending may not happen
> if the stream is not writeable.

That should be fine. The stream will be marked as not-writeable if the
client has sent a RST_STREAM. The test is there to stop the client
having to send additional RST_STREAM frames if the server continues
writing (which would be a protocol violation after receiving a RST_STREAM).

I think it may be time to download Vegeta to try and dig into the root
cause of these RST_STREAM frames.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to