On 21/09/2020 10:44, Martin Grigorov wrote: > On Mon, Sep 21, 2020 at 12:08 PM Martin Grigorov <mgrigo...@apache.org> > wrote: >> On Mon, Sep 21, 2020 at 11:23 AM Mark Thomas <ma...@apache.org> wrote:
<snip/> >>> RFC 7540 allows the connection to be closed with a protocol error if the >>> reset is received "a significant time after sending END_STREAM". >>> >>> Tomcat retains state for maxConcurrentStreams + 10% with closed streams >>> being pruned (oldest first) every 10 new streams. >> >> The spec talks about time but Tomcat uses the number of streams. >> I understand that under normal usage this could be enough "time" but under >> heavy load this number is reached quickly and the time is short. Agreed. The closed stream is being pruned within a fraction of a second. However, the window where the client could legitimately send RST_STREAM is also very small. Looking at the timings, I don't think the client is sending the RST_STREAM within that window. You get back to the (lack of) definition of significant. I think I could make a case for significant being anything from a few milliseconds to several minutes. >>> It isn't clear at this point why the client is sending the RST_STREAM. >>> >> >> I don't know either. But Vegeta is using the Golang standard libraries, so >> any Golang application should behave the same way. >> Also as I investigated last week - Golang/Node.js/Rust/Netty HTTP2 servers >> do not complain about it. >> >> >>> This normally indicates an error condition. From the description of the >>> load test I don't see why the client would be sending a RST_STREAM. >>> >>> If the RST_STREAM is valid and it isn't received a significant time >>> after the END_STREAM then Tomcat needs to handle this. >>> >>> If the RST_STREAM is valid but received a significant amount of time >>> after the END_STREAM that would be a client error would be a client error. >>> >>> Of course, significant isn't defined in the specification. >>> >>> Any special handling for RST_STREAM also needs to be applied to >>> WINDOW_UPDATE. It also needs to differentiate between a frame received >>> for a past closed stream and a frame received for an stream that never >>> existed. >>> >>> I think the key question here is why is the client sending the >>> RST_STREAM in the first place. Is Tomcat doing something it shouldn't / >>> not doing something it should to cause this? If so, we need to fix the >>> root cause rather than tackle the symptom. >>> >>> There are a couple of options for debugging this to see why the client >>> is sending the RST_STREAM: >>> - Enable debug logging for HTTP/2 in Tomcat. This is very verbose and >>> if the root cause is threading related typically changes the timing >>> enough to mask the issue. >>> >> >> Here are the logs of 20 seconds load: >> >> https://gist.githubusercontent.com/martin-g/37dacc12149d5cecbfd4e16dc2248cfd/raw/76693cd1dd1f37ba126edb84ebd7631802aa91a6/tomcat-http2.log Thanks. My first observation is that we aren't helping ourselves by having some messages display stream IDs in 0,000 format, and others without the commas. That should be a fairly simple fix. I picked a request somewhere in the middle and I see the expected sequence of frames for a normal request. Total processing time is about 30ms. Then about 50ms later, the client sends a RST_STREAM. I'd really like to see what reason code was sent with that reset. That is another simple fix. >> This is with my patch in #reset(). Without it there are also errors that >> stream XYZ is already closed ACK. <snip/> >> According to https://tools.ietf.org/html/rfc7540#section-5.1 when the >> stream is in HALF_CLOSED_REMOTE state it should send either END_STREAM or >> RST_STREAM to the remote peer. >> In the logs I see such sequence of events: >> >> ... >> o.a.coyote.http2.StreamStateMachine : Connection [124], Stream >> [5,279], State changed from [OPEN] to [HALF_CLOSED_REMOTE] >> o.a.coyote.http2.Http2UpgradeHandler : Connection [124], Stream >> [5,279], Writing the headers >> org.apache.coyote.http2.Stream : Connection [124], Stream >> [5,279], flushing output with buffer at position [12], writeInProgress >> [false] and closed [true] >> org.apache.coyote.http2.AbstractStream : Connection [124], Stream >> [5,279], reduce flow control window by [12] to [4194292] >> o.a.coyote.http2.Http2UpgradeHandler : Connection [124], Stream >> [5,279], Data length [12] >> o.a.coyote.http2.StreamStateMachine : Connection [124], Stream >> [5,279], State changed from [HALF_CLOSED_REMOTE] to [CLOSED_TX] >> org.apache.coyote.http2.Stream : Connection [124], Stream >> [5,279] has been recycled >> >> >> From the logs it is not clear whether any of these flags is being sent to >> the client. I will investigate! >> > > The flag is always being sent with the data! Agreed. Tomcat's debug logs show a normal stream with a clean closure. > One minor thing though: at > https://github.com/apache/tomcat/blob/613787d26fe14aec775fe82e59d0e4bc079d1814/java/org/apache/coyote/http2/Http2AsyncUpgradeHandler.java#L211 > the internal state is being changed but the actual sending may not happen > if the stream is not writeable. That should be fine. The stream will be marked as not-writeable if the client has sent a RST_STREAM. The test is there to stop the client having to send additional RST_STREAM frames if the server continues writing (which would be a protocol violation after receiving a RST_STREAM). I think it may be time to download Vegeta to try and dig into the root cause of these RST_STREAM frames. Mark --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org