On Mon, Sep 21, 2020 at 1:36 PM Martin Grigorov <mgrigo...@apache.org>
wrote:

> On Mon, Sep 21, 2020 at 12:08 PM Martin Grigorov <mgrigo...@apache.org>
> wrote:
>
> >
> >
> > On Mon, Sep 21, 2020 at 11:23 AM Mark Thomas <ma...@apache.org> wrote:
> >
> >> On 21/09/2020 08:18, Martin Grigorov wrote:
> >> > On Fri, Sep 18, 2020 at 6:16 PM Mark Thomas <ma...@apache.org> wrote:
> >> >
> >> >> On 18/09/2020 14:07, Martin Grigorov wrote:
> >> >>
> >> >> <snip/>
> >> >>
> >> >>> What is the difference
> >> >>> between org.apache.coyote.http2.StreamStateMachine.State#CLOSED_RX
> >> >>> and org.apache.coyote.http2.StreamStateMachine.State#CLOSED_TX ?
> >> >>
> >> >> Compare the parameters used to construct the enums.
> >> >>
> >> >>> I read some parts of https://tools.ietf.org/html/rfc7540 but I
> don't
> >> see
> >> >>> anything related to two types of CLOSED state.
> >> >>
> >> >> Section 5.1. Definition of the closed state (page 20) explains the
> >> >> difference between the two.
> >> >>
> >> >
> >> > Still I do not understand what RX and TX stand for. But this is not
> >> > important.
> >>
> >> TX - transmit
> >>
> >> RX - receive
> >>
> >
> > Thanks!
> >
> >
> >>
> >>
> >> > The following patch fixes the issue for me/Vegeta:
> >> >
> >> > @@ -1570,12 +1571,15 @@ class Http2UpgradeHandler extends
> AbstractStream
> >> > implements InternalHttpUpgradeH
> >> >
> >> >      @Override
> >> >      public void reset(int streamId, long errorCode) throws
> >> Http2Exception
> >> >  {
> >> > -        Stream stream = getStream(streamId, true);
> >> > -        boolean active = stream.isActive();
> >> > -        stream.checkState(FrameType.RST);
> >> > -        stream.receiveReset(errorCode);
> >> > -        if (active) {
> >> > -            activeRemoteStreamCount.decrementAndGet();
> >> > +        boolean unknownIsError = Http2Error.CANCEL.getCode() !=
> >> errorCode;
> >> > +        Stream stream = getStream(streamId, unknownIsError);
> >> > +        if (stream != null) {
> >> > +            boolean active = stream.isActive();
> >> > +            stream.checkState(FrameType.RST);
> >> > +            stream.receiveReset(errorCode);
> >> > +            if (active) {
> >> > +                activeRemoteStreamCount.decrementAndGet();
> >> > +            }
> >> >          }
> >> >      }
> >> >
> >> > I.e. do not throw an error if the remote peer is trying to cancel an
> >> > already closed stream.
> >>
> >> RFC 7540 allows the connection to be closed with a protocol error if the
> >> reset is received "a significant time after sending END_STREAM".
> >>
> >> Tomcat retains state for maxConcurrentStreams + 10% with closed streams
> >> being pruned (oldest first) every 10 new streams.
> >>
> >
> > The spec talks about time but Tomcat uses the number of streams.
> > I understand that under normal usage this could be enough "time" but
> under
> > heavy load this number is reached quickly and the time is short.
> >
> > I've tried with these values for
> > maxConcurrentStreams/maxConcurrentStreamExecution: 8 (number of CPU
> cores),
> > 16, 100 (the default), 1024 and 8196. The throughput is around 10K only
> > until 100. After that it drops (as you said earlier).
> >
> >
> >>
> >> It isn't clear at this point why the client is sending the RST_STREAM.
> >>
> >
> > I don't know either. But Vegeta is using the Golang standard libraries,
> so
> > any Golang application should behave the same way.
> > Also as I investigated last week - Golang/Node.js/Rust/Netty HTTP2
> servers
> > do not complain about it.
> >
> >
> >> This normally indicates an error condition. From the description of the
> >> load test I don't see why the client would be sending a RST_STREAM.
> >>
> >> If the RST_STREAM is valid and it isn't received a significant time
> >> after the END_STREAM then Tomcat needs to handle this.
> >>
> >> If the RST_STREAM is valid but received a significant amount of time
> >> after the END_STREAM that would be a client error would be a client
> error.
> >>
> >> Of course, significant isn't defined in the specification.
> >>
> >> Any special handling for RST_STREAM also needs to be applied to
> >> WINDOW_UPDATE. It also needs to differentiate between a frame received
> >> for a past closed stream and a frame received for an stream that never
> >> existed.
> >>
> >> I think the key question here is why is the client sending the
> >> RST_STREAM in the first place. Is Tomcat doing something it shouldn't /
> >> not doing something it should to cause this? If so, we need to fix the
> >> root cause rather than tackle the symptom.
> >>
> >> There are a couple of options for debugging this to see why the client
> >> is sending the RST_STREAM:
> >> - Enable debug logging for HTTP/2 in Tomcat. This is very verbose and
> >>   if the root cause is threading related typically changes the timing
> >>   enough to mask the issue.
> >>
> >
> > Here are the logs of 20 seconds load:
> >
> >
> https://gist.githubusercontent.com/martin-g/37dacc12149d5cecbfd4e16dc2248cfd/raw/76693cd1dd1f37ba126edb84ebd7631802aa91a6/tomcat-http2.log
> >
>
> Actually the above was just the last logs.
> The previous logs were in .gz files which I've missed.
>
> https://gist.githubusercontent.com/martin-g/37dacc12149d5cecbfd4e16dc2248cfd/raw/00596b7dc31d30d275384334ff17c5416f5b363c/tomcat-http2.log
> - this file contains the first 7 seconds of the load run.
> I have 3 times more information for the last 13 seconds but it is more or
> less the same information.
>
> There are many like this:
>
> 2020-09-21 14:25:04.850 DEBUG 232086 --- [https-jsse-nio-18080-exec-8]
> o.a.coyote.http11.Http11NioProtocol      : Found processor [null] for
> socket [org.apache.tomcat.util.net.SecureNioChannel@2b435926
> :java.nio.channels.SocketChannel[connected
> local=/127.0.0.1:18080 remote=/127.0.0.1:42229]]
> 2020-09-21 14:25:04.850 DEBUG 232086 --- [https-jsse-nio-18080-exec-6]
> o.a.coyote.http2.Http2UpgradeHandler     : Connection [2]
>
> java.io.IOException: Unable to unwrap data, invalid status [CLOSED]
>         at org.apache.tomcat.util.net
> .SecureNioChannel.read(SecureNioChannel.java:766)
>         at org.apache.tomcat.util.net
> .NioEndpoint$NioSocketWrapper$NioOperationState.run(NioEndpoint.java:1468)
>

When I tested HTTP/2, I used h2c (unencrypted) to get a better idea at
what's going on. Otherwise you don't really know what proportion of TLS or
HTTP/2 impacts performance more [and then you have to test more things like
OpenSSL, be careful that the ciphers used are the same and equally well
optimized in each impl, etc].

Then once you feel happy about h2c, you can see how things go with TLS.

Rémy


>         at org.apache.tomcat.util.net
> .SocketWrapperBase$OperationState.start(SocketWrapperBase.java:1039)
>         at org.apache.tomcat.util.net
> .SocketWrapperBase.vectoredOperation(SocketWrapperBase.java:1450)
>         at org.apache.tomcat.util.net
> .SocketWrapperBase.read(SocketWrapperBase.java:1293)
>         at org.apache.tomcat.util.net
> .SocketWrapperBase.read(SocketWrapperBase.java:1265)
>         at
> org.apache.coyote.http2.Http2AsyncParser.readConnectionPreface(Http2AsyncParser.java:55)
>         at
> org.apache.coyote.http2.Http2UpgradeHandler.init(Http2UpgradeHandler.java:238)
>         at
> org.apache.coyote.http2.Http2AsyncUpgradeHandler.init(Http2AsyncUpgradeHandler.java:39)
>         at
> org.apache.coyote.http2.Http2UpgradeHandler.upgradeDispatch(Http2UpgradeHandler.java:313)
>         at
> org.apache.coyote.http2.Http2AsyncUpgradeHandler.upgradeDispatch(Http2AsyncUpgradeHandler.java:39)
>         at
> org.apache.coyote.http11.upgrade.UpgradeProcessorInternal.dispatch(UpgradeProcessorInternal.java:54)
>         at
> org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:59)
>         at
> org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
>         at org.apache.tomcat.util.net
> .NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1590)
>         at org.apache.tomcat.util.net
> .SocketProcessorBase.run(SocketProcessorBase.java:49)
>         at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>         at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
>         at
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
>         at java.base/java.lang.Thread.run(Thread.java:832)
>
>
>
> >
> >
> > This is with my patch in #reset(). Without it there are also errors that
> > stream XYZ is already closed
> >
> >
> >> - Client side logging (I don't know what the capabilities are here).
> >> - JVM level TLS debug logging (even more verbose than Tomcat's HTTP/2
> >>   debug and harder to work with)
> >> - Wireshark (or similar)
> >>
> >> I'd recommend Wireshark. I don't think JSSE supports SSLKEYLOGFILE so
> >> I'd use the APR/Native connector with SSLKEYLOGFILE to get the cleartext
> >> in Wireshark.
> >>
> >> > With this change and Vegeta's -max-workers=100 I can get 12 K
> reqs/sec.
> >>
> >> That is odd. If the client is sending RST_STREAM messages then I'd
> >> expect to see requests failing as RST_STREAM indicates something has
> >> gone wrong or the client has aborted the request.
> >>
> >
> > According to https://tools.ietf.org/html/rfc7540#section-5.1 when the
> > stream is in HALF_CLOSED_REMOTE state it should send either END_STREAM or
> > RST_STREAM to the remote peer.
> > In the logs I see such sequence of events:
> >
> > ...
> > o.a.coyote.http2.StreamStateMachine      : Connection [124], Stream
> > [5,279], State changed from [OPEN] to [HALF_CLOSED_REMOTE]
> > o.a.coyote.http2.Http2UpgradeHandler     : Connection [124], Stream
> > [5,279], Writing the headers
> > org.apache.coyote.http2.Stream           : Connection [124], Stream
> > [5,279], flushing output with buffer at position [12], writeInProgress
> > [false] and closed [true]
> > org.apache.coyote.http2.AbstractStream   : Connection [124], Stream
> > [5,279], reduce flow control window by [12] to [4194292]
> > o.a.coyote.http2.Http2UpgradeHandler     : Connection [124], Stream
> > [5,279], Data length [12]
> > o.a.coyote.http2.StreamStateMachine      : Connection [124], Stream
> > [5,279], State changed from [HALF_CLOSED_REMOTE] to [CLOSED_TX]
> > org.apache.coyote.http2.Stream           : Connection [124], Stream
> > [5,279] has been recycled
> >
> >
> > From the logs it is not clear whether any of these flags is being sent to
> > the client. I will investigate!
> >
> > Note: ", Writing the headers" is added by me. The current log entry is
> > "upgradeHandler.writeHeaders=Connection [{0}], Stream [{1}]"
> >
> >
> >>
> >> > With more workers it starts failing with:
> >> >
> >> > "status_codes": {
> >> >     "0": 1000,
> >> >     "200": 1
> >> >   },
> >> >   "errors": [
> >> >     "Get \"https://localhost:18080/testbed/plaintext\": net/http:
> >> request
> >> > canceled while waiting for connection (Client.Timeout exceeded while
> >> > awaiting headers)",
> >> >     "Get \"https://localhost:18080/testbed/plaintext\": context
> >> deadline
> >> > exceeded",
> >> >     "Get \"https://localhost:18080/testbed/plaintext\": context
> >> deadline
> >> > exceeded (Client.Timeout exceeded while awaiting headers)"
> >> >   ]
> >> > i.e. out of 1001 requests only one succeeds and the others fail with
> >> > timeout. I will try to debug this one too and run a profiler too.
> >>
> >> I'd probably start by enabling Tomcat's HTTP/2 debug logging for this
> >> one. I'm wondering if something is triggering a connection level error
> >> early in the processing.
> >>
> >> Mark
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>
> >>
>

Reply via email to