On Wed, 2007-04-11 at 08:02 -0400, Tony Thompson wrote: > > I am having what I consider a fairly significant performance issue > > with a ChunkedInputStream in the 3.0.1 client. I have been packet > > tracing a conversation using the HttpClient where the response is > > chunked and my application is reading the response stream directly > > from the method I executed. On the wire, frequently during a large > > chunked response I see "ZeroWindow" responses from my client to the > > server which would indicate that the HttpClient is not getting the > > data off the wire fast enough. I have tried making the > > BufferedInputStream in HttpConnection large (128K) and it still fills > up (it just takes a little longer). > > > > So, after doing some profiling of ChunkedInputStream I found that a > > huge amount of time is spent in > > ChunkedInputStream.getChunkSizeFromInputStream(). In the very short > > profiling session I ran, ChunkedInputStream.read( byte[], int, int ) > > was invoked 2424 times and the time spent in that method (excluding > > further method calls) was 870ms. getChunkSizeFromInputStream() was > > invoked 432 times and the time spent in that method was 27762ms > > (excluding further method calls). Does someone who understands that > > code better than I have any idea how that can be improved? > > > > > >Tony, > > > >I have spent a fair amount of time and efforts profiling HttpCore, a > set of low level HTTP transport components HttpClient 4.0 will be > >based on. > >Overall HttpClient 4.0 is expected to be 20 to 40% faster then > HttpClient 3.x due to improvements in the core HTTP components. The sad > >truth is we simply lack resources to back-port those changes to > HttpClient 3.x code line. > > Unfortunately I need to come up with some kind of fix now. I am > currently using this in an environment where this is causing lots of > issues. So, I guess that means I have to dig into that myself. Any > pointers on what I might be able to do to improve that particular piece > of the code? >
I ended up rewriting it almost completely for HttpClient 4.0. One of the problems I found is that in lots of places HttpClient 3.x reads one byte at a time from the input stream in order to be able to detect a CRLF / LF line delimiter, which may be one of the factors contributing to the performance issue you have been having. I simply do not see an easy fix for this problem. > > One other issue I have with that code is if I interrupt the file > > transfer and call method.abort(), that ChunkedInputStream appears to > > still keep pulling data from the host. Wouldn't it just make more > > sense to just close down that connection instead of making it sit > > there and pull data that is just dumped into the bit bucket? > > > >This precisely what HttpMethod#abort() does. It simply shuts down the > underlying connection. I have a hard time believing any data can > >be received after the connection socket has been closed. It is > plausible, though, some data may still be read from an intermediate > >content buffer, but I find this scenario unlikely. > > You may want to take a look at that in the new client. In the 3.0.1 > client, after that stream is closed, exhaustInputStream() is called > which attempts to finish reading the content so the connection can be > ready for another request. I do not think this is the case. #exhaustInputStream() is called ONLY if the connection is being released back to the connection manager. HttpMethod#abort() simply calls HttpConnection#close(), which in its turn just closes down the underlying network socket without trying to exhaust the input stream. http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpclient/HttpMethodBase.html#1102 http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpclient/HttpConnection.html#1214 Hope this helps Oleg > In my case, that is a lot of content and so > it continues on for a bit before the socket is just reset. I am not > continuing to read in my application but looking at a packet trace I can > see the client is doing it for me. > > Thanks > Tony > > This message (and any associated files) is intended only for the > use of the individual or entity to which it is addressed and may > contain information that is confidential, subject to copyright or > constitutes a trade secret. If you are not the intended recipient > you are hereby notified that any dissemination, copying or > distribution of this message, or files associated with this message, > is strictly prohibited. If you have received this message in error, > please notify us immediately by replying to the message and deleting > it from your computer. Messages sent to and from Stoneware, Inc. > may be monitored. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
