Igor Lubashev wrote:
1.  BufferedInputStream is working fine.  I've looked at the source, and
it correctly tried to read data only when its internal buffer is
exhausted.  Most read calls reference only the internal buffer.  When
the data does get read from the underlying stream, it tries to read it
in large chunks.  (Of course, if the underlying stream returns very
little data, it is a different problem.)

2.  It is hard to believe that reading a byte at a time is a bottleneck,
but I've just quickly written a LineReaderInputStream, which is derived
from BufferedInputStream, so all the searching for CRLF/LF happens very
quickly internally.  The source is attached.

Just call readLine() method, and you'll get Strings out of the stream.
You can interleave all regular stream operations and readLine() calls.
However, if you wish to use readLine() *after* using the stream's read()
methods, make sure that you do not inadvertently pass this stream to
anything that is buffering the stream's data (or your strings may get
consumed via buffering).

- Igor



Igor,

With all due respect given the implementation of BufferedInputStream#read() method in Sun's JRE (see below) I just do not see how LineReaderInputStream should be any faster

   public synchronized int read() throws IOException {
   if (pos >= count) {
       fill();
       if (pos >= count)
       return -1;
   }
   return getBufIfOpen()[pos++] & 0xff;
   }

Have you done any benchmarking comparing performance of HttpClient 3.x with and without the patch?

I have invested a lot of efforts into optimizing the low level HTTP components for HttpClient 4.0 [1] and most performance gains came chiefly from three factors: elimination of unnecessary synchronization and intermediate buffer copying and reduced garbage (thus reduced GC time). Performance improvement due to the improved HTTP header parser and chunk codec were marginal at best.

Oleg

[1] http://jakarta.apache.org/httpcomponents/httpcore/index.html




I looked at the source for BufferedInputStream and it looks like
it tries to fill the empty space in the buffer each time you read
from
it (for a socket connection it will read more than one packet of
data)
instead of just doing a single read from the underlying stream.
Ok, then the byte-by-byte reading in CIS when parsing the chunk
header
might well be the problem. If you want to fix that, you'll have to
hack
deeply into CIS. Here is what I would do if I had no other choice:

- extend CIS by a local byte array as a buffer (needs two extra int
  for cursor and fill size)
- change the chunk header parsing to read a bunch of bytes into the
  buffer, then parsing from there
- change all read methods to return leftover bytes from the buffer
  before calling a read on the underlying stream

hope that helps,
  Roland
Tony and Roland,

I suspect rather strongly it is BufferedInputStream that needs fixing, not ChunkedInputStream

Oleg


------------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to