On Sat, 2014-02-15 at 17:06 +0100, Sebastiano Vigna wrote: > On 15 Feb 2014, at 12:47 PM, Oleg Kalnichevski <ol...@apache.org> wrote: > > > The problem mostly likely has been introduced by HTTPCLIENT-1432 [1]. I > > reviewed the patch once more and could not find anything obviously wrong > > with it. Try reverting the changes introduced by HTTPCLIENT-1432 and see > > Subclassing InputStream without overriding the multi-byte read() method is a > recipe for disaster... the inherited method does a byte-by-byte read. You can > see what's happening in this hprof trace: > > java.util.zip.Inflater.inflateBytes(Inflater.java:Unknown line) > java.util.zip.Inflater.inflate(Inflater.java:259) > java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152) > java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116) > java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122) > > org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:56) > java.io.InputStream.read(InputStream.java:179) > > it.unimi.di.law.warc.util.InspectableCachedHttpEntity.copyContent(InspectableCachedHttpEntity.java:67) > > copyContent() would love to read(byte[],int,int) in a buffer, but since > LazyDecompressingInputStream doesn't override it it invokes instead the > read-byte-by-byte inherited method in InputStream, which in turn now calls > for each byte the one-byte read() method from LazyDecompressingInputStream, > which invokes the one-byte read method from InflaterInputStream, which does a > multi-byte, length-one read from GZIPInputStream, which unleashes a similar > call on InflaterInputStream, which unfortunately makes a similar read using > the native inflateBytes() method. > > The result is a 10-50x decrease in speed, but I think that a trivial override > of read(byte[],int,int) in LazyDecompressingInputStream will solve the > problem. >
Sebastiano Could you please raise a JIRA for this defect and tag it as regression? Oleg --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org For additional commands, e-mail: httpclient-users-h...@hc.apache.org