[
https://issues.apache.org/jira/browse/HTTPCLIENT-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arturo Bernal updated HTTPCLIENT-2422:
--------------------------------------
Fix Version/s: 5.7-alpha1
> DecompressingEntity in 5.4+ eagerly creates decompression stream, causing
> ZipException on empty/invalid bodies (regression from 5.2 lazy behavior)
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HTTPCLIENT-2422
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2422
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient (classic)
> Affects Versions: 5.4, 5.5, 5.6
> Reporter: Sneha Murganoor
> Priority: Critical
> Fix For: 5.7-alpha1
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> In 5.2, DecompressingEntity.getContent() returned a
> LazyDecompressingInputStream that deferred GZIPInputStream creation to the
> first read() call. This allowed responses with Content-Encoding: gzip but
> empty or non-gzip bodies to be handled gracefully — the stream was never read
> or the error surfaced at a point where callers could handle it.
> In 5.4+, DecompressingEntity (moved to
> org.apache.hc.client5.http.entity.compress) was rewritten to eagerly call
> decoder.apply(super.getContent()) in getContent(). This immediately creates
> GZIPInputStream, which reads the gzip magic bytes in its constructor. If the
> body is empty (e.g., chunked transfer with zero-length body) or not actually
> compressed, this throws ZipException: Not in GZIP format at getContent() time
> — before the caller has any opportunity to handle it.
> Reproduction:
> A backend sends:
> {quote}
> HTTP/1.1 200 OK
> Content-Encoding: gzip
> Transfer-Encoding: chunked
> 0\r\n\r\n
> (Empty chunked body with Content-Encoding: gzip header.)
> {quote}
> In 5.2: entity.getContent() succeeds, returns LazyDecompressingInputStream.
> Caller reads EOF without error.
> In 5.4+: entity.getContent() throws java.util.zip.ZipException: Not in GZIP
> format.
> Stack trace:
> {quote}
> java.util.zip.ZipException: Not in GZIP format
> at
> java.base/java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:197)
> at java.base/java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:81)
> at
> org.apache.hc.client5.http.entity.compress.DecompressingEntity.getContent(DecompressingEntity.java:63)
> {quote}
> Context:
> HTTPCLIENT-1432 reported the same class of issue (ZipException on 304
> responses with Content-Encoding: gzip). It was fixed in 4.5.5 and 5.0 Beta1
> by using LazyDecompressingInputStream. The 5.4 rewrite of DecompressingEntity
> removed lazy initialization, reintroducing this failure mode.
> While the backend is arguably misbehaving by sending Content-Encoding: gzip
> with no body, this is common in practice (web servers that add the header
> unconditionally regardless of whether compression occurred). The 5.2 behavior
> was more resilient to this.
> Suggested fix:
> Restore lazy stream initialization in DecompressingEntity.getContent() —
> defer decoder.apply() to first read(), or handle the case where the
> underlying stream is empty before attempting decompression.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]