[
https://issues.apache.org/jira/browse/CAMEL-23320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Federico Mariani updated CAMEL-23320:
-------------------------------------
Description:
Since CAMEL-20097 introduced
this.binding.setUseReaderForPayload(!endpoint.isUseStreaming()), the Spring
Boot platform-http consumer reads request bodies through request.getReader() by
default (useStreaming defaults to false). This applies charset decoding to raw
bytes, which corrupts any request body containing byte sequences that are
invalid in the applied charset.
In practice, Spring Boot's CharacterEncodingFilter sets the request encoding to
UTF-8 when the request does not specify one. Binary content types
(application/octet-stream, application/pdf, image/*, protobuf, etc.) carry no
charset parameter, so UTF-8 is applied. Bytes that do not form valid UTF-8
sequences (common in binary data) are replaced with U+FFFD at decode time, this
replacement is irreversible. Text payloads whose charset matches the applied
one (e.g., UTF-8 JSON) are not affected.
On Camel Spring Boot, it results in a StreamCacheException (CoyoteReader stream
closed) and a 500 error due to the async CompletableFuture.runAsync execution
model.
The original CAMEL-19177 implementation did NOT set useReaderForPayload. Since
the field defaults to false, it always used request.getInputStream(), and
binary data worked fine.
CAMEL-20097 attempted to map the useStreaming option to Spring Boot by toggling
useReaderForPayload, but this was semantically wrong, it conflated "streaming
vs buffered" with "Reader vs InputStream".
was:
The Spring Boot platform-http consumer corrupts binary request bodies (PDFs,
images, protobuf, etc.) by default since CAMEL-20097 introduced the line:
_this.binding.setUseReaderForPayload(!endpoint.isUseStreaming());_
Since useStreaming defaults to false, useReaderForPayload is true, causing
_DefaultHttpBinding.parseBody()_ to return _request.getReader()_, a character
stream that applies charset encoding to the raw bytes.
On Camel Spring Boot, it results in a StreamCacheException (CoyoteReader stream
closed) and a 500 error due to the async CompletableFuture.runAsync execution
model.
The original CAMEL-19177 implementation did NOT set useReaderForPayload. Since
the field defaults to false, it always used request.getInputStream(), and
binary data worked fine.
CAMEL-20097 attempted to map the useStreaming option to Spring Boot by toggling
useReaderForPayload, but this was semantically wrong, it conflated "streaming
vs buffered" with "Reader vs InputStream".
> camel-platform-http-starter - Fix binary data corruption due to Spring Boot's
> default UTF-8 charset
> ---------------------------------------------------------------------------------------------------
>
> Key: CAMEL-23320
> URL: https://issues.apache.org/jira/browse/CAMEL-23320
> Project: Camel
> Issue Type: Improvement
> Components: camel-platform-http, camel-spring-boot
> Reporter: Federico Mariani
> Assignee: Federico Mariani
> Priority: Major
> Fix For: 4.14.7, 4.18.3, 4.20.0
>
>
> Since CAMEL-20097 introduced
> this.binding.setUseReaderForPayload(!endpoint.isUseStreaming()), the Spring
> Boot platform-http consumer reads request bodies through request.getReader()
> by default (useStreaming defaults to false). This applies charset decoding to
> raw bytes, which corrupts any request body containing byte sequences that are
> invalid in the applied charset.
> In practice, Spring Boot's CharacterEncodingFilter sets the request encoding
> to UTF-8 when the request does not specify one. Binary content types
> (application/octet-stream, application/pdf, image/*, protobuf, etc.) carry no
> charset parameter, so UTF-8 is applied. Bytes that do not form valid UTF-8
> sequences (common in binary data) are replaced with U+FFFD at decode time,
> this replacement is irreversible. Text payloads whose charset matches the
> applied one (e.g., UTF-8 JSON) are not affected.
> On Camel Spring Boot, it results in a StreamCacheException (CoyoteReader
> stream closed) and a 500 error due to the async CompletableFuture.runAsync
> execution model.
> The original CAMEL-19177 implementation did NOT set useReaderForPayload.
> Since the field defaults to false, it always used request.getInputStream(),
> and binary data worked fine.
> CAMEL-20097 attempted to map the useStreaming option to Spring Boot by
> toggling useReaderForPayload, but this was semantically wrong, it conflated
> "streaming vs buffered" with "Reader vs InputStream".
--
This message was sent by Atlassian Jira
(v8.20.10#820010)