[ 
https://issues.apache.org/jira/browse/CAMEL-23320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Federico Mariani updated CAMEL-23320:
-------------------------------------
    Description: 
Since CAMEL-20097 introduced 
this.binding.setUseReaderForPayload(!endpoint.isUseStreaming()), the Spring 
Boot platform-http consumer reads request bodies through request.getReader() by 
default (useStreaming defaults to false). This applies charset decoding to raw 
bytes, which corrupts any request body containing byte sequences that are 
invalid in the applied charset.

In practice, Spring Boot's CharacterEncodingFilter sets the request encoding to 
UTF-8 when the request does not specify one. Binary content types 
(application/octet-stream, application/pdf, image/*, protobuf, etc.) carry no 
charset parameter, so UTF-8 is applied. Bytes that do not form valid UTF-8 
sequences (common in binary data) are replaced with U+FFFD at decode time, this 
replacement is irreversible. Text payloads whose charset matches the applied 
one (e.g., UTF-8 JSON) are not affected.

On Camel Spring Boot, it results in a StreamCacheException (CoyoteReader stream 
closed) and a 500 error due to the async CompletableFuture.runAsync execution 
model.

The original CAMEL-19177 implementation did NOT set useReaderForPayload. Since 
the field defaults to false, it always used request.getInputStream(), and 
binary data worked fine.

CAMEL-20097 attempted to map the useStreaming option to Spring Boot by toggling 
useReaderForPayload, but this was semantically wrong, it conflated "streaming 
vs buffered" with "Reader vs InputStream".

  was:
The Spring Boot platform-http consumer corrupts binary request bodies (PDFs, 
images, protobuf, etc.) by default since CAMEL-20097 introduced the line:

_this.binding.setUseReaderForPayload(!endpoint.isUseStreaming());_

Since useStreaming defaults to false, useReaderForPayload is true, causing 
_DefaultHttpBinding.parseBody()_ to return _request.getReader()_, a character 
stream that applies charset encoding to the raw bytes. 
On Camel Spring Boot, it results in a StreamCacheException (CoyoteReader stream 
closed) and a 500 error due to the async CompletableFuture.runAsync execution 
model.

The original CAMEL-19177 implementation did NOT set useReaderForPayload. Since 
the field defaults to false, it always used request.getInputStream(), and 
binary data worked fine.

CAMEL-20097 attempted to map the useStreaming option to Spring Boot by toggling 
useReaderForPayload, but this was semantically wrong, it conflated "streaming 
vs buffered" with "Reader vs InputStream".


> camel-platform-http-starter - Fix binary data corruption due to Spring Boot's 
> default UTF-8 charset
> ---------------------------------------------------------------------------------------------------
>
>                 Key: CAMEL-23320
>                 URL: https://issues.apache.org/jira/browse/CAMEL-23320
>             Project: Camel
>          Issue Type: Improvement
>          Components: camel-platform-http, camel-spring-boot
>            Reporter: Federico Mariani
>            Assignee: Federico Mariani
>            Priority: Major
>             Fix For: 4.14.7, 4.18.3, 4.20.0
>
>
> Since CAMEL-20097 introduced 
> this.binding.setUseReaderForPayload(!endpoint.isUseStreaming()), the Spring 
> Boot platform-http consumer reads request bodies through request.getReader() 
> by default (useStreaming defaults to false). This applies charset decoding to 
> raw bytes, which corrupts any request body containing byte sequences that are 
> invalid in the applied charset.
> In practice, Spring Boot's CharacterEncodingFilter sets the request encoding 
> to UTF-8 when the request does not specify one. Binary content types 
> (application/octet-stream, application/pdf, image/*, protobuf, etc.) carry no 
> charset parameter, so UTF-8 is applied. Bytes that do not form valid UTF-8 
> sequences (common in binary data) are replaced with U+FFFD at decode time, 
> this replacement is irreversible. Text payloads whose charset matches the 
> applied one (e.g., UTF-8 JSON) are not affected.
> On Camel Spring Boot, it results in a StreamCacheException (CoyoteReader 
> stream closed) and a 500 error due to the async CompletableFuture.runAsync 
> execution model.
> The original CAMEL-19177 implementation did NOT set useReaderForPayload. 
> Since the field defaults to false, it always used request.getInputStream(), 
> and binary data worked fine.
> CAMEL-20097 attempted to map the useStreaming option to Spring Boot by 
> toggling useReaderForPayload, but this was semantically wrong, it conflated 
> "streaming vs buffered" with "Reader vs InputStream".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to