[ 
https://issues.apache.org/jira/browse/IO-780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711546#comment-17711546
 ] 

Marcono1234 commented on IO-780:
--------------------------------

This is not an "underflow", at least not a {{CoderResult.UNDERFLOW}}. For the 
example snippet above the {{StringReader}} is done after the incomplete 
surrogate pair; {{ReaderInputStream}} calls {{encode}} with {{endOfInput=true}} 
which leads to {{CoderResult.MALFORMED}}, which is erroneously ignored.

Also note that this is not limited to unpaired surrogates; a similar situation 
can probably also occur when {{encode}} returns {{CoderResult.OVERFLOW}} which 
is erroneously ignored as well for {{endOfInput=true}} (for a Charset which 
does not write anything on {{flush}}).

> ReaderInputStream discards some encoding errors
> -----------------------------------------------
>
>                 Key: IO-780
>                 URL: https://issues.apache.org/jira/browse/IO-780
>             Project: Commons IO
>          Issue Type: Bug
>          Components: Streams/Writers
>    Affects Versions: 2.11.0
>            Reporter: Marcono1234
>            Priority: Major
>
> h3. Description
> {{org.apache.commons.io.input.ReaderInputStream}} discards encoder errors in 
> some cases instead of properly rethrowing them.
> The underlying issue is that {{lastCoderResult}} is re-assigned before it has 
> been checked for errors and overflow ([link to 
> code|https://github.com/apache/commons-io/blob/b9e4f5e6e718ec8e4156e31bef733874700d7cbf/src/main/java/org/apache/commons/io/input/ReaderInputStream.java#L267]).
> This was originally mentioned in pull request 
> [#293|https://github.com/apache/commons-io/pull/293].
> h3. Example
> The {{read()}} call in the following example should throw an exception, but 
> currently it erroneously returns -1.
> {code}
> // Encoder which throws on malformed or unmappable input
> CharsetEncoder encoder = StandardCharsets.UTF_8.newEncoder();
> ReaderInputStream in = new ReaderInputStream(new StringReader("\uD800"), 
> encoder);
> // BUG: This should have thrown an exception because the input is malformed
> System.out.println("Read: " + in.read());
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to