[
https://issues.apache.org/jira/browse/IO-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18053292#comment-18053292
]
Gary D. Gregory commented on IO-884:
------------------------------------
Good catch [~eamonnmcmanus]
Thank you for your report.
In git master now:
* I added the test {{IOUtilsTest.testContentEquals_Reader_Reader_unevenReads()
and,}}
* Reverted the unreleased code to the previous implementation
Please verify git master or a snapshot build from
[https://repository.apache.org/content/repositories/snapshots/]
Thank you!
> IOUtils.contentEquals(Reader,Reader) makes an invalid assumption about read
> return value
> ----------------------------------------------------------------------------------------
>
> Key: IO-884
> URL: https://issues.apache.org/jira/browse/IO-884
> Project: Commons IO
> Issue Type: Bug
> Components: Utilities
> Reporter: Éamonn McManus
> Priority: Minor
>
> My colleague Chris Povirk noticed this when we were importing the latest
> Commons IO code into Google's source repo. [This
> commit|https://github.com/apache/commons-io/commit/65f8e5d55d02c2503e122a1d35601eba3c315e73]
> improves the performance of IOUtils.contentEquals(Reader, Reader), but at
> the expense of an incorrect assumption about what Reader.read(char[],int,int)
> can return.
> {code:java}
> read1 = input1.read(array1, 0, DEFAULT_BUFFER_SIZE);
> read2 = input2.read(array2, 0, DEFAULT_BUFFER_SIZE);
> // If both read EOF here, they're equal.
> if (read1 == EOF && read2 == EOF) {
> return true;
> }
> // If only one read EOF or different amounts, they're not
> equal.
> if (read1 != read2) {
> return false;
> } {code}
> In the last `if`, it's correct to return false if one of the values is EOF.
> Otherwise, both values are positive (by the contract of Reader.read) but they
> are not required to be equal even if the contents being read are equal. A
> Reader can legitimately return any positive value up to DEFAULT_BUFFER_SIZE,
> and two Readers from different sources might not return the same value. (For
> example, a Reader coming from a network connection will typically return data
> that has already arrived at that connection, but if it has some data then it
> will not wait for further data to fill up the full array.)
> I think it may well make sense to have special logic for the `read1 == read2`
> case, since that will be common, but there needs to be a fallback for when
> that is false.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)