Éamonn McManus created IO-884:
---------------------------------
Summary: IOUtils.contentEquals(Reader,Reader) makes an invalid
assumption about read return value
Key: IO-884
URL: https://issues.apache.org/jira/browse/IO-884
Project: Commons IO
Issue Type: Bug
Components: Utilities
Affects Versions: 2.21.1
Reporter: Éamonn McManus
My colleague Chris Povirk noticed this when we were importing the latest
Commons IO code into Google's source repo. [This
commit|https://github.com/apache/commons-io/commit/65f8e5d55d02c2503e122a1d35601eba3c315e73]
improves the performance of IOUtils.contentEquals(Reader, Reader), but at the
expense of an incorrect assumption about what Reader.read(char[],int,int) can
return.
{code:java}
read1 = input1.read(array1, 0, DEFAULT_BUFFER_SIZE);
read2 = input2.read(array2, 0, DEFAULT_BUFFER_SIZE);
// If both read EOF here, they're equal.
if (read1 == EOF && read2 == EOF) {
return true;
}
// If only one read EOF or different amounts, they're not equal.
if (read1 != read2) {
return false;
} {code}
In the last `if`, it's correct to return false if one of the values is EOF.
Otherwise, both values are positive (by the contract of Reader.read) but they
are not required to be equal even if the contents being read are equal. A
Reader can legitimately return any positive value up to DEFAULT_BUFFER_SIZE,
and two Readers from different sources might not return the same value. (For
example, a Reader coming from a network connection will typically return data
that has already arrived at that connection, but if it has some data then it
will not wait for further data to fill up the full array.)
I think it may well make sense to have special logic for the `read1 == read2`
case, since that will be common, but there needs to be a fallback for when that
is false.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)