Adrien Grand created LUCENE-5583:
------------------------------------
Summary: Should BufferedChecksumIndexInput have its own buffer?
Key: LUCENE-5583
URL: https://issues.apache.org/jira/browse/LUCENE-5583
Project: Lucene - Core
Issue Type: Bug
Affects Versions: 4.8
Reporter: Adrien Grand
I was playing with on-the-fly checksum verification and this made me stumble
upon an issue with {{BufferedChecksumIndexInput}}.
I have some code that skips over a {{DataInput}} by reading bytes into
/dev/null, eg.
{code}
private static final byte[] SKIP_BUFFER = new byte[1024];
private static void skipBytes(DataInput in, long numBytes) throws IOException
{
assert numBytes >= 0;
for (long skipped = 0; skipped < numBytes; ) {
final int toRead = (int) Math.min(numBytes - skipped, SKIP_BUFFER.length);
in.readBytes(SKIP_BUFFER, 0, toRead);
skipped += toRead;
}
}
{code}
It is fine to read into this static buffer, even from multiple threads, since
the content that is read doesn't matter here. However, it breaks with
{{BufferedChecksumIndexInput}} because of the way that it updates the checksum:
{code}
@Override
public void readBytes(byte[] b, int offset, int len)
throws IOException {
main.readBytes(b, offset, len);
digest.update(b, offset, len);
}
{code}
If you are unlucky enough so that a concurrent call to {{skipBytes}} started
modifying the content of {{b}} before the call to {{digest.update(b, offset,
len)}} finished, then your checksum will be wrong.
I think we should make {{BufferedChecksumIndexInput}} read into a private
buffer first instead of relying on the user-provided buffer.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]