[ https://issues.apache.org/jira/browse/LUCENE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964082#comment-13964082 ]
Simon Willnauer commented on LUCENE-5583: ----------------------------------------- I personally think you shouldn't pass this shared buffer to readBytes() it can break all delegates. I wonder if we want to add a skipBytes method to DataInput that we can impl. efficienly on the lower levels and that just calls readByte() in a loop as a default impl? > Should BufferedChecksumIndexInput have its own buffer? > ------------------------------------------------------ > > Key: LUCENE-5583 > URL: https://issues.apache.org/jira/browse/LUCENE-5583 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 4.8 > Reporter: Adrien Grand > > I was playing with on-the-fly checksum verification and this made me stumble > upon an issue with {{BufferedChecksumIndexInput}}. > I have some code that skips over a {{DataInput}} by reading bytes into > /dev/null, eg. > {code} > private static final byte[] SKIP_BUFFER = new byte[1024]; > private static void skipBytes(DataInput in, long numBytes) throws > IOException { > assert numBytes >= 0; > for (long skipped = 0; skipped < numBytes; ) { > final int toRead = (int) Math.min(numBytes - skipped, > SKIP_BUFFER.length); > in.readBytes(SKIP_BUFFER, 0, toRead); > skipped += toRead; > } > } > {code} > It is fine to read into this static buffer, even from multiple threads, since > the content that is read doesn't matter here. However, it breaks with > {{BufferedChecksumIndexInput}} because of the way that it updates the > checksum: > {code} > @Override > public void readBytes(byte[] b, int offset, int len) > throws IOException { > main.readBytes(b, offset, len); > digest.update(b, offset, len); > } > {code} > If you are unlucky enough so that a concurrent call to {{skipBytes}} started > modifying the content of {{b}} before the call to {{digest.update(b, offset, > len)}} finished, then your checksum will be wrong. > I think we should make {{BufferedChecksumIndexInput}} read into a private > buffer first instead of relying on the user-provided buffer. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org