[ https://issues.apache.org/jira/browse/LUCENE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964343#comment-13964343 ]
Michael McCandless commented on LUCENE-5583: -------------------------------------------- +1, patch looks nice. Hmm, should ByteArrayDataInput.skipBytes should be marked @Override (and I guess change its param to long)? > Add DataInput.skipBytes > ----------------------- > > Key: LUCENE-5583 > URL: https://issues.apache.org/jira/browse/LUCENE-5583 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 4.8 > Reporter: Adrien Grand > Priority: Blocker > Fix For: 4.8, 5.0 > > Attachments: LUCENE-5583.patch, LUCENE-5583.patch > > > I was playing with on-the-fly checksum verification and this made me stumble > upon an issue with {{BufferedChecksumIndexInput}}. > I have some code that skips over a {{DataInput}} by reading bytes into > /dev/null, eg. > {code} > private static final byte[] SKIP_BUFFER = new byte[1024]; > private static void skipBytes(DataInput in, long numBytes) throws > IOException { > assert numBytes >= 0; > for (long skipped = 0; skipped < numBytes; ) { > final int toRead = (int) Math.min(numBytes - skipped, > SKIP_BUFFER.length); > in.readBytes(SKIP_BUFFER, 0, toRead); > skipped += toRead; > } > } > {code} > It is fine to read into this static buffer, even from multiple threads, since > the content that is read doesn't matter here. However, it breaks with > {{BufferedChecksumIndexInput}} because of the way that it updates the > checksum: > {code} > @Override > public void readBytes(byte[] b, int offset, int len) > throws IOException { > main.readBytes(b, offset, len); > digest.update(b, offset, len); > } > {code} > If you are unlucky enough so that a concurrent call to {{skipBytes}} started > modifying the content of {{b}} before the call to {{digest.update(b, offset, > len)}} finished, then your checksum will be wrong. > I think we should make {{BufferedChecksumIndexInput}} read into a private > buffer first instead of relying on the user-provided buffer. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org