[ 
https://issues.apache.org/jira/browse/LUCENE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965080#comment-13965080
 ] 

ASF subversion and git services commented on LUCENE-5583:
---------------------------------------------------------

Commit 1586231 from jpou...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1586231 ]

LUCENE-5583: Add DataInput.skipBytes, ChecksumIndexInput can now seek forward.

> Add DataInput.skipBytes
> -----------------------
>
>                 Key: LUCENE-5583
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5583
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 4.8
>            Reporter: Adrien Grand
>            Priority: Blocker
>             Fix For: 4.8, 5.0
>
>         Attachments: LUCENE-5583.patch, LUCENE-5583.patch
>
>
> I was playing with on-the-fly checksum verification and this made me stumble 
> upon an issue with {{BufferedChecksumIndexInput}}.
> I have some code that skips over a {{DataInput}} by reading bytes into 
> /dev/null, eg.
> {code}
>   private static final byte[] SKIP_BUFFER = new byte[1024];
>   private static void skipBytes(DataInput in, long numBytes) throws 
> IOException {
>     assert numBytes >= 0;
>     for (long skipped = 0; skipped < numBytes; ) {
>       final int toRead = (int) Math.min(numBytes - skipped, 
> SKIP_BUFFER.length);
>       in.readBytes(SKIP_BUFFER, 0, toRead);
>       skipped += toRead;
>     }
>   }
> {code}
> It is fine to read into this static buffer, even from multiple threads, since 
> the content that is read doesn't matter here. However, it breaks with 
> {{BufferedChecksumIndexInput}} because of the way that it updates the 
> checksum:
> {code}
>   @Override
>   public void readBytes(byte[] b, int offset, int len)
>     throws IOException {
>     main.readBytes(b, offset, len);
>     digest.update(b, offset, len);
>   }
> {code}
> If you are unlucky enough so that a concurrent call to {{skipBytes}} started 
> modifying the content of {{b}} before the call to {{digest.update(b, offset, 
> len)}} finished, then your checksum will be wrong.
> I think we should make {{BufferedChecksumIndexInput}} read into a private 
> buffer first instead of relying on the user-provided buffer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to