[ https://issues.apache.org/jira/browse/LUCENE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794675#action_12794675 ]
Uwe Schindler edited comment on LUCENE-2084 at 12/27/09 10:01 AM: ------------------------------------------------------------------ bq. Therefore, limit - offset is a correct way to calculate length of the entire buffer, which is what this class has always used. This is exactly wrong: The lengthz of the "valid" area inside the buffer is always limit()-position() [see javadocs of wrap()]. If arrayOffset()!=0, position() may still be 0 (because the arrayoffset is only used when wrapping byte arrays. The arrayOffset() defines the postion in the underlying array that defines the buffer's position()==0 (see javadocs). This error is mostly no problem for dynamical allocated array buffers (as most users use), because arrayOffset() is always 0 (I have never seen a buffer with offset != 0) A problem also occurs if you wrap an array with wrap([], offset, length). the offset there is *not* the arrayOffset(), it is the position()! And length is after wrapping remaining(). The capacity() is the array length. Therefore it is really a bug. You deprecated it, that's good, but it should be fixed, it does not work correct. For verifying, read the source code of ByteBuffer and HeapByteBuffer from src.zip in your jdk distrib and of course the javadocs. So just to repeat, the correct code would be: - start postion in array=arrayOffset()+position() - length=remaining() This is how the IdentityEncode for Payloads works. So deprecate *and* fix it. If we really want to work with buffers, the correct way to implement this would be that the whole class implements CharsetEncoder. was (Author: thetaphi): bq. Therefore, limit - offset is a correct way to calculate length of the entire buffer, which is what this class has always used. This is exactly wrong: The lengthz of the "valid" area inside the buffer is always limit(). If arrayOffset()!=0, position() may still be 0 (because the arrayoffset is only used when wrapping byte arrays. The arrayoffset defines the postion in the underlying array that defines the buffer's position 0 (see javadocs). This error is mostly no problem for dynamical allocated array buffers (as most users use), because arrayOffset() is always 0 (i have never seen a buffer with offset != 0) A problem also occurs if you wrap an array with wrap([], offset, length). the offset there is *not* the arrayOffset(), it is the position()! And length is after wrapping remaining(). The capacity() is the array length. Therefore it is really a bug. You deprecated it, thats ood, but it should be fixed, it does not work correct. For verifying, read the source code of ByteBuffer and HeapByteBuffer from src.zip in your jdk distrib and of course the javadocs.. > remove Byte/CharBuffer wrapping for collation key generation > ------------------------------------------------------------ > > Key: LUCENE-2084 > URL: https://issues.apache.org/jira/browse/LUCENE-2084 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/* > Reporter: Robert Muir > Assignee: Robert Muir > Priority: Minor > Fix For: 3.1 > > Attachments: collation.benchmark.tar.bz2, LUCENE-2084.patch, > LUCENE-2084.patch, LUCENE-2084.patch, TopTFWikipediaWords.tar.bz2 > > > We can remove the overhead of ByteBuffer and CharBuffer wrapping in > CollationKeyFilter and ICUCollationKeyFilter. > this patch moves the logic in IndexableBinaryStringTools into char[],int,int > and byte[],int,int based methods, with the previous Byte/CharBuffer methods > delegating to these. > Previously, the Byte/CharBuffer methods required a backing array anyway. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org