[
https://issues.apache.org/jira/browse/LUCENE-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555882#comment-17555882
]
tangdh commented on LUCENE-10619:
---------------------------------
hi [~jpountz] ,I have raised a [PR|https://github.com/apache/lucene/pull/966]:D
> Optimize the writeBytes in TermsHashPerField
> --------------------------------------------
>
> Key: LUCENE-10619
> URL: https://issues.apache.org/jira/browse/LUCENE-10619
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/index
> Affects Versions: 9.2
> Reporter: tangdh
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Because we don't know the length of slice, writeBytes will always write byte
> one after another instead of writing a block of bytes.
> May be we could return both offset and length in ByteBlockPool#allocSlice?
> 1. BYTE_BLOCK_SIZE is 32768, offset is at most 15 bits.
> 2. slice size is at most 200, so it could fit in 8 bits.
> So we could put them together into an int -------- offset | length
> There are only two places where this function is used,the cost of change it
> is relatively small.
> When allocSlice could return the offset and length of new Slice, we could
> change writeBytes like below
> {code:java}
> // write block of bytes each time
> while(remaining > 0 ) {
> int offsetAndLength = allocSlice(bytes, offset);
> length = min(remaining, (offsetAndLength & 0xff) - 1);
> offset = offsetAndLength >> 8;
> System.arraycopy(src, srcPos, bytePool.buffer, offset, length);
> remaining -= length;
> offset += (length + 1);
> }
> {code}
> If it could work, I'd like to raise a pr.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]