[
https://issues.apache.org/jira/browse/LUCENE-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990611#comment-12990611
]
Renaud Delbru commented on LUCENE-2886:
---------------------------------------
{quote}
The BulkVInt codec is VInt implemented as a FixedIntBlock codec.
{quote}
Yes, I saw the code, it is a similar implementation of the VInt we used in our
experiments.
{quote}
previously various codecs
looked much faster than Vint but a lot of the reason for this is due to the way
Vint
was implemented...
{quote}
This is odd, because we observed the contrary (on the lucene-1458 branch). The
standard codec was by an order of magnitude faster than any other codec. We
discovered that this was due to the IntBlock interface implementation that:
- was copying the buffer bytearray two times (one time from the disk to the
buffer, then another time from the buffer to the IntBlock codec).
- had to perform more work wrt to check each of the buffer (IntBlock buffer,
IndexInput buffer).
But this might have been improved since then. Michael told me he worked on a
new version of the IntBlock interface which was more performant.
{quote}
So, if we 'group' the long values so we are e.g. reading say N long values
at once in a single internal 'block', I think we might get more efficiency
via the I/O system, and also less overhead from the bulkpostings apis.
{quote}
If I understand, this is similar to increasing the boundaries of the variable
block size. Indeed, it incurs some non-negligible overhead to perform a block
read for each simple64 long word (simple64 frame), and this might be better to
read more than one per block read.
> Adaptive Frame Of Reference
> ----------------------------
>
> Key: LUCENE-2886
> URL: https://issues.apache.org/jira/browse/LUCENE-2886
> Project: Lucene - Java
> Issue Type: New Feature
> Components: Codecs
> Reporter: Renaud Delbru
> Fix For: 4.0
>
> Attachments: LUCENE-2886_simple64.patch,
> LUCENE-2886_simple64_varint.patch, lucene-afor.tar.gz
>
>
> We could test the implementation of the Adaptive Frame Of Reference [1] on
> the lucene-4.0 branch.
> I am providing the source code of its implementation. Some work needs to be
> done, as this implementation is working on the old lucene-1458 branch.
> I will attach a tarball containing a running version (with tests) of the AFOR
> implementation, as well as the implementations of PFOR and of Simple64
> (simple family codec working on 64bits word) that has been used in the
> experiments in [1].
> [1] http://www.deri.ie/fileadmin/documents/deri-tr-afor.pdf
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]