[ 
https://issues.apache.org/jira/browse/LUCENE-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990611#comment-12990611
 ] 

Renaud Delbru commented on LUCENE-2886:
---------------------------------------

{quote}
The BulkVInt codec is VInt implemented as a FixedIntBlock codec.
{quote}

Yes, I saw the code, it is a similar implementation of the VInt we used in our 
experiments.

{quote}
previously various codecs
looked much faster than Vint but a lot of the reason for this is due to the way 
Vint
was implemented...
{quote}

This is odd, because we observed the contrary (on the lucene-1458 branch). The 
standard codec was by an order of magnitude faster than any other codec. We 
discovered that this was due to the IntBlock interface implementation that:
- was copying the buffer bytearray two times (one time from the disk to the 
buffer, then another time from the buffer to the IntBlock codec).
- had to perform more work wrt to check each of the buffer (IntBlock buffer, 
IndexInput buffer).
But this might have been improved since then. Michael told me he worked on a 
new version of the IntBlock interface which was more performant.

{quote}
So, if we 'group' the long values so we are e.g. reading say N long values
at once in a single internal 'block', I think we might get more efficiency
via the I/O system, and also less overhead from the bulkpostings apis.
{quote}

If I understand, this is similar to increasing the boundaries of the variable 
block size. Indeed, it incurs some non-negligible overhead to perform a block 
read for each simple64 long word (simple64 frame), and this might be better to 
read more than one per block read.

> Adaptive Frame Of Reference 
> ----------------------------
>
>                 Key: LUCENE-2886
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2886
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Codecs
>            Reporter: Renaud Delbru
>             Fix For: 4.0
>
>         Attachments: LUCENE-2886_simple64.patch, 
> LUCENE-2886_simple64_varint.patch, lucene-afor.tar.gz
>
>
> We could test the implementation of the Adaptive Frame Of Reference [1] on 
> the lucene-4.0 branch.
> I am providing the source code of its implementation. Some work needs to be 
> done, as this implementation is working on the old lucene-1458 branch. 
> I will attach a tarball containing a running version (with tests) of the AFOR 
> implementation, as well as the implementations of PFOR and of Simple64 
> (simple family codec working on 64bits word) that has been used in the 
> experiments in [1].
> [1] http://www.deri.ie/fileadmin/documents/deri-tr-afor.pdf

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to