[
https://issues.apache.org/jira/browse/LUCENE-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688409#action_12688409
]
Paul Elschot commented on LUCENE-1410:
--------------------------------------
The encoding in the google research slides is another one.
They use 2 bits prefixing the first byte and indicating the number of bytes
used for the encoded number (1-4), and then they group 4 of those prefixes
together to get a single byte of 4 prefixes followed by the non prefixed bytes
of the 4 encoded numbers.
This requires a 256 way switch (indexed jump) for every 4 encoded numbers, and
I would expect that jump to limit performance somewhat when compared to pfor
that has a 32 way switch for 32/64/128 encoded numbers.
But since the prefixes only indicate the numbers of bytes used for the encoded
numbers, no shifts and masks are needed, only byte moves.
So it could well be wortwhile to give this encoding a try, too, especially for
lists of numbers shorter than 16 or 32.
> PFOR implementation
> -------------------
>
> Key: LUCENE-1410
> URL: https://issues.apache.org/jira/browse/LUCENE-1410
> Project: Lucene - Java
> Issue Type: New Feature
> Components: Other
> Reporter: Paul Elschot
> Priority: Minor
> Attachments: autogen.tgz, LUCENE-1410b.patch, LUCENE-1410c.patch,
> LUCENE-1410d.patch, LUCENE-1410e.patch, TermQueryTests.tgz, TestPFor2.java,
> TestPFor2.java, TestPFor2.java
>
> Original Estimate: 21840h
> Remaining Estimate: 21840h
>
> Implementation of Patched Frame of Reference.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]