[ 
https://issues.apache.org/jira/browse/LUCENE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13536386#comment-13536386
 ] 

Michael McCandless commented on LUCENE-4609:
--------------------------------------------

{quote}
I'm missing something.. if there are 2 bits per value, and the codec knows its 
only 1 byte, there could be either 1, 2, 3 or 4 values in that single byte. How 
could the decoder know when to stop without knowing how many bits should not be 
encoded at the end?
{quote}

Ahh you're right!  OK.  So I think the custom header must include both bpv and 
"wasted bits".  Hmm, but only if bpv is "small enough" to be ambiguous right?

I guess another option is to leave those bits as 0s so that the decoded ord is 
0, which is the "reserved" root ord and so counting that is harmless maybe?  
Tricky ...
                
> Write a PackedIntsEncoder/Decoder for facets
> --------------------------------------------
>
>                 Key: LUCENE-4609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4609
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/facet
>            Reporter: Shai Erera
>            Priority: Minor
>         Attachments: LUCENE-4609.patch
>
>
> Today the facets API lets you write IntEncoder/Decoder to encode/decode the 
> category ordinals. We have several such encoders, including VInt (default), 
> and block encoders.
> It would be interesting to implement and benchmark a 
> PackedIntsEncoder/Decoder, with potentially two variants: (1) receives 
> bitsPerValue up front, when you e.g. know that you have a small taxonomy and 
> the max value you can see and (2) one that decides for each doc on the 
> optimal bitsPerValue, writes it as a header in the byte[] or something.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to