[ 
https://issues.apache.org/jira/browse/LUCENE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575199#comment-13575199
 ] 

Shai Erera commented on LUCENE-4764:
------------------------------------

I think that it would actually be interesting to test *only* VInt, without 
dgap. Because the ords seem to be arbitrary, I'm not even sure what they buy 
us. Mike, can you try that? Index with a Sorting(Unique(VInt8)) and modify 
FastCountingFacetsAggregator to not do dgap? Would be interesting to see the 
effects on compression as well as speed. Dgap is something you want to do if 
you suspect that a document will have e.g. higher ordinals, that are close to 
each other in such a way that dgap would make them compress better ...

Robert, if I understand your proposal correctly, what you suggest is to encode:

int[] -- pairs of highest/lowest ordinal in a document + length (#additional 
ords)
byte[] -- a packed-int of deltas for all documents (but deltas are computed off 
the absolute ord in the int[]

Why would that be better than a single byte[] (packed-ints) + offsets?
                
> Faster but more RAM/Disk consuming DocValuesFormat for facets
> -------------------------------------------------------------
>
>                 Key: LUCENE-4764
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4764
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.2, 5.0
>
>         Attachments: LUCENE-4764.patch
>
>
> The new default DV format for binary fields has much more
> RAM-efficient encoding of the address for each document ... but it's
> also a bit slower at decode time, which affects facets because we
> decode for every collected docID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to