[ https://issues.apache.org/jira/browse/LUCENE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563685#comment-13563685 ]
Michael McCandless commented on LUCENE-4609: -------------------------------------------- I had to change the PackedBytes.get to take a [reused] IntsRef in, else I was hitting thread-safety issues (AIOOBE)... base = trunk, comp = patch, index = full 6.6M Wikpedia English with 9 facet dims counted: {noformat} Task QPS base StdDev QPS comp StdDev Pct diff HighTerm 7.30 (1.5%) 7.60 (1.6%) 4.1% ( 0% - 7%) MedTerm 16.22 (0.7%) 17.25 (0.8%) 6.4% ( 4% - 7%) LowTerm 37.87 (0.8%) 41.08 (0.4%) 8.5% ( 7% - 9%) {noformat} So we finally have something faster than dGap(vInt)! But PackedInts takes ~2X the storage (base is 151208 KB, by summing DV on disk; comp is 305889 KB by measuring RAM of the PackedInts structures). > Write a PackedIntsEncoder/Decoder for facets > -------------------------------------------- > > Key: LUCENE-4609 > URL: https://issues.apache.org/jira/browse/LUCENE-4609 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/facet > Reporter: Shai Erera > Priority: Minor > Attachments: LUCENE-4609.patch, LUCENE-4609.patch, LUCENE-4609.patch, > LUCENE-4609.patch, LUCENE-4609.patch > > > Today the facets API lets you write IntEncoder/Decoder to encode/decode the > category ordinals. We have several such encoders, including VInt (default), > and block encoders. > It would be interesting to implement and benchmark a > PackedIntsEncoder/Decoder, with potentially two variants: (1) receives > bitsPerValue up front, when you e.g. know that you have a small taxonomy and > the max value you can see and (2) one that decides for each doc on the > optimal bitsPerValue, writes it as a header in the byte[] or something. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org