[ https://issues.apache.org/jira/browse/LUCENE-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14316702#comment-14316702 ]
ASF subversion and git services commented on LUCENE-6030: --------------------------------------------------------- Commit 1659025 from [~rcmuir] in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1659025 ] LUCENE-6030: remove fixed @Seed > Add norms patched compression which uses table for most common values > --------------------------------------------------------------------- > > Key: LUCENE-6030 > URL: https://issues.apache.org/jira/browse/LUCENE-6030 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Ryan Ernst > Assignee: Ryan Ernst > Fix For: 5.0, Trunk > > Attachments: LUCENE-6030.patch > > > We have added the PATCHED norms sub format in lucene 50, which uses a bitset > to mark documents that have the most common value (when >97% of the documents > have that value). This works well for fields that have a predominant value > length, and then a small number of docs with some other random values. But > another common case is having a handful of very common value lengths, like > with a title field. > We can use a table (see TABLE_COMPRESSION) to store the most common values, > and save an oridinal for the "other" case, at which point we can lookup in > the secondary patch table. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org