[ https://issues.apache.org/jira/browse/LUCENE-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless resolved LUCENE-2872. ---------------------------------------- Resolution: Fixed > Terms dict should block-encode terms > ------------------------------------ > > Key: LUCENE-2872 > URL: https://issues.apache.org/jira/browse/LUCENE-2872 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael McCandless > Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-2872.patch, LUCENE-2872.patch, LUCENE-2872.patch > > > With PrefixCodedTermsReader/Writer we now encode each term standalone, > ie its bytes, metadata, details for postings (frq/prox file pointers), > etc. > But, this is costly when something wants to visit many terms but pull > metadata for only few (eg respelling, certain MTQs). This is > particularly costly for sep codec because it has more metadata to > store, per term. > So instead I think we should block-encode all terms between indexed > term, so that the metadata is stored "column stride" instead. This > makes it faster to enum just terms. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org