roman created LUCENE-4242: ----------------------------- Summary: UnInverted cache uses term freq to filter out terms (but deleted docs are included in the freq count) Key: LUCENE-4242 URL: https://issues.apache.org/jira/browse/LUCENE-4242 Project: Lucene - Java Issue Type: Bug Components: core/index Affects Versions: 4.0 Reporter: roman Priority: Minor
TermEnum.docFreq() count is used to compute uninverted index (DocTermOrds.uninvert()). The code goes like: final int df = te.docFreq(); if (df <= maxTermDocFreq) { So, if there are deleted documents in the index and maxTermDocFreq is low, then the term will be excluded (even if the freq of the livedocs is OK). Most likely, the cache will be incomplete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org