[ 
https://issues.apache.org/jira/browse/LUCENE-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610650#comment-13610650
 ] 

Commit Tag Bot commented on LUCENE-4498:
----------------------------------------

[branch_4x commit] Robert Muir
http://svn.apache.org/viewvc?view=revision&revision=1401421

LUCENE-4498: pulse docFreq=1 in 4.1 codec

                
> pulse docfreq=1 DOCS_ONLY for 4.1 codec
> ---------------------------------------
>
>                 Key: LUCENE-4498
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4498
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>            Reporter: Robert Muir
>             Fix For: 4.1, 5.0
>
>         Attachments: LUCENE-4498_lazy.patch, LUCENE-4498.patch, 
> LUCENE-4498.patch, LUCENE-4498.patch, LUCENE-4498.patch
>
>
> We have pulsing codec, but currently this has some downsides:
> * its very general, wrapping an arbitrary postingsformat and pulsing 
> everything in the postings for an arbitrary docfreq/totalTermFreq cutoff
> * reuse is hairy: because it specializes its enums based on these cutoffs, 
> when walking thru terms e.g. merging there is a lot of sophisticated stuff to 
> avoid the worst cases where we clone indexinputs for tons of terms.
> On the other hand the way the 4.1 codec encodes "primary key" fields is 
> pretty silly, we write the docStartFP vlong in the term dictionary metadata, 
> which tells us where to seek in the .doc to read our one lonely vint.
> I think its worth investigating that in the DOCS_ONLY docfreq=1 case, we just 
> write the lone doc delta where we would write docStartFP. 
> We can avoid the hairy reuse problem too, by just supporting this in 
> refillDocs() in BlockDocsEnum instead of specializing.
> This would remove the additional seek for "primary key" fields without really 
> any of the downsides of pulsing today.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to