[ 
https://issues.apache.org/jira/browse/LUCENE-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5123:
---------------------------------------

    Attachment: LUCENE-5123.patch

Here's a starting patch with lots of nocommits, but tests are passing ...

This patch adds a postings (Fields) impl during flush (FreqProxFields), reading 
the postings from FreqProxTermWriter's RAM buffers, and another for merging 
(MappedMultiFields).

It also adds a new PostingsFormat.write to write the postings ... the idea is 
this will eventually replace PostingsFormat.fieldsConsumer, but for now, so we 
can cutover existing postings formats iteratively, I made a default impl for 
write that takes a Fields and steps through Fields/Terms/Postings for the 
FieldsConsumer API.
                
> invert the codec postings API
> -----------------------------
>
>                 Key: LUCENE-5123
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5123
>             Project: Lucene - Core
>          Issue Type: Wish
>            Reporter: Robert Muir
>            Assignee: Michael McCandless
>         Attachments: LUCENE-5123.patch
>
>
> Currently FieldsConsumer/PostingsConsumer/etc is a "push" oriented api, e.g. 
> FreqProxTermsWriter streams the postings at flush, and the default merge() 
> takes the incoming codec api and filters out deleted docs and "pushes" via 
> same api (but that can be overridden).
> It could be cleaner if we allowed for a "pull" model instead (like 
> DocValues). For example, maybe FreqProxTermsWriter could expose a Terms of 
> itself and just passed this to the codec consumer.
> This would give the codec more flexibility to e.g. do multiple passes if it 
> wanted to do things like encode high-frequency terms more efficiently with a 
> bitset-like encoding or other things...
> A codec can try to do things like this to some extent today, but its very 
> difficult (look at buffering in Pulsing). We made this change with DV and it 
> made a lot of interesting optimizations easy to implement...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to