[ 
https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009545#comment-14009545
 ] 

Michael McCandless commented on LUCENE-5675:
--------------------------------------------

Thanks Steve!

> "ID postings format"
> --------------------
>
>                 Key: LUCENE-5675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5675
>             Project: Lucene - Core
>          Issue Type: New Feature
>    Affects Versions: 4.9, 5.0
>            Reporter: Robert Muir
>             Fix For: 4.9, 5.0
>
>         Attachments: LUCENE-5675.patch
>
>
> Today the primary key lookup in lucene is not that great for systems like 
> solr and elasticsearch that have versioning in front of IndexWriter.
> To some extend BlockTree can "sometimes" help avoid seeks by telling you the 
> term does not exist for a segment. But this technique (based on FST prefix) 
> is fragile. The only other choice today is bloom filters, which use up huge 
> amounts of memory.
> I don't think we are using everything we know: particularly the version 
> semantics.
> Instead, if the FST for the terms index used an algebra that represents the 
> max version for any subtree, we might be able to answer that there is no term 
> T with version < V in that segment very efficiently.
> Also ID fields dont need postings lists, they dont need stats like 
> docfreq/totaltermfreq, etc this stuff is all implicit. 
> As far as API, i think for users to provide "IDs with versions" to such a PF, 
> a start would to set a payload or whatever on the term field to get it thru 
> indexwriter to the codec. And a "consumer" of the codec can just cast the 
> Terms to a subclass that exposes the FST to do this version check efficiently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to