[ 
https://issues.apache.org/jira/browse/LUCENE-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated LUCENE-7863:
-------------------------------------
    Attachment: LUCENE-7863.patch

[^LUCENE-7863.patch] has significant fixes for codec registration.
- it looks like the large enough term dictionary hit some code path in 
{{IntersectingTermsEnum}} which is broken due to introduced index format 
changes.
- it's reproduced with {{derivative-terms-only.alg}}
{code}
java.io.EOFException: seek past EOF: 
MMapIndexInput(path="...lucene-solr/lucene/benchmark/deriv/index/_0_Lucene50HijackInjector_0.doc")
        at 
org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.seek(ByteBufferIndexInput.java:366)
        at 
org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.reset(Lucene50PostingsReader.java:306)
        at 
org.apache.lucene.codecs.lucene50.Lucene50PostingsReader.postings(Lucene50PostingsReader.java:210)
        at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.postings(SegmentTermsEnum.java:1006)
        at 
org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite(MultiTermQueryConstantScoreWrapper.java:166)
{code}
- overall, the idea to just change Vlong to Zlong through overriding turns out 
not really good, it leads to many changes removes incapsulation and final that 
means there is no any sense in them.   

> Don't repeat postings (and perhaps positions) on ReverseWF, EdgeNGram, etc  
> ----------------------------------------------------------------------------
>
>                 Key: LUCENE-7863
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7863
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Mikhail Khludnev
>         Attachments: LUCENE-7863.hazard, LUCENE-7863.patch, 
> LUCENE-7863.patch, LUCENE-7863.patch, LUCENE-7863.patch, LUCENE-7863.patch
>
>
> h2. Context
> \*suffix and \*infix\* searches on large indexes. 
> h2. Problem
> Obviously applying {{ReversedWildcardFilter}} doubles an index size, and I'm 
> shuddering to think about EdgeNGrams...
> h2. Proposal 
> _DRY_



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to