[ 
https://issues.apache.org/jira/browse/LUCENE-7500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16044427#comment-16044427
 ] 

David Smiley commented on LUCENE-7500:
--------------------------------------

bq. WeigthedSpanTermExtractor

It's kinda complicated.  So... See {{SrndTermQuery.visitMatchingTerms}} (part 
of the surround query parser). It calls 
{{MultiFields.getTerms(reader,fieldName)}} which works in terms of 
{{MultiFields.getFields(reader)}} -- at least it does today without the patch.  
And MultiFields.getFields today gets the Fields off the LeafReader.  With that 
gone, in the patch getField needs to consult FieldInfos to see which fields 
have terms.  In the latest patch (2nd iteration), I improved 
MultiFields.getTerms to not go through getFields first, which is a nice 
optimization in its own right.  So it's no longer pertinent for the surround 
query parser wether the highlighter's phrase handling has a leaf reader that 
implements getFieldInfos or not (I could put that UnsupportedOperationException 
back in).  It's hard to say for sure what very advanced queries I've never seen 
before might require... but the highlighters throwing an exception here 
guarantees queries that use getFieldInfos won't work whereas letting 
getFieldInfos through allows for it to might work.  I think until such use 
cases present themselves, we should just let getFieldInfos delegate.  Perhaps 
the best most correct option is to synthesize a FieldInfos that properly 
"looks" like what this filtered leaf reader is exposing.  It's dubious wether 
we should bother writing this code though.

> Remove Fields.java in lieu of LeafReader.getTerms(fieldName)
> ------------------------------------------------------------
>
>                 Key: LUCENE-7500
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7500
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: master (7.0)
>
>         Attachments: LUCENE_7500_avoid_leafReader_fields.patch, 
> LUCENE_7500_avoid_leafReader_fields.patch, 
> LUCENE_7500_Remove_LeafReader_fields.patch, 
> LUCENE_7500_Remove_LeafReader_fields.patch
>
>
> {{Fields}} seems like a pointless intermediary between the {{LeafReader}} and 
> {{Terms}}. Why not have {{LeafReader.getTerms(fieldName)}} instead? One loses 
> the ability to get the count and iterate over indexed fields, but it's not 
> clear what real use-cases are for that and such rare needs could figure that 
> out with FieldInfos.
> [~mikemccand] pointed out that we'd probably need to re-introduce a 
> {{TermVectors}} class since TV's are row-oriented not column-oriented.  IMO 
> they should be column-oriented but that'd be a separate issue.
> _(p.s. I'm lacking time to do this w/i the next couple months so if someone 
> else wants to tackle it then great)_



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to