[ https://issues.apache.org/jira/browse/LUCENE-7500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16044427#comment-16044427 ]
David Smiley commented on LUCENE-7500: -------------------------------------- bq. WeigthedSpanTermExtractor It's kinda complicated. So... See {{SrndTermQuery.visitMatchingTerms}} (part of the surround query parser). It calls {{MultiFields.getTerms(reader,fieldName)}} which works in terms of {{MultiFields.getFields(reader)}} -- at least it does today without the patch. And MultiFields.getFields today gets the Fields off the LeafReader. With that gone, in the patch getField needs to consult FieldInfos to see which fields have terms. In the latest patch (2nd iteration), I improved MultiFields.getTerms to not go through getFields first, which is a nice optimization in its own right. So it's no longer pertinent for the surround query parser wether the highlighter's phrase handling has a leaf reader that implements getFieldInfos or not (I could put that UnsupportedOperationException back in). It's hard to say for sure what very advanced queries I've never seen before might require... but the highlighters throwing an exception here guarantees queries that use getFieldInfos won't work whereas letting getFieldInfos through allows for it to might work. I think until such use cases present themselves, we should just let getFieldInfos delegate. Perhaps the best most correct option is to synthesize a FieldInfos that properly "looks" like what this filtered leaf reader is exposing. It's dubious wether we should bother writing this code though. > Remove Fields.java in lieu of LeafReader.getTerms(fieldName) > ------------------------------------------------------------ > > Key: LUCENE-7500 > URL: https://issues.apache.org/jira/browse/LUCENE-7500 > Project: Lucene - Core > Issue Type: Improvement > Reporter: David Smiley > Assignee: David Smiley > Fix For: master (7.0) > > Attachments: LUCENE_7500_avoid_leafReader_fields.patch, > LUCENE_7500_avoid_leafReader_fields.patch, > LUCENE_7500_Remove_LeafReader_fields.patch, > LUCENE_7500_Remove_LeafReader_fields.patch > > > {{Fields}} seems like a pointless intermediary between the {{LeafReader}} and > {{Terms}}. Why not have {{LeafReader.getTerms(fieldName)}} instead? One loses > the ability to get the count and iterate over indexed fields, but it's not > clear what real use-cases are for that and such rare needs could figure that > out with FieldInfos. > [~mikemccand] pointed out that we'd probably need to re-introduce a > {{TermVectors}} class since TV's are row-oriented not column-oriented. IMO > they should be column-oriented but that'd be a separate issue. > _(p.s. I'm lacking time to do this w/i the next couple months so if someone > else wants to tackle it then great)_ -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org