[
https://issues.apache.org/jira/browse/LUCENE-7500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16044668#comment-16044668
]
David Smiley commented on LUCENE-7500:
--------------------------------------
bq. Maybe we can move most of the static methods from MultiFields to ReaderUtil
or maybe a new MultiReaderUtil, and then can we remove MultiFields entirely?
Who still uses it (instantiates it as a Fields instance)? (But we can do that
separately).
Yes I've had the same thought in terms of MultiFields being a bad name now
(particularly for the "Fields" part of the name). I didn't notice ReaderUtil
specifically; perhaps that makes sense. The problem with "MultiReaderUtil" is
that it suggests a relationship to MultiReader, which exists, yet there isn't a
relationship. MultiReader might have been a good/consistent name for what
SlowCompositeReaderWrapper actually is, but as you know that guy isn't in
Lucene anymore. I filed LUCENE-7875 to keep the scope of this current issue
more manageable.
bq. Can you add a comment here (PhraseHelper.java) explaining why the incoming
field is ignored?
The inner class {{SingleFieldFilterLeafReader}} has documentation and a decent
name; perhaps RedirectFieldLeafReader might be a better name? All methods to
LeafReader that accept a field name are deliberately ignored so that it can use
another configured name instead. (The terms method in particular is no
exception). I propose I add a sentence to this class's documentation to say
"This is needed to support the ability to highlight a query irrespective of the
field a query refers to (aka requireFieldMatch=false)." BTW This class is
nearly the same as WeightedSpanTermExtractor.DelegatingLeafReader and exists
for the same purpose.
bq. This change in ParallelReader seems a little dangerous
My change is not purely identical in behavior, yes. I believe it is
bugginess/edge cases in the original behavior that were not thought out well
originally. For example, notice that tvFieldToReader was only populated on the
*first* occurrence of the field name in a reader. Why limit it so... shouldn't
it be _whichever_ reader first has term vectors for that field? That might not
be the first one. Imagine the first reader has fieldA with only a terms index
and another reader for fieldA just has term vectors. The original code
wouldn't notice the term vectors because it wasn't the first. Ultimately the
tests passed; I didn't look at ParallelReader's tests in close detail to see
how well (or not) it's documented. If you think we need to be extremely risk
averse in this patch for ParallelReader (bug for bug compatible :-), then I can
try and redo my work on ParallelReader to make it as bare minimum in changes as
possible.
> Remove LeafReader.fields()
> --------------------------
>
> Key: LUCENE-7500
> URL: https://issues.apache.org/jira/browse/LUCENE-7500
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: David Smiley
> Assignee: David Smiley
> Fix For: master (7.0)
>
> Attachments: LUCENE_7500_avoid_leafReader_fields.patch,
> LUCENE_7500_avoid_leafReader_fields.patch,
> LUCENE_7500_Remove_LeafReader_fields.patch,
> LUCENE_7500_Remove_LeafReader_fields.patch
>
>
> {{Fields}} seems like a pointless intermediary between the {{LeafReader}} and
> {{Terms}}. Why not have {{LeafReader.getTerms(fieldName)}} instead? One loses
> the ability to get the count and iterate over indexed fields, but it's not
> clear what real use-cases are for that and such rare needs could figure that
> out with FieldInfos.
> [~mikemccand] pointed out that we'd probably need to re-introduce a
> {{TermVectors}} class since TV's are row-oriented not column-oriented. IMO
> they should be column-oriented but that'd be a separate issue.
> _(p.s. I'm lacking time to do this w/i the next couple months so if someone
> else wants to tackle it then great)_
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]