[ 
https://issues.apache.org/jira/browse/LUCENE-7500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16044668#comment-16044668
 ] 

David Smiley commented on LUCENE-7500:
--------------------------------------

bq. Maybe we can move most of the static methods from MultiFields to ReaderUtil 
or maybe a new MultiReaderUtil, and then can we remove MultiFields entirely? 
Who still uses it (instantiates it as a Fields instance)? (But we can do that 
separately).

Yes I've had the same thought in terms of MultiFields being a bad name now 
(particularly for the "Fields" part of the name).  I didn't notice ReaderUtil 
specifically; perhaps that makes sense.  The problem with "MultiReaderUtil" is 
that it suggests a relationship to MultiReader, which exists, yet there isn't a 
relationship.  MultiReader might have been a good/consistent name for what 
SlowCompositeReaderWrapper actually is, but as you know that guy isn't in 
Lucene anymore.  I filed LUCENE-7875 to keep the scope of this current issue 
more manageable.

bq. Can you add a comment here (PhraseHelper.java) explaining why the incoming 
field is ignored?

The inner class {{SingleFieldFilterLeafReader}} has documentation and a decent 
name; perhaps RedirectFieldLeafReader might be a better name?  All methods to 
LeafReader that accept a field name are deliberately ignored so that it can use 
another configured name instead. (The terms method in particular is no 
exception). I propose I add a sentence to this class's documentation to say 
"This is needed to support the ability to highlight a query irrespective of the 
field a query refers to (aka requireFieldMatch=false)."   BTW This class is 
nearly the same as WeightedSpanTermExtractor.DelegatingLeafReader and exists 
for the same purpose.

bq. This change in ParallelReader seems a little dangerous

My change is not purely identical in behavior, yes.  I believe it is 
bugginess/edge cases in the original behavior that were not thought out well 
originally.  For example, notice that tvFieldToReader was only populated on the 
*first* occurrence of the field name in a reader.  Why limit it so... shouldn't 
it be _whichever_ reader first has term vectors for that field?  That might not 
be the first one.  Imagine the first reader has fieldA with only a terms index 
and another reader for fieldA just has term vectors.  The original code 
wouldn't notice the term vectors because it wasn't the first.  Ultimately the 
tests passed; I didn't look at ParallelReader's tests in close detail to see 
how well (or not) it's documented.  If you think we need to be extremely risk 
averse in this patch for ParallelReader (bug for bug compatible :-), then I can 
try and redo my work on ParallelReader to make it as bare minimum in changes as 
possible.

> Remove LeafReader.fields()
> --------------------------
>
>                 Key: LUCENE-7500
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7500
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: master (7.0)
>
>         Attachments: LUCENE_7500_avoid_leafReader_fields.patch, 
> LUCENE_7500_avoid_leafReader_fields.patch, 
> LUCENE_7500_Remove_LeafReader_fields.patch, 
> LUCENE_7500_Remove_LeafReader_fields.patch
>
>
> {{Fields}} seems like a pointless intermediary between the {{LeafReader}} and 
> {{Terms}}. Why not have {{LeafReader.getTerms(fieldName)}} instead? One loses 
> the ability to get the count and iterate over indexed fields, but it's not 
> clear what real use-cases are for that and such rare needs could figure that 
> out with FieldInfos.
> [~mikemccand] pointed out that we'd probably need to re-introduce a 
> {{TermVectors}} class since TV's are row-oriented not column-oriented.  IMO 
> they should be column-oriented but that'd be a separate issue.
> _(p.s. I'm lacking time to do this w/i the next couple months so if someone 
> else wants to tackle it then great)_



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to