[ 
https://issues.apache.org/jira/browse/LUCENE-7354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353400#comment-15353400
 ] 

Steve Rowe commented on LUCENE-7354:
------------------------------------

{quote}
In MoreLikeThis.java, circa line 763, when calling addTermFrequencies on a 
Field object, we are incorrectly calling toString on the Field object, which 
puts the Field attributes (indexed, stored, et. al) into the String that is 
returned.
{quote}

I don't see this - when I run {{CloudMLTQParserTest}} without your patch, and I 
look at {{MoreLikeThis.retrieveTerms()}} where {{String.valueOf(fieldValue)}} 
is called (by pulling the value of that expression out into a variable and 
breaking there in the debugger), I only see the actual field values - no 
indexed stored et al.

Indexed, stored, et al. are Field*Type* attributes, not Field attributes, 
right?  

In {{CloudMLTQParser.parse()}} where the filtered doc is composed, in your 
patch you have a nocommit (the only one I see in your patch) - 
Field.stringValue() returns {{value.toString()}}, but only if it's a String or 
a Number, and otherwise null, so it's definitely possible to not have a string 
value for binary fields or geo fields - I guess the question is whether people 
want to use non-text/non-scalar fields for MLT?:

{code:java}
    for (String field : fieldNames) {
      Collection<Object> fieldValues = doc.getFieldValues(field);
      if (fieldValues != null) {
        Collection<String> strings = new ArrayList<>(fieldValues.size());
        for (Object value : fieldValues) {
          if (value instanceof Field){
            String sv = ((Field) value).stringValue();
            if (sv != null) {
              strings.add(sv);
            }//TODO: nocommit: what to do when we don't have StringValue? I 
don't think it is possible in this case, but need to check on this
          } else {
            strings.add(value.toString());
          }
        }
        filteredDocument.put(field, strings);
      }
    }
{code}

> MoreLikeThis incorrectly does toString on Field object
> ------------------------------------------------------
>
>                 Key: LUCENE-7354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7354
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 6.0.1, 5.5.1, master (7.0)
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: LUCENE-7354-mlt-fix
>
>
> In MoreLikeThis.java, circa line 763, when calling addTermFrequencies on a 
> Field object, we are incorrectly calling toString on the Field object, which 
> puts the Field attributes (indexed, stored, et. al) into the String that is 
> returned.
> I'll put up a patch/fix shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to