[ https://issues.apache.org/jira/browse/LUCENE-7354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353400#comment-15353400 ]
Steve Rowe commented on LUCENE-7354: ------------------------------------ {quote} In MoreLikeThis.java, circa line 763, when calling addTermFrequencies on a Field object, we are incorrectly calling toString on the Field object, which puts the Field attributes (indexed, stored, et. al) into the String that is returned. {quote} I don't see this - when I run {{CloudMLTQParserTest}} without your patch, and I look at {{MoreLikeThis.retrieveTerms()}} where {{String.valueOf(fieldValue)}} is called (by pulling the value of that expression out into a variable and breaking there in the debugger), I only see the actual field values - no indexed stored et al. Indexed, stored, et al. are Field*Type* attributes, not Field attributes, right? In {{CloudMLTQParser.parse()}} where the filtered doc is composed, in your patch you have a nocommit (the only one I see in your patch) - Field.stringValue() returns {{value.toString()}}, but only if it's a String or a Number, and otherwise null, so it's definitely possible to not have a string value for binary fields or geo fields - I guess the question is whether people want to use non-text/non-scalar fields for MLT?: {code:java} for (String field : fieldNames) { Collection<Object> fieldValues = doc.getFieldValues(field); if (fieldValues != null) { Collection<String> strings = new ArrayList<>(fieldValues.size()); for (Object value : fieldValues) { if (value instanceof Field){ String sv = ((Field) value).stringValue(); if (sv != null) { strings.add(sv); }//TODO: nocommit: what to do when we don't have StringValue? I don't think it is possible in this case, but need to check on this } else { strings.add(value.toString()); } } filteredDocument.put(field, strings); } } {code} > MoreLikeThis incorrectly does toString on Field object > ------------------------------------------------------ > > Key: LUCENE-7354 > URL: https://issues.apache.org/jira/browse/LUCENE-7354 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 6.0.1, 5.5.1, master (7.0) > Reporter: Grant Ingersoll > Assignee: Grant Ingersoll > Priority: Minor > Attachments: LUCENE-7354-mlt-fix > > > In MoreLikeThis.java, circa line 763, when calling addTermFrequencies on a > Field object, we are incorrectly calling toString on the Field object, which > puts the Field attributes (indexed, stored, et. al) into the String that is > returned. > I'll put up a patch/fix shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org