On 9/11/2018 8:32 PM, John Blythe wrote:
we recently migrated to cloud. part of that migration jumped us from 6.1 to
7.4.

one example query between our old solr instance and our new cloud instance
produces 42 results and 19k results.

the analyzer is the same aside from WordDelimiterFilterFactory moving over
to the graph variation of it and the lucene parser moving from 6.1 to 7.4
obviously.

Did you completely reindex after changing your schema?  Not doing this, especially if attempting to use the index from the earlier version, can lead to problems.  Have you checked what happens if you use the non-graph version of WDF (and completely reindex), so you can see whether that changes anything?  That filter will disappear in 8.0, but it's still there for all of 7.x.

Adding "debug=query" to your URL parameters is very useful in locating differences.  Maybe 6.1 and 7.4 are parsing the query differently.  There's a good chance that this will reveal something we can pursue.

i've used the analysis tool in solr admin to try to determine the
difference between the two. i'm seeing the same output between index and
query results yet when actually running the queries have that huge
divergence of results.

One of the big differences between 6.x and 7.x for query parsing is that the sow (split on whitespace) parameter defaults to true in 6.x (and I think it didn't even exist in 6.1, so it's effectively true).  In 7.x, that parameter defaults to false.  So the query parser in 7.x tends to behave *exactly* like what you see in the analysis tool, whereas in 6.x the input would be split on whitespace before ever reaching analysis, which can result in very subtle differences in how the input is analyzed.  Adding "sow=true" to your URL parameters is something you can try as a quick test.

Thanks,
Shawn

Reply via email to