I believe the traversal ordering should match the index ordering. This was done on purpose because I think a similar bug existed in the past.
As for ranking, all matches are considered and the highest ranking is picked. If you look at the ranking function, it has several inputs. So are you saying we return multiple possible devices to the user? I'm going to say no, it's the projects job to remove this kind of ambiguity for the user. The example below is not an algorithm problem. It's a data problem. We just need to clean up the data and get rid of these incorrect patterns. <div>-------- Original message --------</div><div>From: Volkan YAZICI <[email protected]> </div><div>Date:12/29/2014 6:46 AM (GMT-05:00) </div><div>To: [email protected] </div><div>Cc: </div><div>Subject: Deterministic Ngram Matcher Hits </div><div> </div>Hi all, If I am not mistaken, the employed ngram matcher has potential to return different results for different traversel orderings provided by the underlying collections framework. This is also evident from the following issues: - HTC One X+ matches to both HTC One X and HTC_One_X. <http://markmail.org/message/rzgioqbm22wtzt3p> - DMAP-112: Java client test fails with JDK 1.8.0-25 <https://issues.apache.org/jira/browse/DMAP-112> I have been thinking about this and it occurred to me that instead of returning a single hit with the highest score (which varies with the employed collection traversal ordering), we can return the set of all feasible hits with the same score. I believe, this will make it easier to unit test the matcher on different platforms. Comments? Best.
