RE: Deterministic Ngram Matcher Hits

Reza Naghibi Mon, 29 Dec 2014 07:15:13 -0800

I believe the traversal ordering should match the index ordering. This was done 
on purpose because I think a similar bug existed in the past.


As for ranking, all matches are considered and the highest ranking is picked. 
If you look at the ranking function, it has several inputs.

So are you saying we return multiple possible devices to the user? I'm going to 
say no, it's the projects job to remove this kind of ambiguity for the user.

The example below is not an algorithm problem. It's a data problem. We just 
need to clean up the data and get rid of these incorrect patterns.



<div>-------- Original message --------</div><div>From: Volkan YAZICI 
<[email protected]> </div><div>Date:12/29/2014  6:46 AM  (GMT-05:00) 
</div><div>To: [email protected] </div><div>Cc:  </div><div>Subject: 
Deterministic Ngram Matcher Hits </div><div>
</div>Hi all,

If I am not mistaken, the employed ngram matcher has potential to return
different results for different traversel orderings provided by the
underlying collections framework. This is also evident from the following
issues:

   - HTC One X+ matches to both HTC One X and HTC_One_X.
   <http://markmail.org/message/rzgioqbm22wtzt3p>
   - DMAP-112: Java client test fails with JDK 1.8.0-25
   <https://issues.apache.org/jira/browse/DMAP-112>

I have been thinking about this and it occurred to me that instead of
returning a single hit with the highest score (which varies with the
employed collection traversal ordering), we can return the set of all
feasible hits with the same score. I believe, this will make it easier to
unit test the matcher on different platforms. Comments?

Best.

RE: Deterministic Ngram Matcher Hits

Reply via email to