Minhash based clustering 
-------------------------

                 Key: MAHOUT-344
                 URL: https://issues.apache.org/jira/browse/MAHOUT-344
             Project: Mahout
          Issue Type: Bug
          Components: Clustering
            Reporter: Ankur


Minhash clustering performs probabilistic dimension reduction of high 
dimensional data. The essence of the technique is to hash each item using 
multiple independent hash functions such that the probability of collision of 
similar items is higher. Multiple such hash tables can then be constructed  to 
answer near neighbor type of queries efficiently.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to