[ 
https://issues.apache.org/jira/browse/MAHOUT-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Schelter resolved MAHOUT-803.
---------------------------------------

    Resolution: Won't Fix
    
> Complete minsize constraints for similarity measures used in RowSimilarityJob
> -----------------------------------------------------------------------------
>
>                 Key: MAHOUT-803
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-803
>             Project: Mahout
>          Issue Type: Task
>          Components: Math
>    Affects Versions: 0.6
>            Reporter: Sebastian Schelter
>            Assignee: Suneel Marthi
>             Fix For: Backlog
>
>
> The latest implementation of RowSimilarityJob allows specifying a threshold 
> for the minimum similarity value of the resulting row pairs.
> A measure can specify a minsize constraints via 
> VectorSimilarityMeasure.consider(...) to prune some candidate pairs very 
> early by looking at some statistics computed for the single rows.
> For example if cooccurrence count is used as similarity measure and a 
> threshold of 5 is set, then all row pairs where one of the vectors has less 
> than 5 non-zero components can be discarded.
> These min-size constraints are still missing for CityBlockSimilarity, 
> LoglikelihoodSimilarity and EuclideanDistanceSimilarity

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to