[
https://issues.apache.org/jira/browse/MAHOUT-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Schelter resolved MAHOUT-803.
---------------------------------------
Resolution: Won't Fix
> Complete minsize constraints for similarity measures used in RowSimilarityJob
> -----------------------------------------------------------------------------
>
> Key: MAHOUT-803
> URL: https://issues.apache.org/jira/browse/MAHOUT-803
> Project: Mahout
> Issue Type: Task
> Components: Math
> Affects Versions: 0.6
> Reporter: Sebastian Schelter
> Assignee: Suneel Marthi
> Fix For: Backlog
>
>
> The latest implementation of RowSimilarityJob allows specifying a threshold
> for the minimum similarity value of the resulting row pairs.
> A measure can specify a minsize constraints via
> VectorSimilarityMeasure.consider(...) to prune some candidate pairs very
> early by looking at some statistics computed for the single rows.
> For example if cooccurrence count is used as similarity measure and a
> threshold of 5 is set, then all row pairs where one of the vectors has less
> than 5 non-zero components can be discarded.
> These min-size constraints are still missing for CityBlockSimilarity,
> LoglikelihoodSimilarity and EuclideanDistanceSimilarity
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira