[ 
https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836452#action_12836452
 ] 

Ted Dunning commented on MAHOUT-300:
------------------------------------

Huh.... some of those times are a little surprising.

For DotProduct and CosineDistanceMeasure, SequentialAccessSparseVector is 3x 
faster than RandomAccessSparseVector and 8x faster than DenseVector.  There the 
world is good.

But for SquaredEuclideanDistanceMeasure and TanimotoDistanceMeasure, there is 
little difference while for ManhattanDistanceMeasure, 
SequentialAccessSparseVector is slower than RandomAccessSparseVector.

Is it just that for these last 3 distances the sequentiality has not been taken 
into account?

{noformat}
DotProduct
                             Rate = 3877.9443 MB/s         Rate = 9846.534 MB/s 
         Rate = 31736.123 MB/s

org.apache.mahout.common.distance.CosineDistanceMeasure
                             Speed = 1690.1599 /sec        Speed = 3366.8774 
/sec        Speed = 12309.282 /sec

org.apache.mahout.common.distance.EuclideanDistanceMeasure
                             Speed = 2913.8206 /sec        Speed = 5868.9404 
/sec        Speed = 8209.688 /sec

org.apache.mahout.common.distance.ManhattanDistanceMeasure
                             Speed = 867.9127 /sec         Speed = 2435.4307 
/sec        Speed = 1048.7443 /sec

org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure
                             Speed = 3387.1472 /sec        Speed = 7091.4087 
/sec        Speed = 8785.509 /sec

org.apache.mahout.common.distance.TanimotoDistanceMeasure
                             Speed = 1803.4031 /sec        Speed = 3873.8967 
/sec        Speed = 6844.7017 /sec
{noformat}


> Solve performance issues with Vector Implementations
> ----------------------------------------------------
>
>                 Key: MAHOUT-300
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-300
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.3
>            Reporter: Robin Anil
>             Fix For: 0.3
>
>         Attachments: MAHOUT-300.patch, MAHOUT-300.patch, MAHOUT-300.patch, 
> MAHOUT-300.patch, MAHOUT-300.patch
>
>
> AbstractVector operations like times
>   public Vector times(double x) {
>     Vector result = clone();
>     Iterator<Element> iter = iterateNonZero();
>     while (iter.hasNext()) {
>       Element element = iter.next();
>       int index = element.index();
>       result.setQuick(index, element.get() * x);
>     }
>     return result;
>   }
> should be implemented as follows
>  public Vector times(double x) {
>     Vector result = clone();
>     Iterator<Element> iter = result.iterateNonZero();
>     while (iter.hasNext()) {
>       Element element = iter.next();
>       element.set(element.get() * x);
>     }
>     return result;
>   }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to