[ https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836452#action_12836452 ]
Ted Dunning commented on MAHOUT-300: ------------------------------------ Huh.... some of those times are a little surprising. For DotProduct and CosineDistanceMeasure, SequentialAccessSparseVector is 3x faster than RandomAccessSparseVector and 8x faster than DenseVector. There the world is good. But for SquaredEuclideanDistanceMeasure and TanimotoDistanceMeasure, there is little difference while for ManhattanDistanceMeasure, SequentialAccessSparseVector is slower than RandomAccessSparseVector. Is it just that for these last 3 distances the sequentiality has not been taken into account? {noformat} DotProduct Rate = 3877.9443 MB/s Rate = 9846.534 MB/s Rate = 31736.123 MB/s org.apache.mahout.common.distance.CosineDistanceMeasure Speed = 1690.1599 /sec Speed = 3366.8774 /sec Speed = 12309.282 /sec org.apache.mahout.common.distance.EuclideanDistanceMeasure Speed = 2913.8206 /sec Speed = 5868.9404 /sec Speed = 8209.688 /sec org.apache.mahout.common.distance.ManhattanDistanceMeasure Speed = 867.9127 /sec Speed = 2435.4307 /sec Speed = 1048.7443 /sec org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure Speed = 3387.1472 /sec Speed = 7091.4087 /sec Speed = 8785.509 /sec org.apache.mahout.common.distance.TanimotoDistanceMeasure Speed = 1803.4031 /sec Speed = 3873.8967 /sec Speed = 6844.7017 /sec {noformat} > Solve performance issues with Vector Implementations > ---------------------------------------------------- > > Key: MAHOUT-300 > URL: https://issues.apache.org/jira/browse/MAHOUT-300 > Project: Mahout > Issue Type: Improvement > Affects Versions: 0.3 > Reporter: Robin Anil > Fix For: 0.3 > > Attachments: MAHOUT-300.patch, MAHOUT-300.patch, MAHOUT-300.patch, > MAHOUT-300.patch, MAHOUT-300.patch > > > AbstractVector operations like times > public Vector times(double x) { > Vector result = clone(); > Iterator<Element> iter = iterateNonZero(); > while (iter.hasNext()) { > Element element = iter.next(); > int index = element.index(); > result.setQuick(index, element.get() * x); > } > return result; > } > should be implemented as follows > public Vector times(double x) { > Vector result = clone(); > Iterator<Element> iter = result.iterateNonZero(); > while (iter.hasNext()) { > Element element = iter.next(); > element.set(element.get() * x); > } > return result; > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.