[ https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836249#action_12836249 ]
Ted Dunning commented on MAHOUT-300: ------------------------------------ {quote} ted: It may be that someday we will need maxNonZero, but we can do that when it comes up. Then no need of all these checks. Just need to iterateAll(). It will be slow. But thats the penalty you pay to use this function(should be documented) on large sparse vector. If you just need the maxNonZero, which should use iterateNonZero. All implementations return -INF if nothing is found.. {quote} Quite the contrary! We need sparse implementations that do the same as iterateAll implements but much faster for sparse cases. The difference between max and maxNonZero is definitional and both should return the same values on sparse or non-sparse versions of their inputs. They should also both take advantage of sparseness to be as fast as possible. And all of these cases should have semi-reasonable behavior on unreasonable inputs. max should return the largest value x in the vector such that for all values y in the vector y <= x. This can be done by scanning all non-zero values explicitly and then handling all zero values in one comparison. If there are NO values in the vector, then the result is undefined and a domain exception is warranted. maxNonZero should do the same thing, but should be applied only to non-zero values. If there are no non-zero values, then the same domain exception needs to be raised. > Solve performance issues with Vector Implementations > ---------------------------------------------------- > > Key: MAHOUT-300 > URL: https://issues.apache.org/jira/browse/MAHOUT-300 > Project: Mahout > Issue Type: Improvement > Affects Versions: 0.3 > Reporter: Robin Anil > Fix For: 0.3 > > Attachments: MAHOUT-300.patch, MAHOUT-300.patch > > > AbstractVector operations like times > public Vector times(double x) { > Vector result = clone(); > Iterator<Element> iter = iterateNonZero(); > while (iter.hasNext()) { > Element element = iter.next(); > int index = element.index(); > result.setQuick(index, element.get() * x); > } > return result; > } > should be implemented as follows > public Vector times(double x) { > Vector result = clone(); > Iterator<Element> iter = result.iterateNonZero(); > while (iter.hasNext()) { > Element element = iter.next(); > element.set(element.get() * x); > } > return result; > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.