[ https://issues.apache.org/jira/browse/MAHOUT-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Suneel Marthi resolved MAHOUT-1582. ----------------------------------- Resolution: Won't Fix Fix Version/s: 0.13.0 Resolving this as 'Won't Fix', please feel free to create a new Jira > Create simpler row and column aggregation API at local level > ------------------------------------------------------------ > > Key: MAHOUT-1582 > URL: https://issues.apache.org/jira/browse/MAHOUT-1582 > Project: Mahout > Issue Type: Bug > Reporter: Ted Dunning > Assignee: Suneel Marthi > Labels: legacy, math, scala > Fix For: 0.13.0 > > > The issue is that the current row and column aggregation API makes it > difficult to do anything but row by row aggregation using anonymous classes. > There is no scope for being aware of locality, nor to use the well known > function definitions in Functions. This makes lots of optimizations > impossible and many of these are optimizations that we want to have. An > example would be adding up absolute values of values. With the current API, > it would be very hard to optimize for sparse matrices and the wrong direction > of iteration but with a different API, this should be easy. > What I suggest is an API of this form: > {code} > Vector aggregateRows(DoubleDoubleFunction combiner, DoubleFunction mapper) > {code} > This will produce a vector with one element per row in the original. The > nice thing here is that if the matrix is row major, we can iterate over rows > and accumulate a value for each row using sparsity as available. On the > other hand, if the matrix is column major, we can keep a vector of > accumulators and still use sparsity as appropriate. > The use of sparsity comes in because the matrix code now has control over > both of the loops involved and also has visibility into properties of the map > and combine functions. For instance, ABS(0) == 0 so if we combine with PLUS, > we can use a sparse iterator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)