[ 
https://issues.apache.org/jira/browse/MAHOUT-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Lyubimov updated MAHOUT-1582:
-------------------------------------
    Assignee: Suneel Marthi

> Create simpler row and column aggregation API at local level
> ------------------------------------------------------------
>
>                 Key: MAHOUT-1582
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1582
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Ted Dunning
>            Assignee: Suneel Marthi
>              Labels: legacy, math, scala
>
> The issue is that the current row and column aggregation API makes it 
> difficult to do anything but row by row aggregation using anonymous classes.  
> There is no scope for being aware of locality, nor to use the well known 
> function definitions in Functions.  This makes lots of optimizations 
> impossible and many of these are optimizations that we want to have.  An 
> example would be adding up absolute values of values.  With the current API, 
> it would be very hard to optimize for sparse matrices and the wrong direction 
> of iteration but with a different API, this should be easy.
> What I suggest is an API of this form:
> {code}
>    Vector aggregateRows(DoubleDoubleFunction combiner, DoubleFunction mapper)
> {code}
> This will produce a vector with one element per row in the original.  The 
> nice thing here is that if the matrix is row major, we can iterate over rows 
> and accumulate a value for each row using sparsity as available.  On the 
> other hand, if the matrix is column major, we can keep a vector of 
> accumulators and still use sparsity as appropriate.  
> The use of sparsity comes in because the matrix code now has control over 
> both of the loops involved and also has visibility into properties of the map 
> and combine functions.  For instance, ABS(0) == 0 so if we combine with PLUS, 
> we can use a sparse iterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to