[ 
https://issues.apache.org/jira/browse/MAHOUT-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024737#comment-13024737
 ] 

Jonathan Traupman commented on MAHOUT-672:
------------------------------------------

Got a bit of this done this evening, but ran into a roadblock: I had to 
separate out the new VirtualMatrix interface from VectorIterable. VirtualMatrix 
contains the times() method while VectorIterable keeps the iterator() and 
iterateAll() methods. I had to do this because there's no efficient way to 
iterate the rows of a squared A'A matrix without actually constructing the full 
product matrix.

However, when I started looking at porting the Lanczos solver to use the new 
VirtualMatrix type, the core algorithm translates fine but the 
getInitialVector() routine, which relies on an iteration through the data 
matrix, presents difficulties.

The easiest path through this impasse that I can see would be defining the new 
DistributedSquaredMatrix (which implements A'A) to also implement 
VectorIterable, but have it iterate over its underlying source matrix A, rather 
than the product matrix. This would preserve the current behavior of Lanczos 
solver albeit at the expense of an iterator on DistributedSquaredMatrix that 
doesn't make a great deal of sense in a more general context. This solution 
would probably not work for the diagonal offset case, because it's unclear how 
to transform the iterated rows using the offset. We could always define a 
"IterableVirtualMatrix" interface that extends both VectorIterable and 
VirtualMatrix for algorithms, like Lanczos, that require it, though I'm still 
bothered by the weird iterator semantics.

The other possible solution I considered would be to add the 
times(VirtualMatrix m) method to the VirtualMatrix interface, then rewriting 
the getInitialVector() routine in terms of fundamental matrix operations: for 
the symmetric case, scaleFactor looks to be trace(A*A) and the initial vector 
looks to be A times a vector of all ones. The asymmetric case is mathematically 
different, but I don't know enough about Lanczos to fully understand why. 
Unfortunately, implementing matrix multiplication with virtual matrices may be 
hard, or at the very least computationally expensive. 

Thoughts?

> Implementation of Conjugate Gradient for solving large linear systems
> ---------------------------------------------------------------------
>
>                 Key: MAHOUT-672
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-672
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Math
>    Affects Versions: 0.5
>            Reporter: Jonathan Traupman
>            Priority: Minor
>         Attachments: 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch, 
> 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch, MAHOUT-672.patch, 
> MAHOUT-672.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> This patch contains an implementation of conjugate gradient, an iterative 
> algorithm for solving large linear systems. In particular, it is well suited 
> for large sparse systems where a traditional QR or Cholesky decomposition is 
> infeasible. Conjugate gradient only works for matrices that are square, 
> symmetric, and positive definite (basically the same types where Cholesky 
> decomposition is applicable). Systems like these commonly occur in statistics 
> and machine learning problems (e.g. regression). 
> Both a standard (in memory) solver and a distributed hadoop-based solver 
> (basically the standard solver run using a DistributedRowMatrix a la 
> DistributedLanczosSolver) are included.
> There is already a version of this algorithm in taste package, but it doesn't 
> operate on standard mahout matrix/vector objects, nor does it implement a 
> distributed version. I believe this implementation will be more generically 
> useful to the community than the specialized one in taste.
> This implementation solves the following types of systems:
> Ax = b, where A is square, symmetric, and positive definite
> A'Ax = b where A is arbitrary but A'A is positive definite. Directly solving 
> this system is more efficient than computing A'A explicitly then solving.
> (A + lambda * I)x = b and (A'A + lambda * I)x = b, for systems where A or A'A 
> is singular and/or not full rank. This occurs commonly if A is large and 
> sparse. Solving a system of this form is used, for example, in ridge 
> regression.
> In addition to the normal conjugate gradient solver, this implementation also 
> handles preconditioning, and has a sample Jacobi preconditioner included as 
> an example. More work will be needed to build more advanced preconditioners 
> if desired.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to