[jira] [Commented] (MAHOUT-672) Implementation of Conjugate Gradient for solving large linear systems

Jonathan Traupman (Commented) (JIRA) Mon, 24 Oct 2011 10:30:57 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134267#comment-13134267
 ]


Jonathan Traupman commented on MAHOUT-672:
------------------------------------------

Reply to Ted's and Jake's comments:

> Is it possible to multiply by many vectors at once to accelerate the 
> convergence here by essentially exploring in multiple directions at once?

This might be possible, but I don't think it's just an easy tweak. At iteration 
i, we compute the conjugate gradient, then move up it to a local min and 
repeat. We don't know the direction we're going to go at i+1 until after we've 
finished iteration i. To do what you suggest would mean moving along a vector 
that's different than the gradient, trying to collapse multiple CG steps into 
one. I'd imagine there's some literature on this (if not, might be a fruitful 
avenue of research). 

However, my gut reaction is 1) this won't just be a simple modification to this 
algorithm and 2) the technique used to approximate the gradient at i+1 while 
still on iteration i might involve operations that will make a distributed 
version more difficult/impossible. E.g. I've seen that the LSMR solver often 
converges in fewer steps than the CG one, but it's doing enough additional 
stuff with the data matrix that creating a map/reduce version would be a lot 
more work than just using a DistributedRowMatrix.

> I gather you replaced a.addTo(b) with b.assign(a, Functions.PLUS)? If so, 
> then all will be well.

Yes, that's all I did.

Anyway, I'd really like to reach some closure on this issue. These two 
algorithms aren't the end-all be-all of linear system solvers, but I think 
they're useful in their current form and can be a foundation for further work.

                
> Implementation of Conjugate Gradient for solving large linear systems
> ---------------------------------------------------------------------
>
>                 Key: MAHOUT-672
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-672
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Math
>    Affects Versions: 0.5
>            Reporter: Jonathan Traupman
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch, 
> 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch, MAHOUT-672.patch, 
> mahout-672-111023.patch, mahout-672.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> This patch contains an implementation of conjugate gradient, an iterative 
> algorithm for solving large linear systems. In particular, it is well suited 
> for large sparse systems where a traditional QR or Cholesky decomposition is 
> infeasible. Conjugate gradient only works for matrices that are square, 
> symmetric, and positive definite (basically the same types where Cholesky 
> decomposition is applicable). Systems like these commonly occur in statistics 
> and machine learning problems (e.g. regression). 
> Both a standard (in memory) solver and a distributed hadoop-based solver 
> (basically the standard solver run using a DistributedRowMatrix a la 
> DistributedLanczosSolver) are included.
> There is already a version of this algorithm in taste package, but it doesn't 
> operate on standard mahout matrix/vector objects, nor does it implement a 
> distributed version. I believe this implementation will be more generically 
> useful to the community than the specialized one in taste.
> This implementation solves the following types of systems:
> Ax = b, where A is square, symmetric, and positive definite
> A'Ax = b where A is arbitrary but A'A is positive definite. Directly solving 
> this system is more efficient than computing A'A explicitly then solving.
> (A + lambda * I)x = b and (A'A + lambda * I)x = b, for systems where A or A'A 
> is singular and/or not full rank. This occurs commonly if A is large and 
> sparse. Solving a system of this form is used, for example, in ridge 
> regression.
> In addition to the normal conjugate gradient solver, this implementation also 
> handles preconditioning, and has a sample Jacobi preconditioner included as 
> an example. More work will be needed to build more advanced preconditioners 
> if desired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-672) Implementation of Conjugate Gradient for solving large linear systems

Reply via email to