Hi all, Sorry for the delay. I'll try switching all of the least squares package to the new/seperable API and post the results to github. From there we can review, discuss and then decide if it makes sense to switch the rest of the optimizers.
On 08/15/2013 09:12 AM, Konstantin Berlin wrote: > Hi, > > As a quick first pass, I will comment on Evans code only for now. Most > of my comments have to do with numerical aspects and associated > design, rather than threading. > > I like the idea of the optimize function taking in a Problem class. I > think it can add a lot of flexibility going forward. > > First I think to make things clear, LeastSquares should be renamed to > IterativeLeastSquares. Since for unconstrained linear least squares, > this whole hierarchy is not correct. For one thing they don't need an > initial guess, for another, linear least squares can factor the matrix > before experimental data is supplied, so it can support rapid > recomputation. >From my background the problem is more commonly called General Least Squares, or Non-Linear Least Squares. I shortened it to avoid (yet another) long name. :) I don't plan to include linear least squares problem in the new API. It seems hard to be simpler than new QRDecomposition(A).getSolver().solve(b). :) > > There are several issues that I have with current example > > 1) While the least-squares problem now encapsulates the weights, which > I like, for some reason the weights are explicitly used in the actual > GaussNewtonOptimizer and probably all other least-squares optimizer. I > think that is bad for several reasons: > i) the actual optimizer algorithm has nothing to do with weights, so > it logically inconsistent. It forces people who have no weights to > still go through the actual numerical computation. > ii) it makes it harder to maintain and update the optimizer code. Just > think about how much cleaner it would be if weights are actually > included in the Jacobian. > iIi) The inclusion of weight can be really optimized inside the > LeastSquaresProblem, where depending on the problem they might be able > to take it more efficiently than the general approach. > > 2) All computations inside the optimizers should be done using linear > algebra library (or is the library so bad?). Not only does it make the > code easier to follow, it can make it a lot faster. Imagine a > not-so-hypothetical example where the Jacobian is actually sparse > (I know you guys are removing sparse linear algebra, but lets say it > gets put back at some point later). Forming the Hessian from the > Jacobian is a very expensive operation (J^T*J), but if J is sparse the > speedup could be automatically propagated through the algorithm. I agree with 1 and 2. You seem to have more experience with optimizers than me; would you like to update GaussNewton.doOptimize() to assume the weights are already included? > > 3) Speaking of forming the Hessian, the actual computation of the > descent step should be logically and explicitly separated into two > separate function. I think the first function could actually be moved > to the LeastSquaresProblem (need more thought on this) > i) The first function will compute the step size p, where H*p = -g, H > is the Hessian and g is the gradient. The reason for this is there are > several ways this problem can be solved, and that should be deferred > to a specific implementation later. For example, in a very large > least-squares problem (or any large problem) you cannot actually form > J^T*J (or H), and instead you solve the problem using conjugate > gradient iterative method (which you already have in the library), > where you compute J^T*J*x as J^T*(J*x). Again, here having the > Jacobian be a matrix would really help future proof things. In the > case of general non-linear optimizers you might want to support in the > future a very important quasi-Newton algorithms BFGS or L-BFGS, were > this step is done very differently. > > ii) Once the step size is computed, a line search or trust region > method is used to potentially modify the step. Here also bound > constraints can be enforced, so an optimizer that supports it can > overwrite this function. This part should also be exposed, at least in > the abstract classes, from which the final least-squares optimizer > should be built. I'm not sure I follow. Are you suggesting that the LSP class solve the normal equations? Solving the normal equations and computing the next step seems like the job of the "optimizer". > > Thanks, > Konstantin > Ajo Fod wrote: > > I like the immutability idea. I wonder if there are any guidelines on when > the performance hit because of immutability is high enough to want mutable > objects and synchronization? > > -Ajo For me the dividing line is copying large matrices and arrays. I try to avoid copying the residuals and the Jacobian because they can be megabytes in size. Regards, Evan --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org