Sorry about the late reply; I was practically all of April on holidays
and now I find myself occupied by final exams.

+1 - see below.  The only real question here is do we need a
subpackage for matrix decompositions.  Since I think it is unlikely
that we will have more than a handful of these, I am OK putting these
into the top level, i.e. in .linear.

That seems fine to  me.

> I also had a look at Jama yesterday. There they defer the explicit
> generation of the Q part of the decomposition until the user calls
> getQ(), which I guess has a computational advantage over calculating
> the whole decomp if the user of the API only needs R. This of course
> implies that the algorithm has a state and it's most natural to
> implement it as a class of its own.

Again, I think this should be a separate (immutable) class with state,
with the decomp done in the constructor, which should take a
RealMatrix (not impl) as argument (using getData to copy if argument
is not a RealMatrixImp).  I am not sure I understand what you mean
about the Q and R accessors in Jama.  It looks to me like they are
just doing transformations to provide Q and R separately.  I think it
makes sense to provide those accessors (as we should in the LU case
when we externalize that).

The algorithm used there produces the matrix R and an array of
Householder vectors. When the getQ() is called, the Householder
vectors are made into matrices that are multiplied together to yield
the Q matrix. This seems to be the best way to go about things.

>
> From the release plan I read that the QR-decomposition will be needed
> for linear regression. Does that mean that it will be used mainly for
> least-squares fitting? In that case both Q and R are needed most of
> the time, so having the algorithm in a separate class is not strictly
> necessary..

The immediate motivation is for solving the normal equations.  I don't
think we should include the solve() method that Jama has in this
class, though.  I think it is more natural to have that in the OLS
implementation.

Tests are a good start.

Returning to the overall API design, I think it makes sense to follow
the abstract factory pattern used elsewhere in [math] (e.g. the
distributions package) to provide for pluggable decomp implementations
with defaults provided.  So what we would end up with would be an
abstract DecompositionFactory class with a concrete
DecompositionFactoryImpl subclass providing default implementations.
Interfaces for decompositions would be abstracted.  User code with
look like this:

QRDecomposition qr =
DecompositionFactory.newInstance().createQRDecomposition(matrix);

where QRDecomposition is the interface and
DecompositionFactory.newInstance() returns a DecompositionFactoryImpl
and createQRDecomposition(matrix) invokes the constructor for
QRDecompositionImpl, which is the default implementation.  This setup
is used in the distributions and analysis packages to provide
pluggable implementations.

Seems good to me. What do you think is better as a method name,
"createQRDecomposition" or "newQRDecomposition"? Both styles seem to
be in use.

I suppose we won't have a base interface for matrix decompositions?

To get started, we can just define QRDecomposition,
QRDecompositionImpl.  If there are no objections / better ideas, we
can then add the factory impls and do the same for LU decomp (and
Cholesky, which I think we may also have laying around somewhere).

Alright, I'm on it.

Joni

Reply via email to