I just read all of these references, but they all seem confined to relatively naive implementations of multiply. Notably, they didn't do the standard two levels of block decomposition that is typically necessary to get good register and cache hit statistics.
That means that these demonstrations are pretty unconvincing to me ... I wouldn't be surprised if the fact that the advantage evaporates at larger matrix sizes is just a reflection of a need for better blocking at larger matrix sizes. This is supported (slightly) by the fact that 10-20 is pretty close to the blocking factor for many implementations. On Fri, Nov 28, 2008 at 1:15 AM, <[EMAIL PROTECTED]> wrote: > > ----- "Ted Dunning" <[EMAIL PROTECTED]> a écrit : > > > Luc, > > > > Last I looked, I think I saw that commons math used a double indirect > > storage format very similar to Jama. > > > > Is there any thought to going to a higher performance layout such as > > used by > > Colt? > > Yes. > The linear algebra package is defined using interfaces RealMatrix, > RealVector and separate implementations RealMatrixImpl, RealVectorImpl ... > The purpose is to abstract the storage method. For now, only one > representation has been used: a simple double[][]. It is possible to provide > other implementations too, for example a single array, handling the index > computation by some intermediate layer. I have recently seen a discussion on > the web where it was argued that single array was more efficient only for > low dimensions (see > http://cio.nist.gov/esd/emaildir/lists/jama/msg01425.html). Also the > optimizations of array access in the JVM has been vastly improved last few > years, so I think it would be worth checking what can be observed now. > Providing several different implementations using different strategies would > be a very interesting feature, allowing users to choose the implementation > that best suit their needs. > > Another improvement I am thinking about is to provide specialized > implementations for some matrices shapes. The first one would simply be band > matrices. A single class could handle diagonal matrices (when configured > with 0 subdiagonals and 0 superdiagonals), bidiagonals (lower and upper), > tridiagonals and even triangular matrices with a rather extreme > configuration with n sub or superdiagonal. I thought this would better be > implemented with a single double array row by row for data locality and > hence cache efficiency. Sparse matrices would be the next step. > > We need time to do this. > > Luc > > > > > On Thu, Nov 27, 2008 at 9:10 AM, <[EMAIL PROTECTED]> wrote: > > > > > Hello, > > > > > > This commit is the result of weeks of work. I hope it completes an > > > important feature > > > to [math], computation of eigenvalues and eigenvectors for symmetric > > real > > > matrices. > > > > > > The implementation is based on algorithms developed in the last 10 > > years or > > > so. It is based partly on two reference papers and partly on LAPACK. > > Lapack > > > is distributed under a modified-BSD license, so this is acceptable > > for > > > [math]. I have updated the NOTICE file and taken care of the proper > > > attributions in Javadoc. > > > > > > The current status is that we can solve eigenproblems much faster > > than Jama > > > (see the speed gains in the commit message below). Furthermore, the > > > eigenvectors are not always computed, they are computed only if > > needed. So > > > applications that only need eigenvalues will benefit from a larger > > speed > > > gain. This could even be improved again by allowing to compute only > > some > > > eigenvalues, not all of them. This feature is available in the > > higher level > > > LAPACK routine, but I didn't include it yet. I'll do it only when > > required, > > > as this as already been a very large amount of work. > > > > > > If someone could test this new decomposition algorithm further, I > > would be > > > more than happy. > > > > > > My next goal is now to implement Singular Value Decomposition. I > > will most > > > probably use a method based on eigen decomposition as this seems to > > be now > > > the prefered way since dqd/dqds and MRRR algorithms are available. > > > > > > Luc > > > > > > ----- [EMAIL PROTECTED] a écrit : > > > > > > > Author: luc > > > > Date: Thu Nov 27 07:50:42 2008 > > > > New Revision: 721203 > > > > > > > > URL: http://svn.apache.org/viewvc?rev=721203&view=rev > > > > Log: > > > > completed implementation of EigenDecompositionImpl. > > > > The implementation is now based on the very fast and accurate > > dqd/dqds > > > > algorithm. > > > > It is faster than Jama for all dimensions and speed gain increases > > > > with dimensions. > > > > The gain is about 30% below dimension 100, about 50% around > > dimension > > > > 250 and about > > > > 65% for dimensions around 700. > > > > It is also possible to compute only eigenvalues (and hence saving > > > > computation of > > > > eigenvectors, thus even increasing the speed gain). > > > > JIRA: MATH-220 > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > -- > > Ted Dunning, CTO > > DeepDyve > > 4600 Bohannon Drive, Suite 220 > > Menlo Park, CA 94025 > > www.deepdyve.com > > 650-324-0110, ext. 738 > > 858-414-0013 (m) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Ted Dunning, CTO DeepDyve 4600 Bohannon Drive, Suite 220 Menlo Park, CA 94025 www.deepdyve.com 650-324-0110, ext. 738 858-414-0013 (m)