There are 3 copies involved -- the JNI one is unavoidable, and I got around 1 of the others. This sped things up about 8% -- maybe not as game changing as I thought. I think removing the final copy would involve rewriting a lot of stuff in terms of the JBLAS/BLAS-native 1D array representation of 2D, and that's probably too much to contemplate. It still seems to be a win past 100x100 matrices.
On Fri, Apr 19, 2013 at 6:53 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > > Sean, > > What about accumulating the matrix to be solved directly into a special > JBLAS based matrix?