Check out Mikio's presentation, which mentions JNI overhead, at http://mikiobraun.github.io/jblas/
I agree and it's kind of unavoidable. But copying N^2 data for an operation that's more like N^3 operation seems fine in theory. N still needs to be large enough. And yes that point has been strangely high. Here I think the combination of more suitable LAPACK routine, and largeish but reasonable N (>100) makes it a win. On Fri, Apr 19, 2013 at 3:21 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote: > On Apr 19, 2013 3:49 AM, "Sean Owen" <sro...@gmail.com> wrote: >> >> Hey Mikio I posted the stack trace on the jblas-users list FYI. It >> happens on a random 100x100 matrix, for example -- but not every time. > I suspect that is due to jni calls being incredibky costly compared to fpu > ops. I see very similar behavior i my calls from R to java and back. And r > calls lapack too for certain things.