On Tue, Nov 15, 2016 at 6:37 PM, Jerry DeLisle <jvdeli...@charter.net> wrote: > All comments incorporated. Standing by for approval.
Looks good, nice job! Ok for trunk. I was thinking that for strided arrays, it probably is faster to copy them to dense arrays before doing the matrix multiplication. That would also enable using an optimized blas (-fexternal-blas) for strided arrays. But this is of course nothing that blocks this patch, just something that might be worth looking into in the future. -- Janne Blomqvist