Alex,
I think you should recheck your numbers. Both BIDMat and nvblas are
wrappers for cublas. The speeds are identical, except on machines that
have multiple GPUs which nvblas exploits and cublas doesnt.
It would be a good idea to add a column with Gflop throughput. Your
numbers for BIDMat 10k
I
> don't have any background in optimizations)
>
>
>
> On Thu, Mar 12, 2015 at 8:50 PM, jfcanny <[hidden email]
> > wrote:
>
> > If you're contemplating GPU acceleration in Spark, its important to
> look
> > beyond BLAS. Dense BLAS probably ac
If you're contemplating GPU acceleration in Spark, its important to look
beyond BLAS. Dense BLAS probably account for only 10% of the cycles in the
datasets we've tested in BIDMach, and we've tried to make them
representative of industry machine learning workloads. Unless you're
crunching images or