On Sunday, 14 June 2015 at 21:50:02 UTC, Ola Fosheim Grøstad wrote:
On Sunday, 14 June 2015 at 21:31:53 UTC, anonymous wrote:
2. Then write similar code with hardware optimized BLAS and benchmark where the overhead between pure C/LLVM and BLAS calls balance out to even.
may there are more important / beneficial things to work on - assuming total time of contributors is fix and used for other D stuff:)

Sure, but that is what I'd do if I had the time. Get a baseline for what kind of NxN sizes D can reasonably be expected to deal with in a "naive brute force" manner.

Then consider pushing anything beyond that over to something more specialized.

*shrugs*

On Sunday, 14 June 2015 at 21:50:02 UTC, Ola Fosheim Grøstad wrote:
On Sunday, 14 June 2015 at 21:31:53 UTC, anonymous wrote:
2. Then write similar code with hardware optimized BLAS and benchmark where the overhead between pure C/LLVM and BLAS calls balance out to even.
may there are more important / beneficial things to work on - assuming total time of contributors is fix and used for other D stuff:)

Sure, but that is what I'd do if I had the time. Get a baseline for what kind of NxN sizes D can reasonably be expected to deal with in a "naive brute force" manner.

Then consider pushing anything beyond that over to something more specialized.

*shrugs*

sorry, I should read more careful. I understand 'optimize default implementation to the speed of high quality BLAS for _any_/large matrix size'. Great if it is done but imo there is no real pressure to do it and probably needs lot of time of experts.

To benchmark when existing BLAS is actually faster is than 'naive brute force' sounds very good and reasonable.

Reply via email to