While VML is generally much faster for big arrays, the overhead is considerable or even dominant for small ones.
I think the ideal way is to let people to explicit articulate their intention to use these instead of changing the behavior silently. For example, I may want to be able to use these over several large arrays, but not at each place with ``a + b``. Think about it more, I think the best way to do this is for each package to define their own functions (not extending the functions in the base) with consistent naming. Then in the client code, we will be able to do VML.exp(a1) IPP.exp(a2) Yeppp.exp(a3) I think what we only have to do is to coordinate the naming across several packages. So I can easily replace ``Yeppp.add`` with ``VML.add`` or the other way round. - Dahua On Friday, February 28, 2014 9:14:01 AM UTC-6, John Myles White wrote: > > If you get them all to export the same API, you could, in principle, just > switch `using VML` to `using Yeppp`. > > My question: are we finally conceding that add! and co. is probably worth > having? > > — John > > On Feb 28, 2014, at 7:10 AM, Dahua Lin <lind...@gmail.com <javascript:>> > wrote: > > This is very nice. > > Now that we have several "back-ends" for vectorized computation, VML, > Yeppp, Julia's builtin functions, as well as the @simd-ized versions, I am > considering whether there is a way to switch back-end without affecting the > client codes. > > - Dahua > > > On Thursday, February 27, 2014 11:58:20 PM UTC-6, Stefan Karpinski wrote: >> >> We really need to stop using libm for those. >> >> >> On Fri, Feb 28, 2014 at 12:40 AM, Simon Kornblith <si...@simonster.com>wrote: >> >>> Some of the poorest performers here are trunc, ceil, floor, and round >>> (as of LLVM 3.4). We currently call out to libm for these, but there are >>> LLVM intrinsics that are optimized into a single instruction. It looks like >>> the loop vectorizer may even vectorize these intrinsics automatically. >>> >>> Simon >>> >>> >>> On Thursday, February 27, 2014 9:04:18 PM UTC-5, Simon Kornblith wrote: >>>> >>>> I created a package that makes Julia use Intel's Vector Math Library >>>> for operations on arrays of Float32/Float64. VML provides vectorized >>>> versions of many of the functions in openlibm with equivalent (<1 ulp) >>>> accuracy (in VML_HA mode). The speedup is often quite stunning; see >>>> benchmarks at https://github.com/simonster/VML.jl. >>>> >>>> Simon >>>> >>> >> >