Hi Steven,

I added your version (vander3) to my benchmark and updated the IJulia 
notebook:

http://nbviewer.ipython.org/gist/synapticarbors/26910166ab775c04c47b

As you mentioned it's a lot faster than the other version I wrote and evens 
out the underperformance vs numpy for the larger arrays on my machine. The 
@inbounds macro makes a small difference, but not dramatic. One of the 
things that I wonder is if there would be interest in having a way of 
globally turning off bounds checking either at the function level or 
module/file level, similar to cython's decorators and file-level compiler 
directives. 

Josh

On Thursday, January 8, 2015 at 4:36:52 PM UTC-5, Steven G. Johnson wrote:
>
> For comparison, the NumPy vander function
>
>      
> https://github.com/numpy/numpy/blob/f4be1039d6fe3e4fdc157a22e8c071ac10651997/numpy/lib/twodim_base.py#L490-L577
>
> does all its work in multiply.accumulate.   Here is the outer loop of 
> multiply.accumulate (written in C):
>
>    
> https://github.com/numpy/numpy/blob/3b22d87050ab63db0dcd2d763644d924a69c5254/numpy/core/src/umath/ufunc_object.c#L2936-L3264
>
> and the inner loops (I think) are generated from this source file for 
> various numeric types:
>
>   
> https://github.com/numpy/numpy/blob/3b22d87050ab63db0dcd2d763644d924a69c5254/numpy/core/src/umath/loops.c.src
>
> A quick glance at these will tell you the price in code complexity that 
> NumPy is paying for the performance they manage to get.
>

Reply via email to