Hi,

The example with numpy array for small array, the speed problem is
probably because NumPy have not been speed optimized for low overhead.
For example, each c function should check first if the input is a
NumPy array, if not jump to a function to make one. For example,
currently in the c function(PyArray_Multiply?) that got called by
dot(), a c function call is made to check if the array is a NumPy
array. This is an extra overhead for the expected most frequent
expected behavior that the input is a NumPy array. I'm pretty sure
this happen at many place. In this particular function, there is many
other function call before calling blas just for the simple case of
vector x vector, vector x matrix or matrix x matrix dot product.

But this is probably for another thread if people want to discuss it
more. Also, I didn't verify how frequently we could lower the overhead
as we don't need it. So it could be just a few function that need
those type of optimization.

For the comparison with the multiple type of array on the GPU, I think
the first reason is that people worked isolated and that the only
implemented the subset of the numpy ndarray they needed. As different
project/groups need different part, reusing other people work was not
trivial.

Otherwise, I see the problem, but I don't know what to say about it as
I didn't experience it.

Fred
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to