On 8/11/22 19:45, Chris Sidebottom wrote:
Hi Matti,

Thanks for your questions :-)

This seems like it would improve performance on aarch64. Would the routines 
also work with the Apple silicon?
Yip, I can't see a reason why that wouldn't be the case.

If these are new routines, it would be better to implement them in terms of the 
numpy universal intrinsics rather than adding a new submodule.
These would be the same routines as seen in SVML (integrated here: 
https://github.com/numpy/numpy/blob/main/numpy/core/src/umath/loops_umath_fp.dispatch.c.src#L67),
 which use the universal intrinsics before using the SVML library, the actual 
surface area is minimal so I'd propose we follow a similar path with our 
existing routines and then aim to apply universal intrinsics if that's possible 
in the future - does that sound like a good approach?

Cheers,
Chris


Yes, if the routines already exist then it would seem an additional submodule of code would be the best path forward, as long as the license is compatible.

Matti

_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to