[Numpy-discussion] Re: Introducing Arm Optimized Routines

Matti Picus Wed, 09 Nov 2022 08:20:36 -0800


On 8/11/22 19:45, Chris Sidebottom wrote:

Hi Matti,

Thanks for your questions :-)

This seems like it would improve performance on aarch64. Would the routines 
also work with the Apple silicon?

Yip, I can't see a reason why that wouldn't be the case.

If these are new routines, it would be better to implement them in terms of the 
numpy universal intrinsics rather than adding a new submodule.

These would be the same routines as seen in SVML (integrated here: 
https://github.com/numpy/numpy/blob/main/numpy/core/src/umath/loops_umath_fp.dispatch.c.src#L67),
 which use the universal intrinsics before using the SVML library, the actual 
surface area is minimal so I'd propose we follow a similar path with our 
existing routines and then aim to apply universal intrinsics if that's possible 
in the future - does that sound like a good approach?

Cheers,
Chris

Yes, if the routines already exist then it would seem an additionalsubmodule of code would be the best path forward, as long as the licenseis compatible.


Matti

_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: [email protected]

[Numpy-discussion] Re: Introducing Arm Optimized Routines

Reply via email to