On 8/11/22 19:45, Chris Sidebottom wrote:
Hi Matti,
Thanks for your questions :-)
This seems like it would improve performance on aarch64. Would the routines
also work with the Apple silicon?
Yip, I can't see a reason why that wouldn't be the case.
If these are new routines, it would be better to implement them in terms of the
numpy universal intrinsics rather than adding a new submodule.
These would be the same routines as seen in SVML (integrated here:
https://github.com/numpy/numpy/blob/main/numpy/core/src/umath/loops_umath_fp.dispatch.c.src#L67),
which use the universal intrinsics before using the SVML library, the actual
surface area is minimal so I'd propose we follow a similar path with our
existing routines and then aim to apply universal intrinsics if that's possible
in the future - does that sound like a good approach?
Cheers,
Chris
Yes, if the routines already exist then it would seem an additional
submodule of code would be the best path forward, as long as the license
is compatible.
Matti
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com