Hello,

Here at Arm, we've been investigating how we can improve performance on 
AArch64. One way in which we can improve performance is by integrating some 
existing optimized routines 
(https://github.com/ARM-software/optimized-routines), similar to the SVML 
methods for AVX512 that are currently included as a git submodule. Our intent 
is to include the optimized routines repository as an additional submodule 
which we can then use to provide routines on AArch64 for ASIMD, SVE and beyond.

Currently, we're targeting 4-ULP as this aligns with libmvec 
(https://sourceware.org/glibc/wiki/libmvec) and the SVML integration 
(https://github.com/numpy/numpy/pull/19478). This is alongside adding 
sufficient error handling to pass the Numpy test suite, meeting the test 
requirements highlighted in the SVML integration 
(https://github.com/numpy/numpy/pull/19478#issuecomment-893001722).

We've already started curating the necessary functions, let us know if you have 
any feedback.

Cheers,
Chris

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to