[Numpy-discussion] How to avoid this memory copy overhead in d=a*b+c?

2022-09-16 Thread 腾刘
Hello everyone, I 'm here again to ask a naive question about Numpy performance. As far as I know, Numpy's vectorization operator is very effective because it utilizes SIMD instructions and multi-threads compared to index-style programming (using a "for" loop and assigning each element with its in

[Numpy-discussion] Re: How to avoid this memory copy overhead in d=a*b+c?

2022-09-16 Thread Kevin Sheppard
You can use inplace operators where appropriate to avoid memory allocation. a *= bc += a Kevin  From: 腾刘Sent: Friday, September 16, 2022 8:35 AMTo: Discussion of Numerical PythonSubject: [Numpy-discussion] How to avoid this memory copy overhead in d=a*b+c? Hello everyone, I 'm here again to ask a n

[Numpy-discussion] Re: How to avoid this memory copy overhead in d=a*b+c?

2022-09-16 Thread 腾刘
Thanks a lot for answering this question but I still have some uncertainties. I 'm trying to improve the time efficiency as much as possible so I 'm not mainly worried about memory allocation, since in my opinion it won't cost much. Instead, the memory accessing is my central concern because of th

[Numpy-discussion] Re: How to avoid this memory copy overhead in d=a*b+c?

2022-09-16 Thread Kevin Sheppard
Have a look at numexpr (https://github.com/pydata/numexpr). It can achieve large speedups in ops like this at the cost of having to write expensive operations as strings, e.g., d = ne.evaluate('a * b + c'). You could also write a gufunc in numba that would be memory and access efficient. Kevin

[Numpy-discussion] Re: How to avoid this memory copy overhead in d=a*b+c?

2022-09-16 Thread Francesc Alted
This is exactly what numexpr is meant for: https://numexpr.readthedocs.io/projects/NumExpr3/en/latest/ In particular, see these benchmarks (made around 10 years ago, but they should still apply): https://numexpr.readthedocs.io/projects/NumExpr3/en/latest/intro.html#expected-performance Cheers On

[Numpy-discussion] Re: How to avoid this memory copy overhead in d=a*b+c?

2022-09-16 Thread 腾刘
Thanks a million!! I will check these thoroughly~ Kevin Sheppard 于2022年9月16日周五 16:11写道: > Have a look at numexpr (https://github.com/pydata/numexpr). It can > achieve large speedups in ops like this at the cost of having to write > expensive operations as strings, e.g., d = ne.evaluate('a * b +

[Numpy-discussion] Re: How to avoid this memory copy overhead in d=a*b+c?

2022-09-16 Thread 腾刘
Still so naive in Python, there truly are lots of beautiful libraries at hand. Thanks a lot for suggestions!! Francesc Alted 于2022年9月16日周五 16:15写道: > This is exactly what numexpr is meant for: > https://numexpr.readthedocs.io/projects/NumExpr3/en/latest/ > > In particular, see these benchmarks

[Numpy-discussion] Re: Enhancement for AArch64 SVE instruction set

2022-09-16 Thread kawakam...@fujitsu.com
Hi, It's been a long time since I first contacted here, but I submitted my pull request about handling Arm64 SVE architecture yesterday. https://github.com/numpy/numpy/pull/22265 Since there may be no public CI environment that runs the SVE instruction set, I tested my source code on an inhouse

[Numpy-discussion] Re: Enhancement for AArch64 SVE instruction set

2022-09-16 Thread matti picus
It seems cirrus-ci offers AWS EKS Graviton2 instances [0] and this is free for open source projects. Do you know if that offering has SVE-enabled CPUs? Matti [0] https://cirrus-ci.org/guide/linux/ On Fri, Sep 16, 2022 at 5:54 AM kawakam...@fujitsu.com wrote: > > Hi, > > It's been a long time sin

[Numpy-discussion] Ways to achieve faster np.nanpercentile() calculation?

2022-09-16 Thread Aron Gergely
Hi all, On my system, np.nanpercentile()  is orders of magnitude (>100x) slower than np.percentile(). I use numpy 1.23.1 Wondering if there is a way to speed it up. I came across this workaround for 3D arrays: https://krstn.eu/np.nanpercentile()-there-has-to-be-a-faster-way/ But I would need