In the script below, the evaluation of the expression `z.real**2 + z.imag**2` is timed using the `timeit` module. `z` is a 1D array of random samples with dtype `np.complex128` and with length 250000.
The mystery is the change in performance of the calculation from the first array to which it is applied to the second. The output of the script is ``` numpy version 1.23.0.dev0+460.gc30876f64 619.7096 microseconds 625.3833 microseconds 634.8389 microseconds 137.0659 microseconds 137.5231 microseconds 137.5582 microseconds ``` Each batch of three timings corresponds to repeating the timeit operation three times on the same random array `z`; i.e. a new array `z` is generated for the second batch. The question is why is does it take so much longer to evaluate the expression the first time? Some other details: * If I change the expression to, say, `'z.real + z.imag'`, the huge disparity disappears. * If I generate more random `z` arrays, the performance remains at the level of approximately 140 usec. * I used the main branch of numpy for the above output, but the same thing happens with 1.20.3, so this is not the result of a recent change. * So far, when I run the script, I always see output like that shown above: the time required for the first random array is typically four times that required for the second array. If I run similar commands in ipython, I have seen the slow case repeated several times (with newly generated random arrays), but eventually the time drops down to 140 usec (or so), and I don't see the slow case anymore. * I'm using a 64 bit Linux computer: ``` $ uname -a Linux pop-os 5.15.8-76051508-generic #202112141040~1639505278~21.10~0ede46a SMP Tue Dec 14 22:38:29 U x86_64 x86_64 x86_64 GNU/Linux ``` Any ideas? Warren Here's the script: ``` import timeit import numpy as np def generate_sample(n, rng): return rng.normal(scale=1000, size=2*n).view(np.complex128) print(f'numpy version {np.__version__}') print() rng = np.random.default_rng() n = 250000 timeit_reps = 10000 expr = 'z.real**2 + z.imag**2' z = generate_sample(n, rng) for _ in range(3): t = timeit.timeit(expr, globals=globals(), number=timeit_reps) print(f"{1e6*t/timeit_reps:9.4f} microseconds") print() z = generate_sample(n, rng) for _ in range(3): t = timeit.timeit(expr, globals=globals(), number=timeit_reps) print(f"{1e6*t/timeit_reps:9.4f} microseconds") print() ``` _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com