In the script below, the evaluation of the expression `z.real**2 +
z.imag**2` is timed using the `timeit` module. `z` is a 1D array of
random samples with dtype `np.complex128` and with length 250000.

The mystery is the change in performance of the calculation from the
first array to which it is applied to the second.  The output of the
script is

```
numpy version 1.23.0.dev0+460.gc30876f64

 619.7096 microseconds
 625.3833 microseconds
 634.8389 microseconds

 137.0659 microseconds
 137.5231 microseconds
 137.5582 microseconds
```

Each batch of three timings corresponds to repeating the timeit
operation three times on the same random array `z`; i.e. a new array
`z` is generated for the second batch.  The question is why is does it
take so much longer to evaluate the expression the first time?

Some other details:

* If I change the expression to, say, `'z.real + z.imag'`, the huge
disparity disappears.
* If I generate more random `z` arrays, the performance remains at the
level of approximately 140 usec.
* I used the main branch of numpy for the above output, but the same
thing happens with 1.20.3, so this is not the result of a recent
change.
* So far, when I run the script, I always see output like that shown
above: the time required for the first random array is typically four
times that required for the second array.  If I run similar commands
in ipython, I have seen the slow case repeated several times (with
newly generated random arrays), but eventually the time drops down to
140 usec (or so), and I don't see the slow case anymore.
* I'm using a 64 bit Linux computer:
  ```
  $ uname -a
  Linux pop-os 5.15.8-76051508-generic
#202112141040~1639505278~21.10~0ede46a SMP Tue Dec 14 22:38:29 U
x86_64 x86_64 x86_64 GNU/Linux
  ```

Any ideas?

Warren

Here's the script:

```
import timeit
import numpy as np


def generate_sample(n, rng):
    return rng.normal(scale=1000, size=2*n).view(np.complex128)


print(f'numpy version {np.__version__}')
print()

rng = np.random.default_rng()
n = 250000
timeit_reps = 10000

expr = 'z.real**2 + z.imag**2'

z = generate_sample(n, rng)
for _ in range(3):
    t = timeit.timeit(expr, globals=globals(), number=timeit_reps)
    print(f"{1e6*t/timeit_reps:9.4f} microseconds")
print()

z = generate_sample(n, rng)
for _ in range(3):
    t = timeit.timeit(expr, globals=globals(), number=timeit_reps)
    print(f"{1e6*t/timeit_reps:9.4f} microseconds")
print()
```
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to