[Numpy-discussion] Re: array.T.max() vs array.max(axis=0) performance difference

Tiziano Zito via NumPy-Discussion Fri, 21 Mar 2025 15:24:15 -0700

Hi George,

what you see is due to the memory layout of numpy arrays. If you switch your 
array to F-order you'll see that the two functions have the same timings, i.e. 
both are fast (on my machine 25 times faster for the 1_000_000 points case).


Try:

vertices = np.array(np.random.random((n, 2)), order='F')

When your array doesn't fit in L1-cache anymore, either order 'C' or order 'F' 
becomes (much) more efficient depending on which dimension you are internally 
looping through.

You can read more about it here:
https://numpy.org/doc/stable/dev/internals.html#internal-organization-of-numpy-arrays
and
https://numpy.org/doc/stable/reference/arrays.ndarray.html#internal-memory-layout-of-an-ndarray

Hope that helps,
Tiziano


On Fri 21 Mar, 12:10 +0200, George Tsiamasiotis via NumPy-Discussion 
<[email protected]> wrote:

Hello NumPy community!

I was writing a function that calculates the bounding box of a polygon (the 
smallest rectangle that fully contains the polygon, and who's sides are 
parallel to the x and y axes). The input is a (N,2) array containing the 
vertices of the polygon, and the output is a 4-tuple containing the vertices of 
the 2 corners of the bounding box.

I found two ways to do that using the np.min() and np.max() methods, and I was 
surprised to see a significant speed difference, even though they seemingly do 
the same thing.

While for small N the speed is essentially the same, the difference becomes 
noticeable for larger N. >From my testing, the difference seems to plateau, 
with the one way being around 4-5 times faster than the other.

Is there an explanation for this?

Here is a small benchmark I wrote (must be executed with IPython):

import numpy as np
from IPython import get_ipython

vertices = np.random.random((1000, 2))

def calculate_bbox_normal(vertices: np.ndarray) -> tuple[np.float64]:
    xmin, ymin = vertices.min(axis=0)
    xmax, ymax = vertices.max(axis=0)
    return xmin, ymin, xmax, ymax

def calculate_bbox_transpose(vertices: np.ndarray) -> tuple[np.float64]:
    xmin = vertices.T[0].min()
    xmax = vertices.T[0].max()
    ymin = vertices.T[1].min()
    ymax = vertices.T[1].max()
    return xmin, ymin, xmax, ymax

bbox_normal = calculate_bbox_normal(vertices)
bbox_transpose = calculate_bbox_transpose(vertices)

print(f"Equality: {bbox_normal == bbox_transpose}")

for n in [10, 100, 1000, 10_000, 100_000, 1_000_000]:
    print(f"Number of points: {n}")
    vertices = np.random.random((n, 2))
    print("Normal:    ", end="")
    get_ipython().run_line_magic("timeit", "calculate_bbox_normal(vertices)")
    print("Transpose: ", end="")
    get_ipython().run_line_magic("timeit", "calculate_bbox_transpose(vertices)")
    print()

_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: [email protected]


_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: [email protected]

[Numpy-discussion] Re: array.T.max() vs array.max(axis=0) performance difference

Reply via email to