pitrou commented on issue #46063: URL: https://github.com/apache/arrow/issues/46063#issuecomment-3055807630
As a data point, NumPy has a function `nanmin` that ignores NaN when computing an array's minimum, but it still returns NaN if no other values are encountered (it also emits a warning): ```python >>> np.nanmin(np.array([1.0, float('nan')])) np.float64(1.0) >>> np.nanmin(np.array([float('nan')])) <ipython-input-17-324f73f42199>:1: RuntimeWarning: All-NaN slice encountered np.nanmin(np.array([float('nan')])) np.float64(nan) ``` However, it also doesn't support an empty input array unless you pass an `initial` argument, and then... ```python >>> np.nanmin(np.array([])) Traceback (most recent call last): ... ValueError: zero-size array to reduction operation fmin which has no identity >>> np.nanmin(np.array([]), initial=5.0) np.float64(5.0) >>> np.nanmin(np.array([float('nan')]), initial=5.0) np.float64(5.0) ``` Arrow on the other hand returns Null if an empty array is given: ```python >>> pc.min(pa.array([], type=pa.float64())) <pyarrow.DoubleScalar: None> ``` Thought @jorisvandenbossche @bkietz @westonpace @WillAyd ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org