[Numpy-discussion] Re: Add to NumPy a function to compute cumulative sums from 0.

john . dawson Tue, 22 Aug 2023 07:38:44 -0700

Dom Grigonis wrote:
> 1. Dimension length stays constant, while cumusm0 extends length to n+1, then 
> np.diff, truncates it back. This adds extra complexity, while things are very 
> convenient to work with when dimension length stays constant throughout the 
> code.

For n values there are n-1 differences. Equivalently, for k differences there 
are k+1 values. Herefor, `diff` ought to reduce length by 1 and `cumsum` ought 
to increase it by 1. Returning arrays of the same length is a fencepost error. 
This is a problem in the current behaviour of `cumsum` and the proposed 
behaviour of `diff0`.

Dom Grigonis wrote:
> For now, I only see my point of view and I can list a number of cases from 
> data analysis and modelling, where I found np.diff0 to be a fairly optimal 
> choice to use and it made things smoother. While I haven’t seen any real-life 
> examples where np.cumsum0 would be useful so I am naturally biased. I would 
> appreciate If anyone provided some examples that justify np.cumsum0 - for now 
> I just can’t think of any case where this could actually be useful or why it 
> would be more convenient/sensible than np.diff0.

------------------------------------------------------------
EXAMPLE

Consider a path given by a list of points, say (101, 203), (102, 205), (107, 
204) and (109, 202). What are the positions at fractions, say 1/3 and 2/3, 
along the path (linearly interpolating)?

The problem is naturally solved with `diff` and `cumsum0`:

```
import numpy as np
from scipy import interpolate

positions = np.array([[101, 203], [102, 205], [107, 204], [109, 202]], 
dtype=float)
steps_2d = np.diff(positions, axis=0)
steps_1d = np.linalg.norm(steps_2d, axis=1)
distances = np.cumsum0(steps_1d)
fractions = distances / distances[-1]
interpolate_at = interpolate.make_interp_spline(fractions, positions, 1)
interpolate_at(1/3)
interpolate_at(2/3)
```

Please show how to solve the problem with `diff0` and `cumsum`.
------------------------------------------------------------

Both `diff0` and `cumsum` have a fencepost problem, but `diff0` has a second 
defect: it maps an array of positions to a heterogeneous array where one 
element is a position and the rest are displacements. The operations that make 
sense for displacements, like scaling, differ from those that make sense for 
positions.

------------------------------------------------------------
EXAMPLE

Money is invested on 2023-01-01. The annualized rate is 4% until 2023-02-04 and 
5% thence until 2023-04-02. By how much does the money multiply in this time?

The problem is naturally solved with `diff`:

```
import numpy as np

percents = np.array([4, 5], dtype=float)
times = np.array(["2023-01-01", "2023-02-04", "2023-04-02"], 
dtype=np.datetime64)
durations = np.diff(times)
YEAR = np.timedelta64(365, "D")
multipliers = (1 + percents / 100) ** (durations / YEAR)
multipliers.prod()
```

Please show how to solve the problem with `diff0`. It makes sense to divide 
`np.diff(times)` by `YEAR`, but it would not make sense to divide the output of 
`np.diff0(times)` by `YEAR` because of its incongruous initial value.
------------------------------------------------------------
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Add to NumPy a function to compute cumulative sums from 0.

Reply via email to