This has come up before, see https://github.com/numpy/numpy/issues/6044 for the first time this came up; there were several subsequent discussions linked there.
In the meantime, the data APIs consortium has been actively working on adding a `cumulative_sum` function to the array API standard, see https://github.com/data-apis/array-api/issues/597 and https://github.com/data-apis/array-api/pull/653. The proposed `cumulative_sum` function includes an `include_initial` keyword argument that gets the OP's desired behavior. I think we should probably eventually deprecate `cumsum` and `cumprod` in favor of the array API standard's `cumulative_sum` and `cumulative_product` if only because of the embarrassing naming issue. Once the array API standard has finalized the name for the keyword argument, I think it makes sense to add the keyword argument to np.cumsum, even if we don't deprecate it yet. I don't think it makes sense to add a new function just for this. On Fri, Aug 11, 2023 at 6:34 AM <john.daw...@camlingroup.com> wrote: > `cumsum` computes the sum of the first k summands for every k from 1. > Judging by my experience, it is more often useful to compute the sum of the > first k summands for every k from 0, as `cumsum`'s behaviour leads to > fencepost-like problems. > https://en.wikipedia.org/wiki/Off-by-one_error#Fencepost_error > For example, `cumsum` is not the inverse of `diff`. I propose adding a > function to NumPy to compute cumulative sums beginning with 0, that is, an > inverse of `diff`. It might be called `cumsum0`. The following code is > probably not the best way to implement it, but it illustrates the desired > behaviour. > > ``` > def cumsum0(a, axis=None, dtype=None, out=None): > """ > Return the cumulative sum of the elements along a given axis, > beginning with 0. > > cumsum0 does the same as cumsum except that cumsum computes the sum > of the first k summands for every k from 1 and cumsum, from 0. > > Parameters > ---------- > a : array_like > Input array. > axis : int, optional > Axis along which the cumulative sum is computed. The default > (None) is to compute the cumulative sum over the flattened > array. > dtype : dtype, optional > Type of the returned array and of the accumulator in which the > elements are summed. If `dtype` is not specified, it defaults to > the dtype of `a`, unless `a` has an integer dtype with a > precision less than that of the default platform integer. In > that case, the default platform integer is used. > out : ndarray, optional > Alternative output array in which to place the result. It must > have the same shape and buffer length as the expected output but > the type will be cast if necessary. See > :ref:`ufuncs-output-type` for more details. > > Returns > ------- > cumsum0_along_axis : ndarray. > A new array holding the result is returned unless `out` is > specified, in which case a reference to `out` is returned. If > `axis` is not None the result has the same shape as `a` except > along `axis`, where the dimension is smaller by 1. > > See Also > -------- > cumsum : Cumulatively sum array elements, beginning with the first. > sum : Sum array elements. > trapz : Integration of array values using the composite trapezoidal > rule. > diff : Calculate the n-th discrete difference along given axis. > > Notes > ----- > Arithmetic is modular when using integer types, and no error is > raised on overflow. > > ``cumsum0(a)[-1]`` may not be equal to ``sum(a)`` for floating-point > values since ``sum`` may use a pairwise summation routine, reducing > the roundoff-error. See `sum` for more information. > > Examples > -------- > >>> a = np.array([[1, 2, 3], [4, 5, 6]]) > >>> a > array([[1, 2, 3], > [4, 5, 6]]) > >>> np.cumsum0(a) > array([ 0, 1, 3, 6, 10, 15, 21]) > >>> np.cumsum0(a, dtype=float) # specifies type of output value(s) > array([ 0., 1., 3., 6., 10., 15., 21.]) > > >>> np.cumsum0(a, axis=0) # sum over rows for each of the 3 columns > array([[0, 0, 0], > [1, 2, 3], > [5, 7, 9]]) > >>> np.cumsum0(a, axis=1) # sum over columns for each of the 2 rows > array([[ 0, 1, 3, 6], > [ 0, 4, 9, 15]]) > > ``cumsum(b)[-1]`` may not be equal to ``sum(b)`` > > >>> b = np.array([1, 2e-9, 3e-9] * 1000000) > >>> np.cumsum0(b)[-1] > 1000000.0050045159 > >>> b.sum() > 1000000.0050000029 > > """ > empty = a.take([], axis=axis) > zero = empty.sum(axis, dtype=dtype, keepdims=True) > later_cumsum = a.cumsum(axis, dtype=dtype) > return concatenate([zero, later_cumsum], axis=axis, dtype=dtype, > out=out) > ``` > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: nathan12...@gmail.com >
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com