Hi Sebastian, Interestingly I was just recently having conversations with Iason Krommydas about segmented reductions. He wants them to make reductions in awkward-array more efficient. So from the ragged array perspective it’s also interesting.
Nathan On Fri, Feb 27, 2026 at 3:56 AM Sebastian Berg <[email protected]> wrote: > On Sat, 2024-11-23 at 20:03 -0500, Marten van Kerkwijk wrote: > > Hi All, > > > > This discussion about updating reduceat went silent, but recently I > > came > > back to my PR to allow `indices` to be a 2-dimensional array of start > > and stop values (or a tuple of separate start and stop arrays). I > > thought a bit more about it and think it is the easiest way to extend > > the present definition. So, I have added some tests and > > documentation > > and would now like to open it for proper discussion. See > > > > https://github.com/numpy/numpy/pull/25476 > > > And another try much later :). > > I think we would be nice to revive this old PR. This discussion (and > more, plus my old attempts to use it a long time ago) make me convinced > that any move forward would be nice. > > But my opinion on the precise direction changed a bit :). > > I would prefer to introduce a new `ufunc.segmented_reduce` or > `reduce_segmented`. > My reasoning is that I think it is more descriptive and overloading > `reduceat` with multiple ways of using it seems potentially confusing > and to me seems awkward long term. A new ufunc method seems cheap API > surface wise. > > This word probably comes from the HPC world maybe (e.g. CUDA cub uses > it). One caveat is that it got adopted as `segmented_sum` into e.g. > JAX via tensorflow but they at least have an API that doesn't quite > look like a segmented reduce anymore (more a `ufunc.at`/map > reduce/reduce by key, although possibly with limitations making it > maybe more an implementation detail). > > So if anyone has thoughts on this, I would be interested! And otherwise > it's a heads up that I think we may push for this, or the the current > overloading of `reduceat`. > > Cheers, > > Sebastian > > > > > > From the examples there: > > ``` > > a = np.arange(12) > > np.add.reduceat(a, ([1, 3, 5], [2, -1, 0])) > > # array([ 1, 52, 0]) > > np.minimum.reduceat(a, ([1, 3, 5], [2, -1, 0]), initial=10) > > # array([ 1, 3, 10]) > > np.minimum.reduceat(a, ([1, 3, 5], [2, -1, 0])) > > # ValueError: empty slice encountered with reduceat operation for > > 'minimum', which does not have an identity. Specify 'initial'. > > ``` > > Let me know what you all think, > > > > Marten > > > > p.s. Rereading the thread, I see we discussed initial vs default > > values > > in some detail. This is interesting, but somewhat orthogonal to the > > PR, > > since it just copies behaviour already present for reduce. > > _______________________________________________ > > NumPy-Discussion mailing list -- [email protected] > > To unsubscribe send an email to [email protected] > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: [email protected] > _______________________________________________ > NumPy-Discussion mailing list -- [email protected] > To unsubscribe send an email to [email protected] > https://mail.python.org/mailman3//lists/numpy-discussion.python.org > Member address: [email protected] >
_______________________________________________ NumPy-Discussion mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: [email protected]
