Re: [Numpy-discussion] [Suggestion] Labelled Array

Allan Haldane Sat, 13 Feb 2016 09:11:58 -0800

I've had a pretty similar idea for a new indexing function'split_classes' which would help in your case, which essentially does


    def split_classes(c, v):
        return [v[c == u] for u in unique(c)]


Your example could be coded as

    >>> [sum(c) for c in split_classes(label, data)]
    [9, 12, 15]

I feel I've come across the need for such a function often enough thatit might be generally useful to people as part of numpy. Theimplementation of split_classes above has pretty poor performancebecause it creates many temporary boolean arrays, so my plan for a PRwas to have a speedy version of it that uses a single pass through v.

(I often wanted to use this function on large datasets).

If anyone has any comments on the idea (good idea. bad idea?) I'd loveto hear.

I have some further notes and examples here:https://gist.github.com/ahaldane/1e673d2fe6ffe0be4f21


Allan

On 02/12/2016 09:40 AM, Sérgio wrote:

Hello,

This is my first e-mail, I will try to make the idea simple.

Similar to masked array it would be interesting to use a label array to
guide operations.

Ex.:
 >>> x
labelled_array(data =
  [[0 1 2]
  [3 4 5]
  [6 7 8]],
                         label =
  [[0 1 2]
  [0 1 2]
  [0 1 2]])

 >>> sum(x)
array([9, 12, 15])

The operations would create a new axis for label indexing.

You could think of it as a collection of masks, one for each label.

I don't know a way to make something like this efficiently without a
loop. Just wondering...

Sérgio.


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [Suggestion] Labelled Array

Reply via email to