Re: [Numpy-discussion] numpy where function on different sized arrays

David Warde-Farley Sat, 24 Nov 2012 16:34:28 -0800

On Sat, Nov 24, 2012 at 7:08 PM, David Warde-Farley <
d.warde.far...@gmail.com> wrote:


> I think that would lose information as to which value in B was at each
> position. I think you want:
>
>
(premature send, stupid Gmail...)

idx = {}
for i, x in enumerate(a):
        for j, y in enumerate(x):
                if y in B:
                        idx.setdefault(y, []).append((i,j))

On the problem size the OP specified, this is about 4x slower than the
NumPy version I posted above. However with a small modification:

idx = {}
set_b = set(B)  # makes 'if y in B' lookups much faster
for i, x in enumerate(a):
        for j, y in enumerate(x):
                if y in set_b:
                        idx.setdefault(y, []).append((i,j))


It actually beats my solution. With inputs: np.random.seed(0); A =
np.random.random_integers(40, 59, size=(40, 60)); B = np.arange(40, 60)

In [115]: timeit foo_py_orig(A, B)
100 loops, best of 3: 16.5 ms per loop

In [116]: timeit foo_py(A, B)
100 loops, best of 3: 2.5 ms per loop

In [117]: timeit foo_numpy(A, B)
100 loops, best of 3: 4.15 ms per loop

Depending on the specifics of the inputs, a collections.DefaultDict could
also help things.


> On Sat, Nov 24, 2012 at 5:23 PM, Daπid <davidmen...@gmail.com> wrote:
>
>> A pure Python approach could be:
>>
>> for i, x in enumerate(a):
>>         for j, y in enumerate(x):
>>                 if y in b:
>>                         idx.append((i,j))
>>
>> Of course, it is slow if the arrays are large, but it is very
>> readable, and probably very fast if cythonised.
>>
>>
>> David.
>>
>> On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley
>> <d.warde.far...@gmail.com> wrote:
>> > M = A[..., np.newaxis] == B
>> >
>> > will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a
>> > boolean mask for all the occurrences of B[i] in A.
>> >
>> > If you wanted all the (i, j) pairs for each value in B, you could do
>> > something like
>> >
>> > import numpy as np
>> > from itertools import izip, groupby
>> > from operator import itemgetter
>> >
>> > id1, id2, id3 = np.where(A[..., np.newaxis] == B)
>> > order = np.argsort(id3)
>> > triples_iter = izip(id3[order], id1[order], id2[order])
>> > grouped = groupby(triples_iter, itemgetter(0))
>> > d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in
>> > grouped)
>> >
>> > Then d[value] is a list of all the (i, j) pairs where A[i, j] == value,
>> and
>> > the keys of d are every value in B.
>> >
>> >
>> >
>> > On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi <
>> sgo...@staffmail.ed.ac.uk>
>> > wrote:
>> >>
>> >> Hi all
>> >>
>> >> This must have been answered in the past but my google search
>> capabilities
>> >> are not the best.
>> >>
>> >> Given an array A say of dimension 40x60 and given another array/vector
>> B
>> >> of dimension 20 (the values in B occur only once).
>> >>
>> >> What I would like to do is the following which of course does not work
>> (by
>> >> the way doesn't work in IDL either):
>> >>
>> >> indx=where(A == B)
>> >>
>> >> I understand A and B are both of different dimensions. So my question:
>> >> what would the fastest or proper way to accomplish this (I found a
>> solution
>> >> but think is rather awkward and not very scipy/numpy-tonic tough).
>> >>
>> >> Thanks
>> >> --
>> >> The University of Edinburgh is a charitable body, registered in
>> >> Scotland, with registration number SC005336.
>> >>
>> >> _______________________________________________
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion@scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> >
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@scipy.org
>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy where function on different sized arrays

Reply via email to