Re: [Numpy-discussion] numpy where function on different sized arrays

2012-11-24 Thread David Warde-Farley
On Sat, Nov 24, 2012 at 7:08 PM, David Warde-Farley <
d.warde.far...@gmail.com> wrote:

> I think that would lose information as to which value in B was at each
> position. I think you want:
>
>
(premature send, stupid Gmail...)

idx = {}
for i, x in enumerate(a):
for j, y in enumerate(x):
if y in B:
idx.setdefault(y, []).append((i,j))

On the problem size the OP specified, this is about 4x slower than the
NumPy version I posted above. However with a small modification:

idx = {}
set_b = set(B)  # makes 'if y in B' lookups much faster
for i, x in enumerate(a):
for j, y in enumerate(x):
if y in set_b:
idx.setdefault(y, []).append((i,j))


It actually beats my solution. With inputs: np.random.seed(0); A =
np.random.random_integers(40, 59, size=(40, 60)); B = np.arange(40, 60)

In [115]: timeit foo_py_orig(A, B)
100 loops, best of 3: 16.5 ms per loop

In [116]: timeit foo_py(A, B)
100 loops, best of 3: 2.5 ms per loop

In [117]: timeit foo_numpy(A, B)
100 loops, best of 3: 4.15 ms per loop

Depending on the specifics of the inputs, a collections.DefaultDict could
also help things.


> On Sat, Nov 24, 2012 at 5:23 PM, Daπid  wrote:
>
>> A pure Python approach could be:
>>
>> for i, x in enumerate(a):
>> for j, y in enumerate(x):
>> if y in b:
>> idx.append((i,j))
>>
>> Of course, it is slow if the arrays are large, but it is very
>> readable, and probably very fast if cythonised.
>>
>>
>> David.
>>
>> On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley
>>  wrote:
>> > M = A[..., np.newaxis] == B
>> >
>> > will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a
>> > boolean mask for all the occurrences of B[i] in A.
>> >
>> > If you wanted all the (i, j) pairs for each value in B, you could do
>> > something like
>> >
>> > import numpy as np
>> > from itertools import izip, groupby
>> > from operator import itemgetter
>> >
>> > id1, id2, id3 = np.where(A[..., np.newaxis] == B)
>> > order = np.argsort(id3)
>> > triples_iter = izip(id3[order], id1[order], id2[order])
>> > grouped = groupby(triples_iter, itemgetter(0))
>> > d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in
>> > grouped)
>> >
>> > Then d[value] is a list of all the (i, j) pairs where A[i, j] == value,
>> and
>> > the keys of d are every value in B.
>> >
>> >
>> >
>> > On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi <
>> sgo...@staffmail.ed.ac.uk>
>> > wrote:
>> >>
>> >> Hi all
>> >>
>> >> This must have been answered in the past but my google search
>> capabilities
>> >> are not the best.
>> >>
>> >> Given an array A say of dimension 40x60 and given another array/vector
>> B
>> >> of dimension 20 (the values in B occur only once).
>> >>
>> >> What I would like to do is the following which of course does not work
>> (by
>> >> the way doesn't work in IDL either):
>> >>
>> >> indx=where(A == B)
>> >>
>> >> I understand A and B are both of different dimensions. So my question:
>> >> what would the fastest or proper way to accomplish this (I found a
>> solution
>> >> but think is rather awkward and not very scipy/numpy-tonic tough).
>> >>
>> >> Thanks
>> >> --
>> >> The University of Edinburgh is a charitable body, registered in
>> >> Scotland, with registration number SC005336.
>> >>
>> >> ___
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion@scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> >
>> >
>> > ___
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@scipy.org
>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy where function on different sized arrays

2012-11-24 Thread David Warde-Farley
I think that would lose information as to which value in B was at each
position. I think you want:



On Sat, Nov 24, 2012 at 5:23 PM, Daπid  wrote:

> A pure Python approach could be:
>
> for i, x in enumerate(a):
> for j, y in enumerate(x):
> if y in b:
> idx.append((i,j))
>
> Of course, it is slow if the arrays are large, but it is very
> readable, and probably very fast if cythonised.
>
>
> David.
>
> On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley
>  wrote:
> > M = A[..., np.newaxis] == B
> >
> > will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a
> > boolean mask for all the occurrences of B[i] in A.
> >
> > If you wanted all the (i, j) pairs for each value in B, you could do
> > something like
> >
> > import numpy as np
> > from itertools import izip, groupby
> > from operator import itemgetter
> >
> > id1, id2, id3 = np.where(A[..., np.newaxis] == B)
> > order = np.argsort(id3)
> > triples_iter = izip(id3[order], id1[order], id2[order])
> > grouped = groupby(triples_iter, itemgetter(0))
> > d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in
> > grouped)
> >
> > Then d[value] is a list of all the (i, j) pairs where A[i, j] == value,
> and
> > the keys of d are every value in B.
> >
> >
> >
> > On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi <
> sgo...@staffmail.ed.ac.uk>
> > wrote:
> >>
> >> Hi all
> >>
> >> This must have been answered in the past but my google search
> capabilities
> >> are not the best.
> >>
> >> Given an array A say of dimension 40x60 and given another array/vector B
> >> of dimension 20 (the values in B occur only once).
> >>
> >> What I would like to do is the following which of course does not work
> (by
> >> the way doesn't work in IDL either):
> >>
> >> indx=where(A == B)
> >>
> >> I understand A and B are both of different dimensions. So my question:
> >> what would the fastest or proper way to accomplish this (I found a
> solution
> >> but think is rather awkward and not very scipy/numpy-tonic tough).
> >>
> >> Thanks
> >> --
> >> The University of Edinburgh is a charitable body, registered in
> >> Scotland, with registration number SC005336.
> >>
> >> ___
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Z-ordering (Morton ordering) for numpy

2012-11-24 Thread Charles R Harris
On Sat, Nov 24, 2012 at 1:30 PM, Gamblin, Todd  wrote:

> So, just FYI, my usage of this is for Rubik, where it's a communication
> latency optimization for the code being mapped to the network.  I haven't
> tested it as an optimization for particular in-core algorithms.  However,
> there was some work on this at LLNL maybe a couple years ago -- I think it
> was for the solvers.  I'll ask around for an example and/or a paper, or
> maybe Travis has examples.
>
> Just from an ease-of-use point of view, though, if you make it simple to
> do zordering, you might see more people using it :).  That's why I wanted
> to get this into numpy.
>
> This brings up another point.  This is pure python, so it won't be
> super-fast.  What's the typical way things are integrated and optimized in
> numpy?  Do you contribute something like this in python first then convert
> to cython/C as necessary?  Or would you want it in C to begin with?
>
>
I think Python is a good start as it allows easy modification and can be
converted to Cython later on. OTOH, if the new function looks settled from
the get go, Cython is an option.



Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy where function on different sized arrays

2012-11-24 Thread Daπid
A pure Python approach could be:

for i, x in enumerate(a):
for j, y in enumerate(x):
if y in b:
idx.append((i,j))

Of course, it is slow if the arrays are large, but it is very
readable, and probably very fast if cythonised.


David.

On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley
 wrote:
> M = A[..., np.newaxis] == B
>
> will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a
> boolean mask for all the occurrences of B[i] in A.
>
> If you wanted all the (i, j) pairs for each value in B, you could do
> something like
>
> import numpy as np
> from itertools import izip, groupby
> from operator import itemgetter
>
> id1, id2, id3 = np.where(A[..., np.newaxis] == B)
> order = np.argsort(id3)
> triples_iter = izip(id3[order], id1[order], id2[order])
> grouped = groupby(triples_iter, itemgetter(0))
> d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in
> grouped)
>
> Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and
> the keys of d are every value in B.
>
>
>
> On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi 
> wrote:
>>
>> Hi all
>>
>> This must have been answered in the past but my google search capabilities
>> are not the best.
>>
>> Given an array A say of dimension 40x60 and given another array/vector B
>> of dimension 20 (the values in B occur only once).
>>
>> What I would like to do is the following which of course does not work (by
>> the way doesn't work in IDL either):
>>
>> indx=where(A == B)
>>
>> I understand A and B are both of different dimensions. So my question:
>> what would the fastest or proper way to accomplish this (I found a solution
>> but think is rather awkward and not very scipy/numpy-tonic tough).
>>
>> Thanks
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy where function on different sized arrays

2012-11-24 Thread David Warde-Farley
M = A[..., np.newaxis] == B

will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a
boolean mask for all the occurrences of B[i] in A.

If you wanted all the (i, j) pairs for each value in B, you could do
something like

import numpy as np
from itertools import izip, groupby
from operator import itemgetter

id1, id2, id3 = np.where(A[..., np.newaxis] == B)
order = np.argsort(id3)
triples_iter = izip(id3[order], id1[order], id2[order])
grouped = groupby(triples_iter, itemgetter(0))
d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in
grouped)

Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and
the keys of d are every value in B.



On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi
wrote:

> Hi all
>
> This must have been answered in the past but my google search capabilities
> are not the best.
>
> Given an array A say of dimension 40x60 and given another array/vector B
> of dimension 20 (the values in B occur only once).
>
> What I would like to do is the following which of course does not work (by
> the way doesn't work in IDL either):
>
> indx=where(A == B)
>
> I understand A and B are both of different dimensions. So my question:
> what would the fastest or proper way to accomplish this (I found a solution
> but think is rather awkward and not very scipy/numpy-tonic tough).
>
> Thanks
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy where function on different size

2012-11-24 Thread Siegfried Gonzi
> Message: 6
> Date: Sat, 24 Nov 2012 20:36:45 +
> From: Siegfried Gonzi 
> Subject: [Numpy-discussion] numpy where function on different size
> Hi all
>This must have been answered in the past but my google search 
> capabilities are not the best.
>Given an array A say of dimension 40x60 and given another array/vector B 
> of dimension 20 (the values in B occur only once).
>What I would like to do is the following which of course does not work 
> (by the way doesn't work in IDL either):
>indx=where(A == B)
>I understand A and B are both of different dimensions. So my question: 
> what would the fastest or proper way to accomplish this (I found a 
> solution but think is rather awkward and not very scipy/numpy-tonic 
> tough).


I should clarify: where(A==B, C, A) 

C is of equal dimension than B. Basically: everywhere where A equals e.g. B[0] 
replace by C[0]; or where B[4] equals A replace very occurence in A by C[4].


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy where function on different sized arrays

2012-11-24 Thread Siegfried Gonzi
Hi all
 
This must have been answered in the past but my google search capabilities are 
not the best.
 
Given an array A say of dimension 40x60 and given another array/vector B of 
dimension 20 (the values in B occur only once).
 
What I would like to do is the following which of course does not work (by the 
way doesn't work in IDL either):
 
indx=where(A == B)
 
I understand A and B are both of different dimensions. So my question: what 
would the fastest or proper way to accomplish this (I found a solution but 
think is rather awkward and not very scipy/numpy-tonic tough).
 
Thanks
-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Z-ordering (Morton ordering) for numpy

2012-11-24 Thread Gamblin, Todd
So, just FYI, my usage of this is for Rubik, where it's a communication latency 
optimization for the code being mapped to the network.  I haven't tested it as 
an optimization for particular in-core algorithms.  However, there was some 
work on this at LLNL maybe a couple years ago -- I think it was for the 
solvers.  I'll ask around for an example and/or a paper, or maybe Travis has 
examples.

Just from an ease-of-use point of view, though, if you make it simple to do 
zordering, you might see more people using it :).  That's why I wanted to get 
this into numpy.

This brings up another point.  This is pure python, so it won't be super-fast.  
What's the typical way things are integrated and optimized in numpy?  Do you 
contribute something like this in python first then convert to cython/C as 
necessary?  Or would you want it in C to begin with?

-Todd


On Nov 24, 2012, at 12:17 PM, Aron Ahmadia 
 wrote:

> Todd,
> 
> I am optimistic and I think it would be a good idea to put this in.  A couple 
> previous studies [1] haven't found any useful speedups from in-core 
> applications for Morton-order, and if you have results for real scientific 
> applications using numpy this would not only be great, but the resulting 
> paper would have quite a bit of impact.  I'm sure you're already connected to 
> the right people at LLNL, but I can think of a couple other projects which 
> might be interested in trying this sort of thing out.
> 
> http://www.cs.utexas.edu/~pingali/CS395T/2012sp/papers/co.pdf
> 
> Cheers,
> Aron
> 
> 
> On Sat, Nov 24, 2012 at 8:10 PM, Travis Oliphant  wrote:
> This is pretty cool.Something like this would be interesting to play 
> with.  There are some algorithms that are faster with z-order arrays.The 
> code is simple enough and small enough that I could see putting it in NumPy.  
>  What do others think?
> 
> -Travis
> 
> 
> 
> On Nov 24, 2012, at 1:03 PM, Gamblin, Todd wrote:
> 
> > Hi all,
> >
> > In the course of developing a network mapping tool I'm working on, I also 
> > developed some python code to do arbitrary-dimensional z-order (morton 
> > order) for ndarrays.  The code is here:
> >
> >   https://github.com/tgamblin/rubik/blob/master/rubik/zorder.py
> >
> > There is a function to put the elements of an array in Z order, and another 
> > one to enumerate an array's elements in Z order.  There is also a ZEncoder 
> > class that can generate Z-codes for arbitrary dimensions and bit widths.
> >
> > I figure this is something that would be generally useful.  Any interest in 
> > having this in numpy?  If so, what should the interface look like and can 
> > you point me to a good spot in the code to add it?
> >
> > I was thinking it might make sense to have a Z-order iterator for ndarrays, 
> > kind of like ndarray.flat.  i.e.:
> >
> >   arr = np.empty([4,4], dtype=int)
> >   arr.flat = range(arr.size)
> >   for elt in arr.zorder:
> >   print elt,
> >   0 4 1 5 8 12 9 13 2 6 3 7 10 14 11 15
> >
> > Or an equivalent to ndindex:
> >
> >   arr = np.empty(4,4, dtype=int)
> >   arr.flat = range(arr.size)
> >   for ix in np.zindex(arr.shape):
> >   print ix,
> >   (0, 0) (1, 0) (0, 1) (1, 1) (2, 0) (3, 0) (2, 1) (3, 1) (0, 2) (1, 2) 
> > (0, 3) (1, 3) (2, 2) (3, 2) (2, 3) (3, 3)
> >
> > Thoughts?
> >
> > -Todd
> > __
> > Todd Gamblin, tgamb...@llnl.gov, http://people.llnl.gov/gamblin2
> > CASC @ Lawrence Livermore National Laboratory, Livermore, CA, USA
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

__
Todd Gamblin, tgamb...@llnl.gov, http://people.llnl.gov/gamblin2
CASC @ Lawrence Livermore National Laboratory, Livermore, CA, USA

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Z-ordering (Morton ordering) for numpy

2012-11-24 Thread Aron Ahmadia
Todd,

I am optimistic and I think it would be a good idea to put this in.  A
couple previous studies [1] haven't found any useful speedups from in-core
applications for Morton-order, and if you have results for real scientific
applications using numpy this would not only be great, but the resulting
paper would have quite a bit of impact.  I'm sure you're already connected
to the right people at LLNL, but I can think of a couple other projects
which might be interested in trying this sort of thing out.

http://www.cs.utexas.edu/~pingali/CS395T/2012sp/papers/co.pdf

Cheers,
Aron


On Sat, Nov 24, 2012 at 8:10 PM, Travis Oliphant wrote:

> This is pretty cool.Something like this would be interesting to play
> with.  There are some algorithms that are faster with z-order arrays.
>  The code is simple enough and small enough that I could see putting it in
> NumPy.   What do others think?
>
> -Travis
>
>
>
> On Nov 24, 2012, at 1:03 PM, Gamblin, Todd wrote:
>
> > Hi all,
> >
> > In the course of developing a network mapping tool I'm working on, I
> also developed some python code to do arbitrary-dimensional z-order (morton
> order) for ndarrays.  The code is here:
> >
> >   https://github.com/tgamblin/rubik/blob/master/rubik/zorder.py
> >
> > There is a function to put the elements of an array in Z order, and
> another one to enumerate an array's elements in Z order.  There is also a
> ZEncoder class that can generate Z-codes for arbitrary dimensions and bit
> widths.
> >
> > I figure this is something that would be generally useful.  Any interest
> in having this in numpy?  If so, what should the interface look like and
> can you point me to a good spot in the code to add it?
> >
> > I was thinking it might make sense to have a Z-order iterator for
> ndarrays, kind of like ndarray.flat.  i.e.:
> >
> >   arr = np.empty([4,4], dtype=int)
> >   arr.flat = range(arr.size)
> >   for elt in arr.zorder:
> >   print elt,
> >   0 4 1 5 8 12 9 13 2 6 3 7 10 14 11 15
> >
> > Or an equivalent to ndindex:
> >
> >   arr = np.empty(4,4, dtype=int)
> >   arr.flat = range(arr.size)
> >   for ix in np.zindex(arr.shape):
> >   print ix,
> >   (0, 0) (1, 0) (0, 1) (1, 1) (2, 0) (3, 0) (2, 1) (3, 1) (0, 2) (1,
> 2) (0, 3) (1, 3) (2, 2) (3, 2) (2, 3) (3, 3)
> >
> > Thoughts?
> >
> > -Todd
> > __
> > Todd Gamblin, tgamb...@llnl.gov, http://people.llnl.gov/gamblin2
> > CASC @ Lawrence Livermore National Laboratory, Livermore, CA, USA
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Z-ordering (Morton ordering) for numpy

2012-11-24 Thread Travis Oliphant
This is pretty cool.Something like this would be interesting to play with.  
There are some algorithms that are faster with z-order arrays.The code is 
simple enough and small enough that I could see putting it in NumPy.   What do 
others think?

-Travis



On Nov 24, 2012, at 1:03 PM, Gamblin, Todd wrote:

> Hi all,
> 
> In the course of developing a network mapping tool I'm working on, I also 
> developed some python code to do arbitrary-dimensional z-order (morton order) 
> for ndarrays.  The code is here:
> 
>   https://github.com/tgamblin/rubik/blob/master/rubik/zorder.py
> 
> There is a function to put the elements of an array in Z order, and another 
> one to enumerate an array's elements in Z order.  There is also a ZEncoder 
> class that can generate Z-codes for arbitrary dimensions and bit widths.
> 
> I figure this is something that would be generally useful.  Any interest in 
> having this in numpy?  If so, what should the interface look like and can you 
> point me to a good spot in the code to add it?
> 
> I was thinking it might make sense to have a Z-order iterator for ndarrays, 
> kind of like ndarray.flat.  i.e.:
> 
>   arr = np.empty([4,4], dtype=int)
>   arr.flat = range(arr.size)
>   for elt in arr.zorder:
>   print elt,
>   0 4 1 5 8 12 9 13 2 6 3 7 10 14 11 15
> 
> Or an equivalent to ndindex:
> 
>   arr = np.empty(4,4, dtype=int)
>   arr.flat = range(arr.size)
>   for ix in np.zindex(arr.shape):
>   print ix,
>   (0, 0) (1, 0) (0, 1) (1, 1) (2, 0) (3, 0) (2, 1) (3, 1) (0, 2) (1, 2) 
> (0, 3) (1, 3) (2, 2) (3, 2) (2, 3) (3, 3)
> 
> Thoughts?
> 
> -Todd
> __
> Todd Gamblin, tgamb...@llnl.gov, http://people.llnl.gov/gamblin2
> CASC @ Lawrence Livermore National Laboratory, Livermore, CA, USA
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Z-ordering (Morton ordering) for numpy

2012-11-24 Thread Gamblin, Todd
Hi all,

In the course of developing a network mapping tool I'm working on, I also 
developed some python code to do arbitrary-dimensional z-order (morton order) 
for ndarrays.  The code is here:

https://github.com/tgamblin/rubik/blob/master/rubik/zorder.py

There is a function to put the elements of an array in Z order, and another one 
to enumerate an array's elements in Z order.  There is also a ZEncoder class 
that can generate Z-codes for arbitrary dimensions and bit widths.

I figure this is something that would be generally useful.  Any interest in 
having this in numpy?  If so, what should the interface look like and can you 
point me to a good spot in the code to add it?

I was thinking it might make sense to have a Z-order iterator for ndarrays, 
kind of like ndarray.flat.  i.e.:

arr = np.empty([4,4], dtype=int)
arr.flat = range(arr.size)
for elt in arr.zorder:
print elt,
0 4 1 5 8 12 9 13 2 6 3 7 10 14 11 15

Or an equivalent to ndindex:

arr = np.empty(4,4, dtype=int)
arr.flat = range(arr.size)
for ix in np.zindex(arr.shape):
print ix,
(0, 0) (1, 0) (0, 1) (1, 1) (2, 0) (3, 0) (2, 1) (3, 1) (0, 2) (1, 2) 
(0, 3) (1, 3) (2, 2) (3, 2) (2, 3) (3, 3)

Thoughts?

-Todd
__
Todd Gamblin, tgamb...@llnl.gov, http://people.llnl.gov/gamblin2
CASC @ Lawrence Livermore National Laboratory, Livermore, CA, USA

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Hierarchical vs non-hierarchical ndarray.base and __array_interface__

2012-11-24 Thread Gamblin, Todd
Hi all,

I posted on the change in semantics of ndarray.base here:

https://github.com/numpy/numpy/commit/6c0ad59#commitcomment-2153047

And some folks asked me to post my question to the numpy mailing list.  I've 
implemented a tool for mapping processes in parallel applications to nodes in 
cartesian networks.  It uses hierarchies of numpy arrays to represent the 
domain decomposition of the application, as well as corresponding groups of 
processes on the network.  You can "map" an application to the network using 
assignment of through views.  The tool is here if anyone is curious:  
https://github.com/tgamblin/rubik.  I used numpy to implement this because I 
wanted to be able to do mappings for arbitrary-dimensional networks.  Blue 
Gene/Q, for example, has a 5-D network.

The reason I bring this up is because I rely on the ndarray.base pointer and 
some of the semantics in __array_interface__ to translate indices within my 
hierarchy of views.  e.g., if a value is at (0,0) in a view I want to know that 
it's actually at (4,4) in its immediate parent array.

After looking over the commit I linked to above, I realized I'm actually 
relying on a lot of stuff that's not guaranteed by numpy.  I rely on .base 
pointing to its closest parent, and I rely on __array_interface__.data 
containing the address of the array's memory and its strides.  None of these is 
guaranteed by the API docs:

http://docs.scipy.org/doc/numpy/reference/arrays.interface.html

So I guess I have a few questions:

1. Is translating indices between base arrays and views something that would be 
useful to other people?

2. Is there some better way to do this than using ndarray.base and 
__array_interface__?

3. What's the numpy philosophy on this?  Should views know about their parents 
or not?  They obviously have to know a little bit about their memory, but 
whether or not they know how they were derived from their owning array is a 
different question.  There was some discussion on the vagueness of .base here:


http://thread.gmane.org/gmane.comp.python.numeric.general/51688/focus=51703

But it doesn't look like you're deprecating .base in 1.7, only changing its 
behavior, which I tend to agree is worse than deprecating it.

After thinking about all this, I'm not sure what I would like to happen.  I can 
see the value of not keeping extra references around within numpy, and my 
domain is pretty different from the ways that I imagine people use numpy.  I 
wouldn't have to change my code much to make it work without .base, but I do 
rely on __array_interface__.  If that doesn't include the address and strides, 
t think I'm screwed as far as translating indices go.

Any suggestions?

Thanks!
-Todd



__
Todd Gamblin, tgamb...@llnl.gov, http://people.llnl.gov/gamblin2
CASC @ Lawrence Livermore National Laboratory, Livermore, CA, USA

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion