[Numpy-discussion] Loading bit strings

2010-03-05 Thread Dan Lenski
Is there a good way in NumPy to convert from a bit string to a boolean
array?

For example, if I have a 2-byte string s='\xfd\x32', I want to get a
16-length boolean array out of it.

Here's what I came up with:

A = fromstring(s, dtype=uint8)
out = empty(A.size * 8, dtype=bool)
for bit in range(0,8):
  out[bit::8] = A&(1

[Numpy-discussion] using reducing functions without eliminating dimensions?

2009-04-07 Thread Dan Lenski
Hi all,

I often want to use some kind of dimension-reducing function (like min(), 
max(), sum(), mean()) on an array without actually removing the last 
dimension, so that I can then do operations broadcasting the reduced 
array back to the size of the full array.  Full example:

  >> table.shape
  (47, 1814)

  >> table.min(axis=1).shape
  (47,)

  >> table - table.min(axis=1)
  ValueError: shape mismatch: objects cannot be broadcast to a single 
shape

  >> table - table.min(axis=1)[:, newaxis]

I have to resort to ugly code with lots of stuff like "... axis=1)[:, 
newaxis]".

Is there any way to get the reducing functions to leave a size-1 dummy 
dimension in place, to make this easier?

Thanks!

Dan

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] using reducing functions without eliminating dimensions?

2009-04-09 Thread Dan Lenski
On Thu, 09 Apr 2009 01:31:33 -0500, Robert Kern wrote:

> On Thu, Apr 9, 2009 at 01:29, Anne Archibald 
> wrote:
> 
>> What's wrong with np.amin(a,axis=-1)[...,np.newaxis]?
> 
> It's cumbersome, particularly when you have axis=arbitrary_axis.

Quite right.  It would nice to be able to say:

np.amin(a, axiskeep=-1)
or
a.min(axiskeep=3)

... or something along those lines.

Dan

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Dan Lenski
Hi all,
I'm using NumPy to read and process data from ASCII UCD files.  This is a 
file format for describing unstructured finite-element meshes.

Most of the file consists of rectangular, numerical text matrices, easily 
and efficiently read with loadtxt().  But there is one particularly nasty 
section that consists of matrices with variable numbers of columns, like 
this:

# index property type nodes
1   1tet  620 583 1578 1792
2   1tet  656 551 553 566
3   1tet  1565 766 1600 1646
4   1tet  1545 631 1566 1665
5   1hex  1531 1512 1559 1647 1648 1732
6   1hex  777 1536 1556 1599 1601 1701
7   1quad 296 1568 1535 1604
8   1quad 54 711 285 666

As you might guess, the "type" label in the third column does indicate 
the number of following columns.

Some of my files contain sections like this of *more than 1 million 
lines*, so I need to be able to read them fast.  I have not yet come up
with a good way to do this.  What I do right now is I split them up into 
separate arrays based on the "type" label:

lines = [f.next() for i in range(n)]
lines = [l.split(None, 3) for l in lines]
id, prop, types, nodes = apply(zip, lines) # THIS TAKES /FOREVER/

id = array(id, dtype=uint)
prop = array(id, dtype=uint)
types = array(types)

cells = {}
for t in N.unique(types):
 these = N.nonzero(types==t)
 # THIS NEXT LINE TAKES FOREVER
 these_nodes = array([nodes[ii].split() for ii in these], dtype=uint).T
 cells[t] = N.row_stack(( id[these], prop[these], these_nodes ))

This is really pretty slow and sub-optimal.  Has anyone developed a more 
efficient way to read arrays with variable numbers of columns???

Dan

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-14 Thread Dan Lenski
On Thu, 14 Aug 2008 04:40:16 +, Daniel Lenski wrote:
> I assume that list-of-arrays is more memory-efficient since array
> elements don't have the overhead of full-blown Python objects.  But
> list- of-lists is probably more time-efficient since I think it's faster
> to convert the whole array at once than do it row-by-row.
> 
> Dan

Just a follow-up...

Well, I tried the simple, straightforward list-of-lists approach and it's 
the fastest.  About 20 seconds for 1.5 million cells on my machine:

def _read_cells(self, f, n, debug=False):
cells = dict()
for i in xrange(n):
cell = f.readline().split()
celltype = cell.pop(2)
if celltype not in cells: cells[celltype]=[]
cells[celltype].append(cell)
for k in cells:
cells[k] = N.array(cells[k], dtype=int).T
return cells

List-of-arrays uses about 20% less memory, but is about 4-5 times slower 
(presumably due to the overhead of array creation?).

And my preallocation approach is 2-3 times slower than list-of-lists.  
Again, I *think* this is due to array creation/conversion overhead, when 
assigning to a slice of the array:

def _read_cells2(self, f, n, debug=False):
cells = dict()
count = dict()
curtype = None

for i in xrange(n):
cell = f.readline().split()
celltype = cell[2]

if celltype!=curtype:
curtype = celltype
if curtype not in cells:
cells[curtype] = N.empty((n-i, len(cell)-1), type=int)
count[curtype] = 0
block = cells[curtype]
block[count[curtype]] = cell[:2]+cell[3:] ### THIS LINE HERE
count[curtype] += 1

for k in cells:
cells[k] = cells[k][:count[k]].T

return cells

So my conclusion is... you guys are right.  List-of-lists is the fastest 
way to build up an array.  Then do the string-to-numeric and list-to-
array conversion ALL AT ONCE with numpy.array(list_of_lists, dtype=int).

Thanks for all the insight!

Dan

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] weights parameter of np.average() doesn't work (BUG?)

2008-08-24 Thread Dan Lenski
Hi all,
Is there a good reason why the weights parameter of np.average() doesn't 
broadcast properly?  This is with the Ubuntu Hardy x86_64 numpy package, 
version 1.0.4.


In [293]: a=arange(100).reshape(10,10)

# Things work fine when weights have the exact same shape as a

In [297]: average(a, axis=1, weights=ones((10,10)))
Out[297]: array([  4.5,  14.5,  24.5,  34.5,  44.5,  54.5,  64.5,  74.5,  
84.5,  94.5])

# Bizarre and incorrect result with length-10 weight array

In [298]: average(a, axis=1, weights=ones(10))
Out[298]: 
array([  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.],
  [ 10.,  11.,  12.,  13.,  14.,  15.,  16.,  17.,  18.,  19.],
  [ 20.,  21.,  22.,  23.,  24.,  25.,  26.,  27.,  28.,  29.],
  [ 30.,  31.,  32.,  33.,  34.,  35.,  36.,  37.,  38.,  39.],
  [ 40.,  41.,  42.,  43.,  44.,  45.,  46.,  47.,  48.,  49.],
  [ 50.,  51.,  52.,  53.,  54.,  55.,  56.,  57.,  58.,  59.],
  [ 60.,  61.,  62.,  63.,  64.,  65.,  66.,  67.,  68.,  69.],
  [ 70.,  71.,  72.,  73.,  74.,  75.,  76.,  77.,  78.,  79.],
  [ 80.,  81.,  82.,  83.,  84.,  85.,  86.,  87.,  88.,  89.],
  [ 90.,  91.,  92.,  93.,  94.,  95.,  96.,  97.,  98.,  99.]
   )

Doing the weighted-sum explicitly works fine for me:

In [311]: sum(a*ones(10), axis=-1)/sum(ones(10))
Out[311]: array([  4.5,  14.5,  24.5,  34.5,  44.5,  54.5,  64.5,  74.5,  
84.5,  94.5])

This seems like a bug, especially since average.__doc__ states that:
If weights are given, result is:
sum(a * weights,axis) / sum(weights,axis),
where the weights must have a's shape or be 1D with length the
size of a in the given axis. Integer weights are converted to
Float.  Not specifying weights is equivalent to specifying
weights that are all 1.

Frankly, I don't even see why weights is constrained to be 1D or the same 
shape as a... why not anything that's broadcastable to the same shape as a?

Dan

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] doing zillions of 3x3 determinants fast

2008-08-25 Thread Dan Lenski
On Sun, 24 Aug 2008 23:49:45 -0600, Charles R Harris wrote:

> On Sun, Aug 24, 2008 at 9:48 PM, Daniel Lenski <[EMAIL PROTECTED]>
> wrote:
> 
>> Hi all,
>> I need to take the determinants of a large number of 3x3 matrices, in
>> order to determine for each of N points, in which of M tetrahedral
>> cells they lie.  I arrange the matrices in an ndarray of shape
>> (N,M,5,3,3).
>>
>>
> If you step back a bit and describe the actual problem you have, rather
> than your current solution, it might be that there are algorithms in
> scipy to solve it.
>

Hi Charles,
I have an irregular/unstructured mesh of tetrahedral cells, and I need to 
interpolate some data over this mesh, at arbitrary points within the 
complete volume.
 
So this is basically 3D interpolation.  I've done some looking into this 
already, and it seems that the only facility for 3D interpolation in 
Scipy is map_coordinates, which only works on a regular grid.

So I have to roll my own interpolation!  I start by trying to figure out 
in which of the tetrahedral cells each interpolation point lies.  The 
standard way to do this is to check that for every three vertices of each 
tetrahedron (v1,v2,v3), the point in question lies on the same side as 
the fourth point (v4).  This can be checked with:

| v1x-xv1y-y   v1z-z | | v1x-v4xv1y-v4y   v1z-v4z |
| v2x-xv2y-y   v2z-z |  =  | v2x-v4xv2y-v4y   v2z-v4z |
| v3x-xv3y-y   v3z-z | | v3x-v4xv3y-v4y   v3z-v4z |

(Here's a description of a nearly identical algorithm: http://
steve.hollasch.net/cgindex/geometry/ptintet.html)

Dan

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] argsort in descending order

2008-09-05 Thread Dan Lenski
On Fri, 05 Sep 2008 07:02:38 -0700, SimonPalmer wrote:

> another newb question I suspect, but is there a way to instruct argsort
> to sort in descending order or should I just sort and reverse?

Just sort and subtract to get the reverse order.  I can think of two 
reasonable ways to do it with no copying:

from numpy import *
a = rand(100)

# this does NOT make a copy of a, simply indexes it backward, and should 
# be very fast
reverseorder = argsort(a[::-1])

# this way doesn't make a copy either, and probably is a little slower
# since it has to do many subtractions
reverseorder = len(a)-1-argsort(a)

HTH,
Dan

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion