Re: [Numpy-discussion] recommended way to run numpy on snow leopard

2009-10-21 Thread Zachary Pincus
> Wow.  Once again, Apple makes using python unnecessarily difficult.
> Someone needs a whack with a clue bat.

Well, some tools from the operating system use numpy and other python  
modules. And upgrading one of these modules might conceivably break  
that dependency, leading to breakage in the OS. So Apple's design goal  
is to keep their tools working right, and absent any clear standard  
for package management in python that allows for side-by-side  
installation of different versions of the same module, this is  
probably the best way to go from their perspective.

Personally, I just install a hand-built python in /Frameworks. This is  
very easy, and it is also where the python.org python goes, so 3rd- 
party installers with hard-coded paths (boo) still work.

Zach
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Zachary Pincus
Unless I read your request or the documentation wrong, h5py already  
supports pulling specific fields out of "compound data types":

http://h5py.alfven.org/docs-1.1/guide/hl.html#id3

> For compound data, you can specify multiple field names alongside  
> the numeric slices:
> >>> dset["FieldA"]
> >>> dset[0,:,4:5, "FieldA", "FieldB"]
> >>> dset[0, ..., "FieldC"]

Is this latter style of access what you were asking for? (Or is the  
problem that it's not fast enough in hdf5, even with the shuffle  
filter, etc?)

So then the issue is that there's a dependency on hdf5 and h5py? (or  
if you want to access LZF-compressed files without h5py, a dependency  
on hdf5 and the C LZF compressor?). This is pretty lightweight,  
especially if you're proposing writing new code which itself would be  
a dependency. So your new code couldn't depend on *anything* else if  
you wanted it to be a fewer-dependencies option than hdf5+h5py, right?

Zach
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ctypes and numpy

2009-11-07 Thread Zachary Pincus
Check out this thread:
http://www.mail-archive.com/numpy-discuss...@lists.sourceforge.net/msg01154.html

In shot, it can be done, but it can be tricky to make sure you don't  
leak memory. A better option if possible is to pre-allocate the array  
with numpy and pass that buffer into the C code -- but that's not  
always possible...

Zach


On Nov 6, 2009, at 10:18 PM, Chris Colbert wrote:

> Are you bound to using ctypes as the only option?
>
> This could be done quite easily in Cython...
>
> On Fri, Nov 6, 2009 at 1:57 PM, Trevor Clarke   
> wrote:
>> I've found some info on ctypes and numpy but it mostly tells me how  
>> to pass
>> numpy objects to C funcs. I've got a C func which returns a  
>> c_void_p to a
>> buffer and I'd like to access the data using numpy without creating  
>> a copy
>> of the data. Could someone point me in the right direction?
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] neighborhood iterator

2009-11-23 Thread Zachary Pincus
Hi,

Having just written some cython code to iterate a neighborhood across  
an array, I have some ideas about features that would be useful for a  
general frame. Specifically, being able to pass in a "footprint"  
boolean array to define the neighborhood is really useful in many  
contexts. Also useful is the ability to query the offset of the  
current pixel from the center of the neighborhood. (These two  
features, plus very efficient handling of boundary conditions by  
breaking the image into regions where the conditions are and are not  
required, make the image iterators in ITK really nice to use.)

Zach


On Nov 23, 2009, at 9:12 AM, Nadav Horesh wrote:

>
> Thank you, this is a start. I seems that there are more issues to  
> resolve. I am trying to make a general frame that would enable one  
> to write filters using this iterator.
>
>
>   Nadav
>
> -Original Message-
> From: numpy-discussion-boun...@scipy.org on behalf of David Warde- 
> Farley
> Sent: Mon 23-Nov-09 03:21
> To: Discussion of Numerical Python
> Subject: Re: [Numpy-discussion] neighborhood iterator
>
>
> On 22-Nov-09, at 12:50 PM, Nadav Horesh wrote:
>
>>
>> I wonder if the neighbourhood iterator can be used as a more
>> efficient replacement for ndimage.generic_filter. Is there a way to
>> use it from cython?
>
> Yes, using the NumPy C API, called like any other C function is from
> Cython. Something like:
>
> ##
>
> import numpy as np
> cimport numpy as np
>
> cdef extern from "numpy/arrayobject.h":
> object PyArray_NeighborhoodIterNew(object iter, np.npy_intp  
> bounds,
>   int mode, object, np.ndarray
> fill_value)
> int PyArrayNeighborhoodIter_Next(object iter)
> int PyArrayNeighborhoodIter_Reset(object iter)
>
> ##
>
> should do the trick.
>
> Note that you'll need to call np.import_array() before using any of
> these functions to initialize the C API, I think.
>
> David
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] convert strides/shape/offset into nd index?

2009-11-29 Thread Zachary Pincus
Hi all,

I'm curious as to what the most straightforward way is to convert an  
offset into a memory buffer representing an arbitrarily strided array  
into the nd index into that array. (Let's assume for simplicity that  
each element is one byte...)

Does sorting the strides from largest to smallest and then using  
integer division and mod (in the obvious way) always work? (I can't  
seem to find a counterexample, but I am not used to thinking too  
deeply about bizarrely-strided configurations.) Is there a method that  
doesn't involve sorting?

Thanks,
Zach
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] convert strides/shape/offset into nd index?

2009-12-01 Thread Zachary Pincus
> Not to be a downer, but this problem is technically NP-complete. The
> so-called "knapsack problem" is to find a subset of a collection of
> numbers that adds up to the specified number, and it is NP-complete.
> Unfortunately, it is exactly what you need to do to find the indices
> to a particular memory location in an array of shape (2,2,...,2).

Ha ha, right -- that is the knapsack problem isn't it.
Oh well... I'll just require fortran- or C-style strided arrays, for  
which case it is easy to unravel offsets into indices.

Thanks everyone!

Zach
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed?

2008-03-24 Thread Zachary Pincus
Hi all,

> I looked at line 21902  of dlapack_lite.c, it is,
>
>for (niter = iter; niter <= 20; ++niter) {
>
> Indeed the upper limit for iterations in the
> linalg.svd code is set for 20.  For now I will go with
> my method (on earlier post) of squaring the matrix and
> then doing svd when the original try on the original
> matrix throws the linalg.linalg.LinAlgError.  I do not
> claim that this is a cure-all.  But it seems to work
> fast and avoids the original code from thrashing
> around in a long iteration.
>
> I would suggest this be made explicit in the NumPy
> documentation and then the user be given the option to
> reset the limit on the number of iterations.
>
> Well, it certainly shouldn't be hardwired in as 20. At minimum it  
> should be a #define, and ideally it should be passed in with the  
> function call, but I don't know if the interface allows that.

I just wanted to mention that this particular issue has bitten me in  
the past too. It would be nice to be able to have a bit more control  
over the SVD iteration limit either at compile-time, or run-time.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Fast histogram

2008-04-17 Thread Zachary Pincus
Hi folks,

I'm working on a live-video display for some microscope control tools  
I'm building. For this, I need a  fast histogram function to work on  
large-ish images (1000x2000 or so) at video rate, with cycles left  
over for more interesting calculations (like autofocus).

Now, numpy.histogram is a bit slower than I'd like, probably because  
it's pretty general (and of course cf. the recent discussion about its  
speed). I just need even bins within a set range. This is easy enough  
to do with a C-extension, or perhaps even cython, but before I go  
there, I was wondering if there's a numpy function that can help.

Here's what I have in mind:

def histogram(arr, bins, range):
   min, max = range
   indices = numpy.clip(((arr.astype(float) - min) * bins / (max -  
min)).astype(int), 0, bins-1)
   histogram = numpy.zeros(bins, numpy.uint32)
   for i in indices:
 histogram[i] += 1

Now, clearly, the last loop is what needs speeding up. Are there any  
numpy functions that can do this kind of operation? Also perhaps  
unnecessarily slow is the conversion of 'arr' to a float -- I do this  
to avoid overflow issues with integer arrays.

Any advice? Should I go ahead and write this up in C (easy enough), or  
can I do this in numpy? Probably the indices-computation line I'll  
speed up with numexpr, if I use a pure-numpy approach.

Thanks,
Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fast histogram

2008-04-17 Thread Zachary Pincus
Hi,

> How about a combination of sort, followed by searchsorted right/left  
> using the bin boundaries as keys? The difference of the two  
> resulting vectors is the bin value. Something like:
>
> In [1]: data = arange(100)
>
> In [2]: bins = [0,10,50,70,100]
>
> In [3]: lind = data.searchsorted(bins)
>
> In [4]: print lind[1:] - lind[:-1]
> [10 40 20 30]
>
> This won't be as fast as a c implementation, but at least avoids the  
> loop.

This is, more or less, what the current numpy.histogram does, no? I  
was hoping to avoid the O(n log n) sorting, because the image arrays  
are pretty big, and numpy.histogram doesn't get close to video rate  
for me...

Perhaps, though, some of the slow-down from numpy.histogram is from  
other overhead, and not the sorting. I'll try this, but I think I'll  
probably just have to write the c loop...

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fast histogram

2008-04-17 Thread Zachary Pincus
Hi, and thanks for the suggestion!

> How many bits per pixel does your camera actually generate !?
> If its for example a 12 bit camera, you could just fill in directly
> into 4096 preallocated bins.
> You would not need any sorting !!
> That's what I did for a 16 bit camera -- but I wrote it in C and I had
> 4 cameras at 30 Hz.

That approach avoids the bin-index calculation line:
indices = numpy.clip(((array.astype(float) - min) * bins / (max -  
min)).astype(int), 0, bins-1)

But even if indices = array, one still needs to do something like:
for index in indices: histogram[index] += 1

Which is slow in python and fast in C.

I'm guessing that there's no utility function in numpy that does a  
loop like this? If so, that would be handy, but if now, I think I need  
to dig out the numpy book and write a little extension...

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fast histogram

2008-04-17 Thread Zachary Pincus
>> But even if indices = array, one still needs to do something like:
>> for index in indices: histogram[index] += 1
>>
>> Which is slow in python and fast in C.
>>
>>
> I thought of a broadcasting approach... what are the chances that a  
> simple
>
> bins[:] = 0
> bins[ img.flat ] += 1

That doesn't work cumulatively, it seems. Some bins get a value of  
'1', but it never gets incremented past that.

It was worth a try, though... in some cases in-place updating seems to  
work this way (which is usually a source of confusion on this list,  
not the desired result!)

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fast histogram

2008-04-17 Thread Zachary Pincus
Hello,

>> But even if indices = array, one still needs to do something like:
>> for index in indices: histogram[index] += 1
> numpy.bincount?

That is indeed what I was looking for! I knew I'd seen such a function.

However, the speed is a bit disappointing. I guess the sorting isn't  
too much of a penalty:

def histogram(array, bins, range):
   min, max = range
   indices = numpy.clip(((array.astype(float) - min) * bins / (max -  
min)).astype(int), 0, bins-1).flat
   return numpy.bincount(indices)

import numexpr
def histogram_numexpr(array, bins, range):
   min, max = range
   min = float(min)
   max = float(max)
   indices = numexpr.evaluate('(array - min) * bins / (max - min)')
   indices = numpy.clip(indices.astype(int), 0, bins-1).flat
   return numpy.bincount(indices)

 >>> arr.shape
(1300, 1030)

 >>> timeit numpy.histogram(arr, 12, [0, 5000])
10 loops, best of 3: 99.9 ms per loop

 >>>  timeit histogram(arr, 12, [0, 5000])
10 loops, best of 3: 127 ms per loop

 >>> timeit histogram_numexpr(arr, 12, [0, 5000])
10 loops, best of 3: 109 ms per loop

 >>>  timeit numpy.histogram(arr, 5000, [0, 5000])
10 loops, best of 3: 111 ms per loop

 >>>  timeit histogram(arr, 5000, [0, 5000])
10 loops, best of 3: 127 ms per loop

 >>> timeit histogram_numexpr(arr, 5000, [0, 5000])
10 loops, best of 3: 108 ms per loop

So, they're all quite close, and it seems that numpy.histogram is the  
definite winner. Huh. I guess I will have to go to C or maybe weave to  
get up to video-rate, unless folks can suggest some further  
optimizations...

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fast histogram

2008-04-17 Thread Zachary Pincus
Combining Sebastian and Jae-Joon's suggestions, I have something that  
might work:

 >>> timeit numpy.bincount(array.flat)
10 loops, best of 3: 28.2 ms per loop

This is close enough to video-rate... And I can then combine bins as  
needed to get a particular bin count/range after the fact.

Thanks, everyone,

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] "aligned" matrix / ctypes

2008-04-23 Thread Zachary Pincus
Hello all,

I need to allocate a numpy array that I will then pass to a camera  
driver (via ctypes) so that the driver can fill the array with pixels.  
The catch is that the driver requires that rows of pixels start at 4- 
byte boundaries.

The sample C++ code given for allocating memory for this is (pixels  
are unsigned shorts):

// Two bytes for each pixel, then round
// up to the next multiple of four.
long width_bytes = ( ( 2 * width_pixels ) + 3 ) & -4;
long allocated_size = width_bytes * height;
unsigned char* image_data = new unsigned char[allocated_size];
I could make an empty uint8 numpy array of the required shape  
(allocated_size,) and adjust its dtype, shape, and strides to get the  
desired result. I'd then feed the array's ctypes data attribute to the  
driver to get filled in. Alternately I could allocate the data buffer  
via ctypes and then construct an array around it.

Is either option better? How does one construct a numpy array around a  
ctypes memory object? Can the array "take over" the memory management  
for the buffer?

Thanks for any suggestions,
Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "aligned" matrix / ctypes

2008-04-23 Thread Zachary Pincus
Hi,

Thanks a ton for the advice, Robert! Taking an array slice (instead of  
trying to set up the strides, etc. myself) is a slick way of getting  
this result indeed.

>> I need to allocate a numpy array that I will then pass to a camera
>> driver (via ctypes) so that the driver can fill the array with  
>> pixels.
>> The catch is that the driver requires that rows of pixels start at 4-
>> byte boundaries.
>>
>> The sample C++ code given for allocating memory for this is (pixels
>> are unsigned shorts):
>>
>> // Two bytes for each pixel, then round
>> // up to the next multiple of four.
>> long width_bytes = ( ( 2 * width_pixels ) + 3 ) & -4;
>> long allocated_size = width_bytes * height;
>> unsigned char* image_data = new unsigned char[allocated_size];

> Note that the approach above doesn't ensure that the first row is
> correctly aligned. It just assumes that the allocator will always
> start a new block aligned at 4 bytes (which may be reasonable for the
> platforms you are targetting).

Hmm, good point. The SDK/example code is pretty flakey, so I bet they  
just make this assumption.

> To solve the initial alignment, you overallocate a 1D array by 3 bytes
> and find the offset from the allocated initial address which is
> correctly aligned.

Sorry -- I haven't had to ever concern myself with alignment before  
this, so I'm not sure how to go about this step. Do I just look at the  
raw pointer address and see what offset I need to give it to get it to  
be divisible by four?

> Slice out [:allocated_size] portion of this,
> .view() it as uint16, reshape it to
> (height,width_pixels+width_pixels%2), then slice out
> [:,:width_pixels].

Everything else makes good sense. Thanks again!

Zach

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] problem with view() and strided arrays?

2008-04-24 Thread Zachary Pincus
Hello all,

I'm writing up a general function to allocate aligned numpy arrays  
(I'll post it shortly, as Anne suggested that such a function would be  
useful).

However, I've run into trouble with using ndarray.view() in odd corner- 
cases:
In : numpy.__version__
Out: '1.1.0.dev5077'

In : a = numpy.ones((3,8),dtype=numpy.uint8)
In : a.view(numpy.uint16)
Out:
array([[257, 257, 257, 257],
[257, 257, 257, 257],
[257, 257, 257, 257]], dtype=uint16)

In : a = numpy.ones((3,9),dtype=numpy.uint8)
In : a[:,1:].view(numpy.uint16)
ValueError: new type not compatible with array.

In : a[:,:-1].view(numpy.uint16)
ValueError: new type not compatible with array.

In : numpy.array(a[:,:-1]).view(numpy.uint16)
Out:
array([[257, 257, 257, 257],
[257, 257, 257, 257],
[257, 257, 257, 257]], dtype=uint16)

It seems like 'view' won't create a view where the stride length is  
not a whole multiple of the dtype's itemsize. However, since strides  
are measured in bytes (right?), this shouldn't be a problem.

Is this a minor bug? Or am I woefully misunderstanding the issue  
involved here?

Thanks,
Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] problem with view() and strided arrays?

2008-04-24 Thread Zachary Pincus
> It seems like 'view' won't create a view where the stride length is
> not a whole multiple of the dtype's itemsize. However, since strides
> are measured in bytes (right?), this shouldn't be a problem.


Actually -- it seems like view() doesn't work with strided arrays at  
all. (?)

In : a = numpy.ones((4,32), dtype=numpy.uint8)

In : a.view(numpy.uint16).shape
Out: (4, 16)

In : a[:,:16].view(numpy.uint16)
ValueError: new type not compatible with array.

I think this might be a recent regression, because before I updated my  
numpy installation to the latest SVN version (to check if the bug was  
fixed!), I'm pretty sure this kind of operation worked.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "aligned" matrix / ctypes

2008-04-24 Thread Zachary Pincus

Hi all,


It's worth mentioning that there was some discussion of adding an
allocator for aligned arrays. It got sidetracked into a discussion of
SSE support for numpy, but there are all sorts of reasons to want
aligned arrays in numpy, as this post demonstrates. Seeing as it's
just a few lines worth of pure python, is anyone interested in writing
an aligned_empty() for numpy?


Here's my attempt at aligned_empty(). It only accepts alignments that  
are an integer multiple (or divisor) of the dtype's itemsize, because  
of limitations in ndarray.view().


I've also attached aligned_empty_arbitrary(), which is similar, but  
allows any bizarro alignment requested. Or at least it would, if  
ndarray.view() were able to handle strided arrays.


It could probably be tightened a bit -- I tried to make the code err  
on the "demonstrative" side rather than "terse".


Here are some basic tests:

import aligned_allocator as aa

a = aa.aligned_empty((9,7), dtype=numpy.uint8, byte_alignment=64,  
order='C')

for i in range(a.shape[0]):
assert(a[i,:].ctypes.data % 64 == 0)

a = aa.aligned_empty((9,7), dtype=numpy.float32, byte_alignment=16,  
order='C')

for i in range(a.shape[0]):
assert(a[i,:].ctypes.data % 32 == 0)

a = aa.aligned_empty((9,7), dtype=numpy.uint16, byte_alignment=32,  
order='F')

for i in range(a.shape[1]):
assert(a[:,i].ctypes.data % 32 == 0)

Best,
Zach



 
   import numpy

def aligned_empty(shape, dtype, byte_alignment, order='C'):
  dtype = numpy.dtype(dtype)
  if (byte_alignment % dtype.itemsize != 0 and dtype.itemsize % byte_alignment != 0:
raise ValueError('Invalid alignment.')
  order = order.upper()
  if order == 'F':
shape = shape[::-1]
  elif order != 'C':
raise ValueError("Order must be 'C' or 'F'.")
  fastest_shape = shape[-1]
  other_shape = shape[:-1]
  element_alignment = byte_alignment / dtype.itemsize
  if element_alignment == 0:
aligned_width = fastest_shape
  else:
aligned_width = fastest_shape + (element_alignment - fastest_shape % element_alignment) % element_alignment 
  other_dims = numpy.multiply.reduce(other_shape)
  total_bytes = aligned_width * other_dims * dtype.itemsize
  aligned_total_bytes = total_bytes + (byte_alignment - 1)
  buffer_array = numpy.empty(aligned_total_bytes, dtype=numpy.uint8)
  buffer_address = buffer_array.__array_interface__['data'][0]
  buffer_offset = (byte_alignment - buffer_address % byte_alignment) % byte_alignment
  aligned_buffer = buffer_array[buffer_offset:(total_bytes + buffer_offset)].view(dtype)
  aligned_shape = tuple(other_shape) + (aligned_width,)
  aligned_view = aligned_buffer.reshape(aligned_shape, order='C')
  requested_view = aligned_view[..., :fastest_shape]
  if order == 'F':
return requested_view.T
  else:
return requested_view


def aligned_empty_arbitrary(shape, dtype, byte_alignment, order='C'):
  dtype = numpy.dtype(dtype)
  order = order.upper()
  if order == 'F':
shape = shape[::-1]
  elif order != 'C':
raise ValueError("Order must be 'C' or 'F'.")
  fastest_shape = shape[-1]
  other_shape = shape[:-1]
  byte_width = fastest_shape * dtype.itemsize
  aligned_byte_width = byte_width + (byte_alignment - byte_width % byte_alignment) % byte_alignment 
  other_dims = numpy.multiply.reduce(other_shape)
  total_bytes = aligned_byte_width * other_dims
  aligned_total_bytes = total_bytes + (byte_alignment - 1)
  buffer_array = numpy.empty(aligned_total_bytes, dtype=numpy.uint8)
  buffer_address = buffer_array.__array_interface__['data'][0]
  buffer_offset = (byte_alignment - buffer_address % byte_alignment) % byte_alignment
  aligned_buffer = buffer_array[buffer_offset:(total_bytes + buffer_offset)]
  byte_shape = tuple(other_shape) + (aligned_byte_width,)
  aligned_view = aligned_buffer.reshape(byte_shape, order='C')
  requested_view = aligned_view[..., :byte_width].view(dtype)
  if order == 'F':
return requested_view.T
  else:
return requested_view___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Using normal()

2008-04-24 Thread Zachary Pincus
>   The syntax is normal(loc=0.0, scale=1.0, size=None), but I've not  
> seen
> what those represent, nor how to properly invoke this function. A  
> clue will
> be much appreciated.
>
>   I want to call normal() passing at least the width of the curve(at  
> the end
> points where y=0.0), and the center (where y=1.0). Being able to  
> specify the
> y value of the inflection point (by default 0.5) would also be  
> helpful.

The 'normal dstribution' is the Gaussian function:
N(x) = 1/(std*sqrt(2*pi))*exp(-(x-mean)**2/(2*std**2)), where N(x) is  
the probability density at position x, given a normal distribution  
characterized by 'mean' and 'std' (standard deviation).

http://en.wikipedia.org/wiki/Normal_distribution

Now, numpy.random.normal gives random samples distributed according to  
the above probability density function. The only remaining mystery is  
how 'loc' and 'scale' -- the parameters of numpy.random.normal -- map  
to 'mean' and 'standard deviation', which is how a normal distribution  
is usually parameterized. Fortunately, the documentation reveals this:

 >>> print numpy.random.normal.__doc__
Normal distribution (mean=loc, stdev=scale).

 normal(loc=0.0, scale=1.0, size=None) -> random values


If you need an alternate parameterization of the normal (e.g. in terms  
of the y value of the inflection point), just solve that out  
analytically from the definition of the normal in terms of mean and std.

However, it looks like you're trying to plot the normal function, not  
get random samples. Just evaluate the function (as above) at the x  
positions:

mean, std = (0, 1)
x = numpy.linspace(-10, 10, 200) # 200 evenly-spaced points from -10  
to 10
y = 1/(std*numpy.sqrt(2*numpy.pi))*numpy.exp(-(x-mean)**2/(2*std**2))

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Using normal()

2008-04-24 Thread Zachary Pincus
>> norm = 1 / (scale * sqrt(2 * pi))
>> y = norm * exp(-power((x - loc), 2) / (2 * scale**2))
>
>   Hmm-m-m. I don't understand the source of the error:
>
>   y = norm * exp(-pow((x - loc), 2) / (2 * scale**2))
> TypeError: only length-1 arrays can be converted to Python scalars

Python's built in pow() and exp() functions can't handle numpy arrays,  
and thus try (and fail) to convert arrays to scalar values. You want  
to use numpy.exp and numpy.power (or just the ** operator), to do  
these operations to numpy arrays elementwise.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Using normal()

2008-04-24 Thread Zachary Pincus
> It works for me:
>
>>> x = arange(0,10)
>>> scale=1
>>> loc=1
>>> norm = 1 / (scale * sqrt(2 * pi))
>>> y = norm * exp(-power((x - loc), 2) / (2 * scale**2))
>>> y
>
> array([  1.46762663e-01,   3.98942280e-01,   1.46762663e-01,
> 5.39909665e-02,   2.68805194e-03,   1.33830226e-04,
> 9.01740968e-07,   6.07588285e-09,   5.54048800e-12,
> 5.05227108e-15])

I assume you 'from numpy import *'? This is why it works -- because  
that import causes the python built-in exp() to be replaced (in the  
current namespace) by numpy.exp().

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "aligned" matrix / ctypes

2008-04-24 Thread Zachary Pincus
Zach:
>> Here's my attempt at aligned_empty(). It only accepts alignments  
>> that are
>> an integer multiple (or divisor) of the dtype's itemsize, because of
>> limitations in ndarray.view().
Robert:
> aligned_empty() works for me (after removing the if: suite that guards
> this case) because it correctly .view()s before it slices.
> aligned_empty_arbitrary() doesn't because it slices before it
> .view()s.

As I noted, aligned_empty_arbitrary() does not currently work because  
ndarray.view() cannot deal with strided arrays, which (as per another  
email thread) I think may be a bug. If it could, then  
aligned_empty_arbitrary() would work, and not raise an exception.

However, the truth is that aligned_empty does not work with arbitrary  
alignments! If you remove the 'if' guard, various assumptions break,  
due to the floor division here:
   element_alignment = byte_alignment / dtype.itemsize

which only works right when byte_alignment is an integral multiple of  
itemsize.

Otherwise you get:
In [1]: import aligned_allocator
In [2]: a = aligned_allocator.aligned_empty((5,5), float, 3)
In [3]: a[0,:].ctypes.data % 3
Out[3]: 0
In [4]: a[1,:].ctypes.data % 3
Out[4]: 1
In [5]: a[2,:].ctypes.data % 3
Out[5]: 2

Which is to say, the data isn't aligned right.

The reason that one must slice before .view()ing to allow arbitrary  
alignment is as follows. Imagine that we want rows of four 2-byte  
shorts aligned to 3-byte boundaries. (Assume that we already have a  
buffer that starts on a 3-byte boundary.) So we need an array that's 9  
bytes wide by however many rows, and then we just want to use the  
first eight bytes of row. If we slice first, we can get a strided  
array that is eight bytes wide, and thus something that we can  
interpret as four shorts. (That is, if .view() could handle strided  
arrays.)

On the other hand, there's absolutely no way that we can .view()  
before slicing, because our underlying array is 9 bytes wide, and you  
can't look at 9 bytes as any integral number of 2-byte shorts.  
So .view() should properly fail, and thus we can't get to the slicing.

Fortunately this is all an obtuse corner case, since who ever needs to  
align to values that aren't either an integral multiple or divisor of  
the itemsize? Nevertheless, anyone who wants to use the  
aligned_empty() code should keep the 'if' guard there.

In the mean time, is there any way to (from within python) construct  
an array from another array by specifying the dtype and strides? That  
is, some other way to do the view/slice business directly?

Thanks,
Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] problem with view() and strided arrays?

2008-04-25 Thread Zachary Pincus
Hi all,

> Actually -- it seems like view() doesn't work with strided arrays at
> all. (?)
>
> In : a = numpy.ones((4,32), dtype=numpy.uint8)
>
> In : a.view(numpy.uint16).shape
> Out: (4, 16)
>
> In : a[:,:16].view(numpy.uint16)
> ValueError: new type not compatible with array.
>
> I think this might be a recent regression, because before I updated my
> numpy installation to the latest SVN version (to check if the bug was
> fixed!), I'm pretty sure this kind of operation worked.

The problem starts out in arrayobject.c:6392, in array_descr_set(),  
where the following test is performed:
 if ((newtype->elsize != self->descr->elsize) && \
 (self->nd == 0 || !PyArray_ISONESEGMENT(self) || \
  newtype->subarray)) goto fail;

I *think* I could fix it by relaxing the restrictions to only require  
that the array have at least one dimension where self->strides[dim] ==  
self->descr->elsize, and then adjust the size of that dimension.

Here's my perhaps-fix. The old code is:
 if ((newtype->elsize != self->descr->elsize) && \
 (self->nd == 0 || !PyArray_ISONESEGMENT(self) || \
  newtype->subarray)) goto fail;

 if (PyArray_ISCONTIGUOUS(self)) index = self->nd - 1;
 else index = 0;


My suggested fix is:
 if ((newtype->elsize != self->descr->elsize) && \
 (self->nd == 0 || newtype->subarray)) goto fail;

 if (PyArray_ISCONTIGUOUS(self)) index = self->nd - 1;
 else if (PyArray_ISFORTRAN(self)) index = 0;
 else {
   int index_found = FALSE;
   for (index = 0; index < self->nd; index++) {
 if (self->strides[index] == self->descr->elsize) {
   index_found = TRUE;
   break;
 }
   }
   if (!index_found) goto fail;
 }

Could someone look this over? If it looks basically right, I'll make  
this a proper patch and post it to trac.

Thanks,
Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "aligned" matrix / ctypes

2008-04-25 Thread Zachary Pincus

Hello all,

Attached is code (plus tests) for allocating aligned arrays -- I think  
this addresses all the requests in this thread, with regard to  
allowing for different kinds of alignment. Thanks Robert and Anne for  
your help and suggestions. Hopefully this will be useful.


The core is a function for allocating arrays with totally arbitrary  
alignment along each dimension (e.g. you could allocate an 10x20 array  
of uint16's where each uint16 is aligned to 4-byte boundaries and each  
row of 20 uint16's is aligned to 32-byte boundaries, and the entire  
buffer is aligned to a 128-byte boundary.) I've also included helper  
functions for two common cases: when you want everything aligned to a  
particular multiple (every element, row, etc. as well as the whole  
buffer), and when you want an array where the rows (second-fastest  
moving index) are so aligned (this was my original use case, for fast  
image-blitting).


Zach


def aligned_empty(shape, dtype, dim_alignments, array_alignment):
  '''Allocate an empty array with the given shape and dtype, where  
the array
  buffer starts at a memory address evenly-divisible by  
array_alignment, and
  where items along each dimension are offset from the first item on  
that
  dimension by a byte offset that is an integer multiple of the  
corresponding

  value in the dim_alignments tuple.

  Example: To allocate a 20x30 array of floats32s, where individual  
data
  elements are aligned to 16-bute boundaries, each row is aligned to  
a 64-byte

  boundary, and the array's buffer starts on a 128-byte boundary, call:
aligned_empty((20,30), numpy.float32, (64, 16), 128)
  '''

def aligned_rows_empty(shape, dtype, alignment, order='C'):
  '''Return an array where the rows (second-fastest-moving index) are  
aligned
  to byte boundaries evenly-divisible by 'alignment'. If 'order' is  
'C', then
  the indexing is such that the fastest-moving index is the last one;  
if the

  order is 'F', then the fastest-moving index is the first.'''

def aligned_elements_empty(shape, dtype, alignment, order='C'):
  '''Return an array where each element is aligned to byte boundaries  
evenly-

  divisible by 'alignment'.'''


import numpy

def aligned_empty(shape, dtype, dim_alignments, array_alignment):
  '''Allocate an empty array with the given shape and dtype, where the array
  buffer starts at a memory address evenly-divisible by array_alignment, and
  where items along each dimension are offset from the first item on that
  dimension by a byte offset that is an integer multiple of the corresponding
  value in the dim_alignments tuple.
  
  Example: To allocate a 20x30 array of floats32s, where individual data
  elements are aligned to 16-bute boundaries, each row is aligned to a 64-byte
  boundary, and the array's buffer starts on a 128-byte boundary, call:
aligned_empty((20,30), numpy.float32, (64, 16), 128)
  '''
  if len(shape) != len(dim_alignments):
raise ValueError('Alignments must be provided for each dimension.')
  dtype = numpy.dtype(dtype)
  strides = []
  current_size = dtype.itemsize
  for width, alignment in zip(shape[::-1], dim_alignments[::-1]):
# build up new strides array in reverse, so that the fastest-moving index
# is the last (C-ish indexing, but not necessarily contiguous)
current_size += (alignment - current_size % alignment) % alignment
strides.append(current_size)
current_size *= width
  strides = strides[::-1]
  total_bytes = current_size + (array_alignment - 1)
  buffer = numpy.empty(total_bytes, dtype=numpy.uint8)
  address = buffer.ctypes.data
  offset = (array_alignment - address % array_alignment) % array_alignment
  return numpy.ndarray(shape=shape, dtype=dtype, buffer=buffer,
strides=strides, offset=offset)

def aligned_rows_empty(shape, dtype, alignment, order='C'):
  '''Return an array where the rows (second-fastest-moving index) are aligned
  to byte boundaries evenly-divisible by 'alignment'. If 'order' is 'C', then
  the indexing is such that the fastest-moving index is the last one; if the
  order is 'F', then the fastest-moving index is the first.'''
  if len(shape) < 2:
raise ValueError('Need at least a 2D array to align rows.')
  order = order.upper()
  if order not in ('C', 'F'):
raise ValueError("Order must be 'C' or 'F'.")
  dim_alignments = [1 for dim in shape]
  dim_alignments[-2] = alignment
  if order == 'F':
shape = shape[::-1]
return aligned_empty(shape, dtype, dim_alignments, alignment).T
  else:
return aligned_empty(shape, dtype, dim_alignments, alignment)

def aligned_elements_empty(shape, dtype, alignment, order='C'):
  '''Return an array where each element is aligned to byte boundaries evenly-
  divisible by 'alignment'.'''
  order = order.upper()
  if order not in ('C', 'F'):
raise ValueError("Order must be 'C' or 'F'.")
  dim_alignments = [alignment for dim in shape]
  if order == 'F':
shape = shape[::-1]
return aligned_empty(shape, dtype, dim

Re: [Numpy-discussion] "aligned" matrix / ctypes

2008-04-25 Thread Zachary Pincus
> There is one more important case, which is aligning just the beginning
> of the array.

Indeed -- thanks! The following should take care of that case (it's  
all fluff around calling aligned_empty() with a dim_alignments tuple  
of all 1's, and the array_alignment parameter as required):

def aligned_start_empty(shape, dtype, alignment, order='C'):
   '''Return an array with the first element aligned to a byte that is
   evenly-divisible by the specified alignment.'''
   order = order.upper()
   if order not in ('C', 'F'):
 raise ValueError("Order must be 'C' or 'F'.")
   dim_alignments = [1 for dim in shape]
   if order == 'F':
 shape = shape[::-1]
 return aligned_empty(shape, dtype, dim_alignments, alignment).T
   else:
 return aligned_empty(shape, dtype, dim_alignments, alignment)

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] svd in numpy

2008-05-18 Thread Zachary Pincus
On May 17, 2008, at 9:34 AM, David Cournapeau wrote:

> Nripun Sredar wrote:
>> I have a sparse matrix 416x52. I tried to factorize this matrix using
>> svd from numpy. But it didn't produce a result and looked like it is
>> in an infinite loop.
>> I tried a similar operation using random numbers in the matrix. Even
>> this is in an infinite loop.
>> Did anyone else face a similar problem?
>> Can anyone please give some suggestions?
>
> Are you on windows ? What is the CPU on your machine ? I suspect  
> this is
> caused by windows binaries which shipped blas/lapack without support  
> for
> "old" CPU.

I have seen this issue as well, on Windows XP running on a Core2 Duo.  
(But... it was a virtualized environment with VirtualBox, so I don't  
know if that disables the SSE features.)

Anyhow, this was with the latest windows binaries, I think  
(numpy-1.0.4.win32-py2.5.msi), and had the same issue: infinite loop  
with 100% processor doing a SVD. (Non-sparse array, though.)

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] build outside C code with scons within numpy distutils

2008-05-25 Thread Zachary Pincus
Hello all,

I've been following David's work making numpy build with scons with  
some interest. I have a quick question about how one might use scons/ 
numpy distutils from an outside project.

Specifically, I have a package that uses numpy and numpy.distutils to  
built itself. Unfortunately, there are some pure-C libraries that I  
call using ctypes, and as these libraries are are not python  
extensions, it is hard to get distutils to build them happily on all  
platforms. From what I understand, one of the benefits of having scons  
available was that building and packaging these sort of libraries  
would become easier.

Is that the case? And if so, what would be the best way to add the  
scons stuff to a simple setup.py file written for numpy.distutils?

Thanks,
Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] build outside C code with scons within numpy distutils

2008-05-25 Thread Zachary Pincus
Thanks for the tips! This is very helpful.


>> Specifically, I have a package that uses numpy and numpy.distutils to
>> built itself. Unfortunately, there are some pure-C libraries that I
>> call using ctypes, and as these libraries are are not python
>> extensions, it is hard to get distutils to build them happily on all
>> platforms.
>
> You can take a look at the examples in numscons sources
> (sources/tests/ctypesext).
>
> In the setup.py, you call config.add_sconscript('SConstruct')
>
> And in the scons script, you do:
>
> from numscons import GetNumpyEnvironment
> env = GetNumpyEnvironment(ARGUMENTS)
>
> env.NumpyCtypes('foo', source = ['foo.c'])
>
> Note that although this has not changed since it existed, I do not
> guarantee API stability at this point.
>
> cheers,
>
> David
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy, py2exe, and SSE

2008-06-04 Thread Zachary Pincus
Hello all,

I've been toying around with bundling up a numpy-using python program  
for windows by using py2exe. All in all, it works great, except for  
one thing: the numpy superpack installer for windows has (correctly)  
selected SSE3 binary libraries to install on my machine. This causes  
(of course) crashes when the binary py2exe bundle, which includes the  
numpy libraries installed on my machine, is run on a non-SSE3 machine.

So, two questions:
  1) What's the best target for "safe" binaries on i386 machines? No  
SSE at all? SSE? SSE2?
  2) What's the best way to get the libraries desired installed for  
numpy? E.g. compile  if myself with some --no-sse option? (And if so,  
any clue as to the required options.) Or is there some way to get the  
numpy windows installer to install a specific kind of binary?

Thanks,
Zach Pincus
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Inplace shift

2008-06-06 Thread Zachary Pincus
> I'd like to shift the columns of a 2d array one column to the right.
> Is there a way to do that without making a copy?


I think what you want is numpy.roll?

Definition: numpy.roll(a, shift, axis=None)
Docstring:
 Roll the elements in the array by 'shift' positions along
 the given axis.

 >>> from numpy import roll
 >>> arange(10)
 array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
 >>> roll(arange(10), 2)
 array([8, 9, 0, 1, 2, 3, 4, 5, 6, 7])

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Getting specific Win32 binaries? (was: numpy, py2exe, and SSE)

2008-06-07 Thread Zachary Pincus
Hello all,

I'd like to be able to grab specific Win32 binaries from the  
distribution -- e.g. the SSE, SSE2. or SSE3-specific files (so that I  
can make sure to distribute the lowest-common-denominator ones via  
py2exe). Is this possible with the current binary releases? Or ought I  
to just build my own as required?

Thanks,
Zach



On Jun 4, 2008, at 1:37 PM, Zachary Pincus wrote:

> Hello all,
>
> I've been toying around with bundling up a numpy-using python program
> for windows by using py2exe. All in all, it works great, except for
> one thing: the numpy superpack installer for windows has (correctly)
> selected SSE3 binary libraries to install on my machine. This causes
> (of course) crashes when the binary py2exe bundle, which includes the
> numpy libraries installed on my machine, is run on a non-SSE3 machine.
>
> So, two questions:
>  1) What's the best target for "safe" binaries on i386 machines? No
> SSE at all? SSE? SSE2?
>  2) What's the best way to get the libraries desired installed for
> numpy? E.g. compile  if myself with some --no-sse option? (And if so,
> any clue as to the required options.) Or is there some way to get the
> numpy windows installer to install a specific kind of binary?
>
> Thanks,
> Zach Pincus
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.distutils problem with py2app/py2exe?

2008-06-28 Thread Zachary Pincus
Hello all,

I've noticed that py2app and py2exe work strangely on my project,  
which (having fortran code) is driven by the numpy distutils. Now,  
these two distutils commands need to peek into the build/lib. 
[whatever] directories to grab the files to package up. Indeed, the  
docs for py2exe and py2app say that the build directories are  
automatically added to the path.

However, I found that for my project, I need to explicitly add lines  
to the setup.py file like:

if have_py2exe:
   sys.path.insert(0, './build/lib.win32-%d.%d'%(sys.version_info[0],  
sys.version_info[1]))


I got around to digging a little deeper, and at least in the py2app  
case, py2app distutils Command subclass gets the build dir information  
as follows:

 def run(self):
 build = self.reinitialize_command('build')
 build.run()
 [... snip ...]
 extra_paths.extend([build.build_platlib, build.build_lib])

Now, when I run py2app, the build.build_platlib and build.build_lib  
values are None. So py2app never finds the library directory, and thus  
packages the files incorrectly. Is this likely to be (a) a problem in  
numpy.distutils, (b) a problem with my local setup, or (c) a problem  
with py2app?

Any thoughts?

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Infinity definitions

2008-07-16 Thread Zachary Pincus
>> Hi,
>> Following Travis's suggestion below, I would like to suggest that the
>> following definitions be depreciated or removed in this forthcoming
>> release:
>>
>> numpy.Inf
>> numpy.Infinity
>> numpy.infty
>> numpy.PINF
>> numpy.NAN
>> numpy.NaN
> ...
>
> While this is being discussed, what about the "representation" of
> nan's an infinities produced by repr in various forms?  It is
> particularly annoying when the repr of simple numeric types cannot be
> evaluated.  These include:
>
> 'infj'== repr(complex(0,inf))
> 'nanj'== repr(complex(0,nan))
> 'infnanj' == repr(complex(inf,nan))
> 'nannanj' == repr(complex(nan,nan))
>
> It seems that perhaps infj and nanj should be defined symbols.  I am
> not sure why a + does not get inserted before nan.
>
> In addition, the default infstr and nanstr printing options should
> also be changed to 'inf' and 'nan'.

Hello all,

If I recall correctly, one reason for the plethora of infinity  
definitions (which had been mentioned previously on the list) was that  
the repr for some or all float/complex types was generated by code in  
the host OS, and not in numpy. As such, these reprs were different for  
different platforms. As there was a desire to ensure that reprs could  
always be evaluated, the various ways that inf and nan could be spit  
out by the host libs were all included.

Has this been fixed now, so that repr(inf), (etc.) looks identical on  
all platforms?

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy 1.1.0 (+ PIL 1.1.6) crashes on large datasets

2008-07-29 Thread Zachary Pincus
> I've managed to crash numpy+PIL when feeding it rather large images.
> Please see the URL for a test image, script, and gdb stack trace. This
> crashes on my box (Windows XP SP3) as well as on a linux box (the gdb
> trace I've been provided with) and a Mac. Windows reports the crash to
> happen in "multiarray.pyd"; the stack trace mentions the equivalent
> file.
> Unfortunately, I don't know how to fix this. Can I help somehow?
>
> -- Chris
>
> [1] http://cracki.ath.cx:10081/pub/numpy-pil-crash/

Hmm... I've opened this file with my homebrew image-IO tools, and I  
cannot provoke a segfault. (These tools are derived from PIL, but with  
many bugs related to the array interface fixed. I had submitted  
patches to the PIL mailing list, which, as usual, languished.)

I wonder if the issue is with how the PIL is providing the buffer  
interface to numpy? Can you get the crash if you get the array into  
numpy through the image's tostring (or whatever) method, and then use  
numpy.fromstring?

Zach

PS. This is with a recent SVN version of numpy, on OS X 10.5.4.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bilteral filter

2008-08-02 Thread Zachary Pincus
> Is there any efficient implementation of bilateral filter that works  
> smoothly with numpy?

Not that I know of...

Of course, if you were to write one, I'm sure there would be some  
interest in it! I would recommend looking into the tools in  
scipy.ndimage for basic image-processing support, but there's no  
bilateral filtering there.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bilateral filter

2008-08-05 Thread Zachary Pincus
> Attached here my cython implementation of the bilateral filter,  
> which is my first cython program. I would ask for the following:

Thanks for the code! I know it will be of use to me. (Do you place any  
particular license on it?)

Zach


On Aug 5, 2008, at 9:38 AM, Nadav Horesh wrote:

> Attached here my cython implementation of the bilateral filter,  
> which is my first cython program. I would ask for the following:
>
>   • Is there any way to speed up the code just by "cosmetic"  
> modifications?
>   • In particular I use the unportable gcc function __builtin_abs: Is  
> there any way to access this in a portable way?
>   • I would like to donate this code to scipy or any other suitable  
> project. What can/should I do to realise it?
>
> Remarks:
>
> The code contains 3 end-user routines:
>   • A pure python implementation: Easy to read and modify --- it can  
> be cut out into a python source code.
>   • A straight forward cython implementation: About 4 times as fast  
> as the python implementation.
>   • An optimised cython implementation earning another factor of 2 in  
> speed (depends on the parameters used).
> I see this code as a "research grade" that would evolve in the near  
> future, as I am working on a related project, and hopefully  
> following your comments.
>
>   Nadav
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] cubic spline function in numarray or numpy

2008-08-06 Thread Zachary Pincus
Hi Shawn,

> I am trying to find 1-D cubic spline function. But I am not able to  
> call it. Error message can’t find “ndimage”
>


Numpy provides tools for creating and dealing with multidimensional  
arrays and some basic FFT and linear algebra functions. You want to  
look at scipy, a set of packages built on numpy that provide more  
sophisticated domain-specific functionality. Specifically, ndimage is  
part of *scipy*, not numpy.

Zach



On Aug 6, 2008, at 11:46 AM, Gong, Shawn (Contractor) wrote:

> hi list,
>
> I am trying to find 1-D cubic spline function. But I am not able to  
> call it. Error message can’t find “ndimage”
>
> Would someone point me to 1-D cubic spline functions either in  
> numarray or numpy? and their help pages?
>
>
> I use python 2.3.5 and numpy 1.0.1. Are they too old to use cubic  
> spline?
>
>
> thanks,
>
> Shawn
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy / ipython heisenbug crasher help?

2008-08-07 Thread Zachary Pincus
Hello all,

I have a bizarre bug that causes numpy to segfault, but:
  - only when run under ipython
  - only when numpy is imported *after* an other library (that does  
not import numpy)

Here's what the code looks like.
Crashes (only in ipython):
import celltool.utility.pil_lite.Image as Image, numpy
i = Image.open('untreated_2_logical.png')
a = numpy.array(i)

Does not crash:
import numpy, celltool.utility.pil_lite.Image as Image
i = Image.open('untreated_2_logical.png')
a = numpy.array(i)

The 'i' object exposes an __array_interface__, and is produced from my  
personal fork of the PIL.

The crash is thusly reported:
Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0251f000

0   libSystem.B.dylib 0x0a13 __memcpy + 627
1   multiarray.so 0x022f7af8 _array_copy_into + 2840  
(arrayobject.c:1113)
2   multiarray.so 0x022f88af PyArray_FromArray + 495  
(arrayobject.c:8334)
3   multiarray.so 0x022ee314 PyArray_FromAny + 308 (arrayobject.c: 
8814)
4   multiarray.so 0x022fb01c PyArray_CheckFromAny + 92  
(arrayobject.c:8995)
5   multiarray.so 0x022fb2e6 _array_fromobject + 310  
(multiarraymodule.c:5787)
[And then the python stack frames]

Can anyone suggest what a good next step in reducing this patently  
insane bug down to a more minimal test case is? I had this bug with  
numpy 1.1 as well as the current SVN head.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy / ipython heisenbug crasher help?

2008-08-07 Thread Zachary Pincus
Hmm, I may have identified this by other means. See my next email...

Zach


On Aug 7, 2008, at 3:22 PM, Zachary Pincus wrote:

> Hello all,
>
> I have a bizarre bug that causes numpy to segfault, but:
>  - only when run under ipython
>  - only when numpy is imported *after* an other library (that does
> not import numpy)
>
> Here's what the code looks like.
> Crashes (only in ipython):
> import celltool.utility.pil_lite.Image as Image, numpy
> i = Image.open('untreated_2_logical.png')
> a = numpy.array(i)
>
> Does not crash:
> import numpy, celltool.utility.pil_lite.Image as Image
> i = Image.open('untreated_2_logical.png')
> a = numpy.array(i)
>
> The 'i' object exposes an __array_interface__, and is produced from my
> personal fork of the PIL.
>
> The crash is thusly reported:
> Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
> Exception Codes: KERN_INVALID_ADDRESS at 0x0251f000
>
> 0   libSystem.B.dylib 0x0a13 __memcpy + 627
> 1   multiarray.so 0x022f7af8 _array_copy_into + 2840
> (arrayobject.c:1113)
> 2   multiarray.so 0x022f88af PyArray_FromArray + 495
> (arrayobject.c:8334)
> 3   multiarray.so 0x022ee314 PyArray_FromAny + 308 (arrayobject.c:
> 8814)
> 4   multiarray.so 0x022fb01c PyArray_CheckFromAny + 92
> (arrayobject.c:8995)
> 5   multiarray.so 0x022fb2e6 _array_fromobject + 310
> (multiarraymodule.c:5787)
> [And then the python stack frames]
>
> Can anyone suggest what a good next step in reducing this patently
> insane bug down to a more minimal test case is? I had this bug with
> numpy 1.1 as well as the current SVN head.
>
> Zach
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug? Numpy crash with bad __array_interface__

2008-08-07 Thread Zachary Pincus
Hello all,

As per my previous email, I encountered a strange sometimes-segfault  
when using 'numpy.array(thing)' to convert 'thing' (which provides an  
__array_interface__) to a numpy array.

The offending __array_interface__ has a 'data' item that is a python  
string (not, as specified in the protocol, a pointer to a memory area)  
which is too small for the provided array size. Which, in some cases,  
causes a segfault.

Is this a bug? That is, should numpy check the size of the  
'data' (when 'data' is in fact a sized python object and not just a  
bare pointer) against the size of the array and provided typestr? Or  
is this too much of a corner case to deal with...

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] packbits / unpackbits bugs or misfeatures?

2008-08-07 Thread Zachary Pincus
Hello all,

numpy.unpackbits has a docstring that states that it returns a boolean  
array, but the function instead returns a uint8 array. Should I enter  
this in trac as a documentation bug or a functionality bug?

Also, numpy.packbits will not accept a bool-typed array as input (only  
scalar-typed arrays) -- should it have this ability, for symmetry with  
unpackbits? (Assuming that unpackbits should indeed unpack into bool  
arrays...)

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.fromstring() : error handling ?

2008-08-07 Thread Zachary Pincus
> the following code drives python into an endless loop :
>
> >>> import numpy
> >>> numpy.fromstring("abcd", dtype = float, sep = ' ')

It works on OS X 10.5.4 with a today's SVN head of numpy:

In [1]: import numpy

In [2]: numpy.fromstring("abcd", dtype = float, sep = ' ')
Out[2]: array([ 0.])

In [3]: numpy.version.version
Out[3]: '1.2.0.dev5619'

Perhaps this was fixed recently, or the bug is platform-specific?

Zach


On Aug 7, 2008, at 4:44 PM, oc-spam66 wrote:

> Hello,
>
> the following code drives python into an endless loop :
>
> >>> import numpy
> >>> numpy.fromstring("abcd", dtype = float, sep = ' ')
>
> I think the function numpy.fromstring is lacking an adequate error  
> handling for that case.
>
> Is it a bug ?
>
> Regards,
>
> -- 
> O.C.
> Python 2.5.2
> Debian Lenny
>
>
> Créez votre adresse électronique [EMAIL PROTECTED]
> 1 Go d'espace de stockage, anti-spam et anti-virus intégrés.
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] packbits / unpackbits bugs or misfeatures?

2008-08-09 Thread Zachary Pincus
>> Hello all,
>>
>> numpy.unpackbits has a docstring that states that it returns a  
>> boolean
>> array, but the function instead returns a uint8 array. Should I enter
>> this in trac as a documentation bug or a functionality bug?
>
> Would you mind fixing up the docs on the documentation editor?

Except that it would kind of make more sense, at least to me, for  
unpackbits to return a boolean array, and packbits to take boolean  
arrays. I could try to look into fixing the code instead of the docs,  
if others agree.

Otherwise I'll fix the docs. Maybe that's better from an API-breakage  
standpoint anyway, regardless of what "makes sense to me"...

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Zachary Pincus
Hi Dan,

Your approach generates numerous large temporary arrays and lists. If  
the files are large, the slowdown could be because all that memory  
allocation is causing some VM thrashing. I've run into that at times  
parsing large text files.

Perhaps better would be to iterate through the file, building up your  
cells dictionary  incrementally. Finally, once the file is read in  
fully, you could convert what you can to arrays...

f = open('big_file')
header = f.readline()
cells = {'tet':[], 'hex':[], 'quad':[]}
for line in f:
   vals = line.split()
   index_property = vals[:2]
   type=vals[3]
   nodes = vals[3:]
   cells[type].append(index_property+nodes)
for type, vals in cells:
   cells[type] = numpy.array(vals, dtype=int)

I'm not sure if this is exactly what you want, but you get the idea...  
Anyhow, the above only uses about twice as much RAM as the size of the  
file. Your approach looks like it uses several times more than that.

Also you could see if:
   cells[type].append(numpy.array([index, property]+nodes, dtype=int))

is faster than what's above... it's worth testing.

If even that's too slow, maybe you'll need to do this in C? That  
shouldn't be too hard, really.

Zach





On Aug 13, 2008, at 3:56 PM, Dan Lenski wrote:

> Hi all,
> I'm using NumPy to read and process data from ASCII UCD files.  This  
> is a
> file format for describing unstructured finite-element meshes.
>
> Most of the file consists of rectangular, numerical text matrices,  
> easily
> and efficiently read with loadtxt().  But there is one particularly  
> nasty
> section that consists of matrices with variable numbers of columns,  
> like
> this:
>
> # index property type nodes
> 1   1tet  620 583 1578 1792
> 2   1tet  656 551 553 566
> 3   1tet  1565 766 1600 1646
> 4   1tet  1545 631 1566 1665
> 5   1hex  1531 1512 1559 1647 1648 1732
> 6   1hex  777 1536 1556 1599 1601 1701
> 7   1quad 296 1568 1535 1604
> 8   1quad 54 711 285 666
>
> As you might guess, the "type" label in the third column does indicate
> the number of following columns.
>
> Some of my files contain sections like this of *more than 1 million
> lines*, so I need to be able to read them fast.  I have not yet come  
> up
> with a good way to do this.  What I do right now is I split them up  
> into
> separate arrays based on the "type" label:
>
> lines = [f.next() for i in range(n)]
> lines = [l.split(None, 3) for l in lines]
> id, prop, types, nodes = apply(zip, lines) # THIS TAKES /FOREVER/
>
> id = array(id, dtype=uint)
> prop = array(id, dtype=uint)
> types = array(types)
>
> cells = {}
> for t in N.unique(types):
> these = N.nonzero(types==t)
> # THIS NEXT LINE TAKES FOREVER
> these_nodes = array([nodes[ii].split() for ii in these], dtype=uint).T
> cells[t] = N.row_stack(( id[these], prop[these], these_nodes ))
>
> This is really pretty slow and sub-optimal.  Has anyone developed a  
> more
> efficient way to read arrays with variable numbers of columns???
>
> Dan
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Zachary Pincus
> This is similar to what I tried originally!  Unfortunately, repeatedly
> appending to a list seems to be very slow... I guess Python keeps
> reallocating and copying the list as it grows.  (It would be nice to  
> be
> able to tune the increments by which the list size increases.)

Robert's right, as ever -- repeated appending to a list is an  
*extremely* common operation, which you see often in idiomatic python.  
The implementation of list.append should be very fast, and smart about  
pre-allocating as needed.

Try profiling the code just to make sure that it is the list append  
that's slow, and not something else happening on that line, e.g..

> I hope this recipe may prove useful to others.  It would be nice if  
> NumPy
> had a built-in facility for arrays that intelligently expend their
> allocation as they grow.

It appears to be the general consensus on this mailing list that the  
best solution when an expandable array is required is to append to a  
python list, and then once you've built it up completely, convert it  
to an array. So I'm at least surprised that this is turning out to be  
so slow for you... But if the profiler says that's where the trouble  
is, then so it is...


>> Also you could see if:
>>   cells[type].append(numpy.array([index, property]+nodes, dtype=int))
>>
>> is faster than what's above... it's worth testing.
>
> Repeatedly concatenating arrays with numpy.append or  
> numpy.concatenate is
> also quite slow, unfortunately. :-(

Actually, my suggestion was to compare building up a list-of-lists and  
then converting that to a 2d array versus building up a list-of- 
arrays, and then converting that to a 2d array... one might wind up  
being faster or more memory-efficient than the other...

Zach




___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy 1.2.0b2 released

2008-08-14 Thread Zachary Pincus
>> is it really necessary to label these dmg's for 10.5 only?
> No.  This is done automatically by the tool used to build the mpkg.
> I'll look at changing this to 10.4, thanks for the reminder.

If the dmg name is generated from the distribution name that the  
python distutils makes (e.g. macosx-10.5-i386-2.5), then the following  
may be of note:
It appears that the MACOSX_DEPLOYMENT_TARGET environment variable  
controls (among other things) the distutils name. I generally set mine  
to 10.4, or even 10.3, depending on whether anything that I'm building  
requires later features (I'm pretty sure that numpy builds don't.)

Zach


On Aug 14, 2008, at 11:41 PM, Christopher Burns wrote:

> On Thu, Aug 14, 2008 at 6:45 PM, Les Schaffer  
> <[EMAIL PROTECTED]> wrote:
>> is it really necessary to label these dmg's for 10.5 only?
>
> No.  This is done automatically by the tool used to build the mpkg.
> I'll look at changing this to 10.4, thanks for the reminder.
>
>> will this dmg install on 10.4 if  py2.5 is available?
>
> It should.  Let us know otherwise.
>
> -- 
> Christopher Burns
> Computational Infrastructure for Research Labs
> 10 Giannini Hall, UC Berkeley
> phone: 510.643.4014
> http://cirl.berkeley.edu/
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] first post, simple questions

2008-08-21 Thread Zachary Pincus
> import numpy
> linsp = numpy.linspace
> red = linsp(0, 255, 50)
> green = linsp(125, 150, 50)
> blue = linsp(175, 255, 50)
>
> array's elements are float. How do I convert them into integer?
>
> I need to build a new array from red, green, blue. like this:
>
> [[ red[0], green[0], blue[0] ],
>  [ red[1], green[1], blue[1] ],
>  [ red[2], green[3], blue[3] ],
>  
>  
>  [ red[49], green[49], blue[49] ],
> ]
>
> how do I create such an array?

We can convert them to integer and build the array at the same time:

colors = numpy.empty((50, 3), dtype=numpy.uint8)
colors[:,0] = numpy.linspace(0, 255, 50)
colors[:,1] = numpy.linspace(125, 150, 50)
colors[:,2] = numpy.linspace(175, 255, 50)

Or we could use a fancy-dtype record array:

colors = numpy.empty(50, dtype=[('r', numpy.uint8), ('g',  
numpy.uint8), ('b', numpy.uint8)])
colors['r'] = numpy.linspace(0, 255, 50)
colors['g'] = numpy.linspace(125, 150, 50)
colors['b'] = numpy.linspace(175, 255, 50)

To see the array look like the old non-fancy type, use:
colors.view((numpy.uint8, 3))

Zach



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] subsampling array without loops

2008-08-22 Thread Zachary Pincus
> I'm looking for a way to acccomplish the following task without lots
> of loops involved, which are really slowing down my code.
>
> I have a 128x512 array which I want to break down into 2x2 squares.
> Then, for each 2x2 square I want to do some simple calculations
> such as finding the maximum value, which I then store in a 64x256
> array.

Perhaps you could take the relevant slices of the arrays with  
appropriate striding:

a = numpy.arange(128*512).reshape((128, 512))
top_left = a[::2, ::2]
top_right = a[::2, 1::2]
bottom_left = a[1::2, ::2]
bottom_right = a[1::2, 1::2]

tiles = numpy.array([top_left, top_right, bottom_left, bottom_right])
maxes = tiles.max(axis=0)

Similarly if you want overlapping tiles, you could leave out the  
final :2 in the slice specifications above.

Zach


> Here is the actual code involved.  It's only slightly more complex
> than what I described above, since it also involves doing a bit of
> masking on the 2x2 sub-arrays.
>
> Any hints are much appreciated!  An inordinate amount of time is
> being spent in this function and another one like it.
>
> Catherine
>
> def calc_sdcm_at_rlra(self,iblock):
>
> npixl = self.nlineht/self.nlinerl
> npixs = self.nsmpht/self.nsmprl
>
> sdcm_out = numpy.array([Constants.CLOUDMASK_NR]
> *self.nlinerl*self.nsmprl,'int8') \
> .reshape(self.nlinerl,self.nsmprl)
>
> for ilinerl in range(0,self.nlinerl):
> for ismprl in range(0,self.nsmprl):
>
> height = self.data[iblock].height[ilinerl*2:ilinerl*2
> +2, ismprl*2:ismprl*2+2]
> sdcm   = self.data[iblock].sdcm[ilinerl*2:ilinerl*2
> +2, ismprl*2:ismprl*2+2]
> source = self.data[iblock].heightsrc
> [ilinerl*2:ilinerl*2+2, ismprl*2:ismprl*2+2]
>
> mask1 = (source == Constants.HEIGHT_STEREO)
> mask2 = ( (source == Constants.HEIGHT_SURFACE) | \
>   (source == Constants.HEIGHT_DEFAULT) )
>
> if (mask1.any()):
> loc = height[mask1].argmax()
> sdcm_out[ilinerl,ismprl] = sdcm[mask1].ravel() 
> [loc]
> elif (mask2.any()):
> loc = height[mask2].argmax()
> sdcm_out[ilinerl,ismprl] = sdcm[mask2].ravel() 
> [loc]
>
> return sdcm_out
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] subsampling array without loops

2008-08-25 Thread Zachary Pincus
> This almost works.  Is there a way to do some masking on tiles, for
> instance taking the maximum height of each 2x2 square that is an
> odd number?   I've tried playing around with masking and where, but
> they don't return an array of the original size and shape of "tiles"
> below.

Could you provide an example (maybe not code, just descriptive) of  
what you want to do?

I'm not sure what specifically you mean by "height" and "odd" above...

Zach


> Catherine
>
>> Perhaps you could take the relevant slices of the arrays with
>> appropriate striding:
>>
>> a = numpy.arange(128*512).reshape((128, 512))
>> top_left = a[::2, ::2]
>> top_right = a[::2, 1::2]
>> bottom_left = a[1::2, ::2]
>> bottom_right = a[1::2, 1::2]
>>
>> tiles = numpy.array([top_left, top_right, bottom_left, bottom_right])
>> maxes = tiles.max(axis=0)
>>
>> Similarly if you want overlapping tiles, you could leave out the
>> final :2 in the slice specifications above.
>>
>> Zach
>>
>>
>>
>>> I'm looking for a way to acccomplish the following task without lots
>>> of loops involved, which are really slowing down my code.
>>>
>>> I have a 128x512 array which I want to break down into 2x2 squares.
>>> Then, for each 2x2 square I want to do some simple calculations
>>> such as finding the maximum value, which I then store in a 64x256
>>> array.
>>>
>>> Here is the actual code involved.  It's only slightly more complex
>>> than what I described above, since it also involves doing a bit of
>>> masking on the 2x2 sub-arrays.
>>>
>>> Any hints are much appreciated!  An inordinate amount of time is
>>> being spent in this function and another one like it.
>>>
>>> Catherine
>>>
>>>def calc_sdcm_at_rlra(self,iblock):
>>>
>>>npixl = self.nlineht/self.nlinerl
>>>npixs = self.nsmpht/self.nsmprl
>>>
>>>sdcm_out = numpy.array([Constants.CLOUDMASK_NR] \
>>>   *self.nlinerl*self.nsmprl,'int8') \
>>>.reshape(self.nlinerl,self.nsmprl)
>>>
>>>for ilinerl in range(0,self.nlinerl):
>>>for ismprl in range(0,self.nsmprl):
>>>
>>>height = self.data[iblock].height
>>> [ilinerl*2:ilinerl*2+2, \
>>> ismprl*2:ismprl*2+2]
>>>sdcm   = self.data[iblock].sdcm[ilinerl*2:ilinerl*2
>>> +2, \
>>>ismprl*2:ismprl*2+2]
>>>source = self.data[iblock].heightsrc
>>> [ilinerl*2:ilinerl*2+2, \
>>>
>>> ismprl*2:ismprl*2+2]
>>>
>>>mask1 = (source == Constants.HEIGHT_STEREO)
>>>mask2 = ( (source == Constants.HEIGHT_SURFACE) | \
>>>  (source == Constants.HEIGHT_DEFAULT) )
>>>
>>>if (mask1.any()):
>>>loc = height[mask1].argmax()
>>>sdcm_out[ilinerl,ismprl] = sdcm[mask1].ravel()
>>> [loc]
>>>elif (mask2.any()):
>>>loc = height[mask2].argmax()
>>>sdcm_out[ilinerl,ismprl] = sdcm[mask2].ravel()
>>> [loc]
>>>
>>>return sdcm_out
>>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] at my wits end over an error message...

2008-08-30 Thread Zachary Pincus
Hi Alan,

> Traceback (most recent call last):
>  File "/usr/local/lib/python2.5/site-packages/enthought.traits-2.0.4- 
> py2.5-linux-i686.egg/enthought/traits/trait_notifiers.py", line 325,  
> in call_1
>self.handler( object )
>  File "TrimMapl_1.py", line 98, in _Run_fired
>outdata = np.array(outdata, dtype=dtypes)
> TypeError: expected a readable buffer object

This would make it appear that the problem is not with numpy per se,  
but with the traits library, or how you're using it... I'm not too  
familiar with traits, so I can't really provide any advice there.

Can you disable whatever it is you're doing with traits for the time  
being, to see if that solves it? Maybe not have whatever notifier is  
presumably listening to outdata?

Also, for the record, your inner loop could be rendered in more  
idiomatic python as:

TYPICALLY_UINT_COLUMNS = set(['Track', 'Bin', 'code', 'horizon'])
..
dtypes = []
for var in self.var_list:
if var in TYPICALLY_UINT_COLUMNS:
dtypes.append(var, ' TYPICALLY_UINT_COLUMNS = ['Track', 'Bin', 'code', 'horizon']
> ..
>dtypes = [ ]
>for i in range(0, len(self.var_list)) :
>if TYPICALLY_UINT_COLUMNS.count(self.var_list[i]) > 0:
>dtypes.append((self.var_list[i], 'else :
>dtypes.append((self.var_list[i], 'print "dtypes = ", dtypes
>print outdata[0]
>outdata = np.array(outdata, dtype=dtypes)


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] sum of positive values in an array

2008-09-05 Thread Zachary Pincus
>
> Hi, probably a basic question, but I'm looking for a neat way to sum
> all the positive values in an array of floats.  I'm currently doing it
> the hard way, but am hoping there is some cunning and elegant syntax I
> can use instead

Fancy indexing's my favorite cunning and elegant syntax:

a = numpy.random.randn(100)
a.sum()
a > 0 # gives a boolean array that's True where the elements of a are  
 > 0.
a[a > 0] # retrieve from a only the elements with True in the boolean  
array.
a[a > 0].sum() # your desired result.

Note that you can also fancy-index with a list/array of indices, as  
well as a boolean "mask" array:
a[[1, 40, 55]] # pulls just the values at index 1, 40, and 55 out of a  
into a shape (3,) array.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] 2D (or n-d) fancy indexing?

2008-10-08 Thread Zachary Pincus
Hello all,

I'm doing something silly with images and am unable to figure out the  
right way to express this with "fancy indexing" -- or anything other  
than a brute for-loop for that matter.

The basic gist is that I have an array representing n images, of shape  
(n, x, y). I also have a "map" of shape (x, y), which contains indices  
in the range [0, n-1]. I then want to construct the "composite" image,  
of shape (x, y), with pixels selected from the n source images as per  
the indices in the map, i.e.:

composite[x, y] = images[map[x, y], x, y]
for all (x, y).

Now, I can't figure out if there's an easy way to express this in  
numpy. For that matter, I can't even figure out a simple way to do the  
1D version of the same:

composite[i] = images[map[i], i]
where composite and map have shape (m,), and images has shape (n, m).

Can anyone assist? Surely there's something simple that I'm just not  
seeing.

Thanks,

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 2D (or n-d) fancy indexing?

2008-10-08 Thread Zachary Pincus
> You need to give an array for each axis. Each of these arrays will be
> broadcast against each other to form three arrays of the desired shape
> of composite. This is discussed in the manual here:
>
>  
> http://mentat.za.net/numpy/refguide/indexing.xhtml#indexing-multi-dimensional-arrays
>
> Conceptually, you need arrays A, B, and C such that
>
>  composite[x,y] == images[A[x,y], B[x,y], C[x,y]]
>  for all x,y

Aha -- thanks especially for the clear illustration of what B and C  
need to be. That really helps.

> In [32]: composite = images[map, arange(Nx)[:,newaxis], arange(Ny)]
>
> When arange(Nx)[:,newaxis] and arange(Ny) get broadcasted with map,
> you get (480,640) arrays like you would get with mgrid[0:Nx,0:Ny].

That's very handy indeed. Thanks for your help!

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 2D (or n-d) fancy indexing?

2008-10-09 Thread Zachary Pincus
> http://mentat.za.net/numpy/numpy_advanced_slides/

Those slides are really useful! Thanks a ton.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import 16-bit tiff - byte-order problem?

2008-11-07 Thread Zachary Pincus
Hi,

The PIL has some fundamental architectural problems that prevent it  
from dealing easily with 16-bit TIFFs, which are exacerbated on little- 
endian platforms. Add to this a thin sheen of various byte-order bugs  
and other problems in the __array_interface__, and it's really hard to  
get consistent loading of 16-bit tiffs across platforms. (I and many  
others have submitted patches for these issues to no avail.)

A while ago, I tried to see if I could graft the image file format  
parsers from the PIL onto a byte-loading backend that used numpy.  
Unfortunately, I really couldn't -- the parsers assume too much about  
the problematic architecture of the PIL.

I do have a private fork of the PIL that I made which remedies the  
bugs and above-mentioned architectural issuse, and works cross- 
platform, and with any endian system. (It's very restricted compared  
to the regular PIL -- basically it just does image IO and then  
converts to numpy arrays.) I haven't released this because I don't  
really want to make trouble -- and we're promised that a major  
revision of the PIL is in the offing which will fix these troubles --  
but I'm happy to send the code out to those who actually need reliable  
16-bit image IO.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Compiler options for mingw?

2008-11-28 Thread Zachary Pincus
Hi all,

I'm curious about how to control compiler options for mingw builds of  
numpy on windows... Specifically, I want to build binaries without SSE  
support, so that they can run on older hardware.

Setting a CFLAGS variable on the command-line doesn't appear to do  
anything, but perhaps appearances are deceiving.

Thanks for any suggestions -- I've googled fruitlessly for a while.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Compiler options for mingw?

2008-11-28 Thread Zachary Pincus
>> I'm curious about how to control compiler options for mingw builds of
>> numpy on windows... Specifically, I want to build binaries without  
>> SSE
>> support, so that they can run on older hardware.
>
> The windows binaries of numpy can run on machines without SSE support.
> If for some reason you want to build it by yourself, you just need a
> BLAS/LAPACK without SSE - assuming you want BLAS/LAPACK. Note that
> depending on your intent, all the scripts to generate the full binary
> (which install the most optimized binary depending on the detected
> arch) are in svn too, if that's something you want to do.
>
> I have recently remove the arch specific optimization in
> numpy.distutils, so this should not be a problem either.

Thanks for the information, David!

Regarding the windows binary installer, it appears that it selects the  
optimized/unoptimized binary at *install time*, which causes  
complications for further bundling of numpy with e.g. py2exe for use  
on other machines. (E.g. my dev machine has SSE3, and it appears that  
SSE3-optimized binaries get installed on it, which then causes  
crashing when I bundle up a numpy script with py2exe ad run it on an  
older box.) Is this in fact the case? If so, is there any easy way to  
force the installer to  just use the basic unoptimized configuration?  
That would be the best...

On the other hand, if I'm using a SSE-free BLAS/LAPACK or non at all,  
there'll be no SSE optimization done? I understand that gcc4, and thus  
mingw derived from that version, will automatically try to use sse  
instructions where possible if not specifically disabled, which is  
what induced my original question. So, to be certain that gcc isn't  
introducing sse instructions under the covers, I would still like to  
know if there's a way to pass compiler flags to the build stage of  
numpy. On UNIX, CFLAGS seems to do the trick, but on windows with  
mingw the flags don't seem to be recognized... (E.g. setting CFLAGS to  
'-mfoo' causes an invalid option error on OS X, but not with windows.)

Thanks again,
Zach


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.loadtxt : yet a new implementation...

2008-12-02 Thread Zachary Pincus
Hi Pierre,

I've tested the new loadtxt briefly. Looks good, except that there's a  
minor bug when trying to use a specific white-space delimiter (e.g.  
\t) while still allowing other white-space to be allowed in fields  
(e.g. spaces).

Specifically, on line 115 in LineSplitter, we have:
 self.delimiter = delimiter.strip() or None
so if I pass in, say, '\t' as the delimiter, self.delimiter gets set  
to None, which then causes the default behavior of any-whitespace-is- 
delimiter to be used. This makes lines like "Gene Name\tPubMed ID 
\tStarting Position" get split wrong, even when I explicitly pass in  
'\t' as the delimiter!

Similarly, I believe that some of the tests are formulated wrong:
 def test_nodelimiter(self):
 "Test LineSplitter w/o delimiter"
 strg = " 1 2 3 4  5 # test"
 test = LineSplitter(' ')(strg)
 assert_equal(test, ['1', '2', '3', '4', '5'])

I think that treating an explicitly-passed-in ' ' delimiter as  
identical to 'no delimiter' is a bad idea. If I say that ' ' is the  
delimiter, or '\t' is the delimiter, this should be treated *just*  
like ',' being the delimiter, where the expected output is:
['1', '2', '3', '4', '', '5']

At least, that's what I would expect. Treating contiguous blocks of  
whitespace as single delimiters is perfectly reasonable when None is  
provided as the delimiter, but when an explicit delimiter has been  
provided, it strikes me that the code shouldn't try to further- 
interpret it...

Does anyone else have any opinion here?

Zach


On Dec 1, 2008, at 1:21 PM, Pierre GM wrote:

> Well, looks like the attachment is too big, so here's the  
> implementation. The tests will come in another message.
>
> 
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Compiler options for mingw?

2008-12-04 Thread Zachary Pincus
> I needed it to help me fixing a couple of bugs for old CPU, so it
> ended up being implemented in the nsis script for scipy now (I will
> add it to numpy installers too). So from now, any newly releases of
> both numpy and scipy installers could be overriden:
>
> installer-name.exe /arch native -> default behavior
> installer-name.exe /arch nosse -> Force installation wo sse, even if
> SSE-cpu is detected.
>
> It does not check that the option is valid, so you can end up
> requesting SSE3 installer on a SSE2 CPU. But well...

Cool! Thanks! This will be really useful...

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Thoughts on persistence/object tracking in scientific code

2008-12-29 Thread Zachary Pincus
This looks really cool -- thanks Luis.

Definitely keep us posted as this progresses, too.

Zach


On Dec 29, 2008, at 4:41 PM, Luis Pedro Coelho wrote:

> On Monday 29 December 2008 14:51:48 Luis Pedro Coelho wrote:
>> I will make the git repository publicly available once I figure out  
>> how to
>> do that.
>
> You can get my code with:
>
> git clone http://coupland.cbi.cmu.edu/jug
>
> As I said, I consider this alpha code and am only making it publicly  
> available
> at this stage because it came up. The license is LGPL.
>
> bye,
> Luis
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] array copy-to-self and views

2007-02-01 Thread Zachary Pincus
Hello folks,

I recently was trying to write code to modify an array in-place (so  
as not to invalidate any references to that array) via the standard  
python idiom for lists, e.g.:

a[:] = numpy.flipud(a)

Now, flipud returns a view on 'a', so assigning that to 'a[:]'  
provides pretty strange results as the buffer that is being read (the  
view) is simultaneously modified. Here is an example:

In [2]: a = numpy.arange(10).reshape((5,2))
In [3]: a
Out[3]:
array([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
In [4]: numpy.flipud(a)
Out[4]:
array([[8, 9],
[6, 7],
[4, 5],
[2, 3],
[0, 1]])
In [5]: a[:] = numpy.flipud(a)
In [6]: a
Out[6]:
array([[8, 9],
[6, 7],
[4, 5],
[6, 7],
[8, 9]])

A question, then: Does this represent a bug? Or perhaps there is a  
better idiom for modifying an array in-place than 'a[:] = ...'? Or is  
incumbent on the user to ensure that any time an array is directly  
modified, that the modifying array is not a view of the original array?

Thanks for any thoughts,

Zach Pincus

Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array copy-to-self and views

2007-02-01 Thread Zachary Pincus
> Zachary Pincus wrote:
>> Hello folks,
>>
>> I recently was trying to write code to modify an array in-place (so
>> as not to invalidate any references to that array)
>
> I'm not sure what this means exactly.

Say one wants to keep two different variables referencing a single in- 
memory list, as so:
a = [1,2,3]
b = a
Now, if 'b' and 'a' go to live in different places (different class  
instances or whatever) but we want 'b' and 'a' to always refer to the  
same in-memory object, so that 'id(a) == id(b)', we need to make sure  
to not assign a brand new list to either one.

That is, if we do something like 'a = [i + 1 for i in a]' then 'id 
(a) != id(b)'. However, we can do 'a[:] = [i + 1 for i in a]' to  
modify a in-place. This is not super-common, but it's also not an  
uncommon python idiom.

I was in my email simply pointing out that naïvely translating that  
idiom to the numpy case can cause unexpected behavior in the case of  
views.

I think that this is is unquestionably a bug -- isn't the point of  
views that the user shouldn't need to care if a particular array  
object is a view or not? Given the lack of methods to query whether  
an array is a view, or what it might be a view on, this seems like a  
reasonable perspective... I mean, if certain operations produce  
completely different results when one of the operands is a view, that  
*seems* like a bug. It might not be worth fixing, but I can't see how  
that behavior would be considered a feature.

However, I do think there's a legitimate question about whether it  
would be worth fixing -- there could be a lot of complicated checks  
to catch these kind of corner cases.

>> via the standard
>> python idiom for lists, e.g.:
>>
>> a[:] = numpy.flipud(a)
>>
>> Now, flipud returns a view on 'a', so assigning that to 'a[:]'
>> provides pretty strange results as the buffer that is being read (the
>> view) is simultaneously modified.
>
> yes, weird. So why not just:
>
> a = numpy.flipud(a)
>
> Since flipud returns a view, the new "a" will still be using the same
> data array. Does this satisfy your need above?

Nope -- though 'a' and 'numpy.flipud(a)' share the same data, the  
actual ndarray instances are different. This means that any other  
references to the 'a' array (made via 'b = a' or whatever) now refer  
to the old 'a', not the flipped one.

The only other option for sharing arrays is to encapsulate them as  
attributes of *another* object, which itself won't change. That seems  
a bit clumsy.

> It's too bad that to do this you need to know that flipud created a
> view, rather than a copy of the data, as if it were a copy, you would
> need to do the a[:] trick to make sure a kept the same data, but  
> that's
> the price we pay for the flexibility and power of numpy -- the
> alternative is to have EVERYTHING create a copy, but there were be a
> substantial performance hit for that.

Well, Anne's email suggests another alternative -- each time a view  
is created, keep track of the original array from whence it came, and  
then only make a copy when collisions like the above would take place.

And actually, I suspect that views already need to keep a reference  
to their original array in order to keep that array from being  
deleted before the view is. But I don't know the guts of numpy well  
enough to say for sure.

> NOTE: the docstring doesn't make it clear that a view is created:
>
>>>> help(numpy.flipud)
> Help on function flipud in module numpy.lib.twodim_base:
>
> flipud(m)
>  returns an array with the columns preserved and rows flipped in
>  the up/down direction.  Works on the first dimension of m.
>
> NOTE2: Maybe these kinds of functions should have an optional flag  
> that
> specified whether you want a view or a copy -- I'd have expected a  
> copy
> in this case!

Well, it seems like in most cases one does not need to care whether  
one is looking at a view or an array. The only time that comes to  
mind is when you're attempting to modify the array in-place, e.g.
a[] = 

Even if the maybe-bug above were easily fixable (again, not sure  
about that), you might *still* want to be able to figure out if a  
were a view before such a modification. Whether this needs a runtime  
'is_view' method, or just consistent documentation about what returns  
a view, isn't clear to me. Certainly the latter couldn't hurt.

> QUESTION:
> How do you tell if two arrays are views on the same data: is  
> checking if
> they have the same .base reliable?
>
>>>> a = numpy.array((1,2,3,4)

Re: [Numpy-discussion] array copy-to-self and views

2007-02-01 Thread Zachary Pincus
>> A question, then: Does this represent a bug? Or perhaps there is a
>> better idiom for modifying an array in-place than 'a[:] = ...'? Or is
>> incumbent on the user to ensure that any time an array is directly
>> modified, that the modifying array is not a view of the original  
>> array?
>>
>>
> Yes, it is and has always been incumbent on the user to ensure that  
> any
> time an array is directly modified in-place that the modifying  
> array is
> not a "view" of the original array.

Fair enough. Now, how does a user ensure this -- say someone like me,  
who has been using numpy (et alia) for a couple of years, but clearly  
not long enough to have an 'intuitive' feel for every time something  
might be a view (a feeling that must seem quite natural to long-time  
numpy users, who may have forgotten precisely how long it takes to  
develop that level of intuition)?

Documentation of what returns views helps, for sure. Would any other  
'training' mechanisms help? Say a function that (despite Tim's pretty  
reasonable 'don't do that' warning) will return true when two arrays  
have overlapping memory? Or an 'inplace_modify' function that takes  
the time to make that check?

Perhaps I'm the first to have views bite me in this precise way.  
However, if there are common failure-modes with views, I hope it's  
not too unreasonable to ask about ways that those common problems  
might be addressed. (Other than just saying "train for ten years, and  
you too will have numpy-fu, my son.") Giving newbies tools to deal  
with common problems with admittedly "dangerous" constructs might be  
useful.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] linalg.eigh orders eigenvalues/eigenvectors differently than linalg.eig

2007-02-19 Thread Zachary Pincus
Hello all,

It seems that the 'eigh' routine from numpy.linalg does not follow  
the same convention as numpy.linalg.eig in terms of the order of the  
returned eigenvalues. (And thus eigenvectors as well...)

Specifically, eig returns eigenvalues in order from largest to  
smallest, while eigh returns them from smallest to largest.

Example:
 >>> a = numpy.array([[21, 28, 35],[28, 38, 48],[35, 48, 61]])
 >>> numpy.linalg.eigh(a)
(array([ -1.02825542e-14,   7.04131679e-01,   1.19295868e+02]),
array([[ 0.40824829, -0.81314396, -0.41488581],
[-0.81649658, -0.12200588, -0.56431188],
[ 0.40824829,  0.56913221, -0.71373795]]))

 >>> numpy.linalg.eig(a)
(array([  1.19295868e+02,   7.04131679e-01,   4.62814557e-15]),
array([[-0.41488581, -0.81314396,  0.40824829],
[-0.56431188, -0.12200588, -0.81649658],
[-0.71373795,  0.56913221,  0.40824829]]))

Is this a bug? If it is, though, fixing it now might break code that  
depends on this 'wrong' order. (This is also present in  
scipy.linalg.) If not a bug, or not-fixable-now, then at least some  
documentation as to the convention regarding ordering of eigenvalues  
is probably worthwhile...

Any thoughts?

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Distributing prebuilt numpy and other extensions

2007-02-20 Thread Zachary Pincus
Hello folks,

I've developed some command-line tools for biologists using python/ 
numpy and some custom C and Fortran extensions, and I'm trying to  
figure out how to easily distribute them...

For people using linux, I figure a source distribution is no problem  
at all. (Right?)
On the other hand, for Mac users (whose computers by default don't  
have the dev tools, and even then would need to get a fortran  
compiler elsewhere) I'd like to figure out something a bit easier.

I'd like to somehow provide an installer (double-clickable or python  
script) that does a version check and then installs an appropriate  
version of prebuilt binaries for numpy and my C and Fortran  
extensions. Is this possible within the bounds of the python or numpy  
distutils? Would setuptools be a better way to go? Preferably it  
would be a dead easy, one-step thing...

Or is this whole idea problematic, and better to stick with source  
distribution in all cases?

Thanks for any advice,

Zach Pincus


Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Greek Letters

2007-02-20 Thread Zachary Pincus
I have found that the python 'unicode name' escape sequence, combined  
with the canonical list of unicode names ( http://unicode.org/Public/ 
UNIDATA/NamesList.txt ), is a good way of getting the symbols you  
want and still keeping the python code legible.

 From the above list, we see that the symbol name we want is GREEK  
SMALL LETTER CHI, so:
chi = u'\N{GREEK SMALL LETTER CHI}'
will do the trick. For chi^2, use:
chi2 = u'\N{GREEK SMALL LETTER CHI}\N{SUPERSCRIPT TWO}'

Note that to print these characters, we usually need to encode them  
somehow. My terminal supports UTF-8, so the following works for me:
import codecs
print codecs.encode(chi2, 'utf8')

giving (if your mail reader supports utf8 and mine encodes it  
properly...):
χ²

Zach Pincus

Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine


On Feb 20, 2007, at 3:56 PM, Mark Janikas wrote:

> Hello all,
>
>
>
> I was wondering how I could print the chi-squared symbol in  
> python.  I have been looking at the Unicode docs, but I figured I  
> would ask for assistance here while I delve into it.  Thanks for  
> any help in advance.
>
>
>
> Mark Janikas
>
> Product Engineer
>
> ESRI, Geoprocessing
>
> 380 New York St.
>
> Redlands, CA 92373
>
> 909-793-2853 (2563)
>
> [EMAIL PROTECTED]
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] what goes wrong with cos(), sin()

2007-02-21 Thread Zachary Pincus
Your results are indeed around zero.

 >>> numpy.allclose(0, 1.22460635382e-016)
True

It's not exactly zero because floating point math is in general not  
exact. You'll need to check out a reference about doing floating  
point operations numerically for more details, but in general you  
should not expect exact results due to the limited precision of any  
fixed-width digital representation of floats.

A corrolary: in general do not two floating-point values for equality  
-- use something like numpy.allclose. (Exception -- equality is  
expected if the exact sequence of operations to generate two numbers  
were identical.)

Zach Pincus

Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine

On Feb 21, 2007, at 10:11 AM, WolfgangZillig wrote:

> Hi,
>
> I'm quite new to numpy/scipy so please excuse if my problem is too  
> obvious.
>
> example code:
>
> import numpy as n
> print n.sin(n.pi)
> print n.cos(n.pi/2.0)
>
> results in:
> 1.22460635382e-016
> 6.12303176911e-017
>
> I've expected something around 0. Can anybody explain what I am doing
> wrong here?
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] what goes wrong with cos(), sin()

2007-02-21 Thread Zachary Pincus
It's true -- blindly using allclose isn't a lot better than blindly  
using equality testing. (Though given the choice between blindly  
using one and blindly using the other, I'd still probably vote for  
allclose... it won't get you quantum mechanics, but it'll do fine for  
a lot of other things.)

On the other hand, *properly* using allclose (e.g. setting the  
absolute and expected relative error tolerances) is better than  
properly using equality testing because in many cases there is no  
proper use for equality testing.

Zach

On Feb 21, 2007, at 10:29 AM, Anne Archibald wrote:

> On 21/02/07, Zachary Pincus <[EMAIL PROTECTED]> wrote:
>
>> A corrolary: in general do not two floating-point values for equality
>> -- use something like numpy.allclose. (Exception -- equality is
>> expected if the exact sequence of operations to generate two numbers
>> were identical.)
>
> I really can't agree that blindly using allclose() is a good idea. For
> example, allclose(plancks_constant,0) and the difference leads to
> quantum mechanics... you really have to know how much difference you
> expect and how big your numbers are going to be.
>
> Anne
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] the neighbourhood of each element of an array

2007-02-23 Thread Zachary Pincus
Scipy's ndimage module has a function that takes a generic callback  
and calls it with the values of each neighborhood (of a given size,  
and optionally with a particular "mask" footprint) centered on each  
array element. That function handles boundary conditions, etc nicely.

Unfortunately, I'm not sure if it works with masked arrays, and I  
think it hands a ravel'd set of pixels back to the callback function.  
You could probably hack masking in there by passing it the mask  
concatenated to the array, and then deal with the mask explicitly.

Zach


On Feb 23, 2007, at 11:39 AM, Bryan Cole wrote:

> On Fri, 2007-02-23 at 17:38 +0100, [EMAIL PROTECTED] wrote:
>> Hi,
>>
>> Given a (possibly masked) 2d array x, is there a fast(er) way in  
>> Numpy to obtain
>> the same result as the following few lines?
>>
>> d = 1  # neighbourhood 'radius'
>> Nrow = x.shape[0]
>> Ncol = x.shape[1]
>> y = array([[x[i-d:i+d+1,j-d:j+d+1].ravel() for j in range(d,Ncol- 
>> d)]  \
>>for i in range(d,Nrow-d)])
>>
>
> how about something like
>
> er = Nrow - d
> ec = Ncol - d
> y = array([x[i:er+i, j:ec+j] for j in arange(-d,d)
>   for i in arange(-d,d)])
>
> now you're looping over a small array and combining slices of the big
> array (as opposed to looping over the big array and combining slices
> from a small one). This should be faster for large Nrow, Ncol.
>
> BC
>
>
>> What you get is an array containing all the elements in a  
>> neighbourhood for each
>> element, disregarding the edges to avoid out-of-range problems.  
>> The code above
>> becomes quite slow for e.g. a 2000x2000 array. Does anyone know a  
>> better
>> approach?
>>
>> Ciao,
>> Joris
>>
>>
>>
>> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Questions about cross-compiling extensions for mac-ppc and mac-intel

2007-02-24 Thread Zachary Pincus
Hi folks,

I've been doing a lot of web-reading on the subject, but have not  
been completely able to synthesize all of the disparate bits of  
advice about building python extensions as Mac-PPC and Mac-Intel fat  
binaries, so I'm turning to the wisdom of this list for a few questions.

My general goal is to make a double-clickable Mac installer of a set  
of tools built around numpy, numpy's distutils, a very hacked-up  
version of PIL, and some fortran code too. To this end, I need to  
figure out how to get the numpy distutils to cross-compile,  
generating PPC code and Intel code in separate builds -- and/or  
generating a universal binary all in one go. (I'd like to distribute  
a universal version of numpy, but I think that my own code needs to  
be built/distributed separately for each architecture due to endian- 
ness issues.)

Is there explicit support in distutils for this, or is it a matter of  
setting the proper environment variables to entice gcc and gfortran  
to generate code for a specific architecture?

One problem is that PIL is a tricky beast, even in the neutered form  
that I'm using it. It does a compile-time check for the endian-ness  
of the system, and a compile-time search for the zlib to use, both of  
which are problematic.

To address the former, I'd like to be able to (say) include something  
like 'config_endian --big' on the 'python setup.py' command-line, and  
have that information trickle down to the PIL config script (a few  
subpackages deep). Is this easy or possible?

To address the latter, I think I need to have the PIL extensions  
dynamically link against '/Developer/SDKs/MacOSX10.4u.sdk/usr/lib/ 
libz.dylib' which is the fat-binary version of the library, using the  
headers from '/Developer/SDKs/MacOSX10.4u.sdk/usr/include/zlib.h
'. Right now, PIL is using system_info from numpy.distutils to find  
the valid library paths on which libz and its headers might live.  
This is nice and more or less platform-neutral, which I like. How  
best should I convince/configure numpy.distutils.system_info to put '/ 
Developer/SDKs/MacOSX10.4u.sdk/usr/{lib,include}' on the output to  
get_include_dirs() and get_lib_dirs()?

Thanks for any advice or counsel,

Zach Pincus

Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy segmentation fault

2007-03-14 Thread Zachary Pincus
If I recall correctly, there's a bug in numpy 1.0.1 on Linux-x86-64  
that causes this segfault. This is fixed in the latest SVN version of  
numpy, so if you can grab that, it should work.

I can't find the trac ticket, but I ran into this some weeks ago.

Zach



On Mar 14, 2007, at 1:36 PM, Cory Davis wrote:

> Hi there,
>
> I have just installed numpy-1.0.1 from source, which seemed to go
> fine.  However when I try to "import numpy" I get a segmentation
> fault.
>
> A have a 64 bit machine running RedHat Enterprise Linux and Python  
> 2.34
>
> Any clues greatly appreciated.
>
> Cheers,
> Cory.
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-22 Thread Zachary Pincus
Hi James,

Would it be possible to ask Peter if he knows anything that could  
help us resolve scipy ticket 213 ( http://projects.scipy.org/scipy/ 
scipy/ticket/213 )?

The basic issue is that ndimage.spline_filter seems to introduce  
nasty ringing artifacts, which make all of the geometric transforms  
(affine_transform included) perform very badly. Examples of these  
artifacts can be seen in the attachments to this ticket:

http://projects.scipy.org/scipy/scipy/attachment/ticket/213/ 
rotate_artifacts.jpg
http://projects.scipy.org/scipy/scipy/attachment/ticket/213/spline% 
20error.png

If Peter has any suggestions at all as to what's happened, that would  
be very helpful. Hopefully this won't be too much of an imposition,  
but these problems with the spline prefiltering are really hampering  
ndimage, so hopefully it will be worth it.

Thanks,

Zach Pincus
Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine


On Mar 22, 2007, at 11:34 AM, James Turner wrote:

> The people at STScI put me in touch with Peter Verveer, the author of
> nd_image. Unfortunately Peter is currently unable to maintain the code
> (either in numarray or scipy), but he did send me some comments on the
> problem discussed in this thread. Here's what he said:
>
> James.
>
> -
>
> Hi James,
>
> Yes, it could be that you see some mirroring. Let me first explain  
> what the mode
> options do:
>
> If  you try to interpolate a value that falls outside of the  
> boundaries, then
> that is done by either setting it constant (mode='constant') or by  
> mapping to a
> position inside the boundaries, i.e. by mirroring, and then  
> interpolating. So
> the mode argument really deals with extrapolating.
>
> Problem is when  you are mapping a value that is inside the  
> boundaries, but very
> close. Interpolation is done by splines which require that a window  
> is placed
> around the point you are interpolating, and the result is  
> calculated from the
> values inside that window. Thus, if you are close to the boundary,  
> part of that
> window will fall outside the boundaries, and the algorithm must  
> choose how to
> set the values outside the boundary. Ideally that would be done in  
> the same way
> as above, i.e. you should have the choice if that is done by a  
> constant value,
> or by  mirroring etc.
>
> Unfortunately, the  spline algorithms that I use (references for  
> the algorithm
> are in the manual) have an intrinsic mirroring assumption build in.  
> Thus for
> interpolating values inside the boundaries, a mirror boundary is  
> the only
> option. I did not figure out a way to modify the interpolation  
> algorithms.
> Therefore, if you choose a mode different from mirroring, then you  
> end up with
> small artifacts at the boundaries. Presumably these artifacts will  
> get bigger if
> the order of spline interpolation you select is larger.
>
> So, its not really a bug, its a undesired feature...
>
> Cheers, Peter
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-22 Thread Zachary Pincus
Hello all,

> By the way, ringing at sharp edges is an intrinsic feature of higher-
> order spline interpolation, right? I believe this kind of interpolant
> is really intended for smooth (band-limited) data. I'm not sure why
> the pre-filtering makes a difference though; I don't yet understand
> well enough what the pre-filter actually does.
>
> I'm not sure what people normally do in computer graphics, since I'm
> working more with natural band-limited images, but in the case of
> Stefan's "rotate_artifacts" example, wouldn't it be appropriate to
> use the 1st- or 0th-order spline instead of the 2nd order? If your
> real-life data are smooth enough, however, then in theory the
> ringing with higher orders should go away.

James is indeed correct. This whole thing is my mistake based on a  
mis-interpretation of terminology. I've more carefully re-read the  
paper on which the ndimage spline code is based ('Splines: A Perfect  
Fit for Signal/Image Processing' by Michael Unser; it's on citeseer).  
I now (hopefully) understand that the spline "pre-filter", while  
explicitly analogous to a traditional anti-aliasing pre-filter, is  
actually computing the spline coefficients via a filtering-type  
operation. While a traditional anti-aliasing filter should blur an  
image (as a band-limiting step), the fact that the anti-aliasing pre- 
filter does not is of no concern since the filtered output isn't a  
band-limited set of pixels, but a set of coefficients for a band- 
limited spline.

The actual transform operators then use these coefficients to  
(properly) compute pixel values at different locations. I just  
assumed that the "pre-filtering" was broken (even on natural images  
with smooth variations) because images pre-filtered with the spline  
filter didn't look like traditional pre-filtered images ought.

IMO, ticket 213 should be closed as PEBCAK (I'll do that forthwith);  
further I'm sorry to have caused Peter to be further bothered about  
this non-issue.

Zach






On Mar 22, 2007, at 8:13 PM, James Turner wrote:

> By the way, ringing at sharp edges is an intrinsic feature of higher-
> order spline interpolation, right? I believe this kind of interpolant
> is really intended for smooth (band-limited) data. I'm not sure why
> the pre-filtering makes a difference though; I don't yet understand
> well enough what the pre-filter actually does.
>
> I'm not sure what people normally do in computer graphics, since I'm
> working more with natural band-limited images, but in the case of
> Stefan's "rotate_artifacts" example, wouldn't it be appropriate to
> use the 1st- or 0th-order spline instead of the 2nd order? If your
> real-life data are smooth enough, however, then in theory the
> ringing with higher orders should go away.
>
> Sorry if I'm stating the obvious and missing the real point! I just
> wanted to make sure this hasn't been overlooked... Likewise, sorry I
> didn't mention this before if it does turn out to be relevant. Let
> me know if you want references to explain what I said above.
>
> James.
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-23 Thread Zachary Pincus
Hello all,

On Mar 23, 2007, at 3:04 AM, Stefan van der Walt wrote:

> On Thu, Mar 22, 2007 at 11:20:37PM -0700, Zachary Pincus wrote:
>> The actual transform operators then use these coefficients to
>> (properly) compute pixel values at different locations. I just
>> assumed that the "pre-filtering" was broken (even on natural images
>> with smooth variations) because images pre-filtered with the spline
>> filter didn't look like traditional pre-filtered images ought.
>>
>> IMO, ticket 213 should be closed as PEBCAK (I'll do that forthwith);
>> further I'm sorry to have caused Peter to be further bothered about
>> this non-issue.
>
> The ticket is more concerned with the usability of ndimage.
> Currently, the principle of least surprise is violated.  If you use
> the default arguments of ndimage.rotate (except for axes, which should
> still be fixed to be (1,0) by default) and rotate Lena by 30 degrees,
> you get something with green splotches on (attached to the ticket).
>
> IMO, we should either change the default parameters or update the
> documentation before closing the ticket.

Hmm, this is worrisome. There really shouldn't be ringing on  
continuous-tone images like Lena  -- right? (And at no step in an  
image like that should gaussian filtering be necessary if you're  
doing spline interpolation -- also right?)

I assumed the problem was a non-issue after some testing with a few  
natural images didn't introduce any artifacts like those, but now  
with the Lena example I'm not sure. This is frustrating, to be sure,  
mostly because of my limited knowledge on the subject -- just enough  
to be dangerous.

Zach



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-23 Thread Zachary Pincus
Hello again,

On Mar 23, 2007, at 7:56 AM, Travis Oliphant wrote:
>> I'm not sure what people normally do in computer graphics, since I'm
>> working more with natural band-limited images, but in the case of
>> Stefan's "rotate_artifacts" example, wouldn't it be appropriate to
>> use the 1st- or 0th-order spline instead of the 2nd order? If your
>> real-life data are smooth enough, however, then in theory the
>> ringing with higher orders should go away.
>>
>>
> Yes, that is true.  But, you really shouldn't see much ringing with a
> 3rd order spline, though.

There are a couple of different examples of ringing that we've seen  
(specifically, the first and third attachments to scipy ticket 213:
http://projects.scipy.org/scipy/scipy/ticket/213

The first was on Stefan's artificial data which had sharp edges, and  
got very nasty ringing artifacts even with 3rd order splines. From  
your recollection, is this expected behavior based on splines and the  
nature of Stefan's image, or more likely to be a bug?

The second example is rotated version of Lena -- an image with  
natural and much smoother edges -- where 3rd order spline  
interpolation still introduced strong ringing (or something nasty).  
This looks more like a bug, but (as per this discussion) my insight  
is clearly limited in this case.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-24 Thread Zachary Pincus
Hello folks,

>> Hmm, this is worrisome. There really shouldn't be ringing on
>> continuous-tone images like Lena  -- right? (And at no step in an
>> image like that should gaussian filtering be necessary if you're
>> doing spline interpolation -- also right?)
>
> That's hard to say. Just because it's mainly a continuous-tone image
> doesn't necessarily mean it is well sampled everywhere. This depends
> both on the subject and the camera optics. Unlike the data I usually
> work with, I think everyday digital photographs (probably a photo
> scan in the case of Lena) do not generally have the detector sampling
> frequency matched to the optical resolution of the image. If that's
> true, the presence of aliasing in interpolated images depends on the
> structure of the subject and whether the scene has edges or high-
> frequency patterns in it.

Based on my reading of the two excellent Unser papers (both the one  
that ndimage's spline code is based on, and the one that Travis  
linked to), it seems like a major point of using splines for  
interpolation is *better* behavior in the case of non-band-limited  
data than the traditional 'anti-aliasing [e.g. lowpass] filter' +  
'resampling' + 'reconstruction [e.g. lowpass] filter' procedure.

That is, based on my (very limited) understanding (I understand the  
basics of the sampling theorem and can follow the Unser paper's  
update thereof, but not much more than that), in the spline case the  
spline fitting step *replaces* the anti-aliasing filter in the above  
procedure as the method for dealing with non-band-limited data. And  
the claim is that it in fact works better in many cases.

So it seems that something is wrong if the spline interpolation tools  
in ndimage only work properly in the band-limited case, or require  
lowpass filtering.


> Stefan's rotated Lena example is indeed a bit bizarre on zooming in!
> However, the artefacts are clearly localized to distinct edges, so I
> suspect this is indeed some kind of aliasing. Moreover, it looks like
> Lena has been decimated (reduced in size) prior to the rotation. That
> is definitely a good way to get artefacts, unless an anti-aliasing
> filter is applied before shrinking the image. My impression is that
> this image is probably somewhat undersampled (to understand exactly
> what that means, read up on the Sampling Theorem).

Agreed, it looks like aliasing. Nevertheless, any resampling  
procedure is supposed to deal with this internally, right? Either by  
lowpass filtering (traditional case), or by spline fitting (spline  
case as described by Unser and understood by me) -- it shouldn't be  
letting aliasing bubble through, correct?

>> The first was on Stefan's artificial data which had sharp edges, and
>> got very nasty ringing artifacts even with 3rd order splines. From
>> your recollection, is this expected behavior based on splines and the
>> nature of Stefan's image, or more likely to be a bug?
>
> Your question was aimed at Travis, so I don't want to discourage him
> from answering it :-), but looking at this in more detail, I do think
> the amplitude of the artefacts here is greater than I might expect due
> to ringing with a quadratic b-spline kernel, which I think has minima
> with amplitudes <10% of the central peak. There has to be SOME
> oscillation, but in Stefan's "rotate_artifacts" example it seems to be
> at the level of ~100%. Also, it is not present on one of the inner
> edges for some reason. So I do wonder if the algorithm in nd_image is
> making this worse than it needs to be.

Good observation! Here are cubic spline fits to a step and a delta  
function, which both have quite small amounts of ringing compared to  
what's observed -- at least on the color images. Maybe 10% ringing in  
each color channel adds up perceptually more than it does  
mathematically?

import numpy, Gnuplot, scipy.interpolate

# step
x = numpy.arange(-10, 10)
y = numpy.where(x > 0, 1, 0)
tck = scipy.interpolate.splrep(x, y, s=0, k=3)
nx = numpy.linspace(-10, 9, 200, True)
ny = scipy.interpolate.splev(nx, tck)
d = Gnuplot.Data(numpy.transpose([x, y]), with='points')
nd = Gnuplot.Data(numpy.transpose([nx,ny]), with='lines')
g = Gnuplot.Gnuplot()
g.plot(d, nd)
ny.min()
# -0.107490563074 -- largest ringing is ~10% of step size

# delta
x = numpy.arange(-10, 10)
y = numpy.where(x == 0, 1, 0)
tck = scipy.interpolate.splrep(x, y, s=0, k=3)
nx = numpy.linspace(-10, 9, 200, True)
ny = scipy.interpolate.splev(nx, tck)
d = Gnuplot.Data(numpy.transpose([x, y]), with='points')
nd = Gnuplot.Data(numpy.transpose([nx,ny]), with='lines')
g = Gnuplot.Gnuplot()
g.plot(d, nd)
ny.min()
# -0.136449610107 -- largest ringing is ~13% of impulse size

Now let's use ndimage to rotate a step function by 45%, or zoom in on  
it:

import numpy, scipy.ndimage
o = numpy.ones((100, 50), dtype=float)
z = numpy.zeros((100, 50), dtype=float)
a = numpy.concatenate([o, z], axis=1)
b = scipy.ndimage.rotate(a, 45, order=3)
b.max()
# 1.088323

Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-24 Thread Zachary Pincus
Hello all,

Mystery solved, I think.

I downloaded Stéfan's Lena images and tried rotating them myself. As  
far as I can tell, the artifacts are caused by numerical overflow, as  
Lena is an unsigned 8-bit image.

If Lena is converted to floating-point before the rotation is  
applied, and then the intensity range is clipped to [0,255] and  
converted back to uint8 before saving, everything looks fine.

So, the "problem" was indeed the ringing that spline interpolation  
introduces. Despite the fact that this ringing was on the order of a  
few percent (as shown earlier), that few percent made a big  
difference when the it caused intensity values to ring over the top  
or under the bottom of the numerical type's range. Thus the bizarre  
wrap-around artifacts only in certain edge regions.

When using floating-point images throughout, the spline interpolation  
looks great even on the step-function images, up to order 4 and 5  
where the ringing gets (slightly) noticeable.

So, is this a bug? Well, I still think so. Given that small ringing  
is going to happen on all but the very smoothest images, and given  
that ndimage is going to be used on non-floating-point types, it  
would be good if there were some explicit internal clipping to the  
data type's range. Otherwise, the ndimage resampling tools are unfit  
for use on non-floating-point data that resides near the edges of the  
range of the data type.

Though I'm not quite sure how one would structure the calculations so  
that it would be possible to tell when over/underflow happened... it  
might not be possible. In which case, either the tools should use  
floating-point math at some of the steps internally (as few as  
possible) before clipping and converting to the required data type,  
or explicit warnings should be added to the documentation.

Zach




On Mar 24, 2007, at 1:58 PM, Stefan van der Walt wrote:

> On Sat, Mar 24, 2007 at 01:41:21AM -0400, James Turner wrote:
>> That's hard to say. Just because it's mainly a continuous-tone image
>> doesn't necessarily mean it is well sampled everywhere. This depends
>> both on the subject and the camera optics. Unlike the data I usually
>> work with, I think everyday digital photographs (probably a photo
>> scan in the case of Lena) do not generally have the detector sampling
>> frequency matched to the optical resolution of the image. If that's
>> true, the presence of aliasing in interpolated images depends on the
>> structure of the subject and whether the scene has edges or high-
>> frequency patterns in it.
>
> Agreed, but the aliasing effects isn't not the problem here, as it
> should be visible in the input image as well.  I'd expect a
> third-order spline interpolation to be more smooth than a first-order
> interpolant, but in the resulting images this isn't the case.
>
> See
>
> http://mentat.za.net/results/lena_small.png
> http://mentat.za.net/results/img_rot_30_1.png (1st order spline)
> http://mentat.za.net/results/img_rot_30_3.png (3rd order spline)
>
>> Lena has been decimated (reduced in size) prior to the rotation. That
>> is definitely a good way to get artefacts, unless an anti-aliasing
>> filter is applied before shrinking the image. My impression is that
>> this image is probably somewhat undersampled (to understand exactly
>> what that means, read up on the Sampling Theorem).
>
> The artefacts arn't visible in the source image (url above).  The
> image definately is a scaled down version of the original Lena -- very
> interesting, btw, see
>
> http://www.cs.cmu.edu/~chuck/lennapg/lenna.shtml
>
>> investigating it further. One experiment might be to blur the  
>> original
>> Lena with a Gaussian whose sigma is 1 pixel of the shrunken image
>> before actually shrinking her, then do the rotation.
>
> A rotation should take place without significant shifts in colour.
> This almost looks like a value overflow problem.
>
>> So I do wonder if the algorithm in nd_image is making this worse
>> than it needs to be.
>
> That is my suspicion, too.
>
>> compare the results? I just tried doing a similar rotation in  
>> PyRAF on
>> a monochrome image with a bicubic spline, and see considerably  
>> smaller
>> artefacts (just a compact overshoot of probably a few % at the
>> edge).
>
> Could you apply the PyRAF rotation on the Lena given above and post
> the result?
>
> I always thought we could simply revert to using bilinear and bicubic
> polygon interpolation (instead of spline interpolation), but now I
> read on wikipedia:
>
> """
> In the mathematical subfield of numerical analysis, spline
> interpolation is a form of interpolation where the interpolant is a
> special type of piecewise polynomial called a spline. Spline
> interpolation is preferred over polynomial interpolation because the
> interpolation error can be made small even when using low degree
> polynomials for the spline. Thus, spline interpolation avoids the
> problem of Runge's phenomenon which occurs when using high deg

Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-26 Thread Zachary Pincus
Thanks for the information and the paper link, James. I certainly  
appreciate the perspective, and now see why the anti-aliasing and  
reconstruction filtering might best be left to clients of a  
resampling procedure.

Hopefully at least some of the kinks in the spline interpolation (to  
date: obligate mirror boundary conditions, extra edge values, integer  
overflow) can be smoothed out.
I can't guarantee that I'll have the time or expertise to deal with  
ni_interpolation.c, but I'll try to give it a shot some time here.

Zach


On Mar 26, 2007, at 1:16 AM, James Turner wrote:

> PS... (sorry for all the posts, for anyone who isn't interested...)
>
>> Agreed, it looks like aliasing. Nevertheless, any resampling
>> procedure is supposed to deal with this internally, right? Either by
>> lowpass filtering (traditional case), or by spline fitting (spline
>> case as described by Unser and understood by me) -- it shouldn't be
>> letting aliasing bubble through, correct?
>
> In the general case, I don't think it is appropriate for the  
> resampling
> procedure to use low-pass filtering internally to avoid artefacts,
> except perhaps when downsampling. It probably makes sense for computer
> graphics work, but there are cases where the input data are band
> limited to begin with and any degradation in resolution is  
> unacceptable.
> Where needed, I think low-pass filtering should either be the
> responsibility of the main program or an option. It's not even  
> possible
> for the resampling procedure to prevent artefacts in every case, since
> the aliasing in a badly undersampled image cannot be removed post
> factum (this is for undersampled photos rather than artificial  
> graphics,
> which I think are fundamentally different because everything is  
> defined
> on the grid, although I haven't sat down and proved it  
> mathematically).
> I'm also not sure how the procedure could decide on the level of
> smoothing needed for a given dataset without external information.
>
> Of course intermediate-order splines will probably keep everyone  
> happy,
> being reasonably robust against ringing effects without causing much
> smoothing or interpolation error :-).
>
> By the way, I think you and Stefan might be interested in a medical
> imaging paper by Lehmann et al. (1999), which gives a very nice  
> overview
> of the properties of different interpolation kernels:
>
> http://ieeexplore.ieee.org/Xplore/login.jsp?url=/ 
> iel5/42/17698/00816070.pdf?arnumber=816070
>
> For what it's worth, I'd agree with both of you that the numeric
> overflow should be documented if not fixed. It sounds like Stefan has
> figured out a solution for it though. If you make sense of the code in
> "ni_interpolation.c", Stefan, I'd be very interested in how to make it
> calculate one less value at the edges :-).
>
> Cheers,
>
> James.
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New Operators in Python

2007-03-26 Thread Zachary Pincus
Hi folks,

Sorry to rain on this parade, but unicode variable names and/or other  
syntactic elements have already been rejected for Python 3:

http://www.python.org/dev/peps/pep-3099/
> Python 3000 source code won't use non-ASCII Unicode characters for  
> anything except string literals or comments.
To tell the truth, in my code (for an n=1 example), I use elementwise  
(or broadcasted) multiplication about an order of magnitude more than  
matrix multiplication. Now, there's plenty of linear algebra in my  
work, but it's usually boxed off enough from the rest that converting  
everything to matrices, or using the 'dot' function (etc.), is just  
fine.

Personally, I even prefer the current way that numpy does things --  
I've always *really* disliked matlab's special syntax for elementwise  
multiplication. Now, clearly there are cases where this is handy, but  
at least in looking over my code, I find that those cases are quite  
rare, really.

So, in large part, I really don't feel a strong need for new  
operators in numpy. (And in the rest of python too! The @ decorator  
pie-syntax is quite enough for line-noise characters, in my personal  
opinion. And I think that python-dev pretty well agrees too, based on  
the raging flame wars about adding even that.)

Now, this is of course just my single opinion, but folks should  
recall that even C++, which rarely met a feature that it didn't like,  
drew the line at adding extra syntactic operators to the language  
that existed only to be overridden/implemented by users. (Which is  
precisely what's being proposed here.)

Anyhow, feel free to disagree with me -- I'm no expert here. I'm only  
mentioning this as a public service to make it clear that most of  
what's being proposed in this thread is, for better or worse, 100%  
dead-in-the-water for Python 3, and the rest will have a fairly  
significant uphill battle.

Zach




On Mar 26, 2007, at 2:42 AM, dmitrey wrote:

> The unicode keyboards sailing everywhere is just a matter of time
> And python 2-symbol operators soon will look obsolete, this will
> increase migrating from python to Sun fortress etc. I took a look at
> their unicode syntax for math formulas
> http://research.sun.com/projects/plrg/faq/NAS-CG.pdf
> it looks (vs Python) (or even MATLAB) like Pentium 4 vs Intel 386
> processors.
> It just allows copy-paste from articles of many formulas, including
> "ro", 'beta' and other non-ascii symbols
> Also, implementing unicode will allow separate operators for (for
> example) MATLAB cross() equivalent (vector multiplication of vectors).
> WBR, D.
>
> René Bastian wrote:
>> Hello,
>>
>> I am interest both in numarray type multiplication and matrix type
>> multiplication.
>>
>> But I am not shure that I can buy an Unicode keyboard.
>>
>> May be it would be possible to implement a class with
>> user "definissable" (?) signs.
>>
>> My choice :
>>
>> a * b -> numarray type multi
>> a !* b -> matrix
>>
>>
>>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-26 Thread Zachary Pincus

Hello folks,


Hmm, this is worrisome. There really shouldn't be ringing on
continuous-tone images like Lena  -- right? (And at no step in an
image like that should gaussian filtering be necessary if you're
doing spline interpolation -- also right?)


That's hard to say. Just because it's mainly a continuous-tone image
doesn't necessarily mean it is well sampled everywhere. This depends
both on the subject and the camera optics. Unlike the data I usually
work with, I think everyday digital photographs (probably a photo
scan in the case of Lena) do not generally have the detector sampling
frequency matched to the optical resolution of the image. If that's
true, the presence of aliasing in interpolated images depends on the
structure of the subject and whether the scene has edges or high-
frequency patterns in it.


Based on my reading of the two excellent Unser papers (both the one  
that ndimage's spline code is based on, and the one that Travis  
linked to), it seems like a major point of using splines for  
interpolation is *better* behavior in the case of non-band-limited  
data than the traditional 'anti-aliasing [e.g. lowpass] filter' +  
'resampling' + 'reconstruction [e.g. lowpass] filter' procedure.


That is, based on my (very limited) understanding (I understand the  
basics of the sampling theorem and can follow the Unser paper's  
update thereof, but not much more than that), in the spline case the  
spline fitting step *replaces* the anti-aliasing filter in the above  
procedure as the method for dealing with non-band-limited data. And  
the claim is that it in fact works better in many cases.


So it seems that something is wrong if the spline interpolation tools  
in ndimage only work properly in the band-limited case, or require  
lowpass filtering.




Stefan's rotated Lena example is indeed a bit bizarre on zooming in!
However, the artefacts are clearly localized to distinct edges, so I
suspect this is indeed some kind of aliasing. Moreover, it looks like
Lena has been decimated (reduced in size) prior to the rotation. That
is definitely a good way to get artefacts, unless an anti-aliasing
filter is applied before shrinking the image. My impression is that
this image is probably somewhat undersampled (to understand exactly
what that means, read up on the Sampling Theorem).


Agreed, it looks like aliasing. Nevertheless, any resampling  
procedure is supposed to deal with this internally, right? Either by  
lowpass filtering (traditional case), or by spline fitting (spline  
case as described by Unser and understood by me) -- it shouldn't be  
letting aliasing bubble through, correct?



The first was on Stefan's artificial data which had sharp edges, and
got very nasty ringing artifacts even with 3rd order splines. From
your recollection, is this expected behavior based on splines and the
nature of Stefan's image, or more likely to be a bug?


Your question was aimed at Travis, so I don't want to discourage him
from answering it :-), but looking at this in more detail, I do think
the amplitude of the artefacts here is greater than I might expect due
to ringing with a quadratic b-spline kernel, which I think has minima
with amplitudes <10% of the central peak. There has to be SOME
oscillation, but in Stefan's "rotate_artifacts" example it seems to be
at the level of ~100%. Also, it is not present on one of the inner
edges for some reason. So I do wonder if the algorithm in nd_image is
making this worse than it needs to be.


Good observation! Here are cubic spline fits to a step and a delta  
function, which both have quite small amounts of ringing compared to  
what's observed -- at least on the color images. Maybe 10% ringing in  
each color channel adds up perceptually more than it does  
mathematically?


import numpy, Gnuplot, scipy.interpolate

# step
x = numpy.arange(-10, 10)
y = numpy.where(x > 0, 1, 0)
tck = scipy.interpolate.splrep(x, y, s=0, k=3)
nx = numpy.linspace(-10, 9, 200, True)
ny = scipy.interpolate.splev(nx, tck)
d = Gnuplot.Data(numpy.transpose([x, y]), with='points')
nd = Gnuplot.Data(numpy.transpose([nx,ny]), with='lines')
g = Gnuplot.Gnuplot()
g.plot(d, nd)
ny.min()
# -0.107490563074 -- largest ringing is ~10% of step size

# delta
x = numpy.arange(-10, 10)
y = numpy.where(x == 0, 1, 0)
tck = scipy.interpolate.splrep(x, y, s=0, k=3)
nx = numpy.linspace(-10, 9, 200, True)
ny = scipy.interpolate.splev(nx, tck)
d = Gnuplot.Data(numpy.transpose([x, y]), with='points')
nd = Gnuplot.Data(numpy.transpose([nx,ny]), with='lines')
g = Gnuplot.Gnuplot()
g.plot(d, nd)
ny.min()
# -0.136449610107 -- largest ringing is ~13% of impulse size

(sample plots are attached for reference).

Now let's use ndimage to rotate a step function by 45%, or zoom in on  
it:


import numpy, scipy.ndimage
o = numpy.ones((100, 50), dtype=float)
z = numpy.zeros((100, 50), dtype=float)
a = numpy.concatenate([o, z], axis=1)
b = scipy.ndimage.rotate(a, 45, order=3)
b.max()
# 1.08832325055
b.min()
# -0.

Re: [Numpy-discussion] matrix indexing question

2007-03-26 Thread Zachary Pincus
> Since matrices are an iterable Python object,
> we *expect* to iterate over the contained objects.
> (Arrays.)  I am not sure why this is not evident to all,
> but it is surely the sticking point in this discussion.
>
> A matrix is not a container of matrices.
> That it acts like one is surprsing.
> Surprises are bad unless they have a clear justification.
> Perhaps a clear justification exists,
> but it has not been offered in this discussion.

I think that a clear justification has been offered several times on  
the list recently, though maybe not all in this thread.

Matrices in numpy seem to exist as a construct to facilitate linear  
algebra. One such property is that row- and column-vector slices of  
matrices are (1xN) or (Nx1) matrices, because otherwise common linear  
algebra things -- like how you multiply a row-vector or a column  
vector by a matrix, and whether and when it needs to be transposed --  
do not translate properly from "linear algebra notation" like in  
textbooks and papers.

Once the matrix class is committed to providing row- and column- 
vector slices as other matrices, it makes no sense to have iteration  
over the matrix provide a different data type than slicing.

Basically, my rule of thumb has been to *only* use matrices when I'm  
copying linear algebra operations out of textbooks and papers, and I  
want the notations to align. Doing other, non-linear-algebra  
operations with matrices -- like iterating over their elements -- is  
not really worth it.

There's a meta question, of course: should things be changed to make  
it "worth it" to do "pythonic" tasks with matrices? Should there be  
special elementwise vs. matrix-operation operators? Should numpy work  
a lot more like matlab? That discussion is on-going in another  
thread, I think.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Zachary Pincus
> Exactly: that was one other thing I found artificial.
> Surely the points will then be wanted as arrays.
>
> So my view is that we still do not have a use case
> for wanting matrices yielded when iterating across
> rows of a matrix.

It's pretty clear from my perspective: 1-D slices of matrices *must*  
be matrices. I gave an intuitive make-it-fit-with-known-notation  
explanation, and Charles gave a theoretically-grounded explanation.  
There was no objection to this from any quarter.

Given that 1-D slices of matrices must be matrices for deep  
mathematical reasons as well as notational compatibility with the  
rest of linear algebra -- e.g. that m[0] yields a matrix if m is a  
matrix-- it almost certainly would violate the principle of least  
surprise for iteration over m (intuitively understood to be choosing m 
[0] then m[1] and so forth) to yield anything other than a matrix.  
This can't possibly be what you're asking for, right? You aren't  
suggesting that m[0] and list(iter(m))[0] should be different types?

There are many clear and definite use cases for m[0] being a matrix,  
by the way, in addition to the theoretical arguments -- these aren't  
hard to come by. Whether or nor there are actual good use-cases for  
iterating over a matrix, if you believe that m[0] should be a matrix,  
it's madness to suggest that list(iter(m))[0] should be otherwise.

My opinion? Don't iterate over matrices. Use matrices for linear  
algebra. There's no "iteration" construct in linear algebra. The  
behavior you find so puzzling is a necessary by-product of the  
fundamental behavior of the matrix class -- which has been explained  
and which you offered no resistance to.

Zach

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Zachary Pincus
Hello all,

I suspect my previous email did not contain the full chain of my  
reasoning, because I thought that some parts were basically obvious.  
My point was merely that given some pretty basic fundamental tenants  
of numpy, Alan's suggestions quickly lead to *serious issues* far  
worse than the original problem. Thanks Chris for making this more  
explicit; here I will do so further.

Let A be an array of arbitrary shape, and M be m-by-n matrix.

(1) A[i] equals by definition A[i,...]

This is a pretty fundamental part of numpy, right? That property (and  
its relatives) along with the broadcasting behavior, are in large  
part what make numpy so very expressive compared to matlab, etc.

I can write code in numpy to do multidimensional operations over  
arbitrary-shaped and arbitrary-dimensioned data that look *exactly*  
like the corresponding 1D operations. It's amazing, really, how  
numpy's "implicit indexing" (this property), broadcasting (sort of  
the inverse of this, really, where shape tuples are extended as  
necessary with implicit 1's, instead of slices being extended  
with :'s as necessary), and fancy indexing, allow complex operations  
to be written compactly, clearly, and in a fully-general manner.

I thought that point (1) was obvious, as it's a pretty fundamental  
part of numpy (and Numeric and Numarray before that). It's in  
absolutely no way a "trivial convenience". Of course, broadcasting  
and ufuncs, etc, contribute a lot more to expressiveness than this  
property, but it's part of the family.


(2) Thus, M[i] equals by definition M[i,:]. By the mathematical  
definition of matrices and for pragmatic reasons, M[i,:] is another  
matrix.

Here the trouble starts. Should matrices, not being arbitrary in  
their dimension, even have the ability to be indexed as M[i]? Or  
should they be completely different beasts than arrays? Which  
violates the "principle of least surprise" more?

Now, I think we all more or less agree that M[i,:] has to be a  
matrix, because otherwise the matrix class is pretty worthless for  
linear algebra, and doesn't operate according to the mathematical  
definition of a matrix. Indeed, this is one of the basic properties  
of the numpy matrix -- that, and that the multiplication operator is  
defined as matrix-multiplication. Remove that and there's almost no  
point in *having* a separate matrix class.

Want a use-case? Do the singular value decomposition, and get the  
product of the second-smallest singular vectors as Vt[:,-2] * U 
[-2,:]. (Things vaguely akin to this come up a lot.) You can copy the  
operation from any linear algebra text if the row- and column-vectors  
are treated as above. Otherwise you have to wonder whether that's  
supposed to be an outer or inner product, and use the corresponding  
operator -- and at that point, why was I using matrix classes anyway?

(3) If M[i] = M[i,:], and M[i,:] is a matrix, then the iteration  
behavior of matrices *has* to yield other matrices, unless Alan is  
willing to suggest that list(iter(m))[i] should have a different type  
than M[i] -- a clear absurdity. This is the point that I was trying  
to make previously, having mistakenly assumed that point (1) was  
clear to all, and that point (2) had been clearly established.

So, if we trace the reasoning of Alan's suggestion, coupled with  
these above properties, we get just that:
(a) Iteration over M should yield arrays.
(b) Ergo M[i] should be an array.
(c) Ergo M[i,:] should be an array.
(d) Ergo the matrix class should be identical to the array class  
except for overloaded multiplication, and the only way to get a 'row'  
or 'column' vector from a matrix that operates as a proper row or  
column vector should is M[i:i+1,:]. Whicy is a lot of typing to get  
the 'i-th' vector from the matrix.

I don't think these changes are worth it -- it basically discards the  
utility of the matrix class for linear algebra in order to make the  
matrix class more useful for general purpose data storage (a role  
already filled by the array type).

Here is another set of changes which would make matrices fully  
consistent with their linear-algebra roots:
(a) M[i,:] should be a matrix.
(b) M[i] is an error.
(c) Iteration over M is an error.

That's kind of lame though, right? Because then matrices are  
completely inconsistent with their numpy roots.

So we are left with the current behavior:
(a) M[i,:] is a matrix.
(b) M[i] is a matrix.
(c) Iteration over M yields matrices.


My view is that, fundamentally, they are linear algebra constructs.  
However, I also think it would be far more harmful to remove basic  
behavior common to the rest of numpy (e.g. that M[i] = M[i,:]), and  
behavior that comes along with that (iterability). Hence my  
suggestion: treat matrices like linear algebra constructs -- don't  
iterate over them (unless you know what you're doing). Don't index  
them like M[i] (unless you know what you're doing).

Maybe it's "just a 

Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Zachary Pincus
> The use case I requested was for iteration over a
> matrix where it is desirable that matrices are yielded.
> That is not what you offered.

No, I offered a thorough discussion of why the design of numpy, as I  
imperfectly understand it, make the trade-offs to achieve your  
desired goal unpalatable.

> The context for this request is my own experience:
> whenever I have needed to iterate over matrices,
> I have always wanted the arrays.  So I am simply
> interested in seeing an example of the opposite desire.

I'm glad that Robert provided such a nice example.

Nevertheless, I think that even absent any example it should be  
pretty clear at this point that if you want iteration over matrices  
to return arrays, either:
(1) you need to be willing to accept that list(iter(M))[i] != M[i]
or
(2) M[i], which equals M[i,:], should be an array.
(or 3, that M[i] and iteration over M is not defined)

Clearly, (1) is fundamentally ridiculous. Clearly use-cases (as well  
as the mathematical definition of matrices) abound to illustrate why  
(2) is no good. Option (3) helps nobody. So which is it? You have to  
choose one!

It is my assertion that if you are iterating over matrices and hoping  
to get arrays, you are misusing the matrix class. Almost by  
definition if you expect M[i,:] or M[i] (or equivalently, the output  
of an iterator over M) to be an array, then M is being used outside  
of a linear algebra context -- and thus, in my view, misused. In  
those cases, the matrix should be cast to an array, which has the  
desired behavior. Robert nicely illustrated a linear-algebraic  
context for matrix iteration.

Now, Bill offers up a different suggestion: indexing M yields neither  
a matrix nor an array, but a class that operates more or less like an  
array, except insofar as it interacts with other matrix objects, or  
other objects of similar classes. I'm interested in hearing more  
about this, what trade-offs or other compromises it might involve.

Zach


>
> Cheers,
> Alan Isaac
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Buffer interface PEP

2007-03-27 Thread Zachary Pincus
Hi,

I have a specific question and then a general question, and some  
minor issues for clarification.

Specifically, regarding the arguments to getbufferproc:
> 166 format
> 167address of a format-string (following extended struct
> 168syntax) indicating what is in each element of
> 169of memory.  The number of elements is len / itemsize,
> 170where itemsize is the number of bytes implied by the format.
> 171NULL if not needed in which case format is "B" for
> 172unsigned bytes.  The memory for this string must not
> 173be freed by the consumer --- it is managed by the exporter.

Is this saying that either NULL or a pointer to "B" can be supplied  
by getbufferproc to indicate to the caller that the array is unsigned  
bytes? If so, is there a specific reason to put the (minor)  
complexity of handling this case in the caller's hands, instead of  
dealing with it internally to getbufferproc? In either case, the  
wording is a bit unclear, I think.

The general question is that there are several other instances where  
getbufferproc is allowed to return ambiguous information which must  
be handled on the client side. For example, C-contiguous data can be  
indicated either by a NULL strides pointer or a pointer to a properly- 
constructed strides array. Clients that can't handle C-contiguous  
data (contrived example, I know there is a function to deal with  
that) would then need to check both for NULL *and* inside the strides  
array if not null, before properly deciding that the data isn't  
usable them. Similarly, the suboffsets can be either all negative or  
NULL to indicate the same thing.

Might it be more appropriate to specify only one canonical behavior  
in these cases? Otherwise clients which don't do all the checks on  
the data might not properly interoperate with providers which format  
these values in the alternate manner.


Also, some typos, and places additional clarification could help:

> 253 PYBUF_STRIDES (strides and isptr)
Should 'isptr' be 'suboffsets'?

> 75 of a larger array can be described without copying the data.   T
Dangling 'T'.

> 279 Get the buffer and optional information variables about the  
> buffer.
> 280 Return an object-specific view object (which may be simply a
> 281 borrowed reference to the object itself).
This phrasing (and similar phrasing elsewhere) is somewhat opaque to  
me. What's an "object-specific view object"?

> 287 Call this function to tell obj that you are done with your "view"
Similarly, the 'view' concept and terminology should be defined more  
clearly in the preamble.

> 333 The struct string-syntax is missing some characters to fully
> 334 implement data-format descriptions already available elsewhere (in
> 335 ctypes and NumPy for example).  Here are the proposed additions:
Is the following table just the additions? If so, it might be good to  
show the full spec, and flag the specific additions. If not, then the  
additions should be flagged.

> 341 't'   bit (number before states how many bits)
vs.
> 372 According to the struct-module, a number can preceed a character
> 373 code to specify how many of that type there are.  The
I'm confused -- could this be phrased more clearly? Does '5t' refer  
to a field 5-bits wide, or 5-one bit fields? Is 't' allowed? If  
so, is it equivalent to or different from '5t'?

> 378 Functions should be added to ctypes to create a ctypes object from
> 379 a struct description, and add long-double, and ucs-2 to ctypes.
Very cool.

In general, the logic of the 'locking mechanism' should be described  
at a high level at some point. It's described in nitty-gritty  
details, but at least I would have appreciated a bit more of a  
discussion about the general how and why -- this would be helpful to  
clients trying to use the locking mechanism properly.

Thanks to Travis and everyone else involved for your work on this  
cogent and sorely-needed PEP.

Zach



On Mar 27, 2007, at 12:42 PM, Travis Oliphant wrote:

>
> Hi all,
>
> I'm having a good discussion with Carl Banks and Greg Ewing over on
> python-dev about the buffer interface. We are converging on a pretty
> good design, IMHO.   If anybody wants to participate in the  
> discussion,
> please read the PEP (there are a few things that still need to change)
> and add your two cents over on python-dev (you can read it through the
> gmane gateway and participate without signing up).
>
> The PEP is stored in numpy/doc/pep_buffer.txt on the SVN tree for  
> NumPy
>
> Best regards,
>
> -Travis
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] A unique function...

2007-03-27 Thread Zachary Pincus
Hello,

There's unique and unique1d, but these don't output the number of  
occurences.
There's also bincount, which outputs the number of each element, but  
includes zeros for non-present elements and so could be problematic  
for certain data.

Zach



On Mar 27, 2007, at 2:28 PM, Pierre GM wrote:

> All,
> I could swear that I ran once into a numpy (or scipy) function that  
> output the
> unique values of a 1d ndarray along with the number of occurences  
> of each
> element. If I'm not completely mistaken, it 's a private function  
> initially
> in Fortran.
> Does this ring a bell to anyone ? Where could I find this function ?
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Buffer interface PEP

2007-03-27 Thread Zachary Pincus
>> Is this saying that either NULL or a pointer to "B" can be supplied
>> by getbufferproc to indicate to the caller that the array is unsigned
>> bytes? If so, is there a specific reason to put the (minor)
>> complexity of handling this case in the caller's hands, instead of
>> dealing with it internally to getbufferproc? In either case, the
>> wording is a bit unclear, I think.
>>
>
> Yes, the wording could be more clear.   I'm trying to make it easy for
> exporters to change
> to the new buffer interface.
>
> The main idea I really want to see is that if the caller just passes
> NULL instead of an address then it means they are assuming the data  
> will
> be "unsigned bytes"   It is up to the exporter to either allow this or
> raise an error.
>
> The exporter should always be explicit if an argument for returning  
> the
> format is provided (I may have thought differently a few days ago).

Understood -- I'm for the exporters being as explicit as possible if  
the argument is provided.

>> The general question is that there are several other instances where
>> getbufferproc is allowed to return ambiguous information which must
>> be handled on the client side. For example, C-contiguous data can be
>> indicated either by a NULL strides pointer or a pointer to a  
>> properly-
>> constructed strides array.
>
> Here.  I'm trying to be easy on the exporter and the consumer.  If the
> data is contiguous, then neither the exporter nor will likely care  
> about
> the strides.  Allowing this to be NULL is like the current array
> protocol convention which allows this to be None.

See below. My comments here aren't suggesting that NULL should be  
disallowed. I'm basically wondering whether it is a good idea to  
allow NULL and something else to represent the same information.  
(E.g. as above, an exporter could choose to show C-contiguous data  
with a NULL returned to the client, or with a trivial strides array).

Otherwise two different exporters exporting identical data could  
provide different representations, which the clients would need to be  
able to handle. I'm not sure that this is a recipe for perfect  
interoperability.

>> Clients that can't handle C-contiguous
>> data (contrived example, I know there is a function to deal with
>> that) would then need to check both for NULL *and* inside the strides
>> array if not null, before properly deciding that the data isn't
>> usable them.
> Not really.  A client that cannot deal with strides will simply not  
> pass
> an address to a stride array to the buffer protocol (that argument  
> will
> be NULL).  If the exporter cannot provide memory without stride
> information, then an error will be raised.

This doesn't really address my question, which I obscured with a  
poorly-chosen example. The PEP says (or at least that's how I read  
it) that if the client *does* provide an address for the stride  
array, then for un-strided arrays, the exporter may either choose to  
fill on NULL at that address, or provide a strides array.

Might it be easier for clients if the PEP required that NULL be  
returned if the array is C-contiguous? Or at least strongly suggested  
that? (I understand that there might be cases where an naive exporter  
"thinks" it is dealing with a strided array but it really is  
contiguous, and the exporter shouldn't be required to do that  
detection.)

The use-case isn't too strong here, but I think it's clear in the  
suboffsets case (see below).

>> Similarly, the suboffsets can be either all negative or
>> NULL to indicate the same thing.
> I think it's much easier to check if suboffsets is NULL rather than
> checking all the entries to see if they are -1 for the very common  
> case
> (i.e. the NumPy case) of no dereferencing.Also, if you can't deal
> with suboffsets you would just not provide an address for them.

My point exactly! As written, the PEP allows an exporter to either  
return NULL, or an array of all negative numbers (in the case that  
the client requested that information), forcing a fully -conforming  
client to make *both* checks in order to decide what to do.

Especially in this case, it would make sense to require a NULL be  
returned in the case of no suboffsets. This makes things easier for  
both clients that can deal with both suboffsets or non-offsets (e.g.  
they can branch on NULL, not on NULL or all-are-negative), and also  
for clients that can *only* deal with suboffsets.

Now, in these two cases, the use-case is pretty narrow, I agree.  
Basically it makes things easier for savvy clients that can deal with  
different data types, by not forcing them to make two checks (strides  
== NULL or strides array is trivial; suboffsets == NULL or suboffsets  
are all negative) when one would do. Again, this PEP allows the same  
information can be passed in two very different ways, when it really  
doesn't seem like that ambiguity makes life that much easier for  
exporters.

Maybe I'm wrong about this las

Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-28 Thread Zachary Pincus
Very cool -- Thanks for digging into the code and making these fixes,  
Stéfan! The ndimage C code is non-trivial for sure.

I'll test things out in the next couple of days.

Thanks again,

Zach



On Mar 28, 2007, at 10:25 AM, Stefan van der Walt wrote:

> On Wed, Mar 28, 2007 at 05:14:59PM +0200, Stefan van der Walt wrote:
>> As for the values at the edges, I'm still working on it.
>
> OK, that was a one-line patch.  Please test to see if there are any
> subtle conditions on the border that I may have missed.  I know of one
> already, but I'd be glad if you can find any others :)
>
> Cheers
> Stéfan
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question (final post)

2007-03-28 Thread Zachary Pincus
It's because the property that A[i] == A[i,...] is much more  
important to most numpy users than the results of a particular (mis) 
use of the matrix class.

This has been explained in many different contexts over many  
different email messages by many different people. You're not looking  
at the big picture, which has been fairly explicitly spelled out by  
myself and others. I'm not even sure why I'm replying at this point.

Zach


On Mar 28, 2007, at 3:25 PM, Alan Isaac wrote:

> On Wed, 28 Mar 2007, Robert Kern wrote:
>> People have been giving you reasons, over and over again.
>> You are simply refusing to listen to them.
>
> Exploring whether the reasoning is adequate is not the same
> as refusing to listen.  I do not presume my view is correct.
>
>> You have a use case for arrays being the iterates.  You
>> are presuming that the only argument that can beat that is
>> another use case for matrix objects being the iterates.
>> This is not true; there are other principles at work.
>
> Put slightly differently:
> given the surprising passion of the attacks at the
> suggestion that perhaps iteration over a matrix might more
> consistently yield arrays, I presumed there must be *many*
> instances in which it was obviously desirable that such
> iteration should yield matrices.  So I asked to see some.
>
> In the context of this discussion, I found the (lack of)
> responses very interesting.  Even in your thoughtful
> response it proved irrelevant rather than important for
> iteration over matrices to yield matrices.
>
> I understand that some people claim that a general principle
> of consistency is involved.  I have not been able to
> understand this particular design decision as a matter of
> consistency, and I have tried to say why.  However I am just
> a user (and supporter) of numpy, and as indicated in other
> posts, I make no pretense of deep insight into the design
> decisions.  In this case, I simply wanted to open
> a discussion of a design decision, not win an "argument".
>
> Anyway, I understand that I am being perceived as
> bull-headed here, so I'll let this go.  Thanks for your
> attempt to help me see the virtues of the current design.
>
> Cheers,
> Alan Isaac
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-29 Thread Zachary Pincus
Looks promising!

> On 3/29/07, Bill Baxter <[EMAIL PROTECTED]> wrote: On 3/30/07,  
> Timothy Hochberg <[EMAIL PROTECTED]> wrote:
> > Note, however that you can't (for instance) multiply column  
> vector with
> > a row vector:
> >
> > >>> (c)(r)
> > Traceback (most recent call last):
> >   ...
> > TypeError: Cannot matrix multiply columns with anything
> >
>
> That should be allowed.  (N,1)*(1,M) is just an (N,M) matrix with
> entries C[i,j] = A[i,0]*B[0,]
>
> I thought about that a little, and while I agree that it could be  
> allowed, I'm not sure that it should be allowed. It's a trade off  
> between a bit of what I would guess is little used functionality  
> with some enhanced error checking (I would guess that usually  
> row*column signals a mistake). However, I don't care much one way  
> or the other; it's not hard to allow.

I'm also for allowing it; from the perspective of one writing code  
that "looks like" typical (if potentially technically incorrect)  
notation, being able to write direct analogs to a'b (to get a scalar)  
or ab' (to get an MxN matrix) and have everything "work" would be  
quite useful. Especially in various reconstruction tasks (e.g. using  
svd to approximate a given matrix with a lower-rank one), the "outer  
product" of a row and a column vector comes up often enough that I  
would be loathe to have it raise an error.

> I kind of like the idea of using call for multiply, though.  If it
> doesn't turn out to have any major down sides it could be a good way
> to give ndarray a concise syntax for "dot".
>
> We'll see how it goes down this time. I've proposed using call  
> before, since I've thought the matrix situation was kind of silly  
> for what seems like ages now, but it always sinks without a ripple.

The call syntax is nice, if a bit opaque to someone looking at the  
code for the first time. On the other hand, it's explicit in a way  
that overloaded multiplication just isn't. I like it (for what little  
that's worth).


Zach


>
>
> -- 
>
> //=][=\\
>
> [EMAIL PROTECTED]
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] argsort and take along arbitrary axes

2007-05-14 Thread Zachary Pincus
Hello all,

I've got a few questions that came up as I tried to calculate various  
statistics about an image time-series. For example, I have an array  
of shape (t,x,y) representing t frames of a time-lapse of resolution  
(x,y).

Now, say I want to both argsort and sort this time-series, pixel- 
wise. (For example.)

In 1-d it's easy:
indices = a.argsort()
sorted = a[indices]

I would have thought that doing this on my 3-d array would work  
similarly:
indices = a.argsort(axis=0)
sorted = a.take(indices, axis=0)

Unfortunately, this gives a ValueError of "dimensions too large."  
Now, I know that 'a.sort(axis=0)' works fine for the given example,  
but I'm curious about how to this sort of indexing operation in the  
general case.

Thanks for any insight,

Zach Pincus

Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] argsort and take along arbitrary axes

2007-05-14 Thread Zachary Pincus
>> I've got a few questions that came up as I tried to calculate various
>> statistics about an image time-series. For example, I have an array
>> of shape (t,x,y) representing t frames of a time-lapse of resolution
>> (x,y).
>>
>> Now, say I want to both argsort and sort this time-series, pixel-
>> wise. (For example.)
>>
>> In 1-d it's easy:
>> indices = a.argsort()
>> sorted = a[indices]
>>
>> I would have thought that doing this on my 3-d array would work
>> similarly:
>> indices = a.argsort(axis=0)
>> sorted = a.take(indices, axis=0)
>>
>> Unfortunately, this gives a ValueError of "dimensions too large."
>> Now, I know that 'a.sort(axis=0)' works fine for the given example,
>> but I'm curious about how to this sort of indexing operation in the
>> general case.
>
> Unfortunately, argsort doesn't work transparently with take or  
> fancy indexing for multidimensional arrays. I am thinking of adding  
> a function argtake for this, and also for the results returned by  
> argmax and argmin, but at the moment you have to fill in the   
> values of the other indices and use fancy indexing. For now, it is  
> probably simpler, prettier, and faster to just sort the array.

Thanks Charles. Unfortunately, the argsort/sort buisness was, as I  
mentioned, just an example of the kind of 'take' operation that I am  
trying to figure out how to do. There are other operations that will  
have similarly-formatted 'indices' arrays (as above) that aren't  
generated from argsort...

As such, how do I "fill in the values of the other indices and use  
fancy indexing"? Even after reading the numpy book about that, and  
reading the docstring for numpy.take, I'm still vague on this. Would  
I use numpy.indices to get a list of index arrays, and then swap in  
(at the correct position in this list) the result of argsort (or the  
other operations), and use that for fancy indexing? Is there an  
easier/faster way?

Thanks again,

Zach





___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] very large matrices.

2007-05-14 Thread Zachary Pincus
Hello Dave,

I don't know if this will be useful to your research, but it may be  
worth pointing out in general. As you know PCA (and perhaps some  
other spectral algorithms?) use eigenvalues of matrices that can be  
factored out as A'A (where ' means transpose). For example, in the  
PCA case, if A is a centered data matrix (i.e. the mean value of each  
data point has been subtracted off), and if the data elements are row- 
vectors, then A'A is the covariance matrix. PCA then examines the  
(nonzero) eigenvectors of this covariance matrix A'A.

Now, it's possible to determine the eigenvalues/eigenvectors for A'A  
from the matrix AA', which in many useful cases is much smaller than  
A'A. For example, imagine that you have 100 data points, each in  
10,000 dimensions. (This is common in imaging applications.) A'A is  
10,000x10,000, but AA' is only 100x100. We can get the eigenvalues/ 
vectors of A'A from those of AA', as described below. (This works in  
part because in such cases, the larger matrix is rank-deficient, so  
there will only be e.g. 100 nonzero eigenvalues anyway.)

If you're lucky enough to have this kind of structure in your  
problem, feel free to use the following code which exploits that (and  
explains in a bit more detail how it works).


Zach Pincus
Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine



def _symm_eig(a):
   """Return the eigenvectors and eigenvalues of the symmetric matrix  
a'*a. If
   a has more columns than rows, then that matrix will be rank- 
deficient,
   and the non-zero eigenvalues and eigenvectors can be more easily  
extracted
   from the matrix a*a', from the properties of the SVD:
 if a of shape (m,n) has SVD u*s*v', then:
   a'*a = v*s'*s*v'
   a*a' = u*s*s'*u'
   That is, v contains the eigenvectors of a'*a, with s'*s the  
eigenvalues,
   according to the eigen-decomposition theorem.
   Now, let s_hat, an array of shape (m,n), be such that s * s_hat = I 
(m,m)
 and s_hat * s = I(n,n). With that, we can solve for u or v in  
terms of the
 other:
   v = a'*u*s_hat'
   u = a*v*s_hat
   """
   m, n = a.shape
   if m >= n:
 # just return the eigenvalues and eigenvectors of a'a
 vecs, vals = _eigh(numpy.dot(a.transpose(), a))
 vecs = numpy.where(vecs < 0, 0, vecs)
 return vecs, vals
   else:
 # figure out the eigenvalues and vectors based on aa', which is  
smaller
 sst_diag, u = _eigh(numpy.dot(a, a.transpose()))
 # in case due to numerical instabilities we have sst_diag < 0  
anywhere,
 # peg them to zero
 sst_diag = numpy.where(sst_diag < 0, 0, sst_diag)
 # now get the inverse square root of the diagonal, which will  
form the
 # main diagonal of s_hat
 err = numpy.seterr(divide='ignore', invalid='ignore')
 s_hat_diag = 1/numpy.sqrt(sst_diag)
 numpy.seterr(**err)
 s_hat_diag = numpy.where(numpy.isfinite(s_hat_diag), s_hat_diag, 0)
 # s_hat_diag is a list of length m, a'u is (n,m), so we can just  
use
 # numpy's broadcasting instead of matrix multiplication, and  
only create
 # the upper mxm block of a'u, since that's all we'll use anyway...
 v = numpy.dot(a.transpose(), u[:,:m]) * s_hat_diag
 return sst_diag, v

def _eigh(m):
   """Return eigenvalues and corresponding eigenvectors of the hermetian
   array m, ordered by eigenvalues in decreasing order.
   Note that the numpy.linalg.eigh makes no order guarantees."""
   values, vectors = numpy.linalg.eigh(m)
   order = numpy.flipud(values.argsort())
   return values[order], vectors[:,order]



On May 13, 2007, at 8:35 PM, Dave P. Novakovic wrote:

> There are definitely elements of spectral graph theory in my research
> too. I'll summarise
>
> We are interested in seeing the each eigenvector from svd can
> represent in a semantic space
> In addition to this we'll be testing it against some algorithms like
> concept indexing (uses a bipartitional k-meansish method for dim
> reduction)
> also testing against Orthogonal Locality Preserving indexing, which
> uses the laplacian of a similarity matrix to calculate projections of
> a document (or term) into a manifold.
>
> These methods have been implemented and tested for document
> classification, I'm interested in seeing their applicability to
> modelling semantics with a system known as Hyperspace to analog
> language.
>
> I was hoping to do svd to my HAL built out of reuters, but that was
> way too big. instead i'm trying with the traces idea i mentioned
> before (ie contextually grepping a keyword out of the docs to build a
> space around it.)
>
> Cheers
>
> Dave
>
> On 5/14/07, Charles R Harris <[EMAIL PROTECTED]> wrote:
>>
>>
>> On 5/13/07, Dave P. Novakovic <[EMAIL PROTECTED]> wrote:
 Are you trying some sort of principal components analysis?
>>>
>>> PCA is indeed one part of the research I'm doing.
>>
>> I had the impression you were trying to build a linear space in  
>> which to
>> e

[Numpy-discussion] Change to get_printoptions?

2007-07-25 Thread Zachary Pincus
Hello all,

I just recently updated to the SVN version of numpy to test my code  
against it, and found that a small change made to  
numpy.get_printoptions (it now returns a dictionary instead of a  
list) breaks my code.

Here's the changeset:
http://projects.scipy.org/scipy/numpy/changeset/3877

I'm not really looking forward to needing to detect numpy versions  
just so I can do the right thing with get_printoptions, but I do  
agree that the new version of the function is more sensible. My  
question is if there's any particular policy about backwards- 
incompatible python api changes, or if I need to be aware of their  
possibility at every point release. (Either is fine -- I'm happy for  
numpy to be better at the cost of incompatibility, but I'd like to  
know if changes like these are the rule or exception.)

Also, in terms of compatibility checking, has anyone written a little  
function to check if numpy is within a particular version range?  
Specifically, one that handles numpy's built from SVN as well as from  
release tarballs.


Thanks,

Zach Pincus

Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] getting numPy happening for sciPy

2007-07-27 Thread Zachary Pincus
On Jul 27, 2007, at 2:42 AM, Nils Wagner wrote:

> I cannot reproduce the problem concerning #401. It is Mac specific
> problem. Am I missing something ?

I can't reproduce this problem either. I just yesterday built scipy  
from SVN on two different OS X 10.4.10 boxes, one using the fortran  
compiler from hpc.sourceforge.net (not the latest 2007 release, but  
the december 2006 one), and the other using the compiler from  
r.research.att.com/tools. Everything else was similar, and everything  
worked fine with regard to ticket 401.

On the other hand, when I tried to compile scipy using the latest  
(2007-05) gfortran from hpc.sourceforge.net, I got bizarre link  
errors about MACOSX_DEPLOYMENT_TARGET being set incorrectly. (See  
previous email here http://projects.scipy.org/pipermail/scipy-user/ 
2007-June/012542.html ). Interestingly, with the earlier version of  
gfortran from hpc.sourceforge.net, and with the r.research.att.com/ 
tools version, this problem  does not arise.

Anyhow, my point is that there are still odd linker errors (as in  
ticket 401) lurking that may or may not have anything to do with  
scipy per se, but might have to do with odd and perhaps buggy builds  
of gfortran. Feh -- I wish Apple would just start including a fortran  
compiler with the rest of their dev tools. The situation otherwise is  
not good.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Vector magnitude?

2007-09-05 Thread Zachary Pincus
Hello,

'len' is a (pretty basic) python builtin function for getting the  
length of anything with a list-like interface. (Or more generally,  
getting the size of anything that is sized, e.g. a set or dictionary.)

Numpy arrays offer a list-like interface allowing you to iterate  
along their first dimension, etc. (*) Thus, len(numpy_array) is  
equivalent to numpy_array.shape[0], which is the number of elements  
along the first dimension of the array.

Zach


(*) For example, this is useful if you've got numerous data vectors  
packed into an array along the first dimension, and want to iterate  
across the different vectors.

a = numpy.ones((number_of_data_elements, size_of_data_element))
for element in a:
# element is a 1-D array with a length of 'size_of_data_element'

Note further that this works even if your data elements are multi- 
dimensional; i.e. the above works the same if:
element_shape = (x,y,z)
a = numpy.ones((number_of_data_elements,)+element_shape)
for element in a:
# element is a 3-D array with a shape of (x,y,z)





On Sep 5, 2007, at 2:40 PM, Robert Dailey wrote:

> Thanks for your response.
>
> I was not able to find len() in the numpy documentation at the  
> following link:
> http://www.scipy.org/doc/numpy_api_docs/namespace_index.html
>
> Perhaps I'm looking in the wrong location?
>
> On 9/5/07, Matthieu Brucher <[EMAIL PROTECTED] > wrote:
>
> 2007/9/5, Robert Dailey < [EMAIL PROTECTED]>: Hi,
>
> I have two questions:
>
> 1) Is there any way in numpy to represent vectors? Currently I'm  
> using 'array' for vectors.
>
>
> A vector is an array with one dimension, it's OK. You could use a  
> matrix of dimension 1xn or nx1 as well.
>
>
> 2) Is there a way to calculate the magnitude (length) of a vector  
> in numpy?
>
> Yes, len(a)
>
> Matthieu
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.rot90 bug with >2D arrays

2007-09-05 Thread Zachary Pincus
Hello,

numpy.rot90 (from twodim_base.by) says it works only on the first two  
axes of an array, but due to its use of the transpose method (which  
reverses the shape tuple), can affect other axes.

For example:
a = numpy.ones((50,40,3))
b = numpy.rot90(a)
assert(b.shape == (3,40,50))
# desired result is b.shape == (40,50,3)

I believe that the fix is to replace the two calls to 'transpose()'  
in the function with 'swapaxes(0,1)'. This would allow rotating an  
array along just the first two axes.

Does this fix look right? Should I file a bug or can someone just  
check that in?

Zach

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Find first occurrence?

2007-09-19 Thread Zachary Pincus
Hello all,

On several occasions, I've had the need to find only the first  
occurrence of a value in an unsorted numpy array. I usually use  
numpy.where(arr==val)[0] or similar, and don't worry about the fact  
that I'm iterating across the entire array.

However, sometimes the arrays are pretty big and the find operation  
is in an inner loop, so I was wondering if there's already a C  
extension function somewhere in numpy or scipy that does a fast find  
first operation, or anything similar. (Easy enough to write my own,  
and maybe given the issues inherent in comparing float equality,  
etc., it doesn't belong in the core numpy anyway...)

Thanks,

Zach Pincus
Program in Biomedical Informatics and Department of Biochemistry

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] display numpy array as image

2007-11-29 Thread Zachary Pincus
Hello all,

I'm curious if people have experience with / preferences for how to  
display a numpy array onscreen as an image.

Pyglet looks relatively easy -- you can feed an image buffer object  
with a string or a ctypes pointer. I presume getting a string from an  
array is plenty fast, but the ctypes pointer option is intriguing as  
it allows for dealing with simple strided arrays (the image objects  
allow for an arbitrary number of bytes between rows). Is it possible  
to get a ctypes pointer to the beginning of the array buffer from  
numpy without too much ugliness?

wxPython looks pretty easy too, as there are facilities for getting  
pixels from a buffer. Does anyone have any experience with these? Are  
there ways of allowing a numpy array and a wxPython image to point to  
the same memory?

Anyhow, these are specific questions, but I'd also appreciate any  
general thoughts about good approaches for getting pixels from numpy  
arrays onscreen.

Thanks,

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] display numpy array as image

2007-11-29 Thread Zachary Pincus
Thanks for the suggestions, everyone! All very informative and most  
helpful.

For what it's worth, here's my application: I'm building a tool for  
image processing which needs some manual input in a few places (e.g.  
user draws a few lines). The images are greyscale images with 12-14  
bits of dynamic range (from a microscope), so I need to have some  
basic brightness/contrast/gamma controls, as well as allowing basic  
drawing on the image to get the needed user input. It looks like GL  
or wx will be best suited here, I think? (I presume that python/numpy/ 
[GL|wx] can keep up with things like dragging a slider to change  
brightness/contrast/other LUT changes, as long as I code reasonably.)

Anyhow, thanks for all the input,

Zach


On Nov 29, 2007, at 9:03 PM, Joe Harrington wrote:

> If you want to explore the array interactively, blink images, mess  
> with
> colormaps using the mouse, rescale the image values, mark regions, add
> labels, look at dynamic plots of rows and columns, etc., get the ds9
> image viewer and the xpa programs that come with it that allow it to
> communicate with other programs:
>
> ftp://sao-ftp.harvard.edu/pub/rd/ds9
> http://hea-www.harvard.edu/RD/ds9/index.html
>
> Then get the Python numdisplay package, which uses xpa.  You have  
> to get
> numdisplay from inside the stsci_python package:
>
> http://www.stsci.edu/resources/software_hardware/pyraf/stsci_python/ 
> current/download
>
> Just grab the numdisplay directory from within that.  Older  
> versions of
> numdisplay are standalone but don't work perfectly.  Beware, there are
> outdated web sites about numdisplay on the stsci site.  Don't google!
>
> Run ds9 before you load numdisplay.  Then you can send your python
> arrays to a real interactive data viewer at will.  There are even
> mechanisms to define physical coordinates mapped from the image
> coordinates.
>
> --jh--
>
>

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


<    1   2   3   >