Re: [Numpy-discussion] arrayfns in numpy?

2007-03-14 Thread Matthieu Brucher

On the scipy user list, this exac question appeared last month, so yu can
check the answers on the archive :)

Matthieu

2007/3/2, Stephen Kelly [EMAIL PROTECTED]:


Hi,

I'm working on a project that requires interpolation, and I found this
post (http://mail.python.org/pipermail/python-list/2000-August/050462.html
) which works fine on my linux box (which seems to have Numeric
installed), but does not work on my windows platform because the
arrayfns module does not seem to be available on the latest numpy.

Is there some other way I should be doing interpolation using numpy,
or is the omission of arrayfns an oversight?

Kind regards,

Stephen.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] efficient norm of a vector

2007-03-14 Thread lorenzo bolla

thanks. I hadn't seen it.

anyway, from very rough benchmarks I did, the quickest and easiest way of
computing the euclidean norm of a 1D array is:
n = sqrt(dot(x,x.conj()))
much faster than:
n = sqrt(sum(abs(x)**2))
and much much faster than:
n = scipy.linalg.norm(x)

regards,
lorenzo.


On 3/14/07, Bill Baxter [EMAIL PROTECTED] wrote:


There is numpy.linalg.norm.

Here's what it does:

def norm(x, ord=None):
   x = asarray(x)
   nd = len(x.shape)
   if ord is None: # check the default case first and handle it
immediately
   return sqrt(add.reduce((x.conj() * x).ravel().real))
   if nd == 1:
   if ord == Inf:
   return abs(x).max()
   elif ord == -Inf:
   return abs(x).min()
   elif ord == 1:
   return abs(x).sum() # special case for speedup
   elif ord == 2:
   return sqrt(((x.conj()*x).real).sum()) # special case for
speedup
   else:
   return ((abs(x)**ord).sum())**(1.0/ord)
   elif nd == 2:
   if ord == 2:
   return svd(x, compute_uv=0).max()
   elif ord == -2:
   return svd(x, compute_uv=0).min()
   elif ord == 1:
   return abs(x).sum(axis=0).max()
   elif ord == Inf:
   return abs(x).sum(axis=1).max()
   elif ord == -1:
   return abs(x).sum(axis=0).min()
   elif ord == -Inf:
   return abs(x).sum(axis=1).min()
   elif ord in ['fro','f']:
   return sqrt(add.reduce((x.conj() * x).real.ravel()))
   else:
   raise ValueError, Invalid norm order for matrices.
   else:
   raise ValueError, Improper number of dimensions to norm.



--bb



On 3/14/07, lorenzo bolla [EMAIL PROTECTED] wrote:
 Hi all,
 just a quick (and easy?) question.
 what is the best (fastest) way to implement the euclidean norm of a
vector,
 i.e. the function:

 import scipy as S
 def norm(x):
normalize a vector.
return S.sqrt(S.sum(S.absolute(x)**2))

 ?

 thanks in advance,
 Lorenzo.
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nd_image.affine_transform edge effects

2007-03-14 Thread Stefan van der Walt
Hi James

On Fri, Mar 09, 2007 at 08:44:34PM -0300, James Turner wrote:
 Last year I wrote a program that uses the affine_transform()
 function in numarray to resample and co-add datacubes with WCS
 offsets in 3D. This function makes it relatively easy to handle
 N-D offsets and rotations with a smooth interpolant, which is
 exactly what I wanted. However, I am finding that the results
 contain edge effects which translate into artefacts in the
 middle of the final mosaic.

Is this related to

http://projects.scipy.org/scipy/scipy/ticket/213

in any way?

Code snippets to illustrate the problem would be welcome.

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Which dtype are supported by numexpr ?

2007-03-14 Thread Francesc Altet
A Divendres 09 Març 2007 18:56, Francesc Altet escrigué:
 A Divendres 09 Març 2007 18:40, Sebastian Haase escrigué:
  Which dtypes are supported by numexpr ?

 Well, numexpr does support any dtype that is homogeneous, except 'uint64'.
 This is because internally all the unsigned types are upcasted to the
 smallest *signed* integer that can fit the info for it. As it happens
 that 'uint64' doesn't have a standard signed type above that is able to
 keep its info: this is why it is unsupported.

 Besides, there is the limitation that Win32 doesn't have such a 'uint64',
 and computations in numpy or python are normally done by converting them to
 python long integers (correct me if I'm wrong here), whose arithmetic is
 very slow compared with the types supported by the compiler.  So, the best
 approach is to avoid 'uint64' types in general.

The info above is somewhat inexact. I was talking about the enhanced numexpr 
version included in PyTables 2.0 (see [1]). The original version of numexpr 
(see [2]) doesn't have support for int64 on 32-bit platforms and also neither 
does for strings. Sorry for the confusion.

[1] http://www.pytables.org/trac/browser/trunk/tables/numexpr
[2] http://projects.scipy.org/scipy/scipy/browser/trunk/Lib/sandbox/numexpr

  We are very interested in numexpr !
  Where is the latest / most-up-to-date documentation ?

 At the moment, I think documentation is reduced at some docstrings. If you
 want more, you will have to look at the sources.

Ups, I spoke too fast. David Cooke kindly added this page:

http://www.scipy.org/SciPyPackages/NumExpr

Actually, numexpr doesn't need many more info that this as it is pretty 
straighforward to use already.

Cheers,

-- 
0,0   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 -
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Vectorized ufuncs with OS X frameworks

2007-03-14 Thread Francesc Altet
Hi Eric,

A Divendres 09 Març 2007 15:32, Eric Brown escrigué:
 Hi All,

   I have a set of large arrays to which I have to do exp functions
 over and over again. I am wondering if there is any benefit to
 teaching numpy how to use the vForce frameworks to do this function.

   Does anyone have experience hacking the source to get numpy to use
 system libraries to do vectorized versions of sqrt, exp, log,
 addition, etc.?  I am also interested in doing this for other
 operating systems.

Maybe it is not what you want, but perhaps it is worth for you to have a look 
at numexpr [1] which has a kernel for vectorizing the sort of computations 
you are interested in (sqrt, exp, log, pow, trigonometric, addition...). It 
does the vectorization by extracting blocks of the numpy arrays and passing 
them to the computing kernel (which are macros in C for achieving maximum 
performance).

Also, it should be possible to adapt the kernel of numexpr for VForce, 
although I'm not sure about how vForce is going to achieve more efficiency 
than numexpr, as IMO the cost for these kind of computations is mostly 
dominated by fetch/store memory cells rather than doing actual computations 
(although I might be wrong).

[1] http://projects.scipy.org/scipy/scipy/browser/trunk/Lib/sandbox/numexpr/

Cheers,

-- 
0,0   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 -
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 3-10x speedup in bincount()

2007-03-14 Thread David Huard

Hi Stephen,

I'd de glad to test your bincount function for histogramming purposes.

David

2007/3/14, Stephen Simmons [EMAIL PROTECTED]:


Well, there were no responses to my earlier email proposing changes to
numpy.bincount() to make it faster and more flexible. Does this mean
noone is using bincount? :-)

Anyway I've now made the proposed modifications and got substantial
speedups of 3-10. Details are in this extract from my C code. For the
time being I will test this as a standalone extension module. When
stable and tests are written, I'll submit as a patch to the numpy's
_compiled_base.c.

Cheers,

Stephen


/
* Faster versions of bincount and casting strings to integers
*
* Author:  Stephen Simmons, [EMAIL PROTECTED]
* Date:11 March 2007
*
* This module contains C code for functions I am using to accelerate
* SQL-like aggregate functions for a column-oriented database based on
numpy.
*
* subtotal's bincount is typically 3-10 times faster than numpy's
bincount,
* and more flexible
*  - takes bins as they are, without promoting their type to int32
*  - takes weights as they are, without promoting their type to double
*  - ignores negative bins, rather than reporting an error
*  - allows optional int32/double output array to be passed in
*  - specify maximum bin number to use. Larger bins numbers are ignored
*  - only scans for max bin number if neither max_bin nor out array
specified
*
* atoi is 30-60 times faster than casting a string array to an integer
type,
* and may optionally use a translation table. The translation table is
* a convenient way to map sparse integer strings to adjacent bins before
* using bincount.
*
* Typical usage
* -

# Set up arrays 5,000,000 elts long. s1 has strings filled with '1' and
'2'.
# w are weights
 s1 = numpy.ndarray(shape=500, dtype='S1'); s1[:]='2'; s1[1::2]='1'
 w = numpy.arange(500,dtype='f4')

# Using standard numpy functions, string conversion is slow
 i = s1.astype('i1')- 5.95 sec (!)
 numpy.bincount(i)  - 113 ms
 numpy.bincount(i, w)   - 272 ms
 numpy.bincount(s1.astype('i1'), w) - 6.12 sec

# Using the faster functions here:
 i = subtotal.atoi(s1)  - 90 ms (60x faster)
 subtotal.bincount(i, max_bin=2)- 31 ms (3.6x faster)
 subtotal.bincount(i, w, max_bin=2) - 51 ms (5.3x faster)

# In both bases, output is
array([2, 1, 2, ..., 1, 2, 1], dtype=int8)
array([  0, 250, 250])
array([  0.e+00,   6.2500e+12,   6.24999750e+12])

# As an example of using a translate table, run bincount from atoi
applying
# a translate table for counting odd vs even strings. This does the
whole lot
# in 158ms, a speedup of 38x from 6.12 s above.
 t = numpy.arange(256, dtype='i1') % 2
 subtotal.bincount(subtotal.atoi(s1, t), w, max_bin=t.max()) -
158 ms
array([  6.24999750e+12,   6.2500e+12])

*
* These timings are based on numpy-1.0.1 Windows binary distribution
* for Python 2.5, compared with building this code using standard
distutils
* without any particular compiler optimisations:
C:\mnt\dev\subtotal python setup.py build -cmingw32
* Processor is a 1.83GHz Pentium-M
/

 rest of C code omitted 


Stephen Simmons wrote:
 Hi,

 I'd like to propose some minor modifications to the function
 bincount(arr, weights=None), so would like some feedback from other
 uses of bincount() before I write this up as a proper patch, .

 Background:
 bincount() has two forms:
 - bincount(x) returns an integer array ians of length max(x)+1  where
 ians[n] is the number of times n appears in x.
 - bincount(x, weights) returns a double array dans of length max(x)+1
 where dans[n] is the sum of elements in the weight vector weights[i]
 at the positions where x[i]==n
 In both cases, all elements of x must be non-negative.

 Proposed changes:
 (1) Remove the restriction that elements of x must be non-negative.
 Currently bincount() starts by finding max(x) and min(x). If the min
 value is negative, an exception is raised.  This change proposes
 dropping the initial search for min(x), and instead testing for
 non-negativity while summing values in the return arrays ians or dans.
 Any indexes where where x is negative will be silently ignored. This
 will allow selective bincounts where values to ignore are flagged with
 a negative bin number.

 (2) Allow an optional argument for maximum bin number.
 Currently bincount(x) returns an array whose length is dependent on
 max(x). It is sometimes preferable to specify the exact size of the
 returned array, so this change would add a new optional argument,
 max_bin, which is one less than the size of the returned array. Under
 this change, bincount() starts by finding max(x) only if max_bin is
 not specified. Then the returned array ians or dans is created with
 length max_bin+1, and any indexes that would overflow the output array
 are silently 

Re: [Numpy-discussion] Matlab - NumPy translation and indexing

2007-03-14 Thread Robert Cimrman
David Koch wrote:
 Hi,
 
 so one thing I came across now is the following, very simple:
 
 Matlab:
 A = []
 while 
A = [A some_scalar_value]
 end
 
 
 In Python, I tried:
 
 A = empty((0,0))
 while 
A = concatenate((A, array([someScalarValue])), 1)
 end
 
 which returns an error since the shape of the empty A does not match the
 vector I want to concatenate with. Any way to get around this without
 having
 to use some obscure if exist A clause. I am still stuck in my Matlab ways
 - hoping to get rid of them though.

a = empty( (0,) )
...

I would bet that using a regular Python list would be faster in this
case than growing the numpy array.

a = []
while ...
 a.append( scalar )
a = array( a )

r.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matlab - NumPy translation and indexing

2007-03-14 Thread David Koch

On 3/14/07, Sven Schreiber [EMAIL PROTECTED] wrote:




If you want a 1d-array in the end you could try empty(0) to start with,
and then do hstack((A, your_scalar)) or something like that.




Yeah, that works - odd, I thought concatenate((a,b),0) == hstack((a,b))

Thanks

/David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matlab - NumPy translation and indexing

2007-03-14 Thread Timothy Hochberg

On 3/14/07, David Koch [EMAIL PROTECTED] wrote:


On 3/14/07, Sven Schreiber [EMAIL PROTECTED] wrote:



 If you want a 1d-array in the end you could try empty(0) to start with,
 and then do hstack((A, your_scalar)) or something like that.




Depending on what your generating routine looks like, you might also try a
= fromiter(...) where you package up your scalar source as an iterator.




--

//=][=\\

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matlab - NumPy translation and indexing

2007-03-14 Thread Sturla Molden
On 3/14/2007 2:46 PM, Robert Cimrman wrote:

 a = []
 while ...
  a.append( scalar )
 a = array( a )


While it may help, growing Python lists is also an O(N) process.

One can reduce the amount of allocations by preallocating an ndarray of 
a certain size (e.g. 1024 scalars), filling it up, and storing it in a 
linked list. Finally, the stored arrays are retrieved as a single 
contiguous array. Example code below (cf. class scalar_list).


Sturla Molden




import numpy

class ndarray_list:

  a single linked list of numpy ndarrays.

 class node:
 def __init__(self, data):
 self.next = None
 self.data = data

 def __init__(self):
 self.head = None
 self.tail = None
 self.len = 0

 def append(self, data):
 tmp = self.node(data)
 if self.tail == None:
 self.tail = tmp
 self.head = tmp
 self.len = len(data)
 else:
 self.tail.next = tmp
 self.tail = tmp
 self.len += len(data)

 def length(self): return self.len

 def flatten(self, dtype=float):
 tmp = numpy.empty(self.len, dtype=dtype)
 cur = self.head
 idx0 = 0
 while cur:
 tmp[idx0:idx0+len(cur.data)] = cur.data
 idx0 += len(cur.data)
 cur = cur.next
 return tmp


class scalar_list:

  a single linked list of numpy scalars.

 def __init__(self, size=1024, dtype=float):
 self.cur = 0
 self.size = size
 self.dtype = dtype
 self.arr = numpy.empty(size,dtype)
 self.arrlist = ndarray_list()

 def append(self, scalar):
 cur = self.cur
 self.arr[cur] = scalar
 self.cur += 1
 if self.cur == self.size:
 self.arrlist.append(self.arr)
 self.arr = numpy.empty(self.size,self.dtype)
 self.cur = 0

 def array(self):
 if self.cur: self.arrlist.append(self.arr[:self.cur])
 retval = self.arrlist.flatten(self.dtype)
 self.cur = 0
 self.arr = numpy.empty(self.size,self.dtype)
self.arrlist = ndarray_list()
 self.arrlist.append(retval.copy())
 return retval


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] cannot pickle large numpy objects when memory resources are already stressed

2007-03-14 Thread Glen W. Mabey
On Wed, Mar 14, 2007 at 09:46:46AM -0700, Travis Oliphant wrote:
 Perhaps when the new bytes type is added to Python we will have a way to 
 view a memory area as a bytes object and be able to make a pickle 
 without creating that extra copy in memory.

Perhaps this is an aspect that could be mentioned in the PEP as
motivation?  

It seems kind of minor though; there certainly are more compelling
justifications for the PEP.

Thanks,
Glen
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Migrating from numeric to numpy

2007-03-14 Thread vinjvinj
So far my migration seems to be going well. I have one problem:

I've been using the scipy_base.insert and scipy_base.extract functions
and the behavior in numpy is not the same.

a = [0, 0, 0, 0]
mask = [0, 0, 0, 1]
c = [10]

numpy.insert(a, mask, c)

would change a so that

a = [0, 0, 0, 10]

This is not what I'm seeing in numpy.

Thanks,

VJ

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Questions regarding migrating from Numeric to numpy

2007-03-14 Thread vinjvinj
I'm in the process of migrating from Numeric to numpy. In some of my
code I have the following:

a = zeros(num_elements, PyObject)
b = zeros(num_elements, PyObject)

a is an array of python string objects and b is an array holding
mx.DateTime objects. What do I have to do to migrate this over to
numpy?

PS: I tried to register with the scipy mailing list but got no email
confirmation. I tried with two different email addresses.

Thanks,

VJ

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Putting some random back into the top-level?

2007-03-14 Thread Sebastian Haase
Hi,
Please remind me what's wrong with pylab's
rand and randn !
I just learned about their existence recently and thought
they seem quite handy and should  go directly into (the top-level of) numpy.
Functions that have the same name and do the same thing don't conflict
either ;-)

-Sebastian



On 3/12/07, Rob Hetland [EMAIL PROTECTED] wrote:

 I, for one, would also like this.  Perhaps it should not be called
 'rand', however, as that conflicts with the pylab rand.  (numpy load
 and pylab load also conflict -- probably the only reason I ever use
 import pylab as pl in my scripts).  'random' is already taken by the
 whole package... What does this leave that is still sensible?

 -r

 On Mar 9, 2007, at 2:01 PM, Bill Baxter wrote:

  Has enough time passed with no top level random function that we can
  now put one back in?
  If I recall, the former top level rand() was basically removed because
  it didn't adhere to the shapes are always tuples convention.
 
  Has enough time passed now that we can put something like it back in
  the top level, in tuple-taking form?
 
  I think this is a function people use pretty frequently when writing
  quick tests.
  And numpy.random.random_sample seems a rather long and not so obvious
  name for something that is used relatively frequently.
 
  --bb
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Migrating from numeric to numpy

2007-03-14 Thread Travis Oliphant
vinjvinj wrote:

So far my migration seems to be going well. I have one problem:

I've been using the scipy_base.insert and scipy_base.extract functions
and the behavior in numpy is not the same.

a = [0, 0, 0, 0]
mask = [0, 0, 0, 1]
c = [10]

numpy.insert(a, mask, c)

would change a so that

a = [0, 0, 0, 10]

This is not what I'm seeing in numpy.
  

scipy_base.insert -- numpy.place

-Travis

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Questions regarding migrating from Numeric to numpy

2007-03-14 Thread Travis Oliphant
vinjvinj wrote:

I'm in the process of migrating from Numeric to numpy. In some of my
code I have the following:

a = zeros(num_elements, PyObject)
b = zeros(num_elements, PyObject)
  


PyObject -- object

-Travis

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matlab - NumPy translation and indexing

2007-03-14 Thread Timothy Hochberg

On 3/14/07, Sturla Molden [EMAIL PROTECTED] wrote:


On 3/14/2007 2:46 PM, Robert Cimrman wrote:

 a = []
 while ...
  a.append( scalar )
 a = array( a )


While it may help, growing Python lists is also an O(N) process.



This may just be a terminology problem, but just to be clear, appending to a
Python list is amortized (on average) constant time (O(1)), not O(N).


One can reduce the amount of allocations by preallocating an ndarray of

a certain size (e.g. 1024 scalars), filling it up, and storing it in a
linked list. Finally, the stored arrays are retrieved as a single
contiguous array. Example code below (cf. class scalar_list).


Sturla Molden




import numpy

class ndarray_list:

  a single linked list of numpy ndarrays.

 class node:
 def __init__(self, data):
 self.next = None
 self.data = data

 def __init__(self):
 self.head = None
 self.tail = None
 self.len = 0

 def append(self, data):
 tmp = self.node(data)
 if self.tail == None:
 self.tail = tmp
 self.head = tmp
 self.len = len(data)
 else:
 self.tail.next = tmp
 self.tail = tmp
 self.len += len(data)

 def length(self): return self.len

 def flatten(self, dtype=float):
 tmp = numpy.empty(self.len, dtype=dtype)
 cur = self.head
 idx0 = 0
 while cur:
 tmp[idx0:idx0+len(cur.data)] = cur.data
 idx0 += len(cur.data)
 cur = cur.next
 return tmp


class scalar_list:

  a single linked list of numpy scalars.

 def __init__(self, size=1024, dtype=float):
 self.cur = 0
 self.size = size
 self.dtype = dtype
 self.arr = numpy.empty(size,dtype)
 self.arrlist = ndarray_list()

 def append(self, scalar):
 cur = self.cur
 self.arr[cur] = scalar
 self.cur += 1
 if self.cur == self.size:
 self.arrlist.append(self.arr)
 self.arr = numpy.empty(self.size,self.dtype)
 self.cur = 0

 def array(self):
 if self.cur: self.arrlist.append(self.arr[:self.cur])
 retval = self.arrlist.flatten(self.dtype)
 self.cur = 0
 self.arr = numpy.empty(self.size,self.dtype)
self.arrlist = ndarray_list()
 self.arrlist.append(retval.copy())
 return retval


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion





--

//=][=\\

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] cannot pickle large numpy objects when memory resources are already stressed

2007-03-14 Thread Francesc Altet
El dc 14 de 03 del 2007 a les 09:46 -0700, en/na Travis Oliphant va
escriure:
 Glen W. Mabey wrote:
 
 Hello,
 
 After running a simulation that took 6 days to complete, my script
 proceeded to attempt to write the results out to a file, pickled.
 
 The operation failed even though there was 1G of RAM free (4G machine).  
 I've since reconsidered using the pickle format for storing data sets 
 that include large numpy arrays.  However, somehow I assumed that one
 would be able to pickle anything that you already had in memory, but I
 see now that this was a rash assumption.
 
 Ought there to be a way to do this, or should I forget about being able
 to bundle large numpy arrays and other objects in a single pickle?

If you can afford using another package for doing I/O perhaps PyTables
can save your day. It is optimized for saving a retrieving very large
amounts of data with ease. In particular, it can save your in-memory
arrays without a need to do another copy in memory (provided the array
is contiguous).  It also allows compressing the data in a transparent
way, without a need of using additional memory.

Furthermore, a recent optimization introduced in the 2.0 branch a week
ago also allows to *update* an array on disk without doing copies
neither.

HTH,

-- 
Francesc Altet|  Be careful about using the following code --
Carabos Coop. V.  |  I've only proven that it works, 
www.carabos.com   |  I haven't tested it. -- Donald Knuth

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nonblocking Plots with Matplotlib

2007-03-14 Thread Sebastian Haase
Hey Bill,
what are you using to communicate with the server ?
May I recommend looking at Pyro !
(Python remote objects)
It would allow you to get your proxy objects.
And also handles exception super clean and easy.
I have used it for many years ! It's very stable !

(If you run into problems, take a look at the one-way calls to
ensure that functions, that would block, won't wait for the function
to return)

Just a thought  --
-Sebastian Haase


On 3/13/07, Bill Baxter [EMAIL PROTECTED] wrote:
 Howdy Folks,

 I was missing the good ole days of using Matlab back at the Uni when I
 could debug my code, stop at breakpoints and plot various data without
 fear of blocking the interpreter process.

 Using ipython -pylab is what has been suggested to me in the past,
 but the problem is I don't do my debugging from ipython.  I have a
 very nice IDE that works very well, and it has a lovely interactive
 debugging prompt that I can use to probe my code when stopped at a
 breakpoint.  It's great except I can't really use matplotlib for
 debugging there because it causes things to freeze up.

 So I've come up with a decent (though not perfect) solution for
 quickie interactive plots which is to run matplotlib in a separate
 process.  I call the result it 'ezplot'.  The first alpha version of
 this is now available at the Cheeseshop.  (I made an egg too, so if
 you have setuptools you can do easy_install ezplot.)

 The basic usage is like so:

  In [1]: import ezplot
  In [2]: p = ezplot.Plotter()
  In [3]: p.plot([1,2,3],[1,4,9],marker='o')
  Connecting to server... waiting...
  connected to plotserver 0.1.0a1 on http://localhost:8397
  Out[3]: True
  In [4]: from numpy import *
  In [5]: x = linspace(-5,5,20)
  In [13]: p.clf()
  Out[13]: True
  In [14]: p.plot(x, x*x*log(x*x+0.01))

 (Imagine lovely plots popping up on your screen as these commands are typed.)

 The only return values you get back are True (success...probably) or
 False (failure...for sure).  So no fancy plot object manipulation is
 possible.  But you can do basic plots no problem.

 The nice part is that this (unlike ipython's built-in -pylab threading
 mojo) should work just as well from wherever you're using python.
 Whether it's ipython (no -pylab) or Idle, or a plain MS-DOS console,
 or WingIDE's debug probe, or SPE, or a PyCrust shell or whatever.  It
 doesn't matter because all the client is doing is packing up data and
 shipping over a socket.  All the GUI plotting mojo happens in a
 completely separate process.

 There are plenty of ways this could be made better, but for me, for
 now, this probably does pretty much all I need, so it's back to Real
 Work.  But if anyone is interested in making improvements to this, let
 me know.

 Here's a short list of things that could be improved:
 * Right now I assume use of the wxAGG backend for matplotlib.  Don't
 know how much work it would be to support other back ends (or how to
 go about it, really).   wxAGG is what I always use.
 * Returning more error/exception info from the server would be nice
 * Returning full fledged proxy plot objects would be nice too, but I
 suspect that's a huge effort
 * SOAP may be better for this than xmlrpclib but I just couldn't get
 it to work (SOAPpy + Twisted).
 * A little more safety would be nice.  Anyone know how to make a
 Twisted xmlrpc server not accept connections from anywhere except
 localhost?
 * There's a little glitch in that the spawned plot server dies with
 the parent that created it.  Maybe there's a flag to subprocess.Popen
 to fix that?
 * Sometimes when you click on Exit Server, if there are plot windows
 open it hangs while shutting down.


 Only tested on Win32 but there's nothing much platform specific in there.

 Give it a try and let me know what you think!

 --bb
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Putting some random back into the top-level?

2007-03-14 Thread Timothy Hochberg

On 3/14/07, Sebastian Haase [EMAIL PROTECTED] wrote:


Hi,
Please remind me what's wrong with pylab's
rand and randn !
I just learned about their existence recently and thought
they seem quite handy and should  go directly into (the top-level of)
numpy.
Functions that have the same name and do the same thing don't conflict
either ;-)



I don't know what the problem, if any, is with rand and randn, but I can
tell you what the problem with stuffing stuff in the main namespace is: it's
allready much too crowded, which makes it difficult to find functions when
you need them. Have you tried dir(numpy) recently?

If all of the random number functions live in one package, it's easy to scan
them, see what's there and decide what is appropriate, if they are smeared
out across multiple packages. You might say that we should put these
functions in both the top level package and in the relevant subpackage, but
then it's difficult to find what packages exist since the top level package
is too crowded (a situation that already exists today to a certain extent).

Note that it's pretty easy for a downstream user to create a customized
package that pulls in all of the functions that they feel are deserving: a
file with just from numpy import *; from numpy.random import randn comes
to mind. It's considerably more difficult for such a user to trim the main
namespace in a sensible way. Thus, it's my opinion that we should be moving
stuff out of the main package, not putting stuff in.

Regards,


-Sebastian




On 3/12/07, Rob Hetland [EMAIL PROTECTED] wrote:

 I, for one, would also like this.  Perhaps it should not be called
 'rand', however, as that conflicts with the pylab rand.  (numpy load
 and pylab load also conflict -- probably the only reason I ever use
 import pylab as pl in my scripts).  'random' is already taken by the
 whole package... What does this leave that is still sensible?

 -r

 On Mar 9, 2007, at 2:01 PM, Bill Baxter wrote:

  Has enough time passed with no top level random function that we can
  now put one back in?
  If I recall, the former top level rand() was basically removed because
  it didn't adhere to the shapes are always tuples convention.
 
  Has enough time passed now that we can put something like it back in
  the top level, in tuple-taking form?
 
  I think this is a function people use pretty frequently when writing
  quick tests.
  And numpy.random.random_sample seems a rather long and not so obvious
  name for something that is used relatively frequently.
 
  --bb
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion





--

//=][=\\

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Multiplying Each Column of a Matrix by a Vector (Element-Wise)

2007-03-14 Thread Alexander Michael
Is there a clever way to multiply each column of a matrix by a vector
*element-wise*? That is, the equivalent of (from some 1D v and 2D m):

r = numpy.empty_like(m)
for i in range(m.shape[-1]):
r[...,i] = v*m[...,i]

Thanks,
Alex
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] arrayrange

2007-03-14 Thread Robert Kern
Miguel Oliveira, Jr. wrote:
 Hello,
 
 I've got a few codes that use arrayrange within numpy. It happens  
 that the new version of numpy apparently doesn't recognize  
 arrayrange... I've tried to change it to arange, but it doesn't  
 work... So, for example, the code below used to create a sine wave  
 file, but it's not working anymore... Does anyone have any clue of  
 what may be a solution for this?

arange() is the appropriate replacement for arrayrange(); arrayrange() was just
an alias to arange(). What did you expect it to do that it isn't doing?

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiplying Each Column of a Matrix by a Vector (Element-Wise)

2007-03-14 Thread Robert Kern
Alexander Michael wrote:
 Is there a clever way to multiply each column of a matrix by a vector
 *element-wise*? That is, the equivalent of (from some 1D v and 2D m):
 
 r = numpy.empty_like(m)
 for i in range(m.shape[-1]):
 r[...,i] = v*m[...,i]

  numpy.multiply(m, v[:,numpy.newaxis])

If m is actually just a shape-(n,m) array rather than a numpy.matrix object,
then you can spell that like so:

  m * v[:,numpy.newaxis]

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Putting some random back into the top-level?

2007-03-14 Thread Sebastian Haase
On 3/14/07, Timothy Hochberg [EMAIL PROTECTED] wrote:


 On 3/14/07, Sebastian Haase [EMAIL PROTECTED] wrote:
  Hi,
  Please remind me what's wrong with pylab's
  rand and randn !
  I just learned about their existence recently and thought
  they seem quite handy and should  go directly into (the top-level of)
 numpy.
  Functions that have the same name and do the same thing don't conflict
  either ;-)

 I don't know what the problem, if any, is with rand and randn, but I can
 tell you what the problem with stuffing stuff in the main namespace is: it's
 allready much too crowded, which makes it difficult to find functions when
 you need them. Have you tried dir(numpy) recently?

Hey Tim,
yes, I have done this many times -- just to scare myself  ;-)
As I see it most of them are historical problems -- and we will
likely be stuck with them forever -- since the 1.0 commitment
apparently doesn't even allow to make numpy.resize  and array.resize
to fill in the same way [[ one adds 0s, the other repeats the array ]]
.
 (Especially I'm thinking of hanning and hamming and other things I
understand even less ...)

The only argument here, was that one or two [ :-) ]  random functions
[ how about rand() and randn() ?]
would be nice to have as a shortcut

Yes, I have some modules myself, containing a bunch of home-made
things, that I call useful.
I understand the argument here was to get the best possible common ground.

I don't have (very) strong feelings about this.
-Sebastian
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] arrayrange

2007-03-14 Thread Eike Welk
I would use something like this:
t = linspace(0, durSecs, durSecs*SRate)

Do you know the 'Numpy Example List'
http://www.scipy.org/Numpy_Example_List

Regards Eike.

PS: Ah, you did subscribe.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] import numpy segmentation fault

2007-03-14 Thread Cory Davis
Hi there,

I have just installed numpy-1.0.1 from source, which seemed to go
fine.  However when I try to import numpy I get a segmentation
fault.

A have a 64 bit machine running RedHat Enterprise Linux and Python 2.34

Any clues greatly appreciated.

Cheers,
Cory.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy segmentation fault

2007-03-14 Thread Zachary Pincus
If I recall correctly, there's a bug in numpy 1.0.1 on Linux-x86-64  
that causes this segfault. This is fixed in the latest SVN version of  
numpy, so if you can grab that, it should work.

I can't find the trac ticket, but I ran into this some weeks ago.

Zach



On Mar 14, 2007, at 1:36 PM, Cory Davis wrote:

 Hi there,

 I have just installed numpy-1.0.1 from source, which seemed to go
 fine.  However when I try to import numpy I get a segmentation
 fault.

 A have a 64 bit machine running RedHat Enterprise Linux and Python  
 2.34

 Any clues greatly appreciated.

 Cheers,
 Cory.
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Which dtype are supported by numexpr ?

2007-03-14 Thread David M. Cooke
On Wed, 14 Mar 2007 13:02:10 +0100
Francesc Altet [EMAIL PROTECTED] wrote:

 The info above is somewhat inexact. I was talking about the enhanced
 numexpr version included in PyTables 2.0 (see [1]). The original version of
 numexpr (see [2]) doesn't have support for int64 on 32-bit platforms and
 also neither does for strings. Sorry for the confusion.

What are you doing with strings in numexpr? Ivan Vilata i Balaguer submitted
a patch to add them, but I rejected it as I didn't have any use-cases for
strings that needed the speed of numexpr, and were worth adding the
complexity.

-- 
||\/|
/--\
|David M. Cooke  http://arbutus.physics.mcmaster.ca/dmc/
|[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numexpr efficency depends on the size of the computing kernel

2007-03-14 Thread Francesc Altet
Hi,

Now that I'm commanding my old AMD Duron machine, I've made some
benchmarks just to prove that the numexpr computing is not influenced by
the size of the CPU cache, but I failed miserably (and Tim was right:
there is a dependency of the numexpr efficency on CPU cache size).

Provided that the pytables instance of the computing kernel of numexpr
is quite larger (it supports more datatypes) than the original,
comparing the performance of both versions can be a good way to check
the influence of CPU cache on the computing efficency.

The attached benchmark is a small modification of the timing.py that
comes with the numexpr package (the modification was needed to allow the
numexpr version of pytables to run all the cases). Basically, the
expressions tested operations with arrays of 1 million of elements, with
a mix of contiguous and strided arrays (no unaligned arrays are present
here). See the code in benchmark for the details.

The speed-ups of numexpr over plain numpy on a AMD Duron machine (64 +
64 KB L1 cache, 64 KB L2 cache) are:

For the original numexpr package:

2.14, 2.21, 2.21  (these represent averages for 3 complete runs)

For the modified pytables version (enlarged computing kernel):

1.32, 1.34, 1.37

So, with a CPU with a very small cache, the original numexpr kernel is
1.6x faster than the pytables one.

However, using a AMD Opteron which has a much bigger L2 cache (64 + 64
KB L1 cache, 1 MB L2 cache), the speed-ups are quite similar:

For the original numexpr package:

3.10, 3.35, 3.35

For the modified pytables version (enlarged computing kernel):

3.37, 3.50, 3.45

So, there is effectively a dependency on the CPU cache size. It would be
nice to run the benchmark with other CPUs with a L2 cache in the range
between 64 KB and 1 MB so as to find the point where the performance
starts to be similar (this should be a good guess on the size of the
computing kernel).

Meanwhile, the lesson learned is that Tim worries were correct: one
should be very careful on adding more opcodes (at least, until CPUs with
a very small L2 cache are in use).  With this, perhaps we will have to
reduce the opcodes in the numexpr version for pytables to a bare
minimum :-/

Cheers,

-- 
Francesc Altet|  Be careful about using the following code --
Carabos Coop. V.  |  I've only proven that it works, 
www.carabos.com   |  I haven't tested it. -- Donald Knuth
Expression: b*c+d*e
numpy: 0.284756803513
Skipping weave timing
numexpr: 0.267185997963
Speed-up of numexpr over numpy: 1.06576244894

Expression: 2*a+3*b
numpy: 0.228031897545
Skipping weave timing
numexpr: 0.190967607498
Speed-up of numexpr over numpy: 1.19408679059

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.875679397583
Skipping weave timing
numexpr: 0.729962491989
Speed-up of numexpr over numpy: 1.19962245621

Expression: 2*a + arctan2(a, b)
numpy: 0.530754685402
Skipping weave timing
numexpr: 0.440991616249
Speed-up of numexpr over numpy: 1.20354824411

Expression: a**2 + (b+1)**-2.5
numpy: 0.830808615685
Skipping weave timing
numexpr: 0.408902907372
Speed-up of numexpr over numpy: 2.03179923817

Expression: (a+1)**50
numpy: 0.486846494675
Skipping weave timing
numexpr: 0.394672584534
Speed-up of numexpr over numpy: 1.23354525689

Expression: sqrt(a**2 + b**2)
numpy: 0.387914180756
Skipping weave timing
numexpr: 0.292760682106
Speed-up of numexpr over numpy: 1.3250214406

Average = 1.32191226793
Expression: b*c+d*e
numpy: 0.279518294334
Skipping weave timing
numexpr: 0.225658392906
Speed-up of numexpr over numpy: 1.23867891965

Expression: 2*a+3*b
numpy: 0.227924203873
Skipping weave timing
numexpr: 0.190263104439
Speed-up of numexpr over numpy: 1.19794221031

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.865833806992
Skipping weave timing
numexpr: 0.736699199677
Speed-up of numexpr over numpy: 1.17528810588

Expression: 2*a + arctan2(a, b)
numpy: 0.536459088326
Skipping weave timing
numexpr: 0.465694189072
Speed-up of numexpr over numpy: 1.15195572742

Expression: a**2 + (b+1)**-2.5
numpy: 0.803207492828
Skipping weave timing
numexpr: 0.402952003479
Speed-up of numexpr over numpy: 1.99330810095

Expression: (a+1)**50
numpy: 0.506087398529
Skipping weave timing
numexpr: 0.390724515915
Speed-up of numexpr over numpy: 1.29525376043

Expression: sqrt(a**2 + b**2)
numpy: 0.390014004707
Skipping weave timing
numexpr: 0.292934322357
Speed-up of numexpr over numpy: 1.33140426007

Average = 1.34054729781
Expression: b*c+d*e
numpy: 0.282696795464
Skipping weave timing
numexpr: 0.227395987511
Speed-up of numexpr over numpy: 1.2431916612

Expression: 2*a+3*b
numpy: 0.247914505005
Skipping weave timing
numexpr: 0.206929206848
Speed-up of numexpr over numpy: 1.19806434665

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.87483150959
Skipping weave timing
numexpr: 0.722416090965
Speed-up of numexpr over numpy: 1.21098009932

Expression: 2*a + arctan2(a, b)
numpy: 0.546046590805
Skipping weave timing
numexpr: 0.440475416183
Speed-up of numexpr over 

[Numpy-discussion] zoom FFT with numpy?

2007-03-14 Thread Ray S
We'd like to do what most call a zoom FFT; we only are interested 
in the frequencies of say, 6kHZ to 9kHz with a given N, and so the 
computations from DC to 6kHz are wasted CPU time.
Can this be done without additional numpy pre-filtering computations? 

If explicit filtering is needed to baseband the data, is it worth 
it? It sounds like we need to generate cosine data once for each band 
and N, then multiple with the data for each FFT.

Has anyone coded this up before? I couldn't find any references other 
than http://www.dsprelated.com/showmessage/38917/1.php
and http://www.numerix-dsp.com/siglib/examples/test_zft.c (which uses 
Numerix).

Ray

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nonblocking Plots with Matplotlib

2007-03-14 Thread Bill Baxter
Thanks, Sebastian.  I'll take a look at Pyro.  Hadn't heard of it.
I'm using just xmlrpclib with pickle right now.

--bb

On 3/15/07, Sebastian Haase [EMAIL PROTECTED] wrote:
 Hey Bill,
 what are you using to communicate with the server ?
 May I recommend looking at Pyro !
 (Python remote objects)
 It would allow you to get your proxy objects.
 And also handles exception super clean and easy.
 I have used it for many years ! It's very stable !

 (If you run into problems, take a look at the one-way calls to
 ensure that functions, that would block, won't wait for the function
 to return)

 Just a thought  --
 -Sebastian Haase


 On 3/13/07, Bill Baxter [EMAIL PROTECTED] wrote:
  Howdy Folks,
 
  I was missing the good ole days of using Matlab back at the Uni when I
  could debug my code, stop at breakpoints and plot various data without
  fear of blocking the interpreter process.
 
  Using ipython -pylab is what has been suggested to me in the past,
  but the problem is I don't do my debugging from ipython.  I have a
  very nice IDE that works very well, and it has a lovely interactive
  debugging prompt that I can use to probe my code when stopped at a
  breakpoint.  It's great except I can't really use matplotlib for
  debugging there because it causes things to freeze up.
 
  So I've come up with a decent (though not perfect) solution for
  quickie interactive plots which is to run matplotlib in a separate
  process.  I call the result it 'ezplot'.  The first alpha version of
  this is now available at the Cheeseshop.  (I made an egg too, so if
  you have setuptools you can do easy_install ezplot.)
 
  The basic usage is like so:
 
   In [1]: import ezplot
   In [2]: p = ezplot.Plotter()
   In [3]: p.plot([1,2,3],[1,4,9],marker='o')
   Connecting to server... waiting...
   connected to plotserver 0.1.0a1 on http://localhost:8397
   Out[3]: True
   In [4]: from numpy import *
   In [5]: x = linspace(-5,5,20)
   In [13]: p.clf()
   Out[13]: True
   In [14]: p.plot(x, x*x*log(x*x+0.01))
 
  (Imagine lovely plots popping up on your screen as these commands are 
  typed.)
 
  The only return values you get back are True (success...probably) or
  False (failure...for sure).  So no fancy plot object manipulation is
  possible.  But you can do basic plots no problem.
 
  The nice part is that this (unlike ipython's built-in -pylab threading
  mojo) should work just as well from wherever you're using python.
  Whether it's ipython (no -pylab) or Idle, or a plain MS-DOS console,
  or WingIDE's debug probe, or SPE, or a PyCrust shell or whatever.  It
  doesn't matter because all the client is doing is packing up data and
  shipping over a socket.  All the GUI plotting mojo happens in a
  completely separate process.
 
  There are plenty of ways this could be made better, but for me, for
  now, this probably does pretty much all I need, so it's back to Real
  Work.  But if anyone is interested in making improvements to this, let
  me know.
 
  Here's a short list of things that could be improved:
  * Right now I assume use of the wxAGG backend for matplotlib.  Don't
  know how much work it would be to support other back ends (or how to
  go about it, really).   wxAGG is what I always use.
  * Returning more error/exception info from the server would be nice
  * Returning full fledged proxy plot objects would be nice too, but I
  suspect that's a huge effort
  * SOAP may be better for this than xmlrpclib but I just couldn't get
  it to work (SOAPpy + Twisted).
  * A little more safety would be nice.  Anyone know how to make a
  Twisted xmlrpc server not accept connections from anywhere except
  localhost?
  * There's a little glitch in that the spawned plot server dies with
  the parent that created it.  Maybe there's a flag to subprocess.Popen
  to fix that?
  * Sometimes when you click on Exit Server, if there are plot windows
  open it hangs while shutting down.
 
 
  Only tested on Win32 but there's nothing much platform specific in there.
 
  Give it a try and let me know what you think!
 
  --bb
  ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] in place random generation

2007-03-14 Thread Daniel Mahler
On 3/12/07, Travis Oliphant [EMAIL PROTECTED] wrote:

 I'm not convinced that the broadcasting is causing the slow-downs.
 Currently, the code has two path-ways.  One gets called when the inputs
 are scalars which is equivalent to the old code and the second gets
 called when broadcasting is desired.  Their should be only a very small
 speed hit because of the check.   So, it should be possible to do both
 without great differences in speed.

 Perhaps the speed hit is due to something else (like creating 0-d arrays
 from scalars, or something similar).So, in other words, I think the
 problem is fixable, but we need someone who can track down where the
 slowness is coming from.

 -Travis

The problem is very easy to reproduce.
What would it take to do the tracking?
I'd like to see this fixed.
Presumably the python profiler is no use since the problem is inside C code,
which is not my expertise.

D
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nonblocking Plots with Matplotlib

2007-03-14 Thread Bill Baxter
On 3/15/07, Bill Baxter [EMAIL PROTECTED] wrote:
 Thanks, Sebastian.  I'll take a look at Pyro.  Hadn't heard of it.
 I'm using just xmlrpclib with pickle right now.

I took a look at Pyro -- it looks nice.
The only thing I couldn't find, though, is how decouple the wx GUI on
the server side from the Pyro remote call handler.  Both wx and Pyro
want to run a main loop.

With the XML-RPC, I could use twisted and its wxreactor class.  That
does all the necessary magic under the hood to run both loops.
Basically all you have to do to make it  work is:

   class MainApp(wx.App, twisted.web.xmlrpc.XMLRPC):
...

twisted.internet.wxreactor.install()
app = MainApp()
twisted.internet.reactor.registerWxApp(app)
twisted.internet.reactor.run()

And then you're good to go.  reactor.run() takes care of both main
loops somehow.

Do you know of any good examples showing how to do that sort of thing
with Pyro?  It must be possible.  I mean it's the exact same sort of
thing you'd need if you're writing a simple GUI internet chat program.
 My googling has turned up nothing, though.

--bb
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] in place random generation

2007-03-14 Thread Charles R Harris

On 3/14/07, Daniel Mahler [EMAIL PROTECTED] wrote:


On 3/12/07, Travis Oliphant [EMAIL PROTECTED] wrote:

 I'm not convinced that the broadcasting is causing the slow-downs.
 Currently, the code has two path-ways.  One gets called when the inputs
 are scalars which is equivalent to the old code and the second gets
 called when broadcasting is desired.  Their should be only a very small
 speed hit because of the check.   So, it should be possible to do both
 without great differences in speed.

 Perhaps the speed hit is due to something else (like creating 0-d arrays
 from scalars, or something similar).So, in other words, I think the
 problem is fixable, but we need someone who can track down where the
 slowness is coming from.

 -Travis

The problem is very easy to reproduce.
What would it take to do the tracking?
I'd like to see this fixed.
Presumably the python profiler is no use since the problem is inside C
code,
which is not my expertise.



Well, it is not creation of scalars per se that is slow.

In [6]: def normal1() :
  ...: a = normal(0,1,1000)
  ...: for i in xrange(1000) :
  ...: b = a[i]

In [7]: def normal2() :
  ...: for i in xrange(1000) :
  ...: b = normal(0,1)

In [18]: t1=Timer(normal1(),from __main__ import normal1)

In [19]: t2=Timer(normal2(),from __main__ import normal2)

In [22]: t1.timeit(100)
Out[22]: 0.086430072784423828

In [23]: t2.timeit(100)
Out[23]: 4.276231050491333

Looks to me like the call overhead is dominate and that the fast path is not
nearly as fast as it could be.  Either it takes a long time to decide which
path to take or something is getting reinitialized that needn't. Could be
that normals should be special cased but I can't tell till I look at the
code.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] zoom FFT with numpy?

2007-03-14 Thread Charles R Harris

On 3/14/07, Ray S [EMAIL PROTECTED] wrote:


We'd like to do what most call a zoom FFT; we only are interested
in the frequencies of say, 6kHZ to 9kHz with a given N, and so the
computations from DC to 6kHz are wasted CPU time.
Can this be done without additional numpy pre-filtering computations?

If explicit filtering is needed to baseband the data, is it worth
it? It sounds like we need to generate cosine data once for each band
and N, then multiple with the data for each FFT.

Has anyone coded this up before? I couldn't find any references other
than http://www.dsprelated.com/showmessage/38917/1.php
and http://www.numerix-dsp.com/siglib/examples/test_zft.c (which uses
Numerix).



Sounds like you want to save cpu cycles. How much you can save will depend
on the ratio of the bandwidth to the nyquist. Generally, it is easiest to
just fft, but if the band is very narrow and time matters, you might try:

1) explicit multipy by the transform matrix but only use a few rows.
2) frequency shift, {low pass}, and downsample
3) just downsample and fix up the aliasing

Much depends on the length of the data run and what the signal to noise is.
Google indicates that the zoom fft is 2 without the low pass, which is fine
if you don't need optimum s/n in the result. Should be easy to cook up your
own routine.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] zoom FFT with numpy?

2007-03-14 Thread Anne Archibald
On 15/03/07, Ray Schumacher [EMAIL PROTECTED] wrote:

 The desired band is rather narrow, as the goal is to determine the f of a
 peak that always occurs in a narrow band of about 1kHz around 7kHz

  2) frequency shift, {low pass}, and downsample

By this I would take it to mean, multiply by a complex sinusoid at 7
kHz, optionally low-pass filter, dispose of every high frequency in
the result by downsampling (boxcar averaging will be an ultra-simple
way to include a crummy low-pass filter) then FFT. (Or some equivalent
purely-real operation, doesn't much matter.) Basically implement a
heterodyne receiver in software.

   Much depends on the length of the data run and what the signal to noise is.

 data is stable for 2-4k samps, and s/n is low, ~5-10

It seems like this would benefit from some filtering... you could try
numpy.convolve to apply a FIR filter, but scipy.signal has a whole
heap of tools for efficiently designing and applying FIR and IIR
filters.

If you've got a good filter design arrangement, I suppose you could
just band-pass filter the data then downsample and FFT; the aliasing
would drop the frequency you want into the band-pass in a predictable
way. Saves generating and multiplying the  complex sinusoids.

 What I had been doing is a 2048 N full real_FFT with a Hann window, and
 further analyzing the side lobe/bin energy (via linear interp) to try to
 more precisely determine the f within the peak's bin. (How legitimately
 valuable that is I'm not sure... IANAM)

This can be a good idea; if what you're observing really is an
isolated sinusoidal signal, you can get frequency accuracy limited
only by your signal-to-noise ratio. Off the top of my head I don't
remember how you do the magic, but Ransom et al. 2002 (Fourier
Techniques for Very Long Astrophysical Time Series Analysis,
published in ApJ but more accessible at
http://lanl.arxiv.org/abs/astro-ph/0204349 ; hi Scott!) describe the
process without undue pain. They also discuss the issue of computing
only part of an FFT and claim that it's rarely worth the trouble.

 It's awfully hard to beat the FFTPACK library's full FFT by hand-rolling
 anything else, I'd guess. I was hoping for some all-numpy procedure that I
 could use weave with; that might stand a chance of being faster.

Well, mixing it down then FFTing should be pretty fast. Particularly
if the problem is that you have to do it a zillion times, rather than
that your FFT is huge, so that you can keep the LO array around. And
scipy.signal should help.

Anne M. Archibald
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] zoom FFT with numpy?

2007-03-14 Thread Charles R Harris

On 3/14/07, Ray Schumacher [EMAIL PROTECTED] wrote:


On 3/14/07, Charles R Harris wrote:

 Sounds like you want to save cpu cycles.
 How much you can save will depend
 on the ratio of the bandwidth to the nyquist.

The desired band is rather narrow, as the goal is to determine the f of a
peak that always occurs in a narrow band of about 1kHz around 7kHz

2) frequency shift, {low pass}, and downsample

 Much depends on the length of the data run and what the signal to noise
is.

data is stable for 2-4k samps, and s/n is low, ~5-10

 Google indicates that the zoom fft is 2 without the low pass, which is
fine
if you don't need optimum s/n in the result.

What I had been doing is a 2048 N full real_FFT with a Hann window, and
further analyzing the side lobe/bin energy (via linear interp) to try to
more precisely determine the f within the peak's bin. (How legitimately
valuable that is I'm not sure... IANAM)



That's usually fine. You might want to zero fill to get more samples through
the band. It would help if you gave the sample frequency in Hz too. Anyway,
unless time is important, I would just zero fill by a factor of 4-8 and
transform. You can get the same effect with a chirp-z transform, but again
this is depends on how much programming work you want to do. If you just
have a couple of lines in the band that you want to locate you could also
try maximum entropy methods.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion