[Numpy-discussion] numerical noise for simple calcululations

2008-02-10 Thread dmitrey
hi all,
I need a good estimation of noise value for simple calculations.

I.e. when I calculate something like sin(15)+cos(80) I get a solution 
with precision, for example, 1e-11.

I guess the precision depends on system arch, isn't it?

So what's the best way to estimate the value?

I guess here should be something like 10*numpy.machine_precision, isn't it?
Regards, D.


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numerical noise for simple calcululations

2008-02-10 Thread dmitrey
I need just a single number in avarage.
I have committed some changes to NLP/NSP ralg solver from 
scikits.openopt, for non-noisy funcs it works better, but for noisy 
funcs vise versa, hence now my examples/nssolveVSfsolve.py doesn't work 
as it should be, so I need to implement noise parameter and assing a 
default value to the one.

So, the question is: what default value should be here? I was thinking 
of either 0 or something like K*numpy.machine_precesion, where K is 
something like 1...10...100.
Regards, D.

Timothy Hochberg wrote:


 On Sun, Feb 10, 2008 at 4:23 AM, dmitrey [EMAIL PROTECTED] 
 mailto:[EMAIL PROTECTED] wrote:

 hi all,
 I need a good estimation of noise value for simple calculations.

 I.e. when I calculate something like sin(15)+cos(80) I get a solution
 with precision, for example, 1e-11.

 I guess the precision depends on system arch, isn't it?

 So what's the best way to estimate the value?

 I guess here should be something like 10*numpy.machine_precision,
 isn't it?


 This is a complicated subject, which I'm really not qualified to 
 comment on, but I'm not going to let that stop me. I believe that you 
 want to know how accurate something like the above is given exact 
 inputs. That is a somewhat artificial problem, but I'll answer it to 
 the best of my ability.

 Functions like sin, cos, +, etc can in theory compute there result to 
 within on ULP, or maybe half an ULP (I can't recall exactly). An ULP 
 is a Unit in the Last Place. To explain an ULP, let's pretend that we 
 were using decimal floating point with 3 digits of precision and look 
 at a couple of numbers:

 1.03e-03  -- 1 ULP = 1e-5
 3.05e+02 -- 1 ULP = 1

 We're obviously not using decimal floating point, we're using binary 
 floating point, but the basic idea is the same. The result is that the 
 accuracy is going to totally depend on the magnitude of the result. If 
 the result is small, in general the result will be more accurate in an 
 absolute sense, although not generally in a relative sense.

 In practice, this is drastically oversimplified since the inputs are 
 generally of finite accuracy. Different functions will either magnify 
 or shrink the input error depending on both the function and the value 
 of the input. If you can find an easy to read introduction to 
 numerical analysis, it would probably help. Unfortunately, I don't 
 know of a good one to recommend; the text I have is a pretty hard slog.

 To complicate this further, functions don't always compute there 
 results to maximum theoretical accuracy; presumably in the interest of 
 reasonable performance.

 So, in the end the answer is; it depends. In practice the only useful, 
 simple advice I've seen to get a handle on accuracy is to compute 
 results using at least two different precisions and verify that things 
 are converging sensibly. And compare to known results wherever possible.


 -- 
 .  __
 .   |-\
 .
 .  [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
 

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion
   

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Setting contents of buffer for array object

2008-02-10 Thread Matthew Brett
Hi,

I am sorry if I have missed something obvious, but is there any way in
python of doing this:

import numpy as np
a = np.arange(10)
b = np.arange(10)+1
a.data = b.data # raises error, but I hope you see what I mean

?

Thanks a lot for any pointers.

Matthew
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Setting contents of buffer for array object

2008-02-10 Thread Robert Kern
On Feb 10, 2008 5:15 PM, Matthew Brett [EMAIL PROTECTED] wrote:
 Hi,

 I am sorry if I have missed something obvious, but is there any way in
 python of doing this:

 import numpy as np
 a = np.arange(10)
 b = np.arange(10)+1
 a.data = b.data # raises error, but I hope you see what I mean

 ?

Not really, no. Can you describe your use case in more detail?

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Setting contents of buffer for array object

2008-02-10 Thread Robert Kern
On Feb 10, 2008 6:48 PM, Matthew Brett [EMAIL PROTECTED] wrote:
   import numpy as np
   a = np.arange(10)
   b = np.arange(10)+1
   a.data = b.data # raises error, but I hope you see what I mean
  
   ?
 
  Not really, no. Can you describe your use case in more detail?

 Yes - I am just writing the new median implementation.   To allow
 future optimization, I would like to have the same signature as
 mean():

 def median(a, axis=0, dtype=None, out=None)

 (axis=0 to change to axis=None default at some point).

 To do this, I need to copy the results of the median calculation in
 the routine into the array object given by 'out' - when passed.

Ah, I see. You definitely do not want to reassign the .data buffer in
this case. An out= parameter does not reassign the memory location
that the array object points to. It should use the allocated memory
that was already there. It shouldn't copy anything at all;
otherwise, median(x, out=out) is no better than out[:] =
median(x). Personally, I don't think that a function should expose an
out= parameter unless if it can make good on that promise of memory
efficency. Can you show us the current implementation that you have?

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Setting contents of buffer for array object

2008-02-10 Thread Matthew Brett
  import numpy as np
  a = np.arange(10)
  b = np.arange(10)+1
  a.data = b.data # raises error, but I hope you see what I mean
 
  ?

 Not really, no. Can you describe your use case in more detail?

Yes - I am just writing the new median implementation.   To allow
future optimization, I would like to have the same signature as
mean():

def median(a, axis=0, dtype=None, out=None)

(axis=0 to change to axis=None default at some point).

To do this, I need to copy the results of the median calculation in
the routine into the array object given by 'out' - when passed.

Matthew
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Setting contents of buffer for array object

2008-02-10 Thread Matthew Brett
 Ah, I see. You definitely do not want to reassign the .data buffer in
 this case. An out= parameter does not reassign the memory location
 that the array object points to. It should use the allocated memory
 that was already there. It shouldn't copy anything at all;
 otherwise, median(x, out=out) is no better than out[:] =
 median(x). Personally, I don't think that a function should expose an
 out= parameter unless if it can make good on that promise of memory
 efficency.

I agree - but there are more efficient median algorithms out there
which can make use of the memory efficiently.  I wanted to establish
the call signature to allow that.  I don't feel strongly about it
though.

 Can you show us the current implementation that you have?

is attached, comments welcome...

Matthew
import numpy as np

def median(a, axis=0, dtype=None, out=None):
Compute the median along the specified axis.

Returns the median of the array elements.  The median is taken
over the first dimension of the array by default, otherwise over
the specified axis.

Parameters
--
axis : {None, int}, optional
Axis along which the medians are computed. The default is to
compute the median along the first dimension.  axis=None
returns the median of the flattened array

dtype : type, optional
Type to use in returning the medians. For arrays of integer
type the default is float32, for arrays of float types it is
the same as the array type. Integer arrays may return float
medians because, given the chosen axis has length N, and N is
even, the median is given by the mean of the two central
values (see notes)

out : ndarray, optional
Alternative output array in which to place the result. It must
have the same shape as the expected output but the type will be
cast if necessary.

Returns
---
median : The return type varies, see above.
A new array holding the result is returned unless out is
specified, in which case a reference to out is returned.

SeeAlso
---
mean

Notes
-
Given a vector V length N, the median of V is the middle value of
a sorted copy of V (Vs) - i.e. Vs[(N-1)/2], when N is odd. It is
the mean of the two middle values of Vs, when N is even.

sorted = np.sort(a, axis)
if dtype is None:
if a.dtype in np.sctypes['int']:
dtype = np.float32
else:
dtype = a.dtype
if axis is None:
axis = 0
indexer = [slice(None)] * sorted.ndim
index = int(sorted.shape[axis]/2)
if sorted.shape[axis] % 2 == 1:
indexer[axis] = index
ret = sorted(indexer)
else:
indexer[axis] = slice(index-1, index+1)
ret = np.sum(sorted[indexer], axis=axis)/2.0
if dtype in np.sctypes['int']:
ret = ret.round()
if ret.dtype != dtype:
ret = ret.astype(dtype)
if not out is None:
if not (out.shape == ret.shape and
out.nbytes == ret.nbytes):
raise ValueError, 'wrong shape for output'
# This doesn't work - out.data = ret.data
raise ValueError, 'out parameter not working yet'
return ret
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] non-contiguous array error

2008-02-10 Thread Brad Malone
Hi, I am receiving a AttributeError: incompatible shape for a
non-contiguous array error.  A quick illustration of the type of code
that gives me the error is shown below:

from numpy import *
list=[i for i in range(0,27)]
c=array(list)
c.shape=(3,3,3)
d=fft.fftn(c)
d.shape=(27)
error here


I suppose this has something to do with the fact that the fourier
transform of c has imaginary parts, which affects the way the
information is stored in memory, and this messed up the call to
.shape?

Is there another way to do this, or will I need to rewrite this
section of my code?

Thanks for all you do,
Brad
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] non-contiguous array error

2008-02-10 Thread Robert Kern
On Feb 10, 2008 7:43 PM, Brad Malone [EMAIL PROTECTED] wrote:
 Hi, I am receiving a AttributeError: incompatible shape for a
 non-contiguous array error.  A quick illustration of the type of code
 that gives me the error is shown below:
 
 from numpy import *
 list=[i for i in range(0,27)]
 c=array(list)
 c.shape=(3,3,3)
 d=fft.fftn(c)
 d.shape=(27)
 error here
 

 I suppose this has something to do with the fact that the fourier
 transform of c has imaginary parts, which affects the way the
 information is stored in memory, and this messed up the call to
 .shape?

No. The problem is that (27) is not a tuple. Parentheses are also
used for grouping expressions in Python, so a single-element tuple
needs a comma to disambiguate. You want d.shape = (27,).

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Setting contents of buffer for array object

2008-02-10 Thread Robert Kern
On Feb 10, 2008 7:17 PM, Matthew Brett [EMAIL PROTECTED] wrote:
  Ah, I see. You definitely do not want to reassign the .data buffer in
  this case. An out= parameter does not reassign the memory location
  that the array object points to. It should use the allocated memory
  that was already there. It shouldn't copy anything at all;
  otherwise, median(x, out=out) is no better than out[:] =
  median(x). Personally, I don't think that a function should expose an
  out= parameter unless if it can make good on that promise of memory
  efficency.

 I agree - but there are more efficient median algorithms out there
 which can make use of the memory efficiently.  I wanted to establish
 the call signature to allow that.  I don't feel strongly about it
 though.

I say add the out= parameter when you use such an algorithm. But if
you like, just use slice assignment for now.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Setting contents of buffer for array object

2008-02-10 Thread Damian R. Eads
Matthew Brett wrote:
 import numpy as np
 a = np.arange(10)
 b = np.arange(10)+1
 a.data = b.data # raises error, but I hope you see what I mean

 ?
 Not really, no. Can you describe your use case in more detail?

 Yes - I am just writing the new median implementation.   To allow
 future optimization, I would like to have the same signature as
 mean():

 def median(a, axis=0, dtype=None, out=None)

 (axis=0 to change to axis=None default at some point).

 To do this, I need to copy the results of the median calculation in
 the routine into the array object given by 'out' - when passed.

My understanding of numerical routines that accept an out parameter is
that this is a convention for in-place algorithms.  When None is passed in
the out parameter, it's the caller's way of indicating that in-place is
not needed, and a new array is allocated to store the result; otherwise,
the result is stored in the 'out' array. Either way, the result is
returned. One can break from this convention by allocating more memory
than provided by the out array but that's a performance issue that may or
may not be unavoidable.

Remember that A[:] = expr sets the value of the elements in A to the
values of array elements in the expression expr, and this copying is done
in-place. To copy an array C, and make the copy contiguous, use the
.copy() method on C.

Assigning the .data buffers is not something I have seen before in
non-constructor (or npn=pseudo-constructor like from_buffer) code. I think
it might even be dangerous if you don't do it right. If one does not
properly recalculate the strides of A, slicing operations on A may not
behave as expected.

If this is library code, reassigning the .data buffer can confuse the
user, since it messes up array view semantics. Suppose I'm an ignorant
user and I write the following code:

   A=numpy.random.rand(10,20)
   dummy_input=numpy.random.rand(10,20)
   B=A.T
   C=B[0::-1,:]

then I use a library function foo (suppose foo accepts an input array inp
and an output array out, and assigns out.data to something else)

   foo(in=dummy_input, out=B)

Now, A and B point to two different .data buffers, B's base points to A,
and C's base points to B but A and C share the same .data buffer. As a
user, I may expect B and C to be a view of A (certainly B isn't), and C to
be a view of B (which is verified by checking 'C.base is B') but changing
C's values changes A's but not B's. That's confusing. Also, suppose B's
new data buffer has less elements than its original data buffer. I may be
clever and set B's size and strides attributes accordingly but changing
C's values might cause the manipulation of undefined memory.

Damian
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] String sort

2008-02-10 Thread Charles R Harris
On Feb 8, 2008 5:29 AM, Francesc Altet [EMAIL PROTECTED] wrote:

 Hi,

 I'm a bit confused that the sort method of a string character doesn't
 allow a mergesort:

  s = numpy.empty(10, S10)
  s.sort(kind=merge)
 TypeError: desired sort not supported for this type

 However, by looking at the numpy sources, it seems that the only
 implemented method for sorting array strings is merge (I presume
 because it is stable).  So, perhaps the message above should be fixed.


The only available method is the default quicksort. The mergesort is for
argsort and was put in for lexsort to use.



 Also, in the context of my work in indexing, and because of the slowness
 of the current implementation in NumPy, I've ended with an
 implementation of the quicksort method for 1-D array strings.  For
 moderately large arrays, it is about 2.5x-3x faster than the
 (supposedly) mergesort version in NumPy, not only due to the quicksort,
 but also because I've implemented a couple of macros for efficient
 string swapping and copy.  If this is of interest for NumPy developers,
 tell me and I will provide the code.


I've now got a string/ucs4 specific argsort(kind='q'), the string version of
which is about 40% faster than the old default and about 10% faster than the
mergesort version, but the string/ucs4 specific versions of sort aren't yet
fairing as well. I'm timing things with

In [1]: import timeit

In [2]: t = 
timeit.Timer(np.fromstring(np.empty(1).tostring(),dtype='|S8').sort(kind='q'),import
numpy as np)

In [3]: t.repeat(3,100)
Out[3]: [0.22127485275268555, 0.21282196044921875, 0.21273088455200195]

That's with the current sort(kind='q') in svn, which uses the new string
compare function but is otherwise the old default quicksort. The new string
specific version of quicksort I'm testing is actually a bit slower than
that. Both versions correctly sort the array. So I'm going to continue to
experiment a bit until I see what is going on.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion