[Numpy-discussion] numerical noise for simple calcululations
hi all, I need a good estimation of noise value for simple calculations. I.e. when I calculate something like sin(15)+cos(80) I get a solution with precision, for example, 1e-11. I guess the precision depends on system arch, isn't it? So what's the best way to estimate the value? I guess here should be something like 10*numpy.machine_precision, isn't it? Regards, D. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numerical noise for simple calcululations
I need just a single number in avarage. I have committed some changes to NLP/NSP ralg solver from scikits.openopt, for non-noisy funcs it works better, but for noisy funcs vise versa, hence now my examples/nssolveVSfsolve.py doesn't work as it should be, so I need to implement noise parameter and assing a default value to the one. So, the question is: what default value should be here? I was thinking of either 0 or something like K*numpy.machine_precesion, where K is something like 1...10...100. Regards, D. Timothy Hochberg wrote: On Sun, Feb 10, 2008 at 4:23 AM, dmitrey [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: hi all, I need a good estimation of noise value for simple calculations. I.e. when I calculate something like sin(15)+cos(80) I get a solution with precision, for example, 1e-11. I guess the precision depends on system arch, isn't it? So what's the best way to estimate the value? I guess here should be something like 10*numpy.machine_precision, isn't it? This is a complicated subject, which I'm really not qualified to comment on, but I'm not going to let that stop me. I believe that you want to know how accurate something like the above is given exact inputs. That is a somewhat artificial problem, but I'll answer it to the best of my ability. Functions like sin, cos, +, etc can in theory compute there result to within on ULP, or maybe half an ULP (I can't recall exactly). An ULP is a Unit in the Last Place. To explain an ULP, let's pretend that we were using decimal floating point with 3 digits of precision and look at a couple of numbers: 1.03e-03 -- 1 ULP = 1e-5 3.05e+02 -- 1 ULP = 1 We're obviously not using decimal floating point, we're using binary floating point, but the basic idea is the same. The result is that the accuracy is going to totally depend on the magnitude of the result. If the result is small, in general the result will be more accurate in an absolute sense, although not generally in a relative sense. In practice, this is drastically oversimplified since the inputs are generally of finite accuracy. Different functions will either magnify or shrink the input error depending on both the function and the value of the input. If you can find an easy to read introduction to numerical analysis, it would probably help. Unfortunately, I don't know of a good one to recommend; the text I have is a pretty hard slog. To complicate this further, functions don't always compute there results to maximum theoretical accuracy; presumably in the interest of reasonable performance. So, in the end the answer is; it depends. In practice the only useful, simple advice I've seen to get a handle on accuracy is to compute results using at least two different precisions and verify that things are converging sensibly. And compare to known results wherever possible. -- . __ . |-\ . . [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Setting contents of buffer for array object
Hi, I am sorry if I have missed something obvious, but is there any way in python of doing this: import numpy as np a = np.arange(10) b = np.arange(10)+1 a.data = b.data # raises error, but I hope you see what I mean ? Thanks a lot for any pointers. Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Setting contents of buffer for array object
On Feb 10, 2008 5:15 PM, Matthew Brett [EMAIL PROTECTED] wrote: Hi, I am sorry if I have missed something obvious, but is there any way in python of doing this: import numpy as np a = np.arange(10) b = np.arange(10)+1 a.data = b.data # raises error, but I hope you see what I mean ? Not really, no. Can you describe your use case in more detail? -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Setting contents of buffer for array object
On Feb 10, 2008 6:48 PM, Matthew Brett [EMAIL PROTECTED] wrote: import numpy as np a = np.arange(10) b = np.arange(10)+1 a.data = b.data # raises error, but I hope you see what I mean ? Not really, no. Can you describe your use case in more detail? Yes - I am just writing the new median implementation. To allow future optimization, I would like to have the same signature as mean(): def median(a, axis=0, dtype=None, out=None) (axis=0 to change to axis=None default at some point). To do this, I need to copy the results of the median calculation in the routine into the array object given by 'out' - when passed. Ah, I see. You definitely do not want to reassign the .data buffer in this case. An out= parameter does not reassign the memory location that the array object points to. It should use the allocated memory that was already there. It shouldn't copy anything at all; otherwise, median(x, out=out) is no better than out[:] = median(x). Personally, I don't think that a function should expose an out= parameter unless if it can make good on that promise of memory efficency. Can you show us the current implementation that you have? -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Setting contents of buffer for array object
import numpy as np a = np.arange(10) b = np.arange(10)+1 a.data = b.data # raises error, but I hope you see what I mean ? Not really, no. Can you describe your use case in more detail? Yes - I am just writing the new median implementation. To allow future optimization, I would like to have the same signature as mean(): def median(a, axis=0, dtype=None, out=None) (axis=0 to change to axis=None default at some point). To do this, I need to copy the results of the median calculation in the routine into the array object given by 'out' - when passed. Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Setting contents of buffer for array object
Ah, I see. You definitely do not want to reassign the .data buffer in this case. An out= parameter does not reassign the memory location that the array object points to. It should use the allocated memory that was already there. It shouldn't copy anything at all; otherwise, median(x, out=out) is no better than out[:] = median(x). Personally, I don't think that a function should expose an out= parameter unless if it can make good on that promise of memory efficency. I agree - but there are more efficient median algorithms out there which can make use of the memory efficiently. I wanted to establish the call signature to allow that. I don't feel strongly about it though. Can you show us the current implementation that you have? is attached, comments welcome... Matthew import numpy as np def median(a, axis=0, dtype=None, out=None): Compute the median along the specified axis. Returns the median of the array elements. The median is taken over the first dimension of the array by default, otherwise over the specified axis. Parameters -- axis : {None, int}, optional Axis along which the medians are computed. The default is to compute the median along the first dimension. axis=None returns the median of the flattened array dtype : type, optional Type to use in returning the medians. For arrays of integer type the default is float32, for arrays of float types it is the same as the array type. Integer arrays may return float medians because, given the chosen axis has length N, and N is even, the median is given by the mean of the two central values (see notes) out : ndarray, optional Alternative output array in which to place the result. It must have the same shape as the expected output but the type will be cast if necessary. Returns --- median : The return type varies, see above. A new array holding the result is returned unless out is specified, in which case a reference to out is returned. SeeAlso --- mean Notes - Given a vector V length N, the median of V is the middle value of a sorted copy of V (Vs) - i.e. Vs[(N-1)/2], when N is odd. It is the mean of the two middle values of Vs, when N is even. sorted = np.sort(a, axis) if dtype is None: if a.dtype in np.sctypes['int']: dtype = np.float32 else: dtype = a.dtype if axis is None: axis = 0 indexer = [slice(None)] * sorted.ndim index = int(sorted.shape[axis]/2) if sorted.shape[axis] % 2 == 1: indexer[axis] = index ret = sorted(indexer) else: indexer[axis] = slice(index-1, index+1) ret = np.sum(sorted[indexer], axis=axis)/2.0 if dtype in np.sctypes['int']: ret = ret.round() if ret.dtype != dtype: ret = ret.astype(dtype) if not out is None: if not (out.shape == ret.shape and out.nbytes == ret.nbytes): raise ValueError, 'wrong shape for output' # This doesn't work - out.data = ret.data raise ValueError, 'out parameter not working yet' return ret ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] non-contiguous array error
Hi, I am receiving a AttributeError: incompatible shape for a non-contiguous array error. A quick illustration of the type of code that gives me the error is shown below: from numpy import * list=[i for i in range(0,27)] c=array(list) c.shape=(3,3,3) d=fft.fftn(c) d.shape=(27) error here I suppose this has something to do with the fact that the fourier transform of c has imaginary parts, which affects the way the information is stored in memory, and this messed up the call to .shape? Is there another way to do this, or will I need to rewrite this section of my code? Thanks for all you do, Brad ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-contiguous array error
On Feb 10, 2008 7:43 PM, Brad Malone [EMAIL PROTECTED] wrote: Hi, I am receiving a AttributeError: incompatible shape for a non-contiguous array error. A quick illustration of the type of code that gives me the error is shown below: from numpy import * list=[i for i in range(0,27)] c=array(list) c.shape=(3,3,3) d=fft.fftn(c) d.shape=(27) error here I suppose this has something to do with the fact that the fourier transform of c has imaginary parts, which affects the way the information is stored in memory, and this messed up the call to .shape? No. The problem is that (27) is not a tuple. Parentheses are also used for grouping expressions in Python, so a single-element tuple needs a comma to disambiguate. You want d.shape = (27,). -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Setting contents of buffer for array object
On Feb 10, 2008 7:17 PM, Matthew Brett [EMAIL PROTECTED] wrote: Ah, I see. You definitely do not want to reassign the .data buffer in this case. An out= parameter does not reassign the memory location that the array object points to. It should use the allocated memory that was already there. It shouldn't copy anything at all; otherwise, median(x, out=out) is no better than out[:] = median(x). Personally, I don't think that a function should expose an out= parameter unless if it can make good on that promise of memory efficency. I agree - but there are more efficient median algorithms out there which can make use of the memory efficiently. I wanted to establish the call signature to allow that. I don't feel strongly about it though. I say add the out= parameter when you use such an algorithm. But if you like, just use slice assignment for now. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Setting contents of buffer for array object
Matthew Brett wrote: import numpy as np a = np.arange(10) b = np.arange(10)+1 a.data = b.data # raises error, but I hope you see what I mean ? Not really, no. Can you describe your use case in more detail? Yes - I am just writing the new median implementation. To allow future optimization, I would like to have the same signature as mean(): def median(a, axis=0, dtype=None, out=None) (axis=0 to change to axis=None default at some point). To do this, I need to copy the results of the median calculation in the routine into the array object given by 'out' - when passed. My understanding of numerical routines that accept an out parameter is that this is a convention for in-place algorithms. When None is passed in the out parameter, it's the caller's way of indicating that in-place is not needed, and a new array is allocated to store the result; otherwise, the result is stored in the 'out' array. Either way, the result is returned. One can break from this convention by allocating more memory than provided by the out array but that's a performance issue that may or may not be unavoidable. Remember that A[:] = expr sets the value of the elements in A to the values of array elements in the expression expr, and this copying is done in-place. To copy an array C, and make the copy contiguous, use the .copy() method on C. Assigning the .data buffers is not something I have seen before in non-constructor (or npn=pseudo-constructor like from_buffer) code. I think it might even be dangerous if you don't do it right. If one does not properly recalculate the strides of A, slicing operations on A may not behave as expected. If this is library code, reassigning the .data buffer can confuse the user, since it messes up array view semantics. Suppose I'm an ignorant user and I write the following code: A=numpy.random.rand(10,20) dummy_input=numpy.random.rand(10,20) B=A.T C=B[0::-1,:] then I use a library function foo (suppose foo accepts an input array inp and an output array out, and assigns out.data to something else) foo(in=dummy_input, out=B) Now, A and B point to two different .data buffers, B's base points to A, and C's base points to B but A and C share the same .data buffer. As a user, I may expect B and C to be a view of A (certainly B isn't), and C to be a view of B (which is verified by checking 'C.base is B') but changing C's values changes A's but not B's. That's confusing. Also, suppose B's new data buffer has less elements than its original data buffer. I may be clever and set B's size and strides attributes accordingly but changing C's values might cause the manipulation of undefined memory. Damian ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] String sort
On Feb 8, 2008 5:29 AM, Francesc Altet [EMAIL PROTECTED] wrote: Hi, I'm a bit confused that the sort method of a string character doesn't allow a mergesort: s = numpy.empty(10, S10) s.sort(kind=merge) TypeError: desired sort not supported for this type However, by looking at the numpy sources, it seems that the only implemented method for sorting array strings is merge (I presume because it is stable). So, perhaps the message above should be fixed. The only available method is the default quicksort. The mergesort is for argsort and was put in for lexsort to use. Also, in the context of my work in indexing, and because of the slowness of the current implementation in NumPy, I've ended with an implementation of the quicksort method for 1-D array strings. For moderately large arrays, it is about 2.5x-3x faster than the (supposedly) mergesort version in NumPy, not only due to the quicksort, but also because I've implemented a couple of macros for efficient string swapping and copy. If this is of interest for NumPy developers, tell me and I will provide the code. I've now got a string/ucs4 specific argsort(kind='q'), the string version of which is about 40% faster than the old default and about 10% faster than the mergesort version, but the string/ucs4 specific versions of sort aren't yet fairing as well. I'm timing things with In [1]: import timeit In [2]: t = timeit.Timer(np.fromstring(np.empty(1).tostring(),dtype='|S8').sort(kind='q'),import numpy as np) In [3]: t.repeat(3,100) Out[3]: [0.22127485275268555, 0.21282196044921875, 0.21273088455200195] That's with the current sort(kind='q') in svn, which uses the new string compare function but is otherwise the old default quicksort. The new string specific version of quicksort I'm testing is actually a bit slower than that. Both versions correctly sort the array. So I'm going to continue to experiment a bit until I see what is going on. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion