Re: [Numpy-discussion] np.histogram on arrays.
Hi. Sorry for not having been clearer. I'll explain a little bit. I have 4k x 4k images that I want to analyse. I turn them into numpy arrays so I have 4k x 4k np.array. My analysis starts with determining the bias level. To do that, I compute for each line, and then for each row, an histogram. So I compute 8000 histograms. Here is the code I've used sofar: for i in range(self.data.shape[0]): #Compute an histogram along the columns # Gets counts and bounds self.countsC[i], self.boundsC[i] = np.histogram(data[i], bins=self.bins) for i in range(self.data.shape[1]): # Do the same, along the rows. self.countsR[i], self.boundsR[i] = np.histogram(data[:,i], bins=self.bins) And data.shape is (4000,4000). If histogram had an axis parameter, I could avoid the loop and I guess it would be faster. Éric. So it seems that you give your array directly to histogramdd (asking a 4000D histogram!). Surely that's not what you are trying to achieve. Can you elaborate more on your objectives? Perhaps some code (slow but working) to demonstrate the point. Regards, eat Un clavier azerty en vaut deux -- Éric Depagnee...@depagne.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.histogram on arrays.
How about something like this: # numpy 1.6 def rowhist(A, bins=100): assert (bins 0) assert isinstance(bins, int) rownum = np.arange(A.shape[0]).reshape((-1, 1)).astype(int) * bins intA = (bins * (A - A.min()) / float(A.max() - A.min())).astype(int) intA[intA == bins] = bins - 1 return np.bincount((intA + rownum).flatten(), minlength=(A.shape[0]).reshape((A.shape[0], bins)) # numpy 1.5 def rowhist(A, bins=100): assert (bins 0) assert isinstance(bins, int) rownum = np.arange(A.shape[0]).reshape((-1, 1)).astype(int) * bins intA = (bins * (A - A.min()) / float(A.max() - A.min())).astype(int) intA[intA == bins] = bins - 1 counts = np.zeros(A.shape[0] * bins) bc = np.bincount((intA + rownum).flatten()) counts[:len(bc)] = bc return counts.reshape((A.shape[0], bins)) On Wed, Mar 30, 2011 at 09:04, Éric Depagne e...@depagne.org wrote: Hi. Sorry for not having been clearer. I'll explain a little bit. I have 4k x 4k images that I want to analyse. I turn them into numpy arrays so I have 4k x 4k np.array. My analysis starts with determining the bias level. To do that, I compute for each line, and then for each row, an histogram. So I compute 8000 histograms. Here is the code I've used sofar: for i in range(self.data.shape[0]): #Compute an histogram along the columns # Gets counts and bounds self.countsC[i], self.boundsC[i] = np.histogram(data[i], bins=self.bins) for i in range(self.data.shape[1]): # Do the same, along the rows. self.countsR[i], self.boundsR[i] = np.histogram(data[:,i], bins=self.bins) And data.shape is (4000,4000). If histogram had an axis parameter, I could avoid the loop and I guess it would be faster. Éric. So it seems that you give your array directly to histogramdd (asking a 4000D histogram!). Surely that's not what you are trying to achieve. Can you elaborate more on your objectives? Perhaps some code (slow but working) to demonstrate the point. Regards, eat Un clavier azerty en vaut deux -- Éric Depagne e...@depagne.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.histogram on arrays.
Hi, On Wed, Mar 30, 2011 at 10:04 AM, Éric Depagne e...@depagne.org wrote: Hi. Sorry for not having been clearer. I'll explain a little bit. I have 4k x 4k images that I want to analyse. I turn them into numpy arrays so I have 4k x 4k np.array. My analysis starts with determining the bias level. To do that, I compute for each line, and then for each row, an histogram. So I compute 8000 histograms. Here is the code I've used sofar: for i in range(self.data.shape[0]): #Compute an histogram along the columns # Gets counts and bounds self.countsC[i], self.boundsC[i] = np.histogram(data[i], bins=self.bins) for i in range(self.data.shape[1]): # Do the same, along the rows. self.countsR[i], self.boundsR[i] = np.histogram(data[:,i], bins=self.bins) And data.shape is (4000,4000). If histogram had an axis parameter, I could avoid the loop and I guess it would be faster. Well I guess, for a slight performance improvement, you could create your own streamlined histogrammer. But, in order to better grasp your situation it would be beneficial to know how the counts and bounds are used later on. Just wondering if this kind massive histogramming could be somehow avoided totally. Regards, eat Éric. So it seems that you give your array directly to histogramdd (asking a 4000D histogram!). Surely that's not what you are trying to achieve. Can you elaborate more on your objectives? Perhaps some code (slow but working) to demonstrate the point. Regards, eat Un clavier azerty en vaut deux -- Éric Depagnee...@depagne.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Question regarding concatenate/vstack.
Dear List, I have a quick question regarding vstack and concatenate. In the docs for vstack it says that: np.concatenate(tup, axis=0) should be equivalent to: np.vstack(tup) However, I tried this out and it doesn't seem to be case, i.e. np.vstack((np.arange(5.), np.arange(5.))) array([[ 0., 1., 2., 3., 4.], [ 0., 1., 2., 3., 4.]]) np.concatenate((np.arange(5.),np.arange(5.)), axis=0) array([ 0., 1., 2., 3., 4., 0., 1., 2., 3., 4.]) These aren't the same. Maybe I'm missing something? regards, Andrew. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.histogram on arrays.
Well I guess, for a slight performance improvement, you could create your own streamlined histogrammer. But, in order to better grasp your situation it would be beneficial to know how the counts and bounds are used later on. Just wondering if this kind massive histogramming could be somehow avoided totally. Indeed. Here's what I do. My images come from CCD, and as such, the zero level in the image is not the true zero level, but is the true zero + the background noise of each pixels. By doing the histogram, I plan on detecting what is the most common value per row. Once I have the most common value, I can derive the interval where most of the values are (the index of the largest occurence is easily obtained by sorting the counts, and I take a slice [index_max_count,index_max_count+1] in the second array given by the histogram). Then, I take the mean value of this interval and I assume it is the value of the bias for my row. I do this procedure both on the row and columns as a sanity check. And I know this procedure will not work if on any row/column there is a lot of signal and very little bias. I'll fix that afterwards ;-) Éric. Regards, eat Un clavier azerty en vaut deux -- Éric Depagnee...@depagne.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Question regarding concatenate/vstack.
You're right, they are not equivalent. vstack will happily create an array of higher rank than the parts it is stacking, whereas concatenate requires the arrays it is working with to already be at least 2d, so the equivalent is np.concatenate((np.arange(5.)[newaxis],np.arange(5.)[newaxis]), axis=0) or np.concatenate((np.atleast_2d(np.arange(5.)),np.atleast_2d(np.arange(5.))), axis=0) Gary R. On Wed, Mar 30, 2011 at 9:30 PM, andrew nelson andyf...@gmail.com wrote: Dear List, I have a quick question regarding vstack and concatenate. In the docs for vstack it says that: np.concatenate(tup, axis=0) should be equivalent to: np.vstack(tup) However, I tried this out and it doesn't seem to be case, i.e. np.vstack((np.arange(5.), np.arange(5.))) array([[ 0., 1., 2., 3., 4.], [ 0., 1., 2., 3., 4.]]) np.concatenate((np.arange(5.),np.arange(5.)), axis=0) array([ 0., 1., 2., 3., 4., 0., 1., 2., 3., 4.]) These aren't the same. Maybe I'm missing something? regards, Andrew. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Question regarding concatenate/vstack.
On Wed, Mar 30, 2011 at 1:42 PM, gary ruben gru...@bigpond.net.au wrote: You're right, they are not equivalent. vstack will happily create an array of higher rank than the parts it is stacking, whereas concatenate requires the arrays it is working with to already be at least 2d, so the equivalent is np.concatenate((np.arange(5.)[newaxis],np.arange(5.)[newaxis]), axis=0) or np.concatenate((np.atleast_2d(np.arange(5.)),np.atleast_2d(np.arange(5.))), axis=0) This is fixed in the docstring now. Ralf On Wed, Mar 30, 2011 at 9:30 PM, andrew nelson andyf...@gmail.com wrote: Dear List, I have a quick question regarding vstack and concatenate. In the docs for vstack it says that: np.concatenate(tup, axis=0) should be equivalent to: np.vstack(tup) However, I tried this out and it doesn't seem to be case, i.e. np.vstack((np.arange(5.), np.arange(5.))) array([[ 0., 1., 2., 3., 4.], [ 0., 1., 2., 3., 4.]]) np.concatenate((np.arange(5.),np.arange(5.)), axis=0) array([ 0., 1., 2., 3., 4., 0., 1., 2., 3., 4.]) These aren't the same. Maybe I'm missing something? regards, Andrew. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in genfromtxt for python 3.2
On Wed, Mar 30, 2011 at 3:39 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Mon, Mar 28, 2011 at 11:29 PM, josef.p...@gmail.com wrote: numpy/lib/test_io.py only uses StringIO in the test, no actual csv file If I give the filename than I get a TypeError: Can't convert 'bytes' object to str implicitly from the statsmodels mailing list example data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) Traceback (most recent call last): File pyshell#30, line 1, in module data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1633, in recfromtxt output = genfromtxt(fname, **kwargs) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1181, in genfromtxt first_values = split_line(first_line) File C:\Programs\Python32\lib\site-packages\numpy\lib\_iotools.py, line 206, in _delimited_splitter line = line.split(self.comments)[0].strip(asbytes( \r\n)) TypeError: Can't convert 'bytes' object to str implicitly Is the right fix for this to open a 'filename' passed to genfromtxt, as 'binary' (bytes)? If so I will submit a pull request with a fix and a test, Seems to work and is what was intended I think, see Pauli's changes/notes in commit 0f2e7db0. This is ticket #1607 by the way. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Warning: invalid value encountered in true_divide?
Hi,After numpy upgrade, I started to get "Warning: invalid value encountered in true_divide," when I run a code which did now show any warningpreviously.What does it mean and where should I look to fix this? It does not stop my debugger so I could not identify where the message was from.Thank you,Joon___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Warning: invalid value encountered in true_divide?
On Wed, Mar 30, 2011 at 12:12, Joon Ro joonp...@gmail.com wrote: Hi, After numpy upgrade, I started to get Warning: invalid value encountered in true_divide, when I run a code which did now show any warning previously. What does it mean and where should I look to fix this? It means that a NaN popped up in a division somewhere. It always was there, but some previous versions of numpy had the warnings unintentionally silenced. It does not stop my debugger so I could not identify where the message was from. You can use np.seterr() to change how these warnings are printed. In particular, you can cause an exception to be raised so that you can use a debugger to locate the source. http://docs.scipy.org/doc/numpy/reference/generated/numpy.seterr.html -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in genfromtxt for python 3.2
Hi, On Wed, Mar 30, 2011 at 10:02 AM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Wed, Mar 30, 2011 at 3:39 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Mon, Mar 28, 2011 at 11:29 PM, josef.p...@gmail.com wrote: numpy/lib/test_io.py only uses StringIO in the test, no actual csv file If I give the filename than I get a TypeError: Can't convert 'bytes' object to str implicitly from the statsmodels mailing list example data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) Traceback (most recent call last): File pyshell#30, line 1, in module data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1633, in recfromtxt output = genfromtxt(fname, **kwargs) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1181, in genfromtxt first_values = split_line(first_line) File C:\Programs\Python32\lib\site-packages\numpy\lib\_iotools.py, line 206, in _delimited_splitter line = line.split(self.comments)[0].strip(asbytes( \r\n)) TypeError: Can't convert 'bytes' object to str implicitly Is the right fix for this to open a 'filename' passed to genfromtxt, as 'binary' (bytes)? If so I will submit a pull request with a fix and a test, Seems to work and is what was intended I think, see Pauli's changes/notes in commit 0f2e7db0. This is ticket #1607 by the way. Thanks for making a ticket. I've submitted a pull request for the fix and linked to it from the ticket. The reason I asked whether this was the correct fix was: imagine I'm working with a non-latin default encoding, and I've opened a file: fobj = open('my_nonlatin.txt', 'rt') in python 3.2. That might contain numbers and non-latin text. I can't pass that into 'genfromtxt' because it will give me this error above. I can pass it is as binary but then I'll get garbled text. Should those functions also allow unicode-providing files (perhaps with binary as default for speed)? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in genfromtxt for python 3.2
On Wed, Mar 30, 2011 at 7:37 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Mar 30, 2011 at 10:02 AM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Wed, Mar 30, 2011 at 3:39 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Mon, Mar 28, 2011 at 11:29 PM, josef.p...@gmail.com wrote: numpy/lib/test_io.py only uses StringIO in the test, no actual csv file If I give the filename than I get a TypeError: Can't convert 'bytes' object to str implicitly from the statsmodels mailing list example data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) Traceback (most recent call last): File pyshell#30, line 1, in module data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1633, in recfromtxt output = genfromtxt(fname, **kwargs) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1181, in genfromtxt first_values = split_line(first_line) File C:\Programs\Python32\lib\site-packages\numpy\lib\_iotools.py, line 206, in _delimited_splitter line = line.split(self.comments)[0].strip(asbytes( \r\n)) TypeError: Can't convert 'bytes' object to str implicitly Is the right fix for this to open a 'filename' passed to genfromtxt, as 'binary' (bytes)? If so I will submit a pull request with a fix and a test, Seems to work and is what was intended I think, see Pauli's changes/notes in commit 0f2e7db0. This is ticket #1607 by the way. Thanks for making a ticket. I've submitted a pull request for the fix and linked to it from the ticket. The reason I asked whether this was the correct fix was: imagine I'm working with a non-latin default encoding, and I've opened a file: fobj = open('my_nonlatin.txt', 'rt') in python 3.2. That might contain numbers and non-latin text. I can't pass that into 'genfromtxt' because it will give me this error above. I can pass it is as binary but then I'll get garbled text. I admit the string/bytes thing is still a little confusing to me, but isn't that always going to be a problem (even with python 2.x)? There's no way for genfromtxt to know what the encoding of an arbitrary file is. So your choices are garbled text or an error. Garbled text is better. It may help to explicitly say in the docstring that this is an ASCII routine (as it does in the source code). Ralf Should those functions also allow unicode-providing files (perhaps with binary as default for speed)? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in genfromtxt for python 3.2
On Wed, 30 Mar 2011 10:37:45 -0700, Matthew Brett wrote: [clip] imagine I'm working with a non-latin default encoding, and I've opened a file: fobj = open('my_nonlatin.txt', 'rt') in python 3.2. That might contain numbers and non-latin text. I can't pass that into 'genfromtxt' because it will give me this error above. I can pass it is as binary but then I'll get garbled text. That's the way it also works on Python 2. The text is not garbled -- it's just in some binary representation that you can later on decode to unicode: np.array(['asd']).view(np.chararray).decode('utf-8') array([u'asd'], dtype='U3') Granted, utf-16 and the ilk might be problematic. Should those functions also allow unicode-providing files (perhaps with binary as default for speed)? Nobody has yet asked for this feature as far as I know, so I guess the need for it is pretty low. Personally, I don't think going unicode makes much sense here. First, it would be a Py3-only feature. Second, there is a real need for it only when dealing with multibyte encodings, which are seldom used these days with utf-8 rightfully dominating. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Old tickets
Hi, This followup on tickets that I had previously indicated. So I want to thank Mark, Ralph and any people for going over those! For those that I followed I generally agreed with the outcome. Ticket 301: 'Make power and divide return floats from int inputs (like true_divide)' http://projects.scipy.org/numpy/ticket/301 Invalid because the output dtype is the same as the input dtype unless you override using the dtype argument: np.power(3, 1, dtype=np.float128).dtype dtype('float128') Alternatively return a float and indicate in the docstring that the output dtype can be changed. Ticket 354: 'Possible inconsistency in 0-dim and scalar empty array types' http://projects.scipy.org/numpy/ticket/354 Invalid because an empty array is not the same as an empty string. Ticket 1071: 'loadtxt fails if the last column contains empty value' http://projects.scipy.org/numpy/ticket/1071 Invalid mainly because loadtxt states that 'Each row in the text file must have the same number of values.' So of cause loadtxt must fail when there are missing values. Ticket 1374: 'Ticket 628 not fixed for Solaris (polyfit uses 100% CPU and does not stop)' http://projects.scipy.org/numpy/ticket/1374 Unless this can be verified it should be set as needs_info. Bruce ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in genfromtxt for python 3.2
Hi, On Wed, Mar 30, 2011 at 11:32 AM, Pauli Virtanen p...@iki.fi wrote: On Wed, 30 Mar 2011 10:37:45 -0700, Matthew Brett wrote: [clip] imagine I'm working with a non-latin default encoding, and I've opened a file: fobj = open('my_nonlatin.txt', 'rt') in python 3.2. That might contain numbers and non-latin text. I can't pass that into 'genfromtxt' because it will give me this error above. I can pass it is as binary but then I'll get garbled text. That's the way it also works on Python 2. The text is not garbled -- it's just in some binary representation that you can later on decode to unicode: np.array(['asd']).view(np.chararray).decode('utf-8') array([u'asd'], dtype='U3') Granted, utf-16 and the ilk might be problematic. Should those functions also allow unicode-providing files (perhaps with binary as default for speed)? Nobody has yet asked for this feature as far as I know, so I guess the need for it is pretty low. Personally, I don't think going unicode makes much sense here. First, it would be a Py3-only feature. Second, there is a real need for it only when dealing with multibyte encodings, which are seldom used these days with utf-8 rightfully dominating. It's not a feature I need, but then, I'm afraid all the languages I've been taught are latin-1. Oh, except I learnt a tiny bit of Greek. But I don't use it for work :) I suppose the annoyances would be: 1) Probably temporary surprise that genfromtxt(open('my_file.txt', 'rt')) generates this error 2) Having to go back over returned arrays decoding stuff for utf-8 3) Wrong results for other encodings Maybe the best way is a graceful warning on entry to the routine? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] should get rid of the annoying numpy STDERR output
On Thu, Mar 24, 2011 at 5:25 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Thu, Mar 24, 2011 at 5:11 PM, Robert Kern robert.k...@gmail.com wrote: 2011/3/24 Dmitrey tm...@ukr.net: from numpy import inf, array inf*0 nan (ok) array(inf) * 0.0 StdErr: Warning: invalid value encountered in multiply nan My cycled calculations yields this thousands times slowing computations and making text output completely non-readable. from numpy import __version__ __version__ '2.0.0.dev-1fe8136' We really should change the default to 'warn' for numpy 2.0. Maybe even for numpy 1.6. We've talked about it before, and I think most people were in favor. We just never pulled the trigger. Old thread on this topic: http://thread.gmane.org/gmane.comp.python.numeric.general/35664 Devs, what say you? Works for me, also for 1.6. Hi, just pinging this issue. If this is to happen for 1.6 it should go in the next beta (probably this weekend, only waiting for the genfromtxt issue to be resolved). Some more input would be good. As would a patch. Thanks, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Old tickets
On Wed, Mar 30, 2011 at 2:37 PM, Bruce Southey bsout...@gmail.com wrote: Hi, This followup on tickets that I had previously indicated. So I want to thank Mark, Ralph and any people for going over those! For those that I followed I generally agreed with the outcome. Ticket 301: 'Make power and divide return floats from int inputs (like true_divide)' http://projects.scipy.org/numpy/ticket/301 Invalid because the output dtype is the same as the input dtype unless you override using the dtype argument: np.power(3, 1, dtype=np.float128).dtype dtype('float128') Alternatively return a float and indicate in the docstring that the output dtype can be changed. FWIW, Just thought I'd note (on a python 2.6 system): import numpy as np a = np.array([1, 2, 3, 4]) a.dtype dtype('int32') 2 / a array([2, 1, 0, 0]) from __future__ import division 2 / a array([ 2., 1., 0.6667, 0.5 ]) So, numpy already does this when true division is imported (and therefore consistent with whatever the python environment does), and python currently also returns integers for exponentials when both inputs are integers. Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Old tickets
On 30 Mar 2011, at 23:26, Benjamin Root wrote: Ticket 301: 'Make power and divide return floats from int inputs (like true_divide)' http://projects.scipy.org/numpy/ticket/301 Invalid because the output dtype is the same as the input dtype unless you override using the dtype argument: np.power(3, 1, dtype=np.float128).dtype dtype('float128') Alternatively return a float and indicate in the docstring that the output dtype can be changed. FWIW, Just thought I'd note (on a python 2.6 system): import numpy as np a = np.array([1, 2, 3, 4]) a.dtype dtype('int32') 2 / a array([2, 1, 0, 0]) from __future__ import division 2 / a array([ 2., 1., 0.6667, 0.5 ]) So, numpy already does this when true division is imported (and therefore consistent with whatever the python environment does), and python currently also returns integers for exponentials when both inputs are integers. I'd agree, and in my view power(3, -1) is well defined as 1 / 3 - also, in future (or Python3) a/2 array([ 0.5, 1. , 1.5, 2. ]) a//2 array([0, 1, 1, 2], dtype=int32) so I think at least a**n should follow integer math rules; depends on whether we want np.power to behave differently from ** (if they are internally handled separately at all)... Not sure if I understand the overload suggestion in the ticket, but maybe a solution could be using the output argument (if an explicit optional dtype is not an option): b = np.zeros(2, dtype=np.int32) np.power(np.arange(1,3),-2, b) array([1, 0]) b = np.zeros(2) np.power(np.arange(1,3),-2, b) array([ 1., 0.]) ^^ this could be changed to array([ 1., 0.25]) Cheers, Derek ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Old tickets
Hi, On 30 Mar 2011, at 21:37, Bruce Southey wrote: Ticket 1071: 'loadtxt fails if the last column contains empty value' http://projects.scipy.org/numpy/ticket/1071 Invalid mainly because loadtxt states that 'Each row in the text file must have the same number of values.' So of cause loadtxt must fail when there are missing values. I don't follow the line of argument - see my comment on the ticket. This covers cases where missing values could always have been caught by the user - Converters can also be used to provide a default value for missing data: ``converters = {3: lambda s: float(s or 0)}``. The ticket simply addresses the issue that delimiter='\t' is treated differently from other delimiters if (and only if) the missing value is the last item in the row. Cheers, Derek ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] should get rid of the annoying numpy STDERR output
On Wed, Mar 30, 2011 at 16:03, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Thu, Mar 24, 2011 at 5:25 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Thu, Mar 24, 2011 at 5:11 PM, Robert Kern robert.k...@gmail.com wrote: We really should change the default to 'warn' for numpy 2.0. Maybe even for numpy 1.6. We've talked about it before, and I think most people were in favor. We just never pulled the trigger. Old thread on this topic: http://thread.gmane.org/gmane.comp.python.numeric.general/35664 Devs, what say you? Works for me, also for 1.6. Hi, just pinging this issue. If this is to happen for 1.6 it should go in the next beta (probably this weekend, only waiting for the genfromtxt issue to be resolved). Some more input would be good. As would a patch. Patch: https://github.com/numpy/numpy/pull/65 -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.6.0b1 half float buffer bug?
On Fri, Mar 25, 2011 at 10:00 AM, Eli Stevens (Gmail) wickedg...@gmail.com wrote: Can anyone please give me some suggestions on how to go about writing a unit test for this? Or should I just submit a pull request? I've gotten a bit of positive feedback to adding the 'e' type to the struct module on the python-ideas list (per my understanding, not before python 3.3, but I don't think that should hinder adoption in other libraries), so I'd like to ask again about unit testing a change like this. Can anyone offer some advice for where to start? Also, what kind of timeframe / cutoff am I looking at to get this into 1.6.0 or 1.6.x? Thanks, Eli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadtxt/savetxt tickets
On Sun, Mar 27, 2011 at 4:09 AM, Paul Anton Letnes paul.anton.let...@gmail.com wrote: On 26. mars 2011, at 21.44, Derek Homeier wrote: Hi Paul, having had a look at the other tickets you dug up, My opinions are my own, and in detail, they are: 1752: I attach a possible patch. FWIW, I agree with the request. The patch is written to be compatible with the fix in ticket #1562, but I did not test that yet. Tested, see also my comments on Trac. Great! 1731: This seems like a rather trivial feature enhancement. I attach a possible patch. Agreed. Haven't tested it though. Great! 1616: The suggested patch seems reasonable to me, but I do not have a full list of what objects loadtxt supports today as opposed to what this patch will support. Looks like you got this one. Just remember to make it compatible with #1752. Should be easy. 1562: I attach a possible patch. This could also be the default behavior to my mind, since the function caller can simply call numpy.squeeze if needed. Changing default behavior would probably break old code, however. See comments on Trac as well. Your patch is better, but there is one thing I disagree with. 808if X.ndim ndmin: 809if ndmin == 1: 810X.shape = (X.size, ) 811elif ndmin == 2: 812X.shape = (X.size, 1) The last line should be: 812X.shape = (1, X.size) If someone wants a 2D array out, they would most likely expect a one-row file to come out as a one-row array, not the other way around. IMHO. 1458: The fix suggested in the ticket seems reasonable, but I have never used record arrays, so I am not sure of this. There were some issues with Python3, and I also had some general reservations as noted on Trac - basically, it makes 'unpack' equivalent to transposing for 2D-arrays, but to splitting into fields for 1D-recarrays. My question was, what's going to happen when you get to 2D-recarrays? Currently this is not an issue since loadtxt can only read 2D regular or 1D structured arrays. But this might change if the data block functionality (see below) were to be implemented - data could then be returned as 3D arrays or 2D structured arrays... Still, it would probably make most sense (or at least give the widest functionality) to have 'unpack=True' always return a list or iterator over columns. OK, I don't know recarrays, as I said. 1445: Adding this functionality could break old code, as some old datafiles may have empty lines which are now simply ignored. I do not think the feature is a good idea. It could rather be implemented as a separate function. 1107: I do not see the need for this enhancement. In my eyes, the usecols kwarg does this and more. Perhaps I am misunderstanding something here. Agree about #1445, and the bit about 'usecols' - 'numcols' would just provide a shorter call to e.g. read the first 20 columns of a file (well, not even that much over 'usecols=range(20)'...), don't think that justifies an extra argument. But the 'datablocks' provides something new, that a number of people seem to miss from e.g. gnuplot (including me, actually ;-). And it would also satisfy the request from #1445 without breaking backwards compatibility. I've been wondering if could instead specify the separator lines through the parameter, e.g. blocksep=['None', 'blank','invalid'], not sure if that would make it more useful... What about writing a separate function, e.g. loadblocktxt, and have it separate the chunks and call loadtxt for each chunk? Just a thought. Another possibility would be to write a function that would let you load a set of text files in a directory, and return a dict of datasets, one per file. One could write a similar save-function, too. They would just need to call loadtxt/savetxt on a per-file basis. 1071: It is not clear to me whether loadtxt is supposed to support missing values in the fashion indicated in the ticket. In principle it should at least allow you to, by the use of converters as described there. The problem is, the default delimiter is described as 'any whitespace', which in the present implementation obviously includes any number of blanks or tabs. These are therefore treated differently from delimiters like ',' or ''. I'd reckon there are too many people actually relying on this behaviour to silently change it (e.g. I know plenty of tables with columns separated by either one or several tabs depending on the length of the previous entry). But the tab is apparently also treated differently if explicitly specified with delimiter='\t' - and in that case using a converter à la {2: lambda s: float(s or 'Nan')} is working for fields in the middle of the line, but not at the end - clearly warrants improvement. I've prepared a patch working for Python3