[Numpy-discussion] bug in genfromtxt for python 3.2
numpy/lib/test_io.pyonly uses StringIO in the test, no actual csv file If I give the filename than I get a TypeError: Can't convert 'bytes' object to str implicitly from the statsmodels mailing list example data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) Traceback (most recent call last): File pyshell#30, line 1, in module data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1633, in recfromtxt output = genfromtxt(fname, **kwargs) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1181, in genfromtxt first_values = split_line(first_line) File C:\Programs\Python32\lib\site-packages\numpy\lib\_iotools.py, line 206, in _delimited_splitter line = line.split(self.comments)[0].strip(asbytes( \r\n)) TypeError: Can't convert 'bytes' object to str implicitly line 1184 in npyio (py32 sourcefile) if isinstance(fname, str): fhd = np.lib._datasource.open(fname, 'U') seems to be the culprit for my case changing to binary solved this problem for me fhd = np.lib._datasource.open(fname, 'Ub') (I still have other errors but don't know yet where they are coming from.) Almost all problem with porting statsmodels to python 3.2 so far are input related, mainly reading csv files which are heavily used in the tests. All the real code seems to work fine with numpy and scipy (and matplotlib so far) for python 3.2 Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
On Tue, Mar 29, 2011 at 8:13 AM, Pearu Peterson pearu.peter...@gmail.comwrote: On Mon, Mar 28, 2011 at 10:44 PM, Sturla Molden stu...@molden.no wrote: Den 28.03.2011 19:12, skrev Pearu Peterson: FYI, f2py in numpy 1.6.x supports also assumed shape arrays. How did you do that? Chasm-interop, C bindings from F03, or marshalling through explicit-shape? The latter. Basically, if you have subroutine foo(a) real a(:) end then f2py automatically creates a wrapper subroutine subroutine wrapfoo(a, n) real a(n) integer n !f2py intent(in) :: a !f2py intent(hide) :: n = shape(a,0) interface subroutine foo(a) real a(:) end end interface call foo(a) end that can be wrapped with f2py in ordinary way. Can f2py pass strided memory from NumPy to Fortran? No. I haven't thought about it. Now, after little bit of thinking and testing, I think supporting strided arrays in f2py is easily doable. For the example above, f2py just must generate the following wrapper subroutine subroutine wrapfoo(a, stride, n) real a(n) integer n, stride !f2py intent(in) :: a !f2py intent(hide) :: n = shape(a,0) !f2py intent(hide) :: stride = getstrideof(a) interface subroutine foo(a) real a(:) end end interface call foo(a(1:stride:n)) end Now the question is, how important this feature would be? How high I should put it in my todo list? If there is interest, the corresponding numpy ticket should be assigned to me. Best regards, Pearu ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
On 03/29/2011 09:35 AM, Pearu Peterson wrote: On Tue, Mar 29, 2011 at 8:13 AM, Pearu Peterson pearu.peter...@gmail.com mailto:pearu.peter...@gmail.com wrote: On Mon, Mar 28, 2011 at 10:44 PM, Sturla Molden stu...@molden.no mailto:stu...@molden.no wrote: Den 28.03.2011 19:12, skrev Pearu Peterson: FYI, f2py in numpy 1.6.x supports also assumed shape arrays. How did you do that? Chasm-interop, C bindings from F03, or marshalling through explicit-shape? The latter. Basically, if you have subroutine foo(a) real a(:) end then f2py automatically creates a wrapper subroutine subroutine wrapfoo(a, n) real a(n) integer n !f2py intent(in) :: a !f2py intent(hide) :: n = shape(a,0) interface subroutine foo(a) real a(:) end end interface call foo(a) end that can be wrapped with f2py in ordinary way. Can f2py pass strided memory from NumPy to Fortran? No. I haven't thought about it. Now, after little bit of thinking and testing, I think supporting strided arrays in f2py is easily doable. For the example above, f2py just must generate the following wrapper subroutine subroutine wrapfoo(a, stride, n) real a(n) integer n, stride !f2py intent(in) :: a !f2py intent(hide) :: n = shape(a,0) !f2py intent(hide) :: stride = getstrideof(a) interface subroutine foo(a) real a(:) end end interface call foo(a(1:stride:n)) end I think it should be a(1:n*stride:stride) or something. Dag Sverre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
On Tue, Mar 29, 2011 at 11:03 AM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: I think it should be a(1:n*stride:stride) or something. Yes, it was my typo and I assumed that n is the length of the original array. Pearu ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] npz load on numpy 3
more python 3.2 fun a npz file saved with python 2.6 (I guess) that I try to read with python 3.2 I have no clue, since I never use .npz files arr = np.load(r..\scikits\statsmodels\tsa\vector_ar\tests\results\vars_results.npz) arr numpy.lib.npyio.NpzFile object at 0x03874AC8 dir(arr) ['__class__', '__contains__', '__del__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_files', 'close', 'f', 'fid', 'files', 'items', 'iteritems', 'iterkeys', 'keys', 'zip'] arr.keys() ['causality', 'orthirf', 'detomega', 'nirfs', 'loglike', 'stderr', 'crit', 'phis', 'nahead', 'totobs', 'type', 'obs', 'irf', 'coefs'] arr['irf'] Traceback (most recent call last): File stdin, line 1, in module File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 222, in __getitem__ return format.read_array(value) File C:\Programs\Python32\lib\site-packages\numpy\lib\format.py, line 449, in read_array array = pickle.load(fp) UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 6: ordinal not in range(128) Any ideas ? Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] npz load on numpy 3
Tue, 29 Mar 2011 04:16:00 -0400, josef.pktd wrote: Traceback (most recent call last): File stdin, line 1, in module File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 222, in __getitem__ return format.read_array(value) File C:\Programs\Python32\lib\site-packages\numpy\lib\format.py, line 449, in read_array array = pickle.load(fp) UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 6: ordinal not in range(128) [clip] Any ideas ? There's some open() call that opens the file in text mode rather than in binary mode, I guess. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] npz load on numpy 3
Tue, 29 Mar 2011 08:27:52 +, Pauli Virtanen wrote: Tue, 29 Mar 2011 04:16:00 -0400, josef.pktd wrote: Traceback (most recent call last): File stdin, line 1, in module File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 222, in __getitem__ return format.read_array(value) File C:\Programs\Python32\lib\site-packages\numpy\lib\format.py, line 449, in read_array array = pickle.load(fp) UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 6: ordinal not in range(128) [clip] Any ideas ? There's some open() call that opens the file in text mode rather than in binary mode, I guess. Ah, that's not it. The problem is that pickled Numpy arrays are not backward compatible between Python 2 and 3 because of the string vs. unicode change --- the pickle.load() call should specify an encoding eg. pickle.load(fp, encoding='latin1'). This needs to be wrapped in a try-expect block so that it tries to load it with encoding='latin1' if the first attempt fails. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] random number genration
If I want to generate a string of random bits with equal probability I run random.randint(0,2,size). What if I want a specific proportion of bits? In other words, each bit is 1 with probability p1/2 and 0 with probability q=1-p? thanks ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
Hi, On Tue, Mar 29, 2011 at 12:00 PM, Alex Ter-Sarkissov ater1...@gmail.comwrote: If I want to generate a string of random bits with equal probability I run random.randint(0,2,size). What if I want a specific proportion of bits? In other words, each bit is 1 with probability p1/2 and 0 with probability q=1-p? Would it be sufficient to: In []: bs= ones(1e6, dtype= int) In []: bs[randint(0, 1e6, 1e5)]= 0 In []: bs.sum()/ 1e6 Out[]: 0.904706 Regards, eat thanks ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
On Tue, Mar 29, 2011 at 1:29 PM, eat e.antero.ta...@gmail.com wrote: Hi, On Tue, Mar 29, 2011 at 12:00 PM, Alex Ter-Sarkissov ater1...@gmail.comwrote: If I want to generate a string of random bits with equal probability I run random.randint(0,2,size). What if I want a specific proportion of bits? In other words, each bit is 1 with probability p1/2 and 0 with probability q=1-p? Would it be sufficient to: In []: bs= ones(1e6, dtype= int) In []: bs[randint(0, 1e6, 1e5)]= 0 In []: bs.sum()/ 1e6 Out[]: 0.904706 Or: In []: bs= ones(1e6, dtype= int) In []: bs[rand(1e6) 1./ 10]= 0 In []: bs.sum()/ 1e6 Out[]: 0.89983 Regards, eat thanks ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
Den 29.03.2011 11:00, skrev Alex Ter-Sarkissov: If I want to generate a string of random bits with equal probability I run random.randint(0,2,size). What if I want a specific proportion of bits? In other words, each bit is 1 with probability p1/2 and 0 with probability q=1-p? Does this work you? import numpy as np def randombits(n, p): b = (np.random.rand(n*8).reshape((n,8)) p).astype(int) return (b range(8)).sum(axis=1).astype(np.uint8) Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
Den 29.03.2011 14:56, skrev Sturla Molden: import numpy as np def randombits(n, p): b = (np.random.rand(n*8).reshape((n,8)) p).astype(int) return (b range(8)).sum(axis=1).astype(np.uint8) n is the number of bytes. If you prefer to count in bits: def randombits(n, p): assert(n%8 == 0) b = (np.random.rand(n).reshape((n3,8)) p).astype(int) return (b range(8)).sum(axis=1).astype(np.uint8) Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] np.histogram on arrays.
Hi all. Sorry if this question has already been asked. I've searched the archive, but could not find anything related, so here is my question. I'm using np.histogram on a 4000x4000 array, each with 200 bins. I do that on both dimensions, meaning I compute 8000 histograms. It takes around 5 seconds (which is of course quite fast). I was wondering why np.histogram does not accept an axis parameter so that it could work directly on the array without me having to write a loop. Or maybe did I miss some parameters using np.histogram. Thanks. Éric. Un clavier azerty en vaut deux -- Éric Depagnee...@depagne.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.histogram on arrays.
Hi, On Tue, Mar 29, 2011 at 4:29 PM, Éric Depagne e...@depagne.org wrote: Hi all. Sorry if this question has already been asked. I've searched the archive, but could not find anything related, so here is my question. I'm using np.histogram on a 4000x4000 array, each with 200 bins. I do that on both dimensions, meaning I compute 8000 histograms. It takes around 5 seconds (which is of course quite fast). I was wondering why np.histogram does not accept an axis parameter so that it could work directly on the array without me having to write a loop. Or maybe did I miss some parameters using np.histogram. FWIW, have you considered to use http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogramdd.html#numpy.histogramdd Regards, eat Thanks. Éric. Un clavier azerty en vaut deux -- Éric Depagnee...@depagne.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
On Tue, Mar 29, 2011 at 5:00 AM, Alex Ter-Sarkissov ater1...@gmail.com wrote: If I want to generate a string of random bits with equal probability I run random.randint(0,2,size). What if I want a specific proportion of bits? In other words, each bit is 1 with probability p1/2 and 0 with probability q=1-p? x = (np.random.random(size) p) Setting p = .5 should produce the same results as np.random.randint(0,2,size). Note that this gives you an array of bools, not ints; use x.astype(int) if integers are important (or x.astype(np.uint8) if memory is an issue). HTH, Dan Lepage ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.histogram on arrays.
FWIW, have you considered to use http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogramdd.html# numpy.histogramdd Regards, eat I tried, but I get a /usr/lib/pymodules/python2.6/numpy/lib/function_base.pyc in histogramdd(sample, bins, range, normed, weights) 370 # Reshape is used so that overlarge arrays 371 # will raise an error. -- 372 hist = zeros(nbin, float).reshape(-1) 373 374 # Compute the sample indices in the flattened histogram matrix. ValueError: sequence too large; must be smaller than 32 so I suspect my array is too big for histogramdd Éric. -- Un clavier azerty en vaut deux -- Éric Depagnee...@depagne.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
Den 29.03.2011 15:46, skrev Daniel Lepage: x = (np.random.random(size) p) This will not work. A boolean array is not compactly stored, but an array of bytes. Only the first bit 0 is 1 with probability p, bits 1 to 7 bits are 1 with probability 0. We thus have to do this 8 times for each byte, shift left by range(8), and combine them with binary or. Also the main use of random bits is crypto, which requires the use of /dev/urandom or CrypGenRandom instead of Mersenne Twister (np.random.rand). Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
Den 29.03.2011 16:49, skrev Sturla Molden: Only the first bit 0 is 1 with probability p, bits 1 to 7 bits are 1 with probability 0. That should read: Only bit 0 is 1 with probability p, bits 1 to 7 are 1 with probability 0. Sorry :) Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
On Tue, Mar 29, 2011 at 09:49, Sturla Molden stu...@molden.no wrote: Den 29.03.2011 15:46, skrev Daniel Lepage: x = (np.random.random(size) p) This will not work. A boolean array is not compactly stored, but an array of bytes. Only the first bit 0 is 1 with probability p, bits 1 to 7 bits are 1 with probability 0. We thus have to do this 8 times for each byte, shift left by range(8), and combine them with binary or. It's not clear that the OP really meant bits rather than just bools. Judging by the example that he tried first, it's likely that he just wants bools (or even just 0s and 1s) and not a real string of bits compacted into bytes. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.histogram on arrays.
Hi, On Tue, Mar 29, 2011 at 5:13 PM, Éric Depagne e...@depagne.org wrote: FWIW, have you considered to use http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogramdd.html# numpy.histogramdd Regards, eat I tried, but I get a /usr/lib/pymodules/python2.6/numpy/lib/function_base.pyc in histogramdd(sample, bins, range, normed, weights) 370 # Reshape is used so that overlarge arrays 371 # will raise an error. -- 372 hist = zeros(nbin, float).reshape(-1) 373 374 # Compute the sample indices in the flattened histogram matrix. ValueError: sequence too large; must be smaller than 32 so I suspect my array is too big for histogramdd So it seems that you give your array directly to histogramdd (asking a 4000D histogram!). Surely that's not what you are trying to achieve. Can you elaborate more on your objectives? Perhaps some code (slow but working) to demonstrate the point. Regards, eat Éric. -- Un clavier azerty en vaut deux -- Éric Depagnee...@depagne.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
Den 29.03.2011 16:49, skrev Sturla Molden: This will not work. A boolean array is not compactly stored, but an array of bytes. Only the first bit 0 is 1 with probability p, bits 1 to 7 bits are 1 with probability 0. We thus have to do this 8 times for each byte, shift left by range(8), and combine them with binary or. Also the main use of random bits is crypto, which requires the use of /dev/urandom or CrypGenRandom instead of Mersenne Twister (np.random.rand). Here's a cleaned-up one for those who might be interested :-) Sturla import numpy as np import os def randombits(n, p, intention='numerical'): Returns an array with packed bits drawn from n Bernoulli trials with successrate p. assert (intention in ('numerical','crypto')) # number of bytes m = (n 3) + 1 if n % 8 else n 3 if intention == 'numerical': # Mersenne Twister rflt = np.random.rand(m*8) else: # /dev/urandom on Linux, Apple, et al., # CryptGenRandom on Windows rflt = np.frombuffer(os.urandom(m*64),dtype=np.uint64) rflt = rflt / float(2**64) b = (rflt.reshape((m,8))p) # pack the bits b = (brange(8)).sum(axis=1).astype(np.uint8) # zero the trailing m*8 - n bits b[-1] = (0xFF (m*8 - n)) return b def bitcount(a, bytewise=False): Count the number of set bits in an array of np.uint8. assert(a.dtype == np.uint8) c = a[:,np.newaxis].repeat(8, axis=1) range(8) 0x01 return (c.sum(axis=1) if bytewise else c.sum()) if __name__ == '__main__': b = randombits(int(1e6), .1, intent='numerical') print bitcount(b) # should be close to 10 b = randombits(int(1e6), .1, intent='crypto') print bitcount(b) # should be close to 10 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number genration
On Tue, Mar 29, 2011 at 11:59 AM, Sturla Molden stu...@molden.no wrote: Den 29.03.2011 16:49, skrev Sturla Molden: This will not work. A boolean array is not compactly stored, but an array of bytes. Only the first bit 0 is 1 with probability p, bits 1 to 7 bits are 1 with probability 0. We thus have to do this 8 times for each byte, shift left by range(8), and combine them with binary or. Also the main use of random bits is crypto, which requires the use of /dev/urandom or CrypGenRandom instead of Mersenne Twister (np.random.rand). Here's a cleaned-up one for those who might be interested :-) How about adding it to http://www.scipy.org/Cookbook? Warren Sturla import numpy as np import os def randombits(n, p, intention='numerical'): Returns an array with packed bits drawn from n Bernoulli trials with successrate p. assert (intention in ('numerical','crypto')) # number of bytes m = (n 3) + 1 if n % 8 else n 3 if intention == 'numerical': # Mersenne Twister rflt = np.random.rand(m*8) else: # /dev/urandom on Linux, Apple, et al., # CryptGenRandom on Windows rflt = np.frombuffer(os.urandom(m*64),dtype=np.uint64) rflt = rflt / float(2**64) b = (rflt.reshape((m,8))p) # pack the bits b = (brange(8)).sum(axis=1).astype(np.uint8) # zero the trailing m*8 - n bits b[-1] = (0xFF (m*8 - n)) return b def bitcount(a, bytewise=False): Count the number of set bits in an array of np.uint8. assert(a.dtype == np.uint8) c = a[:,np.newaxis].repeat(8, axis=1) range(8) 0x01 return (c.sum(axis=1) if bytewise else c.sum()) if __name__ == '__main__': b = randombits(int(1e6), .1, intent='numerical') print bitcount(b) # should be close to 10 b = randombits(int(1e6), .1, intent='crypto') print bitcount(b) # should be close to 10 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] stable sort on a recarray ?
sortind = np.argsort(x['name'], kind='mergesort'); x[sortind] The indirect sorting method that was suggested works for doing stable sort on recarrays / structured arrays based on a single-column. # It is necessary to specify kind='mergesort' because argsort is not stable: np.argsort(np.ones(100)) array([ 0, 72, 71, 70, 69, 68, 67, 66, 65, 64, 63, 62, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 73, 51, 74, 76, 97, 96, 95, 94, 93, 92, 91, 90, 89, 88, 87, 86, 85, 84, 83, 82, 81, 80, 79, 78, 77, 75, 50, 49, 48, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 22, 23, 24, 25, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 98, 37, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 36, 99]) Any suggestions on how to achieve stable sort based on multiple columns with numpy ? a = array([('a', 1, 1), ('a', 0, 1), ('a', 0, 0), ('b', 0, 2)], dtype=[('name', '|S10'), ('x', 'i4'), ('y', 'i4')]) name xy a 1 1 a 0 1 a 0 0 b 0 2 # perform sort on primary column 'name' first, then if required on secondary column 'x' argsort(a, order=('name', 'x') array([2, 1, 0, 3]) # we get the expected result but the sort is not stable, desired result was: array([2, 0, 1, 3]) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] stable sort on a recarray ?
On Tue, Mar 29, 2011 at 13:33, butt...@gmail.com wrote: Any suggestions on how to achieve stable sort based on multiple columns with numpy ? http://docs.scipy.org/doc/numpy/reference/generated/numpy.lexsort.html#numpy.lexsort It uses mergesort for stability. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] stable sort on a recarray ?
np.lexsort does the job for both the single or multi-column stable sort cases, thanks. a = np.array([('a', 1, 1), ('a', 0, 1), ('a', 0, 0), ('b', 0, 2)], dtype=[('name', '|S10'), ('x', 'i4'), ('y', 'i4')]) sortind = np.lexsort([a['x'], a['name']]) sortind array([1, 2, 0, 3], dtype=int64) a[sortind] array([('a', 0, 1), ('a', 0, 0), ('a', 1, 1), ('b', 0, 2)], dtype=[('name', '|S10'), ('x', 'i4'), ('y', 'i4')]) The documentation could perhaps benefit from some clarification: http://docs.scipy.org/doc/numpy/reference/generated/numpy.lexsort.html#numpy.lexsort . It is not mentioned on that page that lexsort is a stable sort. . no structured array / recarray example is given . it also states that Structured arrays are sorted lexically by argsort, but fails to mention that the resulting sort is not stable. -- thanks, peter butterworth ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] histogram normed/density keyword (#1628)
On Sun, Mar 27, 2011 at 11:56 AM, Ralf Gommers ralf.gomm...@googlemail.com wrote: Hi all, For the 1.6 release #1628 needs to be resolved. A while ago there was a discussion about the normed keyword in histogram, which ATM has changed behavior compared to numpy 1.5.1. The preferred fix as I read the discussion (http://thread.gmane.org/gmane.comp.python.numeric.general/39746/focus=40089) was to leave normed alone and add a new keyword density. It would be helpful if someone can volunteer to fix this. I've done the above, can some review this: https://github.com/rgommers/numpy/tree/histogram-density-kw Thanks, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadtxt/savetxt tickets
On Sun, Mar 27, 2011 at 12:09 PM, Paul Anton Letnes paul.anton.let...@gmail.com wrote: I am sure someone has been using this functionality to convert floats to ints. Changing will break their code. I am not sure how big a deal that would be. Also, I am of the opinion that one should _first_ write a program that works _correctly_, and only afterwards worry about performance. While I'd agree in most cases, keep in mind that np.loadtxt is supposed to be a fast but simpler alternative to np.genfromtxt. If np.loadtxt becomes much slower, there's not much need to keep these separate any longer. Regards Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: Numpy 1.6.0 beta 1
In article AANLkTi=eeg8kl7639imrtl-ihg1ncqyolddsid5tf...@mail.gmail.com, Ralf Gommers ralf.gomm...@googlemail.com wrote: Hi, I am pleased to announce the availability of the first beta of NumPy 1.6.0. Due to the extensive changes in the Numpy core for this release, the beta testing phase will last at least one month. Please test this beta and report any problems on the Numpy mailing list. Sources and binaries can be found at: http://sourceforge.net/projects/numpy/files/NumPy/1.6.0b1/ For (preliminary) release notes see below. Great! FYI: it works for me on MacOS X 10.5.8 with python.org python 2.6.6: python setup.py build --fcompiler=gnu95 python setup.py install cd .. python -Wd -c 'import numpy; numpy.test()' NumPy version 1.6.0b1 NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-pack ages/numpy Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 (Apple Inc. build 5493)] nose version 0.11.4 ... Ran 3399 tests in 25.474s OK (KNOWNFAIL=3, SKIP=1) -- Russell ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in genfromtxt for python 3.2
Hi, On Mon, Mar 28, 2011 at 11:29 PM, josef.p...@gmail.com wrote: numpy/lib/test_io.py only uses StringIO in the test, no actual csv file If I give the filename than I get a TypeError: Can't convert 'bytes' object to str implicitly from the statsmodels mailing list example data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) Traceback (most recent call last): File pyshell#30, line 1, in module data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1633, in recfromtxt output = genfromtxt(fname, **kwargs) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1181, in genfromtxt first_values = split_line(first_line) File C:\Programs\Python32\lib\site-packages\numpy\lib\_iotools.py, line 206, in _delimited_splitter line = line.split(self.comments)[0].strip(asbytes( \r\n)) TypeError: Can't convert 'bytes' object to str implicitly Is the right fix for this to open a 'filename' passed to genfromtxt, as 'binary' (bytes)? If so I will submit a pull request with a fix and a test, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion