Re: [Numpy-discussion] something wrong with docs?
On Tue, Sep 22, 2009 at 9:29 PM, Fernando Perez fperez@gmail.comwrote: On Tue, Sep 22, 2009 at 7:31 PM, David Goldsmith is there a standard for these ala the docstring standard, or some other extant way to promulgate and strengthen your suggestion (after proper community vetting, of course); I'm not sure what you mean here, sorry. I simply don't understand what you are looking to strengthen or what standard there could be: this is regular code that goes into reST blocks. Sorry if I missed your point... It would be nice if we could move gradually towards docs whose examples (at least those marked as such) were always run via sphinx. That's a suggestion, but given your point, it seems like you'd advocate it being more than that, no? DG ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] something wrong with docs?
Tue, 22 Sep 2009 23:15:56 -0700, David Goldsmith wrote: [clip] It would be nice if we could move gradually towards docs whose examples (at least those marked as such) were always run via sphinx. Also the examples are doctestable, via numpy.test(doctests=True), or enabling Sphinx's doctest extension and its support for those. What Fernando said about them being more clumsy to write and copy than separate code directives is of course true. I wonder if there's a technical fix that could be made in Sphinx, at least for HTML, to correct this... -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Deserialized arrays with base mutate strings
Numpy arrays with the base property are deserialized as arrays pointing to a storage contained within a Python string. This is a problem since such arrays are mutable and can mutate existing strings. Here is how to create one: import numpy, cPickle as p a = numpy.array([1, 2, 3])# create an array b = a[::-1] # create a view b array([3, 2, 1]) b.base# view's base is the original array array([1, 2, 3]) c = p.loads(p.dumps(b, -1)) # roundtrip the view through pickle c array([3, 2, 1]) c.base# base is now a simple string: '\x03\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00' s = c.base s '\x03\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00' type(s) type 'str' c[0] = 4 # when the array is mutated... s # ...the string changes value! '\x04\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00' This is somewhat disconcerting, as Python strings are supposed to be immutable. In this case the string was created by numpy and is probably not shared by anyone, so it doesn't present a problem in practice. But in corner cases it can lead to serious bugs. Python has a cache of one-letter strings, which cannot be turned off. This means that one-byte array views can change existing Python strings used elsewhere in the code. For example: a = numpy.array([65], 'int8') b = a[::-1] c = p.loads(p.dumps(b, -1)) c array([65], dtype=int8) c.base 'A' c[0] = 66 c.base 'B' 'A' 'B' Note how changing a numpy array permanently changed the contents of all 'A' strings in this python instance, rendering python unusable. The fix should be straightforward: use a string subclass (which will skip the one-letter cache), or an entirely separate type for storage of base memory referenced by deserialized arrays. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deserialized arrays with base mutate strings
Wed, 23 Sep 2009 09:15:44 +0200, Hrvoje Niksic wrote: [clip] Numpy arrays with the base property are deserialized as arrays pointing to a storage contained within a Python string. This is a problem since such arrays are mutable and can mutate existing strings. Here is how to create one: Please file a bug ticket in the Trac, thanks! Here is a simpler way, although one more difficult to accidentally: a = numpy.frombuffer(A, dtype='S1') a.flags.writeable = True b = A a[0] = B b 'B' -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deserialized arrays with base mutate strings
Pauli Virtanen wrote: Wed, 23 Sep 2009 09:15:44 +0200, Hrvoje Niksic wrote: [clip] Numpy arrays with the base property are deserialized as arrays pointing to a storage contained within a Python string. This is a problem since such arrays are mutable and can mutate existing strings. Here is how to create one: Please file a bug ticket in the Trac, thanks! Done - ticket #1233. Here is a simpler way, although one more difficult to accidentally: a = numpy.frombuffer(A, dtype='S1') a.flags.writeable = True b = A a[0] = B b 'B' I guess this one could be prevented by verifying that the buffer is writable when setting the writable flag. When deserializing arrays, I don't see a reason for the base property to even exist - sharing of the buffer between different views is unpreserved anyway, as reported in my other thread. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] current stautus of numpy - Excel
FYI: Here is a summary of how one can 1) write numpy arrays to Excel 2) interact with numpy/scipy/... from Excel http://groups.google.com/group/python-excel/msg/3881b7e7ae210cc7 Best regards, Timmie ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy and cython in pure python mode
Robert Kern wrote: On Tue, Sep 22, 2009 at 01:33, Sebastian Haase seb.ha...@gmail.com wrote: Hi, I'm not subscribed to the cython list - hoping enough people would care to justify my post here: The post might be justified, but it is a question of available knowledge as well. I nearly missed this post here. The Cython user list is on: http://groups.google.com/group/cython-users I know that cython's numpy is still getting better and better over time, but is it already today possible to have numpy support when using Cython in pure python mode? I like the idea of being able to develop and debug code the python way -- and then just switching on the cython-overdrive mode. (Otherwise I have very good experience using C/C++ with appropriate typemaps, and I don't mind the C syntax) I only recently learned about the pure python mode on the sympy list (and at the EuroScipy2009 workshop). My understanding is that Cython's pure Python mode could be played in two ways: a) either not having a .pyx-file at all and putting everything into a py-file (using the import cython stuff) or b) putting only cython specific declaration in to a pyx file having the same basename as the py-file next to it. That should be a pxd file with the same basename. And I think that mode should work. b), that is. Sturla's note on the memory view syntax doesn't apply as that's not in a released version of Cython yet, and won't be until 0.12.1 or 0.13. But that could be made to support Python mode a). Finally there's been some recent discussion on cython-dev about a tool which can take a pyx file as input and output pure Python. One more: there is no way on reload cython-modules (yet), right ? Correct. There is no way to reload any extension module. This can be worked around (in most situations that arise in practice) by compiling the module with a new name each time and importing things from it though. Sage already kind of support it (for the %attach feature only), and there are patches around for pyximport in Cython that's just lacking testing and review. Since pyximport lacks a test suite altogether, nobody seems to ever get around to that. Dag Sverre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Numpy 2D array from a list error
Hi, I've got a fairly large (but not huge, 58mb) tab seperated text file, with approximately 200 columns and 56k rows of numbers and strings. Here's a snippet of my code to create a numpy matrix from the data file... data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) data = array(data) ### data = array(data) It causes the following error: ValueError: setting an array element with a sequence If I take the 1st 40,000 lines of the file, it works fine. If I take the last 40,000 lines of the file, it also works fine, so it isn't a problem with the file. I've found a few other posts complaining of the same problem, but none of their fixes work. It seems like a memory problem to me. This was reinforced when I tried to break the dataset into 3 chunks and stack the resulting arrays - I got an error message saying memory error. I don't really understand why reading in this 57mb txt file is taking up ~2gb's of RAM. Any advice? Thanks in advance Dave -- View this message in context: http://www.nabble.com/Numpy-2D-array-from-a-list-error-tp25531145p25531145.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] simple indexing question
I have an array: In [12]: a Out[12]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) And a selection array: In [13]: b Out[13]: array([1, 1, 1, 1, 1]) I want a 1-dimensional output, where the array b selects an element from each column of a, where if b[i]=0 select element from 0th row of a and if b[i]=k select element from kth row of a. Easy way to do this? (Not a[b], that gives 5x5 array output) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] simple indexing question
Neal Becker wrote: I have an array: In [12]: a Out[12]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) And a selection array: In [13]: b Out[13]: array([1, 1, 1, 1, 1]) I want a 1-dimensional output, where the array b selects an element from each column of a, where if b[i]=0 select element from 0th row of a and if b[i]=k select element from kth row of a. Easy way to do this? (Not a[b], that gives 5x5 array output) It might be stupid, but it works... In [51]: a Out[51]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [52]: b = [0,1,0,1,0] In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] Out[53]: array([0, 6, 2, 8, 4]) cheers, r. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Create numpy array from a list error
Hi all, I've got a fairly large (but not huge, 58mb) tab seperated text file, with approximately 200 columns and 56k rows of numbers and strings. Here's a snippet of my code to create a numpy matrix from the data file... data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) data = array(data) ### It causes the following error: data = array(data) ValueError: setting an array element with a sequence If I take the 1st 40,000 lines of the file, it works fine. If I take the last 40,000 lines of the file, it also works fine, so it isn't a problem with the file. I've found a few other posts complaining of the same problem, but none of their fixes work. It seems like a memory problem to me. This was reinforced when I tried to break the dataset into 3 chunks and stack the resulting arrays - I got an error message saying memory error. Also, I don't really understand why reading in this 57mb txt file is taking up ~2gb's of RAM. Any advice? Thanks in advance Dave ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 2D array from a list error
On 09/23/2009 08:42 AM, davew wrote: Hi, I've got a fairly large (but not huge, 58mb) tab seperated text file, with approximately 200 columns and 56k rows of numbers and strings. Here's a snippet of my code to create a numpy matrix from the data file... data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) data = array(data) ### data = array(data) It causes the following error: ValueError: setting an array element with a sequence If I take the 1st 40,000 lines of the file, it works fine. If I take the last 40,000 lines of the file, it also works fine, so it isn't a problem with the file. I've found a few other posts complaining of the same problem, but none of their fixes work. It seems like a memory problem to me. This was reinforced when I tried to break the dataset into 3 chunks and stack the resulting arrays - I got an error message saying memory error. I don't really understand why reading in this 57mb txt file is taking up ~2gb's of RAM. Any advice? Thanks in advance Dave If the text file has 'numbers and strings' how is numpy meant to know what dtype to use? Please try genfromtxt especially if columns contain both numbers and strings. What happens if you read a file instead of using stdin? It is possible that one or more rows have multiple sequential delimiters. Please check the row lengths of your 'data' variable after doing: data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) Really without the input or system, it is hard to say anything. If you really know your data I would suggest preallocating the array and updating the array one line at a time to avoid the large multiple intermediate objects. Bruce ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] cummax
Hi, I want a cummax function where given an array inp it returns this: numpy.array([inp[:i].max() for i in xrange(1,len(inp)+1)]). Various python versions equivalent to the above are quite slow (though a single python loop is much faster than a python loop with a nested numpy C loop as shown above). I have numpy 1.3.0 source. It looks to me like I could add cummax function by simply adding PyArray_CumMax to multiarraymodule.c which would be the same as PyArray_Max except it would call PyArray_GenericAccumulateFunction instead of PyArray_GenericReduceFunction. Also add array_cummax to arraymethods.c. Is there interest in adding this function to numpy? If so, I will check out the latest code and try to check in these changes. If not, how can I write my own Python module in C that adds this UFunc and still gets to reuse the code in PyArray_GenericReduceFunction? Thanks, -Nissim ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 2D array from a list error
If the text file has 'numbers and strings' how is numpy meant to know what dtype to use? Please try genfromtxt especially if columns contain both numbers and strings. Well, I suppose they are all considered to be strings here. I haven't tried to convert the numbers to floats yet. What happens if you read a file instead of using stdin? Same problem It is possible that one or more rows have multiple sequential delimiters. Please check the row lengths of your 'data' variable after doing: Already done, they all have the same number of rows. The fact that the script works with the first 40k lines, and also with the last 40k lines suggests to me that there is no problem with the file. (I calculate column means and standard deviations later in the script - it's only the first two columns which can't be cast to floating point numbers) Really without the input or system, it is hard to say anything. If you really know your data I would suggest preallocating the array and updating the array one line at a time to avoid the large multiple intermediate objects. I'm running on linux. My machine is redhat with 2GB RAM, but when memory became an issue I tried running on other Linux machines with much greater RAM capacities. I don't know what distos. I just tried preallocating the array and updating it one line at a time, and that works fine. Thanks very much for the suggestion. :) This doesn't seem like the expected behaviour though and the error message seems wrong. Many thanks, Dave ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] cummax
On Wed, Sep 23, 2009 at 8:34 AM, Nissim Karpenstein niss...@gmail.comwrote: Hi, I want a cummax function where given an array inp it returns this: numpy.array([inp[:i].max() for i in xrange(1,len(inp)+1)]). Various python versions equivalent to the above are quite slow (though a single python loop is much faster than a python loop with a nested numpy C loop as shown above). I have numpy 1.3.0 source. It looks to me like I could add cummax function by simply adding PyArray_CumMax to multiarraymodule.c which would be the same as PyArray_Max except it would call PyArray_GenericAccumulateFunction instead of PyArray_GenericReduceFunction. Also add array_cummax to arraymethods.c. Is there interest in adding this function to numpy? If so, I will check out the latest code and try to check in these changes. If not, how can I write my own Python module in C that adds this UFunc and still gets to reuse the code in PyArray_GenericReduceFunction? It's already available In [5]: a = arange(10) In [6]: a[5:] = 0 In [7]: maximum.accumulate(a) Out[7]: array([0, 1, 2, 3, 4, 4, 4, 4, 4, 4]) PyArray_Max is there because it is an ndarray method. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] simple indexing question
Robert Cimrman wrote: Neal Becker wrote: I have an array: In [12]: a Out[12]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) And a selection array: In [13]: b Out[13]: array([1, 1, 1, 1, 1]) I want a 1-dimensional output, where the array b selects an element from each column of a, where if b[i]=0 select element from 0th row of a and if b[i]=k select element from kth row of a. Easy way to do this? (Not a[b], that gives 5x5 array output) It might be stupid, but it works... In [51]: a Out[51]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [52]: b = [0,1,0,1,0] In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] Out[53]: array([0, 6, 2, 8, 4]) cheers, r. Thanks. Is there really no more elegant solution? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] simple indexing question
On Wed, Sep 23, 2009 at 11:12 AM, Neal Becker ndbeck...@gmail.com wrote: Robert Cimrman wrote: Neal Becker wrote: I have an array: In [12]: a Out[12]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) And a selection array: In [13]: b Out[13]: array([1, 1, 1, 1, 1]) I want a 1-dimensional output, where the array b selects an element from each column of a, where if b[i]=0 select element from 0th row of a and if b[i]=k select element from kth row of a. Easy way to do this? (Not a[b], that gives 5x5 array output) It might be stupid, but it works... In [51]: a Out[51]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [52]: b = [0,1,0,1,0] In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] Out[53]: array([0, 6, 2, 8, 4]) cheers, r. Thanks. Is there really no more elegant solution? How about this? a array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) b array([0, 1, 0, 1, 0]) a[b,np.arange(a.shape[1])] array([0, 6, 2, 8, 4]) Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 2D array from a list error
On 09/23/2009 10:00 AM, Dave Wood wrote: If the text file has 'numbers and strings' how is numpy meant to know what dtype to use? Please try genfromtxt especially if columns contain both numbers and strings. Well, I suppose they are all considered to be strings here. I haven't tried to convert the numbers to floats yet. What happens if you read a file instead of using stdin? Same problem It is possible that one or more rows have multiple sequential delimiters. Please check the row lengths of your 'data' variable after doing: Already done, they all have the same number of rows. The fact that the script works with the first 40k lines, and also with the last 40k lines suggests to me that there is no problem with the file. (I calculate column means and standard deviations later in the script - it's only the first two columns which can't be cast to floating point numbers) Really without the input or system, it is hard to say anything. If you really know your data I would suggest preallocating the array and updating the array one line at a time to avoid the large multiple intermediate objects. I'm running on linux. My machine is redhat with 2GB RAM, but when memory became an issue I tried running on other Linux machines with much greater RAM capacities. I don't know what distos. I just tried preallocating the array and updating it one line at a time, and that works fine. Thanks very much for the suggestion. :) This doesn't seem like the expected behaviour though and the error message seems wrong. Many thanks, Dave ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Glad it you got a solution. While far from an expert, with 2GB ram you do not have that much free RAM outside the OS and other overheads. With your code, the OS has to read all the data in at least once as well as allocate the storage for the result and any intermediate objects. So it is easy to exhaust memory. I agree that the error message is too vague so you could file a ticket. Use PyTables if memory is a problem for you. For example, see the recent 'np.memmap and memory usage' thread on numpy discussion: http://www.mail-archive.com/numpy-discussion@scipy.org/msg18863.html Especially the post by Francesc Alted: http://www.mail-archive.com/numpy-discussion@scipy.org/msg18868.html Bruce ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 2D array from a list error
On Wed, Sep 23, 2009 at 9:42 AM, davew davejw...@gmail.com wrote: Hi, I've got a fairly large (but not huge, 58mb) tab seperated text file, with approximately 200 columns and 56k rows of numbers and strings. Here's a snippet of my code to create a numpy matrix from the data file... data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) data = array(data) ### data = array(data) It causes the following error: ValueError: setting an array element with a sequence If I take the 1st 40,000 lines of the file, it works fine. If I take the last 40,000 lines of the file, it also works fine, so it isn't a problem with the file. I've found a few other posts complaining of the same problem, but none of their fixes work. It seems like a memory problem to me. This was reinforced when I tried to break the dataset into 3 chunks and stack the resulting arrays - I got an error message saying memory error. I don't really understand why reading in this 57mb txt file is taking up ~2gb's of RAM. Any advice? Thanks in advance Without knowing more, I wouldn't think that there's really a memory error trying to load a 57 MB file or stacking it split into 3. Try using genfromtxt or loadtxt. It should work without a problem unless there is something funny about your file. Skipper ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Numpy depends on OpenSSL ???
I have discovered the hard way that numpy depends on openssl. I am building a 64 bit python environment for the macintosh. I currently do not have a 64 bit openssl library installed, so the python interpreter does not have hashlib. (hashlib gets its md5 function from the openssl library.) The problem is in numpy/core/code_generators/genapi.py, where it appears to be trying to make an md5 hash of the declarations of some of the C functions. What is this hash used for? Is there a particular reason that it needs to be cryptographically strong? Mark S. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] simple indexing question
josef.p...@gmail.com wrote: On Wed, Sep 23, 2009 at 11:12 AM, Neal Becker ndbeck...@gmail.com wrote: Robert Cimrman wrote: Neal Becker wrote: I have an array: In [12]: a Out[12]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) And a selection array: In [13]: b Out[13]: array([1, 1, 1, 1, 1]) I want a 1-dimensional output, where the array b selects an element from each column of a, where if b[i]=0 select element from 0th row of a and if b[i]=k select element from kth row of a. Easy way to do this? (Not a[b], that gives 5x5 array output) It might be stupid, but it works... In [51]: a Out[51]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [52]: b = [0,1,0,1,0] In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] Out[53]: array([0, 6, 2, 8, 4]) cheers, r. Thanks. Is there really no more elegant solution? How about this? a array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) b array([0, 1, 0, 1, 0]) a[b,np.arange(a.shape[1])] array([0, 6, 2, 8, 4]) So it was stupid :) well, time to go home, r. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy depends on OpenSSL ???
On Wed, Sep 23, 2009 at 10:52, Mark Sienkiewicz sienk...@stsci.edu wrote: I have discovered the hard way that numpy depends on openssl. I am building a 64 bit python environment for the macintosh. I currently do not have a 64 bit openssl library installed, so the python interpreter does not have hashlib. (hashlib gets its md5 function from the openssl library.) There are builtin implementations that do not depend on OpenSSL. hashlib should be using them for MD5 and the standard SHA variants when OpenSSL is not available. Try import _md5. But basically, we expect you to have a reasonably complete standard library. The problem is in numpy/core/code_generators/genapi.py, where it appears to be trying to make an md5 hash of the declarations of some of the C functions. What is this hash used for? Is there a particular reason that it needs to be cryptographically strong? It is used for checking for changes in the API. While this use case does not require all of the properties that would make a hash cryptographically strong, it needs some of them. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy depends on OpenSSL ???
On Wed, Sep 23, 2009 at 9:52 AM, Mark Sienkiewicz sienk...@stsci.eduwrote: I have discovered the hard way that numpy depends on openssl. I am building a 64 bit python environment for the macintosh. I currently do not have a 64 bit openssl library installed, so the python interpreter does not have hashlib. (hashlib gets its md5 function from the openssl library.) The problem is in numpy/core/code_generators/genapi.py, where it appears to be trying to make an md5 hash of the declarations of some of the C functions. What is this hash used for? Is there a particular reason that it needs to be cryptographically strong? The hash is used as a way to check for any API changes. It doesn't have to be cryptographically strong, it just needs to scatter the hashed values effectively and we could probably use something simpler. I tend to regard this problem as a Python bug because the standard python modules should be available on all platforms. In any case, we should find a fix. Please open a ticket. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Create numpy array from a list error
On Wed, Sep 23, 2009 at 9:06 AM, Dave Wood davejw...@gmail.com wrote: Hi all, I've got a fairly large (but not huge, 58mb) tab seperated text file, with approximately 200 columns and 56k rows of numbers and strings. Here's a snippet of my code to create a numpy matrix from the data file... data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) data = array(data) ### It causes the following error: data = array(data) ValueError: setting an array element with a sequence If I take the 1st 40,000 lines of the file, it works fine. If I take the last 40,000 lines of the file, it also works fine, so it isn't a problem with the file. I've found a few other posts complaining of the same problem, but none of their fixes work. It seems like a memory problem to me. This was reinforced when I tried to break the dataset into 3 chunks and stack the resulting arrays - I got an error message saying memory error. Also, I don't really understand why reading in this 57mb txt file is taking up ~2gb's of RAM. Any advice? Thanks in advance Dave ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt and share your results? I[14]: data = np.loadtxt('09_03_18_07_55_33.sau', dtype='float', skiprows=83).T I[15]: len data - len(data) O[15]: 66 I[16]: len data[0] - len(data[0]) O[16]: 117040 I[17]: whos Variable TypeData/Info data ndarray 66x117040: 7724640 elems, type `float64`, 61797120 bytes (58 Mb) [gse...@ccn various]$ python sysinfo.py Platform : Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas Python : ('CPython', 'tags/r26', '66714') IPython : 0.10 NumPy: 1.4.0.dev Matplotlib : 1.0.svn -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy depends on OpenSSL ???
On 23-Sep-09, at 11:52 AM, Mark Sienkiewicz wrote: I am building a 64 bit python environment for the macintosh. I currently do not have a 64 bit openssl library installed, so the python interpreter does not have hashlib. (hashlib gets its md5 function from the openssl library.) If you're interested in remedying this with your Python build, have a look at Mac/BuildScript, there is a bunch of logic there that downloads various optional dependencies and builds them with the selected architectures. It should not be difficult to modify it to also grab and build openssl. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy depends on OpenSSL ???
On Thu, Sep 24, 2009 at 1:20 AM, Charles R Harris charlesr.har...@gmail.com wrote: In any case, we should find a fix. I don't think we do - we requires a standard python install, and a python without hashlib is crippled. If you can't build python without openssl, I would consider this a python bug. cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 2D array from a list error
Dave Wood wrote: Well, I suppose they are all considered to be strings here. I haven't tried to convert the numbers to floats yet. This could be an issue. For strings, numpy creates an array of strings, all of the same length, so each element is as big as the largest one: In [13]: l Out[13]: ['5', '34', 'this is a much longer string'] In [14]: np.array(l) Out[14]: array(['5', '34', 'this is a much longer string'], dtype='|S28') Note that each element is 28 bytes (that's what the S28 means). this means that your array would be much larger than the text file if you have even one long string it in. Also, as mentioned in this thread, in order to figure out how big to make each string element, the array() constructor has to scan through your entire list first, and I don't know how much intermediate memory it may use in that process. This really isn't how numpy is meant to be used -- why would you want a big ol' array of mixed numbers and strings, all stored as strings? structured arrays were meant for this, and np.loadtxt() is the easiest way to get one. I just tried preallocating the array and updating it one line at a time, and that works fine. what dtype do you end up with? This doesn't seem like the expected behaviour though and the error message seems wrong. yes, not a good error message at all -- it's hard to make sure good errors get triggered every time! HTH, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Create numpy array from a list error
On Wed, Sep 23, 2009 at 9:06 AM, Dave Wood davejw...@gmail.com wrote: Hi all, I've got a fairly large (but not huge, 58mb) tab seperated text file, with approximately 200 columns and 56k rows of numbers and strings. Here's a snippet of my code to create a numpy matrix from the data file... data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) data = array(data) ### It causes the following error: data = array(data) ValueError: setting an array element with a sequence If I take the 1st 40,000 lines of the file, it works fine. If I take the last 40,000 lines of the file, it also works fine, so it isn't a problem with the file. I've found a few other posts complaining of the same problem, but none of their fixes work. It seems like a memory problem to me. This was reinforced when I tried to break the dataset into 3 chunks and stack the resulting arrays - I got an error message saying memory error. Also, I don't really understand why reading in this 57mb txt file is taking up ~2gb's of RAM. Any advice? Thanks in advance Dave ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion One more reply, You try to read mixed data (strings and numbers) into an array, that might be causing the problem. In my example, after skipping the meta-header all I have is numbers. Additionally, when you are reading chunk of data if one of the column elements truncated or overflows its section NumPy complains with that ValueError. -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 2D array from a list error
Appologies for the multiple posts, people. My posting to the forum was pending for a long time, so I deleted it and tried emailing directly. I didn't think they'd all be sent out. Gokan, thanks for the reply, I hope you get this one. Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt and share your results? I[14]: data = np.loadtxt('09_03_18_07_55_33.sau', dtype='float', skiprows=83).T I[15]: len data - len(data) O[15]: 66 I[16]: len data[0] - len(data[0]) O[16]: 117040 I[17]: whos Variable TypeData/Info data ndarray 66x117040: 7724640 elems, type `float64`, 61797120 bytes (58 Mb) [gse...@ccn various]$ python sysinfo.py Platform : Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas Python : ('CPython', 'tags/r26', '66714') IPython : 0.10 NumPy: 1.4.0.dev Matplotlib : 1.0.svn -- Gökhan I tried using loadtxt and got the same error as before (with a little more information). Traceback (most recent call last): File /home/dwood/workspace/GeneralScripts/src/test_clab2R.py, line 140, in module main() File /home/dwood/workspace/GeneralScripts/src/test_clab2R.py, line 45, in main data = loadtxt(inputfile.txt,dtype='string') File /apps/python/2.5.4/rhel4/lib/python2.5/site-packages/numpy/lib/io.py, line 505, in loadtxt X = np.array(X, dtype) ValueError: setting an array element with a sequence @Christopher Barker Thanks for the information. To fix my problem, I tried taking out the row names (leaving only numerical information), and converting the 2D list to floats. I still had the same problem. On 9/23/09, Christopher Barker chris.bar...@noaa.gov wrote: Dave Wood wrote: Well, I suppose they are all considered to be strings here. I haven't tried to convert the numbers to floats yet. This could be an issue. For strings, numpy creates an array of strings, all of the same length, so each element is as big as the largest one: In [13]: l Out[13]: ['5', '34', 'this is a much longer string'] In [14]: np.array(l) Out[14]: array(['5', '34', 'this is a much longer string'], dtype='|S28') Note that each element is 28 bytes (that's what the S28 means). this means that your array would be much larger than the text file if you have even one long string it in. Also, as mentioned in this thread, in order to figure out how big to make each string element, the array() constructor has to scan through your entire list first, and I don't know how much intermediate memory it may use in that process. This really isn't how numpy is meant to be used -- why would you want a big ol' array of mixed numbers and strings, all stored as strings? structured arrays were meant for this, and np.loadtxt() is the easiest way to get one. I just tried preallocating the array and updating it one line at a time, and that works fine. what dtype do you end up with? This doesn't seem like the expected behaviour though and the error message seems wrong. yes, not a good error message at all -- it's hard to make sure good errors get triggered every time! HTH, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 2D array from a list error
Ignore that last mail, I hit send instead of save by mistake. Between you you seem to be right, it's a problem with loading the array of strings. There must be some large strings in the first 'rowname' column. If this column is left out, it works fine (even as strings). Many thanks, sorry for all the emails. Dave On 9/23/09, Dave Wood davejw...@gmail.com wrote: Appologies for the multiple posts, people. My posting to the forum was pending for a long time, so I deleted it and tried emailing directly. I didn't think they'd all be sent out. Gokan, thanks for the reply, I hope you get this one. Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt and share your results? I[14]: data = np.loadtxt('09_03_18_07_55_33.sau', dtype='float', skiprows=83).T I[15]: len data - len(data) O[15]: 66 I[16]: len data[0] - len(data[0]) O[16]: 117040 I[17]: whos Variable TypeData/Info data ndarray 66x117040: 7724640 elems, type `float64`, 61797120 bytes (58 Mb) [gse...@ccn various]$ python sysinfo.py Platform : Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas Python : ('CPython', 'tags/r26', '66714') IPython : 0.10 NumPy: 1.4.0.dev Matplotlib : 1.0.svn -- Gökhan I tried using loadtxt and got the same error as before (with a little more information). Traceback (most recent call last): File /home/dwood/workspace/GeneralScripts/src/test_clab2R.py, line 140, in module main() File /home/dwood/workspace/GeneralScripts/src/test_clab2R.py, line 45, in main data = loadtxt(inputfile.txt,dtype='string') File /apps/python/2.5.4/rhel4/lib/python2.5/site-packages/numpy/lib/io.py, line 505, in loadtxt X = np.array(X, dtype) ValueError: setting an array element with a sequence @Christopher Barker Thanks for the information. To fix my problem, I tried taking out the row names (leaving only numerical information), and converting the 2D list to floats. I still had the same problem. On 9/23/09, Christopher Barker chris.bar...@noaa.gov wrote: Dave Wood wrote: Well, I suppose they are all considered to be strings here. I haven't tried to convert the numbers to floats yet. This could be an issue. For strings, numpy creates an array of strings, all of the same length, so each element is as big as the largest one: In [13]: l Out[13]: ['5', '34', 'this is a much longer string'] In [14]: np.array(l) Out[14]: array(['5', '34', 'this is a much longer string'], dtype='|S28') Note that each element is 28 bytes (that's what the S28 means). this means that your array would be much larger than the text file if you have even one long string it in. Also, as mentioned in this thread, in order to figure out how big to make each string element, the array() constructor has to scan through your entire list first, and I don't know how much intermediate memory it may use in that process. This really isn't how numpy is meant to be used -- why would you want a big ol' array of mixed numbers and strings, all stored as strings? structured arrays were meant for this, and np.loadtxt() is the easiest way to get one. I just tried preallocating the array and updating it one line at a time, and that works fine. what dtype do you end up with? This doesn't seem like the expected behaviour though and the error message seems wrong. yes, not a good error message at all -- it's hard to make sure good errors get triggered every time! HTH, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?
On Tue, Sep 22, 2009 at 17:14, Citi, Luca lc...@essex.ac.uk wrote: My vote (if I am entitled to) goes to change the code. Whether or not the addressee of .base is an array, it should be the object that has to be kept alive such that the data does not get deallocated rather one object which will keep alive another object, which will keep alive another object, , which will keep alive the object with the data. On creation of a new view B of object A, if A has ONWDATA true then B.base = A, else B.base = A.base. When working on http://projects.scipy.org/numpy/ticket/1085 I had to walk the chain of bases to establish whether any of the inputs and the outputs were views of the same data. If base were the ultimate base, one would only need to check whether any of the inputs have the same base of any of the outputs. This is not reliable. You need to check memory addresses and extents for overlap (unfortunately, slices complicate this; numpy.may_share_memory() is a good heuristic, though). When interfacing with other systems using __array_interface__ or similar APIs, the other system may have multiple objects that point to the same data. If you create ndarrays from each of these objects, their .base attributes would all be different although they all point to the same memory. I tried to modify the code to change the behaviour. I have opened a ticket for this http://projects.scipy.org/numpy/ticket/1232 and attached a patch but I am not 100% sure. I changed PyArray_View in convert.c and a few places in mapping.c and sequence.c. But if there is any reason why the current behaviour should be kept, just ignore the ticket. Lacking a robust use case, I would prefer to keep the current behavior. It is likely that nothing would break if we changed it, but without a use case, I would prefer to be conservative. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] simple indexing question
josef.p...@gmail.com wrote: On Wed, Sep 23, 2009 at 11:12 AM, Neal Becker ndbeck...@gmail.com wrote: Robert Cimrman wrote: Neal Becker wrote: I have an array: In [12]: a Out[12]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) And a selection array: In [13]: b Out[13]: array([1, 1, 1, 1, 1]) I want a 1-dimensional output, where the array b selects an element from each column of a, where if b[i]=0 select element from 0th row of a and if b[i]=k select element from kth row of a. Easy way to do this? (Not a[b], that gives 5x5 array output) It might be stupid, but it works... In [51]: a Out[51]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [52]: b = [0,1,0,1,0] In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] Out[53]: array([0, 6, 2, 8, 4]) cheers, r. Thanks. Is there really no more elegant solution? How about this? a array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) b array([0, 1, 0, 1, 0]) a[b,np.arange(a.shape[1])] array([0, 6, 2, 8, 4]) Josef Thanks, that's not bad. I'm a little surprised that given the fancy indexing capabilities of np there isn't a more direct way to do this. I'm still trying to wrap my mind around the fancy indexing stuff. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Create numpy array from a list error
On 23-Sep-09, at 10:06 AM, Dave Wood wrote: Hi all, I've got a fairly large (but not huge, 58mb) tab seperated text file, with approximately 200 columns and 56k rows of numbers and strings. Here's a snippet of my code to create a numpy matrix from the data file... data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) data = array(data) In general I have found that the pattern your using is a bad one, because it's first reading the entire file into memory and then making a complete copy of it when you call map. I would instead use data = [x.strip().split('\t') for x in sys.stdin] or even defer the loop until array() is called, with a generator: data = (x.strip().split('\t') for x in sys.stdin) This difference still shouldn't be resulting in a memory error with only 57 MB of data, but it'll make things go faster at least. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?
On Wed, Sep 23, 2009 at 13:30, Citi, Luca lc...@essex.ac.uk wrote: http://projects.scipy.org/numpy/ticket/1085 But I think in that case it was still an improvement w.r.t. the current implementation which is buggy. At least it shields 95% of users from unexpected results. Using memory addresses and extents might be overkilling (and expensive) in that case. numpy.may_share_memory() should be pretty cheap. It's just arithmetic. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deserialized arrays with base mutate strings
ke, 2009-09-23 kello 10:01 +0200, Hrvoje Niksic kirjoitti: [clip] I guess this one could be prevented by verifying that the buffer is writable when setting the writable flag. When deserializing arrays, I don't see a reason for the base property to even exist - sharing of the buffer between different views is unpreserved anyway, as reported in my other thread. IIRC, it avoids one copy: ndarray.__reduce__ pickles the raw data as a string, and so ndarray.__setstate__ receives a Python string back. I don't remember if it's in the end possible to emit raw byte stream to a pickle somehow, not going through strings. If not, then a copy can't be avoided. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Deserialized arrays with base mutate strings
On Wed, Sep 23, 2009 at 13:59, Pauli Virtanen p...@iki.fi wrote: ke, 2009-09-23 kello 10:01 +0200, Hrvoje Niksic kirjoitti: [clip] I guess this one could be prevented by verifying that the buffer is writable when setting the writable flag. When deserializing arrays, I don't see a reason for the base property to even exist - sharing of the buffer between different views is unpreserved anyway, as reported in my other thread. IIRC, it avoids one copy: ndarray.__reduce__ pickles the raw data as a string, and so ndarray.__setstate__ receives a Python string back. Correct, that was the goal. I don't remember if it's in the end possible to emit raw byte stream to a pickle somehow, not going through strings. If not, then a copy can't be avoided. No, I don't think you can. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Coercing object arrays to string (or unicode) arrays
As I'm looking into fixing a number of bugs in chararray, I'm running into some surprising behavior. One of the things chararray needs to do occasionally is build up an object array of string objects, and then convert that back to a fixed-length string array. This length is sometimes predetermined by a recarray data structure. Unfortunately, I'm not getting what I would expect when coercing or assigning an object array to a string array. Is this a bug, or am I just going about this the wrong way? If a bug, I'm happy to look into it as part of my fixing chararray task, but I just wanted to confirm that it is a bug before proceeding. In [14]: x = np.array(['abcdefgh', 'ijklmnop'], 'O') # Without specifying the length, it seems to default to sizeof(int)... ??? In [15]: np.array(x, 'S') Out[15]: array(['abcd', 'ijkl'], dtype='|S4') In [21]: np.array(x, np.string_) Out[21]: array(['abcd', 'ijkl'], dtype='|S4') # Specifying a length gives strange results In [16]: np.array(x, 'S8') Out[16]: array(['abcdijkl', 'mnop\xe0\x01\x85\x08'], dtype='|S8') # This is what I expected to happen above, but the cast to a list seems like it should be unnecessary In [17]: np.array(list(x)) Out[17]: array(['abcdefgh', 'ijklmnop'], dtype='|S8') # Assignment also seems broken In [18]: y = np.empty(x.shape, dtype='S8') In [19]: y[:] = x[:] In [20]: y Out[20]: array(['abcdijkl', 'mnop\xc05\xf9\xb7'], dtype='|S8') Cheers, Mike ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy depends on OpenSSL ???
Robert Kern wrote: On Wed, Sep 23, 2009 at 10:52, Mark Sienkiewicz sienk...@stsci.edu wrote: I have discovered the hard way that numpy depends on openssl. I am building a 64 bit python environment for the macintosh. I currently do not have a 64 bit openssl library installed, so the python interpreter does not have hashlib. (hashlib gets its md5 function from the openssl library.) There are builtin implementations that do not depend on OpenSSL. hashlib should be using them for MD5 and the standard SHA variants when OpenSSL is not available. This is the clue that I needed. Here is where it led: setup.py tries to detect the presence of openssl by looking for the library and the include files. It detects the library that Apple provided in /usr/lib/libssl.dylib and tries to build the openssl version of hashlib. But when it actually builds the module, the link fails because that library file is not for the correct architecture. I am building for x86_64, but the library contains only ppc and i386. The result is that hashlib cannot be imported, so the python installer decides not to install it at all. That certainly appears to indicate that the python developers consider hashlib to be optional, but it _should_ work in most any python installation. So, the problem is really about the python install automatically detecting libraries. If I hack the setup.py that builds all the C modules so that it can't find the openssl library, then it uses the fallbacks that are distributed with python. That gets me a as far as EnvironmentError: math library missing; rerun setup.py after setting the MATHLIB env variable, which is a big improvement. (The math library is not missing, but this is a different problem entirely.) Thanks, and sorry for the false alarm. Mark S. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?
numpy.may_share_memory() should be pretty cheap. It's just arithmetic. True, but it is in python. Not something that should go in construct_arrays of ufunc_object.c, I suppose. But the same approach can be translated to C, probably. I can try if we decide http://projects.scipy.org/numpy/ticket/1085 is worth fixing. Let me know. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] something wrong with docs?
On Tue, Sep 22, 2009 at 11:15 PM, David Goldsmith d.l.goldsm...@gmail.com wrote: It would be nice if we could move gradually towards docs whose examples (at least those marked as such) were always run via sphinx. That's a suggestion, but given your point, it seems like you'd advocate it being more than that, no? I was simply thinking that if this markup were to be used in the docs for all examples where it makes sense, then one could simply use the sphinx target make doctest to also validate the documentation. Even if users don't run these by default, developers and buildbots would, which helps raise the reliability of the docs and reduces chances of code bitrot in the examples from the main docs (that problem is taken care of for the docstrings by np.test(doctest=True) ). Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] dtype '|S0' not understood
Howdy, It seems it's possible using e.g. In [25]: dtype([('foo', str)])Out[25]: dtype([('foo', '|S0')]) to get yourself a zero-length string. However dtype('|S0') results in a TypeError: data type not understood. I understand the stupidity of creating a 0-length string field but it's conceivable that it's accidental. For example, it could lead to a situation where you've created that field, are missing all the data you had meant to put in it, serialize with np.save, and upon np.load aren't able to get _any_ of your data back because the dtype descriptor is considered bogus (can you guess why I thought of this scenario?). It seems that either dtype(str) should do something more sensible than zero-length string, or it should be possible to create it with dtype('| S0'). Which should it be? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion