Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray
Hi, On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney wesmck...@gmail.com wrote: ... Perhaps we should make a wiki page someplace summarizing pros and cons of the various implementation approaches? But - we should do this if it really is an open question which one we go for. If not then, we're just slowing Mark down in getting to the implementation. Assuming the question is still open, here's a starter for the pros and cons: array.mask 1) It's easier / neater to implement 2) It can generalize across dtypes 3) You can still get the masked data underneath the mask (allowing you to unmask etc) nafloat64: 1) No memory overhead 2) Battle-tested implementation already done in R I guess we'd have to test directly whether the non-continuous memory of the mask and data would cause enough cache-miss problems to outweigh the potential cycle-savings from single byte comparisons in array.mask. I guess that one and only one of these will get written. I guess that one of these choices may be a lot more satisfying to the current and future masked array itch than the other. I'm personally worried that the memory overhead of array.masks will make many of us tend to avoid them. I work with images that can easily get large enough that I would not want an array-items size byte array added to my storage. The reason I'm asking for more details about the implementation is because that is most of the argument for array.mask at the moment (1 and 2 above). See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy steering group?
Hi, On Wed, May 4, 2011 at 9:24 AM, Robert Kern robert.k...@gmail.com wrote: On Wed, May 4, 2011 at 11:14, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Tue, May 3, 2011 at 7:58 PM, Robert Kern robert.k...@gmail.com wrote: I can't speak for the rest of the group, but as for myself, if you would like to draft such a letter, I'm sure I will agree with its contents. Thank you - sadly I am not confident in deserving your confidence, but I will do my best to say something sensible. Any objections to a public google doc? Even better! I've put up a draft here: numpy-whaley-support - https://docs.google.com/document/d/1gPhUUjWqNpRatw90kCqL1WPWvn1yicf2VAowWSyHlno/edit?hl=en_USauthkey=CPv49_cK I didn't know who to put as signatories. Maybe an extended steering group like (from http://scipy.org/Developer_Zone): Jarrod Millman Eric Jones Robert Kern Travis Oliphant Stefan van der Walt plus: Pauli Ralf Chuck or something like that? Anyone else care to sign / edit? Mark W for example? Sorry, I haven't been following the numpy commits very carefully of late. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy steering group?
Hi, On Thu, May 26, 2011 at 9:32 PM, Pearu Peterson pearu.peter...@gmail.com wrote: Hi, Would it be possible to setup a signing system where anyone who would like to support Clint could sign and advertise the system on relevant mailing lists? This would provide larger body of supporters for this letter and perhaps will have greater impact to whom the letter will be addressed. Personally, I would be happy to sign to such a letter. On the letter: the letter should also mention scipy community as they benefit most from the ATLAS speed. Maybe it would be best phrased then as 'numpy and scipy developers' instead of the steering group? I'm not sure how this kind of thing works for tenure letters, I would guess that if there are a very large number of signatures it might be difficult to see who is being represented... I'm open to suggestions. I can also ask Clint. I've added you as an editor - would you consider adding your name at the end, and maybe something about scipy? - you know the scipy blas / lapack stuff much better than I do. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy steering group?
Oh sorry and: On Thu, May 26, 2011 at 2:03 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, May 4, 2011 at 9:24 AM, Robert Kern robert.k...@gmail.com wrote: On Wed, May 4, 2011 at 11:14, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Tue, May 3, 2011 at 7:58 PM, Robert Kern robert.k...@gmail.com wrote: I can't speak for the rest of the group, but as for myself, if you would like to draft such a letter, I'm sure I will agree with its contents. Thank you - sadly I am not confident in deserving your confidence, but I will do my best to say something sensible. Any objections to a public google doc? Even better! I've put up a draft here: numpy-whaley-support - https://docs.google.com/document/d/1gPhUUjWqNpRatw90kCqL1WPWvn1yicf2VAowWSyHlno/edit?hl=en_USauthkey=CPv49_cK I didn't know who to put as signatories. Maybe an extended steering group like (from http://scipy.org/Developer_Zone): Jarrod Millman Eric Jones Robert Kern Travis Oliphant Stefan van der Walt plus: Pauli Ralf Chuck David C... Sorry - I was up very late last night. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy steering group?
Hi, On Tue, May 3, 2011 at 7:58 PM, Robert Kern robert.k...@gmail.com wrote: On Tue, May 3, 2011 at 12:07, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Apr 30, 2011 at 5:21 AM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Wed, Apr 27, 2011 at 8:52 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, This is just to follow up on a dead thread of mine a little while back. I was asking about letters for Clint Whaley's tenure case, from numpy, but I realized that I don't know who 'numpy' is :) Is there in fact a numpy steering group? Who is best to write letters representing the 'numpy community'? At http://scipy.org/Developer_Zone there's a list of people under a big header steering committee. It seems to me that writing such a letter representing the community is one of the purposes that committee could serve. Ah - yes - thanks for the reply. In the interests of general transparency - and given that no-one from that group has replied to this email - how should the group best be addressed? By personal email? That seems to break the open-source matra of everything on-list: http://producingoss.com/en/setting-tone.html#avoid-private-discussions Having project-relevant *discussions* on-list doesn't preclude getting someone's *attention* off-list. Yes, that's true. My worry was that, having put the question on the list, and not had an answer, it might send a bad signal if it was obvious that I had only got a reply because I'd asked for one off-list. I can't speak for the rest of the group, but as for myself, if you would like to draft such a letter, I'm sure I will agree with its contents. Thank you - sadly I am not confident in deserving your confidence, but I will do my best to say something sensible. Any objections to a public google doc? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy easy_install fails for python 3.2
Hi, I can imagine that this is low-priority, but I have just been enjoying pytox for automated virtualenv testing: http://codespeak.net/tox/index.html which revealed that numpy download-build-install via easy_install (distribute) fails with the appended traceback ending in ValueError: 'build/py3k/numpy' is not a directory. easy_install for pythons 2.5 - 2.7 work fine. Best, Matthew RefactoringTool: /tmp/easy_install-xr2px3/numpy-1.5.1/build/py3k/numpy/compat/py3k.py Running from numpy source directory.Converting to Python3 via 2to3... Traceback (most recent call last): File ../bin/easy_install, line 9, in module load_entry_point('distribute==0.6.14', 'console_scripts', 'easy_install')() File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/command/easy_install.py, line 1855, in main with_ei_usage(lambda: File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/command/easy_install.py, line 1836, in with_ei_usage return f() File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/command/easy_install.py, line 1859, in lambda distclass=DistributionWithoutHelpCommands, **kw File /usr/lib/python3.2/distutils/core.py, line 149, in setup dist.run_commands() File /usr/lib/python3.2/distutils/dist.py, line 919, in run_commands self.run_command(cmd) File /usr/lib/python3.2/distutils/dist.py, line 938, in run_command cmd_obj.run() File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/command/easy_install.py, line 342, in run self.easy_install(spec, not self.no_deps) File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/command/easy_install.py, line 582, in easy_install return self.install_item(spec, dist.location, tmpdir, deps) File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/command/easy_install.py, line 612, in install_item dists = self.install_eggs(spec, download, tmpdir) File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/command/easy_install.py, line 802, in install_eggs return self.build_and_install(setup_script, setup_base) File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/command/easy_install.py, line 1079, in build_and_install self.run_setup(setup_script, setup_base, args) File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/command/easy_install.py, line 1068, in run_setup run_setup(setup_script, args) File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/sandbox.py, line 30, in run_setup lambda: exec(compile(open( File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/sandbox.py, line 71, in run return func() File /home/mb312/dev_trees/nibabel/.tox/py32/lib/python3.2/site-packages/distribute-0.6.14-py3.2.egg/setuptools/sandbox.py, line 33, in lambda {'__file__':setup_script, '__name__':'__main__'}) File setup.py, line 211, in module File setup.py, line 204, in setup_package File /tmp/easy_install-xr2px3/numpy-1.5.1/build/py3k/numpy/distutils/core.py, line 152, in setup File setup.py, line 151, in configuration File /tmp/easy_install-xr2px3/numpy-1.5.1/build/py3k/numpy/distutils/misc_util.py, line 972, in add_subpackage File /tmp/easy_install-xr2px3/numpy-1.5.1/build/py3k/numpy/distutils/misc_util.py, line 941, in get_subpackage File /tmp/easy_install-xr2px3/numpy-1.5.1/build/py3k/numpy/distutils/misc_util.py, line 878, in _get_configuration_from_setup_py File numpy/setup.py, line 5, in configuration File /tmp/easy_install-xr2px3/numpy-1.5.1/build/py3k/numpy/distutils/misc_util.py, line 713, in __init__ ValueError: 'build/py3k/numpy' is not a directory ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy easy_install fails for python 3.2
Hi, On Wed, May 4, 2011 at 1:23 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Wed, May 4, 2011 at 6:53 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I can imagine that this is low-priority, but I have just been enjoying pytox for automated virtualenv testing: http://codespeak.net/tox/index.html which revealed that numpy download-build-install via easy_install (distribute) fails with the appended traceback ending in ValueError: 'build/py3k/numpy' is not a directory. I think it would be good to just say wontfix immediately, rather than just leaving a ticket open and not do anything (like we did with http://projects.scipy.org/numpy/ticket/860). Ouch - yes - I see what you mean. It seems tox can also use pip (which works with py3k now), does that work for you? I think current tox 0.9 uses virtualenv5 for python3.2 and has to use distribute, I believe. Current tip of pytox appears to use virtualenv 1.6.1 for python 3.2, and does use pip, but generates the same error in the end. I've appended the result of a fresh python3.2 virtualenv and a pip install numpy. Sorry - I know these are not fun problems, See you, Matthew RefactoringTool: /home/mb312/.virtualenvs/bare-32/build/numpy/build/py3k/numpy/core/defchararray.py Running from numpy source directory.Traceback (most recent call last): File string, line 14, in module File /home/mb312/.virtualenvs/bare-32/build/numpy/setup.py, line 211, in module setup_package() File /home/mb312/.virtualenvs/bare-32/build/numpy/setup.py, line 204, in setup_package configuration=configuration ) File /home/mb312/.virtualenvs/bare-32/build/numpy/build/py3k/numpy/distutils/core.py, line 152, in setup config = configuration() File /home/mb312/.virtualenvs/bare-32/build/numpy/setup.py, line 151, in configuration config.add_subpackage('numpy') File /home/mb312/.virtualenvs/bare-32/build/numpy/build/py3k/numpy/distutils/misc_util.py, line 972, in add_subpackage caller_level = 2) File /home/mb312/.virtualenvs/bare-32/build/numpy/build/py3k/numpy/distutils/misc_util.py, line 941, in get_subpackage caller_level = caller_level + 1) File /home/mb312/.virtualenvs/bare-32/build/numpy/build/py3k/numpy/distutils/misc_util.py, line 878, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File numpy/setup.py, line 5, in configuration config = Configuration('numpy',parent_package,top_path) File /home/mb312/.virtualenvs/bare-32/build/numpy/build/py3k/numpy/distutils/misc_util.py, line 713, in __init__ raise ValueError(%r is not a directory % (package_path,)) ValueError: 'build/py3k/numpy' is not a directory ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy steering group?
Hi, On Sat, Apr 30, 2011 at 5:21 AM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Wed, Apr 27, 2011 at 8:52 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, This is just to follow up on a dead thread of mine a little while back. I was asking about letters for Clint Whaley's tenure case, from numpy, but I realized that I don't know who 'numpy' is :) Is there in fact a numpy steering group? Who is best to write letters representing the 'numpy community'? At http://scipy.org/Developer_Zone there's a list of people under a big header steering committee. It seems to me that writing such a letter representing the community is one of the purposes that committee could serve. Ah - yes - thanks for the reply. In the interests of general transparency - and given that no-one from that group has replied to this email - how should the group best be addressed? By personal email? That seems to break the open-source matra of everything on-list: http://producingoss.com/en/setting-tone.html#avoid-private-discussions Thanks again, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Numpy steering group?
Hi, This is just to follow up on a dead thread of mine a little while back. I was asking about letters for Clint Whaley's tenure case, from numpy, but I realized that I don't know who 'numpy' is :) Is there in fact a numpy steering group?Who is best to write letters representing the 'numpy community'? Thanks a lot, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ATLAS - support letter
Hi, I'm on the ATLAS mailing list, maybe some of y'all are too. Clint Whaley, the author of ATLAS, was asking for letters to support his tenure case. That is, letters saying that lots of us benefit greatly from his work - which is obviously true. Can we the numpy community produce such a letter? Who would it best come from? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ATLAS - support letter
Hi, On Wed, Apr 20, 2011 at 10:45 PM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Thu, Apr 21, 2011 at 07:25:18AM +0530, pratik wrote: If the place where he is seeking tenure does not know his name (i.e hasn't heard of ATLAS) then it is not a good place to seek tenure in :) . Scholars undervalue code and don't realise the difficulty and the amount of work it takes to produce. More than once I have had colleagues tell me that they valued a paper more than a software. It is important to show that such software gives a large benefit to the scientific community. Matthew, I didn't feel that I could do much to answer your call, but if you feel different, please let me know. Well - thanks for the offer - Clint was asking for individual letters too, you could email and ask him? Are you on the math-atlas list? If not I'll forward you his request... See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ATLAS - support letter
Hi, On Wed, Apr 20, 2011 at 6:55 PM, pratik pratik.mal...@gmail.com wrote: On Wednesday 20 April 2011 10:57 PM, Matthew Brett wrote: Hi, I'm on the ATLAS mailing list, maybe some of y'all are too. Clint Whaley, the author of ATLAS, was asking for letters to support his tenure case. That is, letters saying that lots of us benefit greatly from his work - which is obviously true. Can we the numpy community produce such a letter? Who would it best come from? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion If the place where he is seeking tenure does not know his name (i.e hasn't heard of ATLAS) then it is not a good place to seek tenure in :) . It seems to me that we are so used to depending on ATLAS we forget just how much we rely on Clint to support and improve it. This seems like one of those rare times when it's fairly easy to give something back... Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: Numpy 1.6.0 beta 2
Hi, On Tue, Apr 5, 2011 at 10:56 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Apr 5, 2011 at 11:45 AM, josef.p...@gmail.com wrote: On Tue, Apr 5, 2011 at 1:20 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Apr 5, 2011 at 10:46 AM, Christopher Barker chris.bar...@noaa.gov wrote: On 4/4/11 10:35 PM, Charles R Harris wrote: IIUC, Ub is undefined -- U means universal newlines, which makes no sense when used with b for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags. I'd expect that genfromtxt, being txt, and line oriented, should use 'rU'. but if it wants the raw line endings (why would it?) then rb should be fine. U has been kept around for backwards compatibility, the python documentation recommends that it not be used for new code. That is for 3.* -- the 2.7.* docs say: In addition to the standard fopen() values mode may be 'U' or 'rU'. Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'. All of these external representations are seen as '\n' by the Python program. If Python is built without universal newline support a mode with 'U' is the same as normal text mode. Note that file objects so opened also have an attribute called newlines which has a value of None (if no newlines have yet been seen), '\n', '\r', '\r\n', or a tuple containing all the newline types seen. Python enforces that the mode, after stripping 'U', begins with 'r', 'w' or 'a'. which does, in fact indicate that 'Ub' is NOT allowed. We should be using 'Ur', I think. Maybe the python enforces is what we saw the error from -- it didn't used to enforce anything. 'rbU' works and I put that in as a quick fix. On 4/5/11 7:12 AM, Charles R Harris wrote: The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in python, as it works just fine on python 2.7. Ub never made any sense anywhere -- U means universal newline text file. b means binary -- combining them makes no sense. On older pythons, the behaviour of 'Ub' was undefined -- now, it looks like it is supposed to raise an error. does 'Ur' work with \r line endings on Python 3? Yes. According to my read of the docs, 'U' does nothing -- universal newline support is supposed to be the default: On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. It may indeed be desirable to read the files as text, but that would require more work on both loadtxt and genfromtxt. Why can't we just open the file with mode 'Ur'? text is text, messing with line endings shouldn't hurt anything, and it might help. Well, text in the files then gets the numpy 'U' type instead of 'S', and there are places where byte streams are assumed for stripping and such. Which is to say that changing to text mode requires some work. Another possibility is to use a generator: def usetext(fname): f = open(fname, 'rt') for l in f: yield asbytes(f.next()) I think genfromtxt could use a refactoring and cleanup, but probably not for 1.6. I think it should also be possible to read rb and strip any \r, \r\n in _iotools.py, that's were the bytes are used, from my reading and the initial error message. Doesn't work for \r, you get the whole file at once instead of line by line. Thanks for trying to sort out this ugliness. I've added another pull request: https://github.com/numpy/numpy/pull/71 - tests for \n \r\n and \r files, raising skiptest for currently failing 3.2 \r mode. Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: Numpy 1.6.0 beta 2
Hi, On Tue, Apr 5, 2011 at 9:46 AM, Christopher Barker chris.bar...@noaa.gov wrote: On 4/4/11 10:35 PM, Charles R Harris wrote: IIUC, Ub is undefined -- U means universal newlines, which makes no sense when used with b for binary. I looked at the code a ways back, and I can't remember the resolution order, but there isn't any checking for incompatible flags. I disagree that U makes no sense for binary file reading. In python 3: 'b' means, return byte objects 't' means return decoded strings 'U' means two things: 1) When iterating by line, split lines at any of '\r', '\r\n', '\n' 2) When returning lines split this way, convert '\r' and '\r\n' to '\n' If you support returning lines from a binary file (which python 3 does), then I think 'U' is a sensible thing to allow - as in this case. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: Numpy 1.6.0 beta 2
Hi, On Tue, Apr 5, 2011 at 4:12 PM, Christopher Barker chris.bar...@noaa.gov wrote: On 4/5/11 3:36 PM, josef.p...@gmail.com wrote: I disagree that U makes no sense for binary file reading. I wasn't saying that it made no sense to have a U mode for binary file reading, what I meant is that by the python2 definition, it made no sense. In Python 2, the ONLY difference between binary and text mode is line-feed translation. I think it's right to say that the difference between a text and a binary file in python 2 is - none for unix, and '\r\n' - '\n' translation in windows. The difference between 'rt' and 'U' is (this is for my own benefit): For 'rt', a '\r' does not cause a line break - with 'U' - it does. For 'rt' _not_ on Windows, '\r\n' stays the same - it is stripped to '\n' with 'U'. As for Python 3: In python 3: 'b' means, return byte objects 't' means return decoded strings 'U' means two things: 1) When iterating by line, split lines at any of '\r', '\r\n', '\n' 2) When returning lines split this way, convert '\r' and '\r\n' to '\n' a) 'U' is default -- it's essentially the same as 't' (in PY3), so 't' means return decoded and line-feed translated unicode objects Right - my argument is that the behavior implied by 'U' and 't' is conceptually separable. 'U' is for how to do line-breaks, and line-termination translations, 't' is for whether to decode the text or not. In python 3. b) I think the line-feed conversion is done regardless of if you are iterating by lines, i.e. with a full-on .read(). At least that's how it works in py2 -- not running py3 here to test. Yes, that looks right. If you support returning lines from a binary file (which python 3 does), then I think 'U' is a sensible thing to allow - as in this case. but what is a binary file? In python 3 a binary file is a file which is not decoded, and returns bytes. It still has a concept of a 'line', as defined by line terminators - you can iterate over one, or do .readlines(). In python 2, as you say, a binary file is essentially the same as a text file, with the single exception of the windows \r\n - \n translation. I THINK what you are proposing is that we'd want to be able to have both linefeed translation and no decoding done. But I think that's impossible -- aren't the linefeeds themselves encoded differently with different encodings? Right - so obviously if you open a utf-16 file as binary, terrible things may happen - this was what Pauli was pointing out before. His point was that utf-8 is the standard, and that we probably would not hit many other encodings.I agree with you if you are saying that it would be good to be able to deal with them if we can - presumably by allowing 'rt' file objects, producing python 3 strings. U looks appropriate in this case, better than the workarounds. However, to me the python 3.2 docs seem to say that U only works for text mode Agreed -- but I don't see the problem -- your files are either encoded in something that might treat newlines differently (UCS32, maybe?), in which case you'd want it decoded, or you are working with ascii or ansi or utf-8, in which case you can specify the encoding anyway. I don't understand why we'd want a binary blob for text parsing -- the parsing code is going to have to know something about the encoding to work -- it might as well get passed in to the file open call, and work with unicode. I suppose if we still want to assume ascii for parsing, then we could use 't' and then re-encode to ascii to work with it. Which I agree does seem heavy handed just for fixing newlines. Also, one problem I've often had with encodings is what happens if I think I have ascii, but really have a couple characters above 127 -- then the default is to get an error in decoding. I'd like to be able to pass in a flag that either skips the un-decodable characters or replaces them with something, but it doesn't look like you can do that with the file open function in py3. The line terminator is always b'\n' for binary files; Once you really make the distiction between text and binary, the concept of a line terminator doesn't really make sense anyway. Well - I was arguing that, given we can iterate over lines in binary files, then there must be the concept of what a line is, in a binary file, and that means that we need the concept of a line terminator. I realize this is a discussion that would have to happen on the python-dev list... See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in genfromtxt for python 3.2
Hi, On Wed, Mar 30, 2011 at 10:02 AM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Wed, Mar 30, 2011 at 3:39 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Mon, Mar 28, 2011 at 11:29 PM, josef.p...@gmail.com wrote: numpy/lib/test_io.py only uses StringIO in the test, no actual csv file If I give the filename than I get a TypeError: Can't convert 'bytes' object to str implicitly from the statsmodels mailing list example data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) Traceback (most recent call last): File pyshell#30, line 1, in module data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1633, in recfromtxt output = genfromtxt(fname, **kwargs) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1181, in genfromtxt first_values = split_line(first_line) File C:\Programs\Python32\lib\site-packages\numpy\lib\_iotools.py, line 206, in _delimited_splitter line = line.split(self.comments)[0].strip(asbytes( \r\n)) TypeError: Can't convert 'bytes' object to str implicitly Is the right fix for this to open a 'filename' passed to genfromtxt, as 'binary' (bytes)? If so I will submit a pull request with a fix and a test, Seems to work and is what was intended I think, see Pauli's changes/notes in commit 0f2e7db0. This is ticket #1607 by the way. Thanks for making a ticket. I've submitted a pull request for the fix and linked to it from the ticket. The reason I asked whether this was the correct fix was: imagine I'm working with a non-latin default encoding, and I've opened a file: fobj = open('my_nonlatin.txt', 'rt') in python 3.2. That might contain numbers and non-latin text. I can't pass that into 'genfromtxt' because it will give me this error above. I can pass it is as binary but then I'll get garbled text. Should those functions also allow unicode-providing files (perhaps with binary as default for speed)? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in genfromtxt for python 3.2
Hi, On Wed, Mar 30, 2011 at 11:32 AM, Pauli Virtanen p...@iki.fi wrote: On Wed, 30 Mar 2011 10:37:45 -0700, Matthew Brett wrote: [clip] imagine I'm working with a non-latin default encoding, and I've opened a file: fobj = open('my_nonlatin.txt', 'rt') in python 3.2. That might contain numbers and non-latin text. I can't pass that into 'genfromtxt' because it will give me this error above. I can pass it is as binary but then I'll get garbled text. That's the way it also works on Python 2. The text is not garbled -- it's just in some binary representation that you can later on decode to unicode: np.array(['asd']).view(np.chararray).decode('utf-8') array([u'asd'], dtype='U3') Granted, utf-16 and the ilk might be problematic. Should those functions also allow unicode-providing files (perhaps with binary as default for speed)? Nobody has yet asked for this feature as far as I know, so I guess the need for it is pretty low. Personally, I don't think going unicode makes much sense here. First, it would be a Py3-only feature. Second, there is a real need for it only when dealing with multibyte encodings, which are seldom used these days with utf-8 rightfully dominating. It's not a feature I need, but then, I'm afraid all the languages I've been taught are latin-1. Oh, except I learnt a tiny bit of Greek. But I don't use it for work :) I suppose the annoyances would be: 1) Probably temporary surprise that genfromtxt(open('my_file.txt', 'rt')) generates this error 2) Having to go back over returned arrays decoding stuff for utf-8 3) Wrong results for other encodings Maybe the best way is a graceful warning on entry to the routine? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in genfromtxt for python 3.2
Hi, On Mon, Mar 28, 2011 at 11:29 PM, josef.p...@gmail.com wrote: numpy/lib/test_io.py only uses StringIO in the test, no actual csv file If I give the filename than I get a TypeError: Can't convert 'bytes' object to str implicitly from the statsmodels mailing list example data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) Traceback (most recent call last): File pyshell#30, line 1, in module data = recfromtxt(open('./star98.csv', U), delimiter=,, skip_header=1, dtype=float) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1633, in recfromtxt output = genfromtxt(fname, **kwargs) File C:\Programs\Python32\lib\site-packages\numpy\lib\npyio.py, line 1181, in genfromtxt first_values = split_line(first_line) File C:\Programs\Python32\lib\site-packages\numpy\lib\_iotools.py, line 206, in _delimited_splitter line = line.split(self.comments)[0].strip(asbytes( \r\n)) TypeError: Can't convert 'bytes' object to str implicitly Is the right fix for this to open a 'filename' passed to genfromtxt, as 'binary' (bytes)? If so I will submit a pull request with a fix and a test, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] hashing dtypes, new variation, old theme
Hi, Running the test suite for one of our libraries, there seems to have been a recent breakage of the behavior of dtype hashing. This script: import numpy as np data0 = np.arange(10) data1 = data0 - 10 dt0 = data0.dtype dt1 = data1.dtype assert dt0 == dt1 # always passes assert hash(dt0) == hash(dt1) # fails on latest fails on the current latest-ish - aada93306 and passes on a stock 1.5.0. Is this expected? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] hashing dtypes, new variation, old theme
Hi, On Wed, Mar 16, 2011 at 9:21 AM, Robert Kern robert.k...@gmail.com wrote: On Wed, Mar 16, 2011 at 10:27, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 16, 2011 at 8:56 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 16, 2011 at 8:46 AM, Robert Kern robert.k...@gmail.com wrote: On Wed, Mar 16, 2011 at 01:18, Matthew Brett matthew.br...@gmail.com wrote: Hi, Running the test suite for one of our libraries, there seems to have been a recent breakage of the behavior of dtype hashing. This script: import numpy as np data0 = np.arange(10) data1 = data0 - 10 dt0 = data0.dtype dt1 = data1.dtype assert dt0 == dt1 # always passes assert hash(dt0) == hash(dt1) # fails on latest fails on the current latest-ish - aada93306 and passes on a stock 1.5.0. Is this expected? According to git log hashdescr.c, nothing has changed in the implementation of the hash function since Oct 31, before numpy 1.5.1 which also passes the second test. I'm not sure what would be causing the difference in HEAD. The 1.5.1 branch was based on 1.5.x, not master. David's change isn't in 1.5.x, so apparently it wasn't backported. Hmm. It works just before and just after that change, so the problem is somewhere else. I can git-bisect it later in the day, will do so unless it's become clear in the meantime. Thanks, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadmat output (was Re: Accessing elements of an object array)
Hi, On Wed, Mar 16, 2011 at 1:56 PM, lists_r...@lavabit.com wrote: On Wed, Mar 16, 2011 at 15:18, lists_r...@lavabit.com wrote: In [10]: x Out[10]: array(array((7.399500875785845e-10, 7.721153414752673e-10, -0.984375), Â Â Â dtype=[('cl', '|O8'), ('tl', '|O8'), ('dagc', '|O8')]), dtype=object) In [11]: x.shape, x.size Out[11]: ((), 1) It's not that it's an object array. It's that it is a ()-shape array. You index it with an empty tuple: x[()] Why does loadmat return such arrays? Is there a way to make it produce arrays that are not object arrays? Did you find the struct_as_record option to loadmat? http://docs.scipy.org/doc/scipy/reference/generated/scipy.io.loadmat.html Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Segfault with python 3.2 structured array non-existent field
Hi, On Sun, Mar 13, 2011 at 12:07 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sun, Mar 13, 2011 at 11:51 AM, Christoph Gohlke cgoh...@uci.edu wrote: On 3/13/2011 11:29 AM, Matthew Brett wrote: Hi On Sun, Mar 13, 2011 at 9:54 AM, Christoph Gohlkecgoh...@uci.edu wrote: On 3/13/2011 1:57 AM, Matthew Brett wrote: Hi, I have this on my OSX 10.6 system and numpy 1.5.1 and current numpy head (30ee1d352): $ python3.2 Python 3.2 (r32:88452, Feb 20 2011, 11:12:31) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type help, copyright, credits or license for more information. import numpy as np a = np.zeros((1,), dtype=[('f1', 'f')]) a['f1'] = 1 a['f2'] = 1 Segmentation fault All tests pass with np.test() Expected behavior with same code on python2.6: a['f2'] = 1 Traceback (most recent call last): File stdin, line 1, inmodule ValueError: field named f2 not found. Cheers, Matthew Confirmed on Windows. The crash is in line 816 of mapping.c: PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(index)); This works with Python 3.x: PyErr_Format(PyExc_ValueError, field named %S not found., index); Sure enough, in python3.2: a[b'f2'] = 1 Traceback (most recent call last): File stdin, line 1, inmodule ValueError: field named f2 not found. That is - it is specifically passing a python 3 string that causes the segmentation fault. Is this a release blocker? I'm afraid I don't know the code well enough to be confident of the right fix. Is there something I can do to help? Please open a ticket at http://projects.scipy.org/numpy and refer to this discussion and proposed fix. diff --git a/numpy/core/src/multiarray/mapping.c b/numpy/core/src/multiarray/mapping.c index 8db85bf..3a72811 100644 --- a/numpy/core/src/multiarray/mapping.c +++ b/numpy/core/src/multiarray/mapping.c @@ -812,10 +812,16 @@ array_ass_sub(PyArrayObject *self, PyObject *index, PyObject *op) } } } - +#if defined(NPY_PY3K) + PyErr_Format(PyExc_ValueError, + field named %S not found., + index); +#else PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(index)); +#endif + return -1; } http://projects.scipy.org/numpy/ticket/1770 Sorry to ask, and I ask partly because I'm in the middle of a py3k port, but is this the right fix to this problem? I was confused by the presence of the old PyString_AsString function. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Segfault with python 3.2 structured array non-existent field
Hi, On Tue, Mar 15, 2011 at 10:12 AM, Pauli Virtanen p...@iki.fi wrote: Tue, 15 Mar 2011 10:06:09 -0700, Matthew Brett wrote: Sorry to ask, and I ask partly because I'm in the middle of a py3k port, but is this the right fix to this problem? I was confused by the presence of the old PyString_AsString function. It's not a correct fix. The original code seems also wrong (index can either be Unicode or Bytes/String object), and will probably bomb when indexing with Unicode strings on Python 2. The right thing to do is to make it show the repr of the index object. OK - I realize I'm being very lazy here but, do you mean: PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(PyObject_Repr(index))); The PyString_AsString is present, as it's mapped on Py3 to PyBytes_AsString by npy_3kcompat.h. Oh - dear - I think I felt a blood vessel pop somewhere in my brain :) See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Segfault with python 3.2 structured array non-existent field
Hi, On Tue, Mar 15, 2011 at 11:07 AM, Pauli Virtanen p...@iki.fi wrote: Tue, 15 Mar 2011 10:23:35 -0700, Matthew Brett wrote: [clip] OK - I realize I'm being very lazy here but, do you mean: PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(PyObject_Repr(index))); The PyString_AsString is present, as it's mapped on Py3 to PyBytes_AsString by npy_3kcompat.h. Oh - dear - I think I felt a blood vessel pop somewhere in my brain :) This was an answer to your question as I understood it: PyString_AsString is no longer a part of the API on Python 3.x. So how come this code can work on Python 3 if it appears here? Oh - dear - again. Yes it was a helpful answer and direct to my question - implication otherwise entirely accidental. The blood vessel was only because my brain was already twisted from trying to unlearn the correspondence between string and byte in python 3... I fear that my suggestion as to what you meant by repr doesn't make sense though. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Segfault with python 3.2 structured array non-existent field
Hi, On Tue, Mar 15, 2011 at 10:23 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Tue, Mar 15, 2011 at 10:12 AM, Pauli Virtanen p...@iki.fi wrote: Tue, 15 Mar 2011 10:06:09 -0700, Matthew Brett wrote: Sorry to ask, and I ask partly because I'm in the middle of a py3k port, but is this the right fix to this problem? I was confused by the presence of the old PyString_AsString function. It's not a correct fix. The original code seems also wrong (index can either be Unicode or Bytes/String object), and will probably bomb when indexing with Unicode strings on Python 2. The right thing to do is to make it show the repr of the index object. OK - I realize I'm being very lazy here but, do you mean: PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(PyObject_Repr(index))); Being less lazy, and having read the cpython source, and read Christoph's mail more carefully, I believe Christoph's patch is correct... Unicode indexing of structured array fields doesn't raise an error on python 2.x; I assume because PyString_AsString is returning a char* using the Unicode default encoding, as per the docs. Thanks, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] structured array indexing oddity
Hi, I just wrote a short test for indexing into structured arrays with strings and found this: In [4]: a = np.zeros((1,), dtype=[('f1', 'i4')]) In [5]: a['f1'] Out[5]: array([0]) In [6]: a['f2'] # not present - error --- ValueErrorTraceback (most recent call last) /Users/mb312/ipython console in module() ValueError: field named f2 not found. In [7]: a[0]['f1'] # OK Out[7]: 0 In [8]: a[0]['f2'] --- IndexErrorTraceback (most recent call last) /Users/mb312/ipython console in module() IndexError: invalid index It seems odd to raise an 'IndexError' without a message about the field name in the second case, and a ValueError with a good message in the first. Do y'all agree? Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Segfault with python 3.2 structured array non-existent field
Hi, On Tue, Mar 15, 2011 at 5:30 PM, Christoph Gohlke cgoh...@uci.edu wrote: On 3/15/2011 5:13 PM, Matthew Brett wrote: Hi, On Tue, Mar 15, 2011 at 10:23 AM, Matthew Brettmatthew.br...@gmail.com wrote: Hi, On Tue, Mar 15, 2011 at 10:12 AM, Pauli Virtanenp...@iki.fi wrote: Tue, 15 Mar 2011 10:06:09 -0700, Matthew Brett wrote: Sorry to ask, and I ask partly because I'm in the middle of a py3k port, but is this the right fix to this problem? I was confused by the presence of the old PyString_AsString function. It's not a correct fix. The original code seems also wrong (index can either be Unicode or Bytes/String object), and will probably bomb when indexing with Unicode strings on Python 2. The right thing to do is to make it show the repr of the index object. OK - I realize I'm being very lazy here but, do you mean: PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(PyObject_Repr(index))); Being less lazy, and having read the cpython source, and read Christoph's mail more carefully, I believe Christoph's patch is correct... Unicode indexing of structured array fields doesn't raise an error on python 2.x; I assume because PyString_AsString is returning a char* using the Unicode default encoding, as per the docs. I think the patch is correct for Python 3 but, as Pauli pointed out, the original code can crash also under Python 2.x when indexing with an unicode string that contains non-ascii7 characters, which seems much less likely and apparently has been undetected for quite a while. For example, this crashes for me on Python 2.7: import numpy as np a = np.zeros((1,), dtype=[('f1', 'f')]) a[u's'] = 1 # works a[u'µ'] = 1 # crash So, the proposed patch works for Python 3, but there could be a better patch fixing also the corner case on Python 2. Thanks. How about something like the check further up the file: if (PyUnicode_Check(op)) { temp = PyUnicode_AsUnicodeEscapeString(op); } PyErr_Format(PyExc_ValueError, field named %s not found., PyBytes_AsString(temp)); ? I'm happy to submit a pull request with tests if that kind of thing looks sensible to y'all, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Segfault with python 3.2 structured array non-existent field
Hi, On Tue, Mar 15, 2011 at 5:55 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Tue, Mar 15, 2011 at 5:30 PM, Christoph Gohlke cgoh...@uci.edu wrote: On 3/15/2011 5:13 PM, Matthew Brett wrote: Hi, On Tue, Mar 15, 2011 at 10:23 AM, Matthew Brettmatthew.br...@gmail.com wrote: Hi, On Tue, Mar 15, 2011 at 10:12 AM, Pauli Virtanenp...@iki.fi wrote: Tue, 15 Mar 2011 10:06:09 -0700, Matthew Brett wrote: Sorry to ask, and I ask partly because I'm in the middle of a py3k port, but is this the right fix to this problem? I was confused by the presence of the old PyString_AsString function. It's not a correct fix. The original code seems also wrong (index can either be Unicode or Bytes/String object), and will probably bomb when indexing with Unicode strings on Python 2. The right thing to do is to make it show the repr of the index object. OK - I realize I'm being very lazy here but, do you mean: PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(PyObject_Repr(index))); Being less lazy, and having read the cpython source, and read Christoph's mail more carefully, I believe Christoph's patch is correct... Unicode indexing of structured array fields doesn't raise an error on python 2.x; I assume because PyString_AsString is returning a char* using the Unicode default encoding, as per the docs. I think the patch is correct for Python 3 but, as Pauli pointed out, the original code can crash also under Python 2.x when indexing with an unicode string that contains non-ascii7 characters, which seems much less likely and apparently has been undetected for quite a while. For example, this crashes for me on Python 2.7: import numpy as np a = np.zeros((1,), dtype=[('f1', 'f')]) a[u's'] = 1 # works a[u'µ'] = 1 # crash So, the proposed patch works for Python 3, but there could be a better patch fixing also the corner case on Python 2. Thanks. How about something like the check further up the file: if (PyUnicode_Check(op)) { temp = PyUnicode_AsUnicodeEscapeString(op); } PyErr_Format(PyExc_ValueError, field named %s not found., PyBytes_AsString(temp)); ? I'm happy to submit a pull request with tests if that kind of thing looks sensible to y'all, That fix works for 2.6 but crashes 3.2 when it is rejecting (by design I think) field indexing with byte strings. I've put a test function in this branch which exercises the routes I could think of. It will crash current 2.x because of the unicode error that Pauli pointed out, but I think it corresponds to the expected behavior: https://github.com/matthew-brett/numpy/compare/numpy:master...matthew-brett:struct-arr-fields Do these tests look correct? I'd be happy to hear why the code above does not work, it wasn't obvious to me. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Segfault with python 3.2 structured array non-existent field
Hi, I have this on my OSX 10.6 system and numpy 1.5.1 and current numpy head (30ee1d352): $ python3.2 Python 3.2 (r32:88452, Feb 20 2011, 11:12:31) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type help, copyright, credits or license for more information. import numpy as np a = np.zeros((1,), dtype=[('f1', 'f')]) a['f1'] = 1 a['f2'] = 1 Segmentation fault All tests pass with np.test() Expected behavior with same code on python2.6: a['f2'] = 1 Traceback (most recent call last): File stdin, line 1, in module ValueError: field named f2 not found. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Segfault with python 3.2 structured array non-existent field
Hi On Sun, Mar 13, 2011 at 9:54 AM, Christoph Gohlke cgoh...@uci.edu wrote: On 3/13/2011 1:57 AM, Matthew Brett wrote: Hi, I have this on my OSX 10.6 system and numpy 1.5.1 and current numpy head (30ee1d352): $ python3.2 Python 3.2 (r32:88452, Feb 20 2011, 11:12:31) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type help, copyright, credits or license for more information. import numpy as np a = np.zeros((1,), dtype=[('f1', 'f')]) a['f1'] = 1 a['f2'] = 1 Segmentation fault All tests pass with np.test() Expected behavior with same code on python2.6: a['f2'] = 1 Traceback (most recent call last): File stdin, line 1, inmodule ValueError: field named f2 not found. Cheers, Matthew Confirmed on Windows. The crash is in line 816 of mapping.c: PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(index)); This works with Python 3.x: PyErr_Format(PyExc_ValueError, field named %S not found., index); Sure enough, in python3.2: a[b'f2'] = 1 Traceback (most recent call last): File stdin, line 1, in module ValueError: field named f2 not found. That is - it is specifically passing a python 3 string that causes the segmentation fault. Is this a release blocker? I'm afraid I don't know the code well enough to be confident of the right fix. Is there something I can do to help? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Segfault with python 3.2 structured array non-existent field
Hi, On Sun, Mar 13, 2011 at 11:51 AM, Christoph Gohlke cgoh...@uci.edu wrote: On 3/13/2011 11:29 AM, Matthew Brett wrote: Hi On Sun, Mar 13, 2011 at 9:54 AM, Christoph Gohlkecgoh...@uci.edu wrote: On 3/13/2011 1:57 AM, Matthew Brett wrote: Hi, I have this on my OSX 10.6 system and numpy 1.5.1 and current numpy head (30ee1d352): $ python3.2 Python 3.2 (r32:88452, Feb 20 2011, 11:12:31) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type help, copyright, credits or license for more information. import numpy as np a = np.zeros((1,), dtype=[('f1', 'f')]) a['f1'] = 1 a['f2'] = 1 Segmentation fault All tests pass with np.test() Expected behavior with same code on python2.6: a['f2'] = 1 Traceback (most recent call last): File stdin, line 1, inmodule ValueError: field named f2 not found. Cheers, Matthew Confirmed on Windows. The crash is in line 816 of mapping.c: PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(index)); This works with Python 3.x: PyErr_Format(PyExc_ValueError, field named %S not found., index); Sure enough, in python3.2: a[b'f2'] = 1 Traceback (most recent call last): File stdin, line 1, inmodule ValueError: field named f2 not found. That is - it is specifically passing a python 3 string that causes the segmentation fault. Is this a release blocker? I'm afraid I don't know the code well enough to be confident of the right fix. Is there something I can do to help? Please open a ticket at http://projects.scipy.org/numpy and refer to this discussion and proposed fix. diff --git a/numpy/core/src/multiarray/mapping.c b/numpy/core/src/multiarray/mapping.c index 8db85bf..3a72811 100644 --- a/numpy/core/src/multiarray/mapping.c +++ b/numpy/core/src/multiarray/mapping.c @@ -812,10 +812,16 @@ array_ass_sub(PyArrayObject *self, PyObject *index, PyObject *op) } } } - +#if defined(NPY_PY3K) + PyErr_Format(PyExc_ValueError, + field named %S not found., + index); +#else PyErr_Format(PyExc_ValueError, field named %s not found., PyString_AsString(index)); +#endif + return -1; } http://projects.scipy.org/numpy/ticket/1770 Thanks, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy C-API equivalent of np.float64()
Hi, On Wed, Dec 29, 2010 at 5:37 PM, Robert Bradshaw rober...@math.washington.edu wrote: On Wed, Dec 29, 2010 at 9:05 AM, Keith Goodman kwgood...@gmail.com wrote: On Tue, Dec 28, 2010 at 11:22 PM, Robert Bradshaw rober...@math.washington.edu wrote: On Tue, Dec 28, 2010 at 8:10 PM, John Salvatier jsalv...@u.washington.edu wrote: Wouldn't that be a cast? You do casts in Cython with double(expression) and that should be the equivalent of float64 I think. Or even numpy.float64_t (expression) if you've cimported numpy (though as mentioned this is the same as double on every platform I know of). Even easier is just to use the expression in a the right context and it will convert it for you. That will give me a float object but it will not have dtype, shape, ndim, etc methods. m = np.mean([1,2,3]) m 2.0 m.dtype dtype('float64') m.ndim 0 using np.float64_t gives: AttributeError: 'float' object has no attribute 'dtype' Forgive me if I haven't understood your question, but can you use PyArray_DescrFromType with e.g NPY_FLOAT64 ? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy C-API equivalent of np.float64()
Forgive me if I haven't understood your question, but can you use PyArray_DescrFromType with e.g NPY_FLOAT64 ? I'm pretty hopeless here. I don't know how to put all that together in a function. That might be because I'm not understanding you very well, but I was thinking that: cdef dtype descr = PyArray_DescrFromType(NPY_FLOAT64) would give you the float64 dtype that I thought you wanted? I'm shooting from the hip here, in between nieces competing for the computer and my attention. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy C-API equivalent of np.float64()
Hi, That might be because I'm not understanding you very well, but I was thinking that: cdef dtype descr = PyArray_DescrFromType(NPY_FLOAT64) would give you the float64 dtype that I thought you wanted? I'm shooting from the hip here, in between nieces competing for the computer and my attention. I think I need a function. One that does this: n = 10.0 hasattr(n, 'ndim') False m = np.float64(n) hasattr(m, 'ndim') True Now the nieces have gone, I see that I did completely misunderstand. I think you want the C-API calls to be able to create a 0-dim ndarray object from a python float. There was a thread on C-API array creation on the cython list a little while ago: http://www.mail-archive.com/cython-dev@codespeak.net/msg07703.html Code in scipy here: https://github.com/scipy/scipy-svn/blob/master/scipy/io/matlab/mio5_utils.pyx See around line 36 there, and 432, and the header file I copied from Dag Sverre: https://github.com/scipy/scipy-svn/blob/master/scipy/io/matlab/numpy_rephrasing.h As you can see, it's a little horrible, in that you have to take care to get the references right to the dtype and to the data. I actually did not investigate in detail whether this lower-level array creation was speeding my code up much. I hope that's more useful... Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Hi, On Tue, Nov 30, 2010 at 11:35 AM, Keith Goodman kwgood...@gmail.com wrote: On Tue, Nov 30, 2010 at 11:25 AM, John Salvatier jsalv...@u.washington.edu wrote: I am very interested in this result. I have wanted to know how to do an My first thought was to write the reducing function like this cdef np.float64_t namean(np.ndarray[np.float64_t, ndim=1] a): but cython doesn't allow np.ndarray in a cdef. Sorry for the ill-considered hasty reply, but do you mean that this: import numpy as np cimport numpy as cnp cdef cnp.float64_t namean(cnp.ndarray[cnp.float64_t, ndim=1] a): return np.nanmean(a) # just a placeholder is not allowed? It works for me. Is it a cython version thing? (I've got 0.13), See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.5.1
Hi, On Mon, Nov 22, 2010 at 11:35 AM, Christopher Barker chris.bar...@noaa.gov wrote: On 11/20/10 11:04 PM, Ralf Gommers wrote: I am pleased to announce the availability of NumPy 1.5.1. Binaries, sources and release notes can be found at https://sourceforge.net/projects/numpy/files/. Thank you to everyone who contributed to this release. Yes, thanks so much -- in particular thanks to the team that build the OS-X binaries -- looks like a complete set! Many thanks from me too - particularly for clearing up that annoying numpy-distuils scipy build problem. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Where did the github numpy repository go?
On Sun, Nov 14, 2010 at 12:48 PM, Robin Kraft rkra...@gmail.com wrote: Git is having some kind of major outage: http://status.github.com/ The site and git access is unavailable due to a database failure. We're researching the issue. A good excuse for a long lazy Sunday... Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Developmental version numbering with git
Hi, On Tue, Nov 9, 2010 at 7:48 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Nov 9, 2010 at 8:20 AM, Scott Sinclair scott.sinclair...@gmail.com wrote: On 8 November 2010 23:17, Matthew Brett matthew.br...@gmail.com wrote: Since the change to git the numpy version in setup.py is '2.0.0.dev' regardless because the prior numbering was determined by svn. Is there a plan to add some numbering system to numpy developmental version? Regardless of the answer, the 'numpy/numpy/version.py' will need to changed because of the reference to the svn naming. In case it's useful, we (nipy) went for a scheme where the version number stays as '2.0.0.dev', but we keep a record of what git commit has we are on - described here: http://web.archiveorange.com/archive/v/AW2a1CzoOZtfBfNav9hd I can post more details of the implementation if it's of any interest, In the meantime there's a patch in that direction here: https://github.com/numpy/numpy/pull/12 Tiny patch for py3k attached. Should the generated numpy/version.py be in .gitignore? Is there a better name in order to signal the generated nature of the file? Best, Matthew 0001-BF-py3k-fix-for-git-version-string.patch Description: Binary data ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Developmental version numbering with git
Hi, On Thu, Nov 11, 2010 at 11:44 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Nov 11, 2010 at 12:32 PM, Matthew Brett matthew.br...@gmail.com Tiny patch for py3k attached. Should the generated numpy/version.py be in .gitignore? Is there a better name in order to signal the generated nature of the file? I thought it already was, but if not, yes, I think it should be added. I suppose we could add a 'generated' suffix to the name to mark it as such, but really it seems the file should go into the build directory somewhere, although that might make it difficult to access if needed in other parts of the build. Having the generated file in the main tree was something that bothered me when I committed the patch, but not enought to try to fix it. I never really understood what numpy/__config__.py was for, or how it came about, but I had the impression that was the place that build-time stuff got written - is that correct? Would it be a sensible place to write version information? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] LapackError:non-native byte order
Hi, On Mon, Nov 8, 2010 at 10:34 AM, Pauli Virtanen p...@iki.fi wrote: Mon, 08 Nov 2010 19:31:31 +0100, Pauli Virtanen wrote: ma, 2010-11-08 kello 18:56 +0100, LittleBigBrain kirjoitti: In my system '' is the native byte-order, but unless I change the byte-order label to '=', it won't work in linalg sub-module, but in others works OK. I am not sure whether this is an expected behavior or a bug? import sys sys.byteorder 'little' a.dtype.byteorder '' b.dtype.byteorder '' The error is here: it's not possible to create such dtypes via any Numpy methods -- the '' (or '') is always normalized to '='. Numpy and several other modules consequently assume this normalization. Where do `a` and `b` come from? Ok, `x.newbyteorder('')` seems to do this. Now I'm unsure how things are supposed to work. Yes - it is puzzling that ``x.newbyteorder('')`` makes arrays that are confusing to numpy. If numpy generally always normalizes to the system endian to '=' then should that not also be true of ``newbyteorder``? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Developmental version numbering with git
Hi, Since the change to git the numpy version in setup.py is '2.0.0.dev' regardless because the prior numbering was determined by svn. Is there a plan to add some numbering system to numpy developmental version? Regardless of the answer, the 'numpy/numpy/version.py' will need to changed because of the reference to the svn naming. In case it's useful, we (nipy) went for a scheme where the version number stays as '2.0.0.dev', but we keep a record of what git commit has we are on - described here: http://web.archiveorange.com/archive/v/AW2a1CzoOZtfBfNav9hd I can post more details of the implementation if it's of any interest, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] dtype comparison and hashing - bug?
Hi, I have just run into this oddness: In [28]: dt1 = np.dtype('f4') In [29]: dt1.str Out[29]: 'f4' In [30]: dt2 = dt1.newbyteorder('') In [31]: dt2.str Out[31]: 'f4' In [32]: dt1 == dt2 Out[32]: True In [33]: hash(dt1) == hash(dt2) Out[33]: False This is the same as: http://www.mail-archive.com/numpy-discussion@scipy.org/msg13299.html My question was - does the team still agree this is a bug? Can anyone offer a pointer as to how it should be fixed? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype comparison and hashing - bug?
Hi, It already has a ticket :) http://projects.scipy.org/numpy/ticket/1637 Oops - sorry - thanks for point that out. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Commit rights on github
Hi, Now might be a good time to discuss how we'd like the history to look in a year from now. If we follow the above approach, I guess we may end up with one merge message for each small little bug-fix? (Unless --rebase is used) How do we ensure that fast-forward merges occur whenever possible? The only solution that I know of is to have a pull-like workflow, but I thought this was rejected as too complicated ? Am I the only person to find it strange that we have an active and skilled development community, many of whom have been using git routinely for a long time, and we none of us seem to know what the agreed workflow is? I risk yet another deafening silence in asking - is there anyone who thinks that the current state of play is good? If the answer is no - then isn't there anyone who will step up to the plate and suggest something? I will certainly do it if given any sign that that is welcome. See y'all, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Development workflow
Hi, I think there are two issues here: (A) How to be sensible and presentable (B) How and when your stuff gets into master A very useful distinction - thanks for making it. For (A) I'm following the same workflow I had with the git mirror: 1. For *every* change, create a separate topic branch. 2. Work on it until the feature/bugfix is ready. 3. Push it to my own github clone for review/backup purposes if necessary. 4. If necessary, rebase (not merge!) on master when developing to keep in stride. 5. When ready, (i) rebase on master, (ii) check that the result is sensible, and (iii) push from the topic branch as new master. In this case, since all recent changes are just unrelated stand-alone bugfixes, this produces something that looks very much like SVN log :) I think of the above, 1-4 are okay in all cases. 5 is then perhaps not so absolute, as one could also do a merge if there are several commits. I 100% endorse Fernando's recommendations: http://mail.scipy.org/pipermail/ipython-dev/2010-October/006746.html This really sounds like best-practice to me, and it's even empirically tested! OK - so it seems to me that you agree with Fernando's recommendations, and that's basically the same as what Stefan was proposing (give or take a rebase), and David agreed with Stefan. So - really - everyone agrees on the following - work on topic branches - don't merge from trunk - rebase on trunk if necessary. I think _insisting_ on rebase on trunk before merge with trunk is a little extreme (see follow-up to ipython thread) - but it's not a big deal. Then there's the second question (B) on when core devs should push changes. When ready, when reviewed, or only before release? I would be open even for the radical never-push-your-own-changes solution. I think we could even try it this way for the 1.5.1 release. If it seems that unhandled pull requests start to accumulate (which I don't think will happen), we could just reverse the policy. OK - right - that's the second big issue and obviously that's at the heart of thing. I think that splits into two in fact: i) How often to merge into trunk ii) Who should merge into trunk At the extreme, you have the SVN model where the answers are: i) a merge for almost every commit ii) by the person who wrote the code and I thought that we'd decided that we didn't want that because trunk started getting unpredictable and painful to maintain. At the other end is the more standard DVCS-type workflow: i) merges by branches (which might have only a few commits) ii) by small team of people who are responsible for overseeing trunk. And rarely by the person who wrote the code So - is that a reasonable summary? Does anyone disagree with Pauli's never-push-your-own-changes suggestion? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Development workflow
Hi, All my sincere apologies for the mess I caused... The changes I wanted to commit were quite minimal (just a few lines in a test), but I obviously included some stuffs I didn't want too... Ah - no - so sorry that the discussion got attached to your original post and problem. I would like you to know that I think git is wonderful - and that I have made mistakes with git which are far far worse than the minor glitch you hit. What does not kill us makes us stronger, as they say. I think the only possible lesson that might be drawn is that it probably would have helped you as it has certainly helped me, to have someone scan the set of changes and comment - as part of the workflow. Then we all learn from each other, and build up a common understanding of the code and how it's changing. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Development workflow
Hi, Does anyone disagree with Pauli's never-push-your-own-changes suggestion? I think it is a little too extreme for trivial changes like one-liner and the likes, but I think it is a good default rule (that is if you are not sure, don't push your own changes). Well - but a) if it's a one-liner - I bet you can get someone to sign it off within an hour. b) I've done one-liners that caused a mess. Maybe it's just me ;) c) The only thing you lose from waiting for a sign-off is a few hours - the code is still there if someone needs to use it. That was the problem with SVN - if it wasn't in trunk, it was rotting - but we don't have the problem now. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Development workflow
Hi guys, Am I right in thinking that for the moment at least, the git workflow is basically the same as the svn workflow (everyone commiting to trunk)? I realize that this is not going to cheer anyone up, but is this the best workflow now? Who would decide? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Development workflow
Hi, In my opinion, I've seen a lot of people coming from SVN try to apply SVN-style workflow to git (and presumably other dvcs's), but git and the like (and Github!) allow for much more fine-tuned workflows in my opinion, and I think it's a mistake to ignore that. I'm just some guy, though, so I'm not sure my opinion has much weight. Ah - yes - I know exactly what you mean ;) Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Development workflow
Hi, I find having the branch displayed on the command line helpful in avoiding mishaps, so I have the following in my .bashrc export PS1='\[\033[1;31m\]\$\[\033[0m\...@\h \W$(__git_ps1 (%s))\\$ ' The \W$(__git_ps1 (%s)) bit is the important part. Yes, that one's a lifesaver. It's part of the truly excellent git-completion bash utilities too: http://blog.strug.de/2010/08/how-to-add-git-completion-to-your-terminal/ See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Development workflow
Hi, In my opinion, I've seen a lot of people coming from SVN try to apply SVN-style workflow to git (and presumably other dvcs's), but git and the like (and Github!) allow for much more fine-tuned workflows in my opinion, and I think it's a mistake to ignore that. I'm just some guy, though, so I'm not sure my opinion has much weight. Ah - yes - I know exactly what you mean ;) Sorry - just re-reading that - it's a bit cryptic. I mean, yes, I agree wholeheartedly that it's a shame to stick to svn workflows when using git. I'm just some guy as well of course. I suppose we just-some-guys should start an just-some-guys interest group or something ;) Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Assigning complex value to real array
Hi, 'Talk is cheap, show me the code' . Yes, let's. First we want to initialize memory mapped arrays which will be used for the variables. This python script does that: Ah - no - sorry - I was suggesting you write an implementation of the object you want, in numpy. I am sure that was not your intention, but so far it is possible to read your emails as suggesting that someone else does that work for you, and I don't think there's any chance that's going to happen. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Assigning complex value to real array
Hi, On Thu, Oct 7, 2010 at 3:47 PM, Andrew P. Mullhaupt d...@zen-pharaohs.com wrote: On 10/7/2010 3:48 PM, Anne Archibald wrote: Years ago MATLAB did just this - store real and complex parts of arrays separately (maybe it still does, I haven't used it in a long time). It caused us terrible performance headaches, Most machines now and in the future are not going to choke on these issues (for a variety of reasons). 'Talk is cheap, show me the code' . http://en.wikiquote.org/wiki/Linus_Torvalds Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] real and imag functions should produce errors for object arrays
Hi, On Tue, Sep 21, 2010 at 2:44 PM, Benjamin Root ben.r...@ou.edu wrote: On Tue, Sep 21, 2010 at 4:31 PM, Michael Gilbert michael.s.gilb...@gmail.com wrote: Hi, The following example demonstrates a rather unexpected result: import numpy x = numpy.array( complex( 1.0 , 1.0 ) , numpy.object ) print x.real (1+1j) print x.imag 0 Shouldn't real and imag return an error in such a situation? It looks like there was a decision to let 'real' and 'imag' pass quietly for non-numerical types: In [2]: a = np.array('hello', dtype=object) In [3]: a.real Out[3]: array('hello', dtype=object) In [4]: a.imag Out[4]: array(0, dtype=object) and In [6]: a = np.array('hello', dtype='S5') In [7]: a.real Out[7]: array('hello', dtype='|S5') In [8]: a.imag Out[8]: array('', dtype='|S5') I can see that that could be confusing. I suppose the alternative would be to raise an error for real and imag for non-numerical types at least. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] real and imag functions should produce errors for object arrays
Hi, On Tue, Sep 21, 2010 at 3:28 PM, Robert Kern robert.k...@gmail.com wrote: On Tue, Sep 21, 2010 at 17:17, Pauli Virtanen p...@iki.fi wrote: Tue, 21 Sep 2010 21:50:08 +, Pauli Virtanen wrote: Tue, 21 Sep 2010 17:31:55 -0400, Michael Gilbert wrote: The following example demonstrates a rather unexpected result: import numpy x = numpy.array( complex( 1.0 , 1.0 ) , numpy.object ) print x.real (1+1j) print x.imag 0 Shouldn't real and imag return an error in such a situation? It probably shouldn't do *that* at the least. *that* == return a complex number from .real What is the alternative? I'm personally happy with saying that many of the operations we define on numpy arrays can be done because we know the types and that object arrays subvert this. numpy can't, without excessive amounts of magic, always know a sensible thing to do with object arrays, so we implement the fast thing to do. I agree that special-casing object array .real to detect complex contents seems a bit messy, but it does make sense I think to raise an error from .real and .imag for non-numerical types. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] real and imag functions should produce errors for object arrays
Hi, I see that I have interpreted this thread as Doctor, it hurts when I do this... Well, don't do that! Sorry for the noise. It's all good - a reply is almost always more friendly and helpful than no reply ;) See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] str == int puzzlement
Hi, Yeah, it's just that numpy knows that it cannot compare pears with apples: a = numpy.asarray(['a', 'b']) a.__eq__(1) NotImplemented Thank you - that's very helpful and clear. Maybe it would be better to raise a ValueError, which is not caught by the evaluation mechanism, to prevent such stuff. Sorry that this is not yet clear to me, but, is it true then that: The only situation where array.__eq__ sensibly falls back to python __eq__ is for the individual elements of object arrays? Thanks again, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] str == int puzzlement
Hi, On Wed, Jul 28, 2010 at 6:49 PM, John Salvatier jsalv...@u.washington.edu wrote: I think this is just Python behavior; comparing python ints and strs also gives False: In [45]: 8 == 'L' Out[45]: False Just to be clear, from: a = np.array(['a','b']) a == 1 I was expecting: array([ False, False], dtype=bool) For: In [22]: a = np.array(['a','b']) In [23]: a + 'c' etc - it makes sense to me that I can't add to numpy strings. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] str == int puzzlement
Hi, Please forgive me if this is obvious, but this surprised me: In [15]: x = np.array(['a', 'b']) In [16]: x == 'a' # this was what I expected Out[16]: array([ True, False], dtype=bool) In [17]: x == 1 # this was strange to me Out[17]: False Is it easy to explain why this is? Thanks a lot, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpydoc broken with sphinx 1.0
On Mon, Jul 26, 2010 at 1:56 PM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Mon, Jul 26, 2010 at 01:52:06PM -0700, Matthew Brett wrote: http://old.nabble.com/numpydoc-broken-by-latest-sphinx-td28896476.html http://projects.scipy.org/numpy/ticket/1489 Darn, I should have googled :( Better an extra email than no report at all - all good ;) Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpydoc broken with sphinx 1.0
Hi, I just wanted to mention that numpydoc is broken with the latest sphinx release. I had a quick look, but could find how to solve the problem, so I am just pointing it out here: For reference in case anyone is searching... http://old.nabble.com/numpydoc-broken-by-latest-sphinx-td28896476.html http://projects.scipy.org/numpy/ticket/1489 See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [ANN] Bento (ex-toydist) 0.0.3
Hi, Can you copyright a word ? I thought this was the trademark part of the law. For example, linux is a trademark owned by Linus Torvald. Also, well known packages use words which are at least as common as bento in English (sphinx, twisted, etc...), and as likely to be trademarked. I got ripely panned for doing this before, but... If you have a look at - to reduce controversy - : http://cyber.law.harvard.edu/metaschool/fisher/domain/tm.htm#7 you'll see a summary of the criteria used. I read this stuff as meaning that, if you're doing something that has a low 'likelihood of confusion' with the other guy / gal doing 'Bento', and the other 'Bento' trademark is not 'famous', you're probably, but not certainly, safe from successful prosecution. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] cython and f2py
Hi, Can numpy.distutils be directed to process *.pyx with Cython rather than Pyrex? Yes, but at the moment I believe you have to monkey-patch numpy distutils : see the top of http://github.com/matthew-brett/nipy/blob/master/setup.py and generate_a_pyrex_source around line 289 of: http://github.com/matthew-brett/nipy/blob/master/build_helpers.py for how we've done it - there may be a better way - please post if you find it! Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpydoc broken by latest sphinx
Hi, Sorry if y'all had already seen this, but a friend just picked up the latest sphinx with easy_install -U sphinx - got version 1.0b2 - and then hit this problem with numpydoc: http://projects.scipy.org/numpy/ticket/1489 I just point it out in the hope that someone with more sphinx knowledge might be able to have a look, Thanks a lot, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Technicalities of the SVN - GIT transition
Hi, Actually - isn't it better if people do give you their github username / email combo - assuming that's the easiest combo to work with later? I think the Github user name is not really needed here, as what goes into the history is the Git ID: name + email address. Yes, sorry, of course you're right, the 'name' is nothing to do with the github username. I guess we do need to send the Git ID we like to use though (in my case Matthew Brett matthew.br...@gmail.com). See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Technicalities of the SVN - GIT transition
Hi, I think it should be opt-in. How would opt-out work? Would someone create new accounts for all the contributors and then give them access? Just to be clear, this has nothing to do with accounts on github, or any registered thing. This is *only* about username/email as recognized by git itself (as recorded in the commit objects). Actually - isn't it better if people do give you their github username / email combo - assuming that's the easiest combo to work with later? Mine is 'matthew-brett' with this (my usual) email address, See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Technicalities of the SVN - GIT transition
Hi. Maybe you could put up a list of those people whose emails you need to scrape (obviously without the email addresses) and ask for opt-out? Or opt-in if you think that's better? So here is the list of authors: Do y'all think opt-in? Or opt-out?If it's opt-in I guess you'll catch most of the current committers, and most of the others you'll lose, but maybe that's good enough. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Technicalities of the SVN - GIT transition
Hi, I don't think correcting the email addresses in the SVN history is very useful. Best probably just use some dummy form, maybe That's what svn2git already does, so that would be less work for me :) It may not matter much, but I think there is at least one argument for having real emails: to avoid having duplicate committers (i.e. pvirtanen is the same committer before and after the git transition). But this is only significant for current committers. It seems right to try and keep the commit author the same for pre and post SVN commits if possible. Maybe you could put up a list of those people whose emails you need to scrape (obviously without the email addresses) and ask for opt-out? Or opt-in if you think that's better? Thanks for looking into this. Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] curious about how people would feel about moving to github
Hi, How does that differ from what we do now? Review? I develop in my own branches as is. Right - so - then do you always ask for a review from someone before merging into trunk? If so, then git is just a much more fluid, reliable and faster tool to do what you are doing now. True, but what happens when there is no review? I might point out that there are currently tickets with patches for review going back two years and reviewing a patch isn't *that* much harder than visiting github. Using git makes merging changes much easier, but it doesn't solve the review problem. Well - that's true and not true. The joy of git branches and the ease of merging is that you quickly get into the habit of making feature branches for each piece of work. This makes it extremely easy for someone else to review the changes that you have made. So, it greatly lowers the work needed for someone to review your code, and therefore makes it more likely. Having said that - it will of course happen that you ask for review and no-one responds. That's not a very big problem, because git merges are so easy that you can - as Anne said earlier - just keep on developing without worrying that your changes will go out of date. But if there's a long wait - or it's urgent - then what I do is just email with 'If I don't hear anything I'll merge these changes in a few days'. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] curious about how people would feel about moving to github
Hi, Having said that - it will of course happen that you ask for review and no-one responds. That's not a very big problem, because git merges are so easy that you can - as Anne said earlier - just keep on developing without worrying that your changes will go out of date. But if there's a long wait - or it's urgent - then what I do is just email with 'If I don't hear anything I'll merge these changes in a few days'. Exactly. I had a private bet with myself that that would be the case. See, it isn't so much different after all. The tools change, but the problems and solutions remain much the same. Given that there are only three people doing reviews, and really only two really looking at the c code, I expect that a lot of stuff will be merged without much in the way of review. Well - I do honestly think that a decentralized git workflow is the best tool to improve that. Now if git leads to more developers that might change. Here's hoping. I hope so too. I accidentally ran across this a few days ago: http://www.erlang.org/ - This [Erlang/OTP R13B04] is the first release after the introduction of the official Git repository at Github and it is amazing to notice that the number of contributions from the community has increased significantly. As many as 32 contributors have provided 1 or more patches each until now, resulting in 51 integrated patches from the open source community in this service release. Here's hoping... See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] curious about how people would feel about moving to github
Hi, Maybe most importantly, distributed revision control places any possible contributor on equal footing with those with commit access; this is one important step in making contributors feel valued. I think this is a very important point, but subtle. I realize that's a dangerous combination, but I'm going to have a go at exposition. I think it is true that the distributed model _tends_ to make contributors feel more welcome, but it's not to do with permissions, it's to do with the process.The process is much more important than the permissions. If we want new contributors to feel welcome, we need a clear, explicit process, that everyone agrees to, and follows. I don't mean something enforced by permissions, but something followed, by convention, and with care, by all the developers. That provides a clear and healthy basis for people to join. In that situation, and in that situation only, new developers do not worry about whether they are clever or important or well-known enough to contribute code. That does tend to follow from the distributed model, because it is fundamentally built on the 'show me the code' model of development. Not surprisingly. I completely agree with Anne that we will work it out when we switch, and the details of process should not delay us. But, this is just a vote for some careful thought - and discussion - and agreement - on what sort of atmosphere we want to convey as a community. That atmosphere comes directly from our development model - or rather - the development model is the clearest indicator of what kind of colleagues we are. Are we careful? Are we serious? Are we thoughtful? Are we open? Are we clear? Do we value learning and teaching? Are we coding for the long-term? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought
Hi, I think the main problem has been windows compatibility. Git is best from the command line whereas the windows command line is an afterthought. Another box that needs a check-mark is the buildbot. If svn clients are supported then it may be that neither of those are going to be a problem. However, It needs user testing. For windows - I think honestly this is now not a serious barrier to using git. I've installed msysgit on 4 or 5 machines recently, and it has been very smooth - as well as providing a nice bash shell. I think it would be a huge reduction in the barrier to contributing to numpy if we could change to git. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought
Hi, there is no such thing as a nice bash shell for a windows user. I have no idea how to use one. It is a nice bash shell. You may not want a nice bash shell ;) I can't imagine you'd object to one though. It's just a useful place to type git commands, with file / directory path autocompletion, git branch autocompletion and so on. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought
Hi, Any shell on windows is a pain, if only because of the spaces in the filenames. When I'm using git, or bzr or svn I use the windows shell which I'm very familiar with ,and allows standard copy-paste and has quotes. But since, I think, there are no numpy developers on Windows, and I'm the only one for scipy, and occasional commits I can do with anything, I won't argue this time. I've been testing quite a bit on windows recently, and I used to use windows all the time. I've found msysgit to be pretty good. I personally have always hated the windows shell, but you can use msysgit from the windows shell if you prefer... If you're OK with bzr or svn from the windows command line, I am sure git will pose no major problems. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] curious about how people would feel about moving to github
Hi, It seems to me that git's flexibility in how people collaborate means we can do a certain amount of figuring out after the switch. This is very well said and true to our recent experience with nipy and ipython: http://github.com/ipython/ipython http://github.com/nipy/nipy My experience with a small project has been that anyone who wants to make major changes just clones the repository on github and makes the changes; then we email the main author to ask him to pull particular branches into the main repo. It works well enough. That's the model we've gone for in nipy and ipython too. We wrote it up in a workflow doc project. Here are the example docs giving the git workflow for ipython: https://cirl.berkeley.edu/mb312/gitwash/ and in particular: https://cirl.berkeley.edu/mb312/gitwash/development_workflow.html Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] curious about how people would feel about moving to github
Hi, Linux has Linus, ipython has Fernando, nipy has... well, I'm sure it is somebody. Numpy and Scipy no longer have a central figure and I like it that way. There is no reason that DVCS has to inevitably lead to a central authority. I think I was trying to say that the way it looks as if it will be - before you try it - is very different from the way it actually is when you get there. Anne put the idea very well - but I still think it is very hard to understand, without trying it, just how liberating the workflow is from anxieties about central authorities and so on.You can just get on with what you want to do, talk with or merge from whoever you want, and the whole development process becomes much more fluid and productive. And I know that sounds chaotic but - it just works. Really really well. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] curious about how people would feel about moving to github
Hi, No, at this point we don't have a release manager, we haven't since 1.2. We have people who do the builds and put them up on sourceforge, but they aren't release managers, they don't decide what is in the release or organise the effort. We haven't had a central figure since Travis got a real job ;) And now David has a real job too. I'm just pointing out that that projects like Linux and IPython have central figures because the originators are still active in the development. Let me put it this way, right now, who would you choose to pull the changes and release the official version? OK - for nipy - we have - I think - 5 people who can commit into the main repository. Any one of those 5 people can review someone's work, and commit into the main repository.My guess is - with numpy - there would be some number of people with the same permissions - I imagine you among them. But the rule is - No-one commits into the main repo without someone reviewing and agreeing the work Any trusted person can review. But the point is: No development in the main repo. Merges only. Why? Let's flip your question the other way round. You are saying - I want to continue (as for SVN) to develop in the main repo. But the main repo is where everyone merges from. That means that a) It makes it much harder for anyone to review your changes because they are mixed up in a lot of other changes and b) You force everyone following numpy to adopt your changes In practice - that means that you make it harder for others by making them follow your line of development when they may not want to - until it's ready. I guess you'd agree that code review is essential to good code quality - both for improving code - and for teaching. It encourages new developers because they know their work will be checked. It helps developers learn the coding guidelines and to share good practice. It helps the developers have a broad knowledge of the code base. With SVN / central repo development - that's really hard - because all the development lines get mixed up as people work in different places. With git / DVCS - it suddenly becomes absolutely natural. I think that's why people like Joel Spolsy say stuff like 'This is possibly the biggest advance in software development technology in the ten years I’ve been writing articles here.' : http://www.joelonsoftware.com/items/2010/03/17.html Please - try it - see - I am absolutely sure you'll love it after a very short time... Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] curious about how people would feel about moving to github
Hi, No, I am saying we need at least five people who can commit to the main repo. That is the central repository model. Excellent - yes - that's reasonable. Then if you also agree to this: No development in the main repo. Merges only. then we're all in full agreement. Review is fine, and it would be nice if more people were reviewing code. At the moment I think it is just Pauli, Stefan, and myself. Right - and that is partly because it so much harder to do review with the model that we have at the moment, and partly because we don't yet have the tradition in numpy of review. I think - honestly - if we're going to be able to encourage and train new developers - we'll have to get on that as soon as we can... See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding an ndarray.dot method
Hi, I kind of like this idea. Simple, obvious, and leads to clear code: a.dot(b).dot(c) or in another multiplication order, a.dot(b.dot(c)) And here's an implementation: http://github.com/pv/numpy-work/commit/414429ce0bb0c4b7e780c4078c5ff71c113050b6 I think I'm going to apply this, unless someone complains, as I don't see any downsides (except maybe adding one more to the huge list of methods ndarray already has). Excellent excellent excellent. Once again, I owe you a beverage of your choice. Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Recommended way to add Cython extension using numpy.distutils?
Hi, We (neuroimaging.scipy.org) are using numpy.distutils, and we have .pyx files that we build with Cython. I wanted to add these in our current setup.py scripts, with something like: def configuration(parent_package='',top_path=None): from numpy.distutils.misc_util import Configuration config = Configuration('statistics', parent_package, top_path) config.add_extension('intvol', ['intvol.pyx'], include_dirs = [np.get_include()]) return config but of course numpy only knows about Pyrex, and returns: error: Pyrex required for compiling 'nipy/algorithms/statistics/intvol.pyx' but notavailable Is there a recommended way to plumb Cython into the numpy build machinery? Should I try and patch numpy distutils to use Cython if present? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Recommended way to add Cython extension using numpy.distutils?
Hi, Thanks a lot for the suggestion - I appreciate it. Is there a recommended way to plumb Cython into the numpy build machinery? Should I try and patch numpy distutils to use Cython if present? Here is the monkey-patch I'm using in my project: def evil_numpy_monkey_patch(): from numpy.distutils.command import build_src import Cython import Cython.Compiler.Main build_src.Pyrex = Cython build_src.have_pyrex = True I think this patch does not work for current numpy trunk; I've put a minimal test case here: http://github.com/matthew-brett/du-cy-numpy If you run the setup.py there (python setup.py build) then all works fine for - say - numpy 1.1. For current trunk you get an error ending in: File /Users/mb312/usr/local/lib/python2.6/site-packages/numpy/distutils/command/build_src.py, line 466, in generate_a_pyrex_source if self.inplace or not have_pyrex(): TypeError: 'bool' object is not callable which is easily fixable of course ('build_src.have_pyrex = lambda : True') - leading to: File /Users/mb312/usr/local/lib/python2.6/site-packages/numpy/distutils/command/build_src.py, line 474, in generate_a_pyrex_source import Pyrex.Compiler.Main ImportError: No module named Pyrex.Compiler.Main I'm afraid I did a rather crude monkey-patch to replace the 'generate_a_pyrex_source' function. It seems to work for numpy 1.1 and current trunk. The patching process is here: http://github.com/matthew-brett/du-cy-numpy/blob/master/matthew_monkey.py Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Somewhat goofy warning in 'isfinite'?
Hi, I just noticed this: In [2]: np.isfinite(np.inf) Warning: invalid value encountered in isfinite Out[2]: False Maybe it would be worth not raising the warning, in the interests of tidiness? Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy 2.0, what else to do?
Hi, On Sat, Feb 13, 2010 at 11:53 AM, Xavier Gnata xavier.gn...@gmail.com wrote: IMHO 2.0 should support python3. That would be a major step and a good reason to call it 2.0. I agree with Travis, I think we should try not to attach too much importance to the big number change, release 2.0 just taking care of the ABI compatibility with the usual feature-freeze for an upcoming release, and then we can release 3.0 with any major additions in due course, as the work gets done. Basically, the '2.0' label does not mean that there's open-season for feature changes at this point - that has to wait, if the release is going to be stable. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy 2.0, what else to do?
Hi, Sounds to me like you don't fully agree w/ Travis - he said This is exactly what I was worried about with calling the next release 2.0. Seems that Travis understands that the larger community, whether we want them to or not, _does_ attach...much importance to [a] big number change and wants to avoid calling the next release 2.0 precisely because he recognizes that the changes we do think we can make in three weeks don't warrant that magnitude of a number change. But then, perhaps I shouldn't speak for Travis, sorry Travis. ;-) I think the wider community will be OK, as long as we stay calm about not getting overwhelmed with the number change, and just doing an ordinary release. I can't see us losing many users if they pick up 2.0 and don't see lots of new features, at least, that's never worried me in other people's releases. In any case, I think we're committed to the 2.0 version number at this point. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Fwd: [atlas-devel] ATLAS support letters
Hi, I don't know if y'all are subscribed to the ATLAS mailing list, but, it would be good if we could find a way of supporting Clint as strongly as we can. Best, Matthew -- Forwarded message -- From: Clint Whaley wha...@cs.utsa.edu Date: Fri, Feb 12, 2010 at 9:15 AM Subject: [atlas-devel] ATLAS support letters To: math-atlas-de...@lists.sourceforge.net Guys, I go up for tenure this year. The tenure committee has asked me to get letters of support from ATLAS users so that they can assess the service impact of my support of ATLAS (I *tell* them it is widely used, but can I show it other than downloads?). The letter would discuss a little of what you do, and how you use ATLAS, and the importance of having ATLAS in furthering your project goals. So, if you are part of an organization/business/open source project/research project that uses ATLAS, please contact me if you or a colleague is willing to help with such a letter. If you know someone at such a place that uses ATLAS, forward this on. I will be contacting some groups that I know use ATLAS, but I don't know about the majority of people/groups who do, and I often don't have records and so forget even the ones I knew used it . . . With Goto taking a position at MS, it is all the more important that I can show my colleagues that ATLAS support and development is a service to the community, and having it at UTSA helps the university and department . . . Thanks, Clint ** ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley ** ** -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Math-atlas-devel mailing list math-atlas-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/math-atlas-devel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, On Fri, Feb 12, 2010 at 7:02 AM, Fernando Perez fperez@gmail.com wrote: On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern robert.k...@gmail.com wrote: Here's the problem that I don't think many people appreciate: logical arguments suck just as much as personal experience in answering these questions. You can make perfectly structured arguments until you are blue in the face, but without real data to premise them on, they are no better than the gut feelings. They can often be significantly worse if the strength of the logic gets confused with the strength of the premise. I need to frame this (or make a sig to put it in, the internet equivalent of a wooden frame :). Thank you, Robert. Yes, except that, at its most extreme, it renders reasonable argument pointless, and leads to resolving disputes by authority rather than discussion. Of course we don't work or think in realm that can be cleared of bias or error, but it would be difficult be a scientist and fail to notice that - agreeing - things that really should be true, aren't true and - disagreeing - despite all the threatening brackets, reasoned argument, and careful return to data, do work in increasing our understanding. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, Just a comment: I would like to point out that there is (necessarily) some arbitrary threshold to who is being recognized as people who are actively writing the code. Over the last year, I have posted fixes for multiple bugs and extended the ufunc wrapping mechanisms (__array_prepare__) which were included in numpy-1.4.0, and have also been developing the quantities package, which is intimately tied up with numpy's development. I don't think that makes me a major contributor like you or Chuck etc., but I am heavily invested in numpy's development and an active contributor. Yes - I think that's a valid point - that there is a spectrum in our contributions to numpy, and it is not possible to divide us very clearly into those whose opinions count and those don't. There's code contribution, but there is also commitment and investment. These should also have their weight, in a healthy community. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, I don't want to go the route of marking things experimental which David's pro-1.5 vote seemed to advocate. From what I gathered, Pauli, David, and I were 1.5 with various degrees of opinion and Charles, and Robert are 2.0. Others that I know about: Stephan is 1.5, Jarrod is 2.0, Matthew and Darren seem to be for 2.0. Yes - I'm still rather strongly for 2.0, on the basis that the downside (not as many new features as people might expect, a feeling that we might support a 1.x series) are considerably less damaging than unexpected ABI breakage. I could see my way through to supporting a NumPy 2.0 release. I would ask for the following: 1) I would like the release to come out in about 3-4 weeks 2) I would like the release to contain all the ABI changes we think we will need until NumPy 3.0 when something like David's ideas are implemented which would need to be no sooner than 1 year from now. 3) The following changes to the ABI (no promise that I might not ask for more before the release date): * change the ABI indicator * put the DATETIME dtypes back in their original place in the list * move the *cast functions to the end of the ArrFuncs structure * place 2-3 place-holders in that ArrFuncs structure * fix the hasobject data-type Any other simple ABI changes that should be made? That all seems good to me. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, NumPy decisions in the past have been made by me and other people who are writing the code. I think we have tried pretty hard to listen to all points of view before doing anything. I think there are many examples of this. I hope this previous history alleviates some concern that something else is going to be done here. I think it's notable in general how collegial numpy discussions have been, and for that, thank you to you in particular. I was going to say earlier, but didn't, that your list of steerers seemed very sensible. Only a small point, but, while I completely agree that the version number is a bike-shed, I don't think that's true of the ABI breakage, but I'm sure that's not what you meant. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Mon, Feb 8, 2010 at 2:07 PM, Robert Kern robert.k...@gmail.com wrote: On Mon, Feb 8, 2010 at 16:05, Darren Dale dsdal...@gmail.com wrote: On Mon, Feb 8, 2010 at 5:05 PM, Jarrod Millman mill...@berkeley.edu wrote: On Mon, Feb 8, 2010 at 1:57 PM, Charles R Harris charlesr.har...@gmail.com wrote: Should the release containing the datetime/hasobject changes be called a) 1.5.0 b) 2.0.0 My vote goes to b. You don't matter. Nor do I. Jarrod is on the steering committee. You seem to be pointing out that Darren's vote doesn't count but Jarrod's does. Really, that's a view of the steering committee idea that seems to me a bit miserable. Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, On Mon, Feb 8, 2010 at 2:05 PM, Jarrod Millman mill...@berkeley.edu wrote: On Mon, Feb 8, 2010 at 1:57 PM, Charles R Harris charlesr.har...@gmail.com wrote: Should the release containing the datetime/hasobject changes be called a) 1.5.0 b) 2.0.0 My vote goes to b. I guess Travis' point is that 2.0 implies rather large feature difference from - say 1.0.0 - and this isn't the case.On the other hand, I don't see what substantial difference that makes in the long run - we can always go to 3.0 for a big rewrite and I don't think we'll use any users as a result. On the other hand we might lose users from an ABI change not easily predicted from the version numbering. I guess what I'm saying is we have lots of integers left, and they are cheap, and I'd also vote for using one up to get round this little hurdle. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Trust me, the steering committee would much prefer not to decide anything by any means. I do trust you ;) Looking at the emails, it seems to me there's quite a strong consensus. You don't mean that the steering committee is needed when people on the steering committee don't agree with the consensus, I'm sure. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
No, there isn't. Consensus means everyone, not just a strong majority. http://producingoss.com/en/consensus-democracy.html I stand corrected. I meant then, that there's a strong majority agreement on what to do. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, That is correct. And having failed to find a consensus solution and with several of the people doing the actual work disagreeing (which is neither you, nor I, nor Darren, nor most readers on this list who have weighed in on the discussion phase and may feel miffed about not getting a final vote), we move on to a vote from the steering committee to formalize that majority. I'm continuing only because, the discussion has generated some heat, and I think part of that heat comes from the perception that the excellent community spirit of the project is somewhat undermined by the feeling that reasonable arguments are not being fully heard. More generally I completely agree that the decisions have to be made by the people doing the work, and that I'm not one of them. But, the emphasis of the work on numpy has shifted from development to maintenance, and I'm still not sure that the discussion thus far has fully reflected that fact. I'm really not disagreeing with the decisions made (and if I did, you could rightly and politely ignore me), but I think the atmosphere of how the decisions are made is also important. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, I'm continuing only because, the discussion has generated some heat, and I think part of that heat comes from the perception that the excellent community spirit of the project is somewhat undermined by the feeling that reasonable arguments are not being fully heard. How does one get that feeling? Is that a real question? More generally I completely agree that the decisions have to be made by the people doing the work, and that I'm not one of them. But, the emphasis of the work on numpy has shifted from development to maintenance, and I'm still not sure that the discussion thus far has fully reflected that fact. Unfortunately, it's getting too late to address deficiencies in the breadth and depth of the already-too-extensive discussion. You should have spoken up sooner. We need to make a decision now. I'm not asking for influence in the decision, nor am I trying to delay the decision. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, Is that a real question? Absolutely. What leads you to believe that the reasonable arguments aren't being heard? If one were to start a thread giving an idea and no one responds while vigorous discussion is happening in other threads, that would certainly be visible evidence of that idea not being fully heard. I'm something at a loss to guess how you would ascertain from a thread that has now gone past a hundred messages (most of which favor the side I presume you think the unheard arguments are coming from) that some of the arguments are not being fully heard. Of course we were always discussing judgement calls, and these are always going to be subjective, but I don't think that means that we can't hope to come to a reasoned agreement. I only wrote because I felt that we were beginning to drift towards a formal committee-style judgement in a situation where it has been pretty clear what the majority view was, and that we have to be careful about that, because it can reduce our feeling of shared ownership and responsibility - a feeling that numpy has been remarkably good at maintaining. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, Majorities don't make numpy development decisions normally. Never have. Not of the mailing list membership nor of the steering committee. Implementors do. When implementors disagree strongly and do not reach a consensus, then we fall back to majorities. But as I said before, majority voting requires conscientious control over the voting membership or it isn't majority voting. The process that you identified as being remarkably good at maintaining shared ownership and responsibility isn't majority rule, but consensus among implementors. We just don't have that right now, but we need to get stuff done anyways. I think that's right, in general, but in this case, the primary disagreement was between David C+Chuck, and Travis, and there has been a large weight of the contributions to the list in favor of David's view. Now, you might say, I don't care about the weight of contributions because the people mailing don't implement, but that obviously has a social cost. All important arguments are resolved now, we've withdrawn the binary, agreed to a next ABI breaking release, and David's happy with 1.5 as a number, so I don't think we have to worry that discussion will delay getting stuff done at this point, See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, Getting rid of FILE* pointers and file descriptor would also helps quite a bit on windows. I know that at some point, there were some discussions to make the python C API safe to multiple C runtimes, but I cannot find any recent discussion on that fact. I should just ask on python-dev, I guess. This would be a great relief if we don't have to care about those issues anymore. Just to say that when Guido visited Berkeley a while back he was encouraging us strongly to contact the python-dev list for any help we needed to port to Py3k - so I'd imagine you'd get a good reception... See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
Hi, If that's the case, and particularly if it's going to be a while before 1.4.1 is ready, I suggest that the 1.4.0 release be pulled from current release status on the download sites. +1. If the decision is as you say, I agree with you. That seems reasonable to me too... Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Conversion of matlab import containing objects into 3d array
HI, I'm trying to import data from a matlab file using scipy.io.loadmat. One of the variables in the file imports as an array of shape (51,) of dtype object, with each element being an array of shape (23,100) of dtype float. How do I convert this array into a single array of dtype float with shape (51,23,100)? objarr.astype(float), which I thought might work (from [1]), gives me the error ValueError: setting an array element with a sequence.. I guess that your array started life as a matlab cell array of shape (51,1). As far as I know you'd have to convert long-hand: np.concatenate(list(a), axis=0).reshape((51,23,100)) sort of thing... Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion