[Numpy-discussion] Please Unsubscribe
___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal
On 02-Apr-15 4:35 PM, Eric Firing wrote: On 2015/04/02 10:22 AM, josef.p...@gmail.com wrote: Swapping the axis when slices are mixed with fancy indexing was a design mistake, IMO. But not fancy indexing itself. I'm not saying there should be no fancy indexing capability; I am saying that it should be available through a function or method, rather than via the square brackets. Square brackets should do things that people expect them to do--the most common and easy-to-understand style of indexing. Eric +1 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
Thanks Ryan. There are a number of good thoughts in your message. I'll try to keep track of them. Another respondent reported different results than mine. I'm in the process of re-installing to check. Colin W. On 11 February 2015 at 16:18, Ryan Nelson rnelsonc...@gmail.com wrote: Colin, I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda environment with Python2.7 and Numpy 1.7.0, and I get the same: Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) [MSC v .1500 64 bit (AMD64)] Type copyright, credits or license for more information. IPython 2.3.1 -- An enhanced Interactive Python. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://binstar.org ? - Introduction and overview of IPython's features. %quickref - Quick reference. help - Python's own help system. object? - Details about 'object', use 'object??' for extra details. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.7.0' In [3]: np.mat([4,'5',6]) Out[3]: matrix([['4', '5', '6']], dtype='|S1') In [4]: np.mat([4,'5',6], dtype=int) Out[4]: matrix([[4, 5, 6]]) ### As to your comment about coordinating with Statsmodels, you should see the links in the thread that Alan posted: http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 Josef's comments at the time seem to echo the issues the devs (and others) have with the matrix class. Maybe things have changed with Statsmodels. I know I mentioned Sage and SageMathCloud before. I'll just point out that there are folks that use this for real research problems, not just as a pedagogical tool. They have a Matrix/vector/column_matrix class that do what you were expecting from your problems posted above. Indeed below is a (truncated) cut and past from a Sage Worksheet. (See http://www.sagemath.org/doc/tutorial/tour_linalg.html) ## In : Matrix([1,'2',3]) Error in lines 1-1 Traceback (most recent call last): TypeError: unable to find a common ring for all elements In : Matrix([[1,2,3],[4,5]]) ValueError: List of rows is not valid (rows are wrong types or lengths) In : vector([1,2,3]) (1, 2, 3) In : column_matrix([1,2,3]) [1] [2] [3] ## Large portions of the custom code and wrappers in Sage are written in Python. I don't think their Matrix object is a subclass of ndarray, so perhaps you could strip out the Matrix stuff from here to make a separate project with just the Matrix stuff, if you don't want to go through the Sage interface. On Wed, Feb 11, 2015 at 11:54 AM, cjw c...@ncf.ca wrote: On 11-Feb-15 10:21 AM, Ryan Nelson wrote: So: In [2]: np.mat([4,'5',6]) Out[2]: matrix([['4', '5', '6']], dtype='U11') In [3]: np.mat([4,'5',6], dtype=int) Out[3]: matrix([[4, 5, 6]]) Thanks Ryan, We are not singing from the same hymn book. Using PyScripter, I get: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** import numpy as np print('Numpy version: ', np.__version__) ('Numpy version: ', '1.9.0') Could you say which version you are using please? Colin W On Tue, Feb 10, 2015 at 5:07 PM, cjw c...@ncf.ca c...@ncf.ca wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print(d[0, 1]= 'b' # Correctly flagged, not numeric, ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context:http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing listNumPy-Discussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing listNumPy-Discussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list
Re: [Numpy-discussion] Characteristic of a Matrix.
On 06/01/2015 8:38 PM, Alexander Belopolsky wrote: On Tue, Jan 6, 2015 at 8:20 PM, Nathaniel Smith n...@pobox.com mailto:n...@pobox.com wrote: Since matrices are now part of some high school curricula, I urge that they be treated appropriately in Numpy. Further, I suggest that consideration be given to establishing V and VT sub-classes, to cover vectors and transposed vectors. The numpy devs don't really have the interest or the skills to create a great library for pedagogical use in high schools. If you're interested in an interface like this, then I'd suggest creating a new package focused specifically on that (which might use numpy internally). There's really no advantage in glomming this into numpy proper. Sorry for taking this further off-topic, but I recently discovered an excellent SAGE package, http://www.sagemath.org/. While it's targeted audience includes math graduate students and research mathematicians, parts of it are accessible to schoolchildren. SAGE is written in Python and integrates a number of packages including numpy. My remark about high school was intended to emphasise that matrix algebra is an essential part of linear algebra. Numpy has not fully developed this part. I feel that Guido may not have fully understood the availability of the Matrix class when he approved the reliance on dot(). I would highly recommend to anyone interested in using Python for education to take a look at SAGE. Thanks Alexander, I'll do that. It looks excellent, but it seems that the University of Washington has funding problems and does not appear to have the crew of volunteers that Python has. Regards, Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Characteristic of a Matrix.
On 08/01/2015 1:19 PM, Ryan Nelson wrote: Colin, I'll second the endorsement of Sage; however, for teaching purposes, I would suggest Sage Math Cloud. It is a free, web-based version of Sage, and it does not require you or the students to install any software (besides a new-ish web browser). It also make sharing/collaborative work quite easy as well. I've used this a bit for demos, and it's great. The author William Stein is good at correcting bugs/issues very quickly. Sage implements it's own Matrix and Vector classes, and the Vector class has a column method that returns a column vector (transpose). http://www.sagemath.org/doc/tutorial/tour_linalg.html For what it's worth, I agree with others about the benefits of avoiding a Matrix class in Numpy. In my experience, it certainly makes things cleaner in larger projects when I always use NDArray and just call the appropriate linear algebra functions (e.g. np.dot, etc) when that is context I need. Anyway, just my two cents. Ryan Ryan, Thanks. I agree that Sage Math Cloud seems the better way to go for students. However your preference for the dot() world may be because the Numpy Matrix Class is inadequately developed. I'm not suggesting that development, at this time, but proposing that the errors I referenced be considered as bugs. Colin W. On Wed, Jan 7, 2015 at 2:44 PM, cjw c...@ncf.ca mailto:c...@ncf.ca wrote: Thanks Alexander, I'll look at Sage. Colin W. On 06-Jan-15 8:38 PM, Alexander Belopolsky wrote: On Tue, Jan 6, 2015 at 8:20 PM, Nathaniel Smithn...@pobox.com mailto:n...@pobox.com wrote: Since matrices are now part of some high school curricula, I urge that they be treated appropriately in Numpy. Further, I suggest that consideration be given to establishing V and VT sub-classes, to cover vectors and transposed vectors. The numpy devs don't really have the interest or the skills to create a great library for pedagogical use in high schools. If you're interested in an interface like this, then I'd suggest creating a new package focused specifically on that (which might use numpy internally). There's really no advantage in glomming this into numpy proper. Sorry for taking this further off-topic, but I recently discovered an excellent SAGE package,http://www.sagemath.org/ http://www.sagemath.org/. While it's targeted audience includes math graduate students and research mathematicians, parts of it are accessible to schoolchildren. SAGE is written in Python and integrates a number of packages including numpy. I would highly recommend to anyone interested in using Python for education to take a look at SAGE. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Characteristic of a Matrix.
One of the essential characteristics of a matrix is that it be rectangular. This is neither spelt out or checked currently. The Doc description refers to a class: - *class *numpy.matrix[source] http://github.com/numpy/numpy/blob/v1.9.1/numpy/matrixlib/defmatrix.py#L206 Returns a matrix from an array-like object, or from a string of data. A matrix is aspecialized 2-D array that retains its 2-D nature through operations. It has certain special operators, such as * (matrix multiplication) and ** (matrix power). This illustrates a failure, which is reported later in the calculation: A2= np.matrix([[1, 2, -2], [-3, -1, 4], [4, 2 -6]]) Here 2 - 6 is treated as an expression. Wikipedia offers: In mathematics http://en.wikipedia.org/wiki/Mathematics, a *matrix* (plural *matrices*) is a rectangular http://en.wikipedia.org/wiki/Rectangle *array http://en.wiktionary.org/wiki/array*[1] http://en.wikipedia.org/wiki/Matrix_%28mathematics%29#cite_note-1 of numbers http://en.wikipedia.org/wiki/Number, symbols http://en.wikipedia.org/wiki/Symbol_%28formal%29, or expressions http://en.wikipedia.org/wiki/Expression_%28mathematics%29, arranged in *rows http://en.wiktionary.org/wiki/row* and *columns http://en.wiktionary.org/wiki/column*.[2] http://en.wikipedia.org/wiki/Matrix_%28mathematics%29#cite_note-2[3] http://en.wikipedia.org/wiki/Matrix_%28mathematics%29#cite_note-3 The individual items in a matrix are called its *elements* or *entries*. An example of a matrix with 2 rows and 3 columns is [image: \begin{bmatrix}1 9 -13 \\20 5 -6 \end{bmatrix}.]In the Numpy context, the symbols or expressions need to be evaluable. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Compiling Numpy-1.8.1
This version of Numpy does not appear to be available as an installable binary. In any event, the LAPACK and other packages do not seem to be available with the installable versions. I understand that Windows Studio 2008 is normally used for Windows compiling. Unfortunately, this is no longer available from Microsoft. The link is replaced by a Power Point presentation. Can anyone suggest an alternative compiler/linker? Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Compiling Numpy-1.8.1
Oliver, Thanks. I've installed Windows Studio 2008 Express. I'll read your building on Winods Document. Colin W. On 29 July 2014 08:50, Olivier Grisel olivier.gri...@ensta.org wrote: 2014-07-29 14:24 GMT+02:00 Colin J. Williams c...@ncf.ca: This version of Numpy does not appear to be available as an installable binary. In any event, the LAPACK and other packages do not seem to be available with the installable versions. I understand that Windows Studio 2008 is normally used for Windows compiling. Unfortunately, this is no longer available from Microsoft. The link is replaced by a Power Point presentation. Can anyone suggest an alternative compiler/linker? The web installers for MSVC Express 2008 is still online at: http://go.microsoft.com/?linkid=7729279 FYI I recently update the scikit-learn documentation for building under windows, both for Python 2 and Python 3 as well as 32 bit and 64 bit architectures: http://scikit-learn.org/stable/install.html#building-on-windows The same build environment should work for numpy (I think). -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy-Discussion Digest, Vol 90, Issue 83
On 25-Mar-2014 1:00 PM, numpy-discussion-requ...@scipy.org wrote: Message: 3 Date: Mon, 24 Mar 2014 17:58:57 -0600 From: Charles R Harris charlesr.har...@gmail.com Subject: Re: [Numpy-discussion] Resolving the associativity/precedence debate for @ To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: cab6mnxlyjna5bhgoho+u8+p3umvxdjgg+zuqfwi+vjfhfos...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 On Mon, Mar 24, 2014 at 5:56 PM, Nathaniel Smith n...@pobox.com wrote: On Sat, Mar 22, 2014 at 6:13 PM, Nathaniel Smithn...@pobox.com wrote: After 88 emails we don't have a conclusion in the other thread (see [1] for background). But we have to come to some conclusion or another if we want @ to exist:-). So I'll summarize where the discussion stands and let's see if we can find some way to resolve this. Response in this thread so far seems (AFAICT) to have pretty much converged on same-left. If you think that this would be terrible and there is some compelling argument against it, then please speak up! Otherwise, if no-one objects, then I'll go ahead in the next few days and put same-left into the PEP. I think we should take a close look at broadcasting before deciding on the precedence. Chuck -- next part -- An HTML attachment was scrubbed... URL:http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140324/626e79be/attachment-0001.html -- Perhaps a closer look at np.matrix is needed too. There has been no close exploration of the weaknesses perceived by Nathan in the Matrix class. Are any of these of substance? If so, what corrections would be needed? Would implementation of those changes be done readily. I would like to see a Vector class, as a specialization of Matrix. These would avoid the use of an additional operator which would only be used with numpy. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy-Discussion Digest, Vol 90, Issue 56
Julian, I can see the need to recognize both column and row vectors, but why not with np.matrix? I can see no need for a new operator and hope to be able to comment more fully on PEP 465 in a few days. Colin W. On 17-Mar-2014 7:19 PM, numpy-discussion-requ...@scipy.org wrote: Send NumPy-Discussion mailing list submissions to numpy-discussion@scipy.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.scipy.org/mailman/listinfo/numpy-discussion or, via email, send a message with subject or body 'help' to numpy-discussion-requ...@scipy.org You can reach the person managing the list at numpy-discussion-ow...@scipy.org When replying, please edit your Subject line so it is more specific than Re: Contents of NumPy-Discussion digest... Today's Topics: 1. Re: [help needed] associativity and precedence of '@' (Nathaniel Smith) 2. Re: GSoC project: draft of proposal (Julian Taylor) 3. Re: [help needed] associativity and precedence of '@' (Christophe Bal) 4. Re: [help needed] associativity and precedence of '@' (Alexander Belopolsky) 5. Re: [help needed] associativity and precedence of '@' (Bago) 6. Re: [help needed] associativity and precedence of '@' (Christophe Bal) 7. Re: [help needed] associativity and precedence of '@' (Christophe Bal) 8. Re: [help needed] associativity and precedence of '@' (Nathaniel Smith) -- Message: 1 Date: Mon, 17 Mar 2014 22:02:33 + From: Nathaniel Smith n...@pobox.com Subject: Re: [Numpy-discussion] [help needed] associativity and precedence of '@' To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: CAPJVwB=zBazN+fiYWJeiWOL=4a9bf2xgxjgott8gftt-kdu...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 On Mon, Mar 17, 2014 at 9:38 PM, Christophe Bal projet...@gmail.com wrote: Here is the translation. ;-) Hello, and what about something like that ? a @ b @ c - (a @ b) @ c a * b @ c - (a * b) @ c a @ b * c - a @ (b * c) Easy to remember: the *-product has priority regarding to the @-product, and we just do @-product from left to right. In the terminology we've been using in this thread, this is weak-left. An advantage of this is that most parsers do analyze from left to right. So I really think that it is a better choice than the weak-right one. We've mostly ignored this option because of assuming that if we want left-associativity, we should go with same-left instead of weak-left. Same-left is: a @ b @ c - (a @ b) @ c a * b @ c - (a * b) @ c a @ b * c - (a @ b) * c i.e., even more left-to-right than weak-left :-) Do you think weak-left is better than same-left? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy-Discussion Digest, Vol 90, Issue 45
I would like to see the case made for @. Yes, I know that Guido has accepted the idea, but he has changed his mind before. The PEP seems neutral to retaining both np.matrix and @. Nearly ten years ago, Tim Peters http://legacy.python.org/dev/peps/pep-0020/ gave us: /There should be one-- and preferably only one --obvious way to do it. / W/e now have: / /C= A * BC becomes an instance of the Matrix class (m, p) When A and B are matrices a matrix of (m, n) and (n, p) respectively. Actually, the rules are a little more general than the above. / The PEP proposes that /C= /A @ B where the types or classes of A, B and C are not clear. We also have A.I for the inverse, for the square matrix) or A.T for the transpose of a matrix. One way is recommended in the Zen of Python, of the two, which is the obvious way? Colin W. / / On 15-Mar-2014 9:25 PM, numpy-discussion-requ...@scipy.org wrote: Send NumPy-Discussion mailing list submissions to numpy-discussion@scipy.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.scipy.org/mailman/listinfo/numpy-discussion or, via email, send a message with subject or body 'help' to numpy-discussion-requ...@scipy.org You can reach the person managing the list at numpy-discussion-ow...@scipy.org When replying, please edit your Subject line so it is more specific than Re: Contents of NumPy-Discussion digest... Today's Topics: 1. Re: [help needed] associativity and precedence of '@' (josef.p...@gmail.com) 2. Re: [RFC] should we argue for a matrix power operator, @@? (josef.p...@gmail.com) -- Message: 1 Date: Sat, 15 Mar 2014 21:20:40 -0400 From: josef.p...@gmail.com Subject: Re: [Numpy-discussion] [help needed] associativity and precedence of '@' To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: cammtp+ahag9fn3xpts4udrthbknvxzudc0g8ttj7g3w3dwb...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith n...@pobox.com wrote: Hi all, Here's the main blocker for adding a matrix multiply operator '@' to Python: we need to decide what we think its precedence and associativity should be. I'll explain what that means so we're on the same page, and what the choices are, and then we can all argue about it. But even better would be if we could get some data to guide our decision, and this would be a lot easier if some of you all can help; I'll suggest some ways you might be able to do that. So! Precedence and left- versus right-associativity. If you already know what these are you can skim down until you see CAPITAL LETTERS. We all know what precedence is. Code like this: a + b * c gets evaluated as: a + (b * c) because * has higher precedence than +. It binds more tightly, as they say. Python's complete precedence able is here: http://docs.python.org/3/reference/expressions.html#operator-precedence Associativity, in the parsing sense, is less well known, though it's just as important. It's about deciding how to evaluate code like this: a * b * c Do we use a * (b * c)# * is right associative or (a * b) * c# * is left associative ? Here all the operators have the same precedence (because, uh... they're the same operator), so precedence doesn't help. And mostly we can ignore this in day-to-day life, because both versions give the same answer, so who cares. But a programming language has to pick one (consider what happens if one of those objects has a non-default __mul__ implementation). And of course it matters a lot for non-associative operations like a - b - c or a / b / c So when figuring out order of evaluations, what you do first is check the precedence, and then if you have multiple operators next to each other with the same precedence, you check their associativity. Notice that this means that if you have different operators that share the same precedence level (like + and -, or * and /), then they have to all have the same associativity. All else being equal, it's generally considered nice to have fewer precedence levels, because these have to be memorized by users. Right now in Python, every precedence level is left-associative, except for '**'. If you write these formulas without any parentheses, then what the interpreter will actually execute is: (a * b) * c (a - b) - c (a / b) / c but a ** (b ** c) Okay, that's the background. Here's the question. We need to decide on precedence and associativity for '@'. In particular, there are three different options that are interesting: OPTION 1 FOR @: Precedence: same as * Associativity: left My shorthand name for it: same-left (yes, very creative) This means that if you don't use parentheses, you get: a @ b @ c - (a @ b) @ c a * b @ c - (a * b) @ c a @ b * c -
[Numpy-discussion] Matrix peculiarities
Ralf, Could you please elaborate on the matrix weaknesses? Is there any work planned to eliminate the peculiarities? Regards, Colin W. Subject: Re: [Numpy-discussion] Relative speed To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: Â Â Â CABL7CQjq6wZdfFmBgMhF5kFGpgQxqCY-Nv20=zbmtlwpxdd...@mail.gmail.com Content-Type: text/plain; charset="iso-8859-1" On Thu, Aug 29, 2013 at 3:41 PM, Jonathan T. Niehof jnie...@lanl.govwrote: On 08/29/2013 09:33 AM, Anubhab Baksi wrote: Hi, I need to know about the relative speed (i.e., which one is faster) of the followings: 1. list and numpy array, tuples and numpy array 2. list of tuples and numpy matrix (first one is rectangular) 3. random.randint() and numpy.random.random_integers() Hi Anubhab, if you have a reasonably large amount of data (say O(100)), always try to use numpy arrays and not lists or tuples - it'll be faster. I'd recommend not to use numpy.matrix, it's speed will be similar to numpy arrays but it has some peculiarities that you'd rather not deal with. For the random numbers I'm not sure without checking, just timing it in ipython with %timeit is indeed the way to go. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Python PEP 450
This is to respond to Alan's message: Message: 7 Date: Fri, 16 Aug 2013 11:20:32 -0400 From: Alan G Isaac alan.is...@gmail.com Subject: [Numpy-discussion] PEP 450 (stats module for standard library) To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: 520e4340.5010...@gmail.com Content-Type: text/plain; charset=UTF-8; format=flowed http://www.python.org/dev/peps/pep-0450/ https://groups.google.com/forum/#!topic/comp.lang.python/IV-3mobU7L0 Alan Isaac I suggest that the objectives should clearly identify the target group. It appears to be secondary school students. The overall aim should be compatibility with the numpy calls. Intentional deviations should be justified. Mean, median, variance, standard deviation correlation and simple regression are desirable. The Poisson distribution should be included, I liked the story about horses in the Prussian army. lsr.py provides the start of a regression package for numpy: https://pypi.python.org/pypi/lsr.SID/0.3 This is based on matrices, which are probably not widely taught to secondary students. # See: http://www.stat.purdue.edu/~jennings/stat514/stat512notes/topic3.pdf # or: http://www.stat.purdue.edu/~zhangdb/stat525/notes/ch5.pdf - better Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Treatment of the Matrix by Numpy
To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: CAN06oV9E2Xsf=tgbyqgxpnt4lhan6twtbuyi8gagt-vg2qa...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 On Wed, Jul 24, 2013 at 8:53 AM, St?fan van der Walt ste...@sun.ac.za wrote: On Wed, Jul 24, 2013 at 2:15 AM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Tue, Jul 23, 2013 at 6:09 AM, Pauli Virtanen p...@iki.fi wrote: The .H property has been implemented in Numpy matrices and Scipy's sparse matrices for many years. Then we're done. Numpy is an array package, NOT a matrix package, and while you can implement matrix math with arrays (and we do), having quick and easy mnemonics for common matrix math operations (but uncommon general purpose array operations) is not eh job of numpy. That's what the matrix object is for. I would argue that the ship sailed when we added .T already. Most users see no difference between the addition of .T and .H. I agree. During the Numarray period, I developed a Matrix sub-class which provided: # Properties A= property(fget= toArray, doc= 'Deliver the data as an array.') Adj= property(fget= getAdjoint, doc= 'Deliver the adjoint matrix.') Conj= property(fget= getConjugate, doc= 'Deliver the conjugate of the matrix.') Diag= property(fget= getDiagonal, doc= 'Extract the diagonal as a row vector.') lTri= property(fget= getLTri, doc= 'Extract the lower triangular matrix, ' + 'ie. elements on and below the diagonal.') Cond= property(fget= getCond, doc= 'Deliver the 2-norm Condition number.') Det= property(fget=getDeterminant, doc= 'Deliver the determinant.') EValues= property(fget= getEigenvalues, doc= 'Deliver the eigenvalues.') EVectors= property(fget= getEigenvectors, doc= 'Deliver the eigenvectors.') I= property(fget= getInverse, doc= 'Deliver the inverse.') Imag= property(fget= getImag, doc= 'Return the imaginary part of the matrix.') Norm= property(fget= getNorm, doc= 'To calculate the 2-norm of the matrix.') Real= property(fget= getReal, doc= 'Return the real part of the matrix.') Sqr= property(fget= getSqr, doc= 'Return the square of each element.') SVD= property(fget= getSVD, doc= 'Return S, V, D (Singular Value Decomposition.') T= property(fget= getTranspose, doc= 'Deliver the transpose.') uTri= property(fget= getUTri, doc= 'Extract the upper triangular matrix, ' + 'ie. elements on and above the diagonal.' I think H was in there too. All of this was lost when Travis came along with numpy. No thought was given to sparse matrices at that time. The matrix class should probably be deprecated and removed from NumPy in the long run--being a second class citizen not used by the developers themselves is not sustainable. And, now that we have "dot" as a method, there's very little advantage to it. I would argue that, in some sense, it should be promoted. Perhaps it's better as a separate module. St?fan Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy-Discussion Digest, Vol 82, Issue 34
Thanks Rob, I agree. Your suggestion is the better way. Colin W. On 19/07/2013 11:09 AM, numpy-discussion-requ...@scipy.org wrote: 1. Today's Topics: 1. Re: User Guide (Rob Clewley) [snip] Message: 1 Date: Thu, 18 Jul 2013 18:21:24 -0400 From: Rob Clewley rob.clew...@gmail.com Subject: Re: [Numpy-discussion] User Guide To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: ca+7tcysr-cf7ynmpbnj+bl69q1+nxsjqrp2w6xhyoobg57c...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Hi, I see the desire for stylistic improvement by removing the awkward parens but your correction has incorrect grammar. One cannot have "arrays of Python," nor are Numpy objects a subset of "Python" (because Python is not a set) -- both of which are what your sentence technically states. I.e., the commas are in the wrong place. You could say "The exception: one can have arrays of python objects (including those from numpy) thereby allowing for arrays of different sized elements." but I think it is even clear to just unpack this a bit more with "The exception: one can have arrays of python objects, including numpy objects, which allows arrays to contain different sized elements." In my experience, attempting to be extremely concise in technical writing is a common cause of awkward grammar problems like this. I do it all the time :) -Rob On Thu, Jul 18, 2013 at 9:18 AM, Colin J. Williams cjwilliam...@gmail.com wrote: Returning to numpy after a while away, I'm impressed with the style and content of the User Guide and the Reference. This is to offer a Guide correction - I couldn't figure out how to offer the correction on-line. What is Numpy? Suggest: "The exception: one can have arrays of (Python, including NumPy) objects, thereby allowing for arrays of different sized elements." to: The exception: one can have arrays of Python, including NumPy objects, thereby allowing for arrays of different sized elements. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] User Guide
Returning to numpy after a while away, I'm impressed with the style and content of the User Guide and the Reference. This is to offer a Guide correction - I couldn't figure out how to offer the correction on-line. What is Numpy? Suggest: "The exception: one can have arrays of (Python, including NumPy) objects, thereby allowing for arrays of different sized elements." to: The exception: one can have arrays of Python, including NumPy objects, thereby allowing for arrays of different sized elements. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Time Zones and datetime64
On 12/04/2013 3:57 PM, Chris Barker - NOAA Federal wrote: On Fri, Apr 12, 2013 at 9:52 AM, Riccardo De Maria riccardodema...@gmail.com wrote: Not related to leap seconds and physically accurate time deltas, I have just noticed that SQLite has a nice API: http://www.sqlite.org/lang_datefunc.html that one can be inspired from. The source contains a date.c which looks reasonably clear. well, I don't see any timezone support in there at all. It appears the use UTC, though I"m not entierly sure from the docs what now() would return. So I think it's pretty much like my "use UTC" proposal. -Chris It's not clear whether the Julian day is an integer or contains a fractional part. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Please stop bottom posting!!
On 11/04/2013 7:20 PM, Paul Hobson wrote: On Wed, Apr 3, 2013 at 4:28 PM, Doug Coleman doug.cole...@gmail.com wrote: Also, gmail "bottom-posts" by default. It's transparent to gmail users. I'd imagine they are some of the biggest offenders. Interesting. Mine go to the top by default and I always have to expand the quoted text, trim down as necessary, and then reply below the relevant bits. A quick gander at gmail's setting doesn't offer anything obvious. I'll dig deeper later. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Bottom posting seems to be the accepted Usenet standard. I don't care, can't someone can make a decision, so that we all do the same thing? Please develop a rationale or toss a coin and let us know. Numpy needs a BDFL (or a shorter term, if you wish). Colin W. PS My last posting used the word Anaconda. It was squelched. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Time Zones and datetime64
On 09/04/2013 5:46 PM, Mark Wiebe wrote: On Mon, Apr 8, 2013 at 12:24 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: Recent discussion has made it clear that the timezone handling in the current (numpy1.7) version of datetime64 is broken. Below is a discussion of some possible solutions, hopefully including most of the comments made on the recent thread on this list. http://mail.scipy.org/pipermail/numpy-discussion/2013-April/066038.html The intent it that with a bit more discussion (focused, in this thread at least) on the time zone issues, rather than other DateTIme64 issues, we can start a new datetime64 NEP. This looks great, thanks for putting it together! I've put some comments inline. Background: === The current version (numpy 1.7) of datetime64 appears to handle timezones in the following ways: datetime64s are assumed to be in UTC internally. Time zone translation is done on I/O -- i.e creating a new datetime64 and outputting to text format or as a datetime.datetime object. It might be better to say "defined" instead of "assumed", because that was an explicit choice. When creating a datetime64 from an ISO string, the timezone info in the string is respected. If there is no timezone info in the string, the system time zone (locale setting) is used. On output (i.e.converting to text: __str__ and __repr__) the system locale is used to set the timezone. In [9]: np.datetime64('2013-04-08T12:00:00Z') Out[9]: numpy.datetime64('2013-04-08T05:00:00-0700') In [10]: np.datetime64('2013-04-08T12:00:00') Out[10]: numpy.datetime64('2013-04-08T12:00:00-0700') However, if a datetime,datetime is used without a tzinfo object (the common case, as no tzinfo objects are provided with the python stdlib), the timezone is assumed to be UTC: In [13]: dt Out[13]: datetime.datetime(2013, 4, 8, 12, 0) In [14]: np.datetime64(dt) Out[14]: numpy.datetime64('2013-04-08T05:00:00.00-0700') which can give some odd results, as it's different if you convert the datetime object to a iso string first: In [15]: np.datetime64(dt.isoformat()) Out[15]: numpy.datetime64('2013-04-08T12:00:00-0700') Converting from a datetime64 to a datetime object uses the UTC time (the internal representation with no offset). Issues with the current configuration: === Using the locale time zone is a long standing tradition, and used by the C standard library time functions. However, it is almost always NOT what one wants in a typical numpy application. When working with Scientific (and financial) datasets, the time zone of the data at hand is likely to have nothing to do with the timezone of the computer the code is running on. Also, with cloud computing and web applications, the time zone of the machine on which the code is running is irrelevant to the user. A number of early-adopters of datetime64 have found that they have needed to wrap creating and use of datetime64 arrays to override the timezone behavior. The current implementation may be natural for some interactive use, but that's often not the case, and is particularly problematic when datetime.datetime.now() gives locale lime, but with no time zone info, so numpy actually appears to shift it. In [19]: datetime.datetime.now().isoformat() Out[19]: '2013-04-08T12:05:26.157475'
Re: [Numpy-discussion] ANN: NumPy 1.7.1rc1 release
On 07/04/2013 1:03 AM, Ondřej Čertík wrote: On Tue, Mar 26, 2013 at 6:32 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote: [...] Yes. I created an issue here for them to test it: https://github.com/scikit-learn/scikit-learn/issues/1809 Just to make sure. There doesn't seem to be any more problems, so I am releasing 1.7.1 now. Ondrej ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I would appreciate guidance as to how to install this with any optimizations that are available - win32, Colin W ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Sources more confusing in Python
On 07/04/2013 10:32 AM, Happyman wrote: Hello, I started using python 4-5 months ago. At that time I didn't realize there are incredibly many resource like modules, additional programs (ready one) in python. The problem is to which one I can get all I want "properly". I mean where (exact place) I can download standard modules without going other links?? For example, Excel python module, Image processing module, something module..Every time I get modules from different links.. Is there exact place (stable) to get simply rather than picking/jumping from one to another site?? Any answer is appreciated ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion PyPi is a good starting place, see. There is also PySpread.py, which may be currently limited to Linux. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] try to solve issue #2649 and revisit #473
On 03/04/2013 7:11 PM, huangkan...@gmail.com wrote: Agree with the row-vector and column-vector thing. I notice that in ndarray multiplication, the 1-d array is treated as a column-vector. But in matrix multiplication, 1-d array is converted to a row-vector. So just match the 1-d array to a column-vector, the behavior of ndarray and matrix will be consistent. On Wed, Apr 3, 2013 at 6:59 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Wed, Apr 3, 2013 at 1:03 PM, Alan G Isaac alan.is...@gmail.com wrote: On 4/3/2013 3:18 PM, huangkan...@gmail.com wrote: In my view, the result should be a 1d array, the same as I.A.dot(x). But the maintainers wanted operations with matrices to return matrices whenever possible. So instead of returning x it returns np.matrix(x). the matrix object is a fine idea, but the key problem is that it provides a 2-d matrix, but no concept of a 1-d vector. I think it would all be a cleaner if there were a row-vector and column-vector object to accompany matrix -- they things that naturally return a vector could do so, You can't use a regular 1-d array because there is no way to distinguish between a row or column version. But as Alan sid, this was all hashed out a few years back -- a bunch of great ideas, but no one to implement them. The truth is that matrix has little value outside of teaching, so no one with the skills to push it forward uses it themselves. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Kan Huang Department of Applied math Statistics Stony Brook University 917-767-8018 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I wonder about the value of matrices stated above, Perhaps the thing to do is to wean people away from all those darned dots, Surely, it would be straightforward to sub-class of a matrix (Vec) as a vector (a column of values) there could be a second subclass (TVec) for a transposed vector (a row of values). Some sort of convention would be needed to treat the one dimensional array as a Vec or a TVec. I like the simpler algebra of matrices/vectors. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 23/03/2013 7:21 AM, Ralf Gommers wrote: On Fri, Mar 22, 2013 at 10:39 PM, Colin J. Williams cjwilliam...@gmail.com wrote: On 20/03/2013 11:12 AM, Frédéric Bastien wrote: On Wed, Mar 20, 2013 at 11:01 AM, Colin J. Williams cjwilliam...@gmail.com wrote: On 20/03/2013 10:30 AM, Frédéric Bastien wrote: Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? 32 bits. That explain why you have memory problem but not other people with 64 bits version. So if you want to work with bigger input, change to a python 64 bits. Fred Thanks to the people who responded to my report that numpy, with Python 3.2 was significantly slower than with Python 2.7. I have updated to numpy 1.7.0 for each of the Pythons 2.7.3, 3.2.3 and 3.3.0. The Pythons came from python.org and the Numpys from PyPi. The SciPy site still points to Source Forge, I gathered from the responses that Source Forge is no longer recommended for downloads. That's not the case. The official binaries for NumPy and SciPy are on SourceForge. The Windows installers on PyPI are there to make easy_install work, but they're likely slower than the SF installers (no SSE2/SSE3 instructions). Ralf Thanks, I'll read over Robert Kern's comments. PyPi is the simpler process, but, if the result is unoptimized code, then easy_install is not the way to go. The code is available here(http://web.ncf.ca/cjw/testFPSpeed.py) and the most recent test results are here(http://web.ncf.ca/cjw/FP%2023-Mar-13%20Test%20Summary.txt). These are using PyPi, I'll look into SourceForge. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 23/03/2013 12:05 AM, Chris Barker - NOAA Federal wrote: On Fri, Mar 22, 2013 at 2:39 PM, Colin J. Williams cjwilliam...@gmail.com wrote: I have updated to numpy 1.7.0 for each of the Pythons 2.7.3, 3.2.3 and 3.3.0. ... The tests, which are available here(http://web.ncf.ca/cjw/FP%20Summary%20over%20273-323-330.txt), show that 3.2 is slower, but not to the same degree reported before. Have posted your test code anywhere? Anyway, depending on how you did your timings, that looks to me like 3.* is a bit faster with small data, and pretty much within measurement error for the large datasets. And if the large ones are doing things with really big arrays (I'm assuming pretty big, as you're getting close to 32 bit memory limits...), then it's really hard to imagine how python version could make a noticeable difference -- the real work would be in the numpy code, and that's exactly the same on all python versions. If you are using BLAS or LAPACK stuff, then there might be some differences with the different builds, though I wouldn't expect so if you ar getting them from the same source. -Chris I used the versions from PyPi, this choice has been questioned. I'll compare with the SourceForge versions. Also, I shall be incorporating the random SEED. I expect to report the results in the next week or so. The test code used is available here: http://web.ncf.ca/cjw/testFPSpeed.py Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 11:12 AM, Frédéric Bastien wrote: On Wed, Mar 20, 2013 at 11:01 AM, Colin J. Williams cjwilliam...@gmail.com wrote: On 20/03/2013 10:30 AM, Frédéric Bastien wrote: Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? 32 bits. That explain why you have memory problem but not other people with 64 bits version. So if you want to work with bigger input, change to a python 64 bits. Fred Thanks to the people who responded to my report that numpy, with Python 3.2 was significantly slower than with Python 2.7. I have updated to numpy 1.7.0 for each of the Pythons 2.7.3, 3.2.3 and 3.3.0. The Pythons came from python.org and the Numpys from PyPi. The SciPy site still points to Source Forge, I gathered from the responses that Source Forge is no longer recommended for downloads. The tests, which are available here(http://web.ncf.ca/cjw/FP%20Summary%20over%20273-323-330.txt), show that 3.2 is slower, but not to the same degree reported before. Colin W. PS There seems also to be a Python problem with the treatment of sys.argv in Python 3.3 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order= 2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order= 5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order= 2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order= 5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 10:14 AM, Daπid wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? I know nothing about what goes on behind the scenes. I am using the win32 binary package. Colin W. When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order=5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order=5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 10:29 AM, Jens Nielsen wrote: Hi, Could also be that they are linked to different libs such as atlas and standart Blas. What is the output of numpy.show_config() in the two different python versions. Jens Thanks for this pointer. The result for Py2.7: numpy.show_config() atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['C:\\local\\lib\\yop\\sse3'] define_macros = [('NO_ATLAS_INFO', -1)] language = c atlas_blas_threads_info: NOT AVAILABLE lapack_opt_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['C:\\local\\lib\\yop\\sse3'] define_macros = [('NO_ATLAS_INFO', -1)] language = f77 atlas_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['C:\\local\\lib\\yop\\sse3'] define_macros = [('NO_ATLAS_INFO', -1)] language = f77 lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE atlas_blas_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['C:\\local\\lib\\yop\\sse3'] define_macros = [('NO_ATLAS_INFO', -1)] language = c mkl_info: NOT AVAILABLE The result for 3.2: import numpy numpy.show_config() lapack_info: NOT AVAILABLE lapack_opt_info: NOT AVAILABLE blas_info: NOT AVAILABLE atlas_threads_info: NOT AVAILABLE blas_src_info: NOT AVAILABLE atlas_blas_info: NOT AVAILABLE lapack_src_info: NOT AVAILABLE atlas_blas_threads_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE blas_opt_info: NOT AVAILABLE atlas_info: NOT AVAILABLE lapack_mkl_info: NOT AVAILABLE mkl_info: NOT AVAILABLE I hope that this helps. Colin W. On Wed, Mar 20, 2013 at 2:14 PM, Daπid davidmen...@gmail.com wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 10:30 AM, Frédéric Bastien wrote: Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? 32 bits. Colin W. Fred On Wed, Mar 20, 2013 at 10:14 AM, Daπid davidmen...@gmail.com wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order=5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order=5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 11:06 AM, Jens Nielsen wrote: The python3 version is compiled without any optimised library and is falling back on a slow version. Where did you get this installation from? Jens From the SciPy site. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 11:12 AM, Frédéric Bastien wrote: On Wed, Mar 20, 2013 at 11:01 AM, Colin J. Williams cjwilliam...@gmail.com wrote: On 20/03/2013 10:30 AM, Frédéric Bastien wrote: Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? 32 bits. That explain why you have memory problem but not other people with 64 bits version. So if you want to work with bigger input, change to a python 64 bits. Fred But my machine is only 32 bit. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Synonym standards
It seems that these standards have been adopted, which is good: The following import conventions are used throughout the NumPy source and documentation: import numpy as np import matplotlib as mpl import matplotlib.pyplot as plt Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt Is there some similar standard for PyLab? Thanks, Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Synonym standards
Sent from my BlackBerry® PlayBook™ www.blackberry.com -- *From:* Benjamin Root ben.r...@ou.edu *To:* Discussion of Numerical Python numpy-discussion@scipy.org *Sent:* 26 July 2012 16:57 *Subject:* Re: [Numpy-discussion] Synonym standards On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams fn...@ncf.ca wrote: It seems that these standards have been adopted, which is good: The following import conventions are used throughout the NumPy source and documentation: import numpy as np import matplotlib as mpl import matplotlib.pyplot as plt Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt Is there some similar standard for PyLab? Thanks, Colin W. Colin, Typically, with pylab mode of matplotlib, you do: from pylab import * This is essentially equivalent to: from numpy import * from matplotlib.pyplot import * Note that the pylab module is actually a part of matplotlib and is a shortcut to provide an environment that is very familiar to Matlab users. Converts are then encouraged to use the imports you mentioned in order to properly utilize python namespaces. I hope that helps! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Synonym standards
On 26/07/2012 4:57 PM, Benjamin Root wrote: On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams fn...@ncf.ca wrote: It seems that these standards have been adopted, which is good: The following import conventions are used throughout the NumPy source and documentation: import numpy as np import matplotlib as mpl import matplotlib.pyplot as plt Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt Is there some similar standard for PyLab? Thanks, Colin W. Colin, Typically, with pylab mode of matplotlib, you do: from pylab import * This is essentially equivalent to: from numpy import * from matplotlib.pyplot import * Note that the pylab "module" is actually a part of matplotlib and is a shortcut to provide an environment that is very familiar to Matlab users. Converts are then encouraged to use the imports you mentioned in order to properly utilize python namespaces. I hope that helps! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Thanks Ben, I would prefer not to use: from xxx import *, because of the name pollution. The name convention that I copied above facilitates avoiding the pollution. In the same spirit, I've used: import pylab as plb I had suspected, but hadn't checked, that pylab contains the total namespace of numpy and matplotlib, thanks for confirming this. Colin W, ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Rounding to next lowest float
If you are using integers, why not use Python's Long? Colin W. On 11/10/2011 2:00 PM, Matthew Brett wrote: Hi, Can anyone think of a clever way to round an integer to the next lowest integer represented in a particular floating point format? For example: In [247]: a = 2**25+3 This is out of range of the continuous integers representable by float32, hence: In [248]: print a, int(np.float32(a)) 33554435 33554436 But I want to round down (floor) the integer in float32. That is, in this case I want: floor_exact(a, np.float32) 33554432 I can break the float into its parts to do it: https://github.com/matthew-brett/nibabel/blob/f687bfc88d1676a09fc76c968a346bc81e4d0d04/nibabel/floating.py but that's obviously rather ugly... Is there a simpler way? I'm sure there is and I haven't thought of it... Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-standard standard deviation
On 04-Dec-09 10:54 AM, Bruce Southey wrote: On 12/04/2009 06:18 AM, yogesh karpate wrote: @ Pauli and @ Colin: Sorry for the late reply. I was busy in some other assignments. # As far as normalization by(n) is concerned then its common assumption that the population is normally distributed and population size is fairly large enough to fit the normal distribution. But this standard deviation, when applied to a small population, tends to be too low therefore it is called as biased. # The correction known as bessel correction is there for small sample size std. deviation. i.e. normalization by (n-1). # In electrical-and-electronic-measurements-and-instrumentation by A.K. Sawhney . In 1st chapter of the book Fundamentals of Meausrements . Its shown that for N=16 the std. deviation normalization was (n-1)=15 # While I was learning statistics in my course Instructor would advise to take n=20 for normalization by (n-1) # Probability and statistics by Schuam Series is good reading. Regards ~ymk ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Hi, Basically, all that I see with these arbitrary values is that you are relying on the 'central limit theorem' (http://en.wikipedia.org/wiki/Central_limit_theorem). Really the issue in using these values is how much statistical bias will you tolerate especially in the impact on usage of that estimate because the usage of variance (such as in statistical tests) tend to be more influenced by bias than the estimate of variance. (Of course, many features rely on asymptotic properties so bias concerns are less apparent in large sample sizes.) Obviously the default relies on the developers background and requirements. There are multiple valid variance estimators in statistics with different denominators like N (maximum likelihood estimator), N-1 (restricted maximum likelihood estimator and certain Bayesian estimators) and Stein's (http://en.wikipedia.org/wiki/James%E2%80%93Stein_estimator). So thecurrent default behavior is a valid and documented. Consequently you can not just have one option or different functions (like certain programs) and Numpy's implementation actually allows you do all these in a single function. So I also see no reason change even if I have to add the ddof=1 argument, after all 'Explicit is better than implicit' :-). Bruce Bruce, I suggest that the Central Limit Theorem is tied in with the Law of Large Numbers. When one has a smallish sample size, what give the best estimate of the variance? The Bessel Correction provides a rationale, based on expectations: (http://en.wikipedia.org/wiki/Bessel%27s_correction). It is difficult to understand the proof of Stein: http://en.wikipedia.org/wiki/Proof_of_Stein%27s_example The symbols used are not clearly stated. He seems interested in a decision rule for the calculation of the mean of a sample and claims that his approach is better than the traditional Least Squares approach. In most cases, the interest is likely to be in the variance, with a view to establishing a confidence interval. In the widely used Analysis of Variance (ANOVA), the degrees of freedom are reduced for each mean estimated, see: http://www.mnstate.edu/wasson/ed602lesson13.htm for the example below: *Analysis of Variance Table* ** Source of Variation Sum of Squares Degrees of Freedom Mean Square F Ratio p Between Groups 25.20 2 12.60 5.178 .05 Within Groups 29.20 12 2.43 Total 54.40 14 There is a sample of 15 observations, which is divided into three groups, depending on the number of hours of therapy. Thus, the Total degrees of freedom are 15-1 = 14, the Between Groups 3-1 = 2 and the Residual is 14 - 2 = 12. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-standard standard deviation
On 04-Dec-09 05:21 AM, Pauli Virtanen wrote: pe, 2009-12-04 kello 11:19 +0100, Chris Colbert kirjoitti: Why cant the divisor constant just be made an optional kwarg that defaults to zero? It already is an optional kwarg that defaults to zero. Cheers, I suggested that 1 (one) would be a better default but Robert Kern told us that it won't happen. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-standard standard deviation
On 04-Dec-09 07:18 AM, yogesh karpate wrote: @ Pauli and @ Colin: Sorry for the late reply. I was busy in some other assignments. # As far as normalization by(n) is concerned then its common assumption that the population is normally distributed and population size is fairly large enough to fit the normal distribution. But this standard deviation, when applied to a small population, tends to be too low therefore it is called as biased. # The correction known as bessel correction is there for small sample size std. deviation. i.e. normalization by (n-1). # In electrical-and-electronic-measurements-and-instrumentation by A.K. Sawhney . In 1st chapter of the book Fundamentals of Meausrements . Its shown that for N=16 the std. deviation normalization was (n-1)=15 # While I was learning statistics in my course Instructor would advise to take n=20 for normalization by (n-1) # Probability and statistics by Schuam Series is good reading. Regards ~ymk Yogesh, Thanks for the Bessel name, I hadn't come across that before. The Wikipedia reference for the Bessel Correction uses a divisor of n-1: http://en.wikipedia.org/wiki/Bessel%27s_correction Perhaps the simplification for larger n comes from the fact that for large n, 1/n = 1/(n-1). I would suggest C. E. Weatherburn - Mathematical Statistics, but I doubt whether it is still widely available. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-standard standard deviation
Yogesh, Could you explain the rationale for this choice please? Colin W. On 03-Dec-09 00:35 AM, yogesh karpate wrote: The thing is that the normalization by (n-1) is done for the no. of samples 20 or23(Not sure about this no. but sure about the thing that this no isnt greater than 25) and below that we use normalization by n. Regards ~ymk ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-standard standard deviation
On 29-Nov-09 17:13 PM, Dr. Phillip M. Feldman wrote: All of the statistical packages that I am currently using and have used in the past (Matlab, Minitab, R, S-plus) calculate standard deviation using the sqrt(1/(n-1)) normalization, which gives a result that is unbiased when sampling from a normally-distributed population. NumPy uses the sqrt(1/n) normalization. I'm currently using the following code to calculate standard deviations, but would much prefer if this could be fixed in NumPy itself: def mystd(x=numpy.array([]), axis=None): This function calculates the standard deviation of the input using the definition of standard deviation that gives an unbiased result for samples from a normally-distributed population. xd= x - x.mean(axis=axis) return sqrt( (xd*xd).sum(axis=axis) / (numpy.size(x,axis=axis)-1.0) ) Anne Archibald has suggested a work-around. Perhaps ddof could be set, by default to 1 as other values are rarely required. Where the distribution of a variate is not known a priori, then I believe that it can be shown that the n-1 divisor provides the best estimate of the variance. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Resize method
Access by the interpreter prevents array resizing. Yes, one can use the function, in place of the method but this appears to require copying the whole array. If one sets b= a, then that reference can be deleted with del b. Is there any similar technique for the interpreter? Colin W. Python 2.6 (r26:66721, Oct 2 2008, 11:35:03) [MSC v.1500 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. from numpy import * a= array(7*[3]) a.resize((3,7)) a array([[3, 3, 3, 3, 3, 3, 3], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0]]) a.resize((4,7)) Traceback (most recent call last): File stdin, line 1, in module ValueError: cannot resize an array that has been referenced or is referencing another array in this way. Use the resize function ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Resize method
Christopher Barker wrote: Colin J. Williams wrote: Access by the interpreter prevents array resizing. yup -- resize is really fragile for that reason. It really should be used quite sparingly. Personally, I think it should probably only be used when wrapped with a higher level layer. I've been working on an extendable array class, I call an accumulator (bad name...). The idea is that you can use it to accumulate values when you don't know how big it's going to end up, rather than using a list for this, which is the standard idiom. In [2]: import accumulator In [3]: a = accumulator.accumulator((1,2,3,4,)) In [4]: a Out[4]: accumulator([1, 2, 3, 4]) In [5]: a.append(5) In [6]: a Out[6]: accumulator([1, 2, 3, 4, 5]) In [8]: a.extend((6,7,8,9)) In [9]: a Out[9]: accumulator([1, 2, 3, 4, 5, 6, 7, 8, 9]) At the moment, it only support 1-d arrays, though I'd like to extend it to n-d, probably only allowing growing on the first axis. This has been discussed on this list a fair bit, with mixed reviews as to whether there is any point. It's slower than lists in common usage, but has other advantages -- I'd like to see a C version, but don't know if I'll ever have the time for that. I've enclosed to code for your viewing pleasure -Chris Thanks for this. My aim is to extract a row of data from a line in a file and append it to an array. The number of columns is fixed but, at the start, the number of rows is unknown. I think that I have sorted out the resize approach but I need more tests before I share it. Your accumulator idea is interesting. Back in 2004, I worked on MyMatrix, based on numarray - abandoned when numpy came onto the scene. One of the capabilities there was an /append/ method, intended to add a conforming matrix to the right or below the given matrix. It was probably not efficient but it provided a means of joining together block matrices, The append signature, from a January 2005 backup is here: def append(self, other, toRight= False): ''' Return self, with other appended, to the Right or Below, default: Below. other - a matrix, a list of matrices, or objects which can be converted into matrices. ''' assert self.iscontiguous() assert self.rank == 2 if isinstance(other, _n.NumArray): ... Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] subclassing matrix
Basilisk96 wrote: On Jan 12, 1:36 am, Timothy Hochberg [EMAIL PROTECTED] wrote: I believe that you need to look at __array_finalize__ and __array_priority__ (and there may be one other thing as well, I can't remember; it's late). Search for __array_finalize__ and that will probably help get you started. Well sonovagun! I removed the hack. Then just by setting __array_priority__ = 20.0 in the class body, things are magically working almost as I expect. I say almost because of this custom method: def cross(self, other): Cross product of this vector and another vector return _N.cross(self, other, axis=0) That call to numpy.cross returns a numpy.ndarray. Unless I do return Vector(_N.cross(self, other, axis=0)), I get problems downstream. When is __array_finalize__ called? By adding some print traces, I can see it's called every time an array is modified in any way i.e., reshaped, transposed, etc., and also during operations like u+v, u-v, A*u. But it's not called during the call to numpy.cross. Why? Cheers, -Basilisk96 This may help. It is based on your initial script. The Vectors are considered as columns but presented as rows. This adds a complication which is not resolved. Colin W. #-- vector.py import numpy as _N import math as _M #default tolerance for equality tests TOL_EQ = 1e-6 #default format for pretty-printing Vector instances FMT_VECTOR_DEFAULT = %+.5f class Vector(_N.matrix): 2D/3D vector class that supports numpy matrix operations and more. Examples: u = Vector([1,2,3]) v = Vector('3 4 5') w = Vector([1, 2]) def __new__(cls, data=0. 0. 0., dtype=_N.float64): Subclass instance constructor. If data is not specified, a zero Vector is constructed. The constructor always returns a Vector instance. The instance gets a customizable Format attribute, which controls the printing precision. data= [1, 2, 3] ret= _N.matrix(data, dtype) ##ret = super(Vector, cls).__new__(cls, data, dtype=dtype) ###promote the instance to cls type. ##ret.__class__ = cls assert ret.size in (2, 3), 'Vector must have either two or three components' if ret.shape[0] == 1: ret = ret.T assert ret.shape == (ret.shape[0], 1), 'could not express Vector as a Mx1 matrix' if ret.shape[0] == 2: ret = _N.vstack((ret, 0.)) ret.Format = FMT_VECTOR_DEFAULT ret= _N.ndarray.__new__(cls, ret.shape, dtype, buffer=ret.data) return ret def __str__(self): fmt = getattr(self, Format, FMT_VECTOR_DEFAULT) fmt = ', '.join([fmt]*3) return ''.join([(, fmt, )]) % tuple(self.T.tolist()[0]) def __repr__(self): fmt = ', '.join(['%s']*3) return ''.join([%s([, fmt, ])]) % tuple([self.__class__.__name__] + self.T.tolist()[0]) def __mul__(self, mult): ''' self * multiplicand ''' if isinstance(mult, _N.matrix): return _N.dot(self, mult) else: raise DataError, 'multiplicand must be a Vector or a matrix' def __rmul__(self, mult): ''' multiplier * self.__mul__ ''' if isinstance(mult, _N.matrix): return Vector(_N.dot(mult, self)) else: raise DataError, 'multiplier must be a Vector or a matrix' the remaining methods are Vector-specific math operations, including the X,Y,Z properties... if __name__ == '__main__': u = Vector('1 2 3') print str(u) print repr(u) A = _N.matrix('2 0 0; 0 2 0; 0 0 2') print A p = A * u print p print p.__class__ q= u.T * A try: print q except: print we don't allow for the display of row vectors print q.A, q.T print q.__class__ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] subclassing matrix
Basilisk96 wrote: Hello folks, In the course of a project that involved heavy use of geometry and linear algebra, I found it useful to create a Vector subclass of numpy.matrix (represented as a column vector in my case). Why not consider a matrix with a shape of (1, n) as a row vector and one with (n, 1) as a column vector? Then you can simply write A * u or u.T * A. Does this not meet the need? You could add methods isRowVector and isColumnVector to the Matrix class. Colin W. I'd like to hear comments about my use of this class promotion statement in __new__: ret.__class__ = cls It seems to me that it is hackish to just change an instance's class on the fly, so perhaps someone could clue me in on a better practice. Here is my reason for doing this: Many applications of this code involve operations between instances of numpy.matrix and instances of Vector, such as applying a linear- operator matrix on a vector. If I omit that class promotion statement, then the results of such operations cannot be instantiated as Vector types: from vector import Vector import numpy u = Vector('1 2 3') A = numpy.matrix('2 0 0; 0 2 0; 0 0 2') p = Vector(A * u) p.__class__ class 'numpy.core.defmatrix.matrix' This is undesirable because the calculation result loses the custom Vector methods and attributes that I want to use. However, if I use that class promotion statement, the p.__class__ lookup returns what I want: p.__class__ class 'vector.Vector' Is there a better way to achieve that? Here is the partial subclass code: #-- vector.py import numpy as _N import math as _M #default tolerance for equality tests TOL_EQ = 1e-6 #default format for pretty-printing Vector instances FMT_VECTOR_DEFAULT = %+.5f class Vector(_N.matrix): 2D/3D vector class that supports numpy matrix operations and more. Examples: u = Vector([1,2,3]) v = Vector('3 4 5') w = Vector([1, 2]) def __new__(cls, data=0. 0. 0., dtype=_N.float64): Subclass instance constructor. If data is not specified, a zero Vector is constructed. The constructor always returns a Vector instance. The instance gets a customizable Format attribute, which controls the printing precision. ret = super(Vector, cls).__new__(cls, data, dtype=dtype) #promote the instance to cls type. ret.__class__ = cls assert ret.size in (2, 3), 'Vector must have either two or three components' if ret.shape[0] == 1: ret = ret.T assert ret.shape == (ret.shape[0], 1), 'could not express Vector as a Mx1 matrix' if ret.shape[0] == 2: ret = _N.vstack((ret, 0.)) ret.Format = FMT_VECTOR_DEFAULT return ret def __str__(self): fmt = getattr(self, Format, FMT_VECTOR_DEFAULT) fmt = ', '.join([fmt]*3) return ''.join([(, fmt, )]) % (self.X, self.Y, self.Z) def __repr__(self): fmt = ', '.join(['%s']*3) return ''.join([%s([, fmt, ])]) % (self.__class__.__name__, self.X, self.Y, self.Z) the remaining methods are Vector-specific math operations, including the X,Y,Z properties... Cheers, -Basilisk96 ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] defmatrix.py
Charles R Harris wrote: On 3/26/07, *Travis Oliphant* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I think that might be the simplest thing, dot overrides subtypes. BTW, here is another ambiguity In [6]: dot(array([[1]]),ones(2)) --- exceptions.ValueErrorTraceback (most recent call last) /home/charris/ipython console ValueError: matrices are not aligned Note that in this case dot acts like the rhs is always a column vector although it returns a 1-d vector. I don't know that this is a bad thing, but perhaps we should extend this behaviour to matrices, which would be different from the now current 1-d is always a *row* vector, i.e. The rule 1-d is always a *row* vector only applies when converting to a matrix. In this case, the dot operator does not convert to a matrix but uses rules for operating with mixed 2-d and 1-d arrays inherited from Numeric. I'm very hesitant to change those rules. I wasn't suggesting that, just noticing that the rule was 1-d vector on right is treated as a column vector by dot, which is why an exception was raised in the posted case. If it is traditional for matrix routines always treat is as a row vector, so be it. My recollection is that text books treat the column vector, represented by a lower case letter, bold or underlined, as the default. If b (dressed as described before) is a column vector, then b' represents a row vector. For numpy, it makes sense to consider b as a row vector, since the underlying array uses the C convention where each row is stored contiguously. Colin W. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix indexing question
Alan G Isaac wrote: On Mon, 26 Mar 2007, Colin J. Williams apparently wrote: One would expect the iteration over A to return row vectors, represented by (1, n) matrices. This is again simple assertion. **Why** would one expect this? Some people clearly do not. One person commented that this unexpected behavior was a source of error in their code. Another person commented that they did not even guess that such a thing would be possible. Experience with Python should lead to the ability to anticipate the outcome. Apparently this is not the case. That suggests a design problem. What about **Python** would lead us to expect this behavior?? In *contrast*, everyone agrees that for a matrix M, we should get a matrix from M[0,:]. This is expected and desirable. Perhaps our differences lies in two things: 1. the fact that the text books typically take the column vector as the default. For a Python version, based on C it makes more sense to treat the rows as vectors, as data is stored contiguously by row. 2. the convention has been proposed that the vector is more conveniently implemented as a matrix, where one dimension is one. The vector could be treated as a subclass of the matrix but this adds complexity with little clear benefit. PyMatrix has matrix methods isVector, isCVector and isRVector. I can see some merit in conforming to text book usage and would be glad to consider changes when I complete the port to numpy, in a few months. Colin W. Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix indexing question
Bill Baxter wrote: On 3/26/07, Colin J. Williams [EMAIL PROTECTED] wrote: Bill Baxter wrote: This may sound silly, but I really think seeing all those brackets is what makes it feel wrong. Matlab's output doesn't put it in your face that your 4 is really a matrix([[4]]), even though that's what it is to Matlab. But I don't see a good way to change that behavior. The other thing I find problematic about matrices is the inability to go higher than 2d. To me that means that it's impossible to go pure matrix in my code because I'll have to switch back to arrays any time I want more than 2d (or use a mixed solution like a list of matrices). Matlab allows allows 2D. --bb pure matrix seems to me an area of exploration, does it have any application in numerical computation at this time? I'm not sure what you thought I meant, but all I meant by going pure matrix was having my Numpy code use the 'matrix' type exclusively instead of some mix of 'matrix' and the base 'ndarray' type. It was a term I had not come across before but I assumed that you were referring to something like this link - beyond my comprehension. http://72.14.203.104/search?q=cache:Yu9gbUQEfWkJ:math.ca/Events/winter05/abs/pdf/ma-df.pdf+pure+matrixhl=enct=clnkcd=4gl=calr=lang_en Things become messy when you mix and match them because you don't know any more if an expression like A[1] is going to give you a 1-D thing or a 2-D thing, and you can't be sure what A * B will do without always coercing A and B. Yes, to my mind it's best to consider the multi-dimensional array and the matrix to be two distinct data types. In most cases, it's best that conversions between the two should be explicit. A list of matrices seems to be a logical structure. Yes, and it's the only option if you want to make a list of matrices of different shapes, but I frequently have a need for things like a list of per-point transformation matrices. Each column from each of those matrices can be thought of as a vector. Sometimes its convenient to consider all the X basis vectors together, for instance, which is a simple and efficient M[:,:,0] slice if I have all the data in a 3-D array, but it's a slow list comprehension plus matrix constructor if I have the matrices in a list -- something like matrix([m[:,0] for m in M]) but that line is probably incorrect. Logically, this makes sense, where M is a list of matrices. My guess is that it would be a little faster to build one larger matrix and then slice it as needed. PyMatrix deals with lists in building a larger matrix from sub-matrices. Suppose that we have matrices A (3, 4), B (3, 6), C (4, 2) and D (4, 8). Then E= M([[A, B], [C, D]]) gives E (7, 10). Numpy generally tries to treat all lists and tuples as array literals. That's not likely to change. That need no be a problem is there is clarity of thinking about the essential difference between the matrix data type (even if is is built as a sub-type of the array) and the multi-dimensional array. --bb Colin W. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix indexing question
Alan G Isaac wrote: On 3/26/07, Alan G Isaac [EMAIL PROTECTED] wrote: finds itself in basic conflict with the idea that I ought to be able to iterate over the objects in an iterable container. I mean really, does this not feel wrong? :: for item in x: print item.__repr__() ... matrix([[1, 2]]) matrix([[3, 4]]) On Mon, 26 Mar 2007, Bill Baxter apparently wrote: So you're saying this is what you'd find more pythonic? X[1] matrix([2,3]) X[:,1] matrix([[3, 4]]) Just trying to make it clear what you're proposing. No; that is not possible, since a matrix is inherently 2d. I just want to get the constituent arrays when I iterate over the matrix object or use regular Python indexing, but a matrix when I use matrix/array indexing. That is :: X[1] array([2,3]) X[1,:] matrix([[3, 4]]) That behavior seems completely natural and unsurprising. Perhaps things would be clearer if we thought of the constituent groups of data in a matrix as being themselves matrices. X[1] could represent the second row of a matrix. A row of a matrix is a row vector, a special case of a matrix. To get an array, I suggest that an explicit conversion X[1].A is a clearer way to handle things. Similarly, X[2, 3] is best returned as a value which is of a Python type. Colin W. Probably about half the bugs I get from mixing and matching matrix and array are things like row = A[i] ... z = row[2] Which works for an array but not for a matrix. Exactly! That is the evidence of a bad surprise in the current behavior. Iterating over a Python iterable should provide access to the contained objects. Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix indexing question
Alan G Isaac wrote: Alan G Isaac wrote: So this :: x[1] matrix([[1, 0]]) feels wrong. (Similarly when iterating across rows.) Of course I realize that I can just :: x.A[1] array([1, 0]) On Sun, 25 Mar 2007, Colin J. Williams apparently wrote: An array and a matrix are different animals. Conversion from one to the other should be spelled out. But you are just begging the question here. The question is: when I iterate across matrix rows, why am I iterating across matrices and not arrays. This seems quite out of whack with general Python practice. You cannot just say conversion should be explicit because that assumes (incorrectly actually) that the rows are matrices. The conversion should be explicit argument actually cuts in the opposite direction of what you appear to believe. Alan, Yes, this is where we appear to differ. I believe that vectors are best represented as matrices, with a shape of (1, n) or (m, 1). The choice of these determines whether we have a column or a row vectors. Thus any (m, n) matrix can be considered as either a collection of column vectors or a collection of row vectors. If the end result is required as an array or a list, this can be done explicitly with X[1].A or A[1].tolist(). Here, A is a property of the M (matrix) class. Cheers, Alan Isaac A long time ago, you proposed that PyMatrix should provide for matrix division in two way, as is done in MatLab. This was implemented, but PyMatrix has not yet been ported to numpy - perhaps this summer. Regards, Colin W. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix indexing question
Alan G Isaac wrote: On Mon, 26 Mar 2007, Colin J. Williams apparently wrote: Perhaps things would be clearer if we thought of the constituent groups of data in a matrix as being themselves matrices. This thinking of is what you have suggested before. You need to explain why it is not begging the question. Cheers, Alan Isaac Perhaps it would now help if you redefined the question. In an earlier posting, you appeared anxious that the matrix and the array behave in the same way. Since they are different animals, I see sameness of behaviour as being lower on the list of desirables than fitting the standard ideas of matrix algebra. Suppose that a is a row vector, b a column vector and A a conforming matrix then: a * A A * b and b.T * A are all acceptable operations. One would expect the iteration over A to return row vectors, represented by (1, n) matrices. Colin W. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Detect subclass of ndarray
Alan G Isaac wrote: On Sat, 24 Mar 2007, Charles R Harris apparently wrote: Yes, that is what I am thinking. Given that there are only the two possibilities, row or column, choose the only one that is compatible with the multiplying matrix. The result will not always be a column vector, for instance, mat([[1]])*ones(3) will be a 1x3 row vector. Ack! The simple rule `post multiply means its a column vector` would be horrible enough: A*ones(n)*B becomes utterly obscure. Now even that simple rule is to be violated?? It depends whether ones delivers an instance of the Matrix/vector class or a simple array. I assume that, in the above A and B represent matrices. Colin W. Down this path lies madness. Please, just raise an exception. Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Detect subclass of ndarray
Colin J. Williams wrote: Alan G Isaac wrote: On Sat, 24 Mar 2007, Charles R Harris apparently wrote: Yes, that is what I am thinking. Given that there are only the two possibilities, row or column, choose the only one that is compatible with the multiplying matrix. The result will not always be a column vector, for instance, mat([[1]])*ones(3) will be a 1x3 row vector. Ack! The simple rule `post multiply means its a column vector` would be horrible enough: A*ones(n)*B becomes utterly obscure. Now even that simple rule is to be violated?? It depends whether ones delivers an instance of the Matrix/vector class or a simple array. I assume that, in the above A and B represent matrices. Colin W. Postscript: I hadn't read the later postings when I posted the above. PyMatrix used the convention mentioned in an earlier posting. Simply a vector is considered as a single row matrix or a single column matrix. This same approach can largely be used with numpy's mat: *** Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win32. *** import numpy as _n _n.ones(3) array([ 1., 1., 1.]) a= _n.ones(3) a.T array([ 1., 1., 1.]) _n.mat(a) matrix([[ 1., 1., 1.]]) _n.mat(a).T matrix([[ 1.], [ 1.], [ 1.]]) b= _n.mat(a).T a * b matrix([[ 3.]]) # Something has gone wrong here - it looks as though there is normalization under the counter. In any event, the problem posed by Alan Isaac can be handled with this approach: A * mat(ones(3)).t * B can produce the desired result. I haven't tested it. Colin W. Down this path lies madness. Please, just raise an exception. Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Simple multi-arg wrapper for dot()
Bill Baxter wrote: On 3/25/07, Robert Kern [EMAIL PROTECTED] wrote: Bill Baxter wrote: I don't know. Given our previous history with convenience functions with different calling semantics (anyone remember rand()?), I think it probably will confuse some people. I'd really like to see it on a cookbook page, though. I'd use it. Done. http://www.scipy.org/Cookbook/MultiDot --bb I wasn't able to connect to this link but I gather that the proposal was to used dot(A, B, C) to represent the product of the 3 arrays. if A, B and C were matrices then this could more clearly be written as A * B * C Colin W. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix indexing question
Alan G Isaac wrote: One thing keeps bugging me when I use numpy.matrix. All this is fine:: x=N.mat('1 1;1 0') x matrix([[1, 1], [1, 0]]) x[1,:] matrix([[1, 0]]) But it seems to me that I should be able to extract a matrix row as an array. This can easily be done: *** Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win32. *** import numpy as _n A= _n.mat([[1, 2], [3, 4]]) A[1] matrix([[3, 4]]) A[1].getA1() array([3, 4]) An array and a matrix are different animals. Conversion from one to the other should be spelled out. As you have done below. Colin W. So this :: x[1] matrix([[1, 0]]) feels wrong. (Similarly when iterating across rows.) Of course I realize that I can just :: x.A[1] array([1, 0]) but since the above keeps feeling wrong I felt I should raise this as a possible design issue, better discussed early than latter. Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix indexing question
Alan G Isaac wrote: Em Dom, 2007-03-25 Ã s 13:07 -0400, Alan G Isaac escreveu: x[1] matrix([[1, 0]]) feels wrong. (Similarly when iterating across rows.) On Sun, 25 Mar 2007, Paulo Jose da Silva e Silva apparently wrote: I think the point here is that if you are using matrices, then all you should want are matrices, just like in MATLAB: b = A(1, :) b = 1 2 Yes, that is the idea behind this, which I am also accustomed to from GAUSS. But note again that the Matlab equivalent :: x=N.mat('1 2;3 4') x[0,:] matrix([[1, 2]]) does provide this behavior. The question I am raising is a design question and is I think really not addressed by the rule of thumb you offer. Specifically, that rule of thumb if it is indeed the justification of :: x[1] matrix([[3, 4]]) finds itself in basic conflict with the idea that I ought to be able to iterate over the objects in an iterable container. I mean really, does this not feel wrong? :: for item in x: print item.__repr__() ... matrix([[1, 2]]) matrix([[3, 4]]) Cheers, Alan Isaac Perhaps this would be clearer with: for rowVector in x: print item.__repr__() ... matrix([[1, 2]]) matrix([[3, 4]]) Colin W. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix indexing question
Bill Baxter wrote: On 3/26/07, Alan G Isaac [EMAIL PROTECTED] wrote: Em Dom, 2007-03-25 às 13:07 -0400, Alan G Isaac escreveu: x[1] matrix([[1, 0]]) feels wrong. (Similarly when iterating across rows.) On Sun, 25 Mar 2007, Paulo Jose da Silva e Silva apparently wrote: I think the point here is that if you are using matrices, then all you should want are matrices, just like in MATLAB: b = A(1, :) b = 1 2 Yes, that is the idea behind this, which I am also accustomed to from GAUSS. But note again that the Matlab equivalent :: x=N.mat('1 2;3 4') x[0,:] matrix([[1, 2]]) does provide this behavior. The question I am raising is a design question and is I think really not addressed by the rule of thumb you offer. Specifically, that rule of thumb if it is indeed the justification of :: x[1] matrix([[3, 4]]) finds itself in basic conflict with the idea that I ought to be able to iterate over the objects in an iterable container. I mean really, does this not feel wrong? :: for item in x: print item.__repr__() ... matrix([[1, 2]]) matrix([[3, 4]]) This may sound silly, but I really think seeing all those brackets is what makes it feel wrong. Matlab's output doesn't put it in your face that your 4 is really a matrix([[4]]), even though that's what it is to Matlab. But I don't see a good way to change that behavior. The other thing I find problematic about matrices is the inability to go higher than 2d. To me that means that it's impossible to go pure matrix in my code because I'll have to switch back to arrays any time I want more than 2d (or use a mixed solution like a list of matrices). Matlab allows allows 2D. --bb pure matrix seems to me an area of exploration, does it have any application in numerical computation at this time? A list of matrices seems to be a logical structure. PyMatrix deals with lists in building a larger matrix from sub-matrices. Suppose that we have matrices A (3, 4), B (3, 6), C (4, 2) and D (4, 8). Then E= M([[A, B], [C, D]]) gives E (7, 10). Colin W. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Detect subclass of ndarray
Charles R Harris wrote: On 3/24/07, *Alan G Isaac* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Fri, 23 Mar 2007, Charles R Harris apparently wrote: the following gives the wrong result: In [15]: I = matrix(eye(2)) In [16]: I*ones(2) Out[16]: matrix([[ 1., 1.]]) where the output should be a column vector. Why should this output a column? I would prefer an exception. Add the axis if you want it: I*ones(2)[:,None] works fine. Because it is mathematically correct. You can't multiply a vector by a 2x2 matrix and get a 1x2 matrix as the result. Sure, there are work arounds, but if matrix multiplication is going to work when mixed with arrays, it should work correctly. Chuck It depends on the convention you use when working with matrices. Suppose you adopt the notion, for matrices, a vector is always represented by a matrix. This a row vector would have the shape (1, n) and the column vector would have (n, 1). If A were a (3, 4) matrix and b were a 4 element column vector, then the product of A by b, using matrix arithmetic, would give a 3 element column vector. Colin W. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Latest Array-Interface PEP
Travis Oliphant wrote: I'm attaching my latest extended buffer-protocol PEP that is trying to get the array interface into Python. Basically, it is a translation of the numpy header files into something as simple as possible that can still be used to describe a complicated block of memory to another user. My purpose is to get feedback and criticisms from this community before display before the larger Python community. -Travis It would help me to understand the proposal if it could be explained in terms of the methods of the existing buffer class/type: ['__add__', '__class__', '__cmp__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__getattribute__', '__getitem__', '__getslice__', '__hash__', '__init__', '__len__', '__mul__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__str__'] Numpy extends numarray's type/dtype object. This proposal appears to revert to the old letter codes. I have had very limited experience with C. Colin W. PEP: unassigned Title: Extending the buffer protocol to include the array interface Version: $Revision: $ Last-Modified: $Date: $ Author: Travis Oliphant [EMAIL PROTECTED] Status: Draft Type: Standards Track Created: 28-Aug-2006 Python-Version: 2.6 Abstract This PEP proposes extending the tp_as_buffer structure to include function pointers that incorporate information about the intended shape and data-format of the provided buffer. In essence this will place an array interface directly into Python. Rationale Several extensions to Python utilize the buffer protocol to share the location of a data-buffer that is really an N-dimensional array. However, there is no standard way to exchange the additional N-dimensional array information so that the data-buffer is interpreted correctly. The NumPy project introduced an array interface (http://numpy.scipy.org/array_interface.shtml) through a set of attributes on the object itself. While this approach works, it requires attribute lookups which can be expensive when sharing many small arrays. One of the key reasons that users often request to place something like NumPy into the standard library is so that it can be used as standard for other packages that deal with arrays. This PEP provides a mechanism for extending the buffer protocol (which already allows data sharing) to add the additional information needed to understand the data. This should be of benefit to all third-party modules that want to share memory through the buffer protocol such as GUI toolkits, PIL, PyGame, CVXOPT, PyVoxel, PyMedia, audio libraries, video libraries etc. Proposal Add bf_getarrview and bf_relarrview function pointers to the buffer protocol to allow objects to share a view on a memory pointer including information about accessing it as an N-dimensional array. Add the TP_HAS_ARRAY_BUFFER flag to types that define this extended buffer protocol. Also a few additionsl C-API calls should perhaps be added to Python to facilitate creating new PyArrViewObjects. Specification: static PyObject* bf_getarrayview (PyObject *obj) This function must return a new reference to a PyArrViewObject which contains the details of the array information exposed by the object. If failure occurs, then NULL is returned and an exception set. static int bf_relarrayview(PyObject *obj) If not NULL then this will be called when the object returned by bf_getarrview is destroyed so that the underlying object can be aware when acquired views are released. The object that defines bf_getarrview should not re-allocate memory (re-size itself) while views are extant. A 0 is returned on success and a -1 and an error condition set on failure. The PyArrayViewObject has the structure typedef struct { PyObject_HEAD void *data; /* pointer to the beginning of data */ int nd; /* the number of dimensions */ Py_ssize_t *shape; /* c-array of size nd giving shape */ Py_ssize_t *strides;/* SEE BELOW */ PyObject *base; /* the object this is a view of */ PyObject *format; /* SEE BELOW */ int flags; /* SEE BELOW */ } PyArrayViewObject; strides -- a c-array of size nd providing the striding information which is the number of bytes to skip to get to the next element in that dimension. format -- a Python data-format object (PyDataFormatObject) which contains information about how each item in the array should be interpreted.
Re: [Numpy-discussion] sum of two arrays with different shape?
zhang yunfeng wrote: Hi, I'm newbie to Numpy. When reading tutorials at http://www.scipy.org/Tentative_NumPy_Tutorial http://www.scipy.org/Tentative_NumPy_Tutorial, I found a snippet about addition of two arrays with different shape, Does it make sense? If array shapes are not same, why it doesn't throw out an error? I'm not sure what the rules are but this example throws an error, which it should. [Dbg] x= N.array([1, 2, 3]) [Dbg] y= x+[1, 2, 3, 4] Traceback (most recent call last): File interactive input, line 1, in module ValueError: shape mismatch: objects cannot be broadcast to a single shape [Dbg] see the code below (taken from the above webpage) array a.shape is (4,) and y.shape is (3,4) and a+y ? --- y = arange(12) y array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) y.shape = 3,4 # does not modify the total number of elements y array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) It is possible to operate with arrays of diferent dimensions as long as they fit well. 3*a# multiply each element of a by 3 array([ 30, 60, 90, 120]) a+y# sum a to each row of y array([[10, 21, 32, 43], [14, 25, 36, 47], [18, 29, 40, 51]]) This seems a reasonable operation. Colin W. -- http://my.opera.com/zhangyunfeng ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion