Linear regression in NumPy
I'm a little bit stuck with NumPy here, and neither the docs nor trial&error seems to lead me anywhere: I've got a set of data points (x/y-coordinates) and want to fit a straight line through them, using LMSE linear regression. Simple enough. I thought instead of looking up the formulas I'd just see if there isn't a NumPy function that does exactly this. What I found was "linear_least_squares", but I can't figure out what kind of parameters it expects: I tried passing it my array of X-coordinates and the array of Y-coordinates, but it complains that the first parameter should be two-dimensional. But well, my data is 1d. I guess I could pack the X/Y coordinates into one 2d-array, but then, what do I do with the second parameter? Mor generally: Is there any kind of documentation that tells me what the functions in NumPy do, and what parameters they expect, how to call them, etc. All I found was: "This function returns the least-squares solution of an overdetermined system of linear equations. An optional third argument indicates the cutoff for the range of singular values (defaults to 10-10). There are four return values: the least-squares solution itself, the sum of the squared residuals (i.e. the quantity minimized by the solution), the rank of the matrix a, and the singular values of a in descending order." It doesn't even mention what the parameters "a" and "b" are for... -- http://mail.python.org/mailman/listinfo/python-list
Linear regression in NumPy
Hello, Guys, I have a question about the linear_least_squares in Numpy. My linear_least_squares cannot give me the results. I use Numpy1.0. The newest version. So I checked online and get your guys some examples. I did like this. [EMAIL PROTECTED] 77] ~ >> py Python 2.4.3 (#1, May 18 2006, 07:40:45) [GCC 3.3.3 (cygwin special)] on cygwin Type "help", "copyright", "credits" or "license" for more information. from Numeric import * from LinearAlgebra import linear_least_squares from Matrix import * y = Matrix([[1], [2], [4]]) x = Matrix([[1, 1], [1, 2], [1, 3]]) print y Matrix([[1], [2], [4]]) x Matrix([[1, 1], [1, 2], [1, 3]]) print linear_least_squares(x, y) Here my Numpy stops. and never give me a result. THis is the problem of Numpy or sth wrong. Can you guys give me a LinearAlgebra.py so that I can have a try? Thanks, John -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
nikie wrote: > I'm a little bit stuck with NumPy here, and neither the docs nor > trial&error seems to lead me anywhere: > I've got a set of data points (x/y-coordinates) and want to fit a > straight line through them, using LMSE linear regression. Simple > enough. I thought instead of looking up the formulas I'd just see if > there isn't a NumPy function that does exactly this. What I found was > "linear_least_squares", but I can't figure out what kind of parameters > it expects: I tried passing it my array of X-coordinates and the array > of Y-coordinates, but it complains that the first parameter should be > two-dimensional. But well, my data is 1d. I guess I could pack the X/Y > coordinates into one 2d-array, but then, what do I do with the second > parameter? > > Mor generally: Is there any kind of documentation that tells me what > the functions in NumPy do, and what parameters they expect, how to call > them, etc. All I found was: > "This function returns the least-squares solution of an overdetermined > system of linear equations. An optional third argument indicates the > cutoff for the range of singular values (defaults to 10-10). There are > four return values: the least-squares solution itself, the sum of the > squared residuals (i.e. the quantity minimized by the solution), the > rank of the matrix a, and the singular values of a in descending > order." > It doesn't even mention what the parameters "a" and "b" are for... Look at the docstring. (Note: I am using the current version of numpy from SVN, you may be using an older version of Numeric. http://numeric.scipy.org/) In [171]: numpy.linalg.lstsq? Type: function Base Class: String Form: Namespace: Interactive File: /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/numpy-0.9.6.2148-py2.4-macosx-10.4-ppc.egg/numpy/linalg/linalg.py Definition: numpy.linalg.lstsq(a, b, rcond=1e-10) Docstring: returns x,resids,rank,s where x minimizes 2-norm(|b - Ax|) resids is the sum square residuals rank is the rank of A s is the rank of the singular values of A in descending order If b is a matrix then x is also a matrix with corresponding columns. If the rank of A is less than the number of columns of A or greater than the number of rows, then residuals will be returned as an empty array otherwise resids = sum((b-dot(A,x)**2). Singular values less than s[0]*rcond are treated as zero. -- Robert Kern [EMAIL PROTECTED] "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
nikie napisal(a): > I'm a little bit stuck with NumPy here, and neither the docs nor > trial&error seems to lead me anywhere: > I've got a set of data points (x/y-coordinates) and want to fit a > straight line through them, using LMSE linear regression. Simple > enough. I thought instead of looking up the formulas I'd just see if > there isn't a NumPy function that does exactly this. What I found was > "linear_least_squares", but I can't figure out what kind of parameters > it expects: I tried passing it my array of X-coordinates and the array > of Y-coordinates, but it complains that the first parameter should be > two-dimensional. But well, my data is 1d. I guess I could pack the X/Y > coordinates into one 2d-array, but then, what do I do with the second > parameter? Well, it works for me: x = Matrix([[1, 1], [1, 2], [1, 3]]) y = Matrix([[1], [2], [4]]) print linear_least_squares(x, y) Make sure the dimensions are right. X should be n*k, Y should (unless you know what you are doing) be n*1. So the first dimension must be equal. If you wanted to: y = Matrix([1, 2, 4]) it won't work because it'll have dimensions 1*3. You would have to transpose it: y = transpose(Matrix([1, 2, 4])) Hope this helps. -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
I still don't get it... My data looks like this: x = [0,1,2,3] y = [1,3,5,7] The expected output would be something like (2, 1), as y[i] = x[i]*2+1 (An image sometimes says more than 1000 words, so to make myself clear: this is what I want to do: http://www.statistics4u.info/fundstat_eng/cc_regression.html) So, how am I to fill these matrices? (As a matter of fact, I already wrote the whole thing in Python in about 9 lines of code, but I'm pretty sure this should have been possible using NumPy) -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
nikie wrote: > I still don't get it... > My data looks like this: > x = [0,1,2,3] > y = [1,3,5,7] > The expected output would be something like (2, 1), as y[i] = x[i]*2+1 > > (An image sometimes says more than 1000 words, so to make myself clear: > this is what I want to do: > http://www.statistics4u.info/fundstat_eng/cc_regression.html) > > So, how am I to fill these matrices? As the docstring says, the problem it solves is min ||A*x - b||_2. In order to get it to solve your problem, you need to cast it into this matrix form. This is out of scope for the docstring, but most introductory statistics or linear algebra texts will cover this. In [201]: x = array([0., 1, 2, 3]) In [202]: y = array([1., 3, 5, 7]) In [203]: A = ones((len(y), 2), dtype=float) In [204]: A[:,0] = x In [205]: from numpy import linalg In [206]: linalg.lstsq(A, y) Out[206]: (array([ 2., 1.]), array([ 1.64987674e-30]), 2, array([ 4.10003045, 1.09075677])) -- Robert Kern [EMAIL PROTECTED] "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
Robert Kern wrote: > nikie wrote: > >>I still don't get it... >>My data looks like this: >> x = [0,1,2,3] >> y = [1,3,5,7] >>The expected output would be something like (2, 1), as y[i] = x[i]*2+1 >> >>(An image sometimes says more than 1000 words, so to make myself clear: >>this is what I want to do: >>http://www.statistics4u.info/fundstat_eng/cc_regression.html) >> >>So, how am I to fill these matrices? > > > As the docstring says, the problem it solves is min ||A*x - b||_2. In order to > get it to solve your problem, you need to cast it into this matrix form. This > is > out of scope for the docstring, but most introductory statistics or linear > algebra texts will cover this. > > In [201]: x = array([0., 1, 2, 3]) > > In [202]: y = array([1., 3, 5, 7]) > > In [203]: A = ones((len(y), 2), dtype=float) > > In [204]: A[:,0] = x > > In [205]: from numpy import linalg > > In [206]: linalg.lstsq(A, y) > Out[206]: > (array([ 2., 1.]), > array([ 1.64987674e-30]), > 2, > array([ 4.10003045, 1.09075677])) > I'm new to numpy myself. The above posters are correct to say that the problem must be cast into matrix form. However, as this is such a common technique, don't most math/stats packages do it behind the scenes? For example, in Matlab or Octave I could type: polyfit(x,y,1) and I'd get the answer with shorter, more readable code. A one-liner! Is there a 'canned' routine to do it in numpy? btw, I am not advocating that one should not understand the concepts behind a 'canned' routine. If you do not understand this concept you should take 's advice and dive into a linear algebra book. It's not very difficult, and it is essential that a scientific programmer understand it. -Matt -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
Matt Crema wrote: > Robert Kern wrote: > >> nikie wrote: >> >>> I still don't get it... >>> My data looks like this: >>> x = [0,1,2,3] >>> y = [1,3,5,7] >>> The expected output would be something like (2, 1), as y[i] = x[i]*2+1 >>> >>> (An image sometimes says more than 1000 words, so to make myself clear: >>> this is what I want to do: >>> http://www.statistics4u.info/fundstat_eng/cc_regression.html) >>> >>> So, how am I to fill these matrices? >> >> >> >> As the docstring says, the problem it solves is min ||A*x - b||_2. In >> order to >> get it to solve your problem, you need to cast it into this matrix >> form. This is >> out of scope for the docstring, but most introductory statistics or >> linear >> algebra texts will cover this. >> >> In [201]: x = array([0., 1, 2, 3]) >> >> In [202]: y = array([1., 3, 5, 7]) >> >> In [203]: A = ones((len(y), 2), dtype=float) >> >> In [204]: A[:,0] = x >> >> In [205]: from numpy import linalg >> >> In [206]: linalg.lstsq(A, y) >> Out[206]: >> (array([ 2., 1.]), >> array([ 1.64987674e-30]), >> 2, >> array([ 4.10003045, 1.09075677])) >> > > I'm new to numpy myself. > > The above posters are correct to say that the problem must be cast into > matrix form. However, as this is such a common technique, don't most > math/stats packages do it behind the scenes? > > For example, in Matlab or Octave I could type: > polyfit(x,y,1) > > and I'd get the answer with shorter, more readable code. A one-liner! > Is there a 'canned' routine to do it in numpy? > > btw, I am not advocating that one should not understand the concepts > behind a 'canned' routine. If you do not understand this concept you > should take 's advice and dive into a linear algebra book. > It's not very difficult, and it is essential that a scientific > programmer understand it. > > -Matt Hi again, I guess I should have looked first ;) m,b = numpy.polyfit(x,y,1) -Matt -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
Thank you! THAT's what I've been looking for from the start! -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
nikie wrote: > > Hello, I'm glad that helped, but let's not terminate this discussion just yet. I am also interested in answers to your second question: nikie wrote: > "More generally: Is there any kind of documentation that tells me what > the functions in NumPy do, and what parameters they expect, how to > call them, etc. As I said, I'm also new to numpy (only been using it for a week), but my first impression is that the built-in documentation is seriously lacking. For example, the Mathworks docs absolutely crush numpy's. I mean this constructively, and not as a shot at numpy. gave an excellent answer, but I differ with his one point that the docstring for "numpy.linalg.lstsq?" contains an obvious answer to the question. Good documentation should be written in much simpler terms, and examples of the function's use should be included. I wonder if anyone can impart some strategies for quickly solving problems like "How do I do a linear fit in numpy?" if, for example, I don't know which command to use. In Matlab, I would have typed: "lookfor fit" It would have returned 'polyval'. Then: "help polyval" and this problem would have been solved in under 5 minutes. To sum up a wordy post, "What do experienced users find is the most efficient way to navigate the numpy docs? (assuming one has already read the FAQs and tutorials)" Thanks. -Matt -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
Matt Crema wrote: > To sum up a wordy post, "What do experienced users find is the most > efficient way to navigate the numpy docs? (assuming one has already > read the FAQs and tutorials)" You're not likely to get much of an answer here, but if you ask on [EMAIL PROTECTED], you'll get plenty of discussion. -- Robert Kern [EMAIL PROTECTED] "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
Matt Crema wrote: > > "More generally: Is there any kind of documentation that tells me what > > the functions in NumPy do, and what parameters they expect, how to > > call them, etc. This is a good start too: http://www.tramy.us/guidetoscipy.html Yes, you have to pay for it, but the money goes to the guy who has done a MASSIVE amount of work to get the new numpy out. I would like to see a "Mastering Numpy" book much like the excellent "Mastering Matlab", but some one needs to write it! -Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
Although I think it's worth reading, it only covers the fundamental structure (what arrays are, what ufuncs are..) of NumPy. Neither of the functions dicussed in this thread (polyfit/linear_least_squares) is mentioned in the file. -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
nikie wrote: > Although I think it's worth reading, it only covers the fundamental > structure (what arrays are, what ufuncs are..) of NumPy. Neither of the > functions dicussed in this thread (polyfit/linear_least_squares) is > mentioned in the file. Both functions are described in the full book. Were you just looking at the sample chapter? -- Robert Kern [EMAIL PROTECTED] "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
Robert Kern wrote: > Both functions are described in the full book. Were you just looking at the > sample chapter? No, I've got the full PDF by mail a few days ago, "numpybook.pdf", 261 pages (I hope we're talking about the same thing). I entered "linear_least_squares" and "polyfit" in acrobat's "find text" box, but neither one could be found. -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
nikie wrote: > Robert Kern wrote: > >>Both functions are described in the full book. Were you just looking at the >>sample chapter? > > No, I've got the full PDF by mail a few days ago, "numpybook.pdf", 261 > pages (I hope we're talking about the same thing). I entered > "linear_least_squares" and "polyfit" in acrobat's "find text" box, but > neither one could be found. The version I have in front of me is a bit shorter, 252 pages, but describes polyfit in section 5.3 on page 91 along with the other polynomial functions. lstsq (linear_least_squares is a backwards-compatibility alias that was recently moved to numpy.linalg.old) is described in section 10.1 on page 149. -- Robert Kern [EMAIL PROTECTED] "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
> The version I have in front of me is a bit shorter, 252 pages, but describes > polyfit in section 5.3 on page 91 along with the other polynomial functions. > lstsq (linear_least_squares is a backwards-compatibility alias that was > recently > moved to numpy.linalg.old) is described in section 10.1 on page 149. Oops, sorry, shouldn't have posted before reading the whole document... You are right, of course, both functions are explained. I wonder why the acrobat's search function doesn't work, though. -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
> I have a question about the linear_least_squares in Numpy. Not quite sure what is going on, it looks like there could be some confusion as to linear_least_squares is expecting as an argument of some Numeric arrays and what you are supplying (a Matrix) is perhaps not close enough to being the same thing. Up-to-date versions for all this are all in "numpy" nowadays and the numpy mailing list is perhaps a better place to ask: http://projects.scipy.org/mailman/listinfo/numpy-discussion Anyway, I've pasted in an example using Numeric and LinearAlgebra, which you seem to have on your system. They still work fine. Not sure what the "Matrix" package is that you are using? HTH, Jon example: C:\>python Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from Numeric import * >>> from LinearAlgebra import linear_least_squares >>> a=array( [[1, 1], [1, 2], [1, 3]] ) >>> y=array( [ 1,2,4] ) # note 1 D >>> a array([[1, 1], [1, 2], [1, 3]]) >>> y array([1, 2, 4]) >>> linear_least_squares(a,y) (array([-0.6667, 1.5 ]), array([ 0.1667]), 2, array([ 4.07914333, 0.60049122])) >>> # Is this what you expect as output??? Jianzhong Liu wrote: > Hello, Guys, > > I have a question about the linear_least_squares in Numpy. > > My linear_least_squares cannot give me the results. > > I use Numpy1.0. The newest version. So I checked online and get your > guys some examples. > > I did like this. > > [EMAIL PROTECTED] 77] ~ >> py > Python 2.4.3 (#1, May 18 2006, 07:40:45) > [GCC 3.3.3 (cygwin special)] on cygwin > Type "help", "copyright", "credits" or "license" for more information. > >>> from Numeric import * > >>> from LinearAlgebra import linear_least_squares > >>> from Matrix import * > >>> y = Matrix([[1], [2], [4]]) > >>> x = Matrix([[1, 1], [1, 2], [1, 3]]) > >>> print y > Matrix([[1], >[2], >[4]]) > >>> x > Matrix([[1, 1], >[1, 2], >[1, 3]]) > >>> print linear_least_squares(x, y) > > Here my Numpy stops. and never give me a result. THis is the problem of > Numpy or sth wrong. > > Can you guys give me a LinearAlgebra.py so that I can have a try? > > Thanks, > > John -- http://mail.python.org/mailman/listinfo/python-list
Re: Linear regression in NumPy
Jianzhong Liu wrote: > Hello, Guys, > > I have a question about the linear_least_squares in Numpy. > > My linear_least_squares cannot give me the results. > > I use Numpy1.0. The newest version. So I checked online and get your > guys some examples. The package name for numpy 1.0 is "numpy", not "Numeric". See this page for an explanation of the names and the history of the two packages: http://www.scipy.org/History_of_SciPy The examples you say are for Numeric, not numpy. You can get up-to-date examples from here: http://www.scipy.org/Documentation And please join us on numpy-discussion if you have any more questions. http://www.scipy.org/Mailing_Lists -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list