Re: Wow, Python much faster than MatLab
Gerry, I have the similar background as yours, many years using SAS/R. Right now I am trying to pick up python. >From your point, is there anything that can be done with python easily but not with SAS/R? thanks for your insight. wensui On 1/1/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > We're not so far apart. > > I've used SAS or 25 years, and R/S-PLUS for 10. > > I think you've said it better than I did, though: R requires more attention > (which is often needed). > > I certainly didn't mean that R crashed - just an indictment of how much I > thought I was holding in my head. > > Gerry > -- > http://mail.python.org/mailman/listinfo/python-list > -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
We're not so far apart. I've used SAS or 25 years, and R/S-PLUS for 10. I think you've said it better than I did, though: R requires more attention (which is often needed). I certainly didn't mean that R crashed - just an indictment of how much I thought I was holding in my head. Gerry -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Klaas wrote: > C/C++ do not allocate extra arrays. What you posted _might_ bear a > small resemblance to what numpy might produce (if using vectorized > code, not explicit loop code). This is entirely unrelated to the > reasons why fortran can be faster than c. Array libraries in C++ that use operator overloading produce intermediate arrays for the same reason as NumPy. There is a C++ library that are sometimes able to avoid intermediates (Blitz++), but it can only do so for small arrays for which bounds are known at compile time. Operator overloading is sometimes portrayed as required for scientific computing (e.g. in Java vs. C# flame wars), but the cure can be worse than the disease. C does not have operator overloading and is an entirely different case. You can of course avoid intermediates in C++ if you use C++ as C. You can do that in Python as well. -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
sturlamolden wrote: > as well as looping over the data only once. This is one of the main > reasons why Fortran is better than C++ for scientific computing. I.e. > instead of > > for (i=0; i array1[i] = (array1[i] + array2[i]) * (array3[i] + array4[i]); > > one actually gets something like three intermediates and four loops: > > tmp1 = malloc(n*sizeof(whatever)); > for (i=0; itmp1[i] = array1[i] + array2[i]; > tmp2 = malloc(n*sizeof(whatever)); > for (i=0; itmp2[i] = array3[i] + array4[i]; > tmp3 = malloc(n*sizeof(whatever)); > for (i=0; itmp3[i] = tmp1[i] + tmp2[i]; > free(tmp1); > free(tmp2); > for (i=0; i array1[i] = tmp3[i]; > free(tmp3); C/C++ do not allocate extra arrays. What you posted _might_ bear a small resemblance to what numpy might produce (if using vectorized code, not explicit loop code). This is entirely unrelated to the reasons why fortran can be faster than c. -Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
sturlamolden wrote: > array3[:] = array1[:] + array2[:] OT, but why are you slicing array1 and array2? All that does is create new array objects pointing to the same data. > Now for my question: operator overloading is (as shown) not the > solution to efficient scientific computing. It creates serious bloat > where it is undesired. Can NumPy's performance be improved by adding > the array types to the Python language it self? Or are the dynamic > nature of Python preventing this? Pretty much. Making the array types builtin rather than from a third party module doesn't really change anything. However, if type inferencing tools like psyco are taught about numpy arrays like they are already taught about ints, then one could do make it avoid temporaries. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Wensui Liu wrote: > doing. However, that is not the fault of excel/spss itself but of > people who is using it. Yes and no. I think SPSS makes it too tempting. Like children playing with fire, they may not even know it's dangerous. You can do an GLM in SPSS by just filling out a form - but how many social scientists or MDs know anything about general linear models? The command line interface of MySQL, SAS, Matlab and R makes an excellent deterrent. All statistical tool can be misused. But the difference is accidental and deliberate misuse. Anyone can naviagte a GUI, but you need to know you want to do an ANOVA before you can think of typing "anova" on the command line. You mentioned use of Excel as database. That is another example, although it has more to do with data security and integrity, and sometimes protection of privacy. Many companies have banned the use of Microsoft Access, as employees were building their own mock up databases - thus migrating these Access databases to an even worse solution (Excel). Sturla Molden -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Stef Mientki wrote: > MatLab: 14 msec > Python: 2 msec I have the same experience. NumPy is usually faster than Matlab. But it very much depends on how the code is structured. I wonder if it is possible to improve the performance of NumPy by having its fundamental types in the language, instead of depending on operator overloading. For example, in NumPy, a statement like array3[:] = array1[:] + array2[:] allocates an intermediate array that is not needed. This is because the operator overloading cannot know if it's evaluating a part of a larger statement like array1[:] = (array1[:] + array2[:]) * (array3[:] + array4[:]) If arrays had been a part of the language, as it is in Matlab and Fortran 95, the compiler could see this and avoid intermediate storage, as well as looping over the data only once. This is one of the main reasons why Fortran is better than C++ for scientific computing. I.e. instead of for (i=0; ihttp://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Sturla, I am working in the healthcare and seeing people loves to use excel / spss as database or statistical tool without know what he/she is doing. However, that is not the fault of excel/spss itself but of people who is using it. Things, even include SAS/R, would look stupid, when it has been misused. In the hospitals, people don't pray God. They pray MD. :-) On 30 Dec 2006 19:09:59 -0800, sturlamolden <[EMAIL PROTECTED]> wrote: > > Stef Mientki wrote: > > > I always thought that SPSS or SAS where thé standards. > > Stef > > As far as SPSS is a standard, it is in the field of "religious use of > statistical procedures I don't understand (as I'm a math retard), but > hey p<0.05 is always significant (and any other value is proof of the > opposite ... I think)." > > SPSS is often used by scientists that don't understand maths at all, > often within the fields of social sciences, but regrettably also within > biology and medicine. I know of few program that have done so much harm > as SPSS. It's like handing an armed weapon to a child. Generally one > should stay away from the things that one don't understand, > particularly within medicine where a wrong result can have dramatic > consequences. SPSS encourages the opposite. Copy and paste from Excel > to SPSS is regrettably becoming the de-facto standard in applied > statistics. The problem is not the quality of Excel or SPSS, but rather > the (in)competence of those conducting the data analysis. This can and > does regrettably lead to serious misinterpretation of the data, in > either direction. When a paper is submitted, these errors are usually > not caught in the peer review process, as peer review is, well, exactly > what is says: *peer* review. > > Thus, SPSS makes it easy to shoot your self in the foot. In my > experience students in social sciences and medicine are currently > thought to do exact that, in universities and colleges all around the > World. And it is particularly dangerous within medical sciences, as > peoples' life and health may be affected by it. I pray God something is > done to prohibit or limit the use of these statistical toys. > > > Sturla Molden > PhD > > -- > http://mail.python.org/mailman/listinfo/python-list > -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Stef Mientki wrote: > I always thought that SPSS or SAS where thé standards. > Stef As far as SPSS is a standard, it is in the field of "religious use of statistical procedures I don't understand (as I'm a math retard), but hey p<0.05 is always significant (and any other value is proof of the opposite ... I think)." SPSS is often used by scientists that don't understand maths at all, often within the fields of social sciences, but regrettably also within biology and medicine. I know of few program that have done so much harm as SPSS. It's like handing an armed weapon to a child. Generally one should stay away from the things that one don't understand, particularly within medicine where a wrong result can have dramatic consequences. SPSS encourages the opposite. Copy and paste from Excel to SPSS is regrettably becoming the de-facto standard in applied statistics. The problem is not the quality of Excel or SPSS, but rather the (in)competence of those conducting the data analysis. This can and does regrettably lead to serious misinterpretation of the data, in either direction. When a paper is submitted, these errors are usually not caught in the peer review process, as peer review is, well, exactly what is says: *peer* review. Thus, SPSS makes it easy to shoot your self in the foot. In my experience students in social sciences and medicine are currently thought to do exact that, in universities and colleges all around the World. And it is particularly dangerous within medical sciences, as peoples' life and health may be affected by it. I pray God something is done to prohibit or limit the use of these statistical toys. Sturla Molden PhD -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
On 12/31/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > R is the free version of the S language. S-PLUS is a commercial version. > Both are targeted at statisticians per se. Their strengths are in > exploratory data analysis (in my opinion). > > SAS has many statistical featues, and is phenomenally well-documented and > supported. One of its great strengths is the robustness of its data model > -- very well suited to large sizes, repetitive inputs, industrial-strength > data processing with a statistics slant. Well over 200 SAS books,for > example. > > I think of SAS and R as being like airliners and helicopters -- airlines get > the job done, and well, as long as it's well-defined and nearly the same job > all the time. Helicopters can go anywhere, do anything, but a moment's > inattention leads to a crash. > -- inattention leading to a crash? I don't get it. I used SAS for about 3 or 4 years, and have used S-Plus and then R for 10 years (R for 8 years now). I've never noticed inattention leading to a crash. I've noticed I cannot get away in R without a careful definition of what I want (which is good), and the immediate interactivity of R is very helpful with mistakes. And of course, programming in R is, well, programming in a reasonable language. Programming in SAS is ... well, programming in SAS (which is about as fun as programming in SPSS). (Another email somehow suggested that the stability/instability analogy of airplanes vs. helicopters does apply to SAS vs. R. Again, I don't really get it. Sure, SAS is very stable. But so is R ---one common complaint is getting seg faults because package whatever has memory leaks, but that is not R's fault, but rather the package's fault). But then, this might start looking a lot like a flame war, which is actually rather off-topic for this list. Anyway, for a Python programmer, picking up R should be fairly easy. And rpy is really a great way of getting R and Python to talk to each other. We do this sort of thing quite a bit on our applications. And yes, R is definitely available for both Linux and Windows (and Mac), has excellent support from several editors in those platforms (e.g., emacs + ess, tinn-R, etc), and seems to be becoming a de facto standard at least in statistical research and is extremely popular in bioinformatics and among statisticians who do bioinformatics (look at bioconductor.org). Ramon -- Ramon Diaz-Uriarte Statistical Computing Team Structural Biology and Biocomputing Programme Spanish National Cancer Centre (CNIO) http://ligarto.org/rdiaz -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
> I think of SAS and R as being like airliners and helicopters -- I like that comparison,... .. Airplanes are inherent stable, .. Helicopters are inherent not-stable ;-) cheers, Stef -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
R is the free version of the S language. S-PLUS is a commercial version. Both are targeted at statisticians per se. Their strengths are in exploratory data analysis (in my opinion). SAS has many statistical featues, and is phenomenally well-documented and supported. One of its great strengths is the robustness of its data model -- very well suited to large sizes, repetitive inputs, industrial-strength data processing with a statistics slant. Well over 200 SAS books,for example. I think of SAS and R as being like airliners and helicopters -- airlines get the job done, and well, as long as it's well-defined and nearly the same job all the time. Helicopters can go anywhere, do anything, but a moment's inattention leads to a crash. -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Stef Mientki <[EMAIL PROTECTED]> writes: > Doran, Harold wrote: > > R is the open-source implementation of the S language developed at Bell > > laboratories. It is a statistical programming language that is becoming > > the de facto standard among statisticians. > Thanks for the information > I always thought that SPSS or SAS where thé standards. > Stef The 'SS' in SPSS stands for Social Science, IIRC. Looking at the lack of mention of that on their website, though, and the prominent use of the "E word" there, they have obviously grown out of (or want to grow out of) their original niche. Googling, SAS's market seems to be mostly in the business / financial worlds. No doubt R's community differs from those, though I don't know exactly how. From the long list of free software available for it, it sure seems popular with some people: http://www.stats.bris.ac.uk/R/ John -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Stef Mientki <[EMAIL PROTECTED]> writes: > Mathias Panzenboeck wrote: > > A other great thing: With rpy you have R bindings for python. > > forgive my ignorance, what's R, rpy ? > Or is only relevant for Linux users ? [...] R is a language / environment for statistical programming. RPy is a Python interface to let you use R from Python. I think they both run on both Windows and Linux. http://www.r-project.org/ http://rpy.sourceforge.net/ John -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Doran, Harold wrote: > R is the open-source implementation of the S language developed at Bell > laboratories. It is a statistical programming language that is becoming > the de facto standard among statisticians. Thanks for the information I always thought that SPSS or SAS where thé standards. Stef -- http://mail.python.org/mailman/listinfo/python-list
RE: Wow, Python much faster than MatLab
R is the open-source implementation of the S language developed at Bell laboratories. It is a statistical programming language that is becoming the de facto standard among statisticians. Rpy is what allows an interface between python and the R language. Harold > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On > Behalf Of Stef Mientki > Sent: Saturday, December 30, 2006 9:24 AM > To: python-list@python.org > Subject: Re: Wow, Python much faster than MatLab > > Mathias Panzenboeck wrote: > > A other great thing: With rpy you have R bindings for python. > > forgive my ignorance, what's R, rpy ? > Or is only relevant for Linux users ? > > cheers > Stef > > > So you have the power of R and the easy syntax and big > standard lib of > > python! :) > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Mathias Panzenboeck wrote: > A other great thing: With rpy you have R bindings for python. forgive my ignorance, what's R, rpy ? Or is only relevant for Linux users ? cheers Stef > So you have the power of R and the easy syntax and big standard lib of > python! :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
A other great thing: With rpy you have R bindings for python. So you have the power of R and the easy syntax and big standard lib of python! :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
> > I'm not sure about SciPy, Yes SciPy allows it too ! but lists in standard Python allow this: > array = [1, 2, 3, 4] array[2:5] > [3, 4] > > That's generally a good thing. > You're not perhaps by origin an analog engineer ;-) cheers, Stef Mientki -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
>> MatLab: 14 msec >> Python: 2 msec > > For times this small, I wonder if timing comparisons are valid. I do > NOT think SciPy is in general an order of magnitude faster than Matlab > for the task typically performed with Matlab. The algorithm is meant for real-time analysis, where these kind of differences counts a lot. I'm also a typical "surface programmer" (don't need/want to know what's going inside), just want to get my analysis done, and the fact that Python has much more functions available, means I've to write far less explicit or implicit for loops, and thus I expect it to "look" faster for me always. > >> After taking the first difficult steps into Python, >> all kind of small problems as you already know, >> it nows seems a piece of cake to convert from MatLab to Python. >> (the final programs of MatLab and Python can almost only be >> distinguished by the comment character ;-) >> >> Especially I like: >> - more relaxed behavior of exceeded the upper limit of a (1-dimensional) >> array > > Could you explain what this means? In general, I don't want a > programming language to be "relaxed" about exceeding array bounds. > Well, I've to admit, that wasn't a very tactic remark, "noise" is still an unwanted issue in software. But in the meanwhile I've reading further and I should replace that by some other great things: - the very efficient way, comment is turned into help information - the (at first sight) very easy, but yet quit powerfull OOPs implemetation. cheers, Stef Mientki -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
On Fri, 29 Dec 2006 19:35:22 -0800, Beliavsky wrote: >> Especially I like: >> - more relaxed behavior of exceeded the upper limit of a (1-dimensional) >> array > > Could you explain what this means? In general, I don't want a > programming language to be "relaxed" about exceeding array bounds. I'm not sure about SciPy, but lists in standard Python allow this: >>> array = [1, 2, 3, 4] >>> array[2:5] [3, 4] That's generally a good thing. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Wow, Python much faster than MatLab
Stef Mientki wrote: > hi All, > > instead of questions, > my first success story: > > I converted my first MatLab algorithm into Python (using SciPy), > and it not only works perfectly, > but also runs much faster: > > MatLab: 14 msec > Python: 2 msec For times this small, I wonder if timing comparisons are valid. I do NOT think SciPy is in general an order of magnitude faster than Matlab for the task typically performed with Matlab. > > After taking the first difficult steps into Python, > all kind of small problems as you already know, > it nows seems a piece of cake to convert from MatLab to Python. > (the final programs of MatLab and Python can almost only be > distinguished by the comment character ;-) > > Especially I like: > - more relaxed behavior of exceeded the upper limit of a (1-dimensional) > array Could you explain what this means? In general, I don't want a programming language to be "relaxed" about exceeding array bounds. -- http://mail.python.org/mailman/listinfo/python-list