[Rd] Julia
My purpose in mentioning the Julia language (julialang.org) here is not to start a flame war. I find it to be a very interesting development and others who read this list may want to read about it too. It is still very much early days for this language - about the same stage as R was in 1995 or 1996 when only a few people knew about it - but Julia holds much potential. There is a thread about "R and statistical programming" on groups.google.com/group/julia-dev. As always happens, there is a certain amount of grumbling of the "R IS S SLW" flavor but there is also some good discussion regarding features of R (well, S actually) that are central to the language. (Disclaimer: I am one of the participants discussing the importance of data frames and formulas in R.) If you want to know why Julia has attracted a lot of interest very recently (like in the last 10 days), as a language it uses multiple dispatch (like S4 methods) with methods being compiled on the fly using the LLVM (http://llvm.org) infrastructure. In some ways it achieves the Holy Grail of languages like R, Matlab, NumPy, ... in that it combines the speed of compiled languages with the flexibility of the high-level interpreted language. One of the developers, Jeff Bezanson, gave a seminar about the design of the language at Stanford yesterday, and the video is archived at http://www.stanford.edu/class/ee380/. You don't see John Chambers on camera but I am reasonably certain that a couple of the questions and comments came from him. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
Doug, Agreed on the interesting point - looks like it has some real promise. I think the spike in interest could be attributable to Mike Loukides's tweet on Feb 20. (editor at O'Reilly) https://twitter.com/#!/mikeloukides/status/171773229407551488 That is exactly the moment I stumbled upon it. Jeff On Thu, Mar 1, 2012 at 11:06 AM, Douglas Bates wrote: > My purpose in mentioning the Julia language (julialang.org) here is > not to start a flame war. I find it to be a very interesting > development and others who read this list may want to read about it > too. > > It is still very much early days for this language - about the same > stage as R was in 1995 or 1996 when only a few people knew about it - > but Julia holds much potential. There is a thread about "R and > statistical programming" on groups.google.com/group/julia-dev. As > always happens, there is a certain amount of grumbling of the "R IS > S SLW" flavor but there is also some good discussion regarding > features of R (well, S actually) that are central to the language. > (Disclaimer: I am one of the participants discussing the importance of > data frames and formulas in R.) > > If you want to know why Julia has attracted a lot of interest very > recently (like in the last 10 days), as a language it uses multiple > dispatch (like S4 methods) with methods being compiled on the fly > using the LLVM (http://llvm.org) infrastructure. In some ways it > achieves the Holy Grail of languages like R, Matlab, NumPy, ... in > that it combines the speed of compiled languages with the flexibility > of the high-level interpreted language. > > One of the developers, Jeff Bezanson, gave a seminar about the design > of the language at Stanford yesterday, and the video is archived at > http://www.stanford.edu/class/ee380/. You don't see John Chambers on > camera but I am reasonably certain that a couple of the questions and > comments came from him. > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Jeffrey Ryan jeffrey.r...@lemnica.com www.lemnica.com www.esotericR.com R/Finance 2012: Applied Finance with R www.RinFinance.com See you in Chicago __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Thu, Mar 1, 2012 at 11:20 AM, Jeffrey Ryan wrote: > Doug, > > Agreed on the interesting point - looks like it has some real promise. > I think the spike in interest could be attributable to Mike > Loukides's tweet on Feb 20. (editor at O'Reilly) > > https://twitter.com/#!/mikeloukides/status/171773229407551488 > > That is exactly the moment I stumbled upon it. I think Jeff Bezanson attributes the interest to a blog posting by Viral Shah, another member of the development team, that hit Reddit. He said that, with Viral now in India, it all happened overnight for those in North America and he awoke the next day to find a firestorm of interest. I ran across Julia in the Release Notes of LLVM and mentioned it to Dirk Eddelbuettel who posted about it on Google+ in January. (Dirk, being much younger than I, knows about these new-fangled social media things and I don't.) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
Can somebody postb a link to the video? I cant find it, searching "Julia" on youtube stanford channel gives nothing. Kjetil On Thu, Mar 1, 2012 at 11:37 AM, Douglas Bates wrote: > On Thu, Mar 1, 2012 at 11:20 AM, Jeffrey Ryan > wrote: >> Doug, >> >> Agreed on the interesting point - looks like it has some real promise. >> I think the spike in interest could be attributable to Mike >> Loukides's tweet on Feb 20. (editor at O'Reilly) >> >> https://twitter.com/#!/mikeloukides/status/171773229407551488 >> >> That is exactly the moment I stumbled upon it. > > I think Jeff Bezanson attributes the interest to a blog posting by > Viral Shah, another member of the development team, that hit Reddit. > He said that, with Viral now in India, it all happened overnight for > those in North America and he awoke the next day to find a firestorm > of interest. I ran across Julia in the Release Notes of LLVM and > mentioned it to Dirk Eddelbuettel who posted about it on Google+ in > January. (Dirk, being much younger than I, knows about these > new-fangled social media things and I don't.) > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
http://julialang.org/blog Then click on "Stanford Talk Video". Then click on "available here". Ted. On 01-Mar-2012 Kjetil Halvorsen wrote: > Can somebody postb a link to the video? I cant find it, searching > "Julia" on youtube stanford channel gives nothing. > > Kjetil > > On Thu, Mar 1, 2012 at 11:37 AM, Douglas Bates wrote: >> On Thu, Mar 1, 2012 at 11:20 AM, Jeffrey Ryan >> wrote: >>> Doug, >>> >>> Agreed on the interesting point - looks like it has some real promise. >>> Â_I think the spike in interest could be attributable to Mike >>> Loukides's tweet on Feb 20. (editor at O'Reilly) >>> >>> https://twitter.com/#!/mikeloukides/status/171773229407551488 >>> >>> That is exactly the moment I stumbled upon it. >> >> I think Jeff Bezanson attributes the interest to a blog posting by >> Viral Shah, another member of the development team, that hit Reddit. >> He said that, with Viral now in India, it all happened overnight for >> those in North America and he awoke the next day to find a firestorm >> of interest. Â_I ran across Julia in the Release Notes of LLVM and >> mentioned it to Dirk Eddelbuettel who posted about it on Google+ in >> January. Â_(Dirk, being much younger than I, knows about these >> new-fangled social media things and I don't.) >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel - E-Mail: (Ted Harding) Date: 01-Mar-2012 Time: 20:47:42 This message was sent by XFMail __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Thu, Mar 01, 2012 at 11:06:51AM -0600, Douglas Bates wrote: > My purpose in mentioning the Julia language (julialang.org) here is > not to start a flame war. I find it to be a very interesting > development and others who read this list may want to read about it > too. [...] Very interesting language. Thank you for mentioning it here. Compiling from the github-sources was easy. Will explore it during the next days. Seems not to be very specific to statistics, but good for math in general. Not sure, if it might make sense to combine R and Julia in the long run (I mean: combining via providing interfaces between them, calling the one via the other, merging code or using libs from the one or the other from each side). Ciao, Oliver __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
I haven't used Julia yet, but from my quick reading of the docs it looks like arguments to functions are passed by reference and not by value, so functions can change their arguments. My recollection from when I first started using S (in the course of a job helping profs and grad students do statistical programming, c. 1983) is that not having to worry about in-place algorithms changing your data gave S a big advantage over Fortran or C. While this feature could slow things down and increase memory code, I felt that it made it easier to write correct code and to use functions that others had written. Does Julia have a const declaration or other means of controlling or documenting that a given function will or will not change the data passed into it? Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On > Behalf Of oliver > Sent: Friday, March 02, 2012 5:14 PM > To: Douglas Bates > Cc: R-devel > Subject: Re: [Rd] Julia > > On Thu, Mar 01, 2012 at 11:06:51AM -0600, Douglas Bates wrote: > > My purpose in mentioning the Julia language (julialang.org) here is > > not to start a flame war. I find it to be a very interesting > > development and others who read this list may want to read about it > > too. > [...] > > > Very interesting language. > Thank you for mentioning it here. > > Compiling from the github-sources was easy. > > Will explore it during the next days. > > Seems not to be very specific to statistics, > but good for math in general. > > Not sure, if it might make sense to combine > R and Julia in the long run (I mean: combining via > providing interfaces between them, calling the one via the > other, merging code or using libs from the one or the other > from each side). > > Ciao, >Oliver > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote: > I haven't used Julia yet, but from my quick reading > of the docs it looks like arguments to functions are > passed by reference and not by value, so functions > can change their arguments. My recollection from when > I first started using S (in the course of a job helping > profs and grad students do statistical programming, c. 1983) > is that not having to worry about in-place algorithms changing > your data gave S a big advantage over Fortran or C. [...] C also uses Call-by-Value. Fortran I don't know in detail. > While this feature could slow things down and increase > memory code, I felt that it made it easier to write correct > code and to use functions that others had written. Yes, I also think, that call-by-value decreases errors in Code. What I read about Julia it's like MATLAB plus more features for programming. Does matlab also only use call-by-reference? > Does Julia have a const declaration or other > means of controlling or documenting that a given function > will or will not change the data passed into it? I did not explored it in detail so far. Maybe the orig-poster already did this in more depth? Ciao, Oliver __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
Hi Oliver, On 03/05/2012 09:08 AM, oliver wrote: On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote: I haven't used Julia yet, but from my quick reading of the docs it looks like arguments to functions are passed by reference and not by value, so functions can change their arguments. My recollection from when I first started using S (in the course of a job helping profs and grad students do statistical programming, c. 1983) is that not having to worry about in-place algorithms changing your data gave S a big advantage over Fortran or C. [...] C also uses Call-by-Value. C *only* uses Call-by-Value. Cheers, H. Fortran I don't know in detail. While this feature could slow things down and increase memory code, I felt that it made it easier to write correct code and to use functions that others had written. Yes, I also think, that call-by-value decreases errors in Code. What I read about Julia it's like MATLAB plus more features for programming. Does matlab also only use call-by-reference? Does Julia have a const declaration or other means of controlling or documenting that a given function will or will not change the data passed into it? I did not explored it in detail so far. Maybe the orig-poster already did this in more depth? Ciao, Oliver __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Mon, Mar 05, 2012 at 03:58:59PM -0800, Hervé Pagès wrote: > Hi Oliver, > > On 03/05/2012 09:08 AM, oliver wrote: > >On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote: > >>I haven't used Julia yet, but from my quick reading > >>of the docs it looks like arguments to functions are > >>passed by reference and not by value, so functions > >>can change their arguments. My recollection from when > >>I first started using S (in the course of a job helping > >>profs and grad students do statistical programming, c. 1983) > >>is that not having to worry about in-place algorithms changing > >>your data gave S a big advantage over Fortran or C. > >[...] > > > > > >C also uses Call-by-Value. > > C *only* uses Call-by-Value. [...] Yes, that's what I meant. With "also" I meant, that it uses call-by-value, as some other languages also do. Ciao, Oliver __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On 12-03-05 6:58 PM, Hervé Pagès wrote: Hi Oliver, On 03/05/2012 09:08 AM, oliver wrote: On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote: I haven't used Julia yet, but from my quick reading of the docs it looks like arguments to functions are passed by reference and not by value, so functions can change their arguments. My recollection from when I first started using S (in the course of a job helping profs and grad students do statistical programming, c. 1983) is that not having to worry about in-place algorithms changing your data gave S a big advantage over Fortran or C. [...] C also uses Call-by-Value. C *only* uses Call-by-Value. While literally true, the fact that you can't send an array by value, and must send the value of a pointer to it, kind of supports Bill's point: in C, you mostly end up sending arrays by reference. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
Yes, C does use call by value, always. However, data arrays are almost always passed via pointers to malloc'ed space, so, effectively, data arrays are passed by reference. (One can put a 'const type*' in the prototype of a function to declare that the data pointed to will not not be changed, but it is up to documentation or coding standards to let someone know that data pointed to will likely be changed.) I find R's (& S+'s & S's) copy-on-write-if-not-copying-would-be-discoverable- by-the-uer machanism for giving the allusion of pass-by-value a good way to structure the contract between the function writer and the function user. Does Julia have the tools to let a function writer or user decide whether he really needs to copy its arguments or not? Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On > Behalf Of Hervé Pagès > Sent: Monday, March 05, 2012 3:59 PM > To: oliver > Cc: R-devel > Subject: Re: [Rd] Julia > > Hi Oliver, > > On 03/05/2012 09:08 AM, oliver wrote: > > On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote: > >> I haven't used Julia yet, but from my quick reading > >> of the docs it looks like arguments to functions are > >> passed by reference and not by value, so functions > >> can change their arguments. My recollection from when > >> I first started using S (in the course of a job helping > >> profs and grad students do statistical programming, c. 1983) > >> is that not having to worry about in-place algorithms changing > >> your data gave S a big advantage over Fortran or C. > > [...] > > > > > > C also uses Call-by-Value. > > C *only* uses Call-by-Value. > > Cheers, > H. > > > Fortran I don't know in detail. > > > > > >> While this feature could slow things down and increase > >> memory code, I felt that it made it easier to write correct > >> code and to use functions that others had written. > > > > Yes, I also think, that call-by-value decreases > > errors in Code. > > > > What I read about Julia it's like MATLAB plus more features for programming. > > Does matlab also only use call-by-reference? > > > > > >> Does Julia have a const declaration or other > >> means of controlling or documenting that a given function > >> will or will not change the data passed into it? > > > > I did not explored it in detail so far. > > Maybe the orig-poster already did this in more depth? > > > > > > Ciao, > > Oliver > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fhcrc.org > Phone: (206) 667-5791 > Fax:(206) 667-1319 > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Mon, Mar 05, 2012 at 04:54:05PM -0800, Nicholas Crookston wrote: > There are many experts on this topic. I'll keep this short. > > Newer Fortran Languages allow for call by value, but call by reference > is the typical and historically, the only approach (there was a time > when you could change the value of 1 to 2!). Oh, strange. > > C "only" calls by value except that the value can be a pointer! So, > havoc is just a * away. [...] For me there was no "havoc" at this point, but for others maybe. There are also other languages that only use call-by-value... ...functional languages are that way in principal. Nevertheless internally they may heavily use pointers and even if you have values that are large arrays for example, they internally just give a pointer to that data structure. (That's, why functional languages are not necessarily slow just because you act on large data and have no references in that language. (A common misunderstanding about functional languages must be slow because they have nor references.) The pointer-stuff is just hidden. Even they ((non-purely) functional languages) may have references, their concept of references is different. (See OCaml for example.) There you can use references to change values in place, but the reference itself is a functional value, and you will never have access to the pointer stuff directly. Hence no problems with mem-arithmetics and dangling pointer's or Null-pointers. [...] > I like R and will continue to use it. However, I also think that > strict "call by value" can get you into trouble, just trouble of a > different kind. Can you elaborate more on this? What problems do you have in mind? And what kind of references do you have in mind? The C-like pointers or something like OCaml's ref's? > I'm not sure we will ever yearn for "Julia ouR-Julia", > but it is sure fun to think about what might be possible with this > language. And having fun is one key objective. I have fun if things work. And if the tools do, what I want to achieve... ...and the fun is better, if they do it elegantly. Do you ask for references in R? And what kind of references do you have in mind, and why does it hurt you not to have them? Can you give examples, so that it's easier to see, whwere you miss something? Ciao, Oliver P.S.: The speed issue of R was coming up more than once; in some blog posts it was mentioned. would it make sense to start a seperated thread of it? In one of the blog-articles I read, it was mourned about how NA / missing values were handled, and that NA should maybe become thrown out, just to get higher speed. I would not like to have that. Handling NA as special case IMHO is a very good way. Don't remember if the article I have in mind just argued about HOW this was handled, or if it should be thrown out completely. Making the handling of it better and more performant I think is a good idea, ignoring NA IMHO is a bad idea. But maybe that really would be worth a seperate thread? __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Mon, Mar 05, 2012 at 07:33:10PM -0500, Duncan Murdoch wrote: > On 12-03-05 6:58 PM, Hervé Pagès wrote: > >Hi Oliver, > > > >On 03/05/2012 09:08 AM, oliver wrote: > >>On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote: > >>>I haven't used Julia yet, but from my quick reading > >>>of the docs it looks like arguments to functions are > >>>passed by reference and not by value, so functions > >>>can change their arguments. My recollection from when > >>>I first started using S (in the course of a job helping > >>>profs and grad students do statistical programming, c. 1983) > >>>is that not having to worry about in-place algorithms changing > >>>your data gave S a big advantage over Fortran or C. > >>[...] > >> > >> > >>C also uses Call-by-Value. > > > >C *only* uses Call-by-Value. > > While literally true, the fact that you can't send an array by > value, and must send the value of a pointer to it, kind of supports > Bill's point: in C, you mostly end up sending arrays by reference. [...] It's a problem of how the term "reference" is used. If you want to limit the possible confsion, better say: giving the pointer-by-value. Or: giving the address-value of the array/struct/... by value. To say, you give the array reference is a shorthand, which maybe creates confusion. Just avoiding the word "reference" here would make it more clear. AFAIK in C++ references are different to pointers. (Some others who know C++ in detail might explain this in detail.) So, using the same terms for many different concepts can create a mess in understanding. Ciao, Oliver __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Tue, Mar 06, 2012 at 12:35:32AM +, William Dunlap wrote: [...] > I find R's (& S+'s & S's) copy-on-write-if-not-copying-would-be-discoverable- > by-the-uer machanism for giving the allusion of pass-by-value a good way > to structure the contract between the function writer and the function user. [...] Can you elaborate more on this, especially on the ...-...-...-if-not-copying-would-be-discoverable-by-the-uer stuff? What do you mean with discoverability of not-copying? Ciao, Oliver __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
S (and its derivatives and successors) promises that functions will not change their arguments, so in an expression like val <- func(arg) you know that arg will not be changed. You can do that by having func copy arg before doing anything, but that uses space and time that you want to conserve. If arg is not a named item in any environment then it should be fine to write over the original because there is no way the caller can detect that shortcut. E.g., in cx <- cos(runif(n)) the cos function does not need to allocate new space for its output, it can just write over its input because, without a name attached to it, the caller has no way of looking at what runif(n) returned. If you did x <- runif(n) cx <- cos(x) then cos would have to allocate new space for its output because overwriting its input would affect a subsequent sum(x) I suppose that end-users and function-writers could learn to live with having to decide when to copy, but not having to make that decision makes S more pleasant (and safer) to use. I think that is a major reason that people are able to share S code so easily. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: oliver [mailto:oli...@first.in-berlin.de] > Sent: Tuesday, March 06, 2012 1:12 AM > To: William Dunlap > Cc: Hervé Pagès; R-devel > Subject: Re: [Rd] Julia > > On Tue, Mar 06, 2012 at 12:35:32AM +, William Dunlap wrote: > [...] > > I find R's (& S+'s & S's) > > copy-on-write-if-not-copying-would-be-discoverable- > > by-the-uer machanism for giving the allusion of pass-by-value a good way > > to structure the contract between the function writer and the function user. > [...] > > > Can you elaborate more on this, > especially on the ...-...-...-if-not-copying-would-be-discoverable-by-the-uer > stuff? > > What do you mean with discoverability of not-copying? > > Ciao, >Oliver __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap wrote: > S (and its derivatives and successors) promises that functions > will not change their arguments, so in an expression like > val <- func(arg) > you know that arg will not be changed. You can > do that by having func copy arg before doing anything, > but that uses space and time that you want to conserve. > If arg is not a named item in any environment then it > should be fine to write over the original because there > is no way the caller can detect that shortcut. E.g., in > cx <- cos(runif(n)) > the cos function does not need to allocate new space for > its output, it can just write over its input because, without > a name attached to it, the caller has no way of looking > at what runif(n) returned. If you did > x <- runif(n) > cx <- cos(x) > then cos would have to allocate new space for its output > because overwriting its input would affect a subsequent > sum(x) > I suppose that end-users and function-writers could learn > to live with having to decide when to copy, but not having > to make that decision makes S more pleasant (and safer) to use. > I think that is a major reason that people are able to > share S code so easily. But don't forget the "Holy Grail" that Doug mentioned at the start of this thread: finding a flexible language that is also fast. Currently many R packages employ C/C++ components to compensate for the fact that the R interpreter can be slow, and the pass-by-value semantics of S provides no protection here. In 2008 Ross Ihaka and Duncan Temple Lang published the paper "Back to the Future: Lisp as a base for a statistical computing system" where they propose Common Lisp as a new foundation for R. They suggest that this could be done while maintaining the same familiar R syntax. A key requirement of any strategy is to maintain easy access to the huge universe of existing C/C++/Fortran numerical and graphics libraries, as these libraries are not likely to be rewritten. Thus there will always be a need for a foreign function interface, and the problem is to provide a flexible and type-safe language that does not force developers to use another unfamiliar, less flexible, and error-prone language to optimize the hot spots. Dominick > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> -Original Message- >> From: oliver [mailto:oli...@first.in-berlin.de] >> Sent: Tuesday, March 06, 2012 1:12 AM >> To: William Dunlap >> Cc: Hervé Pagès; R-devel >> Subject: Re: [Rd] Julia >> >> On Tue, Mar 06, 2012 at 12:35:32AM +, William Dunlap wrote: >> [...] >> > I find R's (& S+'s & S's) >> > copy-on-write-if-not-copying-would-be-discoverable- >> > by-the-uer machanism for giving the allusion of pass-by-value a good way >> > to structure the contract between the function writer and the function >> > user. >> [...] >> >> >> Can you elaborate more on this, >> especially on the ...-...-...-if-not-copying-would-be-discoverable-by-the-uer >> stuff? >> >> What do you mean with discoverability of not-copying? >> >> Ciao, >> Oliver > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
There are many experts on this topic. I'll keep this short. Newer Fortran Languages allow for call by value, but call by reference is the typical and historically, the only approach (there was a time when you could change the value of 1 to 2!). C "only" calls by value except that the value can be a pointer! So, havoc is just a * away. I'm very pleased to be on this list and read the discussion. Thank you Douglas Bates for sending the first message. I like R and will continue to use it. However, I also think that strict "call by value" can get you into trouble, just trouble of a different kind. I'm not sure we will ever yearn for "Julia ouR-Julia", but it is sure fun to think about what might be possible with this language. And having fun is one key objective. Nick Crookston 2012/3/5 oliver : > On Mon, Mar 05, 2012 at 03:58:59PM -0800, Hervé Pagès wrote: >> Hi Oliver, >> >> On 03/05/2012 09:08 AM, oliver wrote: >> >On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote: >> >>I haven't used Julia yet, but from my quick reading >> >>of the docs it looks like arguments to functions are >> >>passed by reference and not by value, so functions >> >>can change their arguments. My recollection from when >> >>I first started using S (in the course of a job helping >> >>profs and grad students do statistical programming, c. 1983) >> >>is that not having to worry about in-place algorithms changing >> >>your data gave S a big advantage over Fortran or C. >> >[...] >> > >> > >> >C also uses Call-by-Value. >> >> C *only* uses Call-by-Value. > [...] > > > Yes, that's what I meant. > > With "also" I meant, that it uses call-by-value, as some > other languages also do. > > > Ciao, > Oliver > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Tue, Mar 6, 2012 at 3:56 AM, oliver wrote: > On Mon, Mar 05, 2012 at 04:54:05PM -0800, Nicholas Crookston wrote: >> There are many experts on this topic. I'll keep this short. >> >> Newer Fortran Languages allow for call by value, but call by reference >> is the typical and historically, the only approach (there was a time >> when you could change the value of 1 to 2!). > > Oh, strange. > > >> >> C "only" calls by value except that the value can be a pointer! So, >> havoc is just a * away. > [...] > > For me there was no "havoc" at this point, but for others maybe. > > There are also other languages that only use call-by-value... > ...functional languages are that way in principal. > > Nevertheless internally they may heavily use pointers and > even if you have values that are large arrays for example, > they internally just give a pointer to that data structure. > (That's, why functional languages are not necessarily slow > just because you act on large data and have no references > in that language. (A common misunderstanding about functional > languages must be slow because they have nor references.) > The pointer-stuff is just hidden. > > Even they ((non-purely) functional languages) may have references, > their concept of references is different. (See OCaml for example.) > There you can use references to change values in place, but the > reference itself is a functional value, and you will never have > access to the pointer stuff directly. Hence no problems with > mem-arithmetics and dangling pointer's or Null-pointers. > > > > [...] >> I like R and will continue to use it. However, I also think that >> strict "call by value" can get you into trouble, just trouble of a >> different kind. > > Can you elaborate more on this? > What problems do you have in mind? > And what kind of references do you have in mind? > The C-like pointers or something like OCaml's ref's? OCaml refs are an "escape hatch" from the pure functional programming paradigm where nothing can be changed once given a value, an extreme form of pass-by-value. Similarly, most languages that are advertised as pass-by-value include some kind of escape hatch that permits you to work with pointers (or mutable vectors) for improved runtime performance. The speed issues arise for two main reasons: interpreting code is much slower than running machine code, and copying large data structures can be expensive. Pass-by-value semantics forces this to happen in many situations where the compiler/interpreter cannot safely optimize it away. Based on the video Julia manages the speed issue by viewing everything like a template, thus generating new methods based on type inference. This means there isn't a lot of runtime type checking for dispatch, because customized methods were already generated, but this can lead to another problem: code bloat. There are no free lunches. >> I'm not sure we will ever yearn for "Julia ouR-Julia", >> but it is sure fun to think about what might be possible with this >> language. And having fun is one key objective. > > I have fun if things work. > And if the tools do, what I want to achieve... > ...and the fun is better, if they do it elegantly. > > Do you ask for references in R? > And what kind of references do you have in mind, > and why does it hurt you not to have them? > > Can you give examples, so that it's easier to see, > whwere you miss something? > > > Ciao, > Oliver > > P.S.: The speed issue of R was coming up more than once; > in some blog posts it was mentioned. would it make > sense to start a seperated thread of it? > In one of the blog-articles I read, it was mourned about > how NA / missing values were handled, and that NA should > maybe become thrown out, just to get higher speed. > I would not like to have that. Handling NA as special > case IMHO is a very good way. Don't remember if the > article I have in mind just argued about HOW this was > handled, or if it should be thrown out completely. > Making the handling of it better and more performant I > think is a good idea, ignoring NA IMHO is a bad idea. > > But maybe that really would be worth a seperate thread? > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
On Wed, Mar 07, 2012 at 10:31:14AM -0500, Dominick Samperi wrote: > On Tue, Mar 6, 2012 at 3:56 AM, oliver wrote: > > On Mon, Mar 05, 2012 at 04:54:05PM -0800, Nicholas Crookston wrote: > >> There are many experts on this topic. I'll keep this short. > >> > >> Newer Fortran Languages allow for call by value, but call by reference > >> is the typical and historically, the only approach (there was a time > >> when you could change the value of 1 to 2!). > > > > Oh, strange. > > > > > >> > >> C "only" calls by value except that the value can be a pointer! So, > >> havoc is just a * away. > > [...] > > > > For me there was no "havoc" at this point, but for others maybe. > > > > There are also other languages that only use call-by-value... > > ...functional languages are that way in principal. > > > > Nevertheless internally they may heavily use pointers and > > even if you have values that are large arrays for example, > > they internally just give a pointer to that data structure. > > (That's, why functional languages are not necessarily slow > > just because you act on large data and have no references > > in that language. (A common misunderstanding about functional > > languages must be slow because they have nor references.) > > The pointer-stuff is just hidden. > > > > Even they ((non-purely) functional languages) may have references, > > their concept of references is different. (See OCaml for example.) > > There you can use references to change values in place, but the > > reference itself is a functional value, and you will never have > > access to the pointer stuff directly. Hence no problems with > > mem-arithmetics and dangling pointer's or Null-pointers. > > > > > > > > [...] > >> I like R and will continue to use it. However, I also think that > >> strict "call by value" can get you into trouble, just trouble of a > >> different kind. > > > > Can you elaborate more on this? > > What problems do you have in mind? > > And what kind of references do you have in mind? > > The C-like pointers or something like OCaml's ref's? > > OCaml refs are an "escape hatch" from the pure > functional programming paradigm where nothing can > be changed once given a value, an extreme form of > pass-by-value. OCaml is not a purely functional language and has not the claim to be one; hence it's not an "escape hatch" (which seem to have a negative touch to me). Arrays and strings in OCaml are also imperative. And with the "mutable" attribute in records, you also can crearte imperative record entries. So, it's just a different design / approach than Haskell for example. OCaml is coming from ML-languages. Purely Functional on the one hand is beautiful, and therefore nice; but it also is dogmatic on the other hand. > Similarly, most languages that are > advertised as pass-by-value include some kind of > escape hatch that permits you to work with pointers > (or mutable vectors) for improved runtime performance. References in OCaml are NOT pointers. You do have access in an imperative / in-place way, but you have NO POINTER STUFF in that language. # let a = ref 5;; val a : int ref = {contents = 5} # a := 7;; - : unit = () # a;; - : int ref = {contents = 7} # This is in-place modification of the contents of the ref, without any pointer arithmetics. "a" is a functional value which hosts an imperative one on the inside. > > The speed issues arise for two main reasons: interpreting > code is much slower than running machine code, and > copying large data structures can be expensive. The functional approach often saves time and space. This is just not well known. And the distinction of imperative vs. functional has nothing to do with interpreted vs. directly executed. # let mylist_1 = [ 3;5;323 ];; val mylist_1 : int list = [3; 5; 323] # let mylist_2 = 12 :: mylist_1;; val mylist_2 : int list = [12; 3; 5; 323] # mylist_1;; - : int list = [3; 5; 323] # mylist_2;; - : int list = [12; 3; 5; 323] # Both lists share the common elements here. No copy is done. In this case the functional approach is very nice. Just a counter-example to "functional is eating up space". When thinking about the questions here, I think the design of Ocaml addressed all this, and that this was the design decision, why arrays are possible to be changed imperatively. # let my_array = [| 1; 3; 54; 99 |];; val my_array : int array = [|1; 3; 54; 99|] # my_array;; - : int array = [|1; 3; 54; 99|] # my_array.(2) <- 9;; - : unit = () # my_array;; - : int array = [|1; 3; 9; 99|] # If R is rather purely functional here, then the problem addressed here is, that a pureley functional approach without any "escape hatc
Re: [Rd] Julia
On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote: > On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap wrote: > > S (and its derivatives and successors) promises that functions > > will not change their arguments, so in an expression like > > val <- func(arg) > > you know that arg will not be changed. You can > > do that by having func copy arg before doing anything, > > but that uses space and time that you want to conserve. > > If arg is not a named item in any environment then it > > should be fine to write over the original because there > > is no way the caller can detect that shortcut. E.g., in > > cx <- cos(runif(n)) > > the cos function does not need to allocate new space for > > its output, it can just write over its input because, without > > a name attached to it, the caller has no way of looking > > at what runif(n) returned. If you did > > x <- runif(n) > > cx <- cos(x) You have two names here, x and cx, hence your example does not fit into what you want to explain. A better example would be: x <- runif(n) x <- cos(x) > > then cos would have to allocate new space for its output > > because overwriting its input would affect a subsequent > > sum(x) > > I suppose that end-users and function-writers could learn > > to live with having to decide when to copy, but not having > > to make that decision makes S more pleasant (and safer) to use. > > I think that is a major reason that people are able to > > share S code so easily. > > But don't forget the "Holy Grail" that Doug mentioned at the > start of this thread: finding a flexible language that is also > fast. Currently many R packages employ C/C++ components > to compensate for the fact that the R interpreter can be slow, > and the pass-by-value semantics of S provides no protection > here. [...] The distinction imperative vs. functional has nothing to do with the distinction interpreted vs. directly executed. Thinking again on the problem that was mentioned here, I think it might be circumvented. Looking again at R's properties, looking again into U.Ligges "Programmieren in R", I saw there was mentioned that in R anything (?!) is an object... so then it's OOP; but also it was mentioned, R is a functional language. But this does not mean it's purely functional or has no imperative data structures. As R relies heavily on vectors, here we have an imperative datastructure. So, it rather looks to me that "<-" does work in-place on the vectors, even "<-" itself is a function (which does not matter for the problem). If thats true (I assume here, it is; correct me, if it's wrong), then I think, assigning with "<<-" and assign() also would do an imperative (in-place) change of the contents. Then the copying-of-big-objects-when-passed-as-args problem can be circumvented by working on either a variable in the GlobalEnv (and using "<<-", or using a certain environment for the big data and passing it's name (and the variable) as value to the function which then uses assign() and get() to work on that data. Then in-place modification should be possible. > > In 2008 Ross Ihaka and Duncan Temple Lang published the > paper "Back to the Future: Lisp as a base for a statistical > computing system" where they propose Common > Lisp as a new foundation for R. They suggest that > this could be done while maintaining the same > familiar R syntax. > > A key requirement of any strategy is to maintain > easy access to the huge universe of existing > C/C++/Fortran numerical and graphics libraries, > as these libraries are not likely to be rewritten. > > Thus there will always be a need for a foreign > function interface, and the problem is to provide > a flexible and type-safe language that does not > force developers to use another unfamiliar, > less flexible, and error-prone language to > optimize the hot spots. If I here "type safe" I rather would think about OCaml or maybe Ada, but not LISP. Also, LISP has so many "("'s and ")"'s, that it's making people going crazy ;-) Ciao, Oliver __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Julia
No my examples are what I meant. My point was that a function, say cos(), can act like it does call-by-value but conserve memory when it can if it can distinguish between the case cx <- cos(x=runif(n)) # no allocation needed, use the input space for the return value and and the case x <- runif(n) cx <- cos(x=x) # return value cannot reuse the argument's memory, so allocate space for return value sum(x) # Otherwise sum(x) would return sum(cx) The function needs to know if a memory block is referred to by a name in any environment in order to do that. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: oliver [mailto:oli...@first.in-berlin.de] > Sent: Wednesday, March 07, 2012 10:22 AM > To: Dominick Samperi > Cc: William Dunlap; R-devel > Subject: Re: [Rd] Julia > > On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote: > > On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap > wrote: > > > S (and its derivatives and successors) promises that functions will > > > not change their arguments, so in an expression like > > > val <- func(arg) > > > you know that arg will not be changed. You can do that by having > > > func copy arg before doing anything, but that uses space and time > > > that you want to conserve. > > > If arg is not a named item in any environment then it should be fine > > > to write over the original because there is no way the caller can > > > detect that shortcut. E.g., in > > > cx <- cos(runif(n)) > > > the cos function does not need to allocate new space for its output, > > > it can just write over its input because, without a name attached to > > > it, the caller has no way of looking at what runif(n) returned. If > > > you did > > > x <- runif(n) > > > cx <- cos(x) > > You have two names here, x and cx, hence your example does not fit into what > you want to explain. > > A better example would be: > x <- runif(n) > x <- cos(x) > > > > > > then cos would have to allocate new space for its output because > > > overwriting its input would affect a subsequent > > > sum(x) > > > I suppose that end-users and function-writers could learn to live > > > with having to decide when to copy, but not having to make that > > > decision makes S more pleasant (and safer) to use. > > > I think that is a major reason that people are able to share S code > > > so easily. > > > > But don't forget the "Holy Grail" that Doug mentioned at the start of > > this thread: finding a flexible language that is also fast. Currently > > many R packages employ C/C++ components to compensate for the fact > > that the R interpreter can be slow, and the pass-by-value semantics of > > S provides no protection here. > [...] > > The distinction imperative vs. functional has nothing to do with the > distinction > interpreted vs. directly executed. > > > > > Thinking again on the problem that was mentioned here, I think it might be > circumvented. > > Looking again at R's properties, looking again into U.Ligges "Programmieren in > R", I saw there was mentioned that in R anything (?!) is an object... so then > it's > OOP; but also it was mentioned, R is a functional language. But this does not > mean it's purely functional or has no imperative data structures. > > As R relies heavily on vectors, here we have an imperative datastructure. > > So, it rather looks to me that "<-" does work in-place on the vectors, even > "<-" > itself is a function (which does not matter for the problem). > > If thats true (I assume here, it is; correct me, if it's wrong), then I > think, assigning > with "<<-" and assign() also would do an imperative > (in-place) change of the contents. > > Then the copying-of-big-objects-when-passed-as-args problem can be > circumvented by working on either a variable in the GlobalEnv (and using > "<<-", > or using a certain environment for the big data and passing it's name (and the > variable) as value to the function which then uses assign() and get() to work > on > that data. > Then in-place modification should be possible. > > > > > > > > > In 2008 Ross Ihaka and Duncan Temple Lang published the paper "Back to > > the Future: Lisp as a base for a statistical computing system" where > > they propose Common Lisp as a new foundation for R. They suggest that > > this could be don
Re: [Rd] Julia
Hi, ok, thank you for clarifiying what you meant. You only referred to the reusage of the args, not of an already existing vector. So I overgenerealized your example. But when looking at your example, and how I would implement the cos() I doubt I would use copying the args before calculating the result. Just allocate a result-vector, and then place the cos() of the input-vector into the result vector. I didn't looked at how it is done in R, but I would guess it's like that. In pseudo-Code something like that: cos_val[idx] = cos( input_val[idx] ); But R also handles complex data with cos() so it will look a bit more laborious. What I have seen so far from implementing C-extensions for R is rather C-ish, and so you have the control on many details. Copying the input just to read it would not make sense here. I doubt that R internally is doing that. Or did you found that in the R-code? The other problem, someone mentioned, was *changing* the contents of a matrix... and that this is NO>T done in-place, when using a function for it. But the namespace-name / variable-name as "references" to the matrix might solve that problem. Ciao, Oliver On Wed, Mar 07, 2012 at 07:10:43PM +, William Dunlap wrote: > No my examples are what I meant. My point was that a function, say cos(), > can act like it does call-by-value but conserve memory when it can if it can > distinguish between the case > cx <- cos(x=runif(n)) # no allocation needed, use the input space for the > return value > and and the case >x <- runif(n) >cx <- cos(x=x) # return value cannot reuse the argument's memory, so > allocate space for return value >sum(x) # Otherwise sum(x) would return sum(cx) > The function needs to know if a memory block is referred to by a name in any > environment > in order to do that. > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > -Original Message- > > From: oliver [mailto:oli...@first.in-berlin.de] > > Sent: Wednesday, March 07, 2012 10:22 AM > > To: Dominick Samperi > > Cc: William Dunlap; R-devel > > Subject: Re: [Rd] Julia > > > > On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote: > > > On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap > > wrote: > > > > S (and its derivatives and successors) promises that functions will > > > > not change their arguments, so in an expression like > > > > val <- func(arg) > > > > you know that arg will not be changed. You can do that by having > > > > func copy arg before doing anything, but that uses space and time > > > > that you want to conserve. > > > > If arg is not a named item in any environment then it should be fine > > > > to write over the original because there is no way the caller can > > > > detect that shortcut. E.g., in > > > > cx <- cos(runif(n)) > > > > the cos function does not need to allocate new space for its output, > > > > it can just write over its input because, without a name attached to > > > > it, the caller has no way of looking at what runif(n) returned. If > > > > you did > > > > x <- runif(n) > > > > cx <- cos(x) > > > > You have two names here, x and cx, hence your example does not fit into what > > you want to explain. > > > > A better example would be: > > x <- runif(n) > > x <- cos(x) > > > > > > > > > > then cos would have to allocate new space for its output because > > > > overwriting its input would affect a subsequent > > > > sum(x) > > > > I suppose that end-users and function-writers could learn to live > > > > with having to decide when to copy, but not having to make that > > > > decision makes S more pleasant (and safer) to use. > > > > I think that is a major reason that people are able to share S code > > > > so easily. > > > > > > But don't forget the "Holy Grail" that Doug mentioned at the start of > > > this thread: finding a flexible language that is also fast. Currently > > > many R packages employ C/C++ components to compensate for the fact > > > that the R interpreter can be slow, and the pass-by-value semantics of > > > S provides no protection here. > > [...] > > > > The distinction imperative vs. functional has nothing to do with the > > distinction > > interpreted vs. directly executed. > > > > > > > > > > Thinking again on the problem that was mentioned here,
Re: [Rd] Julia
Ah, and you mean if it's an anonymous array it could be reused directly from the args. OK, now I see why you insist on the anonymous data thing. I didn't grasped it even in my last mail. But that somehow also relates to what I wrote about reusing an already existing, named vector. Just the moment of in-place-modification is different. From x <- runif(n) cx <- cos(x) instead of > > cx <- cos(x=runif(n)) # no allocation needed, use the input space for > > the return value to something like cx <- runif(n) cos( cx, inplace=TRUE) or cos( runif(n), inplace=TRUE) This way it would be possible to specify the reusage of the input *explicitly* (without implicit rules like anonymous vs. named values). In Pseudo-Code something like that: if (in_place == TRUE ) { input_val[idx] = cos( input_val[idx] ); return input_val; } else { result_val = alloc_vec( LENGTH(input_val), ... ); result_val[idx] = cos( input_val[idx] ); return result_val; } Is this matching, what you were looking for? Ciao, Oliver On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote: > Hi, > > ok, thank you for clarifiying what you meant. > You only referred to the reusage of the args, > not of an already existing vector. > So I overgenerealized your example. > > But when looking at your example, > and how I would implement the cos() > I doubt I would use copying the args > before calculating the result. > > Just allocate a result-vector, and then place the cos() > of the input-vector into the result vector. > > I didn't looked at how it is done in R, > but I would guess it's like that. > > > In pseudo-Code something like that: > cos_val[idx] = cos( input_val[idx] ); > > But R also handles complex data with cos() > so it will look a bit more laborious. > > What I have seen so far from implementing C-extensions > for R is rather C-ish, and so you have the control > on many details. Copying the input just to read it > would not make sense here. > > I doubt that R internally is doing that. > Or did you found that in the R-code? > > The other problem, someone mentioned, was *changing* the contents > of a matrix... and that this is NO>T done in-place, when using > a function for it. > But the namespace-name / variable-name as "references" to the matrix > might solve that problem. > > > Ciao, > Oliver > > > > On Wed, Mar 07, 2012 at 07:10:43PM +, William Dunlap wrote: > > No my examples are what I meant. My point was that a function, say cos(), > > can act like it does call-by-value but conserve memory when it can if it > > can > > distinguish between the case > > cx <- cos(x=runif(n)) # no allocation needed, use the input space for > > the return value > > and and the case > >x <- runif(n) > >cx <- cos(x=x) # return value cannot reuse the argument's memory, so > > allocate space for return value > >sum(x) # Otherwise sum(x) would return sum(cx) > > The function needs to know if a memory block is referred to by a name in > > any environment > > in order to do that. > > > > Bill Dunlap > > Spotfire, TIBCO Software > > wdunlap tibco.com > > > > > -Original Message- > > > From: oliver [mailto:oli...@first.in-berlin.de] > > > Sent: Wednesday, March 07, 2012 10:22 AM > > > To: Dominick Samperi > > > Cc: William Dunlap; R-devel > > > Subject: Re: [Rd] Julia > > > > > > On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote: > > > > On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap > > > wrote: > > > > > S (and its derivatives and successors) promises that functions will > > > > > not change their arguments, so in an expression like > > > > > val <- func(arg) > > > > > you know that arg will not be changed. You can do that by having > > > > > func copy arg before doing anything, but that uses space and time > > > > > that you want to conserve. > > > > > If arg is not a named item in any environment then it should be fine > > > > > to write over the original because there is no way the caller can > > > > > detect that shortcut. E.g., in > > > > > cx <- cos(runif(n)) > > > > > the cos function does not need to allocate new space for its output, > > > > > it can just write over its input because, without a name attached to > > > > > it, the caller has no way of looking at what ru
Re: [Rd] Julia
So you propose an inplace=TRUE/FALSE entry for each argument to each function which may may want to avoid allocating memory? The major problem is that the function writer has no idea what the value of inplace should be, as it depends on how the function gets called. This makes writing reusable functions (hence packages) difficult. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: oliver [mailto:oli...@first.in-berlin.de] > Sent: Thursday, March 08, 2012 7:40 AM > To: William Dunlap > Cc: R-devel > Subject: Re: [Rd] Julia > > Ah, and you mean if it's an anonymous array it could be reused directly from > the > args. > > OK, now I see why you insist on the anonymous data thing. > I didn't grasped it even in my last mail. > > > > But that somehow also relates to what I wrote about reusing an already > existing, named vector. > > Just the moment of in-place-modification is different. > > From > x <- runif(n) > cx <- cos(x) > > instead of > > > cx <- cos(x=runif(n)) # no allocation needed, use the input > > > space for the return value > > to something like > > cx <- runif(n) > cos( cx, inplace=TRUE) > > or > > cos( runif(n), inplace=TRUE) > > > > > This way it would be possible to specify the reusage of the input *explicitly* > (without implicit rules like anonymous vs. named values). > > > > In Pseudo-Code something like that: > >if (in_place == TRUE ) >{ > input_val[idx] = cos( input_val[idx] ); > return input_val; >} >else >{ > result_val = alloc_vec( LENGTH(input_val), ... ); > result_val[idx] = cos( input_val[idx] ); > return result_val; >} > > > > Is this matching, what you were looking for? > > > Ciao, >Oliver > > > On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote: > > Hi, > > > > ok, thank you for clarifiying what you meant. > > You only referred to the reusage of the args, not of an already > > existing vector. > > So I overgenerealized your example. > > > > But when looking at your example, > > and how I would implement the cos() > > I doubt I would use copying the args > > before calculating the result. > > > > Just allocate a result-vector, and then place the cos() of the > > input-vector into the result vector. > > > > I didn't looked at how it is done in R, but I would guess it's like > > that. > > > > > > In pseudo-Code something like that: > > cos_val[idx] = cos( input_val[idx] ); > > > > But R also handles complex data with cos() so it will look a bit more > > laborious. > > > > What I have seen so far from implementing C-extensions for R is rather > > C-ish, and so you have the control on many details. Copying the input > > just to read it would not make sense here. > > > > I doubt that R internally is doing that. > > Or did you found that in the R-code? > > > > The other problem, someone mentioned, was *changing* the contents of a > > matrix... and that this is NO>T done in-place, when using a function > > for it. > > But the namespace-name / variable-name as "references" to the matrix > > might solve that problem. > > > > > > Ciao, > > Oliver > > > > > > > > On Wed, Mar 07, 2012 at 07:10:43PM +, William Dunlap wrote: > > > No my examples are what I meant. My point was that a function, say > > > cos(), can act like it does call-by-value but conserve memory when > > > it can if it can distinguish between the case > > > cx <- cos(x=runif(n)) # no allocation needed, use the input > > > space for the return value and and the case > > >x <- runif(n) > > >cx <- cos(x=x) # return value cannot reuse the argument's memory, so > allocate space for return value > > >sum(x) # Otherwise sum(x) would return sum(cx) > > > The function needs to know if a memory block is referred to by a > > > name in any environment in order to do that. > > > > > > Bill Dunlap > > > Spotfire, TIBCO Software > > > wdunlap tibco.com > > > > > > > -Original Message- > > > > From: oliver [mailto:oli...@first.in-berlin.de] > > > > Sent: Wednesday, March 07, 2012 10:22 AM > > > > To: Dominick Samperi > > > > Cc: William Dunlap; R-devel > > > > Subject: Re: [Rd] Julia > > &
Re: [Rd] Julia
I don't think that using in-place modification as a general property would make sense. In-place modification brings in side-effects and that would mean that the order of evaluation can change the result. To get reliable results, the order of evaluation should not be the reason for different results, and thats the reason, why the functional approach is much better for reliable programs. So, in general I would say, this feature is a no-no. In general I would rather discourage in-place modification. For some certain cases it might help... but for such certain cases either such a boolean flag or programming a sparate module in C would make sense. There could also be a global in-place-flag that might be used (via options maybe) but if such a thing would be implemented, the default value should be FALSE. Ciao, Oliver On Thu, Mar 08, 2012 at 04:21:42PM +, William Dunlap wrote: > So you propose an inplace=TRUE/FALSE entry for each > argument to each function which may may want to avoid > allocating memory? The major problem is that the function > writer has no idea what the value of inplace should be, > as it depends on how the function gets called. This makes > writing reusable functions (hence packages) difficult. > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > -Original Message- > > From: oliver [mailto:oli...@first.in-berlin.de] > > Sent: Thursday, March 08, 2012 7:40 AM > > To: William Dunlap > > Cc: R-devel > > Subject: Re: [Rd] Julia > > > > Ah, and you mean if it's an anonymous array it could be reused directly > > from the > > args. > > > > OK, now I see why you insist on the anonymous data thing. > > I didn't grasped it even in my last mail. > > > > > > > > But that somehow also relates to what I wrote about reusing an already > > existing, named vector. > > > > Just the moment of in-place-modification is different. > > > > From > > x <- runif(n) > > cx <- cos(x) > > > > instead of > > > > cx <- cos(x=runif(n)) # no allocation needed, use the input > > > > space for the return value > > > > to something like > > > > cx <- runif(n) > > cos( cx, inplace=TRUE) > > > > or > > > > cos( runif(n), inplace=TRUE) > > > > > > > > > > This way it would be possible to specify the reusage of the input > > *explicitly* > > (without implicit rules like anonymous vs. named values). > > > > > > > > In Pseudo-Code something like that: > > > >if (in_place == TRUE ) > >{ > > input_val[idx] = cos( input_val[idx] ); > > return input_val; > >} > >else > >{ > > result_val = alloc_vec( LENGTH(input_val), ... ); > > result_val[idx] = cos( input_val[idx] ); > > return result_val; > >} > > > > > > > > Is this matching, what you were looking for? > > > > > > Ciao, > >Oliver > > > > > > On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote: > > > Hi, > > > > > > ok, thank you for clarifiying what you meant. > > > You only referred to the reusage of the args, not of an already > > > existing vector. > > > So I overgenerealized your example. > > > > > > But when looking at your example, > > > and how I would implement the cos() > > > I doubt I would use copying the args > > > before calculating the result. > > > > > > Just allocate a result-vector, and then place the cos() of the > > > input-vector into the result vector. > > > > > > I didn't looked at how it is done in R, but I would guess it's like > > > that. > > > > > > > > > In pseudo-Code something like that: > > > cos_val[idx] = cos( input_val[idx] ); > > > > > > But R also handles complex data with cos() so it will look a bit more > > > laborious. > > > > > > What I have seen so far from implementing C-extensions for R is rather > > > C-ish, and so you have the control on many details. Copying the input > > > just to read it would not make sense here. > > > > > > I doubt that R internally is doing that. > > > Or did you found that in the R-code? > > > > > > The other problem, someone mentioned, was *changing* the contents of a > > > matrix... and that this is NO>T done in-place, when using a function > > > for
Re: [Rd] Julia
I guess my point is not getting across. The user should see the functional programming style but under the hood the evaluator should be able to use whatever memory and time saving tricks it can. Julia seems to want to be a nonfunctional language, which I think makes it harder to write the sort of easily reusable functions that S allows. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: oliver [mailto:oli...@first.in-berlin.de] > Sent: Thursday, March 08, 2012 2:23 PM > To: William Dunlap > Cc: R-devel > Subject: Re: [Rd] Julia > > I don't think that using in-place modification as a general property would > make > sense. > > In-place modification brings in side-effects and that would mean that the > order > of evaluation can change the result. > > To get reliable results, the order of evaluation should not be the reason for > different results, and thats the reason, why the functional approach is much > better for reliable programs. > > So, in general I would say, this feature is a no-no. > In general I would rather discourage in-place modification. > > For some certain cases it might help... > but for such certain cases either such a boolean flag or programming a sparate > module in C would make sense. > > There could also be a global in-place-flag that might be used (via options > maybe) but if such a thing would be implemented, the default value should be > FALSE. > > > > Ciao, >Oliver > > > On Thu, Mar 08, 2012 at 04:21:42PM +, William Dunlap wrote: > > So you propose an inplace=TRUE/FALSE entry for each argument to each > > function which may may want to avoid allocating memory? The major > > problem is that the function writer has no idea what the value of > > inplace should be, as it depends on how the function gets called. > > This makes writing reusable functions (hence packages) difficult. > > > > Bill Dunlap > > Spotfire, TIBCO Software > > wdunlap tibco.com > > > > > -Original Message- > > > From: oliver [mailto:oli...@first.in-berlin.de] > > > Sent: Thursday, March 08, 2012 7:40 AM > > > To: William Dunlap > > > Cc: R-devel > > > Subject: Re: [Rd] Julia > > > > > > Ah, and you mean if it's an anonymous array it could be reused > > > directly from the args. > > > > > > OK, now I see why you insist on the anonymous data thing. > > > I didn't grasped it even in my last mail. > > > > > > > > > > > > But that somehow also relates to what I wrote about reusing an > > > already existing, named vector. > > > > > > Just the moment of in-place-modification is different. > > > > > > From > > > x <- runif(n) > > > cx <- cos(x) > > > > > > instead of > > > > > cx <- cos(x=runif(n)) # no allocation needed, use the input > > > > > space for the return value > > > > > > to something like > > > > > > cx <- runif(n) > > > cos( cx, inplace=TRUE) > > > > > > or > > > > > > cos( runif(n), inplace=TRUE) > > > > > > > > > > > > > > > This way it would be possible to specify the reusage of the input > > > *explicitly* (without implicit rules like anonymous vs. named values). > > > > > > > > > > > > In Pseudo-Code something like that: > > > > > >if (in_place == TRUE ) > > >{ > > > input_val[idx] = cos( input_val[idx] ); > > > return input_val; > > >} > > >else > > >{ > > > result_val = alloc_vec( LENGTH(input_val), ... ); > > > result_val[idx] = cos( input_val[idx] ); > > > return result_val; > > >} > > > > > > > > > > > > Is this matching, what you were looking for? > > > > > > > > > Ciao, > > >Oliver > > > > > > > > > On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote: > > > > Hi, > > > > > > > > ok, thank you for clarifiying what you meant. > > > > You only referred to the reusage of the args, not of an already > > > > existing vector. > > > > So I overgenerealized your example. > > > > > > > > But when looking at your example, > > > > and how I would implement the cos() I doubt I would use copying > > > > the args before calculat
Re: [Rd] Julia
Aha, ok. So you not especially look at that one feature (like the anonymous evaluation tricks), but in general want to ask for better internal optimization. Especially with your example of the anonymous (unnamed) values given to a function, I would ask: do you want to write programs all without using names/variables? I think this would be much harder than just to add a boolean flag with inplace=TRUE. So your reply on the flag-proposal as too much of bad usability I need to reply with: it's even worse to write code without variable names and put anything into anonymous datastructures, that are called inside function application, and inside each of the arguments there will be more of unnamed calculations. You will end up not only with a mess, but also with slower calculations, because unnamed ressources must be calculated more than once if they will be used more than once. So I think that you are just asking for more internal optimizations. Fine. But I think internal intermediate code (that can be optimized) would be better than that one "enhancement" of reusing anonymous data for the output. Ciao, Oliver On Thu, Mar 08, 2012 at 10:27:22PM +, William Dunlap wrote: > I guess my point is not getting across. The user should see > the functional programming style but under the hood the > evaluator should be able to use whatever memory and time > saving tricks it can. Julia seems to want to be a nonfunctional > language, which I think makes it harder to write the sort of > easily reusable functions that S allows. > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > > -Original Message- > > From: oliver [mailto:oli...@first.in-berlin.de] > > Sent: Thursday, March 08, 2012 2:23 PM > > To: William Dunlap > > Cc: R-devel > > Subject: Re: [Rd] Julia > > > > I don't think that using in-place modification as a general property would > > make > > sense. > > > > In-place modification brings in side-effects and that would mean that the > > order > > of evaluation can change the result. > > > > To get reliable results, the order of evaluation should not be the reason > > for > > different results, and thats the reason, why the functional approach is much > > better for reliable programs. > > > > So, in general I would say, this feature is a no-no. > > In general I would rather discourage in-place modification. > > > > For some certain cases it might help... > > but for such certain cases either such a boolean flag or programming a > > sparate > > module in C would make sense. > > > > There could also be a global in-place-flag that might be used (via options > > maybe) but if such a thing would be implemented, the default value should be > > FALSE. > > > > > > > > Ciao, > >Oliver > > > > > > On Thu, Mar 08, 2012 at 04:21:42PM +, William Dunlap wrote: > > > So you propose an inplace=TRUE/FALSE entry for each argument to each > > > function which may may want to avoid allocating memory? The major > > > problem is that the function writer has no idea what the value of > > > inplace should be, as it depends on how the function gets called. > > > This makes writing reusable functions (hence packages) difficult. > > > > > > Bill Dunlap > > > Spotfire, TIBCO Software > > > wdunlap tibco.com > > > > > > > -Original Message- > > > > From: oliver [mailto:oli...@first.in-berlin.de] > > > > Sent: Thursday, March 08, 2012 7:40 AM > > > > To: William Dunlap > > > > Cc: R-devel > > > > Subject: Re: [Rd] Julia > > > > > > > > Ah, and you mean if it's an anonymous array it could be reused > > > > directly from the args. > > > > > > > > OK, now I see why you insist on the anonymous data thing. > > > > I didn't grasped it even in my last mail. > > > > > > > > > > > > > > > > But that somehow also relates to what I wrote about reusing an > > > > already existing, named vector. > > > > > > > > Just the moment of in-place-modification is different. > > > > > > > > From > > > > x <- runif(n) > > > > cx <- cos(x) > > > > > > > > instead of > > > > > > cx <- cos(x=runif(n)) # no allocation needed, use the input > > > > > > space for the return value > > > > > > > > to something like > > > > > > >