[Rd] Julia

2012-03-01 Thread Douglas Bates
My purpose in mentioning the Julia language (julialang.org) here is
not to start a flame war.  I find it to be a very interesting
development and others who read this list may want to read about it
too.

It is still very much early days for this language - about the same
stage as R was in 1995 or 1996 when only a few people knew about it -
but Julia holds much potential.  There is a thread about "R and
statistical programming" on groups.google.com/group/julia-dev.  As
always happens, there is a certain amount of grumbling of the "R IS
S SLW" flavor but there is also some good discussion regarding
features of R (well, S actually) that are central to the language.
(Disclaimer: I am one of the participants discussing the importance of
data frames and formulas in R.)

If you want to know why Julia has attracted a lot of interest very
recently (like in the last 10 days), as a language it uses multiple
dispatch (like S4 methods) with methods being compiled on the fly
using the LLVM (http://llvm.org) infrastructure.  In some ways it
achieves the Holy Grail of languages like R, Matlab, NumPy, ... in
that it combines the speed of compiled languages with the flexibility
of the high-level interpreted language.

One of the developers, Jeff Bezanson, gave a seminar about the design
of the language at Stanford yesterday, and the video is archived at
http://www.stanford.edu/class/ee380/.  You don't see John Chambers on
camera but I am reasonably certain that a couple of the questions and
comments came from him.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-01 Thread Jeffrey Ryan
Doug,

Agreed on the interesting point - looks like it has some real promise.
 I think the spike in interest could be attributable to Mike
Loukides's tweet on Feb 20. (editor at O'Reilly)

https://twitter.com/#!/mikeloukides/status/171773229407551488

That is exactly the moment I stumbled upon it.

Jeff

On Thu, Mar 1, 2012 at 11:06 AM, Douglas Bates  wrote:
> My purpose in mentioning the Julia language (julialang.org) here is
> not to start a flame war.  I find it to be a very interesting
> development and others who read this list may want to read about it
> too.
>
> It is still very much early days for this language - about the same
> stage as R was in 1995 or 1996 when only a few people knew about it -
> but Julia holds much potential.  There is a thread about "R and
> statistical programming" on groups.google.com/group/julia-dev.  As
> always happens, there is a certain amount of grumbling of the "R IS
> S SLW" flavor but there is also some good discussion regarding
> features of R (well, S actually) that are central to the language.
> (Disclaimer: I am one of the participants discussing the importance of
> data frames and formulas in R.)
>
> If you want to know why Julia has attracted a lot of interest very
> recently (like in the last 10 days), as a language it uses multiple
> dispatch (like S4 methods) with methods being compiled on the fly
> using the LLVM (http://llvm.org) infrastructure.  In some ways it
> achieves the Holy Grail of languages like R, Matlab, NumPy, ... in
> that it combines the speed of compiled languages with the flexibility
> of the high-level interpreted language.
>
> One of the developers, Jeff Bezanson, gave a seminar about the design
> of the language at Stanford yesterday, and the video is archived at
> http://www.stanford.edu/class/ee380/.  You don't see John Chambers on
> camera but I am reasonably certain that a couple of the questions and
> comments came from him.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Jeffrey Ryan
jeffrey.r...@lemnica.com

www.lemnica.com
www.esotericR.com

R/Finance 2012: Applied Finance with R
www.RinFinance.com

See you in Chicago

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-01 Thread Douglas Bates
On Thu, Mar 1, 2012 at 11:20 AM, Jeffrey Ryan  wrote:
> Doug,
>
> Agreed on the interesting point - looks like it has some real promise.
>  I think the spike in interest could be attributable to Mike
> Loukides's tweet on Feb 20. (editor at O'Reilly)
>
> https://twitter.com/#!/mikeloukides/status/171773229407551488
>
> That is exactly the moment I stumbled upon it.

I think Jeff Bezanson attributes the interest to a blog posting by
Viral Shah, another member of the development team, that hit Reddit.
He said that, with Viral now in India, it all happened overnight for
those in North America and he awoke the next day to find a firestorm
of interest.  I ran across Julia in the Release Notes of LLVM and
mentioned it to Dirk Eddelbuettel who posted about it on Google+ in
January.  (Dirk, being much younger than I, knows about these
new-fangled social media things and I don't.)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-01 Thread Kjetil Halvorsen
Can somebody postb a link to the video? I cant find it, searching
"Julia" on youtube stanford channel gives nothing.

Kjetil

On Thu, Mar 1, 2012 at 11:37 AM, Douglas Bates  wrote:
> On Thu, Mar 1, 2012 at 11:20 AM, Jeffrey Ryan  
> wrote:
>> Doug,
>>
>> Agreed on the interesting point - looks like it has some real promise.
>>  I think the spike in interest could be attributable to Mike
>> Loukides's tweet on Feb 20. (editor at O'Reilly)
>>
>> https://twitter.com/#!/mikeloukides/status/171773229407551488
>>
>> That is exactly the moment I stumbled upon it.
>
> I think Jeff Bezanson attributes the interest to a blog posting by
> Viral Shah, another member of the development team, that hit Reddit.
> He said that, with Viral now in India, it all happened overnight for
> those in North America and he awoke the next day to find a firestorm
> of interest.  I ran across Julia in the Release Notes of LLVM and
> mentioned it to Dirk Eddelbuettel who posted about it on Google+ in
> January.  (Dirk, being much younger than I, knows about these
> new-fangled social media things and I don't.)
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-01 Thread Ted Harding
http://julialang.org/blog

Then click on "Stanford Talk Video".
Then click on "available here".

Ted.

On 01-Mar-2012 Kjetil Halvorsen wrote:
> Can somebody postb a link to the video? I cant find it, searching
> "Julia" on youtube stanford channel gives nothing.
> 
> Kjetil
> 
> On Thu, Mar 1, 2012 at 11:37 AM, Douglas Bates  wrote:
>> On Thu, Mar 1, 2012 at 11:20 AM, Jeffrey Ryan 
>> wrote:
>>> Doug,
>>>
>>> Agreed on the interesting point - looks like it has some real promise.
>>> Â_I think the spike in interest could be attributable to Mike
>>> Loukides's tweet on Feb 20. (editor at O'Reilly)
>>>
>>> https://twitter.com/#!/mikeloukides/status/171773229407551488
>>>
>>> That is exactly the moment I stumbled upon it.
>>
>> I think Jeff Bezanson attributes the interest to a blog posting by
>> Viral Shah, another member of the development team, that hit Reddit.
>> He said that, with Viral now in India, it all happened overnight for
>> those in North America and he awoke the next day to find a firestorm
>> of interest. Â_I ran across Julia in the Release Notes of LLVM and
>> mentioned it to Dirk Eddelbuettel who posted about it on Google+ in
>> January. Â_(Dirk, being much younger than I, knows about these
>> new-fangled social media things and I don't.)
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-
E-Mail: (Ted Harding) 
Date: 01-Mar-2012  Time: 20:47:42
This message was sent by XFMail

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-02 Thread oliver
On Thu, Mar 01, 2012 at 11:06:51AM -0600, Douglas Bates wrote:
> My purpose in mentioning the Julia language (julialang.org) here is
> not to start a flame war.  I find it to be a very interesting
> development and others who read this list may want to read about it
> too.
[...]


Very interesting language.
Thank you for mentioning it here.

Compiling from the github-sources was easy.

Will explore it during the next days.

Seems not to be very specific to statistics,
but good for math in general.

Not sure, if it might make sense to combine
R and Julia in the long run (I mean: combining via
providing interfaces between them, calling the one via the
other, merging code or using libs from the one or the other
from each side).

Ciao,
   Oliver

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-05 Thread William Dunlap
I haven't used Julia yet, but from my quick reading
of the docs it looks like arguments to functions are
passed by reference and not by value, so functions
can change their arguments.  My recollection from when
I first started using S (in the course of a job helping
profs and grad students do statistical programming, c. 1983)
is that not having to worry about in-place algorithms changing
your data gave S a big advantage over Fortran or C.
While this feature could slow things down and increase
memory code, I felt that it made it easier to write correct
code and to use functions that others had written.
Does Julia have a const declaration or other
means of controlling or documenting that a given function
will or will not change the data passed into it?

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -Original Message-
> From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On 
> Behalf Of oliver
> Sent: Friday, March 02, 2012 5:14 PM
> To: Douglas Bates
> Cc: R-devel
> Subject: Re: [Rd] Julia
> 
> On Thu, Mar 01, 2012 at 11:06:51AM -0600, Douglas Bates wrote:
> > My purpose in mentioning the Julia language (julialang.org) here is
> > not to start a flame war.  I find it to be a very interesting
> > development and others who read this list may want to read about it
> > too.
> [...]
> 
> 
> Very interesting language.
> Thank you for mentioning it here.
> 
> Compiling from the github-sources was easy.
> 
> Will explore it during the next days.
> 
> Seems not to be very specific to statistics,
> but good for math in general.
> 
> Not sure, if it might make sense to combine
> R and Julia in the long run (I mean: combining via
> providing interfaces between them, calling the one via the
> other, merging code or using libs from the one or the other
> from each side).
> 
> Ciao,
>Oliver
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-05 Thread oliver
On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote:
> I haven't used Julia yet, but from my quick reading
> of the docs it looks like arguments to functions are
> passed by reference and not by value, so functions
> can change their arguments.  My recollection from when
> I first started using S (in the course of a job helping
> profs and grad students do statistical programming, c. 1983)
> is that not having to worry about in-place algorithms changing
> your data gave S a big advantage over Fortran or C.
[...]


C also uses Call-by-Value.
Fortran I don't know in detail.


> While this feature could slow things down and increase
> memory code, I felt that it made it easier to write correct
> code and to use functions that others had written.

Yes, I also think, that call-by-value decreases
errors in Code.

What I read about Julia it's like MATLAB plus more features for programming.
Does matlab also only use call-by-reference?


> Does Julia have a const declaration or other
> means of controlling or documenting that a given function
> will or will not change the data passed into it?

I did not explored it in detail so far.
Maybe the orig-poster already did this in more depth?


Ciao,
   Oliver

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-05 Thread Hervé Pagès

Hi Oliver,

On 03/05/2012 09:08 AM, oliver wrote:

On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote:

I haven't used Julia yet, but from my quick reading
of the docs it looks like arguments to functions are
passed by reference and not by value, so functions
can change their arguments.  My recollection from when
I first started using S (in the course of a job helping
profs and grad students do statistical programming, c. 1983)
is that not having to worry about in-place algorithms changing
your data gave S a big advantage over Fortran or C.

[...]


C also uses Call-by-Value.


C *only* uses Call-by-Value.

Cheers,
H.


Fortran I don't know in detail.



While this feature could slow things down and increase
memory code, I felt that it made it easier to write correct
code and to use functions that others had written.


Yes, I also think, that call-by-value decreases
errors in Code.

What I read about Julia it's like MATLAB plus more features for programming.
Does matlab also only use call-by-reference?



Does Julia have a const declaration or other
means of controlling or documenting that a given function
will or will not change the data passed into it?


I did not explored it in detail so far.
Maybe the orig-poster already did this in more depth?


Ciao,
Oliver

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-05 Thread oliver
On Mon, Mar 05, 2012 at 03:58:59PM -0800, Hervé Pagès wrote:
> Hi Oliver,
> 
> On 03/05/2012 09:08 AM, oliver wrote:
> >On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote:
> >>I haven't used Julia yet, but from my quick reading
> >>of the docs it looks like arguments to functions are
> >>passed by reference and not by value, so functions
> >>can change their arguments.  My recollection from when
> >>I first started using S (in the course of a job helping
> >>profs and grad students do statistical programming, c. 1983)
> >>is that not having to worry about in-place algorithms changing
> >>your data gave S a big advantage over Fortran or C.
> >[...]
> >
> >
> >C also uses Call-by-Value.
> 
> C *only* uses Call-by-Value.
[...]


Yes, that's what I meant.

With "also" I meant, that it uses call-by-value, as some
other languages also do.


Ciao,
   Oliver

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-05 Thread Duncan Murdoch

On 12-03-05 6:58 PM, Hervé Pagès wrote:

Hi Oliver,

On 03/05/2012 09:08 AM, oliver wrote:

On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote:

I haven't used Julia yet, but from my quick reading
of the docs it looks like arguments to functions are
passed by reference and not by value, so functions
can change their arguments.  My recollection from when
I first started using S (in the course of a job helping
profs and grad students do statistical programming, c. 1983)
is that not having to worry about in-place algorithms changing
your data gave S a big advantage over Fortran or C.

[...]


C also uses Call-by-Value.


C *only* uses Call-by-Value.


While literally true, the fact that you can't send an array by value, 
and must send the value of a pointer to it, kind of supports Bill's 
point:  in C, you mostly end up sending arrays by reference.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-05 Thread William Dunlap
Yes, C does use call by value, always.  However, data arrays
are almost always passed via pointers to malloc'ed space,
so, effectively, data arrays are passed by reference.
(One can put a 'const type*' in the prototype of a function to declare
that the data pointed to will not not be changed, but it is
up to documentation or coding standards to let someone know that
data pointed to will likely be changed.)

I find R's (& S+'s & S's) copy-on-write-if-not-copying-would-be-discoverable-
by-the-uer machanism for giving the allusion of pass-by-value a good way
to structure the contract between the function writer and the function user.
Does Julia have the tools to let a function writer or user decide whether
he really needs to copy its arguments or not?

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -Original Message-
> From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On 
> Behalf Of Hervé Pagès
> Sent: Monday, March 05, 2012 3:59 PM
> To: oliver
> Cc: R-devel
> Subject: Re: [Rd] Julia
> 
> Hi Oliver,
> 
> On 03/05/2012 09:08 AM, oliver wrote:
> > On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote:
> >> I haven't used Julia yet, but from my quick reading
> >> of the docs it looks like arguments to functions are
> >> passed by reference and not by value, so functions
> >> can change their arguments.  My recollection from when
> >> I first started using S (in the course of a job helping
> >> profs and grad students do statistical programming, c. 1983)
> >> is that not having to worry about in-place algorithms changing
> >> your data gave S a big advantage over Fortran or C.
> > [...]
> >
> >
> > C also uses Call-by-Value.
> 
> C *only* uses Call-by-Value.
> 
> Cheers,
> H.
> 
> > Fortran I don't know in detail.
> >
> >
> >> While this feature could slow things down and increase
> >> memory code, I felt that it made it easier to write correct
> >> code and to use functions that others had written.
> >
> > Yes, I also think, that call-by-value decreases
> > errors in Code.
> >
> > What I read about Julia it's like MATLAB plus more features for programming.
> > Does matlab also only use call-by-reference?
> >
> >
> >> Does Julia have a const declaration or other
> >> means of controlling or documenting that a given function
> >> will or will not change the data passed into it?
> >
> > I did not explored it in detail so far.
> > Maybe the orig-poster already did this in more depth?
> >
> >
> > Ciao,
> > Oliver
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 
> --
> Hervé Pagès
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpa...@fhcrc.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-06 Thread oliver
On Mon, Mar 05, 2012 at 04:54:05PM -0800, Nicholas Crookston wrote:
> There are many experts on this topic.  I'll keep this short.
> 
> Newer Fortran Languages allow for call by value, but call by reference
> is the typical and historically, the only approach (there was a time
> when you could change the value of 1 to 2!).

Oh, strange.


> 
> C "only" calls by value except that the value can be a pointer! So,
> havoc is just a * away.
[...]

For me there was no "havoc" at this point, but for others maybe.

There are also other languages that only use call-by-value...
...functional languages are that way in principal.

  Nevertheless internally they may heavily use pointers and
  even if you have values that are large arrays for example,
  they internally just give a pointer to that data structure.
  (That's, why functional languages are not necessarily slow
  just because you act on large data and have no references
  in that language. (A common misunderstanding about functional
  languages must be slow because they have nor references.)
  The pointer-stuff is just hidden.

Even they ((non-purely) functional languages) may have references,
their concept of references is different. (See OCaml for example.)
There you can use references to change values in place, but the
reference itself is a functional value, and you will never have
access to the pointer stuff directly. Hence no problems with
mem-arithmetics and dangling pointer's or Null-pointers.



[...]
> I like R and will continue to use it. However, I also think that
> strict "call by value" can get you into trouble, just trouble of a
> different kind.

Can you elaborate more on this?
What problems do you have in mind?
And what kind of references do you have in mind?
The C-like pointers or something like OCaml's ref's?


> I'm not sure we will ever yearn for "Julia ouR-Julia",
> but it is sure fun to think about what might be possible with this
> language. And having fun is one key objective.

I have fun if things work.
And if the tools do, what I want to achieve...
...and the fun is better, if they do it elegantly.

Do you ask for references in R?
And what kind of references do you have in mind,
and why does it hurt you not to have them?

Can you give examples, so that it's easier to see,
whwere you miss something?


Ciao,
   Oliver

P.S.: The speed issue of R was coming up more than once;
  in some blog posts it was mentioned. would it make
  sense to start a seperated thread of it?
  In one  of the blog-articles I read, it was mourned about
  how NA / missing values were handled, and that NA should
  maybe become thrown out, just to get higher speed.
  I would not like to have that. Handling NA as special
  case IMHO is a very good way. Don't remember if the
  article I have in mind just argued about HOW this was
  handled, or if it should be thrown out completely.
  Making the handling of it better and more performant I
  think is a good idea, ignoring NA IMHO is a bad idea.

  But maybe that really would be worth a seperate thread?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-06 Thread oliver
On Mon, Mar 05, 2012 at 07:33:10PM -0500, Duncan Murdoch wrote:
> On 12-03-05 6:58 PM, Hervé Pagès wrote:
> >Hi Oliver,
> >
> >On 03/05/2012 09:08 AM, oliver wrote:
> >>On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote:
> >>>I haven't used Julia yet, but from my quick reading
> >>>of the docs it looks like arguments to functions are
> >>>passed by reference and not by value, so functions
> >>>can change their arguments.  My recollection from when
> >>>I first started using S (in the course of a job helping
> >>>profs and grad students do statistical programming, c. 1983)
> >>>is that not having to worry about in-place algorithms changing
> >>>your data gave S a big advantage over Fortran or C.
> >>[...]
> >>
> >>
> >>C also uses Call-by-Value.
> >
> >C *only* uses Call-by-Value.
> 
> While literally true, the fact that you can't send an array by
> value, and must send the value of a pointer to it, kind of supports
> Bill's point:  in C, you mostly end up sending arrays by reference.
[...]

It's a problem of how the term "reference" is used.
If you want to limit the possible confsion, better say:
giving the pointer-by-value.

Or: giving the address-value of the array/struct/...
by value.

To say, you give the array reference is a shorthand,
which maybe creates confusion.

Just avoiding the word "reference" here would make it more clear.
AFAIK in C++ references are different to pointers. (Some others
who know C++ in detail might explain this in detail.)

So, using the same terms for many different concepts can create
a mess in understanding.


Ciao,
   Oliver

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-06 Thread oliver
On Tue, Mar 06, 2012 at 12:35:32AM +, William Dunlap wrote:
[...]
> I find R's (& S+'s & S's) copy-on-write-if-not-copying-would-be-discoverable-
> by-the-uer machanism for giving the allusion of pass-by-value a good way
> to structure the contract between the function writer and the function user.
[...]


Can you elaborate more on this,
especially on the ...-...-...-if-not-copying-would-be-discoverable-by-the-uer
stuff?

What do you mean with discoverability of not-copying?

Ciao,
   Oliver

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-06 Thread William Dunlap
S (and its derivatives and successors) promises that functions
will not change their arguments, so in an expression like
   val <- func(arg)
you know that arg will not be changed.  You can
do that by having func copy arg before doing anything,
but that uses space and time that you want to conserve.
If arg is not a named item in any environment then it
should be fine to write over the original because there
is no way the caller can detect that shortcut.  E.g., in
cx <- cos(runif(n))
the cos function does not need to allocate new space for
its output, it can just write over its input because, without
a name attached to it, the caller has no way of looking
at what runif(n) returned.  If you did
x <- runif(n)
cx <- cos(x)
then cos would have to allocate new space for its output
because overwriting its input would affect a subsequent
sum(x)
I suppose that end-users and function-writers could learn
to live with having to decide when to copy, but not having
to make that decision makes S more pleasant (and safer) to use.
I think that is a major reason that people are able to
share S code so easily.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -Original Message-
> From: oliver [mailto:oli...@first.in-berlin.de]
> Sent: Tuesday, March 06, 2012 1:12 AM
> To: William Dunlap
> Cc: Hervé Pagès; R-devel
> Subject: Re: [Rd] Julia
> 
> On Tue, Mar 06, 2012 at 12:35:32AM +, William Dunlap wrote:
> [...]
> > I find R's (& S+'s & S's) 
> > copy-on-write-if-not-copying-would-be-discoverable-
> > by-the-uer machanism for giving the allusion of pass-by-value a good way
> > to structure the contract between the function writer and the function user.
> [...]
> 
> 
> Can you elaborate more on this,
> especially on the ...-...-...-if-not-copying-would-be-discoverable-by-the-uer
> stuff?
> 
> What do you mean with discoverability of not-copying?
> 
> Ciao,
>Oliver
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-06 Thread Dominick Samperi
On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap  wrote:
> S (and its derivatives and successors) promises that functions
> will not change their arguments, so in an expression like
>   val <- func(arg)
> you know that arg will not be changed.  You can
> do that by having func copy arg before doing anything,
> but that uses space and time that you want to conserve.
> If arg is not a named item in any environment then it
> should be fine to write over the original because there
> is no way the caller can detect that shortcut.  E.g., in
>    cx <- cos(runif(n))
> the cos function does not need to allocate new space for
> its output, it can just write over its input because, without
> a name attached to it, the caller has no way of looking
> at what runif(n) returned.  If you did
>    x <- runif(n)
>    cx <- cos(x)
> then cos would have to allocate new space for its output
> because overwriting its input would affect a subsequent
>    sum(x)
> I suppose that end-users and function-writers could learn
> to live with having to decide when to copy, but not having
> to make that decision makes S more pleasant (and safer) to use.
> I think that is a major reason that people are able to
> share S code so easily.

But don't forget the "Holy Grail" that Doug mentioned at the
start of this thread: finding a flexible language that is also
fast. Currently many R packages employ C/C++ components
to compensate for the fact that the R interpreter can be slow,
and the pass-by-value semantics of S provides no protection
here.

In 2008 Ross Ihaka and Duncan Temple Lang published the
paper "Back to the Future: Lisp as a base for a statistical
computing system" where they propose Common
Lisp as a new foundation for R. They suggest that
this could be done while maintaining the same
familiar R syntax.

A key requirement of any strategy is to maintain
easy access to the huge universe of existing
C/C++/Fortran numerical and graphics libraries,
as these libraries are not likely to be rewritten.

Thus there will always be a need for a foreign
function interface, and the problem is to provide
a flexible and type-safe language that does not
force developers to use another unfamiliar,
less flexible, and error-prone language to
optimize the hot spots.

Dominick

> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>> -Original Message-
>> From: oliver [mailto:oli...@first.in-berlin.de]
>> Sent: Tuesday, March 06, 2012 1:12 AM
>> To: William Dunlap
>> Cc: Hervé Pagès; R-devel
>> Subject: Re: [Rd] Julia
>>
>> On Tue, Mar 06, 2012 at 12:35:32AM +, William Dunlap wrote:
>> [...]
>> > I find R's (& S+'s & S's) 
>> > copy-on-write-if-not-copying-would-be-discoverable-
>> > by-the-uer machanism for giving the allusion of pass-by-value a good way
>> > to structure the contract between the function writer and the function 
>> > user.
>> [...]
>>
>>
>> Can you elaborate more on this,
>> especially on the ...-...-...-if-not-copying-would-be-discoverable-by-the-uer
>> stuff?
>>
>> What do you mean with discoverability of not-copying?
>>
>> Ciao,
>>    Oliver
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-07 Thread Nicholas Crookston
There are many experts on this topic.  I'll keep this short.

Newer Fortran Languages allow for call by value, but call by reference
is the typical and historically, the only approach (there was a time
when you could change the value of 1 to 2!).

C "only" calls by value except that the value can be a pointer! So,
havoc is just a * away.

I'm very pleased to be on this list and read the discussion. Thank you
Douglas Bates for sending the first message.

I like R and will continue to use it. However, I also think that
strict "call by value" can get you into trouble, just trouble of a
different kind. I'm not sure we will ever yearn for "Julia ouR-Julia",
but it is sure fun to think about what might be possible with this
language. And having fun is one key objective.

Nick Crookston

2012/3/5 oliver :
> On Mon, Mar 05, 2012 at 03:58:59PM -0800, Hervé Pagès wrote:
>> Hi Oliver,
>>
>> On 03/05/2012 09:08 AM, oliver wrote:
>> >On Mon, Mar 05, 2012 at 03:53:28PM +, William Dunlap wrote:
>> >>I haven't used Julia yet, but from my quick reading
>> >>of the docs it looks like arguments to functions are
>> >>passed by reference and not by value, so functions
>> >>can change their arguments.  My recollection from when
>> >>I first started using S (in the course of a job helping
>> >>profs and grad students do statistical programming, c. 1983)
>> >>is that not having to worry about in-place algorithms changing
>> >>your data gave S a big advantage over Fortran or C.
>> >[...]
>> >
>> >
>> >C also uses Call-by-Value.
>>
>> C *only* uses Call-by-Value.
> [...]
>
>
> Yes, that's what I meant.
>
> With "also" I meant, that it uses call-by-value, as some
> other languages also do.
>
>
> Ciao,
>   Oliver
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-07 Thread Dominick Samperi
On Tue, Mar 6, 2012 at 3:56 AM, oliver  wrote:
> On Mon, Mar 05, 2012 at 04:54:05PM -0800, Nicholas Crookston wrote:
>> There are many experts on this topic.  I'll keep this short.
>>
>> Newer Fortran Languages allow for call by value, but call by reference
>> is the typical and historically, the only approach (there was a time
>> when you could change the value of 1 to 2!).
>
> Oh, strange.
>
>
>>
>> C "only" calls by value except that the value can be a pointer! So,
>> havoc is just a * away.
> [...]
>
> For me there was no "havoc" at this point, but for others maybe.
>
> There are also other languages that only use call-by-value...
> ...functional languages are that way in principal.
>
>  Nevertheless internally they may heavily use pointers and
>  even if you have values that are large arrays for example,
>  they internally just give a pointer to that data structure.
>  (That's, why functional languages are not necessarily slow
>  just because you act on large data and have no references
>  in that language. (A common misunderstanding about functional
>  languages must be slow because they have nor references.)
>  The pointer-stuff is just hidden.
>
> Even they ((non-purely) functional languages) may have references,
> their concept of references is different. (See OCaml for example.)
> There you can use references to change values in place, but the
> reference itself is a functional value, and you will never have
> access to the pointer stuff directly. Hence no problems with
> mem-arithmetics and dangling pointer's or Null-pointers.
>
>
>
> [...]
>> I like R and will continue to use it. However, I also think that
>> strict "call by value" can get you into trouble, just trouble of a
>> different kind.
>
> Can you elaborate more on this?
> What problems do you have in mind?
> And what kind of references do you have in mind?
> The C-like pointers or something like OCaml's ref's?

OCaml refs are an "escape hatch" from the pure
functional programming paradigm where nothing can
be changed once given a value, an extreme form of
pass-by-value. Similarly, most languages that are
advertised as pass-by-value include some kind of
escape hatch that permits you to work with pointers
(or mutable vectors) for improved runtime performance.

The speed issues arise for two main reasons: interpreting
code is much slower than running machine code, and
copying large data structures can be expensive.
Pass-by-value semantics forces this to happen in
many situations where the compiler/interpreter cannot
safely optimize it away.

Based on the video Julia manages the speed issue by
viewing everything like a template, thus generating new
methods based on type inference. This means there isn't
a lot of runtime type checking for dispatch, because
customized methods were already generated, but this
can lead to another problem: code bloat. There are
no free lunches.

>> I'm not sure we will ever yearn for "Julia ouR-Julia",
>> but it is sure fun to think about what might be possible with this
>> language. And having fun is one key objective.
>
> I have fun if things work.
> And if the tools do, what I want to achieve...
> ...and the fun is better, if they do it elegantly.
>
> Do you ask for references in R?
> And what kind of references do you have in mind,
> and why does it hurt you not to have them?
>
> Can you give examples, so that it's easier to see,
> whwere you miss something?
>
>
> Ciao,
>   Oliver
>
> P.S.: The speed issue of R was coming up more than once;
>      in some blog posts it was mentioned. would it make
>      sense to start a seperated thread of it?
>      In one  of the blog-articles I read, it was mourned about
>      how NA / missing values were handled, and that NA should
>      maybe become thrown out, just to get higher speed.
>      I would not like to have that. Handling NA as special
>      case IMHO is a very good way. Don't remember if the
>      article I have in mind just argued about HOW this was
>      handled, or if it should be thrown out completely.
>      Making the handling of it better and more performant I
>      think is a good idea, ignoring NA IMHO is a bad idea.
>
>      But maybe that really would be worth a seperate thread?
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-07 Thread oliver
On Wed, Mar 07, 2012 at 10:31:14AM -0500, Dominick Samperi wrote:
> On Tue, Mar 6, 2012 at 3:56 AM, oliver  wrote:
> > On Mon, Mar 05, 2012 at 04:54:05PM -0800, Nicholas Crookston wrote:
> >> There are many experts on this topic.  I'll keep this short.
> >>
> >> Newer Fortran Languages allow for call by value, but call by reference
> >> is the typical and historically, the only approach (there was a time
> >> when you could change the value of 1 to 2!).
> >
> > Oh, strange.
> >
> >
> >>
> >> C "only" calls by value except that the value can be a pointer! So,
> >> havoc is just a * away.
> > [...]
> >
> > For me there was no "havoc" at this point, but for others maybe.
> >
> > There are also other languages that only use call-by-value...
> > ...functional languages are that way in principal.
> >
> >  Nevertheless internally they may heavily use pointers and
> >  even if you have values that are large arrays for example,
> >  they internally just give a pointer to that data structure.
> >  (That's, why functional languages are not necessarily slow
> >  just because you act on large data and have no references
> >  in that language. (A common misunderstanding about functional
> >  languages must be slow because they have nor references.)
> >  The pointer-stuff is just hidden.
> >
> > Even they ((non-purely) functional languages) may have references,
> > their concept of references is different. (See OCaml for example.)
> > There you can use references to change values in place, but the
> > reference itself is a functional value, and you will never have
> > access to the pointer stuff directly. Hence no problems with
> > mem-arithmetics and dangling pointer's or Null-pointers.
> >
> >
> >
> > [...]
> >> I like R and will continue to use it. However, I also think that
> >> strict "call by value" can get you into trouble, just trouble of a
> >> different kind.
> >
> > Can you elaborate more on this?
> > What problems do you have in mind?
> > And what kind of references do you have in mind?
> > The C-like pointers or something like OCaml's ref's?
> 
> OCaml refs are an "escape hatch" from the pure
> functional programming paradigm where nothing can
> be changed once given a value, an extreme form of
> pass-by-value.

OCaml is not a purely functional language and has
not the claim to be one; hence it's not an "escape hatch"
(which seem to have a negative touch to me).

Arrays and strings in OCaml are also imperative.
And with the "mutable" attribute in records, you also can
crearte imperative record entries.

So, it's just a different design / approach than
Haskell for example. OCaml is coming from ML-languages.

Purely Functional on the one hand is beautiful, and 
therefore nice; but it also is dogmatic on the other hand.

> Similarly, most languages that are
> advertised as pass-by-value include some kind of
> escape hatch that permits you to work with pointers
> (or mutable vectors) for improved runtime performance.

References in OCaml are NOT pointers.
You do have access in an imperative / in-place way, but you
have NO POINTER STUFF in that language.


# let a = ref 5;;
val a : int ref = {contents = 5}
# a := 7;;
- : unit = ()
# a;;
- : int ref = {contents = 7}
# 


This is in-place modification of the contents of the ref,
without any pointer arithmetics.
"a" is a functional value which hosts an imperative one
on the inside.


> 
> The speed issues arise for two main reasons: interpreting
> code is much slower than running machine code, and
> copying large data structures can be expensive.

The functional approach often saves time and space.
This is just not well known.
And the distinction of imperative vs. functional has
nothing to do with interpreted vs. directly executed.



# let mylist_1 = [ 3;5;323 ];;
val mylist_1 : int list = [3; 5; 323]
# let mylist_2 = 12 :: mylist_1;;
val mylist_2 : int list = [12; 3; 5; 323]
# mylist_1;;
- : int list = [3; 5; 323]
# mylist_2;;
- : int list = [12; 3; 5; 323]
# 


Both lists share the common elements here.
No copy is done.
In this case the functional approach is very nice.

Just a counter-example to "functional is eating up space".

When thinking about the questions here, I think
the design of Ocaml addressed all this, and that this was
the design decision, why arrays are possible to be changed
imperatively.


# let my_array = [| 1; 3; 54; 99 |];;
val my_array : int array = [|1; 3; 54; 99|]
# my_array;;
- : int array = [|1; 3; 54; 99|]
# my_array.(2) <- 9;;
- : unit = ()
# my_array;;
- : int array = [|1; 3; 9; 99|]
# 


If R is rather purely functional here,
then the problem addressed here is, that
a pureley functional approach without any "escape hatc

Re: [Rd] Julia

2012-03-07 Thread oliver
On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote:
> On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap  wrote:
> > S (and its derivatives and successors) promises that functions
> > will not change their arguments, so in an expression like
> >   val <- func(arg)
> > you know that arg will not be changed.  You can
> > do that by having func copy arg before doing anything,
> > but that uses space and time that you want to conserve.
> > If arg is not a named item in any environment then it
> > should be fine to write over the original because there
> > is no way the caller can detect that shortcut.  E.g., in
> >    cx <- cos(runif(n))
> > the cos function does not need to allocate new space for
> > its output, it can just write over its input because, without
> > a name attached to it, the caller has no way of looking
> > at what runif(n) returned.  If you did
> >    x <- runif(n)
> >    cx <- cos(x)

You have two names here, x and cx, hence
your example does not fit into what you want to explain.

A better example would be:
x <- runif(n)
x <- cos(x)



> > then cos would have to allocate new space for its output
> > because overwriting its input would affect a subsequent
> >    sum(x)
> > I suppose that end-users and function-writers could learn
> > to live with having to decide when to copy, but not having
> > to make that decision makes S more pleasant (and safer) to use.
> > I think that is a major reason that people are able to
> > share S code so easily.
> 
> But don't forget the "Holy Grail" that Doug mentioned at the
> start of this thread: finding a flexible language that is also
> fast. Currently many R packages employ C/C++ components
> to compensate for the fact that the R interpreter can be slow,
> and the pass-by-value semantics of S provides no protection
> here.
[...]

The distinction imperative vs. functional has nothing to do
with the distinction interpreted vs. directly executed.




Thinking again on the problem that was mentioned here,
I think it might be circumvented.

Looking again at R's properties, looking again into U.Ligges "Programmieren in
R", I saw there was mentioned that in R anything (?!) is an object... so then 
it's
OOP; but also it was mentioned, R is a functional language. But this does not
mean it's purely functional or has no imperative data structures.

As R relies heavily on vectors, here we have an imperative datastructure.

So, it rather looks to me that "<-" does work in-place
on the vectors, even "<-" itself is a function (which does not matter for
the problem).

If thats true (I assume here, it is; correct me, if it's wrong),
then I think, assigning with "<<-" and assign() also would do an imperative
(in-place) change of the contents.

Then the copying-of-big-objects-when-passed-as-args problem can be circumvented
by working on either a variable in the GlobalEnv (and using "<<-", or using a
certain environment for the big data and passing it's name (and the variable)
as value to the function which then uses assign() and get() to work on that
data.
Then in-place modification should be possible.





> 
> In 2008 Ross Ihaka and Duncan Temple Lang published the
> paper "Back to the Future: Lisp as a base for a statistical
> computing system" where they propose Common
> Lisp as a new foundation for R. They suggest that
> this could be done while maintaining the same
> familiar R syntax.
> 
> A key requirement of any strategy is to maintain
> easy access to the huge universe of existing
> C/C++/Fortran numerical and graphics libraries,
> as these libraries are not likely to be rewritten.
> 
> Thus there will always be a need for a foreign
> function interface, and the problem is to provide
> a flexible and type-safe language that does not
> force developers to use another unfamiliar,
> less flexible, and error-prone language to
> optimize the hot spots.

If I here "type safe" I rather would think about OCaml
or maybe Ada, but not LISP.

Also, LISP has so many "("'s and ")"'s,
that it's making people going crazy ;-)

Ciao,
   Oliver

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Julia

2012-03-07 Thread William Dunlap
No my examples are what I meant.  My point was that a function, say cos(),
can act like it does call-by-value but conserve memory when it can  if it can
distinguish between the case
cx <- cos(x=runif(n)) # no allocation needed, use the input space for the 
return value
and and the case
   x <- runif(n)
   cx <- cos(x=x) # return value cannot reuse the argument's memory, so 
allocate space for return value
   sum(x)  # Otherwise sum(x) would return sum(cx)
The function needs to know if a memory block is referred to by a name in any 
environment
in order to do that.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -Original Message-
> From: oliver [mailto:oli...@first.in-berlin.de]
> Sent: Wednesday, March 07, 2012 10:22 AM
> To: Dominick Samperi
> Cc: William Dunlap; R-devel
> Subject: Re: [Rd] Julia
> 
> On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote:
> > On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap 
> wrote:
> > > S (and its derivatives and successors) promises that functions will
> > > not change their arguments, so in an expression like
> > >   val <- func(arg)
> > > you know that arg will not be changed.  You can do that by having
> > > func copy arg before doing anything, but that uses space and time
> > > that you want to conserve.
> > > If arg is not a named item in any environment then it should be fine
> > > to write over the original because there is no way the caller can
> > > detect that shortcut.  E.g., in
> > >    cx <- cos(runif(n))
> > > the cos function does not need to allocate new space for its output,
> > > it can just write over its input because, without a name attached to
> > > it, the caller has no way of looking at what runif(n) returned.  If
> > > you did
> > >    x <- runif(n)
> > >    cx <- cos(x)
> 
> You have two names here, x and cx, hence your example does not fit into what
> you want to explain.
> 
> A better example would be:
> x <- runif(n)
> x <- cos(x)
> 
> 
> 
> > > then cos would have to allocate new space for its output because
> > > overwriting its input would affect a subsequent
> > >    sum(x)
> > > I suppose that end-users and function-writers could learn to live
> > > with having to decide when to copy, but not having to make that
> > > decision makes S more pleasant (and safer) to use.
> > > I think that is a major reason that people are able to share S code
> > > so easily.
> >
> > But don't forget the "Holy Grail" that Doug mentioned at the start of
> > this thread: finding a flexible language that is also fast. Currently
> > many R packages employ C/C++ components to compensate for the fact
> > that the R interpreter can be slow, and the pass-by-value semantics of
> > S provides no protection here.
> [...]
> 
> The distinction imperative vs. functional has nothing to do with the 
> distinction
> interpreted vs. directly executed.
> 
> 
> 
> 
> Thinking again on the problem that was mentioned here, I think it might be
> circumvented.
> 
> Looking again at R's properties, looking again into U.Ligges "Programmieren in
> R", I saw there was mentioned that in R anything (?!) is an object... so then 
> it's
> OOP; but also it was mentioned, R is a functional language. But this does not
> mean it's purely functional or has no imperative data structures.
> 
> As R relies heavily on vectors, here we have an imperative datastructure.
> 
> So, it rather looks to me that "<-" does work in-place on the vectors, even 
> "<-"
> itself is a function (which does not matter for the problem).
> 
> If thats true (I assume here, it is; correct me, if it's wrong), then I 
> think, assigning
> with "<<-" and assign() also would do an imperative
> (in-place) change of the contents.
> 
> Then the copying-of-big-objects-when-passed-as-args problem can be
> circumvented by working on either a variable in the GlobalEnv (and using 
> "<<-",
> or using a certain environment for the big data and passing it's name (and the
> variable) as value to the function which then uses assign() and get() to work 
> on
> that data.
> Then in-place modification should be possible.
> 
> 
> 
> 
> 
> >
> > In 2008 Ross Ihaka and Duncan Temple Lang published the paper "Back to
> > the Future: Lisp as a base for a statistical computing system" where
> > they propose Common Lisp as a new foundation for R. They suggest that
> > this could be don

Re: [Rd] Julia

2012-03-08 Thread oliver
Hi,

ok, thank you for clarifiying what you meant.
You only referred to the reusage of the args,
not of an already existing vector.
So I overgenerealized your example.

But when looking at your example,
and how I would implement the cos()
I doubt I would use copying the args
before calculating the result.

Just allocate a result-vector, and then place the cos()
of the input-vector into the result vector.

I didn't looked at how it is done in R,
but I would guess it's like that.


  In pseudo-Code something like that:
cos_val[idx] = cos( input_val[idx] );

But R also handles complex data with cos()
so it will look a bit more laborious.

What I have seen so far from implementing C-extensions
for R is rather C-ish, and so you have the control
on many details. Copying the input just to read it
would not make sense here.

I doubt that R internally is doing that.
Or did you found that in the R-code?

The other problem, someone mentioned, was *changing* the contents
of a matrix... and that this is NO>T done in-place, when using
a function for it.
But the namespace-name / variable-name as "references" to the matrix
might solve that problem.


Ciao,
  Oliver



On Wed, Mar 07, 2012 at 07:10:43PM +, William Dunlap wrote:
> No my examples are what I meant.  My point was that a function, say cos(),
> can act like it does call-by-value but conserve memory when it can  if it can
> distinguish between the case
> cx <- cos(x=runif(n)) # no allocation needed, use the input space for the 
> return value
> and and the case
>x <- runif(n)
>cx <- cos(x=x) # return value cannot reuse the argument's memory, so 
> allocate space for return value
>sum(x)  # Otherwise sum(x) would return sum(cx)
> The function needs to know if a memory block is referred to by a name in any 
> environment
> in order to do that.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> > -Original Message-
> > From: oliver [mailto:oli...@first.in-berlin.de]
> > Sent: Wednesday, March 07, 2012 10:22 AM
> > To: Dominick Samperi
> > Cc: William Dunlap; R-devel
> > Subject: Re: [Rd] Julia
> > 
> > On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote:
> > > On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap 
> > wrote:
> > > > S (and its derivatives and successors) promises that functions will
> > > > not change their arguments, so in an expression like
> > > >   val <- func(arg)
> > > > you know that arg will not be changed.  You can do that by having
> > > > func copy arg before doing anything, but that uses space and time
> > > > that you want to conserve.
> > > > If arg is not a named item in any environment then it should be fine
> > > > to write over the original because there is no way the caller can
> > > > detect that shortcut.  E.g., in
> > > >    cx <- cos(runif(n))
> > > > the cos function does not need to allocate new space for its output,
> > > > it can just write over its input because, without a name attached to
> > > > it, the caller has no way of looking at what runif(n) returned.  If
> > > > you did
> > > >    x <- runif(n)
> > > >    cx <- cos(x)
> > 
> > You have two names here, x and cx, hence your example does not fit into what
> > you want to explain.
> > 
> > A better example would be:
> > x <- runif(n)
> > x <- cos(x)
> > 
> > 
> > 
> > > > then cos would have to allocate new space for its output because
> > > > overwriting its input would affect a subsequent
> > > >    sum(x)
> > > > I suppose that end-users and function-writers could learn to live
> > > > with having to decide when to copy, but not having to make that
> > > > decision makes S more pleasant (and safer) to use.
> > > > I think that is a major reason that people are able to share S code
> > > > so easily.
> > >
> > > But don't forget the "Holy Grail" that Doug mentioned at the start of
> > > this thread: finding a flexible language that is also fast. Currently
> > > many R packages employ C/C++ components to compensate for the fact
> > > that the R interpreter can be slow, and the pass-by-value semantics of
> > > S provides no protection here.
> > [...]
> > 
> > The distinction imperative vs. functional has nothing to do with the 
> > distinction
> > interpreted vs. directly executed.
> > 
> > 
> > 
> > 
> > Thinking again on the problem that was mentioned here, 

Re: [Rd] Julia

2012-03-08 Thread oliver
Ah, and you mean if it's an anonymous array
it could be reused directly from the args.

OK, now I see why you insist on the anonymous data thing.
I didn't grasped it even in my last mail.



But that somehow also relates to what I wrote about reusing an already
existing, named vector.

Just the moment of in-place-modification is different.

From
  x  <- runif(n)
  cx <- cos(x)

instead of
> > cx <- cos(x=runif(n)) # no allocation needed, use the input space for 
> > the return value

to something like

  cx  <- runif(n)
  cos( cx, inplace=TRUE)

or

  cos( runif(n), inplace=TRUE)




This way it would be possible to specify the reusage
of the input *explicitly* (without  implicit rules
like anonymous vs. named values).



In Pseudo-Code something like that:

   if (in_place == TRUE )
   {
 input_val[idx] = cos( input_val[idx] );
 return input_val;
   }
   else
   {
 result_val = alloc_vec( LENGTH(input_val), ... );
 result_val[idx] = cos( input_val[idx] );
 return result_val;
   }



Is this matching, what you were looking for?


Ciao,
   Oliver


On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote:
> Hi,
> 
> ok, thank you for clarifiying what you meant.
> You only referred to the reusage of the args,
> not of an already existing vector.
> So I overgenerealized your example.
> 
> But when looking at your example,
> and how I would implement the cos()
> I doubt I would use copying the args
> before calculating the result.
> 
> Just allocate a result-vector, and then place the cos()
> of the input-vector into the result vector.
> 
> I didn't looked at how it is done in R,
> but I would guess it's like that.
> 
> 
>   In pseudo-Code something like that:
> cos_val[idx] = cos( input_val[idx] );
> 
> But R also handles complex data with cos()
> so it will look a bit more laborious.
> 
> What I have seen so far from implementing C-extensions
> for R is rather C-ish, and so you have the control
> on many details. Copying the input just to read it
> would not make sense here.
> 
> I doubt that R internally is doing that.
> Or did you found that in the R-code?
> 
> The other problem, someone mentioned, was *changing* the contents
> of a matrix... and that this is NO>T done in-place, when using
> a function for it.
> But the namespace-name / variable-name as "references" to the matrix
> might solve that problem.
> 
> 
> Ciao,
>   Oliver
> 
> 
> 
> On Wed, Mar 07, 2012 at 07:10:43PM +, William Dunlap wrote:
> > No my examples are what I meant.  My point was that a function, say cos(),
> > can act like it does call-by-value but conserve memory when it can  if it 
> > can
> > distinguish between the case
> > cx <- cos(x=runif(n)) # no allocation needed, use the input space for 
> > the return value
> > and and the case
> >x <- runif(n)
> >cx <- cos(x=x) # return value cannot reuse the argument's memory, so 
> > allocate space for return value
> >sum(x)  # Otherwise sum(x) would return sum(cx)
> > The function needs to know if a memory block is referred to by a name in 
> > any environment
> > in order to do that.
> > 
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> > 
> > > -Original Message-
> > > From: oliver [mailto:oli...@first.in-berlin.de]
> > > Sent: Wednesday, March 07, 2012 10:22 AM
> > > To: Dominick Samperi
> > > Cc: William Dunlap; R-devel
> > > Subject: Re: [Rd] Julia
> > > 
> > > On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote:
> > > > On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap 
> > > wrote:
> > > > > S (and its derivatives and successors) promises that functions will
> > > > > not change their arguments, so in an expression like
> > > > >   val <- func(arg)
> > > > > you know that arg will not be changed.  You can do that by having
> > > > > func copy arg before doing anything, but that uses space and time
> > > > > that you want to conserve.
> > > > > If arg is not a named item in any environment then it should be fine
> > > > > to write over the original because there is no way the caller can
> > > > > detect that shortcut.  E.g., in
> > > > >    cx <- cos(runif(n))
> > > > > the cos function does not need to allocate new space for its output,
> > > > > it can just write over its input because, without a name attached to
> > > > > it, the caller has no way of looking at what ru

Re: [Rd] Julia

2012-03-08 Thread William Dunlap
So you propose an inplace=TRUE/FALSE entry for each
argument to each function which may may want to avoid
allocating memory?  The major problem is that the function
writer has no idea what the value of inplace should be,
as it depends on how the function gets called.  This makes
writing reusable functions (hence packages) difficult.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -Original Message-
> From: oliver [mailto:oli...@first.in-berlin.de]
> Sent: Thursday, March 08, 2012 7:40 AM
> To: William Dunlap
> Cc: R-devel
> Subject: Re: [Rd] Julia
> 
> Ah, and you mean if it's an anonymous array it could be reused directly from 
> the
> args.
> 
> OK, now I see why you insist on the anonymous data thing.
> I didn't grasped it even in my last mail.
> 
> 
> 
> But that somehow also relates to what I wrote about reusing an already
> existing, named vector.
> 
> Just the moment of in-place-modification is different.
> 
> From
>   x  <- runif(n)
>   cx <- cos(x)
> 
> instead of
> > > cx <- cos(x=runif(n)) # no allocation needed, use the input
> > > space for the return value
> 
> to something like
> 
>   cx  <- runif(n)
>   cos( cx, inplace=TRUE)
> 
> or
> 
>   cos( runif(n), inplace=TRUE)
> 
> 
> 
> 
> This way it would be possible to specify the reusage of the input *explicitly*
> (without  implicit rules like anonymous vs. named values).
> 
> 
> 
> In Pseudo-Code something like that:
> 
>if (in_place == TRUE )
>{
>  input_val[idx] = cos( input_val[idx] );
>  return input_val;
>}
>else
>{
>  result_val = alloc_vec( LENGTH(input_val), ... );
>  result_val[idx] = cos( input_val[idx] );
>  return result_val;
>}
> 
> 
> 
> Is this matching, what you were looking for?
> 
> 
> Ciao,
>Oliver
> 
> 
> On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote:
> > Hi,
> >
> > ok, thank you for clarifiying what you meant.
> > You only referred to the reusage of the args, not of an already
> > existing vector.
> > So I overgenerealized your example.
> >
> > But when looking at your example,
> > and how I would implement the cos()
> > I doubt I would use copying the args
> > before calculating the result.
> >
> > Just allocate a result-vector, and then place the cos() of the
> > input-vector into the result vector.
> >
> > I didn't looked at how it is done in R, but I would guess it's like
> > that.
> >
> >
> >   In pseudo-Code something like that:
> > cos_val[idx] = cos( input_val[idx] );
> >
> > But R also handles complex data with cos() so it will look a bit more
> > laborious.
> >
> > What I have seen so far from implementing C-extensions for R is rather
> > C-ish, and so you have the control on many details. Copying the input
> > just to read it would not make sense here.
> >
> > I doubt that R internally is doing that.
> > Or did you found that in the R-code?
> >
> > The other problem, someone mentioned, was *changing* the contents of a
> > matrix... and that this is NO>T done in-place, when using a function
> > for it.
> > But the namespace-name / variable-name as "references" to the matrix
> > might solve that problem.
> >
> >
> > Ciao,
> >   Oliver
> >
> >
> >
> > On Wed, Mar 07, 2012 at 07:10:43PM +, William Dunlap wrote:
> > > No my examples are what I meant.  My point was that a function, say
> > > cos(), can act like it does call-by-value but conserve memory when
> > > it can  if it can distinguish between the case
> > > cx <- cos(x=runif(n)) # no allocation needed, use the input
> > > space for the return value and and the case
> > >x <- runif(n)
> > >cx <- cos(x=x) # return value cannot reuse the argument's memory, so
> allocate space for return value
> > >sum(x)  # Otherwise sum(x) would return sum(cx)
> > > The function needs to know if a memory block is referred to by a
> > > name in any environment in order to do that.
> > >
> > > Bill Dunlap
> > > Spotfire, TIBCO Software
> > > wdunlap tibco.com
> > >
> > > > -Original Message-
> > > > From: oliver [mailto:oli...@first.in-berlin.de]
> > > > Sent: Wednesday, March 07, 2012 10:22 AM
> > > > To: Dominick Samperi
> > > > Cc: William Dunlap; R-devel
> > > > Subject: Re: [Rd] Julia
> > &

Re: [Rd] Julia

2012-03-08 Thread oliver
I don't think that using in-place modification as a general property would make
sense.

In-place modification brings in side-effects and that would mean that
the order of evaluation can change the result.

To get reliable results, the order of evaluation should not be
the reason for different results, and thats the reason, why
the functional approach is much better for reliable programs.

So, in general I would say, this feature is a no-no.
In general I would rather discourage in-place modification.

For some certain cases it might help...
but for such certain cases either such a boolean flag
or programming a sparate module in C would make sense.

There could also be a global in-place-flag that might be used (via options
maybe) but if such a thing would be implemented, the default value should be
FALSE.



Ciao,
   Oliver


On Thu, Mar 08, 2012 at 04:21:42PM +, William Dunlap wrote:
> So you propose an inplace=TRUE/FALSE entry for each
> argument to each function which may may want to avoid
> allocating memory?  The major problem is that the function
> writer has no idea what the value of inplace should be,
> as it depends on how the function gets called.  This makes
> writing reusable functions (hence packages) difficult.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> > -Original Message-
> > From: oliver [mailto:oli...@first.in-berlin.de]
> > Sent: Thursday, March 08, 2012 7:40 AM
> > To: William Dunlap
> > Cc: R-devel
> > Subject: Re: [Rd] Julia
> > 
> > Ah, and you mean if it's an anonymous array it could be reused directly 
> > from the
> > args.
> > 
> > OK, now I see why you insist on the anonymous data thing.
> > I didn't grasped it even in my last mail.
> > 
> > 
> > 
> > But that somehow also relates to what I wrote about reusing an already
> > existing, named vector.
> > 
> > Just the moment of in-place-modification is different.
> > 
> > From
> >   x  <- runif(n)
> >   cx <- cos(x)
> > 
> > instead of
> > > > cx <- cos(x=runif(n)) # no allocation needed, use the input
> > > > space for the return value
> > 
> > to something like
> > 
> >   cx  <- runif(n)
> >   cos( cx, inplace=TRUE)
> > 
> > or
> > 
> >   cos( runif(n), inplace=TRUE)
> > 
> > 
> > 
> > 
> > This way it would be possible to specify the reusage of the input 
> > *explicitly*
> > (without  implicit rules like anonymous vs. named values).
> > 
> > 
> > 
> > In Pseudo-Code something like that:
> > 
> >if (in_place == TRUE )
> >{
> >  input_val[idx] = cos( input_val[idx] );
> >  return input_val;
> >}
> >else
> >{
> >  result_val = alloc_vec( LENGTH(input_val), ... );
> >  result_val[idx] = cos( input_val[idx] );
> >  return result_val;
> >}
> > 
> > 
> > 
> > Is this matching, what you were looking for?
> > 
> > 
> > Ciao,
> >Oliver
> > 
> > 
> > On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote:
> > > Hi,
> > >
> > > ok, thank you for clarifiying what you meant.
> > > You only referred to the reusage of the args, not of an already
> > > existing vector.
> > > So I overgenerealized your example.
> > >
> > > But when looking at your example,
> > > and how I would implement the cos()
> > > I doubt I would use copying the args
> > > before calculating the result.
> > >
> > > Just allocate a result-vector, and then place the cos() of the
> > > input-vector into the result vector.
> > >
> > > I didn't looked at how it is done in R, but I would guess it's like
> > > that.
> > >
> > >
> > >   In pseudo-Code something like that:
> > > cos_val[idx] = cos( input_val[idx] );
> > >
> > > But R also handles complex data with cos() so it will look a bit more
> > > laborious.
> > >
> > > What I have seen so far from implementing C-extensions for R is rather
> > > C-ish, and so you have the control on many details. Copying the input
> > > just to read it would not make sense here.
> > >
> > > I doubt that R internally is doing that.
> > > Or did you found that in the R-code?
> > >
> > > The other problem, someone mentioned, was *changing* the contents of a
> > > matrix... and that this is NO>T done in-place, when using a function
> > > for 

Re: [Rd] Julia

2012-03-08 Thread William Dunlap
I guess my point is not getting across.  The user should see
the functional programming style but under the hood the
evaluator should be able to use whatever memory and time
saving tricks it can.  Julia seems to want to be a nonfunctional
language, which I think makes it harder to write the sort of
easily reusable functions that S allows.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: oliver [mailto:oli...@first.in-berlin.de]
> Sent: Thursday, March 08, 2012 2:23 PM
> To: William Dunlap
> Cc: R-devel
> Subject: Re: [Rd] Julia
> 
> I don't think that using in-place modification as a general property would 
> make
> sense.
> 
> In-place modification brings in side-effects and that would mean that the 
> order
> of evaluation can change the result.
> 
> To get reliable results, the order of evaluation should not be the reason for
> different results, and thats the reason, why the functional approach is much
> better for reliable programs.
> 
> So, in general I would say, this feature is a no-no.
> In general I would rather discourage in-place modification.
> 
> For some certain cases it might help...
> but for such certain cases either such a boolean flag or programming a sparate
> module in C would make sense.
> 
> There could also be a global in-place-flag that might be used (via options
> maybe) but if such a thing would be implemented, the default value should be
> FALSE.
> 
> 
> 
> Ciao,
>Oliver
> 
> 
> On Thu, Mar 08, 2012 at 04:21:42PM +, William Dunlap wrote:
> > So you propose an inplace=TRUE/FALSE entry for each argument to each
> > function which may may want to avoid allocating memory?  The major
> > problem is that the function writer has no idea what the value of
> > inplace should be, as it depends on how the function gets called.
> > This makes writing reusable functions (hence packages) difficult.
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> > > -Original Message-
> > > From: oliver [mailto:oli...@first.in-berlin.de]
> > > Sent: Thursday, March 08, 2012 7:40 AM
> > > To: William Dunlap
> > > Cc: R-devel
> > > Subject: Re: [Rd] Julia
> > >
> > > Ah, and you mean if it's an anonymous array it could be reused
> > > directly from the args.
> > >
> > > OK, now I see why you insist on the anonymous data thing.
> > > I didn't grasped it even in my last mail.
> > >
> > >
> > >
> > > But that somehow also relates to what I wrote about reusing an
> > > already existing, named vector.
> > >
> > > Just the moment of in-place-modification is different.
> > >
> > > From
> > >   x  <- runif(n)
> > >   cx <- cos(x)
> > >
> > > instead of
> > > > > cx <- cos(x=runif(n)) # no allocation needed, use the input
> > > > > space for the return value
> > >
> > > to something like
> > >
> > >   cx  <- runif(n)
> > >   cos( cx, inplace=TRUE)
> > >
> > > or
> > >
> > >   cos( runif(n), inplace=TRUE)
> > >
> > >
> > >
> > >
> > > This way it would be possible to specify the reusage of the input
> > > *explicitly* (without  implicit rules like anonymous vs. named values).
> > >
> > >
> > >
> > > In Pseudo-Code something like that:
> > >
> > >if (in_place == TRUE )
> > >{
> > >  input_val[idx] = cos( input_val[idx] );
> > >  return input_val;
> > >}
> > >else
> > >{
> > >  result_val = alloc_vec( LENGTH(input_val), ... );
> > >  result_val[idx] = cos( input_val[idx] );
> > >  return result_val;
> > >}
> > >
> > >
> > >
> > > Is this matching, what you were looking for?
> > >
> > >
> > > Ciao,
> > >Oliver
> > >
> > >
> > > On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote:
> > > > Hi,
> > > >
> > > > ok, thank you for clarifiying what you meant.
> > > > You only referred to the reusage of the args, not of an already
> > > > existing vector.
> > > > So I overgenerealized your example.
> > > >
> > > > But when looking at your example,
> > > > and how I would implement the cos() I doubt I would use copying
> > > > the args before calculat

Re: [Rd] Julia

2012-03-08 Thread oliver
Aha, ok.

So you not especially look at that one feature (like the anonymous
evaluation tricks), but in general want to ask for better internal optimization.

Especially with your example of the anonymous (unnamed) values given to
a function, I would ask: do you want to write programs all without
using names/variables?
I think this would be much harder than just to add a boolean flag
with inplace=TRUE.
So your reply on the flag-proposal as too much of bad usability
I need to reply with: it's even worse to write code without
variable names and put anything into anonymous datastructures,
that are called inside function application, and inside each of the arguments
there will be more of unnamed calculations.
You will end up not only with a mess, but also with slower calculations,
because unnamed ressources must be calculated more than once if they will be 
used
more than once.

So I think that you are just asking for more internal optimizations.
Fine.

But I think internal intermediate code (that can be optimized)
would be better than that one "enhancement" of reusing anonymous
data for the output.


Ciao,
   Oliver


On Thu, Mar 08, 2012 at 10:27:22PM +, William Dunlap wrote:
> I guess my point is not getting across.  The user should see
> the functional programming style but under the hood the
> evaluator should be able to use whatever memory and time
> saving tricks it can.  Julia seems to want to be a nonfunctional
> language, which I think makes it harder to write the sort of
> easily reusable functions that S allows.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -Original Message-
> > From: oliver [mailto:oli...@first.in-berlin.de]
> > Sent: Thursday, March 08, 2012 2:23 PM
> > To: William Dunlap
> > Cc: R-devel
> > Subject: Re: [Rd] Julia
> > 
> > I don't think that using in-place modification as a general property would 
> > make
> > sense.
> > 
> > In-place modification brings in side-effects and that would mean that the 
> > order
> > of evaluation can change the result.
> > 
> > To get reliable results, the order of evaluation should not be the reason 
> > for
> > different results, and thats the reason, why the functional approach is much
> > better for reliable programs.
> > 
> > So, in general I would say, this feature is a no-no.
> > In general I would rather discourage in-place modification.
> > 
> > For some certain cases it might help...
> > but for such certain cases either such a boolean flag or programming a 
> > sparate
> > module in C would make sense.
> > 
> > There could also be a global in-place-flag that might be used (via options
> > maybe) but if such a thing would be implemented, the default value should be
> > FALSE.
> > 
> > 
> > 
> > Ciao,
> >Oliver
> > 
> > 
> > On Thu, Mar 08, 2012 at 04:21:42PM +, William Dunlap wrote:
> > > So you propose an inplace=TRUE/FALSE entry for each argument to each
> > > function which may may want to avoid allocating memory?  The major
> > > problem is that the function writer has no idea what the value of
> > > inplace should be, as it depends on how the function gets called.
> > > This makes writing reusable functions (hence packages) difficult.
> > >
> > > Bill Dunlap
> > > Spotfire, TIBCO Software
> > > wdunlap tibco.com
> > >
> > > > -Original Message-
> > > > From: oliver [mailto:oli...@first.in-berlin.de]
> > > > Sent: Thursday, March 08, 2012 7:40 AM
> > > > To: William Dunlap
> > > > Cc: R-devel
> > > > Subject: Re: [Rd] Julia
> > > >
> > > > Ah, and you mean if it's an anonymous array it could be reused
> > > > directly from the args.
> > > >
> > > > OK, now I see why you insist on the anonymous data thing.
> > > > I didn't grasped it even in my last mail.
> > > >
> > > >
> > > >
> > > > But that somehow also relates to what I wrote about reusing an
> > > > already existing, named vector.
> > > >
> > > > Just the moment of in-place-modification is different.
> > > >
> > > > From
> > > >   x  <- runif(n)
> > > >   cx <- cos(x)
> > > >
> > > > instead of
> > > > > > cx <- cos(x=runif(n)) # no allocation needed, use the input
> > > > > > space for the return value
> > > >
> > > > to something like
> > > >
> > >