Re: [R] return value of {....}

2023-01-15 Thread Sorkin, John
Avi,

Please do not mistake my posting as being a BASHING of R. I greatly admire R 
and the progress it has made from its roots in S. I thank the may people who 
contribute to the development and growth of R. 

Just because a language allows a given syntax does not mean (1) that the 
language is bad or (2) that the syntax should be used (except in rare 
occasions). There may well be a few occasions when using a global variable in a 
function makes sense and in that instance the global variable should be used, 
and the usage should be documented in the comments that are part of the source 
code. Please note that I stated "A general recommendation  use of a global 
variable in a function"; I used the words "recommendation is to AVOID". I did 
not, and would not forbid the use of a global variable in a function. 

English has the word "or" which is not clearly defined; is it an exclusive or 
or an inclusive or? I don't bash English because of this semantic ambiguity. 
What I try to do is make certain that when it is essential to understand if the 
"or" I use in a sentence in exclusive vs. inclusive (or conversely) I make 
certain my meaning is clear. e.g. You can use bleach or ammonia to clean the 
stain, but NEVER use ammonia and bleach together as the combination produces a 
deadly gas. I try to follow this philosophy when I program. I don't use global 
variables in a function unless there is an overwhelming reason to do so. When I 
do, I indicate that that a global variable has been used in my comments. In the 
same vane, I rarely use call by reference in my programs (when this is allowed 
by a programming language); I try to use call by value whenever possible as 
call by reference can be fraught. On the other hand when working with extremely 
large data objects (especially in the old days when I had 
 perhaps 20k of memory rather than 100 gig), I have used call by reference to 
save storage. Despite this, I know that call by reference is not recommended 
just as using a global variable in a function is not recommended, but can, and 
should be used when needed.

John 

From: R-help  on behalf of avi.e.gr...@gmail.com 

Sent: Sunday, January 15, 2023 10:53 PM
Cc: 'R help Mailing list'
Subject: Re: [R] return value of {}

Again, John, we are comparing different designs in languages that are often
decades old and partially retrofitted selectively over the years.

Is it poor form to use global variables? Many think so. Discussions have
been had on how to use variables hidden in various ways that are not global,
such as within a package.

But note R still has global assignment operators like <<- and its partner
->> that explicitly can even create a global variable that did not exist
before the function began and that persists for that session. This is
perhaps a special case of the assign() function which can do the same for
any designated environment.

Although it may sometimes be better form to avoid things like this, it can
also be worse form when you want to do something decentralized with no
control over passing additional arguments to all kinds of functions.

Some languages try to finesse their way past this by creating concepts like
closures that hold values and can even change the values without them being
globally visible. Some may use singleton objects or variables that are part
of a class rather than a single object (which is again a singleton.)

So is the way R allows really a bad thing, especially if rarely used?

All I know is MANY languages use scoping including functions that declare a
variable then create an inner function or many and return the inner
function(s) to the outside where the one getting it can later use that
function and access the variable and even use it as a way to communicate
with the multiple functions it got that were created in that incubator.
Nifty stuff but arguably not always as easy to comprehend!

This forum is not intended for BASHING any language, certainly not R. There
are many other languages to choose from and every one of them will have some
things others consider flaws. How many opted out of say a ++ operator as in
y = x++ for fairly good reasons and later added something like the Walrus
operator so you can now write y = (x := x + 1) as a way to do the same thing
and other things besides?

But to address your point, about a variable outside a function as defined in
a set of environments to search that includes a global context, I want to
note that it is just a set of maskings and your variable "x" can appear in
EVERY environment above you and you can get in real trouble if the order the
environments are put in place changes in some way. The arguably safer way
would be to get a specific value of x would be to not ask for it directly
but as get("x", envir=...) and specify the specific environment that ideally
is in existence at that time. Other arguments to get() let you specify a few
more things such as whether to 

Re: [R] return value of {....}

2023-01-15 Thread avi.e.gross
Again, John, we are comparing different designs in languages that are often
decades old and partially retrofitted selectively over the years.

Is it poor form to use global variables? Many think so. Discussions have
been had on how to use variables hidden in various ways that are not global,
such as within a package.

But note R still has global assignment operators like <<- and its partner
->> that explicitly can even create a global variable that did not exist
before the function began and that persists for that session. This is
perhaps a special case of the assign() function which can do the same for
any designated environment.

Although it may sometimes be better form to avoid things like this, it can
also be worse form when you want to do something decentralized with no
control over passing additional arguments to all kinds of functions.  

Some languages try to finesse their way past this by creating concepts like
closures that hold values and can even change the values without them being
globally visible. Some may use singleton objects or variables that are part
of a class rather than a single object (which is again a singleton.)

So is the way R allows really a bad thing, especially if rarely used?

All I know is MANY languages use scoping including functions that declare a
variable then create an inner function or many and return the inner
function(s) to the outside where the one getting it can later use that
function and access the variable and even use it as a way to communicate
with the multiple functions it got that were created in that incubator.
Nifty stuff but arguably not always as easy to comprehend!

This forum is not intended for BASHING any language, certainly not R. There
are many other languages to choose from and every one of them will have some
things others consider flaws. How many opted out of say a ++ operator as in
y = x++ for fairly good reasons and later added something like the Walrus
operator so you can now write y = (x := x + 1) as a way to do the same thing
and other things besides?

But to address your point, about a variable outside a function as defined in
a set of environments to search that includes a global context, I want to
note that it is just a set of maskings and your variable "x" can appear in
EVERY environment above you and you can get in real trouble if the order the
environments are put in place changes in some way. The arguably safer way
would be to get a specific value of x would be to not ask for it directly
but as get("x", envir=...) and specify the specific environment that ideally
is in existence at that time. Other arguments to get() let you specify a few
more things such as whether to search other places or supply a default.

Is it then poor technique to re-use the same name "x" in the same code for
some independent use? Probably, albeit if the new value plays a similar
role, just in another stretch of code, maybe not. I would comment it
carefully, and spell that out.

S first came out and before I even decided to become a Computer Scientist in
the mid to late  70's  and evolved multiple times. I first noticed it at
Bell Labs in the 80's. To a certain extent, R started as heavily influenced
by S and many programs could run on both. It too has changed over about
three decades. What kind of perfection can anyone expect over more recent
languages carefully produced with little concern about backward
compatibility?

And it remains very useful not necessarily based on the original language or
even the evolving core, but because of changes that maintained compatibility
as well as so many packages that met needs. Making changes, even if
"improvements" is likely to break all kinds of code unless it is something
like the new pipe operator that simply uses aspects that nobody would have
used before such as the new "|>" operator. 

The answer to too many questions about R remains BECAUSE that is how it was
done and whether you like it or not, may not change much any time soon. That
is why so many people like packages such as in the tidyverse because they
manage to make some changes, for better and often for verse.



-Original Message-
From: R-help  On Behalf Of Sorkin, John
Sent: Sunday, January 15, 2023 8:08 PM
To: Richard O'Keefe ; Valentin Petzel 
Cc: R help Mailing list 
Subject: Re: [R] return value of {}

Richard,
I sent my prior email too quickly:

A slight addition to your code shows an important aspect of R, local vs.
global variables:

x <- 137
f <- function () {
   a <- x
   x <- 42
   b <- x
   list(a=a, b=b)
   }
 f()
print(x)

When run the program produces the following:

> x <- 137
> f <- function () {
+a <- x
+x <- 42
+b <- x
+list(a=a, b=b)
+}
>  f()
$a
[1] 137

$b
[1] 42

> print(x)
[1] 137

The fist x, a <- x, invokes an x variable that is GLOBAL. It is known both
inside and outside the function.
The second x, x <- 42, defines an x that is LOCAL to the function, it is not

Re: [R] return value of {....}

2023-01-15 Thread avi.e.gross
Richard,

I appreciate your observations. As regularly noted, there are many possible
forks in the road to designing a language and it seems someone is determined
to try every possible fork.

Yes, some languages that are compiled, or like JavaScript, read the entire
function before executing it and promote some parts written further down to
the top so all variable declarations, as one example, are done before
executing any remaining code. If you look at R, if you decide to use a
library, you can ask for it near the top or just before you want to use it.
I think they get executed as needed so if you only load it in one branch of
an IF statement, ...

We can debate good and bad subjective choices in language design BUT for
practical purposes, what matters is that a feature IS a certain way and you
use it consistently with what is documented, not what you wish it was like.
R has an interpreter that arguably may be simple and keep reading small
sections of code and executing them and then getting more. Much of the code
is not even evaluated till much later or never and thus it may not be
trivial to do any kind of look-ahead and adjustment.

Many languages now look a bit silly after some changes are made or things
added that make the earlier way look clumsy. Some features may even be
deprecated and eventually removed, or the language forks again and people
argue that everyone should upgrade to the new version 12.x and so on.

Do note R lets you use rm(x) and also supports multiple environments
including new dynamic ones  so your example might have a region where an x
is used in quite a few ways such as asking some function call to be
evaluated in a specific environment. This flexibility would be harder if you
asked the interpreter to do things like some other languages that may not
support much. However, lots of languages with scoping rules will indeed
allow you to use variables in outer scopes or hidden scopes and so on. To
each their own.

-Original Message-
From: R-help  On Behalf Of Richard O'Keefe
Sent: Sunday, January 15, 2023 6:39 PM
To: Valentin Petzel 
Cc: R help Mailing list 
Subject: Re: [R] return value of {}

I wonder if the real confusino is not R's scope rules?
(begin .) is not Lisp, it's Scheme (a major Lisp dialect), and in Scheme,
(begin (define x ...) (define y ...) ...) declares variables x and y that
are local to the (begin ...) form, just like Algol 68.  That's weirdness 1.
Javascript had a similar weirdness, when the ECMAscript process eventually
addressed.  But the real weirdness in R is not just that the existence of
variables is indifferent to the presence of curly braces, it's that it's
*dynamic*.  In f <- function (...) {
   ... use x ...
   x <- ...
   ... use x ...
}
the two occurrences of "use x" refer to DIFFERENT variables.
The first occurrence refers to the x that exists outside the function.  It
has to: the local variable does not exist yet.
The assignment *creates* the variable, so the second occurrence of "use x"
refers to the inner variable.
Here's an actual example.
> x <- 137
> f <- function () {
+ a <- x
+ x <- 42
+ b <- x
+ list(a=a, b=b)
+ }
> f()
$a
[1] 137
$b
[1] 42

Many years ago I set out to write a compiler for R, and this was the issue
that finally sank my attempt.  It's not whether the occurrence of "use x" is
*lexically* before the creation of x.
It's when the assignment is *executed* that makes the difference.
Different paths of execution through a function may result in it arriving at
its return point with different sets of local variables.
R is the only language I routinely use that does this.

So rule 1: whether an identifier in an R function refers to an outer
variable or a local variable depends on whether an assignment creating that
local variable has been executed yet.
And rule 2: the scope of a local variable is the whole function.

If the following transcript not only makes sense to you, but is exactly what
you expect, congratulations, you understand local variables in R.

> x <- 0
> g <- function () {
+ n <- 10
+ r <- numeric(n)
+ for (i in 1:n) {
+ if (i == 6) x <- 100
+ r[i] <- x + i
+ }
+ r
+ }
> g()
 [1]   1   2   3   4   5 106 107 108 109 110


On Fri, 13 Jan 2023 at 23:28, Valentin Petzel  wrote:

> Hello Akshay,
>
> R is quite inspired by LISP, where this is a common thing. It is not 
> in fact that {...} returned something, rather any expression 
> evalulates to some value, and for a compound statement that is the 
> last evaluated expression.
>
> {...} might be seen as similar to LISPs (begin ...).
>
> Now this is a very different thing compared to {...} in something like 
> C, even if it looks or behaves similarly. But in R {...} is in fact an 
> expression and thus has evaluate to some value. This also comes with 
> some nice benefits.
>
> You do not need to use {...} for anything that is a single statement. 
> But you can in each possible place use {...} to turn multiple 
> 

Re: [R] return value of {....}

2023-01-15 Thread Bert Gunter
Sorry, John. If I understand you correctly, R has no "Global
Variables" in the sense that you seem to indicate. It does have a
"Global environment", but variables referred to in a function but not
found in the function environment are *not* necessarily searched for
in the "Global Environment" -- they are searched for in the function's
enclosing (defining) environment, then in that environment's enclosing
environment if not found there, etc.

Consider:
f <- f <- function(){
   x <- 3
   g <- function() x  ## x not bound to a value in g's environment
   g ## f returns the function g
}
> environment()

x <- 5 ## defined in the Global environment
g <- f()
## Now what do you think g() returns?

If I have misconstrued what you said, my apologies.
Even if I have not, none of the above is revelatory -- it's all
standard, documented R behavior.

Cheers,
Bert

Cheers,
Bert



while something like d$something <- ... may seem like you're directly
modifying the data it does not actually do so. Most R objects try to
be immutable, that is, the object may not change after creation. This
guarantees that if you have a binding for same object the object won't
change sneakily.

There is a data structure that is in fact mutable which are
environments. For example compare

L <- list()
local({L$a <- 3})
L$a

with

E <- new.env()
local({E$a <- 3})
E$

Valentin Petzel valen...@petzel.at via r-project.org

2:25 AM (5 hours ago)


to avi.e.gross, R-help

On Sun, Jan 15, 2023 at 5:07 PM Sorkin, John  wrote:
>
> Richard,
> I sent my prior email too quickly:
>
> A slight addition to your code shows an important aspect of R, local vs. 
> global variables:
>
> x <- 137
> f <- function () {
>a <- x
>x <- 42
>b <- x
>list(a=a, b=b)
>}
>  f()
> print(x)
>
> When run the program produces the following:
>
> > x <- 137
> > f <- function () {
> +a <- x
> +x <- 42
> +b <- x
> +list(a=a, b=b)
> +}
> >  f()
> $a
> [1] 137
>
> $b
> [1] 42
>
> > print(x)
> [1] 137
>
> The fist x, a <- x, invokes an x variable that is GLOBAL. It is known both 
> inside and outside the function.
> The second x, x <- 42, defines an x that is LOCAL to the function, it is not 
> known to the program that called the function. The LOCAL value of x is used 
> in the expression  b <- x. As can be seen by the print(x) statement, the 
> LOCAL value of x is NOT known by the program that calls the function. The 
> class of a variable, scoping (i.e. local vs. variable) can be a source of 
> subtle programming errors. A general recommendation is to AVOID use of a 
> global variable in a function, i.e. don't use a variable in function that is 
> not passed as a parameter to the function (as was done in the function above 
> in the statment a <- x). If you need to use a variable in a function that is 
> known by the program that calls the function, pass the variable as a argument 
> to the function e.g.
>
> Use this code:
>
> # Set values needed by function
> y <- 2
> b <- 30
>
> myfunction <- function(a,b){
> cat("a=",a,"b=",b,"\n")
>   y <- a
>   y2 <- y+b
>   cat("y=",y,"y2=",y2,"\n")
> }
> # Call the function and pass all needed values to the function
> myfunction(y,b)
>
> Don't use the following code that depends on a global value that is known to 
> the function, but not passed as a parameter to the function:
>
> y <- 2
> myNGfunction <- function(a){
>   cat("a=",a,"b=",b,"\n")
>   y <- a
>   y2 <- y+b
>   cat("y=",y,"y2=",y2,"\n")
> }
> # b is a global variable and will be know to the function,
> # but should be passed as a parameter as in example above.
> b <- 100
> myNGfunction(y)
>
> John
>
> 
> From: R-help  on behalf of Sorkin, John 
> 
> Sent: Sunday, January 15, 2023 7:40 PM
> To: Richard O'Keefe; Valentin Petzel
> Cc: R help Mailing list
> Subject: Re: [R] return value of {}
>
> Richard,
> A slight addition to your code shows an important aspect of R, local vs. 
> global variables:
>
> x <- 137
> f <- function () {
>a <- x
>x <- 42
>b <- x
>list(a=a, b=b)
>}
>  f()
> print(x)
>
> 
> From: R-help  on behalf of Richard O'Keefe 
> 
> Sent: Sunday, January 15, 2023 6:39 PM
> To: Valentin Petzel
> Cc: R help Mailing list
> Subject: Re: [R] return value of {}
>
> I wonder if the real confusino is not R's scope rules?
> (begin .) is not Lisp, it's Scheme (a major Lisp dialect),
> and in Scheme, (begin (define x ...) (define y ...) ...)
> declares variables x and y that are local to the (begin ...)
> form, just like Algol 68.  That's weirdness 1.  Javascript
> had a similar weirdness, when the ECMAscript process eventually
> addressed.  But the real weirdness in R is not just that the
> existence of variables is indifferent to the presence of curly
> braces, it's that it's *dynamic*.  In
> f <- function (...) {
>... use x ...
>x <- ...
>... use x ...
> }
> the two 

Re: [R] return value of {....}

2023-01-15 Thread Sorkin, John
Richard,
I sent my prior email too quickly:

A slight addition to your code shows an important aspect of R, local vs. global 
variables:

x <- 137
f <- function () {
   a <- x
   x <- 42
   b <- x
   list(a=a, b=b)
   }
 f()
print(x)

When run the program produces the following:

> x <- 137
> f <- function () {
+a <- x
+x <- 42
+b <- x
+list(a=a, b=b)
+}
>  f()
$a
[1] 137

$b
[1] 42

> print(x)
[1] 137

The fist x, a <- x, invokes an x variable that is GLOBAL. It is known both 
inside and outside the function.
The second x, x <- 42, defines an x that is LOCAL to the function, it is not 
known to the program that called the function. The LOCAL value of x is used in 
the expression  b <- x. As can be seen by the print(x) statement, the LOCAL 
value of x is NOT known by the program that calls the function. The class of a 
variable, scoping (i.e. local vs. variable) can be a source of subtle 
programming errors. A general recommendation is to AVOID use of a global 
variable in a function, i.e. don't use a variable in function that is not 
passed as a parameter to the function (as was done in the function above in the 
statment a <- x). If you need to use a variable in a function that is known by 
the program that calls the function, pass the variable as a argument to the 
function e.g. 

Use this code:

# Set values needed by function
y <- 2
b <- 30

myfunction <- function(a,b){
cat("a=",a,"b=",b,"\n")
  y <- a
  y2 <- y+b
  cat("y=",y,"y2=",y2,"\n")
}
# Call the function and pass all needed values to the function
myfunction(y,b)
 
Don't use the following code that depends on a global value that is known to 
the function, but not passed as a parameter to the function:

y <- 2
myNGfunction <- function(a){
  cat("a=",a,"b=",b,"\n")
  y <- a
  y2 <- y+b
  cat("y=",y,"y2=",y2,"\n")
}
# b is a global variable and will be know to the function, 
# but should be passed as a parameter as in example above.
b <- 100
myNGfunction(y)

John


From: R-help  on behalf of Sorkin, John 

Sent: Sunday, January 15, 2023 7:40 PM
To: Richard O'Keefe; Valentin Petzel
Cc: R help Mailing list
Subject: Re: [R] return value of {}

Richard,
A slight addition to your code shows an important aspect of R, local vs. global 
variables:

x <- 137
f <- function () {
   a <- x
   x <- 42
   b <- x
   list(a=a, b=b)
   }
 f()
print(x)


From: R-help  on behalf of Richard O'Keefe 

Sent: Sunday, January 15, 2023 6:39 PM
To: Valentin Petzel
Cc: R help Mailing list
Subject: Re: [R] return value of {}

I wonder if the real confusino is not R's scope rules?
(begin .) is not Lisp, it's Scheme (a major Lisp dialect),
and in Scheme, (begin (define x ...) (define y ...) ...)
declares variables x and y that are local to the (begin ...)
form, just like Algol 68.  That's weirdness 1.  Javascript
had a similar weirdness, when the ECMAscript process eventually
addressed.  But the real weirdness in R is not just that the
existence of variables is indifferent to the presence of curly
braces, it's that it's *dynamic*.  In
f <- function (...) {
   ... use x ...
   x <- ...
   ... use x ...
}
the two occurrences of "use x" refer to DIFFERENT variables.
The first occurrence refers to the x that exists outside the
function.  It has to: the local variable does not exist yet.
The assignment *creates* the variable, so the second
occurrence of "use x" refers to the inner variable.
Here's an actual example.
> x <- 137
> f <- function () {
+ a <- x
+ x <- 42
+ b <- x
+ list(a=a, b=b)
+ }
> f()
$a
[1] 137
$b
[1] 42

Many years ago I set out to write a compiler for R, and this was
the issue that finally sank my attempt.  It's not whether the
occurrence of "use x" is *lexically* before the creation of x.
It's when the assignment is *executed* that makes the difference.
Different paths of execution through a function may result in it
arriving at its return point with different sets of local variables.
R is the only language I routinely use that does this.

So rule 1: whether an identifier in an R function refers to an
outer variable or a local variable depends on whether an assignment
creating that local variable has been executed yet.
And rule 2: the scope of a local variable is the whole function.

If the following transcript not only makes sense to you, but is
exactly what you expect, congratulations, you understand local
variables in R.

> x <- 0
> g <- function () {
+ n <- 10
+ r <- numeric(n)
+ for (i in 1:n) {
+ if (i == 6) x <- 100
+ r[i] <- x + i
+ }
+ r
+ }
> g()
 [1]   1   2   3   4   5 106 107 108 109 110


On Fri, 13 Jan 2023 at 23:28, Valentin Petzel  wrote:

> Hello Akshay,
>
> R is quite inspired by LISP, where this is a common thing. It is not in
> fact that {...} returned something, rather any expression evalulates to
> some value, and for a compound 

Re: [R] return value of {....}

2023-01-15 Thread Bert Gunter
Well, weirdness is in the eyes of the beholder, I think.

In any case, R's scoping procedures are described in ?environment and
?assign and ?function, among other places; and in detail in the R
Language Definition. So no matter the behavior, as long as it is
clearly documented -- and consistent of course -- it's OK by me: it's
my job to learn it. Perhaps this just reflects my ignorance of what
might be considered "better" alternatives.

And of course, someone *did* write a compiler for R.

Cheers,
Bert

On Sun, Jan 15, 2023 at 3:39 PM Richard O'Keefe  wrote:
>
> I wonder if the real confusino is not R's scope rules?
> (begin .) is not Lisp, it's Scheme (a major Lisp dialect),
> and in Scheme, (begin (define x ...) (define y ...) ...)
> declares variables x and y that are local to the (begin ...)
> form, just like Algol 68.  That's weirdness 1.  Javascript
> had a similar weirdness, when the ECMAscript process eventually
> addressed.  But the real weirdness in R is not just that the
> existence of variables is indifferent to the presence of curly
> braces, it's that it's *dynamic*.  In
> f <- function (...) {
>... use x ...
>x <- ...
>... use x ...
> }
> the two occurrences of "use x" refer to DIFFERENT variables.
> The first occurrence refers to the x that exists outside the
> function.  It has to: the local variable does not exist yet.
> The assignment *creates* the variable, so the second
> occurrence of "use x" refers to the inner variable.
> Here's an actual example.
> > x <- 137
> > f <- function () {
> + a <- x
> + x <- 42
> + b <- x
> + list(a=a, b=b)
> + }
> > f()
> $a
> [1] 137
> $b
> [1] 42
>
> Many years ago I set out to write a compiler for R, and this was
> the issue that finally sank my attempt.  It's not whether the
> occurrence of "use x" is *lexically* before the creation of x.
> It's when the assignment is *executed* that makes the difference.
> Different paths of execution through a function may result in it
> arriving at its return point with different sets of local variables.
> R is the only language I routinely use that does this.
>
> So rule 1: whether an identifier in an R function refers to an
> outer variable or a local variable depends on whether an assignment
> creating that local variable has been executed yet.
> And rule 2: the scope of a local variable is the whole function.
>
> If the following transcript not only makes sense to you, but is
> exactly what you expect, congratulations, you understand local
> variables in R.
>
> > x <- 0
> > g <- function () {
> + n <- 10
> + r <- numeric(n)
> + for (i in 1:n) {
> + if (i == 6) x <- 100
> + r[i] <- x + i
> + }
> + r
> + }
> > g()
>  [1]   1   2   3   4   5 106 107 108 109 110
>
>
> On Fri, 13 Jan 2023 at 23:28, Valentin Petzel  wrote:
>
> > Hello Akshay,
> >
> > R is quite inspired by LISP, where this is a common thing. It is not in
> > fact that {...} returned something, rather any expression evalulates to
> > some value, and for a compound statement that is the last evaluated
> > expression.
> >
> > {...} might be seen as similar to LISPs (begin ...).
> >
> > Now this is a very different thing compared to {...} in something like C,
> > even if it looks or behaves similarly. But in R {...} is in fact an
> > expression and thus has evaluate to some value. This also comes with some
> > nice benefits.
> >
> > You do not need to use {...} for anything that is a single statement. But
> > you can in each possible place use {...} to turn multiple statements into
> > one.
> >
> > Now think about a statement like this
> >
> > f <- function(n) {
> > x <- runif(n)
> > x**2
> > }
> >
> > Then we can do
> >
> > y <- f(10)
> >
> > Now, you suggested way would look like this:
> >
> > f <- function(n) {
> > x <- runif(n)
> > y <- x**2
> > }
> >
> > And we'd need to do something like:
> >
> > f(10)
> > y <- somehow_get_last_env_of_f$y
> >
> > So having a compound statement evaluate to a value clearly has a benefit.
> >
> > Best Regards,
> > Valentin
> >
> > 09.01.2023 18:05:58 akshay kulkarni :
> >
> > > Dear Valentin,
> > >   But why should {} "return" a value? It
> > could just as well evaluate all the expressions and store the resulting
> > objects in whatever environment the interpreter chooses, and then it would
> > be left to the user to manipulate any object he chooses. Don't you think
> > returning the last, or any value, is redundant? We are living in the
> > 21st century world, and the R-core team might,I suppose, have a definite
> > reason for"returning" the last value. Any comments?
> > >
> > > Thanking you,
> > > Yours sincerely,
> > > AKSHAY M KULKARNI
> > >
> > > 
> > > *From:* Valentin Petzel 
> > > *Sent:* Monday, January 9, 2023 9:18 PM
> > > *To:* akshay kulkarni 
> > > *Cc:* R help Mailing list 
> > > *Subject:* Re: [R] return value of {}
> > >
> > > Hello Akshai,
> > >
> > > I think you 

Re: [R] return value of {....}

2023-01-15 Thread Sorkin, John
Richard,
A slight addition to your code shows an important aspect of R, local vs. global 
variables:

x <- 137
f <- function () {
   a <- x
   x <- 42
   b <- x
   list(a=a, b=b)
   }
 f()
print(x)


From: R-help  on behalf of Richard O'Keefe 

Sent: Sunday, January 15, 2023 6:39 PM
To: Valentin Petzel
Cc: R help Mailing list
Subject: Re: [R] return value of {}

I wonder if the real confusino is not R's scope rules?
(begin .) is not Lisp, it's Scheme (a major Lisp dialect),
and in Scheme, (begin (define x ...) (define y ...) ...)
declares variables x and y that are local to the (begin ...)
form, just like Algol 68.  That's weirdness 1.  Javascript
had a similar weirdness, when the ECMAscript process eventually
addressed.  But the real weirdness in R is not just that the
existence of variables is indifferent to the presence of curly
braces, it's that it's *dynamic*.  In
f <- function (...) {
   ... use x ...
   x <- ...
   ... use x ...
}
the two occurrences of "use x" refer to DIFFERENT variables.
The first occurrence refers to the x that exists outside the
function.  It has to: the local variable does not exist yet.
The assignment *creates* the variable, so the second
occurrence of "use x" refers to the inner variable.
Here's an actual example.
> x <- 137
> f <- function () {
+ a <- x
+ x <- 42
+ b <- x
+ list(a=a, b=b)
+ }
> f()
$a
[1] 137
$b
[1] 42

Many years ago I set out to write a compiler for R, and this was
the issue that finally sank my attempt.  It's not whether the
occurrence of "use x" is *lexically* before the creation of x.
It's when the assignment is *executed* that makes the difference.
Different paths of execution through a function may result in it
arriving at its return point with different sets of local variables.
R is the only language I routinely use that does this.

So rule 1: whether an identifier in an R function refers to an
outer variable or a local variable depends on whether an assignment
creating that local variable has been executed yet.
And rule 2: the scope of a local variable is the whole function.

If the following transcript not only makes sense to you, but is
exactly what you expect, congratulations, you understand local
variables in R.

> x <- 0
> g <- function () {
+ n <- 10
+ r <- numeric(n)
+ for (i in 1:n) {
+ if (i == 6) x <- 100
+ r[i] <- x + i
+ }
+ r
+ }
> g()
 [1]   1   2   3   4   5 106 107 108 109 110


On Fri, 13 Jan 2023 at 23:28, Valentin Petzel  wrote:

> Hello Akshay,
>
> R is quite inspired by LISP, where this is a common thing. It is not in
> fact that {...} returned something, rather any expression evalulates to
> some value, and for a compound statement that is the last evaluated
> expression.
>
> {...} might be seen as similar to LISPs (begin ...).
>
> Now this is a very different thing compared to {...} in something like C,
> even if it looks or behaves similarly. But in R {...} is in fact an
> expression and thus has evaluate to some value. This also comes with some
> nice benefits.
>
> You do not need to use {...} for anything that is a single statement. But
> you can in each possible place use {...} to turn multiple statements into
> one.
>
> Now think about a statement like this
>
> f <- function(n) {
> x <- runif(n)
> x**2
> }
>
> Then we can do
>
> y <- f(10)
>
> Now, you suggested way would look like this:
>
> f <- function(n) {
> x <- runif(n)
> y <- x**2
> }
>
> And we'd need to do something like:
>
> f(10)
> y <- somehow_get_last_env_of_f$y
>
> So having a compound statement evaluate to a value clearly has a benefit.
>
> Best Regards,
> Valentin
>
> 09.01.2023 18:05:58 akshay kulkarni :
>
> > Dear Valentin,
> >   But why should {} "return" a value? It
> could just as well evaluate all the expressions and store the resulting
> objects in whatever environment the interpreter chooses, and then it would
> be left to the user to manipulate any object he chooses. Don't you think
> returning the last, or any value, is redundant? We are living in the
> 21st century world, and the R-core team might,I suppose, have a definite
> reason for"returning" the last value. Any comments?
> >
> > Thanking you,
> > Yours sincerely,
> > AKSHAY M KULKARNI
> >
> > 
> > *From:* Valentin Petzel 
> > *Sent:* Monday, January 9, 2023 9:18 PM
> > *To:* akshay kulkarni 
> > *Cc:* R help Mailing list 
> > *Subject:* Re: [R] return value of {}
> >
> > Hello Akshai,
> >
> > I think you are confusing {...} with local({...}). This one will
> evaluate the expression in a separate environment, returning the last
> expression.
> >
> > {...} simply evaluates multiple expressions as one and returns the
> result of the last line, but it still evaluates each expression.
> >
> > Assignment returns the assigned value, so we can chain assignments like
> this
> >
> > a <- 1 + (b <- 2)
> >
> > 

Re: [R] return value of {....}

2023-01-15 Thread Richard O'Keefe
I wonder if the real confusino is not R's scope rules?
(begin .) is not Lisp, it's Scheme (a major Lisp dialect),
and in Scheme, (begin (define x ...) (define y ...) ...)
declares variables x and y that are local to the (begin ...)
form, just like Algol 68.  That's weirdness 1.  Javascript
had a similar weirdness, when the ECMAscript process eventually
addressed.  But the real weirdness in R is not just that the
existence of variables is indifferent to the presence of curly
braces, it's that it's *dynamic*.  In
f <- function (...) {
   ... use x ...
   x <- ...
   ... use x ...
}
the two occurrences of "use x" refer to DIFFERENT variables.
The first occurrence refers to the x that exists outside the
function.  It has to: the local variable does not exist yet.
The assignment *creates* the variable, so the second
occurrence of "use x" refers to the inner variable.
Here's an actual example.
> x <- 137
> f <- function () {
+ a <- x
+ x <- 42
+ b <- x
+ list(a=a, b=b)
+ }
> f()
$a
[1] 137
$b
[1] 42

Many years ago I set out to write a compiler for R, and this was
the issue that finally sank my attempt.  It's not whether the
occurrence of "use x" is *lexically* before the creation of x.
It's when the assignment is *executed* that makes the difference.
Different paths of execution through a function may result in it
arriving at its return point with different sets of local variables.
R is the only language I routinely use that does this.

So rule 1: whether an identifier in an R function refers to an
outer variable or a local variable depends on whether an assignment
creating that local variable has been executed yet.
And rule 2: the scope of a local variable is the whole function.

If the following transcript not only makes sense to you, but is
exactly what you expect, congratulations, you understand local
variables in R.

> x <- 0
> g <- function () {
+ n <- 10
+ r <- numeric(n)
+ for (i in 1:n) {
+ if (i == 6) x <- 100
+ r[i] <- x + i
+ }
+ r
+ }
> g()
 [1]   1   2   3   4   5 106 107 108 109 110


On Fri, 13 Jan 2023 at 23:28, Valentin Petzel  wrote:

> Hello Akshay,
>
> R is quite inspired by LISP, where this is a common thing. It is not in
> fact that {...} returned something, rather any expression evalulates to
> some value, and for a compound statement that is the last evaluated
> expression.
>
> {...} might be seen as similar to LISPs (begin ...).
>
> Now this is a very different thing compared to {...} in something like C,
> even if it looks or behaves similarly. But in R {...} is in fact an
> expression and thus has evaluate to some value. This also comes with some
> nice benefits.
>
> You do not need to use {...} for anything that is a single statement. But
> you can in each possible place use {...} to turn multiple statements into
> one.
>
> Now think about a statement like this
>
> f <- function(n) {
> x <- runif(n)
> x**2
> }
>
> Then we can do
>
> y <- f(10)
>
> Now, you suggested way would look like this:
>
> f <- function(n) {
> x <- runif(n)
> y <- x**2
> }
>
> And we'd need to do something like:
>
> f(10)
> y <- somehow_get_last_env_of_f$y
>
> So having a compound statement evaluate to a value clearly has a benefit.
>
> Best Regards,
> Valentin
>
> 09.01.2023 18:05:58 akshay kulkarni :
>
> > Dear Valentin,
> >   But why should {} "return" a value? It
> could just as well evaluate all the expressions and store the resulting
> objects in whatever environment the interpreter chooses, and then it would
> be left to the user to manipulate any object he chooses. Don't you think
> returning the last, or any value, is redundant? We are living in the
> 21st century world, and the R-core team might,I suppose, have a definite
> reason for"returning" the last value. Any comments?
> >
> > Thanking you,
> > Yours sincerely,
> > AKSHAY M KULKARNI
> >
> > 
> > *From:* Valentin Petzel 
> > *Sent:* Monday, January 9, 2023 9:18 PM
> > *To:* akshay kulkarni 
> > *Cc:* R help Mailing list 
> > *Subject:* Re: [R] return value of {}
> >
> > Hello Akshai,
> >
> > I think you are confusing {...} with local({...}). This one will
> evaluate the expression in a separate environment, returning the last
> expression.
> >
> > {...} simply evaluates multiple expressions as one and returns the
> result of the last line, but it still evaluates each expression.
> >
> > Assignment returns the assigned value, so we can chain assignments like
> this
> >
> > a <- 1 + (b <- 2)
> >
> > conveniently.
> >
> > So when is {...} useful? Well, anyplace where you want to execute
> complex stuff in a function argument. E.g. you might do:
> >
> > data %>% group_by(x) %>% summarise(y = {if(x[1] > 10) sum(y) else
> mean(y)})
> >
> > Regards,
> > Valentin Petzel
> >
> > 09.01.2023 15:47:53 akshay kulkarni :
> >
> >> Dear members,
> >>  I have the following code:
> >>
> >>> TB <- {x <- 3;y <- 5}

Re: [R] Removing variables from data frame with a wile card

2023-01-15 Thread Rui Barradas

Às 16:54 de 15/01/2023, Sorkin, John escreveu:

I am new to this thread. At the risk of presenting something that has been shown before, 
below I demonstrate how a column in a data frame can be dropped using a wild card, i.e. a 
column whose name starts with "th" using nothing more than base r functions and 
base R syntax. While additions to R such as tidyverse can be very helpful, many things 
that they do can be accomplished simply using base R.

# Create data frame with three columns
one <- rep(1,10)
one
two <- rep(2,10)
two
three <- rep(3,10)
three
mydata <- data.frame(one=one, two=two, three=three)
cat("Data frame with three columns\n")
mydata

# Drop the column whose name starts with th, i.e. column three
# Find the location of the column
ColumToDelete <- grep("th",colnames((mydata)))
cat("The colomumn to be dropped is the column called three, which is 
column",ColumToDelete,"\n")
ColumToDelete

# Drop the column whose name starts with "th"
newdata2 <- mydata[,-ColumnToDelete]
cat("Data frame after droping column whose name is three\n")
newdata2

I hope this helps.
John



From: R-help  on behalf of Valentin Petzel 

Sent: Saturday, January 14, 2023 1:21 PM
To: avi.e.gr...@gmail.com
Cc: 'R-help Mailing List'
Subject: Re: [R] Removing variables from data frame with a wile card

Hello Avi,

while something like d$something <- ... may seem like you're directly modifying 
the data it does not actually do so. Most R objects try to be immutable, that is, 
the object may not change after creation. This guarantees that if you have a 
binding for same object the object won't change sneakily.

There is a data structure that is in fact mutable which are environments. For 
example compare

L <- list()
local({L$a <- 3})
L$a

with

E <- new.env()
local({E$a <- 3})
E$a

The latter will in fact work, as the same Environment is modified, while in the 
first one a modified copy of the list is made.

Under the hood we have a parser trick: If R sees something like

f(a) <- ...

it will look for a function f<- and call

a <- f<-(a, ...)

(this also happens for example when you do names(x) <- ...)

So in fact in our case this is equivalent to creating a copy with removed 
columns and rebind the symbol in the current environment to the result.

The data.table package breaks with this convention and uses C based routines 
that allow changing of data without copying the object. Doing

d[, (cols_to_remove) := NULL]

will actually change the data.

Regards,
Valentin

14.01.2023 18:28:33 avi.e.gr...@gmail.com:


Steven,

Just want to add a few things to what people wrote.

In base R, the methods mentioned will let you make a copy of your original DF 
that is missing the items you are selecting that match your pattern.

That is fine.

For some purposes, you want to keep the original data.frame and remove a column 
within it. You can do that in several ways but the simplest is something where 
you sat the column to NULL as in:

mydata$NAME <- NULL

using the mydata["NAME"] notation can do that for you by using a loop of 
unctional programming method that does that with all components of your grep.

R does have optimizations that make this less useful as a partial copy of a 
data.frame retains common parts till things change.

For those who like to use the tidyverse, it comes with lots of tools that let 
you select columns that start with or end with or contain some pattern and I 
find that way easier.



-Original Message-
From: R-help  On Behalf Of Steven Yen
Sent: Saturday, January 14, 2023 7:49 AM
To: Andrew Simmons 
Cc: R-help Mailing List 
Subject: Re: [R] Removing variables from data frame with a wile card

Thanks to all. Very helpful.

Steven from iPhone


On Jan 14, 2023, at 3:08 PM, Andrew Simmons  wrote:

You'll want to use grep() or grepl(). By default, grep() uses
extended regular expressions to find matches, but you can also use
perl regular expressions and globbing (after converting to a regular 
expression).
For example:

grepl("^yr", colnames(mydata))

will tell you which 'colnames' start with "yr". If you'd rather you
use globbing:

grepl(glob2rx("yr*"), colnames(mydata))

Then you might write something like this to remove the columns starting with yr:

mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]


On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen  wrote:

I have a data frame containing variables "yr3",...,"yr28".

How do I remove them with a wild cardsomething similar to "del yr*"
in Windows/doc? Thank you.


colnames(mydata)

   [1] "year"   "weight" "confeduc"   "confothr" "college"
   [6] ...
[41] "yr3"    "yr4"    "yr5"    "yr6" "yr7"
[46] "yr8"    "yr9"    "yr10"   "yr11" "yr12"
[51] "yr13"   "yr14"   "yr15"   "yr16" "yr17"
[56] "yr18"   "yr19"   "yr20"   "yr21" "yr22"
[61] "yr23"   "yr24"   "yr25"   "yr26" "yr27"
[66] "yr28"...


Re: [R] Removing variables from data frame with a wile card

2023-01-15 Thread avi.e.gross
John,

As you said, you are new to the discussion so let me catch you up.

The original question was about removing many columns that shared a similar 
feature in the naming convention while leaving other columns in-place. Quite a 
few replies were given on how to do that including how to use a regular 
expression to gather the column names you want to remove.

It was only afterwards that the topic changed a bit to mention that some people 
used additional ways both in base R and also using packages like dplyr in the 
tidyverse.

As a general rule, most packages out there provide functionality that can be 
done in base R if you wish, and some are written purely in R while some augment 
that with parts re-done in C or something. If a package is well built and 
frequently used, it may well make your life as a programmer easier as the code 
need not be re-invented and debugged. Of course some packages are of poorer 
quality.

So we fully agree that unless asked for, the base R answers should be the focus 
HERE. Then again, languages are not static and sometimes we see things like 
pipes moved in a modified version into the main language.

Avi

-Original Message-
From: Sorkin, John  
Sent: Sunday, January 15, 2023 11:55 AM
To: Valentin Petzel ; avi.e.gr...@gmail.com
Cc: 'R-help Mailing List' 
Subject: Re: [R] Removing variables from data frame with a wile card

I am new to this thread. At the risk of presenting something that has been 
shown before, below I demonstrate how a column in a data frame can be dropped 
using a wild card, i.e. a column whose name starts with "th" using nothing more 
than base r functions and base R syntax. While additions to R such as tidyverse 
can be very helpful, many things that they do can be accomplished simply using 
base R.  

# Create data frame with three columns
one <- rep(1,10)
one
two <- rep(2,10)
two
three <- rep(3,10)
three
mydata <- data.frame(one=one, two=two, three=three) cat("Data frame with three 
columns\n") mydata

# Drop the column whose name starts with th, i.e. column three # Find the 
location of the column ColumToDelete <- grep("th",colnames((mydata))) cat("The 
colomumn to be dropped is the column called three, which is 
column",ColumToDelete,"\n") ColumToDelete

# Drop the column whose name starts with "th"
newdata2 <- mydata[,-ColumnToDelete]
cat("Data frame after droping column whose name is three\n")
newdata2

I hope this helps.
John



From: R-help  on behalf of Valentin Petzel 

Sent: Saturday, January 14, 2023 1:21 PM
To: avi.e.gr...@gmail.com
Cc: 'R-help Mailing List'
Subject: Re: [R] Removing variables from data frame with a wile card

Hello Avi,

while something like d$something <- ... may seem like you're directly modifying 
the data it does not actually do so. Most R objects try to be immutable, that 
is, the object may not change after creation. This guarantees that if you have 
a binding for same object the object won't change sneakily.

There is a data structure that is in fact mutable which are environments. For 
example compare

L <- list()
local({L$a <- 3})
L$a

with

E <- new.env()
local({E$a <- 3})
E$a

The latter will in fact work, as the same Environment is modified, while in the 
first one a modified copy of the list is made.

Under the hood we have a parser trick: If R sees something like

f(a) <- ...

it will look for a function f<- and call

a <- f<-(a, ...)

(this also happens for example when you do names(x) <- ...)

So in fact in our case this is equivalent to creating a copy with removed 
columns and rebind the symbol in the current environment to the result.

The data.table package breaks with this convention and uses C based routines 
that allow changing of data without copying the object. Doing

d[, (cols_to_remove) := NULL]

will actually change the data.

Regards,
Valentin

14.01.2023 18:28:33 avi.e.gr...@gmail.com:

> Steven,
>
> Just want to add a few things to what people wrote.
>
> In base R, the methods mentioned will let you make a copy of your original DF 
> that is missing the items you are selecting that match your pattern.
>
> That is fine.
>
> For some purposes, you want to keep the original data.frame and remove a 
> column within it. You can do that in several ways but the simplest is 
> something where you sat the column to NULL as in:
>
> mydata$NAME <- NULL
>
> using the mydata["NAME"] notation can do that for you by using a loop of 
> unctional programming method that does that with all components of your grep.
>
> R does have optimizations that make this less useful as a partial copy of a 
> data.frame retains common parts till things change.
>
> For those who like to use the tidyverse, it comes with lots of tools that let 
> you select columns that start with or end with or contain some pattern and I 
> find that way easier.
>
>
>
> -Original Message-
> From: R-help  On Behalf Of Steven Yen
> Sent: Saturday, January 14, 2023 7:49 AM
> To: Andrew 

Re: [R] Removing variables from data frame with a wile card

2023-01-15 Thread Sorkin, John
I am new to this thread. At the risk of presenting something that has been 
shown before, below I demonstrate how a column in a data frame can be dropped 
using a wild card, i.e. a column whose name starts with "th" using nothing more 
than base r functions and base R syntax. While additions to R such as tidyverse 
can be very helpful, many things that they do can be accomplished simply using 
base R.  

# Create data frame with three columns
one <- rep(1,10)
one
two <- rep(2,10)
two
three <- rep(3,10)
three
mydata <- data.frame(one=one, two=two, three=three)
cat("Data frame with three columns\n")
mydata

# Drop the column whose name starts with th, i.e. column three
# Find the location of the column
ColumToDelete <- grep("th",colnames((mydata)))
cat("The colomumn to be dropped is the column called three, which is 
column",ColumToDelete,"\n")
ColumToDelete

# Drop the column whose name starts with "th"
newdata2 <- mydata[,-ColumnToDelete]
cat("Data frame after droping column whose name is three\n")
newdata2

I hope this helps.
John



From: R-help  on behalf of Valentin Petzel 

Sent: Saturday, January 14, 2023 1:21 PM
To: avi.e.gr...@gmail.com
Cc: 'R-help Mailing List'
Subject: Re: [R] Removing variables from data frame with a wile card

Hello Avi,

while something like d$something <- ... may seem like you're directly modifying 
the data it does not actually do so. Most R objects try to be immutable, that 
is, the object may not change after creation. This guarantees that if you have 
a binding for same object the object won't change sneakily.

There is a data structure that is in fact mutable which are environments. For 
example compare

L <- list()
local({L$a <- 3})
L$a

with

E <- new.env()
local({E$a <- 3})
E$a

The latter will in fact work, as the same Environment is modified, while in the 
first one a modified copy of the list is made.

Under the hood we have a parser trick: If R sees something like

f(a) <- ...

it will look for a function f<- and call

a <- f<-(a, ...)

(this also happens for example when you do names(x) <- ...)

So in fact in our case this is equivalent to creating a copy with removed 
columns and rebind the symbol in the current environment to the result.

The data.table package breaks with this convention and uses C based routines 
that allow changing of data without copying the object. Doing

d[, (cols_to_remove) := NULL]

will actually change the data.

Regards,
Valentin

14.01.2023 18:28:33 avi.e.gr...@gmail.com:

> Steven,
>
> Just want to add a few things to what people wrote.
>
> In base R, the methods mentioned will let you make a copy of your original DF 
> that is missing the items you are selecting that match your pattern.
>
> That is fine.
>
> For some purposes, you want to keep the original data.frame and remove a 
> column within it. You can do that in several ways but the simplest is 
> something where you sat the column to NULL as in:
>
> mydata$NAME <- NULL
>
> using the mydata["NAME"] notation can do that for you by using a loop of 
> unctional programming method that does that with all components of your grep.
>
> R does have optimizations that make this less useful as a partial copy of a 
> data.frame retains common parts till things change.
>
> For those who like to use the tidyverse, it comes with lots of tools that let 
> you select columns that start with or end with or contain some pattern and I 
> find that way easier.
>
>
>
> -Original Message-
> From: R-help  On Behalf Of Steven Yen
> Sent: Saturday, January 14, 2023 7:49 AM
> To: Andrew Simmons 
> Cc: R-help Mailing List 
> Subject: Re: [R] Removing variables from data frame with a wile card
>
> Thanks to all. Very helpful.
>
> Steven from iPhone
>
>> On Jan 14, 2023, at 3:08 PM, Andrew Simmons  wrote:
>>
>> You'll want to use grep() or grepl(). By default, grep() uses
>> extended regular expressions to find matches, but you can also use
>> perl regular expressions and globbing (after converting to a regular 
>> expression).
>> For example:
>>
>> grepl("^yr", colnames(mydata))
>>
>> will tell you which 'colnames' start with "yr". If you'd rather you
>> use globbing:
>>
>> grepl(glob2rx("yr*"), colnames(mydata))
>>
>> Then you might write something like this to remove the columns starting with 
>> yr:
>>
>> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
>>
>>> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen  wrote:
>>>
>>> I have a data frame containing variables "yr3",...,"yr28".
>>>
>>> How do I remove them with a wild cardsomething similar to "del yr*"
>>> in Windows/doc? Thank you.
>>>
 colnames(mydata)
>>>   [1] "year"   "weight" "confeduc"   "confothr" "college"
>>>   [6] ...
>>> [41] "yr3"    "yr4"    "yr5"    "yr6" "yr7"
>>> [46] "yr8"    "yr9"    "yr10"   "yr11" "yr12"
>>> [51] "yr13"   "yr14"   "yr15"   "yr16" "yr17"
>>> [56] "yr18"   "yr19"   "yr20"   "yr21" 

Re: [R] return value of {....}

2023-01-15 Thread akshay kulkarni
Dear valentin,
  Thanks for a comprehensive background

THanking you,
Yours sincerely,
AKSHAY M KULKARNI

From: Valentin Petzel 
Sent: Friday, January 13, 2023 4:48 AM
To: akshay kulkarni 
Cc: R help Mailing list 
Subject: Re: [R] return value of {}

Hello Akshay,

R is quite inspired by LISP, where this is a common thing. It is not in fact 
that {...} returned something, rather any expression evalulates to some value, 
and for a compound statement that is the last evaluated expression.

{...} might be seen as similar to LISPs (begin ...).

Now this is a very different thing compared to {...} in something like C, even 
if it looks or behaves similarly. But in R {...} is in fact an expression and 
thus has evaluate to some value. This also comes with some nice benefits.

You do not need to use {...} for anything that is a single statement. But you 
can in each possible place use {...} to turn multiple statements into one.

Now think about a statement like this

f <- function(n) {
x <- runif(n)
x**2
}

Then we can do

y <- f(10)

Now, you suggested way would look like this:

f <- function(n) {
x <- runif(n)
y <- x**2
}

And we'd need to do something like:

f(10)
y <- somehow_get_last_env_of_f$y

So having a compound statement evaluate to a value clearly has a benefit.

Best Regards,
Valentin


09.01.2023 18:05:58 akshay kulkarni :

Dear Valentin,
  But why should {} "return" a value? It could just 
as well evaluate all the expressions and store the resulting objects in 
whatever environment the interpreter chooses, and then it would be left to the 
user to manipulate any object he chooses. Don't you think returning the last, 
or any value, is redundant? We are living in the 21st century world, and the 
R-core team might,I suppose, have a definite reason for"returning" the last 
value. Any comments?

Thanking you,
Yours sincerely,
AKSHAY M KULKARNI


From: Valentin Petzel 
Sent: Monday, January 9, 2023 9:18 PM
To: akshay kulkarni 
Cc: R help Mailing list 
Subject: Re: [R] return value of {}

Hello Akshai,

I think you are confusing {...} with local({...}). This one will evaluate the 
expression in a separate environment, returning the last expression.

{...} simply evaluates multiple expressions as one and returns the result of 
the last line, but it still evaluates each expression.

Assignment returns the assigned value, so we can chain assignments like this

a <- 1 + (b <- 2)

conveniently.

So when is {...} useful? Well, anyplace where you want to execute complex stuff 
in a function argument. E.g. you might do:

data %>% group_by(x) %>% summarise(y = {if(x[1] > 10) sum(y) else mean(y)})

Regards,
Valentin Petzel

09.01.2023 15:47:53 akshay kulkarni :

> Dear members,
>  I have the following code:
>
>> TB <- {x <- 3;y <- 5}
>> TB
> [1] 5
>
> It is consistent with the documentation: For {, the result of the last 
> expression evaluated. This has the visibility of the last evaluation.
>
> But both x AND y are created, but the "return value" is y. How can this be 
> advantageous for solving practical problems? Specifically, consider the 
> following code:
>
> F <- function(X) {  expr; expr2; { expr5; expr7}; expr8;expr10}
>
> Both expr5 and expr7 are created, and are accessible by the code outside of 
> the nested braces right? But the "return value" of the nested braces is 
> expr7. So doesn't this mean that only expr7 should be accessible? Please help 
> me entangle this (of course the return value of F is expr10, and all the 
> other objects created by the preceding expressions are deleted. But expr5 is 
> not, after the control passes outside of the nested braces!)
>
> Thanking you,
> Yours sincerely,
> AKSHAY M KULKARNI
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] return value of {....}

2023-01-15 Thread akshay kulkarni
Dear Bill,
 Thanks for your reply.

thanking you,
Yours sincerely,
AKSHAY M KULKARNI

From: Bill Dunlap 
Sent: Friday, January 13, 2023 10:48 PM
To: Valentin Petzel 
Cc: akshay kulkarni ; R help Mailing list 

Subject: Re: [R] return value of {}

R's
   { expr1; expr2; expr3}
acts much like C's
   ( expr1, expr2, expr3)

E.g.,

$ cat a.c
#include 

int main(int argc, char* argv[])
{
double y = 10 ;
double x = (printf("Starting... "), y = y + 100, y * 20);
printf("Done: x=%g, y=%g\n", x, y);
return 0;
}
$ gcc -Wall a.c
$ ./a.out
Starting... Done: x=2200, y=110

I don't like that syntax (e.g., commas between expressions instead of the usual 
semicolons after expressions).  Perhaps John Chambers et all didn't either.

-Bill

On Fri, Jan 13, 2023 at 2:28 AM Valentin Petzel 
mailto:valen...@petzel.at>> wrote:
Hello Akshay,

R is quite inspired by LISP, where this is a common thing. It is not in fact 
that {...} returned something, rather any expression evalulates to some value, 
and for a compound statement that is the last evaluated expression.

{...} might be seen as similar to LISPs (begin ...).

Now this is a very different thing compared to {...} in something like C, even 
if it looks or behaves similarly. But in R {...} is in fact an expression and 
thus has evaluate to some value. This also comes with some nice benefits.

You do not need to use {...} for anything that is a single statement. But you 
can in each possible place use {...} to turn multiple statements into one.

Now think about a statement like this

f <- function(n) {
x <- runif(n)
x**2
}

Then we can do

y <- f(10)

Now, you suggested way would look like this:

f <- function(n) {
x <- runif(n)
y <- x**2
}

And we'd need to do something like:

f(10)
y <- somehow_get_last_env_of_f$y

So having a compound statement evaluate to a value clearly has a benefit.

Best Regards,
Valentin

09.01.2023 18:05:58 akshay kulkarni 
mailto:akshay...@hotmail.com>>:

> Dear Valentin,
>   But why should {} "return" a value? It could 
> just as well evaluate all the expressions and store the resulting objects in 
> whatever environment the interpreter chooses, and then it would be left to 
> the user to manipulate any object he chooses. Don't you think returning the 
> last, or any value, is redundant? We are living in the 21st century world, 
> and the R-core team might,I suppose, have a definite reason for"returning" 
> the last value. Any comments?
>
> Thanking you,
> Yours sincerely,
> AKSHAY M KULKARNI
>
> 
> *From:* Valentin Petzel mailto:valen...@petzel.at>>
> *Sent:* Monday, January 9, 2023 9:18 PM
> *To:* akshay kulkarni mailto:akshay...@hotmail.com>>
> *Cc:* R help Mailing list mailto:r-help@r-project.org>>
> *Subject:* Re: [R] return value of {}
>
> Hello Akshai,
>
> I think you are confusing {...} with local({...}). This one will evaluate the 
> expression in a separate environment, returning the last expression.
>
> {...} simply evaluates multiple expressions as one and returns the result of 
> the last line, but it still evaluates each expression.
>
> Assignment returns the assigned value, so we can chain assignments like this
>
> a <- 1 + (b <- 2)
>
> conveniently.
>
> So when is {...} useful? Well, anyplace where you want to execute complex 
> stuff in a function argument. E.g. you might do:
>
> data %>% group_by(x) %>% summarise(y = {if(x[1] > 10) sum(y) else mean(y)})
>
> Regards,
> Valentin Petzel
>
> 09.01.2023 15:47:53 akshay kulkarni 
> mailto:akshay...@hotmail.com>>:
>
>> Dear members,
>>  I have the following code:
>>
>>> TB <- {x <- 3;y <- 5}
>>> TB
>> [1] 5
>>
>> It is consistent with the documentation: For {, the result of the last 
>> expression evaluated. This has the visibility of the last evaluation.
>>
>> But both x AND y are created, but the "return value" is y. How can this be 
>> advantageous for solving practical problems? Specifically, consider the 
>> following code:
>>
>> F <- function(X) {  expr; expr2; { expr5; expr7}; expr8;expr10}
>>
>> Both expr5 and expr7 are created, and are accessible by the code outside of 
>> the nested braces right? But the "return value" of the nested braces is 
>> expr7. So doesn't this mean that only expr7 should be accessible? Please 
>> help me entangle this (of course the return value of F is expr10, and all 
>> the other objects created by the preceding expressions are deleted. But 
>> expr5 is not, after the control passes outside of the nested braces!)
>>
>> Thanking you,
>> Yours sincerely,
>> AKSHAY M KULKARNI
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To 
>> UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 

Re: [R] return value of {....}

2023-01-15 Thread akshay kulkarni
dear Heinz,
 Thanks for your replyreason is as old as the Sun..!

THanking you,
Yours sincerely,
AKSHAY M KULKARNI

From: R-help  on behalf of Heinz Tuechler 

Sent: Friday, January 13, 2023 4:30 PM
To: r-help@r-project.org 
Subject: Re: [R] return value of {}

> 09.01.2023 18:05:58 akshay kulkarni :
>
> We are living in the 21st century world, and the R-core team might,I suppose, 
> have a definite reason ...
>

Maybe compatibility reasons with S and R-versions from the 20st century?
But maybe, you would have expected some reason even then.

best regards,

Heinz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] return value of {....}

2023-01-15 Thread akshay kulkarni
Dear leonard,
 I think Avi's repsonse was best...it's just a 
design...as he said, there are other possibilities in other programming 
languages which augments this design...thanks anyways for your reply...

THanking you,
Yours sincerely,
AKSHAY M KULKARNI

From: Leonard Mada 
Sent: Friday, January 13, 2023 1:44 AM
To: Akshay Kulkarni 
Cc: R-help Mailing List 
Subject: Re: [R] return value of {}

Dear Akshay,

The best response was given by Andrew. "{...}" is not a closure.

This is unusual for someone used to C-type languages. But I will try to
explain some of the rationale.

In the case that "{...}" was a closure, then external variables would
need to be explicitly declared before the closure (in order to reuse
those values):
intermediate = c()
{
 intermediate = ...;
 result = someFUN(intermediate);
}

1.) Interactive Sessions
This is cumbersome in interactive sessions. For example: you often
compute the mean or the variance as intermediary results, and will need
them later on as well. They could have been computed outside the
"closure", but writing code in interactive sessions may not always be
straightforward.

2.) Missing arguments
f = function(x, y) {
 if(missing(y)) {
 # assuming x = matrix
 y = x[,2]; x = x[,1];
 }
}
It would be much more cumbersome to define/use a temporary tempY.

I hope this gives a better perspective why this is indeed a useful
feature - even if it is counterintuitive.

Sincerely,

Leonard



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing variables from data frame with a wile card

2023-01-15 Thread Valentin Petzel
Hello Avi,

while something like d$something <- ... may seem like you're directly modifying 
the data it does not actually do so. Most R objects try to be immutable, that 
is, the object may not change after creation. This guarantees that if you have 
a binding for same object the object won't change sneakily.

There is a data structure that is in fact mutable which are environments. For 
example compare

L <- list()
local({L$a <- 3})
L$a

with

E <- new.env()
local({E$a <- 3})
E$a

The latter will in fact work, as the same Environment is modified, while in the 
first one a modified copy of the list is made.

Under the hood we have a parser trick: If R sees something like

f(a) <- ...

it will look for a function f<- and call

a <- f<-(a, ...)

(this also happens for example when you do names(x) <- ...)

So in fact in our case this is equivalent to creating a copy with removed 
columns and rebind the symbol in the current environment to the result.

The data.table package breaks with this convention and uses C based routines 
that allow changing of data without copying the object. Doing

d[, (cols_to_remove) := NULL]

will actually change the data.

Regards,
Valentin

14.01.2023 18:28:33 avi.e.gr...@gmail.com:

> Steven,
> 
> Just want to add a few things to what people wrote.
> 
> In base R, the methods mentioned will let you make a copy of your original DF 
> that is missing the items you are selecting that match your pattern.
> 
> That is fine.
> 
> For some purposes, you want to keep the original data.frame and remove a 
> column within it. You can do that in several ways but the simplest is 
> something where you sat the column to NULL as in:
> 
> mydata$NAME <- NULL
> 
> using the mydata["NAME"] notation can do that for you by using a loop of 
> unctional programming method that does that with all components of your grep.
> 
> R does have optimizations that make this less useful as a partial copy of a 
> data.frame retains common parts till things change.
> 
> For those who like to use the tidyverse, it comes with lots of tools that let 
> you select columns that start with or end with or contain some pattern and I 
> find that way easier.
> 
> 
> 
> -Original Message-
> From: R-help  On Behalf Of Steven Yen
> Sent: Saturday, January 14, 2023 7:49 AM
> To: Andrew Simmons 
> Cc: R-help Mailing List 
> Subject: Re: [R] Removing variables from data frame with a wile card
> 
> Thanks to all. Very helpful.
> 
> Steven from iPhone
> 
>> On Jan 14, 2023, at 3:08 PM, Andrew Simmons  wrote:
>> 
>> You'll want to use grep() or grepl(). By default, grep() uses
>> extended regular expressions to find matches, but you can also use
>> perl regular expressions and globbing (after converting to a regular 
>> expression).
>> For example:
>> 
>> grepl("^yr", colnames(mydata))
>> 
>> will tell you which 'colnames' start with "yr". If you'd rather you
>> use globbing:
>> 
>> grepl(glob2rx("yr*"), colnames(mydata))
>> 
>> Then you might write something like this to remove the columns starting with 
>> yr:
>> 
>> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
>> 
>>> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen  wrote:
>>> 
>>> I have a data frame containing variables "yr3",...,"yr28".
>>> 
>>> How do I remove them with a wild cardsomething similar to "del yr*"
>>> in Windows/doc? Thank you.
>>> 
 colnames(mydata)
>>>   [1] "year"   "weight" "confeduc"   "confothr" "college"
>>>   [6] ...
>>> [41] "yr3"    "yr4"    "yr5"    "yr6" "yr7"
>>> [46] "yr8"    "yr9"    "yr10"   "yr11" "yr12"
>>> [51] "yr13"   "yr14"   "yr15"   "yr16" "yr17"
>>> [56] "yr18"   "yr19"   "yr20"   "yr21" "yr22"
>>> [61] "yr23"   "yr24"   "yr25"   "yr26" "yr27"
>>> [66] "yr28"...
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
>     [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the