Re: [Rd] WISH: Optional mechanism preventing var <<- value from assigning non-existing variable

2023-03-19 Thread Duncan Murdoch

On 19/03/2023 2:43 p.m., Gabriel Becker wrote:
I have to say <<- is a core debugging tool when assigning into the 
global environment. I suppose I could use assign but that would be 
somewhat annoying.


That said I'm still for this change, the vast overwhelming number of 
times that <<- is in my package code - already rare but it does happen - 
it would absolutely be a bug (typo most likely) for it to get to the 
global environment and assign into it. Assigning into thr global 
environment from package code is a serious anti pattern anyway.


To be honest from the developer perspective what id personally actually 
want is an assigner that was willing to go up exactly one frame from the 
current one to find its binding. That is how I essentially always am 
using <<- myself.


This sounds like a linter would be appropriate:  any time you make an 
assignment that goes more than one level up, it warns you about it.


Other linter rules could limit the destination in other ways, e.g. 
assigning to globalenv() or things in the search list could be disallowed.


Another error I've made a few times is to use "<-" by mistake when "<<-" 
was intended.  A linter could detect this by seeing both `x <- value1` 
and `x <<- value2` in the same context.  That's legal, but (for me at 
least) it usually indicates that one of them is a typo.


Duncan Murdoch



~G

On Sun, Mar 19, 2023, 11:16 AM Bill Dunlap > wrote:


Why should it make an exception for cases where the
about-to-be-assigned-to
name is present in the global environment?  I think it should warn
or give
an error if the altered variable is in any environment on the search
list.

-Bill

On Sun, Mar 19, 2023 at 10:54 AM Duncan Murdoch
mailto:murdoch.dun...@gmail.com>>
wrote:

 > I think that should be the default behaviour. It's pretty late to get
 > that into R 4.3.0, but I think your proposal (with
check.superassignment
 > = FALSE being the default) could make it in, and 4.4.0 could
change the
 > default to TRUE.
 >
 > Duncan
 >
 >
 >
 > On 19/03/2023 12:08 p.m., Henrik Bengtsson wrote:
 > > I'd like to be able to prevent the <<- assignment operator ("super
 > > assignment") from assigning to the global environment unless the
 > > variable already exists and is not locked.  If it does not
exist or is
 > > locked, I'd like an error to be produced.  This would allow me to
 > > evaluate expressions with this temporarily set to protect against
 > > mistakes.
 > >
 > > For example, I'd like to do something like:
 > >
 > > $ R --vanilla
 > >> exists("a")
 > > [1] FALSE
 > >
 > >> options(check.superassignment = TRUE)
 > >> local({ a <<- 1 })
 > > Error: object 'a' not found
 > >
 > >> a <- 0
 > >> local({ a <<- 1 })
 > >> a
 > > [1] 1
 > >
 > >> rm("a")
 > >> options(check.superassignment = FALSE)
 > >> local({ a <<- 1 })
 > >> exists("a")
 > > [1] TRUE
 > >
 > >
 > > BACKGROUND:
 > >
 > >  From help("<<-") we have:
 > >
 > > "The operators <<- and ->> are normally only used in functions, and
 > > cause a search to be made through parent environments for an
existing
 > > definition of the variable being assigned. If such a variable
is found
 > > (and its binding is not locked) then its value is redefined,
otherwise
 > > assignment takes place in the global environment."
 > >
 > > I argue that it's unfortunate that <<- fallbacks back to
assigning to
 > > the global environment if the variable does not already exist.
 > > Unfortunately, it has become a "go to" solution for many to use it
 > > that way.  Sometimes it is intended, sometimes it's a mistake.  We
 > > find it also in R packages on CRAN, even if 'R CMD check' tries to
 > > detect when it happens (but it's limited to do so from run-time
 > > examples and tests).
 > >
 > > It's probably too widely used for us to change to a more strict
 > > behavior permanent.  The proposed R option allows me, as a
developer,
 > > to evaluate an R expression with the strict behavior,
especially if I
 > > don't trust the code.
 > >
 > > With 'check.superassignment = TRUE' set, a developer would have to
 > > first declare the variable in the global environment for <<- to
assign
 > > there.  This would remove the fallback "If such a variable is found
 > > (and its binding is not locked) then its value is redefined,
otherwise
 > > assignment takes place in the global environment" in the current
 > > design.  For those who truly intends to assign to the global, could
 > > use assign(var, value, envir = globalenv()) or
globalenv()[[var]] <-
 > > value.
 > >
 > > 'R CMD check' could temporarily set 'check.superassignment = TRUE'
   

Re: [Rd] WISH: Optional mechanism preventing var <<- value from assigning non-existing variable

2023-03-19 Thread Gabriel Becker
I have to say <<- is a core debugging tool when assigning into the global
environment. I suppose I could use assign but that would be somewhat
annoying.

That said I'm still for this change, the vast overwhelming number of times
that <<- is in my package code - already rare but it does happen - it would
absolutely be a bug (typo most likely) for it to get to the global
environment and assign into it. Assigning into thr global environment from
package code is a serious anti pattern anyway.

To be honest from the developer perspective what id personally actually
want is an assigner that was willing to go up exactly one frame from the
current one to find its binding. That is how I essentially always am using
<<- myself.

~G

On Sun, Mar 19, 2023, 11:16 AM Bill Dunlap  wrote:

> Why should it make an exception for cases where the about-to-be-assigned-to
> name is present in the global environment?  I think it should warn or give
> an error if the altered variable is in any environment on the search list.
>
> -Bill
>
> On Sun, Mar 19, 2023 at 10:54 AM Duncan Murdoch 
> wrote:
>
> > I think that should be the default behaviour. It's pretty late to get
> > that into R 4.3.0, but I think your proposal (with check.superassignment
> > = FALSE being the default) could make it in, and 4.4.0 could change the
> > default to TRUE.
> >
> > Duncan
> >
> >
> >
> > On 19/03/2023 12:08 p.m., Henrik Bengtsson wrote:
> > > I'd like to be able to prevent the <<- assignment operator ("super
> > > assignment") from assigning to the global environment unless the
> > > variable already exists and is not locked.  If it does not exist or is
> > > locked, I'd like an error to be produced.  This would allow me to
> > > evaluate expressions with this temporarily set to protect against
> > > mistakes.
> > >
> > > For example, I'd like to do something like:
> > >
> > > $ R --vanilla
> > >> exists("a")
> > > [1] FALSE
> > >
> > >> options(check.superassignment = TRUE)
> > >> local({ a <<- 1 })
> > > Error: object 'a' not found
> > >
> > >> a <- 0
> > >> local({ a <<- 1 })
> > >> a
> > > [1] 1
> > >
> > >> rm("a")
> > >> options(check.superassignment = FALSE)
> > >> local({ a <<- 1 })
> > >> exists("a")
> > > [1] TRUE
> > >
> > >
> > > BACKGROUND:
> > >
> > >  From help("<<-") we have:
> > >
> > > "The operators <<- and ->> are normally only used in functions, and
> > > cause a search to be made through parent environments for an existing
> > > definition of the variable being assigned. If such a variable is found
> > > (and its binding is not locked) then its value is redefined, otherwise
> > > assignment takes place in the global environment."
> > >
> > > I argue that it's unfortunate that <<- fallbacks back to assigning to
> > > the global environment if the variable does not already exist.
> > > Unfortunately, it has become a "go to" solution for many to use it
> > > that way.  Sometimes it is intended, sometimes it's a mistake.  We
> > > find it also in R packages on CRAN, even if 'R CMD check' tries to
> > > detect when it happens (but it's limited to do so from run-time
> > > examples and tests).
> > >
> > > It's probably too widely used for us to change to a more strict
> > > behavior permanent.  The proposed R option allows me, as a developer,
> > > to evaluate an R expression with the strict behavior, especially if I
> > > don't trust the code.
> > >
> > > With 'check.superassignment = TRUE' set, a developer would have to
> > > first declare the variable in the global environment for <<- to assign
> > > there.  This would remove the fallback "If such a variable is found
> > > (and its binding is not locked) then its value is redefined, otherwise
> > > assignment takes place in the global environment" in the current
> > > design.  For those who truly intends to assign to the global, could
> > > use assign(var, value, envir = globalenv()) or globalenv()[[var]] <-
> > > value.
> > >
> > > 'R CMD check' could temporarily set 'check.superassignment = TRUE'
> > > during checks.  If we let environment variable
> > > 'R_CHECK_SUPERASSIGNMENT' set the default value of option
> > > 'check.superassignment' on R startup, it would be possible to check
> > > packages optionally this way, but also to run any "non-trusted" R
> > > script in the "strict" mode.
> > >
> > >
> > > TEASER:
> > >
> > > Here's an example why using <<- for assigning to the global
> > > environment is a bad idea:
> > >
> > > This works:
> > >
> > > $ R --vanilla
> > >> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
> > >> keep
> > >> [1] 3
> > >
> > >
> > > This doesn't work:
> > >
> > > $ R --vanilla
> > >> library(purrr)
> > >> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
> > > Error in keep <<- x : cannot change value of locked binding for 'keep'
> > >
> > >
> > > But, if we "declare" the variable first, it works:
> > >
> > > $ R --vanilla
> > >> library(purrr)
> > >> keep <- 0
> > >> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; 

Re: [Rd] WISH: Optional mechanism preventing var <<- value from assigning non-existing variable

2023-03-19 Thread Duncan Murdoch

On 19/03/2023 2:15 p.m., Bill Dunlap wrote:
Why should it make an exception for cases where the 
about-to-be-assigned-to name is present in the global environment?  I 
think it should warn or give an error if the altered variable is in any 
environment on the search list.


I'd say code like this should work:

  x <- NULL
  f <- function() x <<- 123
  f()

and then x should be changed to 123 unless the binding to x is locked. 
I don't see why it should matter if this code is in local() or in a 
function, or if it is run at the top level.


For most things on the search list, the binding would be locked, but we 
do allow people to attach environments, and then they'd be on the search 
list, so this code should work too:


  g <- function() {

attach(environment())

x <- NULL
f <- function() x <<- 123
f()

  }

What shouldn't work would be something like

  mean <<- 3

but it already doesn't work (contrary to the documentation), giving

  Error: cannot change value of locked binding for 'mean'

(which makes sense; what if I locked a binding in the global 
environment?  Then we'd go to the fallback, but the fallback can't work, 
because the binding is already there but locked...)


Duncan Murdoch



-Bill

On Sun, Mar 19, 2023 at 10:54 AM Duncan Murdoch 
mailto:murdoch.dun...@gmail.com>> wrote:


I think that should be the default behaviour. It's pretty late to get
that into R 4.3.0, but I think your proposal (with
check.superassignment
= FALSE being the default) could make it in, and 4.4.0 could change the
default to TRUE.

Duncan



On 19/03/2023 12:08 p.m., Henrik Bengtsson wrote:
 > I'd like to be able to prevent the <<- assignment operator ("super
 > assignment") from assigning to the global environment unless the
 > variable already exists and is not locked.  If it does not exist
or is
 > locked, I'd like an error to be produced.  This would allow me to
 > evaluate expressions with this temporarily set to protect against
 > mistakes.
 >
 > For example, I'd like to do something like:
 >
 > $ R --vanilla
 >> exists("a")
 > [1] FALSE
 >
 >> options(check.superassignment = TRUE)
 >> local({ a <<- 1 })
 > Error: object 'a' not found
 >
 >> a <- 0
 >> local({ a <<- 1 })
 >> a
 > [1] 1
 >
 >> rm("a")
 >> options(check.superassignment = FALSE)
 >> local({ a <<- 1 })
 >> exists("a")
 > [1] TRUE
 >
 >
 > BACKGROUND:
 >
 >  From help("<<-") we have:
 >
 > "The operators <<- and ->> are normally only used in functions, and
 > cause a search to be made through parent environments for an existing
 > definition of the variable being assigned. If such a variable is
found
 > (and its binding is not locked) then its value is redefined,
otherwise
 > assignment takes place in the global environment."
 >
 > I argue that it's unfortunate that <<- fallbacks back to assigning to
 > the global environment if the variable does not already exist.
 > Unfortunately, it has become a "go to" solution for many to use it
 > that way.  Sometimes it is intended, sometimes it's a mistake.  We
 > find it also in R packages on CRAN, even if 'R CMD check' tries to
 > detect when it happens (but it's limited to do so from run-time
 > examples and tests).
 >
 > It's probably too widely used for us to change to a more strict
 > behavior permanent.  The proposed R option allows me, as a developer,
 > to evaluate an R expression with the strict behavior, especially if I
 > don't trust the code.
 >
 > With 'check.superassignment = TRUE' set, a developer would have to
 > first declare the variable in the global environment for <<- to
assign
 > there.  This would remove the fallback "If such a variable is found
 > (and its binding is not locked) then its value is redefined,
otherwise
 > assignment takes place in the global environment" in the current
 > design.  For those who truly intends to assign to the global, could
 > use assign(var, value, envir = globalenv()) or globalenv()[[var]] <-
 > value.
 >
 > 'R CMD check' could temporarily set 'check.superassignment = TRUE'
 > during checks.  If we let environment variable
 > 'R_CHECK_SUPERASSIGNMENT' set the default value of option
 > 'check.superassignment' on R startup, it would be possible to check
 > packages optionally this way, but also to run any "non-trusted" R
 > script in the "strict" mode.
 >
 >
 > TEASER:
 >
 > Here's an example why using <<- for assigning to the global
 > environment is a bad idea:
 >
 > This works:
 >
 > $ R --vanilla
 >> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
 >> keep
 >> [1] 3
 >
 >
 > This doesn't work:
 >
 > $ R --vanilla
 >> library(purrr)
  

Re: [Rd] WISH: Optional mechanism preventing var <<- value from assigning non-existing variable

2023-03-19 Thread Bill Dunlap
Why should it make an exception for cases where the about-to-be-assigned-to
name is present in the global environment?  I think it should warn or give
an error if the altered variable is in any environment on the search list.

-Bill

On Sun, Mar 19, 2023 at 10:54 AM Duncan Murdoch 
wrote:

> I think that should be the default behaviour. It's pretty late to get
> that into R 4.3.0, but I think your proposal (with check.superassignment
> = FALSE being the default) could make it in, and 4.4.0 could change the
> default to TRUE.
>
> Duncan
>
>
>
> On 19/03/2023 12:08 p.m., Henrik Bengtsson wrote:
> > I'd like to be able to prevent the <<- assignment operator ("super
> > assignment") from assigning to the global environment unless the
> > variable already exists and is not locked.  If it does not exist or is
> > locked, I'd like an error to be produced.  This would allow me to
> > evaluate expressions with this temporarily set to protect against
> > mistakes.
> >
> > For example, I'd like to do something like:
> >
> > $ R --vanilla
> >> exists("a")
> > [1] FALSE
> >
> >> options(check.superassignment = TRUE)
> >> local({ a <<- 1 })
> > Error: object 'a' not found
> >
> >> a <- 0
> >> local({ a <<- 1 })
> >> a
> > [1] 1
> >
> >> rm("a")
> >> options(check.superassignment = FALSE)
> >> local({ a <<- 1 })
> >> exists("a")
> > [1] TRUE
> >
> >
> > BACKGROUND:
> >
> >  From help("<<-") we have:
> >
> > "The operators <<- and ->> are normally only used in functions, and
> > cause a search to be made through parent environments for an existing
> > definition of the variable being assigned. If such a variable is found
> > (and its binding is not locked) then its value is redefined, otherwise
> > assignment takes place in the global environment."
> >
> > I argue that it's unfortunate that <<- fallbacks back to assigning to
> > the global environment if the variable does not already exist.
> > Unfortunately, it has become a "go to" solution for many to use it
> > that way.  Sometimes it is intended, sometimes it's a mistake.  We
> > find it also in R packages on CRAN, even if 'R CMD check' tries to
> > detect when it happens (but it's limited to do so from run-time
> > examples and tests).
> >
> > It's probably too widely used for us to change to a more strict
> > behavior permanent.  The proposed R option allows me, as a developer,
> > to evaluate an R expression with the strict behavior, especially if I
> > don't trust the code.
> >
> > With 'check.superassignment = TRUE' set, a developer would have to
> > first declare the variable in the global environment for <<- to assign
> > there.  This would remove the fallback "If such a variable is found
> > (and its binding is not locked) then its value is redefined, otherwise
> > assignment takes place in the global environment" in the current
> > design.  For those who truly intends to assign to the global, could
> > use assign(var, value, envir = globalenv()) or globalenv()[[var]] <-
> > value.
> >
> > 'R CMD check' could temporarily set 'check.superassignment = TRUE'
> > during checks.  If we let environment variable
> > 'R_CHECK_SUPERASSIGNMENT' set the default value of option
> > 'check.superassignment' on R startup, it would be possible to check
> > packages optionally this way, but also to run any "non-trusted" R
> > script in the "strict" mode.
> >
> >
> > TEASER:
> >
> > Here's an example why using <<- for assigning to the global
> > environment is a bad idea:
> >
> > This works:
> >
> > $ R --vanilla
> >> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
> >> keep
> >> [1] 3
> >
> >
> > This doesn't work:
> >
> > $ R --vanilla
> >> library(purrr)
> >> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
> > Error in keep <<- x : cannot change value of locked binding for 'keep'
> >
> >
> > But, if we "declare" the variable first, it works:
> >
> > $ R --vanilla
> >> library(purrr)
> >> keep <- 0
> >> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
> >> keep
> >> [1] 3
> >
> > /Henrik
> >
> > PS. Does the <<- operator have an official name? Hadley calls it
> > "super assignment" in 'Advanced R'
> > (https://adv-r.hadley.nz/environments.html), which is where I got it
> > from.
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] WISH: Optional mechanism preventing var <<- value from assigning non-existing variable

2023-03-19 Thread Duncan Murdoch
I think that should be the default behaviour. It's pretty late to get 
that into R 4.3.0, but I think your proposal (with check.superassignment 
= FALSE being the default) could make it in, and 4.4.0 could change the 
default to TRUE.


Duncan



On 19/03/2023 12:08 p.m., Henrik Bengtsson wrote:

I'd like to be able to prevent the <<- assignment operator ("super
assignment") from assigning to the global environment unless the
variable already exists and is not locked.  If it does not exist or is
locked, I'd like an error to be produced.  This would allow me to
evaluate expressions with this temporarily set to protect against
mistakes.

For example, I'd like to do something like:

$ R --vanilla

exists("a")

[1] FALSE


options(check.superassignment = TRUE)
local({ a <<- 1 })

Error: object 'a' not found


a <- 0
local({ a <<- 1 })
a

[1] 1


rm("a")
options(check.superassignment = FALSE)
local({ a <<- 1 })
exists("a")

[1] TRUE


BACKGROUND:

 From help("<<-") we have:

"The operators <<- and ->> are normally only used in functions, and
cause a search to be made through parent environments for an existing
definition of the variable being assigned. If such a variable is found
(and its binding is not locked) then its value is redefined, otherwise
assignment takes place in the global environment."

I argue that it's unfortunate that <<- fallbacks back to assigning to
the global environment if the variable does not already exist.
Unfortunately, it has become a "go to" solution for many to use it
that way.  Sometimes it is intended, sometimes it's a mistake.  We
find it also in R packages on CRAN, even if 'R CMD check' tries to
detect when it happens (but it's limited to do so from run-time
examples and tests).

It's probably too widely used for us to change to a more strict
behavior permanent.  The proposed R option allows me, as a developer,
to evaluate an R expression with the strict behavior, especially if I
don't trust the code.

With 'check.superassignment = TRUE' set, a developer would have to
first declare the variable in the global environment for <<- to assign
there.  This would remove the fallback "If such a variable is found
(and its binding is not locked) then its value is redefined, otherwise
assignment takes place in the global environment" in the current
design.  For those who truly intends to assign to the global, could
use assign(var, value, envir = globalenv()) or globalenv()[[var]] <-
value.

'R CMD check' could temporarily set 'check.superassignment = TRUE'
during checks.  If we let environment variable
'R_CHECK_SUPERASSIGNMENT' set the default value of option
'check.superassignment' on R startup, it would be possible to check
packages optionally this way, but also to run any "non-trusted" R
script in the "strict" mode.


TEASER:

Here's an example why using <<- for assigning to the global
environment is a bad idea:

This works:

$ R --vanilla

y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
keep
[1] 3



This doesn't work:

$ R --vanilla

library(purrr)
y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })

Error in keep <<- x : cannot change value of locked binding for 'keep'


But, if we "declare" the variable first, it works:

$ R --vanilla

library(purrr)
keep <- 0
y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
keep
[1] 3


/Henrik

PS. Does the <<- operator have an official name? Hadley calls it
"super assignment" in 'Advanced R'
(https://adv-r.hadley.nz/environments.html), which is where I got it
from.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel