Re: [R] Why does debugging print() change output of function?

2014-09-10 Thread William Dunlap
Another nice thing about using ~formula is that it stores the
environment in which the formula was made along with the formula.
Thus you know which envrionment should be used with evaluating it (and
don't have to guess that parent.frame() may be the right environmnet).
  E.g.,
  evalRHS - function(formula) {
 RHS - formula[[length(formula)]]
 eval(RHS, envir=environment(formula))
  }
  p1 - 1:3
  p2 - 11:13
  f1 - function(formula) {
  p1 - 1000
  evalRHS(formula)
  }
  f1(~p1+p2) # does not use the p1 defined in f1
  # [1] 12 14 16
If you do want to override some of the variables in the formula you
can do that by replacing the environment of the formula with a child
of that environment.  Do this in a function so it only replaces the
environment in a copy of the formula, not the original one.
  f2 - function(formula) {
  environment(formula) - list2env(list(p1=1000),
new.env(parent=environment(formula)))
  evalRHS(formula)
  }
  f2(~p1+p2)
  # [1] 1011 1012 1013
  f1(~p1+p2)
  # [1] 12 14 16

(These examples are not the best since the global environment is
searched in any case.  Put the entire example inside a function to see
the strength of the formula approach.)


Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Tue, Sep 9, 2014 at 10:13 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Sep 9, 2014, at 4:07 PM, peter dalgaard wrote:


 On 07 Sep 2014, at 00:31 , David Winsemius dwinsem...@comcast.net wrote:

 The goal:
 to create a function modeled after `subset` (notorious for its
 non-standard evaluation) that will take a series of logical tests as
 unquoted expressions to be evaluated in the framework of a dataframe
 environment and return a dataframe of logicals:

 ...

 A belated peep from the author of subset(): Don't!

 I think we learned the hard way by now that it is much easier to pass
 unevaluated expressions in the shape of formula objects or maybe expression
 objects. Lots of pain can be avoided by slipping in a simple ~.


 It's taken me several years to understand why you are probably correct in
 this regard. I needed to learn that `~` is actually a function that creates
 a language object.

 is.function(`~`)
 [1] TRUE
 is.language( ~ x  5  x  10)
 [1] TRUE

 ... and that it's rather easy to extract the object somewhat like but not
 really an expression embedded in such an object:

 is.expression( ~ x  5  x  10)
 [1] FALSE

 is.call( (~ x  5  x  10)[2] )
 [1] TRUE

 The task of learning the various types of language objects is not an easy
 one.

 --

 David Winsemius, MD
 Alameda, CA, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why does debugging print() change output of function?

2014-09-09 Thread peter dalgaard

On 07 Sep 2014, at 00:31 , David Winsemius dwinsem...@comcast.net wrote:

 The goal: 
   to create a function modeled after `subset` (notorious for its non-standard 
 evaluation) that will take a series of logical tests as unqiuoted expressions 
 to be evaluated in the framework of a dataframe environment and return a 
 dataframe of logicals:
...

A belated peep from the author of subset(): Don't!

I think we learned the hard way by now that it is much easier to pass 
unevaluated expressions in the shape of formula objects or maybe expression 
objects. Lots of pain can be avoided by slipping in a simple ~.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why does debugging print() change output of function?

2014-09-09 Thread David Winsemius


On Sep 9, 2014, at 4:07 PM, peter dalgaard wrote:



On 07 Sep 2014, at 00:31 , David Winsemius dwinsem...@comcast.net  
wrote:



The goal:
to create a function modeled after `subset` (notorious for its non- 
standard evaluation) that will take a series of logical tests as  
unquoted expressions to be evaluated in the framework of a  
dataframe environment and return a dataframe of logicals:

...

A belated peep from the author of subset(): Don't!

I think we learned the hard way by now that it is much easier to  
pass unevaluated expressions in the shape of formula objects or  
maybe expression objects. Lots of pain can be avoided by slipping in  
a simple ~.


It's taken me several years to understand why you are probably correct  
in this regard. I needed to learn that `~` is actually a function that  
creates a language object.


 is.function(`~`)
[1] TRUE
 is.language( ~ x  5  x  10)
[1] TRUE

... and that it's rather easy to extract the object somewhat like but  
not really an expression embedded in such an object:


 is.expression( ~ x  5  x  10)
[1] FALSE

 is.call( (~ x  5  x  10)[2] )
[1] TRUE

The task of learning the various types of language objects is not an  
easy one.


--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Why does debugging print() change output of function?

2014-09-06 Thread David Winsemius
The goal: 
   to create a function modeled after `subset` (notorious for its non-standard 
evaluation) that will take a series of logical tests as unqiuoted expressions 
to be evaluated in the framework of a dataframe environment and return a 
dataframe of logicals:


 mtest.data.frame - 
 function (x, ..., drop=FALSE) 
 { tests - list(...); print(tests)
 r - if (length(tests)==0) 
 stop(no 'tests')
 else { cbind.data.frame(
 lapply( tests, function(t){
  e - substitute(t)
  r - eval(e, x, parent.frame() )
  if ( !is.logical(r) ) {
  stop('tests' must be logical) }
 r  !is.na(r) } ) )
 }
 }
#--

testdata - structure(list(group1 = structure(1:7, .Label = c(Group A, 
Group B, Group C, Group D, Group E, Group F, Group G
), class = factor), group2 = structure(c(3L, 3L, 2L, 1L, 1L, 
2L, 3L), .Label = c(LS, SS, UNC), class = factor), valid1 = 
structure(c(2L, 
1L, NA, 1L, 2L, 2L, 1L), .Label = c(N, Y), class = factor), 
valid2 = structure(c(1L, 1L, 2L, 1L, 1L, 2L, 1L), .Label = c(N, 
Y), class = factor), valid3 = structure(c(4L, 3L, NA, 
2L, 1L, NA, 5L), .Label = c(0.3, 0.7, 1.2, 1.4, 1.7
), class = factor), valid4 = structure(c(2L, 1L, 3L, 4L, 
1L, 1L, 5L), .Label = c(0.3, 0.4, 0.53, 0.66, 0.71
), class = factor), valid5 = structure(c(4L, 1L, NA, NA, 
3L, NA, 2L), .Label = c(11.2, 11.7, 8.3, 8.5), class = factor)), 
.Names = c(group1, 
group2, valid1, valid2, valid3, valid4, valid5), row.names = c(NA, 
-7L), class = data.frame)

###


 mtest.data.frame(testdata, valid2==N, valid3  1)
[[1]]
[1] tests are

[[2]]
[1]  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE

[[3]]
[1]  TRUE  TRUENA FALSE FALSENA  TRUE

This actually seemed to be somewhat successful, but when ... 

Now if I take out the `print()` call for 'tests', I get an different answer:

 mtest.data.frame - 
+  function (x, ..., drop=FALSE) 
+  { tests - list(...)
+  r - if (length(tests)==0) 
+  stop(no 'tests')
+  else { cbind.data.frame(
+  lapply( tests, function(t){
+   e - substitute(t)
+   r - eval(e, x, parent.frame() )
+   if ( !is.logical(r) ) {
+   stop('tests' must be logical) }
+  r  !is.na(r) } ) )
+  }
+  }
 mtest.data.frame(testdata, valid2==N, valid3  1)
   # i.e. no answer

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why does debugging print() change output of function?

2014-09-06 Thread jwd
On Sat, 6 Sep 2014 15:31:49 -0700
David Winsemius dwinsem...@comcast.net wrote:

The only other difference I see is a missing semicolon in the second
example, which, in the first precedes your print() instruction.

JWDougherty

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why does debugging print() change output of function?

2014-09-06 Thread William Dunlap
In your first example I get an error:
   mtest.data.frame(testdata, valid2==N, valid3  1)
  Error in mtest.data.frame(testdata, valid2 == N, valid3  1) :
object 'valid2' not found
I expect the error because list(...) ought to evaluate the ... arguments.

Use substitute() to get the unevaluated ... arguments up front and
don't use substitute() in the loop over the elements of test.

There are several ways to get the unevaluated ... arguments.  E.g.,
  f0 - function(x, ..., drop=FALSE) match.call(expand.dots=FALSE)$...
  f1 - function(x, ..., drop=FALSE) substitute(...())
  f2 - function(x, ..., drop=FALSE) as.list(substitute(list(...)))[-1]

Your function could be the following, where I also fixed a problem
with parent.frame() being
called in the wrong scope and improved, IMO, the names on the output data.frame.

m2 - function (x, ..., drop = FALSE, verbose = FALSE)
{
tests - substitute(...())
nms - names(tests) # fix up names, since data.frame makes ugly ones
if (is.null(nms)) {
names(tests) - paste0(T, seq_along(tests))
}
else if (any(nms == )) {
names(tests)[nms == ] - paste0(T, which(nms == ))
}
if (verbose) {
print(tests)
}
r - if (length(tests) == 0) {
stop(no 'tests')
}
else {
enclos - parent.frame() # evaluate parent.frame() outside of FUN()
data.frame(lapply(tests, FUN=function(e) {
r - eval(e, x, enclos)
if (!is.logical(r)) {
stop('tests' must be logical)
}
r  !is.na(r)
}))
}
r
}

used as:
 m2(testdata, group2==UNC, Eleven.Two=valid5==11.2)
 T1 Eleven.Two
1  TRUE  FALSE
2  TRUE   TRUE
3 FALSE  FALSE
4 FALSE  FALSE
5 FALSE  FALSE
6 FALSE  FALSE
7  TRUE  FALSE
Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Sat, Sep 6, 2014 at 3:31 PM, David Winsemius dwinsem...@comcast.net wrote:
 The goal:
to create a function modeled after `subset` (notorious for its 
 non-standard evaluation) that will take a series of logical tests as 
 unqiuoted expressions to be evaluated in the framework of a dataframe 
 environment and return a dataframe of logicals:


  mtest.data.frame -
  function (x, ..., drop=FALSE)
  { tests - list(...); print(tests)
  r - if (length(tests)==0)
  stop(no 'tests')
  else { cbind.data.frame(
  lapply( tests, function(t){
   e - substitute(t)
   r - eval(e, x, parent.frame() )
   if ( !is.logical(r) ) {
   stop('tests' must be logical) }
  r  !is.na(r) } ) )
  }
  }
 #--

 testdata - structure(list(group1 = structure(1:7, .Label = c(Group A,
 Group B, Group C, Group D, Group E, Group F, Group G
 ), class = factor), group2 = structure(c(3L, 3L, 2L, 1L, 1L,
 2L, 3L), .Label = c(LS, SS, UNC), class = factor), valid1 = 
 structure(c(2L,
 1L, NA, 1L, 2L, 2L, 1L), .Label = c(N, Y), class = factor),
 valid2 = structure(c(1L, 1L, 2L, 1L, 1L, 2L, 1L), .Label = c(N,
 Y), class = factor), valid3 = structure(c(4L, 3L, NA,
 2L, 1L, NA, 5L), .Label = c(0.3, 0.7, 1.2, 1.4, 1.7
 ), class = factor), valid4 = structure(c(2L, 1L, 3L, 4L,
 1L, 1L, 5L), .Label = c(0.3, 0.4, 0.53, 0.66, 0.71
 ), class = factor), valid5 = structure(c(4L, 1L, NA, NA,
 3L, NA, 2L), .Label = c(11.2, 11.7, 8.3, 8.5), class = 
 factor)), .Names = c(group1,
 group2, valid1, valid2, valid3, valid4, valid5), row.names = c(NA,
 -7L), class = data.frame)

 ###


 mtest.data.frame(testdata, valid2==N, valid3  1)
 [[1]]
 [1] tests are

 [[2]]
 [1]  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE

 [[3]]
 [1]  TRUE  TRUENA FALSE FALSENA  TRUE

 This actually seemed to be somewhat successful, but when ...

 Now if I take out the `print()` call for 'tests', I get an different answer:

 mtest.data.frame -
 +  function (x, ..., drop=FALSE)
 +  { tests - list(...)
 +  r - if (length(tests)==0)
 +  stop(no 'tests')
 +  else { cbind.data.frame(
 +  lapply( tests, function(t){
 +   e - substitute(t)
 +   r - eval(e, x, parent.frame() )
 +   if ( !is.logical(r) ) {
 +   stop('tests' must be logical) }
 +  r  !is.na(r) } ) )
 +  }
 +  }
 mtest.data.frame(testdata, valid2==N, valid3  1)
   # i.e. no answer

 --

 David Winsemius
 Alameda, CA, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide 

Re: [R] Why does debugging print() change output of function?

2014-09-06 Thread David Winsemius

On Sep 6, 2014, at 4:24 PM, William Dunlap wrote:

 In your first example I get an error:
 mtest.data.frame(testdata, valid2==N, valid3  1)
  Error in mtest.data.frame(testdata, valid2 == N, valid3  1) :
object 'valid2' not found
 I expect the error because list(...) ought to evaluate the ... arguments.

Thank you (and JWDougherty) for looking at this. I see that the difference lies 
in the fact that I have vectors in my workspace that were used  as 
preliminaries in constructing my test case that are being accessed by my 
logical expressions.

 group1 - paste(Group, rep(LETTERS[1:7], sep=''))
  group2 - c(UNC, UNC, SS, LS, LS, SS, UNC)
  valid1 - c(Y, N, NA, N, Y, Y, N)
 valid2 - c(N, N, Y, N, N, Y, N)
  valid3 - c(1.4, 1.2, NA, 0.7, 0.3, NA, 1.7)
  valid4 - c(0.4, 0.3, 0.53, 0.66, 0.3, 0.3, 0.71)
 valid5 - c(8.5, 11.2,NA, NA, 8.3, NA, 11.7)

I should have executed rm(list=ls()) and repeated my testing before posting, 
but you 
 Use substitute() to get the unevaluated ... arguments up front and
 don't use substitute() in the loop over the elements of test.
 
 There are several ways to get the unevaluated ... arguments.  E.g.,
  f0 - function(x, ..., drop=FALSE) match.call(expand.dots=FALSE)$...

  f1 - function(x, ..., drop=FALSE) substitute(...())
  f2 - function(x, ..., drop=FALSE) as.list(substitute(list(...)))[-1]

These three version are somewhat confusing,  the second one in particular makes 
it appear that the ellipsis is a function, while the other ones make it appear 
that they are an expression pointing to a list.

 
 Your function could be the following, where I also fixed a problem
 with parent.frame() being
 called in the wrong scope and improved,

Yes, I was worried about that.

 IMO, the names on the output data.frame.
 
 m2 - function (x, ..., drop = FALSE, verbose = FALSE)
 {
tests - substitute(...())
nms - names(tests) # fix up names, since data.frame makes ugly ones
if (is.null(nms)) {
names(tests) - paste0(T, seq_along(tests))
}
else if (any(nms == )) {
names(tests)[nms == ] - paste0(T, which(nms == ))
}
if (verbose) {
print(tests)
}
r - if (length(tests) == 0) {
stop(no 'tests')
}
else {
enclos - parent.frame() # evaluate parent.frame() outside of FUN()
data.frame(lapply(tests, FUN=function(e) {
r - eval(e, x, enclos)
if (!is.logical(r)) {
stop('tests' must be logical)
}
r  !is.na(r)
}))
}
r
 }
 
 used as:
 m2(testdata, group2==UNC, Eleven.Two=valid5==11.2)
 T1 Eleven.Two
 1  TRUE  FALSE
 2  TRUE   TRUE
 3 FALSE  FALSE
 4 FALSE  FALSE
 5 FALSE  FALSE
 6 FALSE  FALSE
 7  TRUE  FALSE
 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com

Thank you again, Bill. 

-- 
David.

 
 
 On Sat, Sep 6, 2014 at 3:31 PM, David Winsemius dwinsem...@comcast.net 
 wrote:
 The goal:
   to create a function modeled after `subset` (notorious for its 
 non-standard evaluation) that will take a series of logical tests as 
 unqiuoted expressions to be evaluated in the framework of a dataframe 
 environment and return a dataframe of logicals:
 
 
 mtest.data.frame -
 function (x, ..., drop=FALSE)
 { tests - list(...); print(tests)
 r - if (length(tests)==0)
 stop(no 'tests')
 else { cbind.data.frame(
 lapply( tests, function(t){
  e - substitute(t)
  r - eval(e, x, parent.frame() )
  if ( !is.logical(r) ) {
  stop('tests' must be logical) }
 r  !is.na(r) } ) )
 }
 }
 #--
 
 testdata - structure(list(group1 = structure(1:7, .Label = c(Group A,
 Group B, Group C, Group D, Group E, Group F, Group G
 ), class = factor), group2 = structure(c(3L, 3L, 2L, 1L, 1L,
 2L, 3L), .Label = c(LS, SS, UNC), class = factor), valid1 = 
 structure(c(2L,
 1L, NA, 1L, 2L, 2L, 1L), .Label = c(N, Y), class = factor),
valid2 = structure(c(1L, 1L, 2L, 1L, 1L, 2L, 1L), .Label = c(N,
Y), class = factor), valid3 = structure(c(4L, 3L, NA,
2L, 1L, NA, 5L), .Label = c(0.3, 0.7, 1.2, 1.4, 1.7
), class = factor), valid4 = structure(c(2L, 1L, 3L, 4L,
1L, 1L, 5L), .Label = c(0.3, 0.4, 0.53, 0.66, 0.71
), class = factor), valid5 = structure(c(4L, 1L, NA, NA,
3L, NA, 2L), .Label = c(11.2, 11.7, 8.3, 8.5), class = 
 factor)), .Names = c(group1,
 group2, valid1, valid2, valid3, valid4, valid5), row.names = 
 c(NA,
 -7L), class = data.frame)
 
 ###
 
 
 mtest.data.frame(testdata, valid2==N, valid3  1)
 [[1]]
 [1] tests are
 
 [[2]]
 [1]  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE
 
 [[3]]
 [1]  TRUE  TRUENA FALSE FALSENA  TRUE
 
 This actually seemed to be somewhat successful, but when ...
 
 Now if I take out the `print()` call for 'tests', I get an different answer:
 
 mtest.data.frame -
 +  function (x, ..., drop=FALSE)
 +  { tests - list(...)
 +  r - if