Re: [R] assumptions about how things are done
Colour me confused. if (...) { ... } else { ... } is a control structure. It requires the test to evaluate to a single logical value, then it evaluates one choice completely and the other not at all. It is special syntax. ifelse(..., ..., ...) is not a control structure. It is not special syntax. It is a normal function call, and it evaluates its arguments and expands them to a common length just like "+" or, more to the point, just like "&". So why do we have people expecting a normal function call to do special control structure magic? Leaving aside the extending-to-a-common-length part, it's ifelse <- function (test, true.part, false.part) { false.part[test] <- true.part[test] false.part } Why is it so hard to understand that there is nothing special to understand here? On Sun, 10 Oct 2021 at 08:36, Avi Gross via R-help wrote: > > This is supposed to be a forum for help so general and philosophical > discussions belong elsewhere, or nowhere. > > > > Having said that, I want to make a brief point. Both new and experienced > people make implicit assumptions about the code they use. Often nobody looks > at how the sausage is made. The recent discussion of ifelse() made me take a > look and I was not thrilled. > > > > My NAÏVE view was that ifelse() was implemented as a sort of loop construct. > I mean if I have a vector of length N and perhaps a few other vectors of the > same length, I might say: > > > > result <- ifelse(condition-on-vector-A, result-if-true-using-vectors, > result-if-false-using-vectors) > > > > So say I want to take a vector of integers from 1 to N and make an output a > second vector where you have either a prime number or NA. If I have a > function called is.prime() that checks a single number and returns > TRUE/FALSE, it might look like this: > > > > primed <- ifelse(is.prime(A, A, NA) > > > > So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being > composite becomes NA and so on. > > > > If you wrote the above using loops, it would be to range from index 1 to N > and apply the above. There are many complications as R allows vectors to be > longer or to be repeated as needed. > > > > What I found ifelse() as implemented to do, is sort of like this: > > > > Make a vector of the right length for the results, initially empty. > > > > Make a vector evaluating the condition so it is effectively a Boolean > result. > > Calculate which indices are TRUE. Secondarily, calculate another set of > indices that are false. > > > > Calculate ALL the THEN conditions and ditto all the ELSE conditions. > > > > Now copy into the result all the THEN values indexed by the TRUE above and > than all the ELSE values indicated by the FALSE above. > > > > In plain English, make a result from two other results based on picking > either one from menu A or one from menu B. > > > > That is not a bad algorithm and in a vectorized language like R, maybe even > quite effective and efficient. It does lots of extra work as by definition > it throws at least half away. > > > > I suspect the implementation could be made much faster by making some of it > done internally using a language like C. > > > > But now that I know what this implementation did, I might have some qualms > at using it in some situations. The original complaint led to other > observations and needs and perhaps blindly using a supplied function like > ifelse() may not be a decent solution for some needs. > > > > I note how I had to reorient my work elsewhere using a group of packages > called the tidyverse when they added a function to allow rowwise > manipulation of the data as compared to an ifelse-like method using all > columns at once. There is room for many approaches and if a function may not > be doing quite what you want, something else may better meet your needs OR > you may want to see if you can copy the existing function and modify it for > your own personal needs. > > > > In the case we mentioned, the goal was to avoid printing selected warnings. > Since the function is readable, it can easily be modified in a copy to find > what is causing the warnings and either rewrite a bit to avoid them or start > over with perhaps your own function that tests before doing things and > avoids tripping the condition (generating a NaN) entirely. > > > > Like may languages, R is a bit too rich. You can piggyback on the work of > others but with some caution as they did not necessarily have you in mind > with what they created. > > > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.
[R] 2022 John M. Chambers Software Award
Dear R-help listers, The Statistical Computing Section of the American Statistical Association announces the competition for the John M. Chambers Statistical Software Award. In 1998 the Association for Computing Machinery (ACM) presented the ACM Software System Award to John Chambers for the design and development of S. Dr. Chambers generously donated his award to the Statistical Computing Section to endow an annual prize for statistical software written by, or in collaboration with, an undergraduate or graduate student. Please visit http://asa.stat.uconn.edu for more information. Best regards, Raymond Wong Awards Chair ASA Section on Statistical Computing ASA Section on Statistical Graphics Associate Professor Department of Statistics Texas A&M University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] assumptions about how things are done
Hi Avi, Definitely a learning moment. I may consider writing an ifElse() for my own use and sharing it if anyone wants it. Jim On Sun, Oct 10, 2021 at 6:36 AM Avi Gross via R-help wrote: > > This is supposed to be a forum for help so general and philosophical > discussions belong elsewhere, or nowhere. > > > > Having said that, I want to make a brief point. Both new and experienced > people make implicit assumptions about the code they use. Often nobody looks > at how the sausage is made. The recent discussion of ifelse() made me take a > look and I was not thrilled. > > > > My NAÏVE view was that ifelse() was implemented as a sort of loop construct. > I mean if I have a vector of length N and perhaps a few other vectors of the > same length, I might say: > > > > result <- ifelse(condition-on-vector-A, result-if-true-using-vectors, > result-if-false-using-vectors) > > > > So say I want to take a vector of integers from 1 to N and make an output a > second vector where you have either a prime number or NA. If I have a > function called is.prime() that checks a single number and returns > TRUE/FALSE, it might look like this: > > > > primed <- ifelse(is.prime(A, A, NA) > > > > So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being > composite becomes NA and so on. > > > > If you wrote the above using loops, it would be to range from index 1 to N > and apply the above. There are many complications as R allows vectors to be > longer or to be repeated as needed. > > > > What I found ifelse() as implemented to do, is sort of like this: > > > > Make a vector of the right length for the results, initially empty. > > > > Make a vector evaluating the condition so it is effectively a Boolean > result. > > Calculate which indices are TRUE. Secondarily, calculate another set of > indices that are false. > > > > Calculate ALL the THEN conditions and ditto all the ELSE conditions. > > > > Now copy into the result all the THEN values indexed by the TRUE above and > than all the ELSE values indicated by the FALSE above. > > > > In plain English, make a result from two other results based on picking > either one from menu A or one from menu B. > > > > That is not a bad algorithm and in a vectorized language like R, maybe even > quite effective and efficient. It does lots of extra work as by definition > it throws at least half away. > > > > I suspect the implementation could be made much faster by making some of it > done internally using a language like C. > > > > But now that I know what this implementation did, I might have some qualms > at using it in some situations. The original complaint led to other > observations and needs and perhaps blindly using a supplied function like > ifelse() may not be a decent solution for some needs. > > > > I note how I had to reorient my work elsewhere using a group of packages > called the tidyverse when they added a function to allow rowwise > manipulation of the data as compared to an ifelse-like method using all > columns at once. There is room for many approaches and if a function may not > be doing quite what you want, something else may better meet your needs OR > you may want to see if you can copy the existing function and modify it for > your own personal needs. > > > > In the case we mentioned, the goal was to avoid printing selected warnings. > Since the function is readable, it can easily be modified in a copy to find > what is causing the warnings and either rewrite a bit to avoid them or start > over with perhaps your own function that tests before doing things and > avoids tripping the condition (generating a NaN) entirely. > > > > Like may languages, R is a bit too rich. You can piggyback on the work of > others but with some caution as they did not necessarily have you in mind > with what they created. > > > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use ifelse without invoking warnings
Dear Ravi, I have uploaded on GitHub a version which handles also constant values instead of functions. Regarding named arguments: this is actually handled automatically as well: eval.by.formula((x > 5 & x %% 2) ~ (x <= 5) ~ ., FUN, y=2, x) # [1] 1 4 9 16 25 6 14 8 18 10 eval.by.formula((x > 5 & x %% 2) ~ (x <= 5) ~ ., FUN, x=2, x) # [1] 4 4 4 4 4 2 14 2 18 2 eval.by.formula((x > 5 & x %% 2) ~ (x <= 5) ~ ., list(FUN[[1]], 0, 1), y=2, x) # [1] 0 0 0 0 0 1 14 1 18 1 But it still needs proper testing and maybe optimization: it is possible to run sapply on the filtered sequence (but I did not want to break anything now). Sincerely, Leonard On 10/9/2021 9:26 PM, Leonard Mada wrote: Dear Ravi, I wrote a small replacement for ifelse() which avoids such unnecessary evaluations (it bothered me a few times as well - so I decided to try a small replacement). ### Example: x = 1:10 FUN = list(); FUN[[1]] = function(x, y) x*y; FUN[[2]] = function(x, y) x^2; FUN[[3]] = function(x, y) x; # lets run multiple conditions # eval.by.formula(conditions, FUN.list, ... (arguments for FUN) ); eval.by.formula((x > 5 & x %% 2) ~ (x <= 5) ~ ., FUN, x, x-1) # Example 2 eval.by.formula((x > 5 & x %% 2) ~ (x <= 5) ~ ., FUN, 2, x) ### Disclaimer: - NOT properly tested; The code for the function is below. Maybe someone can experiment with the code and improve it further. There are a few issues / open questions, like: 1.) Best Name: eval.by.formula, ifelse.formula, ...? 2.) Named arguments: not yet; 3.) Fixed values inside FUN.list 4.) Format of expression for conditions: expression(cond1, cond2, cond3) vs cond1 ~ cond2 ~ cond3 ??? 5.) Code efficiency - some tests on large data sets & optimizations are warranted; Sincerely, Leonard === The latest code is on Github: https://github.com/discoleo/R/blob/master/Stat/Tools.Formulas.R [...] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] assumptions about how things are done
This is supposed to be a forum for help so general and philosophical discussions belong elsewhere, or nowhere. Having said that, I want to make a brief point. Both new and experienced people make implicit assumptions about the code they use. Often nobody looks at how the sausage is made. The recent discussion of ifelse() made me take a look and I was not thrilled. My NA�VE view was that ifelse() was implemented as a sort of loop construct. I mean if I have a vector of length N and perhaps a few other vectors of the same length, I might say: result <- ifelse(condition-on-vector-A, result-if-true-using-vectors, result-if-false-using-vectors) So say I want to take a vector of integers from 1 to N and make an output a second vector where you have either a prime number or NA. If I have a function called is.prime() that checks a single number and returns TRUE/FALSE, it might look like this: primed <- ifelse(is.prime(A, A, NA) So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being composite becomes NA and so on. If you wrote the above using loops, it would be to range from index 1 to N and apply the above. There are many complications as R allows vectors to be longer or to be repeated as needed. What I found ifelse() as implemented to do, is sort of like this: Make a vector of the right length for the results, initially empty. Make a vector evaluating the condition so it is effectively a Boolean result. Calculate which indices are TRUE. Secondarily, calculate another set of indices that are false. Calculate ALL the THEN conditions and ditto all the ELSE conditions. Now copy into the result all the THEN values indexed by the TRUE above and than all the ELSE values indicated by the FALSE above. In plain English, make a result from two other results based on picking either one from menu A or one from menu B. That is not a bad algorithm and in a vectorized language like R, maybe even quite effective and efficient. It does lots of extra work as by definition it throws at least half away. I suspect the implementation could be made much faster by making some of it done internally using a language like C. But now that I know what this implementation did, I might have some qualms at using it in some situations. The original complaint led to other observations and needs and perhaps blindly using a supplied function like ifelse() may not be a decent solution for some needs. I note how I had to reorient my work elsewhere using a group of packages called the tidyverse when they added a function to allow rowwise manipulation of the data as compared to an ifelse-like method using all columns at once. There is room for many approaches and if a function may not be doing quite what you want, something else may better meet your needs OR you may want to see if you can copy the existing function and modify it for your own personal needs. In the case we mentioned, the goal was to avoid printing selected warnings. Since the function is readable, it can easily be modified in a copy to find what is causing the warnings and either rewrite a bit to avoid them or start over with perhaps your own function that tests before doing things and avoids tripping the condition (generating a NaN) entirely. Like may languages, R is a bit too rich. You can piggyback on the work of others but with some caution as they did not necessarily have you in mind with what they created. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use ifelse without invoking warnings
Dear Ravi, I wrote a small replacement for ifelse() which avoids such unnecessary evaluations (it bothered me a few times as well - so I decided to try a small replacement). ### Example: x = 1:10 FUN = list(); FUN[[1]] = function(x, y) x*y; FUN[[2]] = function(x, y) x^2; FUN[[3]] = function(x, y) x; # lets run multiple conditions # eval.by.formula(conditions, FUN.list, ... (arguments for FUN) ); eval.by.formula((x > 5 & x %% 2) ~ (x <= 5) ~ ., FUN, x, x-1) # Example 2 eval.by.formula((x > 5 & x %% 2) ~ (x <= 5) ~ ., FUN, 2, x) ### Disclaimer: - NOT properly tested; The code for the function is below. Maybe someone can experiment with the code and improve it further. There are a few issues / open questions, like: 1.) Best Name: eval.by.formula, ifelse.formula, ...? 2.) Named arguments: not yet; 3.) Fixed values inside FUN.list 4.) Format of expression for conditions: expression(cond1, cond2, cond3) vs cond1 ~ cond2 ~ cond3 ??? 5.) Code efficiency - some tests on large data sets & optimizations are warranted; Sincerely, Leonard === The latest code is on Github: https://github.com/discoleo/R/blob/master/Stat/Tools.Formulas.R eval.by.formula = function(e, FUN.list, ..., default=NA) { tok = split.formula(e); if(length(tok) == 0) return(); FUN = FUN.list; # Argument List clst = substitute(as.list(...))[-1]; len = length(clst); clst.all = lapply(clst, eval); eval.f = function(idCond) { sapply(seq(length(isEval)), function(id) { if(isEval[[id]] == FALSE) return(default); args.l = lapply(clst.all, function(a) if(length(a) == 1) a else a[[id]]); do.call(FUN[[idCond]], args.l); }); } # eval 1st condition: isEval = eval(tok[[1]]); rez = eval.f(1); if(length(tok) == 1) return(rez); # eval remaining conditions isEvalAll = isEval; for(id in seq(2, length(tok))) { if(tok[[id]] == ".") { # Remaining conditions: tok == "."; # makes sens only on the last position if(id < length(tok)) warning("\".\" is not last!"); isEval = ! isEvalAll; rez[isEval] = eval.f(id)[isEval]; next; } isEval = rep(FALSE, length(isEval)); isEval[ ! isEvalAll] = eval(tok[[id]])[ ! isEvalAll]; isEvalAll[isEval] = isEval[isEval]; rez[isEval] = eval.f(id)[isEval]; } return(rez); } # current code uses the formula format: # cond1 ~ cond 2 ~ cond3 # tokenizes a formula in its parts delimited by "~" # Note: # - tokenization is automatic for ","; # - but call MUST then use FUN(expression(_conditions_), other_args, ...); split.formula = function(e) { tok = list(); while(length(e) > 0) { if(e[[1]] == "~") { if(length(e) == 2) { tok = c(NA, e[[2]], tok); break; } tok = c(e[[3]], tok); e = e[[2]]; } else { tok = c(e, tok); break; } } return(tok); } __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.