Re: [R] assumptions about how things are done

2021-10-11 Thread Avi Gross via R-help
I appreciate the feedback from several people. As noted, I do not want a deep 
philosophical discussion here and my main point remains not to expect software 
you "borrow" to do what you WANT but to accommodate it doing what it should.

Most of the time, I would say that a I want a vectorized function to do things 
exactly the way ifelse() does it. A mean A+B vectorized adds corresponding 
entries in any order and perhaps even using multiple cores to do it 
concurrently for larger vectors. Bu we normally don't care about the details, 
just the result. We want ALL the conditions evaluated and all the then and else 
parts and the appropriate ones combined into a result. Other than some storage 
considerations, and maybe efficiency considerations, it matters little if it is 
done in a gradual loop way or some other.

My point is some are used to loops and if they assume everything will not be 
evaluated, may have problems if that is a problem. Your example about division 
by zero is an example where you might change your code to avoid it. One way is 
to compute the then or else vector before calling ifelse() on the result and 
doing that calculation carefully as in a way that tests for dividing by zero 
and does something appropriate to avoid it or trap the error or something. 
Another is the wrapper method I mentioned, And, if you really need not to 
evaluate some things such as to avoid side effects, ifelse() may not be what 
you use then.

I was wondering if I do depend on the times R does non-standard evaluation too 
much that I think it can be done anytime. Obviously, not. I have often seen 
anomalous results when I forgot that changing something multiple times that 
gets evaluated ONCE later, does not work even if my intent was say to make 
lines that are dashed then others that are dotted and so on.

Many other languages I use do not have this gimmick and it can be annoying but 
realistic to have to pass some things explicitly such as a text version of the 
formula used so it can be displayed in the result. Most functions only see the 
result of arguments passed after evaluation.

So a strength of R can also be ...

-Original Message-
From: R-help  On Behalf Of Jorgen Harmse via 
R-help
Sent: Monday, October 11, 2021 11:08 AM
To: r-help@r-project.org
Subject: Re: [R] assumptions about how things are done

As noted by Richard O'Keefe, what usually happens in an R function is that any 
argument is evaluated either in its entirety or not at all. A few functions use 
substitute or similar trickery, but then expectations should be documented. I 
can understand that you want something like ifelse(y>x,x/y,z) to run without 
warning about division by zero, but how would that be implemented in general? 
Even a subexpression as simple as f(a,b) presents a problem: you want 
f(a,b)[cond], but you don't know how the function f works. It might be just a 
vector operation (and then perhaps f(a[cond],b[cond]) is what we want), or it 
might return a+rev(b). Avi Gross correctly notes that the implementation is not 
what he wants, but I think that what he wants is possible only in special cases.

Regards,
Jorgen Harmse. 



Message: 2
Date: Sat, 9 Oct 2021 15:35:55 -0400
From: "Avi Gross" 
    To: 
Subject: [R] assumptions about how things are done
Message-ID: <029401d7bd44$e10843c0$a318cb40$@verizon.net>
Content-Type: text/plain; charset="utf-8"

This is supposed to be a forum for help so general and philosophical
discussions belong elsewhere, or nowhere.



Having said that, I want to make a brief point. Both new and experienced
people make implicit assumptions about the code they use. Often nobody looks
at how the sausage is made. The recent discussion of ifelse() made me take a
look and I was not thrilled.



My NA VE view was that ifelse() was implemented as a sort of loop construct.
I mean if I have a vector of length N and perhaps a few other vectors of the
same length, I might say:



result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
result-if-false-using-vectors)



So say I want to take a vector of integers from 1 to N and make an output a
second vector where you have either a prime number or NA. If I have a
function called is.prime() that checks a single number and returns
TRUE/FALSE, it might look like this:



primed <- ifelse(is.prime(A, A, NA)



So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
composite becomes NA and so on.



If you wrote the above using loops, it would be to range from index 1 to N
and apply the above. There are many complications as R allows vectors to be
longer or to be repeated as needed.



What I found ifelse() as implemented to do, is sort of like this:



Make a vector of the right length for the results, initially empty.



Make a vector evaluating the condit

Re: [R] assumptions about how things are done

2021-10-11 Thread Jorgen Harmse via R-help
As noted by Richard O'Keefe, what usually happens in an R function is that any 
argument is evaluated either in its entirety or not at all. A few functions use 
substitute or similar trickery, but then expectations should be documented. I 
can understand that you want something like ifelse(y>x,x/y,z) to run without 
warning about division by zero, but how would that be implemented in general? 
Even a subexpression as simple as f(a,b) presents a problem: you want 
f(a,b)[cond], but you don't know how the function f works. It might be just a 
vector operation (and then perhaps f(a[cond],b[cond]) is what we want), or it 
might return a+rev(b). Avi Gross correctly notes that the implementation is not 
what he wants, but I think that what he wants is possible only in special cases.

Regards,
Jorgen Harmse. 



Message: 2
Date: Sat, 9 Oct 2021 15:35:55 -0400
From: "Avi Gross" 
To: 
    Subject: [R] assumptions about how things are done
Message-ID: <029401d7bd44$e10843c0$a318cb40$@verizon.net>
Content-Type: text/plain; charset="utf-8"

This is supposed to be a forum for help so general and philosophical
discussions belong elsewhere, or nowhere.



Having said that, I want to make a brief point. Both new and experienced
people make implicit assumptions about the code they use. Often nobody looks
at how the sausage is made. The recent discussion of ifelse() made me take a
look and I was not thrilled.



My NA�VE view was that ifelse() was implemented as a sort of loop construct.
I mean if I have a vector of length N and perhaps a few other vectors of the
same length, I might say:



result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
result-if-false-using-vectors)



So say I want to take a vector of integers from 1 to N and make an output a
second vector where you have either a prime number or NA. If I have a
function called is.prime() that checks a single number and returns
TRUE/FALSE, it might look like this:



primed <- ifelse(is.prime(A, A, NA)



So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
composite becomes NA and so on.



If you wrote the above using loops, it would be to range from index 1 to N
and apply the above. There are many complications as R allows vectors to be
longer or to be repeated as needed.



What I found ifelse() as implemented to do, is sort of like this:



Make a vector of the right length for the results, initially empty.



Make a vector evaluating the condition so it is effectively a Boolean
result.

Calculate which indices are TRUE. Secondarily, calculate another set of
indices that are false.



Calculate ALL the THEN conditions and ditto all the ELSE conditions.



Now copy into the result all the THEN values indexed by the TRUE above and
than all the ELSE values indicated by the FALSE above.



In plain English, make a result from two other results based on picking
either one from menu A or one from menu B.



That is not a bad algorithm and in a vectorized language like R, maybe even
quite effective and efficient. It does lots of extra work as by definition
it throws at least half away.



I suspect the implementation could be made much faster by making some of it
done internally using a language like C.



But now that I know what this implementation did, I might have some qualms
at using it in some situations. The original complaint led to other
observations and needs and perhaps blindly using a supplied function like
ifelse() may not be a decent solution for some needs.



I note how I had to reorient my work elsewhere using a group of packages
called the tidyverse when they added a function to allow rowwise
manipulation of the data as compared to an ifelse-like method using all
columns at once. There is room for many approaches and if a function may not
be doing quite what you want, something else may better meet your needs OR
you may want to see if you can copy the existing function and modify it for
your own personal needs.



In the case we mentioned, the goal was to avoid printing selected warnings.
Since the function is readable, it can easily be modified in a copy to find
what is causing the warnings and either rewrite a bit to avoid them or start
over with perhaps your own function that tests before doing things and
avoids tripping the condition (generating a NaN) entirely.



Like may languages, R is a bit too rich. You can piggyback on the work of
others but with some caution as they did not necessarily have you in mind
with what they created.






[[alternative HTML version deleted]]





--

Message: 4
Date: Sun, 10 Oct 2021 08:34:52 +1100
From: Jim Lemon 
To: Avi Gross 
Cc: r

Re: [R] assumptions about how things are done

2021-10-10 Thread Rolf Turner


On Sun, 10 Oct 2021 19:27:27 +1300
"Richard O'Keefe"  wrote:



> Why is it so hard to understand that there is nothing special to
> understand here?



Fortune nomination.

cheers,

Rolf

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assumptions about how things are done

2021-10-10 Thread Greg Minshall
hi, Richard,

> ifelse(..., ..., ...) is not a control structure.  It is not special
> syntax.  It is a normal function call, and it evaluates its arguments
> and expands them to a common length just like "+" or, more to the
> point, just like "&".
>
> So why do we have people expecting a normal function call to do
> special control structure magic?
>
> Leaving aside the extending-to-a-common-length part, it's
> ifelse <- function (test, true.part, false.part) {
> false.part[test] <- true.part[test]
> false.part
> }
>
> Why is it so hard to understand that there is nothing special to
> understand here?

i wonder if possibly because features like non-standard evaluation lead
many of us to conclude there may/should/could be magic at all levels?

cheers, Greg

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assumptions about how things are done

2021-10-10 Thread Richard O'Keefe
Colour me confused.
if (...) { ... } else { ... }
is a control structure.  It requires the test to evaluate to a single
logical value,
then it evaluates one choice completely and the other not at all.
It is special syntax.

ifelse(..., ..., ...) is not a control structure.  It is not special
syntax.  It is a
normal function call, and it evaluates its arguments and expands them to
a common length just like "+" or, more to the point, just like "&".

So why do we have people expecting a normal function call to do special
control structure magic?

Leaving aside the extending-to-a-common-length part, it's
ifelse <- function (test, true.part, false.part) {
false.part[test] <- true.part[test]
false.part
}

Why is it so hard to understand that there is nothing special to
understand here?

On Sun, 10 Oct 2021 at 08:36, Avi Gross via R-help  wrote:
>
> This is supposed to be a forum for help so general and philosophical
> discussions belong elsewhere, or nowhere.
>
>
>
> Having said that, I want to make a brief point. Both new and experienced
> people make implicit assumptions about the code they use. Often nobody looks
> at how the sausage is made. The recent discussion of ifelse() made me take a
> look and I was not thrilled.
>
>
>
> My NAÏVE view was that ifelse() was implemented as a sort of loop construct.
> I mean if I have a vector of length N and perhaps a few other vectors of the
> same length, I might say:
>
>
>
> result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
> result-if-false-using-vectors)
>
>
>
> So say I want to take a vector of integers from 1 to N and make an output a
> second vector where you have either a prime number or NA. If I have a
> function called is.prime() that checks a single number and returns
> TRUE/FALSE, it might look like this:
>
>
>
> primed <- ifelse(is.prime(A, A, NA)
>
>
>
> So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
> composite becomes NA and so on.
>
>
>
> If you wrote the above using loops, it would be to range from index 1 to N
> and apply the above. There are many complications as R allows vectors to be
> longer or to be repeated as needed.
>
>
>
> What I found ifelse() as implemented to do, is sort of like this:
>
>
>
> Make a vector of the right length for the results, initially empty.
>
>
>
> Make a vector evaluating the condition so it is effectively a Boolean
> result.
>
> Calculate which indices are TRUE. Secondarily, calculate another set of
> indices that are false.
>
>
>
> Calculate ALL the THEN conditions and ditto all the ELSE conditions.
>
>
>
> Now copy into the result all the THEN values indexed by the TRUE above and
> than all the ELSE values indicated by the FALSE above.
>
>
>
> In plain English, make a result from two other results based on picking
> either one from menu A or one from menu B.
>
>
>
> That is not a bad algorithm and in a vectorized language like R, maybe even
> quite effective and efficient. It does lots of extra work as by definition
> it throws at least half away.
>
>
>
> I suspect the implementation could be made much faster by making some of it
> done internally using a language like C.
>
>
>
> But now that I know what this implementation did, I might have some qualms
> at using it in some situations. The original complaint led to other
> observations and needs and perhaps blindly using a supplied function like
> ifelse() may not be a decent solution for some needs.
>
>
>
> I note how I had to reorient my work elsewhere using a group of packages
> called the tidyverse when they added a function to allow rowwise
> manipulation of the data as compared to an ifelse-like method using all
> columns at once. There is room for many approaches and if a function may not
> be doing quite what you want, something else may better meet your needs OR
> you may want to see if you can copy the existing function and modify it for
> your own personal needs.
>
>
>
> In the case we mentioned, the goal was to avoid printing selected warnings.
> Since the function is readable, it can easily be modified in a copy to find
> what is causing the warnings and either rewrite a bit to avoid them or start
> over with perhaps your own function that tests before doing things and
> avoids tripping the condition (generating a NaN) entirely.
>
>
>
> Like may languages, R is a bit too rich. You can piggyback on the work of
> others but with some caution as they did not necessarily have you in mind
> with what they created.
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Re: [R] assumptions about how things are done

2021-10-09 Thread Jim Lemon
Hi Avi,
Definitely a learning moment. I may consider writing an ifElse() for
my own use and sharing it if anyone wants it.

Jim

On Sun, Oct 10, 2021 at 6:36 AM Avi Gross via R-help
 wrote:
>
> This is supposed to be a forum for help so general and philosophical
> discussions belong elsewhere, or nowhere.
>
>
>
> Having said that, I want to make a brief point. Both new and experienced
> people make implicit assumptions about the code they use. Often nobody looks
> at how the sausage is made. The recent discussion of ifelse() made me take a
> look and I was not thrilled.
>
>
>
> My NAÏVE view was that ifelse() was implemented as a sort of loop construct.
> I mean if I have a vector of length N and perhaps a few other vectors of the
> same length, I might say:
>
>
>
> result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
> result-if-false-using-vectors)
>
>
>
> So say I want to take a vector of integers from 1 to N and make an output a
> second vector where you have either a prime number or NA. If I have a
> function called is.prime() that checks a single number and returns
> TRUE/FALSE, it might look like this:
>
>
>
> primed <- ifelse(is.prime(A, A, NA)
>
>
>
> So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
> composite becomes NA and so on.
>
>
>
> If you wrote the above using loops, it would be to range from index 1 to N
> and apply the above. There are many complications as R allows vectors to be
> longer or to be repeated as needed.
>
>
>
> What I found ifelse() as implemented to do, is sort of like this:
>
>
>
> Make a vector of the right length for the results, initially empty.
>
>
>
> Make a vector evaluating the condition so it is effectively a Boolean
> result.
>
> Calculate which indices are TRUE. Secondarily, calculate another set of
> indices that are false.
>
>
>
> Calculate ALL the THEN conditions and ditto all the ELSE conditions.
>
>
>
> Now copy into the result all the THEN values indexed by the TRUE above and
> than all the ELSE values indicated by the FALSE above.
>
>
>
> In plain English, make a result from two other results based on picking
> either one from menu A or one from menu B.
>
>
>
> That is not a bad algorithm and in a vectorized language like R, maybe even
> quite effective and efficient. It does lots of extra work as by definition
> it throws at least half away.
>
>
>
> I suspect the implementation could be made much faster by making some of it
> done internally using a language like C.
>
>
>
> But now that I know what this implementation did, I might have some qualms
> at using it in some situations. The original complaint led to other
> observations and needs and perhaps blindly using a supplied function like
> ifelse() may not be a decent solution for some needs.
>
>
>
> I note how I had to reorient my work elsewhere using a group of packages
> called the tidyverse when they added a function to allow rowwise
> manipulation of the data as compared to an ifelse-like method using all
> columns at once. There is room for many approaches and if a function may not
> be doing quite what you want, something else may better meet your needs OR
> you may want to see if you can copy the existing function and modify it for
> your own personal needs.
>
>
>
> In the case we mentioned, the goal was to avoid printing selected warnings.
> Since the function is readable, it can easily be modified in a copy to find
> what is causing the warnings and either rewrite a bit to avoid them or start
> over with perhaps your own function that tests before doing things and
> avoids tripping the condition (generating a NaN) entirely.
>
>
>
> Like may languages, R is a bit too rich. You can piggyback on the work of
> others but with some caution as they did not necessarily have you in mind
> with what they created.
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] assumptions about how things are done

2021-10-09 Thread Avi Gross via R-help
This is supposed to be a forum for help so general and philosophical
discussions belong elsewhere, or nowhere.

 

Having said that, I want to make a brief point. Both new and experienced
people make implicit assumptions about the code they use. Often nobody looks
at how the sausage is made. The recent discussion of ifelse() made me take a
look and I was not thrilled.

 

My NA�VE view was that ifelse() was implemented as a sort of loop construct.
I mean if I have a vector of length N and perhaps a few other vectors of the
same length, I might say:

 

result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
result-if-false-using-vectors)

 

So say I want to take a vector of integers from 1 to N and make an output a
second vector where you have either a prime number or NA. If I have a
function called is.prime() that checks a single number and returns
TRUE/FALSE, it might look like this:

 

primed <- ifelse(is.prime(A, A, NA)

 

So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
composite becomes NA and so on.

 

If you wrote the above using loops, it would be to range from index 1 to N
and apply the above. There are many complications as R allows vectors to be
longer or to be repeated as needed.

 

What I found ifelse() as implemented to do, is sort of like this:

 

Make a vector of the right length for the results, initially empty.

 

Make a vector evaluating the condition so it is effectively a Boolean
result.

Calculate which indices are TRUE. Secondarily, calculate another set of
indices that are false.

 

Calculate ALL the THEN conditions and ditto all the ELSE conditions.

 

Now copy into the result all the THEN values indexed by the TRUE above and
than all the ELSE values indicated by the FALSE above.

 

In plain English, make a result from two other results based on picking
either one from menu A or one from menu B.

 

That is not a bad algorithm and in a vectorized language like R, maybe even
quite effective and efficient. It does lots of extra work as by definition
it throws at least half away.

 

I suspect the implementation could be made much faster by making some of it
done internally using a language like C.

 

But now that I know what this implementation did, I might have some qualms
at using it in some situations. The original complaint led to other
observations and needs and perhaps blindly using a supplied function like
ifelse() may not be a decent solution for some needs.

 

I note how I had to reorient my work elsewhere using a group of packages
called the tidyverse when they added a function to allow rowwise
manipulation of the data as compared to an ifelse-like method using all
columns at once. There is room for many approaches and if a function may not
be doing quite what you want, something else may better meet your needs OR
you may want to see if you can copy the existing function and modify it for
your own personal needs.

 

In the case we mentioned, the goal was to avoid printing selected warnings.
Since the function is readable, it can easily be modified in a copy to find
what is causing the warnings and either rewrite a bit to avoid them or start
over with perhaps your own function that tests before doing things and
avoids tripping the condition (generating a NaN) entirely.

 

Like may languages, R is a bit too rich. You can piggyback on the work of
others but with some caution as they did not necessarily have you in mind
with what they created.

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.