Re: [R] assumptions about how things are done

Jorgen Harmse via R-help Mon, 11 Oct 2021 08:08:40 -0700

As noted by Richard O'Keefe, what usually happens in an R function is that any 
argument is evaluated either in its entirety or not at all. A few functions use 
substitute or similar trickery, but then expectations should be documented. I 
can understand that you want something like ifelse(y>x,x/y,z) to run without 
warning about division by zero, but how would that be implemented in general? 
Even a subexpression as simple as f(a,b) presents a problem: you want 
f(a,b)[cond], but you don't know how the function f works. It might be just a 
vector operation (and then perhaps f(a[cond],b[cond]) is what we want), or it 
might return a+rev(b). Avi Gross correctly notes that the implementation is not 
what he wants, but I think that what he wants is possible only in special cases.


Regards,
Jorgen Harmse. 



    Message: 2
    Date: Sat, 9 Oct 2021 15:35:55 -0400
    From: "Avi Gross" <avigr...@verizon.net>
    To: <r-help@r-project.org>
    Subject: [R] assumptions about how things are done
    Message-ID: <029401d7bd44$e10843c0$a318cb40$@verizon.net>
    Content-Type: text/plain; charset="utf-8"

    This is supposed to be a forum for help so general and philosophical
    discussions belong elsewhere, or nowhere.



    Having said that, I want to make a brief point. Both new and experienced
    people make implicit assumptions about the code they use. Often nobody looks
    at how the sausage is made. The recent discussion of ifelse() made me take a
    look and I was not thrilled.



    My NA�VE view was that ifelse() was implemented as a sort of loop construct.
    I mean if I have a vector of length N and perhaps a few other vectors of the
    same length, I might say:



    result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
    result-if-false-using-vectors)



    So say I want to take a vector of integers from 1 to N and make an output a
    second vector where you have either a prime number or NA. If I have a
    function called is.prime() that checks a single number and returns
    TRUE/FALSE, it might look like this:



    primed <- ifelse(is.prime(A, A, NA)



    So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
    composite becomes NA and so on.



    If you wrote the above using loops, it would be to range from index 1 to N
    and apply the above. There are many complications as R allows vectors to be
    longer or to be repeated as needed.



    What I found ifelse() as implemented to do, is sort of like this:



    Make a vector of the right length for the results, initially empty.



    Make a vector evaluating the condition so it is effectively a Boolean
    result.

    Calculate which indices are TRUE. Secondarily, calculate another set of
    indices that are false.



    Calculate ALL the THEN conditions and ditto all the ELSE conditions.



    Now copy into the result all the THEN values indexed by the TRUE above and
    than all the ELSE values indicated by the FALSE above.



    In plain English, make a result from two other results based on picking
    either one from menu A or one from menu B.



    That is not a bad algorithm and in a vectorized language like R, maybe even
    quite effective and efficient. It does lots of extra work as by definition
    it throws at least half away.



    I suspect the implementation could be made much faster by making some of it
    done internally using a language like C.



    But now that I know what this implementation did, I might have some qualms
    at using it in some situations. The original complaint led to other
    observations and needs and perhaps blindly using a supplied function like
    ifelse() may not be a decent solution for some needs.



    I note how I had to reorient my work elsewhere using a group of packages
    called the tidyverse when they added a function to allow rowwise
    manipulation of the data as compared to an ifelse-like method using all
    columns at once. There is room for many approaches and if a function may not
    be doing quite what you want, something else may better meet your needs OR
    you may want to see if you can copy the existing function and modify it for
    your own personal needs.



    In the case we mentioned, the goal was to avoid printing selected warnings.
    Since the function is readable, it can easily be modified in a copy to find
    what is causing the warnings and either rewrite a bit to avoid them or start
    over with perhaps your own function that tests before doing things and
    avoids tripping the condition (generating a NaN) entirely.



    Like may languages, R is a bit too rich. You can piggyback on the work of
    others but with some caution as they did not necessarily have you in mind
    with what they created.






        [[alternative HTML version deleted]]





    ------------------------------

    Message: 4
    Date: Sun, 10 Oct 2021 08:34:52 +1100
    From: Jim Lemon <drjimle...@gmail.com>
    To: Avi Gross <avigr...@verizon.net>
    Cc: r-help mailing list <r-help@r-project.org>
    Subject: Re: [R] assumptions about how things are done
    Message-ID:
        <CA+8X3fUQvXUx0=cVvNVTv144PQrRHgHgp1iBiXk23R8V+9=7...@mail.gmail.com>
    Content-Type: text/plain; charset="utf-8"

    Hi Avi,
    Definitely a learning moment. I may consider writing an ifElse() for
    my own use and sharing it if anyone wants it.

    Jim

    On Sun, Oct 10, 2021 at 6:36 AM Avi Gross via R-help
    <r-help@r-project.org> wrote:
    >
    > This is supposed to be a forum for help so general and philosophical
    > discussions belong elsewhere, or nowhere.
    >
    >
    >
    > Having said that, I want to make a brief point. Both new and experienced
    > people make implicit assumptions about the code they use. Often nobody 
looks
    > at how the sausage is made. The recent discussion of ifelse() made me 
take a
    > look and I was not thrilled.
    >
    >
    >
    > My NAÏVE view was that ifelse() was implemented as a sort of loop 
construct.
    > I mean if I have a vector of length N and perhaps a few other vectors of 
the
    > same length, I might say:
    >
    >
    >
    > result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
    > result-if-false-using-vectors)
    >
    >
    >
    > So say I want to take a vector of integers from 1 to N and make an output 
a
    > second vector where you have either a prime number or NA. If I have a
    > function called is.prime() that checks a single number and returns
    > TRUE/FALSE, it might look like this:
    >
    >
    >
    > primed <- ifelse(is.prime(A, A, NA)
    >
    >
    >
    > So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
    > composite becomes NA and so on.
    >
    >
    >
    > If you wrote the above using loops, it would be to range from index 1 to N
    > and apply the above. There are many complications as R allows vectors to 
be
    > longer or to be repeated as needed.
    >
    >
    >
    > What I found ifelse() as implemented to do, is sort of like this:
    >
    >
    >
    > Make a vector of the right length for the results, initially empty.
    >
    >
    >
    > Make a vector evaluating the condition so it is effectively a Boolean
    > result.
    >
    > Calculate which indices are TRUE. Secondarily, calculate another set of
    > indices that are false.
    >
    >
    >
    > Calculate ALL the THEN conditions and ditto all the ELSE conditions.
    >
    >
    >
    > Now copy into the result all the THEN values indexed by the TRUE above and
    > than all the ELSE values indicated by the FALSE above.
    >
    >
    >
    > In plain English, make a result from two other results based on picking
    > either one from menu A or one from menu B.
    >
    >
    >
    > That is not a bad algorithm and in a vectorized language like R, maybe 
even
    > quite effective and efficient. It does lots of extra work as by definition
    > it throws at least half away.
    >
    >
    >
    > I suspect the implementation could be made much faster by making some of 
it
    > done internally using a language like C.
    >
    >
    >
    > But now that I know what this implementation did, I might have some qualms
    > at using it in some situations. The original complaint led to other
    > observations and needs and perhaps blindly using a supplied function like
    > ifelse() may not be a decent solution for some needs.
    >
    >
    >
    > I note how I had to reorient my work elsewhere using a group of packages
    > called the tidyverse when they added a function to allow rowwise
    > manipulation of the data as compared to an ifelse-like method using all
    > columns at once. There is room for many approaches and if a function may 
not
    > be doing quite what you want, something else may better meet your needs OR
    > you may want to see if you can copy the existing function and modify it 
for
    > your own personal needs.
    >
    >
    >
    > In the case we mentioned, the goal was to avoid printing selected 
warnings.
    > Since the function is readable, it can easily be modified in a copy to 
find
    > what is causing the warnings and either rewrite a bit to avoid them or 
start
    > over with perhaps your own function that tests before doing things and
    > avoids tripping the condition (generating a NaN) entirely.
    >
    >
    >
    > Like may languages, R is a bit too rich. You can piggyback on the work of
    > others but with some caution as they did not necessarily have you in mind
    > with what they created.
    >
    >
    >
    >
    >
    >
    >         [[alternative HTML version deleted]]
    >
    > ______________________________________________
    > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.




______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] assumptions about how things are done

Reply via email to