[Rd] RFC: ifelse::ifelse1 as analogue of base::ifelse

Mikael Jagan Tue, 22 Jul 2025 12:04:38 -0700

[
    Partly continued from the thread earlier this month
        https://stat.ethz.ch/pipermail/r-devel/2025-July/084096.html
    in turn continued from discussions in 2016
        https://stat.ethz.ch/pipermail/r-devel/2016-August/072970.html
        https://stat.ethz.ch/pipermail/r-devel/2016-November/073385.html
    ...
]


As I seem to have converged on a (tested, timed, ...) 'ifelse' analogue, I'd
like to point again to the repository housing it:

    https://github.com/jaganmn/ifelse/

The home page has a "matrix" comparing the behaviours of all of the 'ifelse'
analogues that I know about, including mine but at the moment excluding ones
collected by Martin in 2016 ...

    https://gist.github.com/mmaechler/9cfc3219c4b89649313bfe6853d87894

... which I'd not seen until recently.

Besides handling type, class, length, attributes, etc. in a predictable way,
a nice feature of ifelse::ifelse1 is that it is "generic" in the following
sense: as long as there are suitable methods for generic functions '[', '[<-',
'c', and 'length', ifelse::ifelse1 will work for abstract classes of vectors,
not limited to the "basic" classes raw, logical, integer, etc.

Indeed, generic behaviour was a main point of discussion in 2016, and at the
time Martin even encapsulated some very nice tests in a function chkIfelse();
see his GH gist above.  After I tweak the body of chkIfelse() (to reflect
changes since 2016 in 'base' and 'Matrix'), chkIfelse(ifelse::ifelse1) passes
with a few exceptions related either to a bug in R itself

    https://bugs.r-project.org/show_bug.cgi?id=18919

or to a case where I have thought that a test *should* fail [*].  See the
output of

    $ diff -u tests/chkIfelse.R.orig tests/chkIfelse.R

for my changes to chkIfelse() and

    $ R --vanilla -f tests/chkIfelse.R

for an indication of which tests fail.  (You will want to have installed the
suggested packages Matrix, Rmpfr, and zoo.)

ifelse::ifelse1 is also fast, comparable to single-threaded data.table::fifelse
in the most common usage where none of the arguments has a class attribute and
there is no need for method dispatch.  This common usage is handled by a lower
level function, ifelse::.ifelse1 (with dot), also exported.  Of course, we lose
this speed if we remove

    if (!(is.object(yes) || is.object(no) || is.object(na)))
        return(.Call(R_ifelse_ifelse1, ltest, yes, no, na))

from body(ifelse::ifelse1), which we might do because, e.g., it may not be
desirable for R-core to maintain additional C code.  My tests/timings.Rout
suggests that the function without .Call would still be nontrivially (though
not orders of magnitude) faster than base::ifelse in the most common usage.

This RFC asks:

    1. Are there behaviour or API changes that I should consider before possibly
       submitting the package to CRAN?

    2. Are there comments from R-core about the suitability for 'base'?  I am
       personally a bit agnostic about it, but I could file a "wish" at Bugzilla
       on behalf of people who feel strongly.

Grateful for feedback or further testing, comparison, ...

Mikael


[*] chkIfelse(FUN) asserts that identical(x, FUN(x != 0, x, x)) is TRUE for 'x'
    inheriting from virtual class Matrix (from package 'Matrix').

    It is not TRUE when FUN=ifelse::ifelse1, because the class of the return
    value of ifelse::ifelse1(x != 0, x, x) is the class of c(x[0], x[0]), and
    x[0] is a traditional vector if 'x' is a Matrix.

    Martin's proposal (the function named 'ifelse2' in his GH gist) adds logic
    to handle this special case whereas ifelse::ifelse1 does not, on purpose.
    The logic involves a test of identical(class(yes), class(no)), but that is
    a bit unsatisfactory to me: in particular, should the behaviour of 'FUN'
    really differ if class(no) is a simple *subclass* of (hence compatible but
    not identical to) class(yes)?

    > loadNamespace("Matrix")
    > x <- new("dsyMatrix", Dim = c(1L, 1L), x = 1)
    > y <- new("dpoMatrix", Dim = c(1L, 1L), x = 1)
    > identical(class(x), class(y))
    [1] FALSE
    > extends(class(x), class(y))
    [1] FALSE
    > extends(class(y), class(x))
    [1] TRUE
    > MMgist::ifelse2(x != 0, x, x) # MMgist::<name> is pseudocode
    1 x 1 Matrix of class "dgeMatrix"
         [,1]
    [1,]    1
    > MMgist::ifelse2(x != 0, x, y)
    [1] 1
    > ifelse::ifelse1(x != 0, x, x)
         [,1]
    [1,]    1
    > ifelse::ifelse1(x != 0, x, y)
         [,1]
    [1,]    1

    I claim that the consistent behaviour of ifelse::ifelse1 is nicer even if
    the return value does not preserve inheritance from virtual class Matrix.
    After all, the *user* can arrange to preserve inheritance *if desired* with:

    > z <- x
    > z[] <- ifelse::ifelse1(x != 0, x, x)
    > z
    1 x 1 Matrix of class "dgeMatrix"
         [,1]
    [1,]    1

    For an S3 example, consider time series objects:

    > x <- ts(1)
    > y <- structure(x, class = c("zzz", class(x)))
    > identical(class(x), class(y))
    [1] FALSE
    > isa(x, class(y))
    [1] TRUE
    > isa(y, class(x))
    [1] FALSE
    > ifelse2(x != 0, x, x)
    Time Series:
    Start = 1
    End = 1
    Frequency = 1
    [1] 1
    > ifelse2(x != 0, x, y)
    [1] 1
    > ifelse::ifelse1(x != 0, x, x)
    [1] 1
    > ifelse::ifelse1(x != 0, x, y)
    [1] 1

    Then:

    > z <- x
    > z[] <- ifelse::ifelse1(x != 0, x, x)
    > z
    Time Series:
    Start = 1
    End = 1
    Frequency = 1
    [1] 1

    Of course, new logic involving 'extends' and 'isa' could be incorporated.
    But it seems cleaner to me to leave out such special case logic, which is
    liable to introduce unintended "discontinuities" in behaviour and make the
    source code less transparent to non-experts.  Indeed, my preference is to
    clearly document examples like the above, which are of no concern to 99%
    of users) and advertise work-arounds like
        { z <- yes OR no OR na; z[] <- ifelse::ifelse1(test, yes, no, na); z},
    which gives the remaining 1% a clue as well as a bit more control ...

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] RFC: ifelse::ifelse1 as analogue of base::ifelse

Reply via email to