I would be glad to add this to one of my R packages, probably `gdata`.. -G
Gregory R. Warnes, Ph.D. g...@warnes.net Eternity is a long time, take a friend! > On May 26, 2021, at 1:09 PM, Adrian Dușa <dusa.adr...@gmail.com> wrote: > > Yes, that is even better. > Best, > Adrian > > On Wed, May 26, 2021 at 7:05 PM Duncan Murdoch <murdoch.dun...@gmail.com > <mailto:murdoch.dun...@gmail.com>> > wrote: > >> After 5 minutes more thought: >> >> - code non-missing as missingKind = NA, not 0, so that missingKind could >> be a character vector, or missingKind = 0 could be supported. >> >> - print methods should return the main argument, so mine should be >> >> print.MultiMissing <- function(x, ...) { >> vals <- as.character(x) >> if (!is.character(x) || inherits(x, "noquote")) >> print(noquote(vals)) >> else >> print(vals) >> invisible(x) >> } >> >> This still needs a lot of improvement to be a good print method, but >> I'll leave that to you. >> >> Duncan Murdoch >> >> On 26/05/2021 11:43 a.m., Duncan Murdoch wrote: >>> On 26/05/2021 10:22 a.m., Adrian Dușa wrote: >>>> Dear Duncan, >>>> >>>> On Wed, May 26, 2021 at 2:27 AM Duncan Murdoch < >> murdoch.dun...@gmail.com >>>> <mailto:murdoch.dun...@gmail.com>> wrote: >>>> >>>> You've already been told how to solve this: just add attributes >> to the >>>> objects. Use the standard NA to indicate that there is some kind of >>>> missingness, and the attribute to describe exactly what it is. >> Stick a >>>> class on those objects and define methods so that subsetting and >>>> arithmetic preserves the extra info you've added. If you do some >>>> operation that turns those NAs into NaNs, big deal: the attribute >> will >>>> still be there, and is.na <http://is.na>(NaN) still returns TRUE. >>>> >>>> >>>> I've already tried the attributes way, it is not so easy. >>> >>> If you have specific operations that are needed but that you can't get >>> to work, post the issue here. >>> >>>> In the best case scenario, it unnecessarily triples the size of the >>>> data, but perhaps this is the only way forward. >>> >>> I don't see how it could triple the size. Surely an integer has enough >>> values to cover all possible kinds of missingness. So on integer or >>> factor data you'd double the size, on real or character data you'd >>> increase it by 50%. (This is assuming you're on a 64 bit platform with >>> 32 bit integers and 64 bit reals and pointers.) >>> >>> Here's a tiny implementation to show what I'm talking about: >>> >>> asMultiMissing <- function(x) { >>> if (isMultiMissing(x)) >>> return(x) >>> missingKind <- ifelse(is.na(x), 1, 0) >>> structure(x, >>> missingKind = missingKind, >>> class = c("MultiMissing", class(x))) >>> } >>> >>> isMultiMissing <- function(x) >>> inherits(x, "MultiMissing") >>> >>> missingKind <- function(x) { >>> if (isMultiMissing(x)) >>> attr(x, "missingKind") >>> else >>> ifelse(is.na(x), 1, 0) >>> } >>> >>> `missingKind<-` <- function(x, value) { >>> class(x) <- setdiff(class(x), "MultiMissing") >>> x[value != 0] <- NA >>> x <- asMultiMissing(x) >>> attr(x, "missingKind") <- value >>> x >>> } >>> >>> `[.MultiMissing` <- function(x, i, ...) { >>> missings <- missingKind(x) >>> x <- NextMethod() >>> missings <- missings[i] >>> missingKind(x) <- missings >>> x >>> } >>> >>> print.MultiMissing <- function(x, ...) { >>> vals <- as.character(x) >>> if (!is.character(x) || inherits(x, "noquote")) >>> print(noquote(vals)) >>> else >>> print(vals) >>> } >>> >>> `[<-.MultiMissing` <- function(x, i, value, ...) { >>> missings <- missingKind(x) >>> class(x) <- setdiff(class(x), "MultiMissing") >>> x[i] <- value >>> missings[i] <- missingKind(value) >>> missingKind(x) <- missings >>> x >>> } >>> >>> as.character.MultiMissing <- function(x, ...) { >>> missings <- missingKind(x) >>> result <- NextMethod() >>> ifelse(missings != 0, >>> paste0("NA.", missings), result) >>> >>> } >>> >>> This is incomplete. It doesn't do printing very well, and it doesn't >>> handle the case of assigning a MultiMissing value to a regular vector at >>> all. (I think you'd need an S4 implementation if you want to support >>> that.) But it does the basics: >>> >>>> x <- 1:10 >>>> missingKind(x)[4] <- 23 >>>> x >>> [1] 1 2 3 NA.23 5 6 7 8 9 >>> [10] 10 >>>> is.na(x) >>> [1] FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE >>> [10] FALSE >>>> missingKind(x) >>> [1] 0 0 0 23 0 0 0 0 0 0 >>>> >>> >>> Duncan Murdoch >>> >>>> >>>> Base R doesn't need anything else. >>>> >>>> You complained that users shouldn't need to know about attributes, >> and >>>> they won't: you, as the author of the package that does this, will >>>> handle all those details. Working in your subject area you know >> all >>>> the >>>> different kinds of NAs that people care about, and how they code >>>> them in >>>> input data, so you can make it all totally transparent. If you do >> it >>>> well, someone in some other subject area with a completely >> different >>>> set >>>> of kinds of missingness will be able to adapt your code to their >> use. >>>> >>>> >>>> But that is the whole point: the package author does not define possible >>>> NAs (the possibilities are infinite), users do that. >>>> The package should only provide a simple method to achieve that. >>>> >>>> >>>> I imagine this has all been done in one of the thousands of >> packages on >>>> CRAN, but if it hasn't been done well enough for you, do it better. >>>> >>>> >>>> If it were, I would have found it by now... >>>> >>>> Best wishes, >>>> Adrian >>> >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > <https://stat.ethz.ch/mailman/listinfo/r-devel> [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel