Yes, that is even better. Best, Adrian On Wed, May 26, 2021 at 7:05 PM Duncan Murdoch <murdoch.dun...@gmail.com> wrote:
> After 5 minutes more thought: > > - code non-missing as missingKind = NA, not 0, so that missingKind could > be a character vector, or missingKind = 0 could be supported. > > - print methods should return the main argument, so mine should be > > print.MultiMissing <- function(x, ...) { > vals <- as.character(x) > if (!is.character(x) || inherits(x, "noquote")) > print(noquote(vals)) > else > print(vals) > invisible(x) > } > > This still needs a lot of improvement to be a good print method, but > I'll leave that to you. > > Duncan Murdoch > > On 26/05/2021 11:43 a.m., Duncan Murdoch wrote: > > On 26/05/2021 10:22 a.m., Adrian Dușa wrote: > >> Dear Duncan, > >> > >> On Wed, May 26, 2021 at 2:27 AM Duncan Murdoch < > murdoch.dun...@gmail.com > >> <mailto:murdoch.dun...@gmail.com>> wrote: > >> > >> You've already been told how to solve this: just add attributes > to the > >> objects. Use the standard NA to indicate that there is some kind of > >> missingness, and the attribute to describe exactly what it is. > Stick a > >> class on those objects and define methods so that subsetting and > >> arithmetic preserves the extra info you've added. If you do some > >> operation that turns those NAs into NaNs, big deal: the attribute > will > >> still be there, and is.na <http://is.na>(NaN) still returns TRUE. > >> > >> > >> I've already tried the attributes way, it is not so easy. > > > > If you have specific operations that are needed but that you can't get > > to work, post the issue here. > > > >> In the best case scenario, it unnecessarily triples the size of the > >> data, but perhaps this is the only way forward. > > > > I don't see how it could triple the size. Surely an integer has enough > > values to cover all possible kinds of missingness. So on integer or > > factor data you'd double the size, on real or character data you'd > > increase it by 50%. (This is assuming you're on a 64 bit platform with > > 32 bit integers and 64 bit reals and pointers.) > > > > Here's a tiny implementation to show what I'm talking about: > > > > asMultiMissing <- function(x) { > > if (isMultiMissing(x)) > > return(x) > > missingKind <- ifelse(is.na(x), 1, 0) > > structure(x, > > missingKind = missingKind, > > class = c("MultiMissing", class(x))) > > } > > > > isMultiMissing <- function(x) > > inherits(x, "MultiMissing") > > > > missingKind <- function(x) { > > if (isMultiMissing(x)) > > attr(x, "missingKind") > > else > > ifelse(is.na(x), 1, 0) > > } > > > > `missingKind<-` <- function(x, value) { > > class(x) <- setdiff(class(x), "MultiMissing") > > x[value != 0] <- NA > > x <- asMultiMissing(x) > > attr(x, "missingKind") <- value > > x > > } > > > > `[.MultiMissing` <- function(x, i, ...) { > > missings <- missingKind(x) > > x <- NextMethod() > > missings <- missings[i] > > missingKind(x) <- missings > > x > > } > > > > print.MultiMissing <- function(x, ...) { > > vals <- as.character(x) > > if (!is.character(x) || inherits(x, "noquote")) > > print(noquote(vals)) > > else > > print(vals) > > } > > > > `[<-.MultiMissing` <- function(x, i, value, ...) { > > missings <- missingKind(x) > > class(x) <- setdiff(class(x), "MultiMissing") > > x[i] <- value > > missings[i] <- missingKind(value) > > missingKind(x) <- missings > > x > > } > > > > as.character.MultiMissing <- function(x, ...) { > > missings <- missingKind(x) > > result <- NextMethod() > > ifelse(missings != 0, > > paste0("NA.", missings), result) > > > > } > > > > This is incomplete. It doesn't do printing very well, and it doesn't > > handle the case of assigning a MultiMissing value to a regular vector at > > all. (I think you'd need an S4 implementation if you want to support > > that.) But it does the basics: > > > > > x <- 1:10 > > > missingKind(x)[4] <- 23 > > > x > > [1] 1 2 3 NA.23 5 6 7 8 9 > > [10] 10 > > > is.na(x) > > [1] FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE > > [10] FALSE > > > missingKind(x) > > [1] 0 0 0 23 0 0 0 0 0 0 > > > > > > > Duncan Murdoch > > > >> > >> Base R doesn't need anything else. > >> > >> You complained that users shouldn't need to know about attributes, > and > >> they won't: you, as the author of the package that does this, will > >> handle all those details. Working in your subject area you know > all > >> the > >> different kinds of NAs that people care about, and how they code > >> them in > >> input data, so you can make it all totally transparent. If you do > it > >> well, someone in some other subject area with a completely > different > >> set > >> of kinds of missingness will be able to adapt your code to their > use. > >> > >> > >> But that is the whole point: the package author does not define possible > >> NAs (the possibilities are infinite), users do that. > >> The package should only provide a simple method to achieve that. > >> > >> > >> I imagine this has all been done in one of the thousands of > packages on > >> CRAN, but if it hasn't been done well enough for you, do it better. > >> > >> > >> If it were, I would have found it by now... > >> > >> Best wishes, > >> Adrian > > > > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel