> Dear R-core team, > I think I found a small inconsistency in the boxplot function. I don't want > to post it as a bug since I'm not sure this might be considered as one > according to the FAQ --- and this is not a major problem. Don't hesitate to > tell me if I'm wrong.
> If you try to do a boxplot on a matrix and set the "at" argument to some > vector different from 1:n, n is the number of columns of your matrix, then > some boxplots will be hidden since the default "xlim" value will be set to > c(0.5, n + 0.5) during the call of the bxp function. > Currently you can easily bypass this problem by setting "xlim" appropriately > when calling the boxplot function. Yes. And the help page for bxp even has the following note: \note{ if \code{add = FALSE}, the default is \code{xlim = c(0.5, n +0.5)}. It will usually be a good idea to specify the latter if the "x" axis has a log scale or \code{at} is specified or \code{width} is far from uniform. } which clearly documents the current behavior. (and one could say also ``excuses'' the current behavior) In this sense, there's really no bug ... ;-) and you were very wise (or at least cautious :-) *not* to post it as bug .. > I think it will be better if all boxplots were always shown unless the "xlim" > argument is specified. (I realized this behavior when I tried to do boxplots > on conditional simulations of a stochastic process ; in which case the > suggested behavior might be useful.) I do agree that such a change would be more ``logical'' i.e., according to "The Rule of Least Surprise" (a good software design principle of providing a default behavior of "least surprise" to the user). > Here's an example > par(mfrow = c(1, 3)) > data <- matrix(rnorm(10 * 50), 50) > colnames(data) <- letters[1:10] > x.pos <- seq(-10, 10, length = 10) > boxplot(data, at = x.pos) ## only the last 5 boxplots will appear > boxplot(data, at = 1:10) ## all boxplots will appear > boxplot(data, at = x.pos, xlim = range(x.pos) + c(-0.5, 0.5)) ## all boxplots > will be shown > I tried to do a patch if you want to change the current behavior --- note > this is my first patch ever so maybe I'm doing it wrong. it looks good. In the end, I would use xlim <- range(at, finite=TRUE) + c(-0.5, 0.5) There's one ***BIG*** question though: How probable is it that it breaks someone else's code. Note that boxplot() and bxp() are *REALLY* old traditional S functions (and for all the young guys: Boxplots where invented/proposed by the famous John W Tukey, co-inventor of the FFT, the word "bit"; "exploratory data analysis", etc etc. Then (partly) at Bell Labs, who via John Chambers and co-workers also "donated" the S language and hence R to the world !) and therefore you can expect many many uses of boxplot() in other code... and hence, it could well be that some code has (probably implicitly) *relied* on the current "more surprising" behavior. I'd still advocate to the change the default here, but we really have to discuss this, as a change also may have adverse consequences. Martin Maechler, ETH Zurich (and R Core) > *** Downloads/R-2.14.0/src/library/graphics/R/boxplot.R Mon Oct 3 > 00:02:21 2011 > --- boxplot.R Thu Nov 17 23:02:45 2011 > *************** > *** 203,209 **** > } > if(is.null(pars$xlim)) > ! xlim <- c(0.5, n + 0.5) > else { > xlim <- pars$xlim > pars$xlim <- NULL > --- 203,209 ---- > } > if(is.null(pars$xlim)) > ! xlim <- c(min(at) - 0.5, max(at) + 0.5) > else { > xlim <- pars$xlim > pars$xlim <- NULL ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel