Dear R-core team,
I think I found a small inconsistency in the boxplot function. I don't want
to post it as a bug since I'm not sure this might be considered as one
according to the FAQ --- and this is not a major problem. Don't hesitate to
tell me if I'm wrong.
If you try to do a boxplot on a matrix and set the at argument to some
vector different from 1:n, n is the number of columns of your matrix, then
some boxplots will be hidden since the default xlim value will be set to
c(0.5, n + 0.5) during the call of the bxp function.
Currently you can easily bypass this problem by setting xlim appropriately
when calling the boxplot function.
Yes. And the help page for bxp even has the following note:
\note{
if \code{add = FALSE}, the default is \code{xlim = c(0.5, n +0.5)}.
It will usually be a good idea to specify the latter if the x axis
has a log scale or \code{at} is specified or \code{width} is far from
uniform.
}
which clearly documents the current behavior.
(and one could say also ``excuses'' the current behavior)
In this sense, there's really no bug ... ;-) and you were
very wise (or at least cautious :-) *not* to post it as bug ..
I think it will be better if all boxplots were always shown unless the xlim
argument is specified. (I realized this behavior when I tried to do boxplots
on conditional simulations of a stochastic process ; in which case the
suggested behavior might be useful.)
I do agree that such a change would be more ``logical'' i.e.,
according to The Rule of Least Surprise
(a good software design principle of providing a default behavior
of least surprise to the user).
Here's an example
par(mfrow = c(1, 3))
data - matrix(rnorm(10 * 50), 50)
colnames(data) - letters[1:10]
x.pos - seq(-10, 10, length = 10)
boxplot(data, at = x.pos) ## only the last 5 boxplots will appear
boxplot(data, at = 1:10) ## all boxplots will appear
boxplot(data, at = x.pos, xlim = range(x.pos) + c(-0.5, 0.5)) ## all boxplots
will be shown
I tried to do a patch if you want to change the current behavior --- note
this is my first patch ever so maybe I'm doing it wrong.
it looks good.
In the end, I would use
xlim - range(at, finite=TRUE) + c(-0.5, 0.5)
There's one ***BIG*** question though:
How probable is it that it breaks someone else's code.
Note that boxplot() and bxp() are *REALLY* old traditional S
functions
(and for all the young guys: Boxplots where invented/proposed
by the famous John W Tukey, co-inventor of the FFT, the word
bit; exploratory data analysis, etc etc.
Then (partly) at Bell Labs, who via John Chambers and
co-workers also donated the S language and hence R to the world !)
and therefore you can expect many many uses of boxplot() in
other code...
and hence, it could well be that some code has (probably
implicitly) *relied* on the current more surprising behavior.
I'd still advocate to the change the default here,
but we really have to discuss this, as a change also may have
adverse consequences.
Martin Maechler, ETH Zurich (and R Core)
*** Downloads/R-2.14.0/src/library/graphics/R/boxplot.R Mon Oct 3
00:02:21 2011
--- boxplot.R Thu Nov 17 23:02:45 2011
***
*** 203,209
}
if(is.null(pars$xlim))
! xlim - c(0.5, n + 0.5)
else {
xlim - pars$xlim
pars$xlim - NULL
--- 203,209
}
if(is.null(pars$xlim))
! xlim - c(min(at) - 0.5, max(at) + 0.5)
else {
xlim - pars$xlim
pars$xlim - NULL
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel