subject:"\[Rd\] Small inconsistency with boxplot"

Re: [Rd] Small inconsistency with boxplot

2011-11-18 Thread Martin Maechler


 Dear R-core team,
 I think I found a small inconsistency in the boxplot function. I don't want 
 to post it as a bug since I'm not sure this might be considered as one 
 according to the FAQ --- and this is not a major problem. Don't hesitate to 
 tell me if I'm wrong.

 If you try to do a boxplot on a matrix and set the at argument to some 
 vector different from 1:n, n is the number of columns of your matrix, then 
 some boxplots will be hidden since the default xlim value will be set to 
 c(0.5, n + 0.5) during the call of the bxp function.

 Currently you can easily bypass this problem by setting xlim appropriately 
 when calling the boxplot function.

Yes.  And the help page for  bxp  even has the following note:

 \note{
   if \code{add = FALSE}, the default is \code{xlim = c(0.5, n +0.5)}.
   It will usually be a good idea to specify the latter if the x axis
   has a log scale or \code{at} is specified or \code{width} is far from
   uniform.
 }

which clearly documents the current behavior.
(and one could say also ``excuses'' the current behavior)

In this sense, there's really no bug ... ;-) and you were 
very wise (or at least cautious :-) *not* to post it as bug  .. 

 I think it will be better if all boxplots were always shown unless the xlim 
 argument is specified. (I realized this behavior when I tried to do boxplots 
 on conditional simulations of a stochastic process ; in which case the 
 suggested behavior might be useful.)

I do agree that such a change would be more ``logical'' i.e.,
according to  The Rule of Least Surprise
(a good software design principle of providing a default behavior
 of least surprise to the user).

 Here's an example

 par(mfrow = c(1, 3))
 data - matrix(rnorm(10 * 50), 50)
 colnames(data) - letters[1:10]
 x.pos - seq(-10, 10, length = 10)
 boxplot(data, at = x.pos) ## only the last 5 boxplots will appear
 boxplot(data, at = 1:10) ## all boxplots will appear
 boxplot(data, at = x.pos, xlim = range(x.pos) + c(-0.5, 0.5)) ## all boxplots 
 will be shown


 I tried to do a patch if you want to change the current behavior --- note 
 this is my first patch ever so maybe I'm doing it wrong.

it looks good.
In the end, I would use

xlim - range(at, finite=TRUE) + c(-0.5, 0.5)

There's one ***BIG*** question though:  

How probable is it that it breaks someone else's code.
Note that boxplot() and bxp() are  *REALLY*  old traditional S
functions
(and for all the young guys:  Boxplots where invented/proposed
 by the famous  John W Tukey, co-inventor of the FFT, the word
 bit; exploratory data analysis, etc etc.
 Then (partly) at Bell Labs, who via John Chambers and
 co-workers also donated the S language and hence R to the world !)

and therefore you can expect many many uses of boxplot() in
other code...
and hence, it could well be that some code has (probably
implicitly) *relied* on the current more surprising behavior.

I'd still advocate to the change the default here,
but we really have to discuss this, as a change also may have
adverse consequences.

Martin Maechler, ETH Zurich (and R Core)

 *** Downloads/R-2.14.0/src/library/graphics/R/boxplot.R   Mon Oct  3 
 00:02:21 2011
 --- boxplot.R Thu Nov 17 23:02:45 2011
 ***
 *** 203,209 
   }
  
   if(is.null(pars$xlim))
 ! xlim - c(0.5, n + 0.5)
   else {
   xlim - pars$xlim
   pars$xlim - NULL
 --- 203,209 
   }
  
   if(is.null(pars$xlim))
 ! xlim - c(min(at) - 0.5, max(at) + 0.5)
   else {
   xlim - pars$xlim
   pars$xlim - NULL

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Small inconsistency with boxplot

2011-11-17 Thread Mathieu Ribatet

Dear R-core team,

I think I found a small inconsistency in the boxplot function. I don't want to 
post it as a bug since I'm not sure this might be considered as one according 
to the FAQ --- and this is not a major problem. Don't hesitate to tell me if 
I'm wrong.

If you try to do a boxplot on a matrix and set the at argument to some vector 
different from 1:n, n is the number of columns of your matrix, then some 
boxplots will be hidden since the default xlim value will be set to c(0.5, n 
+ 0.5) during the call of the bxp function.

Currently you can easily bypass this problem by setting xlim appropriately 
when calling the boxplot function.

I think it will be better if all boxplots were always shown unless the xlim 
argument is specified. (I realized this behavior when I tried to do boxplots on 
conditional simulations of a stochastic process ; in which case the suggested 
behavior might be useful.)

Here's an example

par(mfrow = c(1, 3))
data - matrix(rnorm(10 * 50), 50)
colnames(data) - letters[1:10]
x.pos - seq(-10, 10, length = 10)
boxplot(data, at = x.pos) ## only the last 5 boxplots will appear
boxplot(data, at = 1:10) ## all boxplots will appear
boxplot(data, at = x.pos, xlim = range(x.pos) + c(-0.5, 0.5)) ## all boxplots 
will be shown

I tried to do a patch if you want to change the current behavior --- note this 
is my first patch ever so maybe I'm doing it wrong.

*** Downloads/R-2.14.0/src/library/graphics/R/boxplot.R Mon Oct  3 00:02:21 2011
--- boxplot.R   Thu Nov 17 23:02:45 2011
***
*** 203,209 
  }
  
  if(is.null(pars$xlim))
! xlim - c(0.5, n + 0.5)
  else {
xlim - pars$xlim
pars$xlim - NULL
--- 203,209 
  }
  
  if(is.null(pars$xlim))
! xlim - c(min(at) - 0.5, max(at) + 0.5)
  else {
xlim - pars$xlim
pars$xlim - NULL


- 
I3M, UMR CNRS 5149
Universite Montpellier II,
4 place Eugene Bataillon
34095 Montpellier cedex 5   France
http://www.math.univ-montp2.fr/~ribatet
Tel: + 33 (0)4 67 14 41 98

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Small inconsistency with boxplot

[Rd] Small inconsistency with boxplot

2 matches

Site Navigation

Mail list logo

Footer information