Petr PIKAL wrote:

so do you think I shall fire a bug announcement? I think I rather wait to see if there is some reaction from others. Maybe, there is some reason behind such behaviour. Those simple statistics tend to behave differently when operating on data.frames so median is not such a huge surprise.


sd(df1), var(df1), mean(df1), max(df1), min(df1), range(df1)

Produced results are usually clearly documented, however for novice it is rather mysterious why using those functions on vector produce easily understandable results but using them on data.frame (which is most common structure of data) is far from consistent and intuitive.

But I agree with you that mean and median in best case shall give similar results regarding results structure.


Well, I don't think that it's a bug since the documentation
for median() does not indicate that median should work for
dataframes, whereas for mean() it clearly says that a method
exists. methods('mean') and methods('median') as well as
mean.default(df1) are informative.

It seems to me to be a simple fix so I wonder what I'm
missing. Paraphrasing <- function(x, ...) sapply(x, median, ...)

I think that it would be desirable to have similar behaviour
for both functions or at least a warning if median.default
is incorrectly applied to a data.frame object.

 -Peter Ehlers napsal dne 04.02.2010 10:28:16:

Well, I get the same as Petr with  R version 2.10.0 (2009-10-26)
on Linux.

To me, this suggests that median is broken! Any user would,
a priori, expect that median() should operate in exactly
the same way as mean(). To extend Petr's example:

  mat <- matrix(1:32, 4,8)
  df1 <- data.frame(mat)
# X1 X2 X3 X4 X5 X6 X7 X8 # 2.5 6.5 10.5 14.5 18.5 22.5 26.5 30.5 median(df1)
  # [1] 14.5 18.5

so (as in Petr's original example, but more clearly) median()
returns the medians of the two "central" columns X4 and X5 of df1.

But that is with an even number of columns. Now look at what
happens with an odd number:

  mat <- matrix(1:28, 4,7)
  df1 <- data.frame(mat)
# X1 X2 X3 X4 X5 X6 X7 # 2.5 6.5 10.5 14.5 18.5 22.5 26.5 median(df1)
  #   structure(c("13", "14", "15", "16"), class = "AsIs")
  # 1                                                   13
  # 2                                                   14
  # 3                                                   15
  # 4                                                   16


This does suggest a tie-in with Petr's observation about "As.Is",
and there is no doubt at all that the above result is rubbish.
It is certainly not what a user would expect, and in the context
of Petr's intention to present R lessons to a class, I could
foresee students turning their backs on R if they came up with
such a result in their early encounters!


On 04-Feb-10 08:59:59, Mario Valle wrote:
Linux 2.9.0 gives:

[1] 34

Ever stranger...

Petr PIKAL wrote:
During some experimentation in preparing R lessons I encountered this

behaviour which I can not explain fully

mat <- matrix(1:16, 4,4)
df1 <- data.frame(mat)

X1 X2 X3 X4 2.5 6.5 10.5 14.5
Expected, documented

[1]  6.5 10.5

Rather weird, AFAIK there shall not be an issue with data frame at
least I did not find any in help page. I tracked it down probably to an As.Is

operation with object and subsequent sorting in median.default.

I know other (*apply) ways how to compute median for data frames so I
just would like to hear an opinion about this behaviour from more experienced people.

Thank you
Best regards


______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Ing. Mario Valle
Data Analysis and Visualization Group            |
Swiss National Supercomputing Centre (CSCS)      | Tel:  +41 (91)
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91)

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
E-Mail: (Ted Harding) <>
Fax-to-email: +44 (0)870 094 0861
Date: 04-Feb-10                                       Time: 09:28:13
------------------------------ XFMail ------------------------------

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

Peter Ehlers
University of Calgary

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

Reply via email to