Petr PIKAL wrote:
Hi

so do you think I shall fire a bug announcement? I think I rather wait to see if there is some reaction from others. Maybe, there is some reason behind such behaviour. Those simple statistics tend to behave differently when operating on data.frames so median is not such a huge surprise.

see

sd(df1), var(df1), mean(df1), max(df1), min(df1), range(df1)

Produced results are usually clearly documented, however for novice it is rather mysterious why using those functions on vector produce easily understandable results but using them on data.frame (which is most common structure of data) is far from consistent and intuitive.

But I agree with you that mean and median in best case shall give similar results regarding results structure.

Regards
Petr

Well, I don't think that it's a bug since the documentation
for median() does not indicate that median should work for
dataframes, whereas for mean() it clearly says that a method
exists. methods('mean') and methods('median') as well as
mean.default(df1) are informative.

It seems to me to be a simple fix so I wonder what I'm
missing. Paraphrasing mean.data.frame:

median.data.frame <- function(x, ...) sapply(x, median, ...)

I think that it would be desirable to have similar behaviour
for both functions or at least a warning if median.default
is incorrectly applied to a data.frame object.

 -Peter Ehlers


r-help-boun...@r-project.org napsal dne 04.02.2010 10:28:16:

Well, I get the same as Petr with  R version 2.10.0 (2009-10-26)
on Linux.

To me, this suggests that median is broken! Any user would,
a priori, expect that median() should operate in exactly
the same way as mean(). To extend Petr's example:

  mat <- matrix(1:32, 4,8)
  df1 <- data.frame(mat)
  mean(df1)
# X1 X2 X3 X4 X5 X6 X7 X8 # 2.5 6.5 10.5 14.5 18.5 22.5 26.5 30.5 median(df1)
  # [1] 14.5 18.5

so (as in Petr's original example, but more clearly) median()
returns the medians of the two "central" columns X4 and X5 of df1.

But that is with an even number of columns. Now look at what
happens with an odd number:

  mat <- matrix(1:28, 4,7)
  df1 <- data.frame(mat)
  mean(df1)
# X1 X2 X3 X4 X5 X6 X7 # 2.5 6.5 10.5 14.5 18.5 22.5 26.5 median(df1)
  #   structure(c("13", "14", "15", "16"), class = "AsIs")
  # 1                                                   13
  # 2                                                   14
  # 3                                                   15
  # 4                                                   16

Wow!!!!!!!!!!

This does suggest a tie-in with Petr's observation about "As.Is",
and there is no doubt at all that the above result is rubbish.
It is certainly not what a user would expect, and in the context
of Petr's intention to present R lessons to a class, I could
foresee students turning their backs on R if they came up with
such a result in their early encounters!

Ted.

On 04-Feb-10 08:59:59, Mario Valle wrote:
Linux 2.9.0 gives:

median(df1)
[1] 34

Ever stranger...
              mario

Petr PIKAL wrote:
During some experimentation in preparing R lessons I encountered this

behaviour which I can not explain fully

mat <- matrix(1:16, 4,4)
df1 <- data.frame(mat)

mean(df1)
X1 X2 X3 X4 2.5 6.5 10.5 14.5
Expected, documented

median(df1)
[1]  6.5 10.5

Rather weird, AFAIK there shall not be an issue with data frame at
least I did not find any in help page. I tracked it down probably to an As.Is

operation with object and subsequent sorting in median.default.

I know other (*apply) ways how to compute median for data frames so I
just would like to hear an opinion about this behaviour from more experienced people.

Thank you
Best regards

Petr

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Ing. Mario Valle
Data Analysis and Visualization Group            |
http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS)      | Tel:  +41 (91)
610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91)
610.82.82

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 04-Feb-10                                       Time: 09:28:13
------------------------------ XFMail ------------------------------

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Peter Ehlers
University of Calgary

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to