Suppose I have something like the following dataframe:

samp1 <- c(60,50,20,90)
samp2 <- c(60,60,90,58)
samp3 <- c(25,65,65,90)

test <- data.frame(samp1,samp2,samp3)

I want to calculate column means. Easy enough, if I want to use all the data within each column:


print(colMeans(test),na.rm = TRUE)


However, I'm danged if I can figure out how to do the same thing after dropping the minimum value for each column. For example, column 1 in the dataframe test consists of 60, 50,20,90. I want to calculate the mean over (60,50,90), dropping the minimum value (20). Figuring out what the minimum value is in a single column is easy, but I can't figure out how to arm-twist colMeans into 'applying itself' to the elements of a column greater than the minimum, for each column in turn. I've tried permutations of select, subset etc., to no avail. Only thing I can think of is to (i) find the minimum in a column, (ii) change it to NA, and then (iii) tell colMeans to na.rm = TRUE):

test2 <- test

for (i in 1:ncol(test)) { test2[which.min(test[,i]),i]==NA}

print(test2)

print(colMeans(test2),na.rm = TRUE)


While this works, seems awfully 'clunky' -- is there a better way?

Thanks in advance...

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to