Barry Rowlingson wrote:
2009/12/17 Joel Fürstenberg-Hägg <joel_furstenberg_h...@hotmail.com>:
Hi all,
I'm have a matrix (X) with observations as rows and parameters as columns. I'm
trying to exchange all missing values in a column by the column mean using the
code below, but so far, nothing happens with the NAs... Can anyone see where
the problem is?
N<-nrow(X) # Calculate number of rows = 108
p<-ncol(X) # Calculate number of columns = 88
# Replace by columnwise mean
for (i in colnames(X)) # Do for all columns in the matrix
{
for (j in rownames(X)) # Go through all rows
{
if(is.na(X[j,i])) # Search for missing value in the given position
{
X[j,i]=mean(X[1:p, i]) # Change missing value to the mean of the column
}
}
}
mean(anything with an NA in it) == NA. You want mean(X[1:p,i],na.rm=TRUE)
> mean(c(1,2,3,NA,4))
[1] NA
> mean(c(1,2,3,NA,4),na.rm=TRUE)
[1] 2.5
I'll leave it to someone else to show you how to speed this code up by
removing the loops...
Barry
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hi,
To replace all the NA's in the columns by the column mean do:
# Make example set
X = matrix(runif(25), 5, 5)
# Add some NA's
X[X>0.6] = NA
# Use an apply function, is shorthand for a loop
# Loops over the columns
X2 = apply(X,2,function(column) {
column[is.na(column)] = mean(column, na.rm = TRUE)
return(column)
})
X
X2
Is this ok barry :).
cheers,
Paul
--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone: +3130 274 3113 Mon-Tue
Phone: +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.