Dear All,

Replacing  missing values with means is generally not a good idea:

"Perhaps the easiest way to impute is to replace each missing
value with the mean of the observed values for that variable. Unfortunately, 
this
strategy can severely distort the distribution for this variable, leading to 
complications
with summary measures including, notably, underestimates of the standard
deviation. Moreover, mean imputation distorts relationships between variables by
“pulling” estimates of the correlation toward zero."

That's from Gelman and Hill -- more here : 
http://www.stat.columbia.edu/~gelman/arm/missing.pdf


best, Fraser

________________________________________
From: Val [valkr...@gmail.com]
Sent: Wednesday, April 26, 2017 8:45 PM
To: r-help@R-project.org (r-help@r-project.org)
Subject: [R] missing and replace

HI all,

I have a data frame with three variables. Some of the variables do
have missing values and I want to replace those missing values
(1represented by NA) with the mean value of that variable. In this
sample data,  variable z and y do have missing values. The mean value
of y  and z are152. 25  and 359.5, respectively . I want replace those
missing values  by the respective mean value ( rounded to the nearest
whole number).

DF1 <- read.table(header=TRUE, text='ID1 x y z
1  25  122    352
2  30  135    376
3  40   NA    350
4  26  157    NA
5  60  195    360')
mean x= 36.2
mean y=152.25
mean z= 359.5

output
ID1  x  y  z
1   25 122   352
2   30 135   376
3   40 152   350
4   26 157   360
5   60 195   360


Thank you in advance

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to