On Sep 1, 2010, at 9:20 AM, Chris Howden wrote:
Hi everyone,
I’m looking for a clever bit of code to replace NA’s with a specific
score
depending on an indicator variable.
I can see how to do it using lots of if statements but I’m sure
there most
be a neater, better way of doing it.
Any ideas at all will be much appreciated, I’m dreading coding up
all those
if statements!!!!!
My problem is as follows:
I have a data set with lots of missing data:
EG Raw Data Set
Category variable1 variable2
variable3
1 5 NA
NA
1 NA
3 4
2 NA
7 NA
This does not do its work by category (since I got tired of fixing
mangled htmlized datasets) but it seems to me that a tapply "wrap"
could do either of these operations within categories:
> egraw
Category variable1 variable2 variable3
1 1 5 NA NA
2 1 NA 3 4
3 2 NA 7 NA
> lapply(egraw, function(x) {mnx <- mean(x, na.rm=TRUE)
sapply(x, function(z) if (is.na(z))
{mnx}else{z})
}
)
$Category
[1] 1 1 2
$variable1
[1] 5 5 5
$variable2
[1] 5 3 7
$variable3
[1] 4 4 4
> sapply(egraw, function(x) {mnx <- mean(x, na.rm=TRUE)
sapply(x, function(z) if (is.na(z))
{mnx}else{z})
}
)
Category variable1 variable2 variable3
[1,] 1 5 5 4
[2,] 1 5 3 4
[3,] 2 5 7 4
etc
Now I want to replace the NA’s with the average for each category,
so if
these averages were:
EG Averages
Category variable1 variable2
variable3
1 4.5
3.2 2.5
2 3.5
7.4 5.9
So I’d like my data set to look like the following once I’ve
replaced the
NA’s with the appropriate category average:
EG Imputed Data Set
Category variable1 variable2
variable3
1 5 3.2
2.5
1 4.5
3 4
2 3.5
7 5.9
etc
Any ideas would be very much appreciated!!!!!
You might add reading the Posing Guide and setting up your reader to
post in plain text to your TODO list.
thankyou
Chris Howden
.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.