Hi there, I am looking for some help replacing missing values in R with the row mean. This is survey data and I am trying to impute values for missing variables in each set of questions separately using the mean of the scores for the other questions within that set.
I have a dataset that looks like this ID A1 A2 A3 B1 B2 B3 C1 C2 C3 C4 b 4 5 NA 2 NA 4 5 1 3 NA c 4 5 1 NA 3 4 5 1 3 2 d NA 5 1 1 NA 4 5 1 3 2 e 4 5 4 5 NA 4 5 1 3 2 I want to replace any NA's in columns A1:A3 with the row mean for those columns only. So for ID=b, I want the NA in A3[ID=b] to be (4+5)/2 which is the average of the values in A1 and A2 for that row. Same thing for columns B1:B3 - I want the NA in B2[ID=b] to be the mean of the values of B1 and B3 in row ID=b so that B2[ID=b] becomes 3 which is (2+4)/2. And same in C1:C4, I want C4[ID=b] to become (5+1+3)/3 which is the mean of C1:C3. Then I want to go to row ID=c and do the same thing and so on. Can anybody help me do this? I have tried using rowMeans and subsetting but can't figure out the right code to do it. Thanks so much. Zahra ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.