[R] better way of recoding factors in data frame?

mohinder_datta Thu, 09 Apr 2009 06:57:12 -0700

Hi all,

I apologize in advance for the length of this post, but I wanted to make sure I 
was clear.


I am trying to merge two dataframes that share a number of rows (but some are 
unique to each data frame). Each row represents a subject in a study. The 
problem is that sex is coded differently in the two, including the way missing 
values are represented.

Here is an example of the merged dataframe:

> myFrame2
   SubjCode SubjSex          Sex
1      sub1       M         <NA>
2      sub2       F         <NA>
3      sub3       M         Male
4      sub4       M         <NA>
5      sub5       F         <NA>
6      sub6       F       Female
7      sub7                 <NA>
8      sub8                 <NA>
9      sub9         Not Recorded
10    sub10         Not Recorded

I then apply the following:

> myFrame2$SubjSex <- factor(myFrame2$SubjSex, levels = c('M','F'))
> myFrame2$SubjSex <- factor(myFrame2$SubjSex, labels = c('Male','Female'))
> myFrame2 <- transform(myFrame2, newSex = ifelse(is.na(SubjSex), Sex, SubjSex))

...and get this:
> myFrame2
   SubjCode SubjSex          Sex newSex
1      sub1    Male         <NA>      1
2      sub2  Female         <NA>      2
3      sub3    Male         Male      1
4      sub4    Male         <NA>      1
5      sub5  Female         <NA>      2
6      sub6  Female       Female      2
7      sub7    <NA>         <NA>     NA
8      sub8    <NA>         <NA>     NA
9      sub9    <NA> Not Recorded      3
10    sub10    <NA> Not Recorded      3

I need that last column to have just 1 (Male), 2 (Female) or 0 (Missing), and 
the only way I've come up with seems very kludgy:

> myFrame2$newSex[is.na(myFrame2$newSex)] <- 0
> myFrame2$newSex <- ifelse(myFrame2$newSex == 3, 0, myFrame2$newSex)

That gives me the right values for "newSex", but I'd like to positively select 
for the values I want to keep, rather than negatively selecting the ones to 
change - I tried this:

> myFrame2$newSex <- ifelse(myFrame2$newSex ==1 || myFrame2$newSex == 2, 
> myFrame2$newSex, 0)

But I just get 1 for every row in newSex. Does anyone know of a way to do this 
by positively selecting the values 1 and 2?


Thanks,
Mohinder






______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] better way of recoding factors in data frame?

Reply via email to