Dear R-List, 

I would like to recode categorial variables into binary data, so that all 
values above median are coded 1 and all values below 0, separating each var 
into two equally large groups (e.g. good performers = 0 vs. bad performers =1).

I have not succeeded so far in finding a nice solution to do that in R. I 
thought there might be a better way than ordering each column and recoding the 
first 50% into 0 and the second into 1. If I use ifelse I have a problem with 
cases that share the same rank being all median. 

e.g.
df<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(1,1,4,2,3,2,2,5,2,2),k2=c(1,2,3,2,1,2,1,3,3,2),result=c(4,3,5,4,2,6,4,4,2,3)))

now I want to recode k1 and k2 so that I have half of the values recoded 0 and 
half recoded 1, split around the median point. The median of k1 is 2 which 
would lead to unequal groupsize if used 2 as cutoff, so all values k1=2 should 
be recoded 1 or 0 randomly until both categories have the same length.

something like

df.rec<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(0,0,1,0,1,1,0,1,0,1),k2=c(0,1,1,0,0,1,0,1,1,0),result=c(4,3,5,4,2,6,4,4,2,3)))

Can anyone help?

Thank you in advance.

Best wishes.
AlainĀ  
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to