[R] recode data according to quantile breaks
Dear R-List, I would like to recode my data according to quantile breaks, i.e. all data within the range of 0%-25% should get a 1, 25%-50% a 2 etc. Is there a nice way to do this with all columns in a dataframe. e.g. df- f-data.frame(id=c(x01,x02,x03,x04,x05,x06),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18)) df id a b c 1 x01 1 2 1 2 x02 2 4 3 3 x03 3 6 9 4 x04 4 8 12 5 x05 5 10 15 6 x06 6 12 18 #I can do it in very complicated way apply(df[-1],2,quantile) a b c 0% 1.0 2.0 1.0 25% 2.2 4.5 4.5 50% 3.5 7.0 10.5 75% 4.8 9.5 14.2 100% 6.0 12.0 18.0 #then df$a[df$a=2.2]-1 ... #result should be df.breaks id a b c x01 1 1 1 x02 1 1 1 x03 2 2 2 x04 3 3 3 x05 4 4 4 x06 4 4 4 But there must be a way to do it more elegantly, something like df.breaks- apply(df[-1],2,recode.by.quantile) Can anyone help me with this? Best wishes Alain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recode data according to quantile breaks
Hi Alain, The following should get you started: apply(df[,-1], 2, function(x) cut(x, breaks = quantile(x), include.lowest = TRUE, labels = 1:4)) Check ?cut and ?apply for more information. HTH, Jorge.- On Tue, Feb 19, 2013 at 9:01 PM, D. Alain wrote: Dear R-List, I would like to recode my data according to quantile breaks, i.e. all data within the range of 0%-25% should get a 1, 25%-50% a 2 etc. Is there a nice way to do this with all columns in a dataframe. e.g. df- f-data.frame(id=c(x01,x02,x03,x04,x05,x06),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18)) df ida b c 1 x01 1 2 1 2 x02 2 4 3 3 x03 3 6 9 4 x04 4 8 12 5 x05 5 10 15 6 x06 6 12 18 #I can do it in very complicated way apply(df[-1],2,quantile) abc 0% 1.0 2.0 1.0 25% 2.2 4.5 4.5 50% 3.5 7.0 10.5 75% 4.8 9.5 14.2 100% 6.0 12.0 18.0 #then df$a[df$a=2.2]-1 ... #result should be df.breaks idabc x011 11 x021 11 x032 22 x043 33 x054 44 x064 44 But there must be a way to do it more elegantly, something like df.breaks- apply(df[-1],2,recode.by.quantile) Can anyone help me with this? Best wishes Alain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recode data according to quantile breaks
HI Alain, Try this: df.breaks-data.frame(id=df[,1],sapply(df[,-1],function(x) findInterval(x,quantile(x),rightmost.closed=TRUE)),stringsAsFactors=FALSE) df.breaks # id a b c #1 x01 1 1 1 #2 x02 1 1 1 #3 x03 2 2 2 #4 x04 3 3 3 #5 x05 4 4 4 #6 x06 4 4 4 A.K. - Original Message - From: D. Alain dialva...@yahoo.de To: Mailinglist R-Project r-help@r-project.org Cc: Sent: Tuesday, February 19, 2013 5:01 AM Subject: [R] recode data according to quantile breaks Dear R-List, I would like to recode my data according to quantile breaks, i.e. all data within the range of 0%-25% should get a 1, 25%-50% a 2 etc. Is there a nice way to do this with all columns in a dataframe. e.g. df- f-data.frame(id=c(x01,x02,x03,x04,x05,x06),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18)) df id a b c 1 x01 1 2 1 2 x02 2 4 3 3 x03 3 6 9 4 x04 4 8 12 5 x05 5 10 15 6 x06 6 12 18 #I can do it in very complicated way apply(df[-1],2,quantile) a b c 0% 1.0 2.0 1.0 25% 2.2 4.5 4.5 50% 3.5 7.0 10.5 75% 4.8 9.5 14.2 100% 6.0 12.0 18.0 #then df$a[df$a=2.2]-1 ... #result should be df.breaks id a b c x01 1 1 1 x02 1 1 1 x03 2 2 2 x04 3 3 3 x05 4 4 4 x06 4 4 4 But there must be a way to do it more elegantly, something like df.breaks- apply(df[-1],2,recode.by.quantile) Can anyone help me with this? Best wishes Alain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.