[R] recode data according to quantile breaks

2013-02-19 Thread D. Alain
Dear R-List, 

I would like to recode my data according to quantile breaks, i.e. all data 
within the range of 0%-25% should get a 1, 25%-50% a 2 etc.
Is there a nice way to do this with all columns in a dataframe.

e.g.

df- 
f-data.frame(id=c(x01,x02,x03,x04,x05,x06),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18))

df
   id    a      b      c
1 x01     1  2  1
2 x02 2      4      3
3 x03 3      6      9
4 x04     4      8 12
5 x05     5     10 15
6 x06     6     12     18

#I can do it in very complicated way


apply(df[-1],2,quantile)
   a    b    c
0%   1.0  2.0  1.0
25%  2.2  4.5  4.5
50%  3.5  7.0 10.5
75%  4.8  9.5 14.2
100% 6.0 12.0 18.0

#then 

df$a[df$a=2.2]-1
...

#result should be


df.breaks

id        a        b        c
x01    1           1        1
x02    1      1        1
x03    2           2        2
x04    3   3    3
x05    4   4    4
x06    4   4    4 



But there must be a way to do it more elegantly, something like


df.breaks- apply(df[-1],2,recode.by.quantile)

Can anyone help me with this?


Best wishes 


Alain      
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode data according to quantile breaks

2013-02-19 Thread Jorge I Velez
Hi Alain,

The following should get you started:

apply(df[,-1], 2, function(x) cut(x, breaks = quantile(x), include.lowest =
TRUE, labels = 1:4))

Check ?cut and ?apply for more information.

HTH,
Jorge.-



On Tue, Feb 19, 2013 at 9:01 PM, D. Alain  wrote:

 Dear R-List,

 I would like to recode my data according to quantile breaks, i.e. all data
 within the range of 0%-25% should get a 1, 25%-50% a 2 etc.
 Is there a nice way to do this with all columns in a dataframe.

 e.g.

 df-
 f-data.frame(id=c(x01,x02,x03,x04,x05,x06),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18))

 df
ida  b  c
 1 x01 1  2  1
 2 x02 2  4  3
 3 x03 3  6  9
 4 x04 4  8 12
 5 x05 5 10 15
 6 x06 6 12 18

 #I can do it in very complicated way


 apply(df[-1],2,quantile)
abc
 0%   1.0  2.0  1.0
 25%  2.2  4.5  4.5
 50%  3.5  7.0 10.5
 75%  4.8  9.5 14.2
 100% 6.0 12.0 18.0

 #then

 df$a[df$a=2.2]-1
 ...

 #result should be


 df.breaks

 idabc
 x011   11
 x021  11
 x032   22
 x043   33
 x054   44
 x064   44



 But there must be a way to do it more elegantly, something like


 df.breaks- apply(df[-1],2,recode.by.quantile)

 Can anyone help me with this?


 Best wishes


 Alain
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode data according to quantile breaks

2013-02-19 Thread arun
HI Alain,

Try this:
df.breaks-data.frame(id=df[,1],sapply(df[,-1],function(x) 
findInterval(x,quantile(x),rightmost.closed=TRUE)),stringsAsFactors=FALSE)
df.breaks
#   id a b c
#1 x01 1 1 1
#2 x02 1 1 1
#3 x03 2 2 2
#4 x04 3 3 3
#5 x05 4 4 4
#6 x06 4 4 4
A.K.



- Original Message -
From: D. Alain dialva...@yahoo.de
To: Mailinglist R-Project r-help@r-project.org
Cc: 
Sent: Tuesday, February 19, 2013 5:01 AM
Subject: [R] recode data according to quantile breaks

Dear R-List, 

I would like to recode my data according to quantile breaks, i.e. all data 
within the range of 0%-25% should get a 1, 25%-50% a 2 etc.
Is there a nice way to do this with all columns in a dataframe.

e.g.

df- 
f-data.frame(id=c(x01,x02,x03,x04,x05,x06),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18))

df
   id    a      b      c
1 x01     1  2  1
2 x02 2      4      3
3 x03 3      6      9
4 x04     4      8 12
5 x05     5     10 15
6 x06     6     12     18

#I can do it in very complicated way


apply(df[-1],2,quantile)
   a    b    c
0%   1.0  2.0  1.0
25%  2.2  4.5  4.5
50%  3.5  7.0 10.5
75%  4.8  9.5 14.2
100% 6.0 12.0 18.0

#then 

df$a[df$a=2.2]-1
...

#result should be


df.breaks

id        a        b        c
x01    1           1        1
x02    1      1        1
x03    2           2        2
x04    3   3    3
x05    4   4    4
x06    4   4    4 



But there must be a way to do it more elegantly, something like


df.breaks- apply(df[-1],2,recode.by.quantile)

Can anyone help me with this?


Best wishes 


Alain      
    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.