Re: [R] Wide to long form conversion
On 10/07/2011 07:28 AM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) Hi Gang, I don't know whether this is the format you want, but: library(prettyR) rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
Dennis, Yes, I was wrong! The variables should be: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F/J: within-subject factor (2 levels: F / J) 5) Time: within-subject factor (2 levels: 1 / 2) 6) Value So your mData3 is what I was looking for. Thanks a lot! Gang On Thu, Oct 6, 2011 at 9:09 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) I don't see how you can get all of 3-5 given the way your data is structured. The problem is that each column contains two of the three variables you want, but not all three. I can see a way to get Subj Group Ref Time F J S1 s Me 1 4 5 S1 s Me 2 3 6 S1 s She 1 6 10 S1 s She 2 6 9 or an 8 line version with Ref (4 Me, 4 She), Factor (F1, J1, F2, J2) repeated twice and the appropriate response vector, but not a way where you have three columns for Ref, F and J. For example, what is the 'J' for MeF1 or the F for SheJ2? With that, here are a few stabs using the reshape2 package. The first step is to do a little renaming of your data frame so that one can use the colsplit() function to generate a new set of variables. names(myData) - c(Group, Me_F_1, Me_J_1, She_F_1, She_J_1, Me_F_2, Me_J_2, She_F_2, She_J_2, Subj) library('plyr') library('reshape2') # collapses the eight columns to be reshaped into a factor named # variable with a corresponding variable named value mData - melt(myData, id = c('Subj', 'Group')) head(mData) # Split the original variables into three new columns, named # Ref, Var and Time, respectively: newvars - colsplit(mData$variable, '_', c('Ref', 'Var', 'Time')) # Append these to the melted data frame and remove 'variable' mData2 - cbind(mData, newvars)[, -3] # This comes closest to your original intent: mData3 - arrange(mData2, Subj, Ref, Var, Time) head(mData3, 8) Subj Group value Ref Var Time 1 S1 s 4 Me F 1 2 S1 s 3 Me F 2 3 S1 s 5 Me J 1 4 S1 s 6 Me J 2 5 S1 s 6 She F 1 6 S1 s 6 She F 2 7 S1 s 10 She J 1 8 S1 s 9 She J 2 # Some rearrangements to consider: mData4 - cast(mData3, Subj + Group + Ref + Time ~ Var, value_var = 'value') head(mData4, 4) Subj Group Ref Time F J 1 S1 s Me 1 4 5 2 S1 s Me 2 3 6 3 S1 s She 1 6 10 4 S1 s She 2 6 9 mData5 - cast(mData3, Subj + Group + Ref + Var ~ Time, value_var = 'value') head(mData5, 4) Subj Group Ref Var 1 2 1 S1 s Me F 4 3 2 S1 s Me J 5 6 3 S1 s She F 6 6 4 S1 s She J 10 9 If you like this one, it's probably a good idea to rename the last two columns 'Time1' and 'Time2' or something similar. HTH, Dennis On Thu, Oct 6, 2011 at 1:28 PM, Gang Chen gangch...@gmail.com wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) As this is the first time I'm learning such a conversion, could someone help me out? Many thanks, Gang myData Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj 1 s 4 5 6 10 3 6 6 9 S1 2 s 6 5 5 6 4 3 5 6 S2 3 s 7 4 6 5 7 4 5 3 S3 4 s 8 5 8 7 7 1 8 6 S4 5 s 10 6 4 7 9 6 4 6 S5 6 s 5 2 4 7 4 1 4 2 S6 7 s 13 2 10 4 11 2 4 3 S7 8 s 8 1 3 11 6 0 3 10 S8 9 s 6 9 5 8 6 8 5 6 S9 10 s 14 5 6 10 13 5 5 10 S10 11 s 15 2 18 2 14 1 18 2 S11 12 s 6 9 4 9 5 11 3 8 S12 13 s 5 5 0 12 4 3 0 8 S13 14 s 5 6 4 9 4 6 2 6 S14 15 s 14 5 12 3 12 3 11 3 S15 16 s 7 2 11 3 5 2 10 2 S16 17 s 1 7 4 5 1 6 3 5 S17 18 s 6 2 7 4 6 2 7
Re: [R] Wide to long form conversion
Jim, I really appreciate your help! I like the power of rep_n_stack, but how can I use rep_n_stack to get the following result? Subj Group value Ref Var Time 1S1 s 4 Me F1 2S1 s 3 Me F2 3S1 s 5 Me J1 4S1 s 6 Me J2 5S1 s 6 She F1 6S1 s 6 She F2 7S1 s10 She J1 8S1 s 9 She J2 On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon j...@bitwrit.com.au wrote: On 10/07/2011 07:28 AM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) Hi Gang, I don't know whether this is the format you want, but: library(prettyR) rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
On Oct 7, 2011, at 7:40 AM, Gang Chen wrote: Jim, I really appreciate your help! I like the power of rep_n_stack, but how can I use rep_n_stack to get the following result? Subj Group value Ref Var Time 1S1 s 4 Me F1 2S1 s 3 Me F2 3S1 s 5 Me J1 4S1 s 6 Me J2 5S1 s 6 She F1 6S1 s 6 She F2 7S1 s10 She J1 8S1 s 9 She J2 I was not able to construct a one step solution with `reshape` that will contains all the columns. You can do it in about 4 steps by first making the data long and then adding annotation columns. Using just rows 1 and 26 you might get: reshape(myData[c(1,26), ], idvar=c(Group,Subj), direction=long, varying=2:9, v.names=c(value) ) Group Subj time value s.S1.1 s S11 4 w.S26.1 w S261 5 s.S1.2 s S12 5 w.S26.2 w S262 9 s.S1.3 s S13 6 w.S26.3 w S263 4 s.S1.4 s S14 10 w.S26.4 w S264 7 s.S1.5 s S15 3 w.S26.5 w S265 3 s.S1.6 s S16 6 w.S26.6 w S266 7 s.S1.7 s S17 6 w.S26.7 w S267 3 s.S1.8 s S18 9 w.S26.8 w S268 5 The 'time' variable is not really what you wanted but refers to the sequence along the original wide column names You can add the desired Ref, Var and Time columms with these constructions: str(times-rep(c(1,2), length=nrow(myData)*8 ) ) num [1:408] 1 2 1 2 1 2 1 2 1 2 ... str(times-rep(c(F,J), each=2, length=nrow(myData)*8 ) ) chr [1:408] F F J J F F J J F F ... str(times-rep(c(Me,She), each=4, length=nrow(myData)*8 ) ) chr [1:408] Me Me Me Me She She She She ... longData - reshape(myData, idvar=c(Group,Subj), ction=long, varying=2:9, v.names=c(value) ) longData$Time - rep(c(1,2), length=nrow(myData)*8 ) longData$Var - rep(c(F,J), each=2, length=nrow(myData)*8 ) longData$Ref - rep(c(Me,She), each=4, length=nrow(myData)*8 ) longData - longData[order(longData$Subj), ] Looking at Jim Lemon's response, I think he just misinterpreted the structure of your data but gave you a perfectly usable response. You could have done much the same thing with a minor modification: str(rep_n_stack(myData,matrix(c(2,3,6,7,4,5,8,9),nrow=1,byrow=TRUE))) 'data.frame': 408 obs. of 4 variables: $ Group : Factor w/ 2 levels s,w: 1 1 1 1 1 1 1 1 1 1 ... $ Subj : Factor w/ 51 levels S1,S10,S11,..: 1 12 23 34 45 48 49 50 51 2 ... $ group1: Factor w/ 8 levels Me.F.1,Me.F.2,..: 1 1 1 1 1 1 1 1 1 1 ... $ value1: int 4 6 7 8 10 5 13 8 6 14 ... Now you can just split apart the 'group1' column with sub() to make the three specified columns. -- David. On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon j...@bitwrit.com.au wrote: On 10/07/2011 07:28 AM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) Hi Gang, I don't know whether this is the format you want, but: library(prettyR) rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
On Oct 7, 2011, at 1:30 PM, David Winsemius wrote: On Oct 7, 2011, at 7:40 AM, Gang Chen wrote: Jim, I really appreciate your help! I like the power of rep_n_stack, but how can I use rep_n_stack to get the following result? Subj Group value Ref Var Time 1S1 s 4 Me F1 2S1 s 3 Me F2 3S1 s 5 Me J1 4S1 s 6 Me J2 5S1 s 6 She F1 6S1 s 6 She F2 7S1 s10 She J1 8S1 s 9 She J2 I was not able to construct a one step solution with `reshape` that will contains all the columns. You can do it in about 4 steps by first making the data long and then adding annotation columns. Using just rows 1 and 26 you might get: reshape(myData[c(1,26), ], idvar=c(Group,Subj), direction=long, varying=2:9, v.names=c(value) ) Group Subj time value s.S1.1 s S11 4 w.S26.1 w S261 5 s.S1.2 s S12 5 w.S26.2 w S262 9 s.S1.3 s S13 6 w.S26.3 w S263 4 s.S1.4 s S14 10 w.S26.4 w S264 7 s.S1.5 s S15 3 w.S26.5 w S265 3 s.S1.6 s S16 6 w.S26.6 w S266 7 s.S1.7 s S17 6 w.S26.7 w S267 3 s.S1.8 s S18 9 w.S26.8 w S268 5 The 'time' variable is not really what you wanted but refers to the sequence along the original wide column names You can add the desired Ref, Var and Time columms with these constructions: str(times-rep(c(1,2), length=nrow(myData)*8 ) ) num [1:408] 1 2 1 2 1 2 1 2 1 2 ... str(times-rep(c(F,J), each=2, length=nrow(myData)*8 ) ) chr [1:408] F F J J F F J J F F ... str(times-rep(c(Me,She), each=4, length=nrow(myData)*8 ) ) chr [1:408] Me Me Me Me She She She She ... It occured to me that the ordering operation probably should have preceded teh ancillary column creation so this method is tested: longData - reshape(myData, idvar=c(Group,Subj), direction=long,#fixed the direction argument varying=2:9, v.names=c(value) ) longData - longData[order(longData$Subj), ] longData$Time - rep(c(1,2), length=nrow(myData)*8 ) longData$Var - rep(c(F,J), each=2, length=nrow(myData)*8 ) longData$Ref - rep(c(Me,She), each=4, length=nrow(myData)*8 ) Group Subj time value Time Var Ref s.S1.1 s S11 41 F Me s.S1.2 s S12 52 F Me s.S1.3 s S13 61 J Me s.S1.4 s S14102 J Me s.S1.5 s S15 31 F She s.S1.6 s S16 62 F She s.S1.7 s S17 61 J She s.S1.8 s S18 92 J She Looking at Jim Lemon's response, I think he just misinterpreted the structure of your data but gave you a perfectly usable response. You could have done much the same thing with a minor modification: str(rep_n_stack(myData,matrix(c(2,3,6,7,4,5,8,9),nrow=1,byrow=TRUE))) 'data.frame': 408 obs. of 4 variables: $ Group : Factor w/ 2 levels s,w: 1 1 1 1 1 1 1 1 1 1 ... $ Subj : Factor w/ 51 levels S1,S10,S11,..: 1 12 23 34 45 48 49 50 51 2 ... $ group1: Factor w/ 8 levels Me.F.1,Me.F.2,..: 1 1 1 1 1 1 1 1 1 1 ... $ value1: int 4 6 7 8 10 5 13 8 6 14 ... Now you can just split apart the 'group1' column with sub() to make the three specified columns. Lemon's method has the advantage that it properly carries along the column information -- David. On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon j...@bitwrit.com.au wrote: On 10/07/2011 07:28 AM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) Hi Gang, I don't know whether this is the format you want, but: library(prettyR) rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE)) Jim David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
David, thanks a lot for the code! I've learned quite a bit from all the generous help... Gang On Fri, Oct 7, 2011 at 1:37 PM, David Winsemius dwinsem...@comcast.net wrote: On Oct 7, 2011, at 1:30 PM, David Winsemius wrote: On Oct 7, 2011, at 7:40 AM, Gang Chen wrote: Jim, I really appreciate your help! I like the power of rep_n_stack, but how can I use rep_n_stack to get the following result? Subj Group value Ref Var Time 1 S1 s 4 Me F 1 2 S1 s 3 Me F 2 3 S1 s 5 Me J 1 4 S1 s 6 Me J 2 5 S1 s 6 She F 1 6 S1 s 6 She F 2 7 S1 s 10 She J 1 8 S1 s 9 She J 2 I was not able to construct a one step solution with `reshape` that will contains all the columns. You can do it in about 4 steps by first making the data long and then adding annotation columns. Using just rows 1 and 26 you might get: reshape(myData[c(1,26), ], idvar=c(Group,Subj), direction=long, varying=2:9, v.names=c(value) ) Group Subj time value s.S1.1 s S1 1 4 w.S26.1 w S26 1 5 s.S1.2 s S1 2 5 w.S26.2 w S26 2 9 s.S1.3 s S1 3 6 w.S26.3 w S26 3 4 s.S1.4 s S1 4 10 w.S26.4 w S26 4 7 s.S1.5 s S1 5 3 w.S26.5 w S26 5 3 s.S1.6 s S1 6 6 w.S26.6 w S26 6 7 s.S1.7 s S1 7 6 w.S26.7 w S26 7 3 s.S1.8 s S1 8 9 w.S26.8 w S26 8 5 The 'time' variable is not really what you wanted but refers to the sequence along the original wide column names You can add the desired Ref, Var and Time columms with these constructions: str(times-rep(c(1,2), length=nrow(myData)*8 ) ) num [1:408] 1 2 1 2 1 2 1 2 1 2 ... str(times-rep(c(F,J), each=2, length=nrow(myData)*8 ) ) chr [1:408] F F J J F F J J F F ... str(times-rep(c(Me,She), each=4, length=nrow(myData)*8 ) ) chr [1:408] Me Me Me Me She She She She ... It occured to me that the ordering operation probably should have preceded teh ancillary column creation so this method is tested: longData - reshape(myData, idvar=c(Group,Subj), direction=long, #fixed the direction argument varying=2:9, v.names=c(value) ) longData - longData[order(longData$Subj), ] longData$Time - rep(c(1,2), length=nrow(myData)*8 ) longData$Var - rep(c(F,J), each=2, length=nrow(myData)*8 ) longData$Ref - rep(c(Me,She), each=4, length=nrow(myData)*8 ) Group Subj time value Time Var Ref s.S1.1 s S1 1 4 1 F Me s.S1.2 s S1 2 5 2 F Me s.S1.3 s S1 3 6 1 J Me s.S1.4 s S1 4 10 2 J Me s.S1.5 s S1 5 3 1 F She s.S1.6 s S1 6 6 2 F She s.S1.7 s S1 7 6 1 J She s.S1.8 s S1 8 9 2 J She Looking at Jim Lemon's response, I think he just misinterpreted the structure of your data but gave you a perfectly usable response. You could have done much the same thing with a minor modification: str(rep_n_stack(myData,matrix(c(2,3,6,7,4,5,8,9),nrow=1,byrow=TRUE))) 'data.frame': 408 obs. of 4 variables: $ Group : Factor w/ 2 levels s,w: 1 1 1 1 1 1 1 1 1 1 ... $ Subj : Factor w/ 51 levels S1,S10,S11,..: 1 12 23 34 45 48 49 50 51 2 ... $ group1: Factor w/ 8 levels Me.F.1,Me.F.2,..: 1 1 1 1 1 1 1 1 1 1 ... $ value1: int 4 6 7 8 10 5 13 8 6 14 ... Now you can just split apart the 'group1' column with sub() to make the three specified columns. Lemon's method has the advantage that it properly carries along the column information -- David. On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon j...@bitwrit.com.au wrote: On 10/07/2011 07:28 AM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) Hi Gang, I don't know whether this is the format you want, but: library(prettyR) rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE)) Jim David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
Take a look here. http://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format Andrew Miles Department of Sociology Duke University On Oct 6, 2011, at 4:28 PM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) As this is the first time I'm learning such a conversion, could someone help me out? Many thanks, Gang myData Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj 1 s45 61036 6 9 S1 2 s65 5 643 5 6 S2 3 s74 6 574 5 3 S3 4 s85 8 771 8 6 S4 5 s 106 4 796 4 6 S5 6 s52 4 741 4 2 S6 7 s 13210 4 112 4 3 S7 8 s81 31160 310 S8 9 s69 5 868 5 6 S9 10 s 145 610 135 510 S10 11 s 15218 2 14118 2 S11 12 s69 4 95 11 3 8 S12 13 s55 01243 0 8 S13 14 s56 4 946 2 6 S14 15 s 14512 3 12311 3 S15 16 s7211 35210 2 S16 17 s17 4 516 3 5 S17 18 s62 7 462 7 4 S18 19 s94 8 5 104 6 3 S19 20 s82 6 592 6 4 S20 21 s65 5 766 5 5 S21 22 s88 3 767 5 3 S22 23 s 114 6 711 6 4 S23 24 s63 2 464 2 2 S24 25 s44 6 623 4 6 S25 26 w59 4 737 3 5 S26 27 w76 3 541 0 4 S27 28 w 10414 28410 2 S28 29 w97 5 684 5 3 S29 30 w92 7 562 6 5 S30 31 w67 6 765 5 8 S31 32 w7612 76310 7 S32 33 w 123 8 9 113 4 7 S33 34 w 12210 592 6 3 S34 35 w6310 453 5 3 S35 36 w93 9 963 7 8 S36 37 w5 11 7 74 11 3 4 S37 38 w74 4 673 1 5 S38 39 w65 1 833 0 8 S39 40 w 10310 273 7 2 S40 41 w1 11 7 518 4 3 S41 42 w 105 610 104 3 9 S42 43 w63 9 242 6 0 S43 44 w9511 454 7 3 S44 45 w85 6 384 2 3 S45 46 w84 8 741 2 6 S46 47 w 122 6 2 101 5 2 S47 48 w 106 9 875 7 8 S48 49 w 13615 1 12414 0 S49 50 w78 11247 111 S50 51 w 123 9 491 7 4 S51 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
Thanks for the pointer! I still couldn't figure out how to convert my data because the example at stackoverflow seems to have only one variable (Year) while there are three within-subject variables plus one between-subjects variable. Any further help? Gang On Thu, Oct 6, 2011 at 4:34 PM, Andrew Miles rstuff.mi...@gmail.com wrote: Take a look here. http://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format Andrew Miles Department of Sociology Duke University On Oct 6, 2011, at 4:28 PM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) As this is the first time I'm learning such a conversion, could someone help me out? Many thanks, Gang myData Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj 1 s 4 5 6 10 3 6 6 9 S1 2 s 6 5 5 6 4 3 5 6 S2 3 s 7 4 6 5 7 4 5 3 S3 4 s 8 5 8 7 7 1 8 6 S4 5 s 10 6 4 7 9 6 4 6 S5 6 s 5 2 4 7 4 1 4 2 S6 7 s 13 2 10 4 11 2 4 3 S7 8 s 8 1 3 11 6 0 3 10 S8 9 s 6 9 5 8 6 8 5 6 S9 10 s 14 5 6 10 13 5 5 10 S10 11 s 15 2 18 2 14 1 18 2 S11 12 s 6 9 4 9 5 11 3 8 S12 13 s 5 5 0 12 4 3 0 8 S13 14 s 5 6 4 9 4 6 2 6 S14 15 s 14 5 12 3 12 3 11 3 S15 16 s 7 2 11 3 5 2 10 2 S16 17 s 1 7 4 5 1 6 3 5 S17 18 s 6 2 7 4 6 2 7 4 S18 19 s 9 4 8 5 10 4 6 3 S19 20 s 8 2 6 5 9 2 6 4 S20 21 s 6 5 5 7 6 6 5 5 S21 22 s 8 8 3 7 6 7 5 3 S22 23 s 11 4 6 7 1 1 6 4 S23 24 s 6 3 2 4 6 4 2 2 S24 25 s 4 4 6 6 2 3 4 6 S25 26 w 5 9 4 7 3 7 3 5 S26 27 w 7 6 3 5 4 1 0 4 S27 28 w 10 4 14 2 8 4 10 2 S28 29 w 9 7 5 6 8 4 5 3 S29 30 w 9 2 7 5 6 2 6 5 S30 31 w 6 7 6 7 6 5 5 8 S31 32 w 7 6 12 7 6 3 10 7 S32 33 w 12 3 8 9 11 3 4 7 S33 34 w 12 2 10 5 9 2 6 3 S34 35 w 6 3 10 4 5 3 5 3 S35 36 w 9 3 9 9 6 3 7 8 S36 37 w 5 11 7 7 4 11 3 4 S37 38 w 7 4 4 6 7 3 1 5 S38 39 w 6 5 1 8 3 3 0 8 S39 40 w 10 3 10 2 7 3 7 2 S40 41 w 1 11 7 5 1 8 4 3 S41 42 w 10 5 6 10 10 4 3 9 S42 43 w 6 3 9 2 4 2 6 0 S43 44 w 9 5 11 4 5 4 7 3 S44 45 w 8 5 6 3 8 4 2 3 S45 46 w 8 4 8 7 4 1 2 6 S46 47 w 12 2 6 2 10 1 5 2 S47 48 w 10 6 9 8 7 5 7 8 S48 49 w 13 6 15 1 12 4 14 0 S49 50 w 7 8 1 12 4 7 1 11 S50 51 w 12 3 9 4 9 1 7 4 S51 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
Hi: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) I don't see how you can get all of 3-5 given the way your data is structured. The problem is that each column contains two of the three variables you want, but not all three. I can see a way to get Subj Group Ref Time F J S1 s Me 1 4 5 S1 s Me 2 3 6 S1 s She1 6 10 S1 s She2 6 9 or an 8 line version with Ref (4 Me, 4 She), Factor (F1, J1, F2, J2) repeated twice and the appropriate response vector, but not a way where you have three columns for Ref, F and J. For example, what is the 'J' for MeF1 or the F for SheJ2? With that, here are a few stabs using the reshape2 package. The first step is to do a little renaming of your data frame so that one can use the colsplit() function to generate a new set of variables. names(myData) - c(Group, Me_F_1, Me_J_1, She_F_1, She_J_1, Me_F_2, Me_J_2, She_F_2, She_J_2, Subj) library('plyr') library('reshape2') # collapses the eight columns to be reshaped into a factor named # variable with a corresponding variable named value mData - melt(myData, id = c('Subj', 'Group')) head(mData) # Split the original variables into three new columns, named # Ref, Var and Time, respectively: newvars - colsplit(mData$variable, '_', c('Ref', 'Var', 'Time')) # Append these to the melted data frame and remove 'variable' mData2 - cbind(mData, newvars)[, -3] # This comes closest to your original intent: mData3 - arrange(mData2, Subj, Ref, Var, Time) head(mData3, 8) Subj Group value Ref Var Time 1S1 s 4 Me F1 2S1 s 3 Me F2 3S1 s 5 Me J1 4S1 s 6 Me J2 5S1 s 6 She F1 6S1 s 6 She F2 7S1 s10 She J1 8S1 s 9 She J2 # Some rearrangements to consider: mData4 - cast(mData3, Subj + Group + Ref + Time ~ Var, value_var = 'value') head(mData4, 4) Subj Group Ref Time F J 1 S1 s Me1 4 5 2 S1 s Me2 3 6 3 S1 s She1 6 10 4 S1 s She2 6 9 mData5 - cast(mData3, Subj + Group + Ref + Var ~ Time, value_var = 'value') head(mData5, 4) Subj Group Ref Var 1 2 1 S1 s Me F 4 3 2 S1 s Me J 5 6 3 S1 s She F 6 6 4 S1 s She J 10 9 If you like this one, it's probably a good idea to rename the last two columns 'Time1' and 'Time2' or something similar. HTH, Dennis On Thu, Oct 6, 2011 at 1:28 PM, Gang Chen gangch...@gmail.com wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) As this is the first time I'm learning such a conversion, could someone help me out? Many thanks, Gang myData Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj 1 s 4 5 6 10 3 6 6 9 S1 2 s 6 5 5 6 4 3 5 6 S2 3 s 7 4 6 5 7 4 5 3 S3 4 s 8 5 8 7 7 1 8 6 S4 5 s 10 6 4 7 9 6 4 6 S5 6 s 5 2 4 7 4 1 4 2 S6 7 s 13 2 10 4 11 2 4 3 S7 8 s 8 1 3 11 6 0 3 10 S8 9 s 6 9 5 8 6 8 5 6 S9 10 s 14 5 6 10 13 5 5 10 S10 11 s 15 2 18 2 14 1 18 2 S11 12 s 6 9 4 9 5 11 3 8 S12 13 s 5 5 0 12 4 3 0 8 S13 14 s 5 6 4 9 4 6 2 6 S14 15 s 14 5 12 3 12 3 11 3 S15 16 s 7 2 11 3 5 2 10 2 S16 17 s 1 7 4 5 1 6 3 5 S17 18 s 6 2 7 4 6 2 7 4 S18 19 s 9 4 8 5 10 4 6 3 S19 20 s 8 2 6 5 9 2 6 4 S20 21 s 6 5 5 7 6 6 5 5 S21 22 s 8 8 3 7 6 7 5 3 S22 23 s 11 4 6 7 1 1 6 4 S23 24 s 6 3 2 4 6 4 2 2 S24 25 s 4 4 6 6 2 3 4 6 S25 26 w 5 9 4 7 3 7 3