Re: [R] ReShape to create Time from Observations?
On Tue, Jul 7, 2009 at 4:22 PM, jim holtmanjholt...@gmail.com wrote: Does something like this work for you; it uses the reshape package: X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12, + Ob4=4:13, Ob5=3:12, Ob6=2:11) Y-data.frame(A=1:20, B=0, C=1, D=5, Ob1=1:10, Ob2=2:11, Ob3=3:12, + Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=5:9) Z-data.frame(A=1:30, B=0, C=1, D=6, E=1:2, Ob1=1:10, Ob2=2:11, + Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=1:10, Ob8=3:12) f.melt - + function(df) + { + # get the starting column number of Ob1, then extend for rest of columns + require(reshape) + melt(df, measure=seq(match(Ob1, names(df)), ncol(df))) + } x.m - f.melt(X) y.m - f.melt(Y) z.m - f.melt(Z) # sample data head(x.m, 20) A B C variable value 1 1 0 1 Ob1 1 2 2 0 1 Ob1 2 3 3 0 1 Ob1 3 4 4 0 1 Ob1 4 5 5 0 1 Ob1 5 6 6 0 1 Ob1 6 7 7 0 1 Ob1 7 8 8 0 1 Ob1 8 9 9 0 1 Ob1 9 10 10 0 1 Ob1 10 11 1 0 1 Ob2 2 12 2 0 1 Ob2 3 13 3 0 1 Ob2 4 14 4 0 1 Ob2 5 15 5 0 1 Ob2 6 16 6 0 1 Ob2 7 17 7 0 1 Ob2 8 18 8 0 1 Ob2 9 19 9 0 1 Ob2 10 20 10 0 1 Ob2 11 SNIP Jim, It wasn't exactly what I was looking for but I think the ideas plus a bit of off-list help from another member helped me get much closer. The idea of using match is very helpful in my case because I'm able to leverage the fact that in my data files everything to the right is also an observation to easily create list to the end of the row. Try the following: X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12,Ob4=4:13, Ob5=3:12, Ob6=2:11) BrkPnt-match(Ob1,names(X)) Ob_Group - list(names(X)[BrkPnt:ncol(X)]) # Give to reshape to turn ObX into time answerX1- reshape(X, varying=Ob_Group, direction='long') and at this point I can subset based on id or some other variable: subset(answerX1, A==1) A B C time Ob1 id 1.1 1 0 11 1 1 1.2 1 0 12 2 1 1.3 1 0 13 3 1 1.4 1 0 14 4 1 1.5 1 0 15 3 1 1.6 1 0 16 2 1 I *think* this is data that I can sent to matplot/qplot and get charts that I'm interested in. I'll work on that today to verify but it looks about right to me using this simple case: PlotData-subset(answerX1, A==1) matplot(PlotData$time,PlotData$Ob1) I really like the match idea. The first observation should generally be in about the first 20 columns of my files which can potentially be thousands of columns wide. There's no reason in my case to match every other column to the right as I already know they will match. I can get a list of all the observations with BrkPnt:ncol(X) or all the independent variables using 1:BrkPnt-1. I could also, if I chose, extract a specific group of observations by matching Ob20 and Ob40 to potentially find observations taken in a certain time period every day, etc. Nice! I'll put it back in a function as you did for use in my actual code. Cheers, Mark -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ReShape to create Time from Observations?
Here is a couple of very simple data.frames: X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11) Y-data.frame(A=1:20, B=0, C=1, D=5, Ob1=1:10, Ob2=2:11, Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=5:9) Z-data.frame(A=1:30, B=0, C=1, D=6, E=1:2, Ob1=1:10, Ob2=2:11, Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=1:10, Ob8=3:12) Each row in the data.frame is a unique experiment. The fields Ob1:Ob6 (in the case of the first data.frame) represent observations taken at fixed intervals for specific that experiment. (Observation 1, Observation2, etc.) IMPORTANT - Different data files have different numbers of both experiments and observations as well as different observation rates. Some data.frames might have 50 observations/experiment at 10 minute intervals (a work day) while others might have 2000 observations/experiment at daily intervals representing years of data. The number of columns preceding OB1 varies from file to file but once I get to Ob1 I have set it up so that the names to the right are consecutive to the end of the row, so 2000 observations will have names Ob1:Ob2000. How could I use ReShape to create a generic new data.frame where all of the ObX columns become 'time' for the experiments in that data.frame? I.e. - Ob1:ObX become s single variable called time incrementing from 1:X. The generic answer cannot use any numbers like 1:3 or 4:12 because every file is different. I think I need to discover the dimensions of the data.frames and locations of Ob1 as well as the name of the last column, etc., to construct the right fields. We could (if it's legal in R) say things like Ob1:Ob11 but it doesn't seem legal. I do see I can say things like names(X[4]) to discover Ob1, and cute things like names(X[dim(X)[2]]) to get the last name, etc., but I cannot put it together how to use this to drive ReShape into making all the Observations into a single variable called time. I hope this is clear. Thanks, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ReShape to create Time from Observations?
Does something like this work for you; it uses the reshape package: X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12, + Ob4=4:13, Ob5=3:12, Ob6=2:11) Y-data.frame(A=1:20, B=0, C=1, D=5, Ob1=1:10, Ob2=2:11, Ob3=3:12, + Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=5:9) Z-data.frame(A=1:30, B=0, C=1, D=6, E=1:2, Ob1=1:10, Ob2=2:11, + Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=1:10, Ob8=3:12) f.melt - + function(df) + { + # get the starting column number of Ob1, then extend for rest of columns + require(reshape) + melt(df, measure=seq(match(Ob1, names(df)), ncol(df))) + } x.m - f.melt(X) y.m - f.melt(Y) z.m - f.melt(Z) # sample data head(x.m, 20) A B C variable value 1 1 0 1 Ob1 1 2 2 0 1 Ob1 2 3 3 0 1 Ob1 3 4 4 0 1 Ob1 4 5 5 0 1 Ob1 5 6 6 0 1 Ob1 6 7 7 0 1 Ob1 7 8 8 0 1 Ob1 8 9 9 0 1 Ob1 9 10 10 0 1 Ob110 11 1 0 1 Ob2 2 12 2 0 1 Ob2 3 13 3 0 1 Ob2 4 14 4 0 1 Ob2 5 15 5 0 1 Ob2 6 16 6 0 1 Ob2 7 17 7 0 1 Ob2 8 18 8 0 1 Ob2 9 19 9 0 1 Ob210 20 10 0 1 Ob211 On Tue, Jul 7, 2009 at 5:37 PM, Mark Knechtmarkkne...@gmail.com wrote: Here is a couple of very simple data.frames: X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11) Y-data.frame(A=1:20, B=0, C=1, D=5, Ob1=1:10, Ob2=2:11, Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=5:9) Z-data.frame(A=1:30, B=0, C=1, D=6, E=1:2, Ob1=1:10, Ob2=2:11, Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=1:10, Ob8=3:12) Each row in the data.frame is a unique experiment. The fields Ob1:Ob6 (in the case of the first data.frame) represent observations taken at fixed intervals for specific that experiment. (Observation 1, Observation2, etc.) IMPORTANT - Different data files have different numbers of both experiments and observations as well as different observation rates. Some data.frames might have 50 observations/experiment at 10 minute intervals (a work day) while others might have 2000 observations/experiment at daily intervals representing years of data. The number of columns preceding OB1 varies from file to file but once I get to Ob1 I have set it up so that the names to the right are consecutive to the end of the row, so 2000 observations will have names Ob1:Ob2000. How could I use ReShape to create a generic new data.frame where all of the ObX columns become 'time' for the experiments in that data.frame? I.e. - Ob1:ObX become s single variable called time incrementing from 1:X. The generic answer cannot use any numbers like 1:3 or 4:12 because every file is different. I think I need to discover the dimensions of the data.frames and locations of Ob1 as well as the name of the last column, etc., to construct the right fields. We could (if it's legal in R) say things like Ob1:Ob11 but it doesn't seem legal. I do see I can say things like names(X[4]) to discover Ob1, and cute things like names(X[dim(X)[2]]) to get the last name, etc., but I cannot put it together how to use this to drive ReShape into making all the Observations into a single variable called time. I hope this is clear. Thanks, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.