Re: [R] ReShape to create Time from Observations?

2009-07-08 Thread Mark Knecht
On Tue, Jul 7, 2009 at 4:22 PM, jim holtmanjholt...@gmail.com wrote:
 Does something like this work for you;  it uses the reshape package:

 X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12,
 + Ob4=4:13, Ob5=3:12, Ob6=2:11)
 Y-data.frame(A=1:20, B=0, C=1, D=5, Ob1=1:10, Ob2=2:11, Ob3=3:12,
 + Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=5:9)
 Z-data.frame(A=1:30, B=0, C=1, D=6, E=1:2, Ob1=1:10, Ob2=2:11,
 + Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=1:10, Ob8=3:12)

 f.melt -
 + function(df)
 + {
 +     # get the starting column number of Ob1, then extend for rest of 
 columns
 +     require(reshape)
 +     melt(df, measure=seq(match(Ob1, names(df)), ncol(df)))
 + }
 x.m - f.melt(X)
 y.m - f.melt(Y)
 z.m - f.melt(Z)

 # sample data
 head(x.m, 20)
    A B C variable value
 1   1 0 1      Ob1     1
 2   2 0 1      Ob1     2
 3   3 0 1      Ob1     3
 4   4 0 1      Ob1     4
 5   5 0 1      Ob1     5
 6   6 0 1      Ob1     6
 7   7 0 1      Ob1     7
 8   8 0 1      Ob1     8
 9   9 0 1      Ob1     9
 10 10 0 1      Ob1    10
 11  1 0 1      Ob2     2
 12  2 0 1      Ob2     3
 13  3 0 1      Ob2     4
 14  4 0 1      Ob2     5
 15  5 0 1      Ob2     6
 16  6 0 1      Ob2     7
 17  7 0 1      Ob2     8
 18  8 0 1      Ob2     9
 19  9 0 1      Ob2    10
 20 10 0 1      Ob2    11

SNIP

Jim,
   It wasn't exactly what I was looking for but I think the ideas plus
a bit of off-list help from another member helped me get much closer.
The idea of using match is very helpful in my case because I'm able to
leverage the fact that in my data files everything to the right is
also an observation to easily create  list to the end of the row. Try
the following:

X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12,Ob4=4:13,
Ob5=3:12, Ob6=2:11)

BrkPnt-match(Ob1,names(X))
Ob_Group - list(names(X)[BrkPnt:ncol(X)])

# Give to reshape to turn ObX into time
answerX1- reshape(X, varying=Ob_Group, direction='long')

and at this point I can subset based on id or some other variable:

subset(answerX1, A==1)
A B C time Ob1 id
1.1 1 0 11   1  1
1.2 1 0 12   2  1
1.3 1 0 13   3  1
1.4 1 0 14   4  1
1.5 1 0 15   3  1
1.6 1 0 16   2  1

   I *think* this is data that I can sent to matplot/qplot and get
charts that I'm interested in. I'll work on that today to verify but
it looks about right to me using this simple case:

PlotData-subset(answerX1, A==1)
matplot(PlotData$time,PlotData$Ob1)

   I really like the match idea. The first observation should
generally be in about the first 20 columns of my files which can
potentially be thousands of columns wide. There's no reason in my case
to match every other column to the right as I already know they will
match. I can get a list of all the observations with BrkPnt:ncol(X) or
all the independent variables using 1:BrkPnt-1. I could also, if I
chose, extract a specific group of observations by matching Ob20 and
Ob40 to potentially find observations taken in a certain time period
every day, etc. Nice!

   I'll put it back in a function as you did for use in my actual code.

Cheers,
Mark




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ReShape to create Time from Observations?

2009-07-07 Thread Mark Knecht
Here is a couple of very simple data.frames:

X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12,
Ob4=4:13, Ob5=3:12, Ob6=2:11)
Y-data.frame(A=1:20, B=0, C=1, D=5, Ob1=1:10, Ob2=2:11, Ob3=3:12,
Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=5:9)
Z-data.frame(A=1:30, B=0, C=1, D=6, E=1:2, Ob1=1:10, Ob2=2:11,
Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=1:10, Ob8=3:12)

Each row in the data.frame is a unique experiment. The fields Ob1:Ob6
(in the case of the first data.frame) represent observations taken at
fixed intervals for specific that experiment. (Observation 1,
Observation2, etc.) IMPORTANT - Different data files have different
numbers of both experiments and observations as well as different
observation rates. Some data.frames might have 50
observations/experiment at 10 minute intervals (a work day) while
others might have 2000 observations/experiment at daily intervals
representing years of data. The number of columns preceding OB1 varies
from file to file but once I get to Ob1 I have set it up so that the
names to the right are consecutive to the end of the row, so 2000
observations will have names Ob1:Ob2000.

How could I use ReShape to create a generic new data.frame where all
of the ObX columns become 'time' for the experiments in that
data.frame? I.e. - Ob1:ObX become s single variable called time
incrementing from 1:X.

The generic answer cannot use any numbers like 1:3 or 4:12 because
every file is different. I think I need to discover the dimensions of
the data.frames and locations of Ob1 as well as the name of the last
column, etc., to construct the right fields. We could (if it's legal
in R) say things like Ob1:Ob11 but it doesn't seem legal. I do see I
can say things like names(X[4]) to discover Ob1, and cute things like
names(X[dim(X)[2]]) to get the last name, etc., but I cannot put it
together how to use this to drive ReShape into making all the
Observations into a single variable called time.

I hope this is clear.

Thanks,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ReShape to create Time from Observations?

2009-07-07 Thread jim holtman
Does something like this work for you;  it uses the reshape package:

 X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12,
+ Ob4=4:13, Ob5=3:12, Ob6=2:11)
 Y-data.frame(A=1:20, B=0, C=1, D=5, Ob1=1:10, Ob2=2:11, Ob3=3:12,
+ Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=5:9)
 Z-data.frame(A=1:30, B=0, C=1, D=6, E=1:2, Ob1=1:10, Ob2=2:11,
+ Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=1:10, Ob8=3:12)

 f.melt -
+ function(df)
+ {
+ # get the starting column number of Ob1, then extend for rest of columns
+ require(reshape)
+ melt(df, measure=seq(match(Ob1, names(df)), ncol(df)))
+ }
 x.m - f.melt(X)
 y.m - f.melt(Y)
 z.m - f.melt(Z)

 # sample data
 head(x.m, 20)
A B C variable value
1   1 0 1  Ob1 1
2   2 0 1  Ob1 2
3   3 0 1  Ob1 3
4   4 0 1  Ob1 4
5   5 0 1  Ob1 5
6   6 0 1  Ob1 6
7   7 0 1  Ob1 7
8   8 0 1  Ob1 8
9   9 0 1  Ob1 9
10 10 0 1  Ob110
11  1 0 1  Ob2 2
12  2 0 1  Ob2 3
13  3 0 1  Ob2 4
14  4 0 1  Ob2 5
15  5 0 1  Ob2 6
16  6 0 1  Ob2 7
17  7 0 1  Ob2 8
18  8 0 1  Ob2 9
19  9 0 1  Ob210
20 10 0 1  Ob211



On Tue, Jul 7, 2009 at 5:37 PM, Mark Knechtmarkkne...@gmail.com wrote:
 Here is a couple of very simple data.frames:

 X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12,
 Ob4=4:13, Ob5=3:12, Ob6=2:11)
 Y-data.frame(A=1:20, B=0, C=1, D=5, Ob1=1:10, Ob2=2:11, Ob3=3:12,
 Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=5:9)
 Z-data.frame(A=1:30, B=0, C=1, D=6, E=1:2, Ob1=1:10, Ob2=2:11,
 Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=1:10, Ob8=3:12)

 Each row in the data.frame is a unique experiment. The fields Ob1:Ob6
 (in the case of the first data.frame) represent observations taken at
 fixed intervals for specific that experiment. (Observation 1,
 Observation2, etc.) IMPORTANT - Different data files have different
 numbers of both experiments and observations as well as different
 observation rates. Some data.frames might have 50
 observations/experiment at 10 minute intervals (a work day) while
 others might have 2000 observations/experiment at daily intervals
 representing years of data. The number of columns preceding OB1 varies
 from file to file but once I get to Ob1 I have set it up so that the
 names to the right are consecutive to the end of the row, so 2000
 observations will have names Ob1:Ob2000.

 How could I use ReShape to create a generic new data.frame where all
 of the ObX columns become 'time' for the experiments in that
 data.frame? I.e. - Ob1:ObX become s single variable called time
 incrementing from 1:X.

 The generic answer cannot use any numbers like 1:3 or 4:12 because
 every file is different. I think I need to discover the dimensions of
 the data.frames and locations of Ob1 as well as the name of the last
 column, etc., to construct the right fields. We could (if it's legal
 in R) say things like Ob1:Ob11 but it doesn't seem legal. I do see I
 can say things like names(X[4]) to discover Ob1, and cute things like
 names(X[dim(X)[2]]) to get the last name, etc., but I cannot put it
 together how to use this to drive ReShape into making all the
 Observations into a single variable called time.

 I hope this is clear.

 Thanks,
 Mark

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.