At 10:42 AM -0700 7/5/09, Mark Knecht wrote:
2009/7/5 Uwe Ligges <lig...@statistik.tu-dortmund.de>:
<- a lot of other conversation omitted, to focus on the following>
Currently my data is one experiment per row, but that's wasting space as most experiments only take 20% of the row and 80% of the row is filled with 0's. I might want to make the array more narrow and have a flag somewhere in the 1st 10 columns that says the this row is a continuation row from the previous row. That way I could pack the array better, use less memory and when I do finally test for 0 I have a short line to traverse? Just an idea. Anyway, I suspect either of these will suit my short term needs. On to the next step. Cheers, Mark
This suggests the use of a "list" rather than a data frame. With a list object, each element in the list would represent one experiment, and each would have the appropriate number of elements (values) for that experiment.
Indeed, the original description, At 5:02 PM -0700 7/4/09, Mark Knecht wrote:
OK, I guess I'm getting better at the data part of R. I wrote a program outside of R this morning to dump a bunch of experimental data. It's a sort of ragged array - about 700 rows and 400 columns, but the amount of data in each column varies based on the length of the experiment. The real data ends with a 0 following some non-zero value. It might be as short as 5 to 10 columns or as many as 390. The first 9 columns contain some data about when the experiment was run and a few other things I thought I might be interested in later. All the data starts in column 10 and has headers saying C1, C2, C3, C4, etc., up to C390 The first value for every experiment is some value I will normalize and then the values following are above and below the original tracing out the path that the experiment took, ending somewhere to the right but not a fixed number of readings.
Is also suggestive of using a list(). For example, the metadata, i.e., the "... data about when the experiment was run and a few other things ..." could be held separately, instead of embedded in the same array, from which it always has to be excluded in order to do an analysis.
But I haven't followed the thread all that closely, so confess that my thoughts might be off the mark.
-Don -- --------------------------------- Don MacQueen Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062 m...@llnl.gov ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.