At 10:42 AM -0700 7/5/09, Mark Knecht wrote:
2009/7/5 Uwe Ligges <lig...@statistik.tu-dortmund.de>:



<- a lot of other conversation omitted, to focus on the following>

Currently my data is one experiment per row, but that's wasting space
as most experiments only take 20% of the row and 80% of the row is
filled with 0's. I might want to make the array more narrow and have a
flag somewhere in the 1st 10 columns that says the this row is a
continuation row from the previous row. That way I could pack the
array better, use less memory and when I do finally test for 0 I have
a short line to traverse?

Just an idea.

Anyway, I suspect either of these will suit my short term needs. On to
the next step.

Cheers,
Mark


This suggests the use of a "list" rather than a data frame. With a list object, each element in the list would represent one experiment, and each would have the appropriate number of elements (values) for that experiment.

Indeed, the original description,

At 5:02 PM -0700 7/4/09, Mark Knecht wrote:
OK, I guess I'm getting better at the data part of R. I wrote a
program outside of R this morning to dump a bunch of experimental
data. It's a sort of ragged array - about 700 rows and 400 columns,
but the amount of data in each column varies based on the length of
the experiment. The real data ends with a 0 following some non-zero
value. It might be as short as 5 to 10 columns or as many as 390. The
first 9 columns contain some data about when the experiment was run
and a few other things I thought I might be interested in later. All
the data starts in column 10 and has headers saying C1, C2, C3, C4,
etc., up to C390 The first value for every experiment is some value I
will normalize and then the values following are above and below the
original tracing out the path that the experiment took, ending
somewhere to the right but not a fixed number of readings.

Is also suggestive of using a list(). For example, the metadata, i.e., the "... data about when the experiment was run and a few other things ..." could be held separately, instead of embedded in the same array, from which it always has to be excluded in order to do an analysis.

But I haven't followed the thread all that closely, so confess that my thoughts might be off the mark.

-Don

--
---------------------------------
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
m...@llnl.gov

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to