HI Tom, On Dec 20, 2007, at 9:06 AM, Tom Sgouros wrote: > > Hello: > > I have been give a spreadsheet to work with formed as one big table. > What it consists of is a 10-row-by-40-column table for each of > about 70 > different locations. In other words, the table row names are repeated > 70 times, once for each of the locations (whose names also appear > in the > same column, where it's talking about the totals for that > location), e.g.: > A B C > Location1 15 73 123 <- this row is the sum of the following 3 > Under 10 6 42 23 > 10 - 25 4 15 23 > Over 25 5 16 77 > Location2 18 75 113 <- same here > Under 10 7 45 13 > 10 - 25 5 18 44 > Over 25 6 12 56 > > I want to get this into R as a collection of data frames, one for each > of my locations. My questions: > > 1. There is a way to handle a collection of data frames, isn't > there? > No doubt there are plenty, but what's the easiest way, so that I > can address them collectively, allowing me to ask such > questions as > what's the max of the over 25's in column C?
A list is the best way for that. Then you can use things like lapply and sapply, as I do towards the end of the script that follows. > 2. What's the easiest way to read such a data array from a text > file? > I can do some editing of a csv file produced from the > spreadsheet, > but don't really know what to aim for. Here is the code I used to read your example, which I saved as a comma-separated file, with the only addition that I added the name "Names" to the first column. You will probably need to adjust filename, nlocations and rows.per.location. filename <- "~/Desktop/rows.txt" nlocations <- 2 rows.per.location <- 3 data <- read.csv(filename) data$Names <- gsub("\\s","", data$Names, perl=TRUE) # Trim off whitespace from first column totals <- data[4*seq_len(nlocations)-3,] # Pick up the rows with the totals actual.data <- data[-(4*seq_len(nlocations)-3),] # Pick up the rest location.names <- totals[,1] # The location names are now the first column of totals data.by.location <- split(actual.data, rep(location.names, each=rows.per.location)) # this is the "workhorse" data.by.location <- lapply(data.by.location, function(x) { data.frame(x[,-1], row.names=x[,1]) # Converting each list item to a better form }) totals2 <- sapply(data.by.location, function(x) sapply(x,sum)) all(totals2 == t(totals[,-1])) # Should return true if the totals add up > 3. Is there some shortcut that would allow me to read this directly > from a spreadsheet? Have a look at the xlsReadWrite package. > Many thanks, > > -tom > Haris Skiadas Department of Mathematics and Computer Science Hanover College ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.