Sander Oom wrote: >Dear all, > >We have a large data set with temperature data for weather stations >across the globe (15000 stations). > >For each station, we need to calculate the number of days a certain >temperature is exceeded. > >So far we used the following S code, where mat88 is a matrix containing >rows of 365 daily temperatures for each of 15000 weather stations: > > m <- 37 > n <- 2 > outmat88 <- matrix(0, ncol = 4, nrow = nrow(mat88)) > for(i in 1:nrow(mat88)) { > # i <- 3 > row1 <- as.data.frame(df88[i, ]) > temprow37 <- select.rows(row1, row1 > m) > temprow39 <- select.rows(row1, row1 > m + n) > temprow41 <- select.rows(row1, row1 > m + 2 * n) > outmat88[i, 1] <- max(row1, na.rm = T) > outmat88[i, 2] <- count.rows(temprow37) > outmat88[i, 3] <- count.rows(temprow39) > outmat88[i, 4] <- count.rows(temprow41) > } > outmat88 > > > What you need is not tapply but apply. Something like apply(mat88, 1, function(x) sum(x > 30))
where your treshold should replace 30 and the `1' refers to rows. For multiple tresholds: apply(mat88, 1, function(x) c( sum(x>20), sum(x>25), sum(x>30))) Kjetil >We have transferred the data to a more potent Linux box running R, but >still hope to speed up the code. > >I know a for loop should be avoided when looking for speed. I also know >the answer is in something like tapply, but my understanding of these >commands is still to limited to see the solution. Could someone show me >the way!? > >Thanks in advance, > >Sander. > > -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- No virus found in this outgoing message. Checked by AVG Anti-Virus. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html