Dear all,

We have a large data set with temperature data for weather stations 
across the globe (15000 stations).

For each station, we need to calculate the number of days a certain 
temperature is exceeded.

So far we used the following S code, where mat88 is a matrix containing 
rows of 365 daily temperatures for each of 15000 weather stations:

        m <- 37
        n <- 2
        outmat88 <- matrix(0, ncol = 4, nrow = nrow(mat88))
        for(i in 1:nrow(mat88)) {
                # i <- 3
                row1 <- as.data.frame(df88[i,  ])
                temprow37 <- select.rows(row1, row1 > m)
                temprow39 <- select.rows(row1, row1 > m + n)
                temprow41 <- select.rows(row1, row1 > m + 2 * n)
                outmat88[i, 1] <- max(row1, na.rm = T)
                outmat88[i, 2] <- count.rows(temprow37)
                outmat88[i, 3] <- count.rows(temprow39)
                outmat88[i, 4] <- count.rows(temprow41)
        }
        outmat88

We have transferred the data to a more potent Linux box running R, but 
still hope to speed up the code.

I know a for loop should be avoided when looking for speed. I also know 
the answer is in something like tapply, but my understanding of these 
commands is still to limited to see the solution. Could someone show me 
the way!?

Thanks in advance,

Sander.
-- 
--------------------------------------------
Dr Sander P. Oom
Animal, Plant and Environmental Sciences,
University of the Witwatersrand
Private Bag 3, Wits 2050, South Africa
Tel (work)      +27 (0)11 717 64 04
Tel (home)      +27 (0)18 297 44 51
Fax             +27 (0)18 299 24 64
Email   [EMAIL PROTECTED]
Web     www.oomvanlieshout.net/sander

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to