Hello, I am writing to ask for advice as to whether I am following the most efficient/fastest workflow for a task using the raster package. This involves 3 datasets:
1. modtest: A ~927 m resolution satellite-derived image: nrow : 141 ncol : 189 ncell : 26649 min value : ? max value : ? projection : +proj=aea +lat_1=-18 +lat_2=-32 +lat_0=0 +lon_0=24 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs +towgs84=0,0,0 xmin : 415851.2 xmax : 590983.4 ymin : -2670093 ymax : -2539439 xres : 926.6254 yres : 926.6254 2. lt.q.test: a ~92 m resolution grid representing identifiers for unique climate-soil combinations: nrow : 1403 ncol : 1878 ncell : 2634834 min value : ? max value : ? projection : +proj=aea +lat_1=-18 +lat_2=-32 +lat_0=0 +lon_0=24 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs +towgs84=0,0,0 xmin : 415926.7 xmax : 590091.6 ymin : -2670093 ymax : -2539979 xres : 92.73961 yres : 92.73961 3. sum.stats: A dataframe within a list with several variables relating to outputs from a crop simulations, and one variable containing values corresponding to the identifiers in yld.test. My task is to 1. assign crop yield values from sum.stats to lt.q.test 2. resample the result of 1 to the resolution of modtest, so that the images are lined up, and so that the yield values in yld.test are averaged/aggregated to modtest's resolution. The solution I have come up with so far is: library(raster) lt.q.test <- raster(paste(path.in, "lt_quin_test1.img", sep = "")) # Assign values from sum.stats (in y.mn column) to yld.test, outputting values as integers. setwd(path.out) subs(lt.q.test, sum.stats[[1]], by = "VALUE", which = "y.mn", subsWithNA = TRUE, filename = "yld.rst4.img", format = 'HFA', datatype = "INT2S") # Resample yld.test to resolution of modtest yld.test <- raster("yld.rst4.img") modtest <- raster("MOD_SA_NDVI_amp_albers_test.img") resample(yld.test, modtest, method = "bilinear", filename = "yld.resamp.test.img", datatype = "INT2S", format = "HFA") # This seems to work The above is performed on fairly small test datasets, and I must now apply this approach multiple times to the full size versions of the input datasets, where the dimensions of lt.q.test are: nrow : 15170 ncol : 17364 ncell : 263411880 and modtest's dimensions are: nrow : 1651 ncol : 1937 ncell : 3197987 I was thus wondering if there is a more efficient approach that I should adopt, or if there are any other concerns of which I should be aware? Thanks in advance for any advice. Cheers, Lyndon _______________________________________________ R-sig-Geo mailing list R-sig-Geo@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo