Dear Asantos,

Instead of
exr<-raster::extract(r, polys)
you can use:

beginCluster(8)
exr<-raster::clusterR(x = r, fun = function (raster) {raster::extract(x = raster, y = polys)}, export = "polys")
endCluster()

HTH,
Ákos Bede-Fazekas
Hungarian Academy of Sciences

2019.05.28. 3:47 keltezéssel, ASANTOS via R-sig-Geo írta:
Dear R-Sig-Geo Members,

  ?????? I create a virtual machine (VM) in Google Cloud with Ubuntu 18.04
with 8 CPU and 30 RAM memory and R 3.6.0 version, but I try to improve
my spatial analysis without success or same a more faster process. If I
use packages snow and doMC with all the 8 CPU's in an operation, but it
use in only 12,54% of our capacity, when the objective is user
extraction() in raster and RF with randomForest(). The gain of 18
seconds, I think that is not so good, then my question is there are any
way for improve that? In my test, I make:

# Take in the ubuntu terminal the number of processors
foresteyebrazil@superforettech1:~$cat/proc/cpuinfo|grepprocess|wc-l
8
#Packages
library(raster)
library(snow)
library(doMC)
library(randomForest)
registerDoMC()
#Take a raster for worldclim
r<-getData('worldclim', var='alt', res=5)
# 1) Use extract()/ randomForest() in Virtual Machine
----------------------------
start_time<-Sys.time()
# SpatialPolygons
cds1<-rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2<-rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys<-spPolygons(cds1, cds2)
# Extract
exr<-raster::extract(r, polys)
tr<-ifelse(exr[[2]]<10,c("A"),c("B"))
df<-cbind(tr,exr[[2]], sqrt(exr[[2]]))
df2<-data.frame(as.factor(df[,1]),as.numeric(as.character(df[,2])),as.numeric(as.character(df[,3])))
df2<-df2[complete.cases(df2),]
colnames(df2)<-c("res1","var1","var2")
res<-NULL
for(win1:9){
mod_RF<-randomForest(x=cbind(df2$var1,df2$var2), y=df2$res1, ntree=100,
mtry=2)
res=rbind(res,cbind(w,mean(mod_RF$err.rate[,1])*100))
}
end_time<-Sys.time()
end_time-start_time
#
#Time difference of 38.72528 secs
# 2) Use extract() with snow and doMC packages in Virtual Machine
----------------------------
start_time<-Sys.time()
# SpatialPolygons
cds1<-rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2<-rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys<-spPolygons(cds1, cds2)
# Extract
beginCluster(n=8)
exr<-raster::extract(r, polys)
tr<-ifelse(exr[[2]]<10,c("A"),c("B"))
df<-cbind(tr,exr[[2]], sqrt(exr[[2]]))
df2<-data.frame(as.factor(df[,1]),as.numeric(as.character(df[,2])),as.numeric(as.character(df[,3])))
df2<-df2[complete.cases(df2),]
colnames(df2)<-c("res1","var1","var2")
endCluster()
res<-NULL
mod_RF2<-foreach(1:9) %dopar%{
randomForest(x=cbind(df2$var1,df2$var2), y=df2$res1, ntree=100, mtry=2)
}
res=rbind(res,cbind(mean(mod_RF2$err.rate[,1])*100))
}
end_time<-Sys.time()
end_time-start_time
#
#Time difference of 20.57027 secs

Thanks in advanced,


_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to