Many thanks Jonathan, Alex and Forrest, This is very helpful information. I'll see what's the best between calc and rasterEngine.
Sincerely, Yan Yan Boulanger, Chercheur scientifique / Research scientist Ressources Naturelles Canada, Canadian Forest Service Centre de Foresterie des Laurentides 1055, rue du P.E.P.S. C.P. 10380, succ. Sainte-Foy Québec (Québec) Canada G1V 4C7 Tel. : +1 418 649-6859 -----Original Message----- From: jgrn...@gmail.com [mailto:jgrn...@gmail.com] On Behalf Of Jonathan Greenberg Sent: 13 mars 2014 12:18 To: Alex Zvoleff Cc: Boulanger, Yan; r-sig-geo@r-project.org Subject: Re: [R-sig-Geo] loops in rasterEngine Yan: Looks like you are getting great help with this -- I want to echo Alex's note that rasterEngine is not a catchall -- for REALLY simple processes you'll get better performance using calc() or using LESS workers (which may seem counter intuitive). I'm submitting a paper this week that showed that a function that just multiplies a raster by 10 ran faster than calc() only when using 4 workers (sfQuickInit(cpus=4)) (vs. calc's 1), but was slower than calc if you have less or more workers. As a rule, rasterEngine, at present, is slower than calc when operation in sequential mode. Now, as an important note, if you grab the latest spatial.tools from r-forge, I have added a feature that will return multiple rasters at once, which seems like what you want to do. You'll want to return a list-of-arrays (each component will be written to its own raster) and make sure you specify the output filenames (the components will be matched against the output filenames). This may result in a significant speedup because you are only reading each raster once, and returning all the outputs (vs. the example above reads/writes the rasters for every i). --j On Thu, Mar 13, 2014 at 9:06 AM, Alex Zvoleff <azvol...@conservation.org> wrote: > On Wed, Mar 12, 2014 at 11:29 PM, Boulanger, Yan > <yan.boulan...@rncan-nrcan.gc.ca> wrote: >> Actually, I have several rasters of more than 440 000 000 pixels >> (MODIS covering all Canada) and I have a 32-cores machine so I would >> like to take advantage of it! ;-) >> >> Time is money (really?!!) > > As mentioned earlier, I would be careful about using rasterEngine for > this kind of task. It may actually slow you down. I would recommend > testing on smaller subsets to determine your gains (or losses) from > doing this type of calculation in parallel versus sequentially. While > I have seen great speed increases for CPU intensive calculations from > using rasterEngine, it sounds like your processing is heavily IO > intensive. I am not sure 32 cores will help you unless you have a very > fast disk or RAID array. > > Alex > >> >> Thanks again! >> yan >> >> Yan Boulanger, Chercheur scientifique / Research scientist Ressources >> Naturelles Canada, Canadian Forest Service Centre de Foresterie des >> Laurentides 1055, rue du P.E.P.S. >> C.P. 10380, succ. Sainte-Foy >> Québec (Québec) Canada >> G1V 4C7 >> Tel. : +1 418 649-6859 >> >> From: Forrest Stevens [mailto:forr...@ufl.edu] >> Sent: 12 mars 2014 22:25 >> To: Boulanger, Yan >> Cc: r-sig-geo@r-project.org >> Subject: Re: [R-sig-Geo] loops in rasterEngine >> >> Hi Yan, I guess I would be surprised for such a simple process if >> rasterEngine() would be worth the overhead? Though, admittedly, Jonathan >> Greenberg might have more information on the topic. To do such an operation >> this is the approach I would take without using rasterEngine(): >> >> >> for (i in 1:5) { >> assign(paste("Safranyik_zones_1961_1990b_",i, sep=""), >> Safranyik_zones_1961_1990b == i) } >> >> >> To do it using rasterEngine() this is the function definition that I would >> use. This of course requires that you've already created a cluster using one >> of the various supported parallel backends otherwise you'll gain nothing >> from the parallel processing. >> >> >> require("spatial.tools") >> >> ## Begin a parallel cluster and register it with foreach: >> ## The number of nodes/cores to use in the cluster cpus = 2 cl <- >> makeCluster(spec = cpus, type = "PSOCK", methods = FALSE) ## Register >> the cluster with foreach: >> registerDoParallel(cl) >> >> ## Or use the following, quick and dirty way: >> #sfQuickInit(cpus=2) >> >> fun_zone <- function( zones, i, ...) { >> return(zones == i) >> } >> >> for (j in 1:5){ >> assign(paste("Safranyik_zones_1961_1990b_",j, sep=""), >> rasterEngine( zones=Safranyik_zones_1961_1990b, args=list("i"=j), >> fun=fun_zone) ) } >> >> stopCluster(cl) >> #sfQuickStop() >> >> >> Hope this helps, >> Forrest >> >> -- >> Forrest R. Stevens >> Ph.D. Candidate, QSE3 IGERT Fellow >> Department of Geography >> Land Use and Environmental Change Institute University of Florida >> www.clas.ufl.edu/users/forrest<http://www.clas.ufl.edu/users/forrest> >> >> On Wed, Mar 12, 2014 at 8:51 PM, Boulanger, Yan >> <yan.boulan...@rncan-nrcan.gc.ca<mailto:yan.boulan...@rncan-nrcan.gc.ca>> >> wrote: >> Hi folks, >> >> I guess I have a lot to learn to write functions but I'm stuck when using >> rasterEngine. It seems that it should be very easy to do but I'm missing >> something, apparently... I have a raster, Safranyik_zones_1961_1990, with >> values (integer) from 1 to 5. I would like to create five rasters for which >> value will be 1 when the raster Safranyik_zones_1961_1990 is equal to "i", >> and NA otherwise. I would like to run everything in a loop . Here's what I >> thought would be ok. >> >> fun_zone <- function(Safranyik_zones,i,...) { Safranyik_zonesb <- >> Safranyik_zones Safranyik_zonesb[] <- NA >> Safranyik_zonesb[Safranyik_zones == i] <- 1 >> return(Safranyik_zonesb) >> } >> >> for (i in 1:5){ >> Safranyik_zones_1961_1990b <- >> rasterEngine(Safranyik_zones=Safranyik_zones_1961_1990,i=i, >> fun=fun_zone) assign(paste("Safranyik_zones_1961_1990b_",i, >> sep=""),Safranyik_zones_1961_1990b[[1]]) >> } >> >> Of course, it says that « i » is missing...: >> >>>Erreur dans Safranyik_zones == i : 'i' est manquant >> >> Any help? >> >> Thanks in advance, >> >> Yan >> >> >> Yan Boulanger, Chercheur scientifique / Research scientist Ressources >> Naturelles Canada, Canadian Forest Service Centre de Foresterie des >> Laurentides 1055, rue du P.E.P.S. >> C.P. 10380, succ. Sainte-Foy >> Québec (Québec) Canada >> G1V 4C7 >> Tel. : +1 418 649-6859 >> >> >> >> >> [[alternative HTML version deleted]] >> >> >> _______________________________________________ >> R-sig-Geo mailing list >> R-sig-Geo@r-project.org<mailto:R-sig-Geo@r-project.org> >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo >> >> >> [[alternative HTML version deleted]] >> >> >> _______________________________________________ >> R-sig-Geo mailing list >> R-sig-Geo@r-project.org >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo >> > > > > -- > Alex Zvoleff > Postdoctoral Associate > Tropical Ecology Assessment and Monitoring (TEAM) Network Conservation > International > 2011 Crystal Dr. Suite 500, Arlington, Virginia 22202, USA > Tel: +1-703-341-2749, Fax: +1-703-979-0953, Skype: azvoleff > http://www.teamnetwork.org | http://www.conservation.org > > _______________________________________________ > R-sig-Geo mailing list > R-sig-Geo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-geo -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo