Thanks again!! I'm working with a area-flowaccumulation so the 0.5 threshold means 0.5 km2, which is 90m * 90m * 60 cell. My intention is prune back the stream later on with a machine learning procedure. I will be carefully look not to overpass the 2,147,483,647 detected stream segments.
To reduce as much as possible I/O I save the *.tif file in the /dev/shm of each node, read then with r.external and build up the location on the flight in each /tmp. So, it quite fast. I will try to increase a bit the RAM. Will post later how is going. Best Giuseppe On 1 November 2017 at 17:12, Markus Metz <[email protected]> wrote: > > > On Wed, Nov 1, 2017 at 7:15 PM, Giuseppe Amatulli < > [email protected]> wrote: > > > > Thanks Markus!! > > I will test and I will let you know how it works. > > Your feedback is very helpful! > > > > I have few more questions > > 1) now how much is the upper limit matrix cell number that > r.stream.extract can handle? > > About 1.15e+18 cells. > > Another limitation is the number of detected stream segments. This must > not be larger than 2,147,483,647 streams, therefore you need to figure out > a reasonable threshold with a smaller test region. A threshold of 0.5 is > definitively too small, no matter how large or small the input is. > Threshold should typically be larger than 1000, but is somewhat dependent > on the resolution of the input. As a rule of thumb, with a coarser > resolution, a smaller threshold might be suitable, with a higher > resolution, the threshold should be larger. Testing different threshold > values in a small subset of the full region can safe a lot of time. > > > 2) is the r.stream.basins add-on subjects to the same limitation? In > case would be possible to update also for r.stream.basins? > > The limitation in r.watershed and r.stream.extract comes from the search > for drainage directions and flow accumulation. The other r.stream.* modules > should support large input data, as long as the number of stream segments > does not exceed 2,147,483,647. > > > 3) is r.stream.extract support the use of multi-threaded through openMP? > Would be difficult implement? > > In your case, only less than 13% of temporary data are kept in memory. > Parallelization with openMP or similar will not help here, your CPU will > run only at less than 20% with one thread anyway. The limit is disk I/O. > You can make it faster by using more memory and/or using a faster disk > storage device. > > Markus M > > > > Best > > Giuseppe > > > > > > > > > > > > > > > > On 31 October 2017 at 15:54, Markus Metz <[email protected]> > wrote: > >> > >> > >> > >> On Mon, Oct 30, 2017 at 1:42 PM, Giuseppe Amatulli < > [email protected]> wrote: > >> > > >> > Hi, > >> > I'm using the r.stream.extract grass command > >> > > >> > r.stream.extract elevation=elv accumulation=upa threshold=0.5 > depression=dep direction=dir stream_raster=stream memory=35000 --o > --verbose > >> > > >> > where the elv is raster of 142690 * 80490 = 11,485,118,100 cell > >> > > >> > and I get this error > >> > > >> > 12.97% of data are kept in memory > >> > Will need up to 293.52 GB (300563 MB) of disk space > >> > Creating temporary files... > >> > Loading input raster maps... > >> > 0..3..6..9..12..15..18..21..24..27..30..33..36..39..42.. > 45..48..51..54..57..60..63..66..69..72..75..78..81..84.. > 87..90..93..96..99..100 > >> > ERROR: Unable to load input raster map(s) > >> > >> This error is caused by integer overflow because not all variables > necessary to support such large maps were 64 bit integer. > >> > >> Fixed in trunk and relbr72 with r71620,1, and tested with a DEM with > 172800 * 67200 = 11,612,160,000 cells: r.stream.extract finished > successfully in 18 hours (not a HPC, a standard desktop maschine with 32 GB > of RAM and a 750 GB SSD). > >> > > >> > According to the help manual the memory=35000 should be set in > according to the overall memory available. I set the HPC upper memory limit > to 40G. > >> > > >> > I try several combination of these parameters but i still get the > same error. > >> > If the r.stream.extract is based on r.watershed than the segmentation > library should be able to handle a huge raster. > >> > >> r.stream.extract is based on a version of r.watershed that did not > support yet such huge raster maps, therefore support for such huge raster > maps needed to be added to r.stream.extract separately. > >> > >> > > >> > Anyone know how to over pass this limitation/error ? > >> > >> Please use the latest GRASS 7.2 or GRASS 7.3 version from svn. > >> > >> Markus M > >> > >> > > >> > Thank you > >> > Best > >> > -- > >> > Giuseppe Amatulli, Ph.D. > >> > > >> > Research scientist at > >> > Yale School of Forestry & Environmental Studies > >> > Yale Center for Research Computing > >> > Center for Science and Social Science Information > >> > New Haven, 06511 > >> > Teaching: http://spatial-ecology.org > >> > Work: https://environment.yale.edu/profile/giuseppe-amatulli/ > >> > > >> > _______________________________________________ > >> > grass-user mailing list > >> > [email protected] > >> > https://lists.osgeo.org/mailman/listinfo/grass-user > >> > > > > > > > > -- > > Giuseppe Amatulli, Ph.D. > > > > Research scientist at > > Yale School of Forestry & Environmental Studies > > Yale Center for Research Computing > > Center for Science and Social Science Information > > New Haven, 06511 > > Teaching: http://spatial-ecology.org > > Work: https://environment.yale.edu/profile/giuseppe-amatulli/ > > -- Giuseppe Amatulli, Ph.D. Research scientist at Yale School of Forestry & Environmental Studies Yale Center for Research Computing Center for Science and Social Science Information New Haven, 06511 Teaching: http://spatial-ecology.org Work: https://environment.yale.edu/profile/giuseppe-amatulli/
_______________________________________________ grass-user mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/grass-user
