#2764: corrupt data written to FCELL and DCELL rasters, hard to re-produce ---------------------+------------------------- Reporter: dylan | Owner: grass-dev@… Type: defect | Status: new Priority: normal | Milestone: 7.2.3 Component: Raster | Version: unspecified Resolution: | Keywords: CPU: x86-64 | Platform: Linux ---------------------+-------------------------
Comment (by dylan): Replying to [comment:30 mmetz]: > Replying to [comment:29 dylan]: > > > > [...] Note that I don't have any issues with any other GRASS commands, or (as far as I can tell) general usage on this machine. I only see these errors when working with GRASS commands that: > > > > * take a long time to run: `r.sun` or `t.rast.mapcalc` ([http ://osgeo-org.1560.x6.nabble.com/Error-reading-raster-data-for-row-xxx- only-when-using-r-series-and-t-rast-series-td5229569.html e.g. a couple of years ago]) > > * operate on moderately large, floating-point maps > > * are done in parallel, either via GNU `parallel` or as implemented in the temporal suite of modules > > > > ...hence the extreme difficulty in recreating the errors or further debugging. > > Unfortunately, I was not able to recreate these errors with the provided test data and scripts. > > I still think this is some obscure disk IO error. You could try to use `nice`, e.g. `nice r.sun ...` and `nice r.mapcalc ...` in `daily-rad.sh`. At least this helps when running many GRASS modules in parallel on HPC systems where results are written out to one single storage device. Well thank you very much for all of your patience, patches, and testing. I'll try the `nice` option. For now, I think that I can tolerate the much lower frequency of errors after switching to LZ4 compression. Perhaps the faster speed of LZ4 lowers the probability of concurrent write operations. It is still quite puzzling that this kind of error has come up on several different machines while tracking GRASS trunk over a 10 year period. Maybe this is a subtle hint that it is time to build a new workstation... I know this is a lot to ask, but did you try testing using ZLIB compression and running it multiple times? It took a couple of tiles before I noticed the error. -- Ticket URL: <https://trac.osgeo.org/grass/ticket/2764#comment:31> GRASS GIS <https://grass.osgeo.org>
_______________________________________________ grass-dev mailing list grass-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/grass-dev