I've been working in the gdal-dev-env (version 3.1.0, installed around mid-December) on OSGeo4w (mostly because it's faster than making COGs using the GTIFF driver) on large (e.g. 102600x91100) orthophoto rasters, generating VRTs, TIFFs and COGs.
While I can do LZW, DEFLATE, and uncompressed just fine (2 minutes with all cores to make a lzw COG from a VRT), I'm struggling to make JPEG COGs. If I run a loop, I can't make it through more than one image without gdal_translate hanging at the finish for sometimes tens of hours. If I kill the process (CTRL-C doesn't always work, but task mgr does) then the resulting COG is fine (same size as if I wait n hours and the process finishes). Over the last few years I've had this issue (gdal_translate hanging at "100 - done.") on many large rasters even when building as TIFF. Also maybe worth noting, even on smaller rasters I often see GDAL hang for minutes to tens of minutes at the end of a raster build. In the past I was only been building single rasters though, so it's not that big of a deal - I can just kill the process. Not any more. I frequently build several at a time and hope to scale up. I'm running on a threadripper 3960x with 256GB RAM that I built. All processing is on a NVMe drive. The LZW compressed tiffs (COGs) are around 1.5 - 3GB (8-bit,RGB with mask band). If I build with CPL_DEBUG=ON, depending on cachemax size, I see "potential thrashing on band one of ." at around 10-20% (even with GDAL_CACHEMAX at 80%), and if not set high enough I'm stuck at 20% for hours and hours. Then gdal hangs at "100 - done." for anywhere from 2 - 12+ hours unless I kill it. If I kill the process, the final raster builds out and appears to work fine, and is the same as if I wait X hours for it to exit. For a test with debug on I just finished, after 2.5h hung at "done" I got this line: GDAL: GDALClose(<outfile.tif.ovr.tmp, this=000001FDC5531C50) And another 45 minutes later the input and output tiffs closed and shared library unloaded after the RAM slowly emptied from ~30 gig over that time. My overall command at the moment is: gdal_translate .\<infile.tif> <outfile.tif> -of COG -co COMPRESS=JPEG -co QUALITY=90 -config GDAL_CACHEMAX "80%" -config GDAL_SWATH_SIZE "80%" -config GDAL_FORCE_CACHING YES -config GDAL_MAX_DATASET_POOL_SIZE 2048 And with lower values (and possibly if I get rid of the GDAL_FORCE_CACHING YES variable - I just added that) I have the same "hang" at 100% lasting for even longer. Again, the same COG builds in 2 minutes with LZW, but with JPEG and all the cachemax settings ramped up, it takes maybe 6 hours.
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev