Excellent, thanks Even! Do you recall what the runtime was before these changes on your test system?
From: Even Rouault <even.roua...@spatialys.com> Date: Tuesday, July 23, 2024 at 3:00 PM To: Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] <jesse.r.me...@nasa.gov>, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev <gdal-dev@lists.osgeo.org> Subject: [EXTERNAL] Re: [gdal-dev] Expected runtime of polygonize (GDAL 3.9.0) for few very large features. CAUTION: This email originated from outside of NASA. Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC. Hi, I've got a chance to have a look at your test dataset. In https://github.com/OSGeo/gdal/pull/10477, I've reduced the runtime to 8 minutes (with GeoParquet output, without spatial sorting), by optimizing some implementation details. I believe this could be further reduced as most of the time is still spent in malloc/free of temporary objects (the output is 90 million polygons!) and some objects could be reused, but that would be more extensive changes Even Le 01/07/2024 à 18:40, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev a écrit : Hi, We’ve encountered a few images with what seems like pathological performance problems with polygonise. The details below are a report from another developer that I haven’t yet independently verified. We threshold a raster image to a binary mask in a memory dataset, use that as its own mask to mask out the background. gdal.Polygonize(nn_mem_band, nn_mem_band, ogr_mem_lyr, -1) We have a number of 32k x 32k raster images that feature number of very large same-valued regions (some as large as 80% of the entire raster). We’re seeing ~10hrs on a modern workstation to complete the line of code above. OpenCV can apparently construct a connected components list in mere seconds, on the same workstation and image, so we’re considering constructing the OGR geometries directly from those as a temporary work around. Is this situation a known pitfall with the current algorithm / data structures behind Polygonize? I’m able to share the problematic tile(s) if of interest, Best Jesse _______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org> https://lists.osgeo.org/mailman/listinfo/gdal-dev -- http://www.spatialys.com<http://www.spatialys.com/> My software is free, but my time generally not.
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev