Excellent, thanks Even!  Do you recall what the runtime was before these 
changes on your test system?

From: Even Rouault <even.roua...@spatialys.com>
Date: Tuesday, July 23, 2024 at 3:00 PM
To: Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] 
<jesse.r.me...@nasa.gov>, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND 
APPLICATIONS INC] via gdal-dev <gdal-dev@lists.osgeo.org>
Subject: [EXTERNAL] Re: [gdal-dev] Expected runtime of polygonize (GDAL 3.9.0) 
for few very large features.
CAUTION: This email originated from outside of NASA.  Please take care when 
clicking links or opening attachments.  Use the "Report Message" button to 
report suspicious messages to the NASA SOC.



Hi,

I've got a chance to have a look at your test dataset. In 
https://github.com/OSGeo/gdal/pull/10477, I've reduced the runtime to 8 minutes 
(with GeoParquet output, without spatial sorting), by optimizing some 
implementation details. I believe this could be further reduced as most of the 
time is still spent in malloc/free of temporary objects (the output is 90 
million polygons!) and some objects could be reused, but that would be more 
extensive changes

Even
Le 01/07/2024 à 18:40, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND 
APPLICATIONS INC] via gdal-dev a écrit :
Hi,

We’ve encountered a few images with what seems like pathological performance 
problems with polygonise.  The details below are a report from another 
developer that I haven’t yet independently verified.

We threshold a raster image to a binary mask in a memory dataset, use that as 
its own mask to mask out the background.
gdal.Polygonize(nn_mem_band, nn_mem_band, ogr_mem_lyr, -1)

We have a number of 32k x 32k raster images that feature number of very large 
same-valued regions (some as large as 80% of the entire raster).  We’re seeing 
~10hrs on a modern workstation to complete the line of code above.  OpenCV can 
apparently construct a connected components list in mere seconds, on the same 
workstation and image, so we’re considering constructing the OGR geometries 
directly from those as a temporary work around.

Is this situation a known pitfall with the current algorithm / data structures 
behind Polygonize?

I’m able to share the problematic tile(s) if of interest,
Best
Jesse



_______________________________________________

gdal-dev mailing list

gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org>

https://lists.osgeo.org/mailman/listinfo/gdal-dev

--

http://www.spatialys.com<http://www.spatialys.com/>

My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
  • ... Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev
    • ... Even Rouault via gdal-dev
    • ... Even Rouault via gdal-dev
      • ... Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev
        • ... Even Rouault via gdal-dev

Reply via email to