Le 23/07/2024 à 21:08, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND
APPLICATIONS INC] a écrit :
Excellent, thanks Even! Do you recall what the runtime was before
these changes on your test system?
I killed the process at about half an hour. Don't recall the progress it
reached, maybe 40%-50%.
*From: *Even Rouault <even.roua...@spatialys.com>
*Date: *Tuesday, July 23, 2024 at 3:00 PM
*To: *Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS
INC] <jesse.r.me...@nasa.gov>, Meyer, Jesse R. (GSFC-618.0)[SCIENCE
SYSTEMS AND APPLICATIONS INC] via gdal-dev <gdal-dev@lists.osgeo.org>
*Subject: *[EXTERNAL] Re: [gdal-dev] Expected runtime of polygonize
(GDAL 3.9.0) for few very large features.
*CAUTION:*This email originated from outside of NASA. Please take
care when clicking links or opening attachments. Use the "Report
Message" button to report suspicious messages to the NASA SOC.
Hi,
I've got a chance to have a look at your test dataset. In
https://github.com/OSGeo/gdal/pull/10477, I've reduced the runtime to
8 minutes (with GeoParquet output, without spatial sorting), by
optimizing some implementation details. I believe this could be
further reduced as most of the time is still spent in malloc/free of
temporary objects (the output is 90 million polygons!) and some
objects could be reused, but that would be more extensive changes
Even
Le 01/07/2024 à 18:40, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS
AND APPLICATIONS INC] via gdal-dev a écrit :
Hi,
We’ve encountered a few images with what seems like pathological
performance problems with polygonise. The details below are a
report from another developer that I haven’t yet independently
verified.
We threshold a raster image to a binary mask in a memory dataset,
use that as its own mask to mask out the background.
gdal.Polygonize(nn_mem_band, nn_mem_band, ogr_mem_lyr, -1)
We have a number of 32k x 32k raster images that feature number of
very large same-valued regions (some as large as 80% of the entire
raster). We’re seeing ~10hrs on a modern workstation to complete
the line of code above. OpenCV can apparently construct a
connected components list in mere seconds, on the same workstation
and image, so we’re considering constructing the OGR geometries
directly from those as a temporary work around.
Is this situation a known pitfall with the current algorithm /
data structures behind Polygonize?
I’m able to share the problematic tile(s) if of interest,
Best
Jesse
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
http://www.spatialys.com
My software is free, but my time generally not.
--
http://www.spatialys.com
My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev