Hi, You send new mails faster than I can read and understand the previous ones but here comes some quick comments.
* The gdalbuildvrt command looks good but it would be kind to tell which file is international and which one is UK. It is not obvious to me by the paths /5/ and /3/. You want to write the international file first into the VRT and UK after that to suit your workflow. * This does not make sense: gdal_translate /data/coastal-2020.vrt /data/3/coastal-2020.tif /data/5/coastal-2020.tif. gdal_translate needs one input and one output, so: gdal_translate /data/coastal-2020.vrt output.tif * I think that gdal_translate does not have option -n * NUM_THREADS=ALL_CPUS is not a creation option -co, but an open option -oo, or it can be used as a configuration option * It may be that -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 can really be used without -co TILED=YES. I have never tried. It is still essential that the output is tiled. * Without testing I am not sure if the issues above make the input rubbish but it is possible. * Your second and third trials are doomed to do something else than what you want very soon: gdalwarp -r near -overwrite. Overwrite switch means that the existing outputfile will be deleted and a new one with the same name will be created. You wanted just update the target file so don't use -overwrite. And when you use an existing file as a target the creation options have no effect; the target file is already created with some options. * Now you understand why The output is: 18376, 17086 not 450000, 225000. (if not I can give a hint: what file did the -overwrite delete?) I suppose that you start getting frustrated because testing all kind of random commands with huge images takes a long time ("The AWS Instance with over 60 VCPU ran for over 8 hours"). However, you could have made all your mistakes much faster with much smaller files (1000 by 1000 pixels for example). Make the commands work with small images and when you are satisfied test them with bigger images. That said, I cannot promise that you will have good performance with updating 450000, 225000 sized LZW compressed GeoTIFF. I have never tried anything like that with such a big image myself. -Jukka Rahkonen- Lähettäjä: gdal-dev <gdal-dev-boun...@lists.osgeo.org> Puolesta Clive Swan Lähetetty: keskiviikko 14. joulukuuta 2022 18.34 Vastaanottaja: gdal-dev@lists.osgeo.org Aihe: Re: [gdal-dev] gdalwarp running very slow I want to APPEND the UK data into the international.tif The updated international size should also be: 450000, 225000 I first tried gdalbuildvrt -o /data/coastal-2020.vrt /vsis3/summer/3/coastal-2020.tif /vsis3/summer/5/coastal-2020.tif gdal_translate /data/coastal-2020.vrt /data/3/coastal-2020.tif /data/5/coastal-2020.tif -n -9999 -co BIGTIFF=YES -co COMPRESS=LZW -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 -co NUM_THREADS=ALL_CPUS --config CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE YES --config The output was rubbish The UK image size is: 18376, 17086 The international size is: 450000, 225000 I tried /data/3/coastal-2020-test.tif = 7GB /data/5/coastal-2020.tif = 700MB gdalwarp -r near -overwrite /data/3/coastal-2020.tif /data/3/coastal-2020-test1.tif -co BIGTIFF=YES -co COMPRESS=LZW -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 -co NUM_THREADS=ALL_CPUS -co PREDICTOR=3 --config CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE YES & disown -h The AWS Instance with over 60 VCPU ran for over 8 hours I tried: /data/5/coastal-2020.tif = 700MB /data/3/coastal-2020-test.tif = 7GB gdalwarp -r near -overwrite /data/5/coastal-2020.tif /data/3/coastal-2020-test.tif -co BIGTIFF=YES -co COMPRESS=LZW -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 -co NUM_THREADS=ALL_CPUS -co PREDICTOR=3 --config CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE YES The output is: 18376, 17086 not 450000, 225000 Any assistance appreciated Thanks Clive On Wed, 14 Dec 2022 at 09:23, Rahkonen Jukka <jukka.rahko...@maanmittauslaitos.fi<mailto:jukka.rahko...@maanmittauslaitos.fi>> wrote: Hi, I don't mean that you should try this and that blindly but to describe what data you have in your hands and what you are planning to do with it so that the other GDAL users could consider what reasonable alternatives you could have. I have never done anything that is even close to your use case but due to other experience I can see potential issues in a few places: * You try to update image A that has a size 450000 by 225000 pixels with image B that has the same size. The result would be A updated into a full copy of B if all pixels in B are valid. * However, image B probably has very much NoData (we do not know because you have not told that) and if GDAL deals with NoData correctly the result would be A updated with valid pixels from B and that is probably what is desired. * However, we do not know how effectively GDAL skips the nodata pixels of B. It may be fast or not. If we know that most part of the world is NoData it might be good to crop image B to include just the area where there is data. That's maybe UK in your case. If skipping the NoData is fast then cropping won't give speedup but it is cheap to test. * You have compressed images. LZW algorithm is compressing some data more effectively than some other. If you expect that you can replace a chunk of LZW compressed data inside a TIFF file with another chunk of LZW compressed data in place you are wrong. The new chunk of data may be larger and it just cannot fit into the same space. Assumption that updating a 6 GB image with 600 MB new data would yield a 6 GB image is not correct with compressed data. * I can imagine that there could be other technical reasons to write the replacing data at the end of the existing TIFF and update the image directories. If the image size is critical it may require re-writing the updated TIFF into a new TIFF file. The complete re-write can be done in most optimal way. See this wiki page https://trac.osgeo.org/gdal/wiki/UserDocs/GdalWarp#GeoTIFFoutput-coCOMPRESSisbroken<https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftrac.osgeo.org%2Fgdal%2Fwiki%2FUserDocs%2FGdalWarp%23GeoTIFFoutput-coCOMPRESSisbroken&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C5b67ce172b5b4c48990108daddf10a33%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066324581688598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=JyLbYpMyvjON3BsPZ9INgBNAy8DUGMwrJ2TVv6%2BhGHY%3D&reserved=0> * If the images are in AWS it is possible that the process should be somehow different than with local images. I have no experience about AWS yet. * A 450000 by 225000 image is rather big. It is possible that it would be faster to split the image into smaller parts, update the parts that need updating, and combine the parts back into a big image. Or keep the parts and combine them virtually with gdalbuildvrt into VRT. Your use case is not so usual and it is rather heavy but there are certainly several ways to do what you want. What should be avoided it to select an inefficient method and try to optimize it. Good luck with your experiments, -Jukka- Lähettäjä: Clive Swan <clives...@gmail.com<mailto:clives...@gmail.com>> Lähetetty: keskiviikko 14. joulukuuta 2022 10.29 Vastaanottaja: Rahkonen Jukka <jukka.rahko...@maanmittauslaitos.fi<mailto:jukka.rahko...@maanmittauslaitos.fi>> Aihe: Re: [gdal-dev] gdalwarp running very slow Hi Jukka, Thanks for that, was really stressed. I will export the UK extent, and rerun the script. Thanks Clive Sent from Outlook for Android<https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2FAAb9ysg&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C5b67ce172b5b4c48990108daddf10a33%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066324581688598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VrXwCU6Hs%2FdjSHmxQBVIXJI30tUikSCe36HZaTNpiYA%3D&reserved=0> ________________________________ From: Rahkonen Jukka <jukka.rahko...@maanmittauslaitos.fi<mailto:jukka.rahko...@maanmittauslaitos.fi>> Sent: Wednesday, December 14, 2022 7:18:50 AM To: Clive Swan <clives...@gmail.com<mailto:clives...@gmail.com>>; gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org> <gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org>> Subject: Re: [gdal-dev] gdalwarp running very slow Hi, Thank you for the information about the source files. I do not yet understand what you are trying to do and why. The both images have the same size 450000 and 225000 and they cover the same area. Is the "image 5_UK_coastal-2020.tif" just NoData with pixel value -9999 everywhere outside the UK? The name of the image makes me think so. -Jukka Rahkonen- Lähettäjä: Clive Swan <clives...@gmail.com<mailto:clives...@gmail.com>> Lähetetty: tiistai 13. joulukuuta 2022 19.22 Vastaanottaja: gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org> Kopio: Rahkonen Jukka <jukka.rahko...@maanmittauslaitos.fi<mailto:jukka.rahko...@maanmittauslaitos.fi>> Aihe: [gdal-dev] gdalwarp running very slow Greetings, I am using the same files, I copied them from an AWS Bucket to a local AWS Instance. I tried gdal_merge << tries to create 300GB file I tried gdal_translate ran but created 2.5 GB not 6.9 GB file Now trying gdalwarp. the gdalinfo is the same in both datasets: coastal-2020.tif (6.9GB) Driver: GTiff/GeoTIFF Size is 450000, 225000 Coordinate System is: GEOGCRS["WGS 84", DATUM["World Geodetic System 1984", ELLIPSOID["WGS 84",6378137,298.257223563, LENGTHUNIT["metre",1]]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433]], CS[ellipsoidal,2], AXIS["geodetic latitude (Lat)",north, ORDER[1], ANGLEUNIT["degree",0.0174532925199433]], AXIS["geodetic longitude (Lon)",east, ORDER[2], ANGLEUNIT["degree",0.0174532925199433]], ID["EPSG",4326]] Data axis to CRS axis mapping: 2,1 Origin = (-180.000000000000000,90.000000000000000) Pixel Size = (0.000800000000000,-0.000800000000000) Metadata: AREA_OR_POINT=Area datetime_created=2022-11-14 18:05:14.053301 Image Structure Metadata: COMPRESSION=LZW INTERLEAVE=BAND PREDICTOR=3 Corner Coordinates: Upper Left (-180.0000000, 90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"N) Lower Left (-180.0000000, -90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"S) Upper Right ( 180.0000000, 90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"N) Lower Right ( 180.0000000, -90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"S) Center ( 0.0000000, 0.0000000) ( 0d 0' 0.01"E, 0d 0' 0.01"N) Band 1 Block=128x128 Type=Float32, ColorInterp=Gray Description = score NoData Value=-9999 Band 2 Block=128x128 Type=Float32, ColorInterp=Undefined Description = severity_value NoData Value=-9999 Band 3 Block=128x128 Type=Float32, ColorInterp=Undefined Description = severity_min NoData Value=-9999 Band 4 Block=128x128 Type=Float32, ColorInterp=Undefined Description = severity_max NoData Value=-9999 Band 5 Block=128x128 Type=Float32, ColorInterp=Undefined Description = likelihood NoData Value=-9999 Band 6 Block=128x128 Type=Float32, ColorInterp=Undefined Description = return_time NoData Value=-9999 Band 7 Block=128x128 Type=Float32, ColorInterp=Undefined Description = likelihood_confidence NoData Value=-9999 Band 8 Block=128x128 Type=Float32, ColorInterp=Undefined Description = climate_reliability NoData Value=-9999 Band 9 Block=128x128 Type=Float32, ColorInterp=Undefined Description = hazard_reliability NoData Value=-9999 5_UK_coastal-2020.tif (600MB) Driver: GTiff/GeoTIFF Size is 450000, 225000 Coordinate System is: GEOGCRS["WGS 84", DATUM["World Geodetic System 1984", ELLIPSOID["WGS 84",6378137,298.257223563, LENGTHUNIT["metre",1]]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433]], CS[ellipsoidal,2], AXIS["geodetic latitude (Lat)",north, ORDER[1], ANGLEUNIT["degree",0.0174532925199433]], AXIS["geodetic longitude (Lon)",east, ORDER[2], ANGLEUNIT["degree",0.0174532925199433]], ID["EPSG",4326]] Data axis to CRS axis mapping: 2,1 Origin = (-180.000000000000000,90.000000000000000) Pixel Size = (0.000800000000000,-0.000800000000000) Metadata: AREA_OR_POINT=Area datetime_created=2022-11-14 18:05:14.053301 hostname=posix.uname_result(sysname='Linux', nodename='ip-172-31-12-125', release='5.15.0-1022-aws', version='#26~20.04.1-Ubuntu SMP Sat Oct 15 03:22:07 UTC 2022', machine='x86_64') Image Structure Metadata: COMPRESSION=LZW INTERLEAVE=BAND PREDICTOR=3 Corner Coordinates: Upper Left (-180.0000000, 90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"N) Lower Left (-180.0000000, -90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"S) Upper Right ( 180.0000000, 90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"N) Lower Right ( 180.0000000, -90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"S) Center ( 0.0000000, 0.0000000) ( 0d 0' 0.01"E, 0d 0' 0.01"N) Band 1 Block=128x128 Type=Float32, ColorInterp=Gray Description = score NoData Value=-9999 Band 2 Block=128x128 Type=Float32, ColorInterp=Undefined Description = severity_value NoData Value=-9999 Band 3 Block=128x128 Type=Float32, ColorInterp=Undefined Description = severity_min NoData Value=-9999 Band 4 Block=128x128 Type=Float32, ColorInterp=Undefined Description = severity_max NoData Value=-9999 Band 5 Block=128x128 Type=Float32, ColorInterp=Undefined Description = likelihood NoData Value=-9999 Band 6 Block=128x128 Type=Float32, ColorInterp=Undefined Description = return_time NoData Value=-9999 Band 7 Block=128x128 Type=Float32, ColorInterp=Undefined Description = likelihood_confidence NoData Value=-9999 Band 8 Block=128x128 Type=Float32, ColorInterp=Undefined Description = climate_reliability NoData Value=-9999 Band 9 Block=128x128 Type=Float32, ColorInterp=Undefined Description = hazard_reliability NoData Value=-9999 -- Regards, Clive Swan -- Hi, If you are still struggling with the same old problem could you please finally send the gdalinfo reports of your two input files which are this time: coastal-2020.tif 5_UK_coastal-2020.tif -Jukka Rahkonen- Lähettäjä: gdal-dev <gdal-dev-bounces at lists.osgeo.org<https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgdal-dev&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C5b67ce172b5b4c48990108daddf10a33%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066324581688598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ksdZuGTu2MsFwMcQTQT1XXt4v8JHhDTDgbVbna0zfoA%3D&reserved=0>> Puolesta Clive Swan Lähetetty: tiistai 13. joulukuuta 2022 17.23 Vastaanottaja: gdal-dev at lists.osgeo.org<https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgdal-dev&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C5b67ce172b5b4c48990108daddf10a33%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066324581688598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ksdZuGTu2MsFwMcQTQT1XXt4v8JHhDTDgbVbna0zfoA%3D&reserved=0> Aihe: [gdal-dev] gdalwarp running very slow Greetings, I am running gdalwarp on a 6GB (output) and 600MB (input) tif image, the AWS Instance has approx 60 VCPU It has taken over 6 hours so far - still running, is it possible to optimise this and speed it up?? gdalwarp -r near -overwrite coastal-2020.tif 5_UK_coastal-2020.tif -co BIGTIFF=YES -co COMPRESS=LZW -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 -co NUM_THREADS=ALL_CPUS --config CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE YES -- Regards, Clive Swan -- M: +44 7766 452665
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev