Re: [gdal-dev] slow netCDF read times

2016-11-22 Thread Pablo Rozas Larraondo
Thank you Julien,

Your comment is very helpful and I think the issue that you're pointing out
is very much related to my problem. I did play with setting different
chunking patterns and I saw that chunking per line solved pretty much the
performance issue, but I didn't understand why.

Also, ticket 5291 talks about BottomUp NetCDF causing grief to GDAL. I've
checked and my file is BottomUp. I've flipped it by doing:
gdal_translate -of netcdf -co "WRITE_BOTTOMUP=NO"
chirps-v2.0.1981.dekads.classic.nc out.nc

And transformed it again into NetCDF4 with its original chunking pattern
and deflate level:
nccopy -7 -c time/4,lat/250,lon/900 out.nc out2.nc

In this case GDAL reads the file almost as fast as the native netcdf
library does. From now on I'll make sure I don't produce netCDF files which
are bottom up. Does anyone know where this bottom up convention comes from
in netcdf files or why is there?

Cheers,
Pablo




On Tue, Nov 22, 2016 at 8:07 PM, Julien Demaria <julien.dema...@acri-st.fr>
wrote:

> Hi,
>
>
>
> Maybe this is a problem with your NetCDF internal chunks cache too small
> and related to this ticket: https://trac.osgeo.org/gdal/ticket/5291
>
> You can change this per-variable cache using this C function:
> http://www.unidata.ucar.edu/software/netcdf/docs/group__variables.html#
> ga2788cbfc6880ec70c304292af2bc7546
>
> Else a workaround may be to rechunk your data using nccopy to have chunks
> of the same size than your reading window.
>
> Another solution is to recompile your NetCDF library to set more chunks
> cache.
>
>
>
> Regards,
>
>
>
> Julien
>
>
>
> *De :* gdal-dev [mailto:gdal-dev-boun...@lists.osgeo.org] *De la part de*
> Pablo Rozas Larraondo
> *Envoyé :* mardi 22 novembre 2016 08:53
> *À :* gdal-dev@lists.osgeo.org
> *Objet :* [gdal-dev] slow netCDF read times
>
>
>
> Hello,
>
>
> I've come across some NetCDF4 files where GDAL is taking a surprisingly
> long time to read data from them. For example this is an example public
> file containing precipitation data:
>
>
>
> ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/
> global_dekad/netcdf/chirps-v2.0.2015.dekads.nc
>
>
>
> If I use GDAL to read a small top left block (500x500) from one of its
> time bands, it takes approximately 1 minute on my computer. Source code is
> available here:
>
>
>
> https://gist.github.com/monkeybutter/769a24bcf87682171eb87ac05c9347c5
>
>
>
> The equivalent operation is completed in less than a second using the
> NetCDF library and even reading the whole file takes around 6 seconds with
> the same library.
>
>
>
> I've tried to profile the GDAL program to get more insight and understand
> what's causing the overhead with not much success. All I know is that the
> deflate function is using 96% of the resources. I also guess that the way
> this file is chunked has something to do with its performance. Can anyone
> suggest any idea for better understanding what's happening here?
>
>
>
> Thank you for your help,
>
> Pablo
>
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] slow netCDF read times

2016-11-22 Thread Julien Demaria
Hi,

Maybe this is a problem with your NetCDF internal chunks cache too small and 
related to this ticket: https://trac.osgeo.org/gdal/ticket/5291
You can change this per-variable cache using this C function: 
http://www.unidata.ucar.edu/software/netcdf/docs/group__variables.html#ga2788cbfc6880ec70c304292af2bc7546
Else a workaround may be to rechunk your data using nccopy to have chunks of 
the same size than your reading window.
Another solution is to recompile your NetCDF library to set more chunks cache.

Regards,

Julien

De : gdal-dev [mailto:gdal-dev-boun...@lists.osgeo.org] De la part de Pablo 
Rozas Larraondo
Envoyé : mardi 22 novembre 2016 08:53
À : gdal-dev@lists.osgeo.org
Objet : [gdal-dev] slow netCDF read times

Hello,

I've come across some NetCDF4 files where GDAL is taking a surprisingly long 
time to read data from them. For example this is an example public file 
containing precipitation data:

ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/global_dekad/netcdf/chirps-v2.0.2015.dekads.nc

If I use GDAL to read a small top left block (500x500) from one of its time 
bands, it takes approximately 1 minute on my computer. Source code is available 
here:

https://gist.github.com/monkeybutter/769a24bcf87682171eb87ac05c9347c5

The equivalent operation is completed in less than a second using the NetCDF 
library and even reading the whole file takes around 6 seconds with the same 
library.

I've tried to profile the GDAL program to get more insight and understand 
what's causing the overhead with not much success. All I know is that the 
deflate function is using 96% of the resources. I also guess that the way this 
file is chunked has something to do with its performance. Can anyone suggest 
any idea for better understanding what's happening here?

Thank you for your help,
Pablo
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

[gdal-dev] slow netCDF read times

2016-11-21 Thread Pablo Rozas Larraondo
Hello,

I've come across some NetCDF4 files where GDAL is taking a surprisingly
long time to read data from them. For example this is an example public
file containing precipitation data:

ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/
global_dekad/netcdf/chirps-v2.0.2015.dekads.nc

If I use GDAL to read a small top left block (500x500) from one of its time
bands, it takes approximately 1 minute on my computer. Source code is
available here:

https://gist.github.com/monkeybutter/769a24bcf87682171eb87ac05c9347c5

The equivalent operation is completed in less than a second using the
NetCDF library and even reading the whole file takes around 6 seconds with
the same library.

I've tried to profile the GDAL program to get more insight and understand
what's causing the overhead with not much success. All I know is that the
deflate function is using 96% of the resources. I also guess that the way
this file is chunked has something to do with its performance. Can anyone
suggest any idea for better understanding what's happening here?

Thank you for your help,
Pablo
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev