Re: [gdal-dev] discussion on xarray and the missing abstract-grid specification

2024-04-03 Thread Michael Sumner via gdal-dev
ah thanks Even, I understand that and it was the first example that Ryan
acknowledged was a problem a while back - he'd seen the Float32 problem in
GHRSST. Excellent to get your take and the details.

I've written up a small description of how this affects GHRSST, and my
still-undecided next step to fix it (as COGs).

https://gist.github.com/mdsumner/66aac8e54b54faa24e14b63f396dca32

(this was going to be part of my pitch to xarray and now I don't need so
much extra background to make the point).

I'm keen to collate examples if people have good ones, I have plenty though
🙏

Cheers, Mike

On Thu, 4 Apr 2024, 08:26 Even Rouault,  wrote:

> Hi,
>
> Not that I've a strong opinion on what GeoZarr/XArray should do, but yes
> 1D coordinate arrays are a bit of a pain to deal with, when they actually
> just encode a regularly spaced grid. This is something I've hit in the
> bridge between the GDAL multidimensional API (roughly netCDF based) and the
> GDAL "classic" 2D raster API, when you want to expose a view of a multi-dim
> dataset as a classic one. There's this bit of logic in
> https://github.com/OSGeo/gdal/blob/master/gcore/gdalmultidim.cpp#L7334 to
> guess if the 1D coordinate array is regularly spaced or not. The netCDF
> driver has a similar heuristics too. This is quite error prone, especially
> if Float32 coordinates are used. There it can be difficult to detect if it
> is a regularly spaced array but Float32 hasn't sufficient precision to
> fully encode it, or if it there are genuine irregularities in the spacing.
> For Float64, the precision issues are less a practical error, because
> comparing the expected value (assuming a regular spacing) with the one of
> the 1D coordinate array with a very small relative epsilon, like 1e-8, is
> sufficient.
>
> Even
>
>
> Le 03/04/2024 à 22:56, Michael Sumner via gdal-dev a écrit :
>
> Here's an (ahem) extremely important discussion on the prospects for
> xarray to extend from the coordinates-only model (like that of netcdf) for
> geo reference:
>
>
> https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/
>
> I'm heartened to see this recognized at this level. I think GDAL per se
> should feature in this discussion prominently, and not just the downstream
> Python packages that are mentioned.
>
> For my input, I'll be highlighting examples of
>
> -  degenerate rectilinear coordinates, and the problem of whether to
> assign a regular grid there (with a trivial lightweight Translate a la
> -a_ullr), or to define a new regular grid and push it through the Warp api,
> as a dataset that picks up or can be pointed to the 1D coordinate arrays.
> This has been the crux of the issue for me, I get to decisions where it's
> not clear what was intended or what should be done going forward.
>
> - actual curvilinear coordinates, and the very very general power of that
> to resolve to a regular grid with very simple specification (specify some
> or nothing of target/extent/resolution/crs/dimension). Key problems here
> are when the coordinate arrays are not auto-detected (rare) or when the
> longitudes are unwrapped (less rare).
>
> - that extent+dimension is complementary to the affine transform, and way
> more user-friendly way to work with rasters for the assign and/or target
> specification step (this is huge in R since the {raster} package ca. 2009
> and very well supported by Warp and Translate). It's true that
> extent+dimension can't do shear params but it's a perfectly valid way to
> work and flipping from transform to extent is trivial numerically.
>
> I hope some more voices from this community pitch in as well, very happy
> to discuss here or offline about any of this.  I'm not really a technical
> expert but I have a good overview of the landscape that GDAL sits in, and
> I'm making similar pitches for involvement in the R communities.
>
> Cheers, Mike
>
> --
> Michael Sumner
> Software and Database Engineer
> Australian Antarctic Division
> Hobart, Australia
> e-mail: mdsum...@gmail.com
>
> ___
> gdal-dev mailing 
> listgdal-dev@lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> -- http://www.spatialys.com
> My software is free, but my time generally not.
>
>
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] discussion on xarray and the missing abstract-grid specification

2024-04-03 Thread Even Rouault via gdal-dev

Hi,

Not that I've a strong opinion on what GeoZarr/XArray should do, but yes 
1D coordinate arrays are a bit of a pain to deal with, when they 
actually just encode a regularly spaced grid. This is something I've hit 
in the bridge between the GDAL multidimensional API (roughly netCDF 
based) and the GDAL "classic" 2D raster API, when you want to expose a 
view of a multi-dim dataset as a classic one. There's this bit of logic 
in 
https://github.com/OSGeo/gdal/blob/master/gcore/gdalmultidim.cpp#L7334 
to guess if the 1D coordinate array is regularly spaced or not. The 
netCDF driver has a similar heuristics too. This is quite error prone, 
especially if Float32 coordinates are used. There it can be difficult to 
detect if it is a regularly spaced array but Float32 hasn't sufficient 
precision to fully encode it, or if it there are genuine irregularities 
in the spacing. For Float64, the precision issues are less a practical 
error, because comparing the expected value (assuming a regular spacing) 
with the one of the 1D coordinate array with a very small relative 
epsilon, like 1e-8, is sufficient.


Even


Le 03/04/2024 à 22:56, Michael Sumner via gdal-dev a écrit :
Here's an (ahem) extremely important discussion on the prospects for 
xarray to extend from the coordinates-only model (like that of netcdf) 
for geo reference:


https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/

I'm heartened to see this recognized at this level. I think GDAL per 
se should feature in this discussion prominently, and not just the 
downstream Python packages that are mentioned.


For my input, I'll be highlighting examples of

-  degenerate rectilinear coordinates, and the problem of whether to 
assign a regular grid there (with a trivial lightweight Translate a la 
-a_ullr), or to define a new regular grid and push it through the Warp 
api, as a dataset that picks up or can be pointed to the 1D coordinate 
arrays. This has been the crux of the issue for me, I get to decisions 
where it's not clear what was intended or what should be done going 
forward.


- actual curvilinear coordinates, and the very very general power of 
that to resolve to a regular grid with very simple specification 
(specify some or nothing of target/extent/resolution/crs/dimension). 
Key problems here are when the coordinate arrays are not auto-detected 
(rare) or when the longitudes are unwrapped (less rare).


- that extent+dimension is complementary to the affine transform, and 
way more user-friendly way to work with rasters for the assign and/or 
target specification step (this is huge in R since the {raster} 
package ca. 2009 and very well supported by Warp and Translate). It's 
true that extent+dimension can't do shear params but it's a perfectly 
valid way to work and flipping from transform to extent is trivial 
numerically.


I hope some more voices from this community pitch in as well, very 
happy to discuss here or offline about any of this.  I'm not really a 
technical expert but I have a good overview of the landscape that GDAL 
sits in, and I'm making similar pitches for involvement in the R 
communities.


Cheers, Mike

--
Michael Sumner
Software and Database Engineer
Australian Antarctic Division
Hobart, Australia
e-mail: mdsum...@gmail.com

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


--
http://www.spatialys.com
My software is free, but my time generally not.
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


[gdal-dev] discussion on xarray and the missing abstract-grid specification

2024-04-03 Thread Michael Sumner via gdal-dev
Here's an (ahem) extremely important discussion on the prospects for xarray
to extend from the coordinates-only model (like that of netcdf) for geo
reference:

https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/

I'm heartened to see this recognized at this level. I think GDAL per se
should feature in this discussion prominently, and not just the downstream
Python packages that are mentioned.

For my input, I'll be highlighting examples of

-  degenerate rectilinear coordinates, and the problem of whether to assign
a regular grid there (with a trivial lightweight Translate a la -a_ullr),
or to define a new regular grid and push it through the Warp api, as a
dataset that picks up or can be pointed to the 1D coordinate arrays. This
has been the crux of the issue for me, I get to decisions where it's not
clear what was intended or what should be done going forward.

- actual curvilinear coordinates, and the very very general power of that
to resolve to a regular grid with very simple specification (specify some
or nothing of target/extent/resolution/crs/dimension). Key problems here
are when the coordinate arrays are not auto-detected (rare) or when the
longitudes are unwrapped (less rare).

- that extent+dimension is complementary to the affine transform, and way
more user-friendly way to work with rasters for the assign and/or target
specification step (this is huge in R since the {raster} package ca. 2009
and very well supported by Warp and Translate). It's true that
extent+dimension can't do shear params but it's a perfectly valid way to
work and flipping from transform to extent is trivial numerically.

I hope some more voices from this community pitch in as well, very happy to
discuss here or offline about any of this.  I'm not really a technical
expert but I have a good overview of the landscape that GDAL sits in, and
I'm making similar pitches for involvement in the R communities.

Cheers, Mike

-- 
Michael Sumner
Software and Database Engineer
Australian Antarctic Division
Hobart, Australia
e-mail: mdsum...@gmail.com
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev