Hi,

At least in this case having INITIAL_REQUEST_PAGE_SIZE=number option would 
resolve the practical problem. We have collections with millions of features 
and therefore large page size is essential when downloading the whole 
collection. On the other hand we have complicated geometries like lake polygons 
and reading 10000 large geometries for resolving the schema is pretty heavy. 
And we have 126 collections in the service that makes 126 x 10000 features read 
for resolving the schemas if the aim is to clip a small area from all 
collections into GeoPackage.

-Jukka Rahkonen-

Lähettäjä: Even Rouault <even.roua...@spatialys.com>
Lähetetty: maanantai 27. syyskuuta 2021 16.47
Vastaanottaja: Rahkonen Jukka (MML) <jukka.rahko...@maanmittauslaitos.fi>; 
'gdal-dev@lists.osgeo.org' <gdal-dev@lists.osgeo.org>
Aihe: Re: [gdal-dev] Does OAPIF paging work as supposed?


Jukka,

your analysis is completely correct. Whether this is expected or not probably 
depends on situations. Should we have a INITIAL_REQUEST_PAGE_SIZE=number open 
option to overload the number of features to retrieve specifically in the first 
request... ??

Regarding the spatial filter, it is passed through the OGR API generally after 
having queried the schema, and for most OGR datasources it wouldn't influence 
the schema, so there isn't much that can be done here, except maybe adding a 
BBOX=west,south,east,north open option.

One option to avoid both issues would be for the service to publish DescribedBy 
links at the collection level that would point to a XML schema (using a GML 
Simple Feature schema profile, such as the one understood by the GML driver) or 
a JSON schema (not "too" complicated too). Both are handled by the driver.

Even
Le 27/09/2021 à 15:21, Rahkonen Jukka (MML) a écrit :
Hi,

I tried to read a relatively small BBOX from an OAPIF server but the process 
feels rather slow and I do not quite understand what I am seeing in the log.

ogr2ogr -f GPKG test.gpkg 
OAPIF:https://avoin-paikkatieto.maanmittauslaitos.fi/maastotiedot/features/v1/?api-key=xxxx
 -spat 25 65 25.1 65.1 -oo PAGE_SIZE=10000 --debug on --config cpl_curl_verbose 
yes

Excerpts from the log:

HTTP: 
Fetch(https://avoin-paikkatieto.maanmittauslaitos.fi/maastotiedot/features/v1/collections/osoitepiste/items?api-key=xxxx&f=json&limit=10000)
HTTP: These HTTP headers were set: Accept: application/geo+json, 
application/json
...
> GET 
> /maastotiedot/features/v1/collections/osoitepiste/items?api-key=xxxx&f=json&limit=10000
>  HTTP/1.1
Host: avoin-paikkatieto.maanmittauslaitos.fi
Accept-Encoding: gzip
Accept: application/geo+json, application/json

* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
...
GeoJSON: First pass: 56.54 %
GeoJSON: First pass: 100.00 %
HTTP: 
Fetch(https://avoin-paikkatieto.maanmittauslaitos.fi/maastotiedot/features/v1/collections/osoitepiste/items?api-key=xxxx&f=json&limit=10000&bbox=25,65,25.1000000000000014,65.0999999999999943)
...
> GET 
> /maastotiedot/features/v1/collections/osoitepiste/items?api-key=xxxx&f=json&limit=10000&bbox=25,65,25.1000000000000014,65.0999999999999943
>  HTTP/1.1
...
< Content-Length: 309
GDALVectorTranslate: 0 features written in layer 'osoitepiste'

Do I read right that GDAL is first reading one page, in this time 10000 
features without BBOX, perhaps for resolving the schema, and then makes a new 
query with BBOX? In this case the BBOX query finds nothing. Reading 10000 
features on the first round and then discarding everything feels too expensive. 
Could it be enough to read for example 10 features that is the default page 
size on the first round instead of the full page?

-Jukka Rahkonen-



_______________________________________________

gdal-dev mailing list

gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org>

https://lists.osgeo.org/mailman/listinfo/gdal-dev

--

http://www.spatialys.com

My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to