Re: [gdal-dev] Does OAPIF paging work as supposed?

2021-09-28 Thread Even Rouault


Le 28/09/2021 à 09:04, Rahkonen Jukka (MML) a écrit :


Hi,

Even Rouault wrote:

> One option to avoid both issues would be for the service to publish

> DescribedBy links at the collection level that would point to a XML

> schema (using a GML Simple Feature schema profile, such as the one

> understood by the GML driver) or a JSON schema (not "too" complicated too).

> Both are handled by the driver.

We added JSON schema and ogrinfo seems to read it, but it still wants 
to read one page of data:


ah you're right. Yes, looking at the code it will always read one page. 
There might be missing information in the schema, in particular for the 
layer geometry type (not always filled in XML schema, and most likely 
never in JSON schema, or at least the logic for that isn't implemented). 
It also takes the opportunity to look at other things such as the 
numberMatched information to be able to response to GetFeatureCount(), 
although that could potentially be skipped and defered when it is invoked.


ogrinfo 
OAPIF:http://some.internal.service/features/collections/PalstanSijaintitiedot/ 
-al -so -oo PAGE_SIZE=1 -nocount -noextent -nogeomtype --debug on


HTTP: Fetch(http://some.internal.service 
/features/collections/PalstanSijaintitiedot/schema)


HTTP: These HTTP headers were set: Accept: application/schema+json

OAPIF: Using JSON schema

HTTP: Fetch(http://some.internal.service 
/features/collections/PalstanSijaintitiedot/items?f=json=1)


I wonder if there is still something in ogrinfo that I cannot turn off 
and what triggers the need to read some features, perhaps the 
coordinate system.


-Jukka Rahkonen-


--
http://www.spatialys.com
My software is free, but my time generally not.

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Does OAPIF paging work as supposed?

2021-09-28 Thread Rahkonen Jukka (MML)
Hi,

Even Rouault wrote:

> One option to avoid both issues would be for the service to publish
> DescribedBy links at the collection level that would point to a XML
> schema (using a GML Simple Feature schema profile, such as the one
> understood by the GML driver) or a JSON schema (not "too" complicated too).
> Both are handled by the driver.

We added JSON schema and ogrinfo seems to read it, but it still wants to read 
one page of data:

ogrinfo 
OAPIF:http://some.internal.service/features/collections/PalstanSijaintitiedot/ 
-al -so -oo PAGE_SIZE=1 -nocount -noextent -nogeomtype --debug on

HTTP: Fetch(http:// some.internal.service 
/features/collections/PalstanSijaintitiedot/schema)
HTTP: These HTTP headers were set: Accept: application/schema+json
OAPIF: Using JSON schema
HTTP: Fetch(http:// some.internal.service 
/features/collections/PalstanSijaintitiedot/items?f=json=1)

I wonder if there is still something in ogrinfo that I cannot turn off and what 
triggers the need to read some features, perhaps the coordinate system.


-Jukka Rahkonen-

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Does OAPIF paging work as supposed?

2021-09-27 Thread Rahkonen Jukka (MML)
Hi,

At least in this case having INITIAL_REQUEST_PAGE_SIZE=number option would 
resolve the practical problem. We have collections with millions of features 
and therefore large page size is essential when downloading the whole 
collection. On the other hand we have complicated geometries like lake polygons 
and reading 1 large geometries for resolving the schema is pretty heavy. 
And we have 126 collections in the service that makes 126 x 1 features read 
for resolving the schemas if the aim is to clip a small area from all 
collections into GeoPackage.

-Jukka Rahkonen-

Lähettäjä: Even Rouault 
Lähetetty: maanantai 27. syyskuuta 2021 16.47
Vastaanottaja: Rahkonen Jukka (MML) ; 
'gdal-dev@lists.osgeo.org' 
Aihe: Re: [gdal-dev] Does OAPIF paging work as supposed?


Jukka,

your analysis is completely correct. Whether this is expected or not probably 
depends on situations. Should we have a INITIAL_REQUEST_PAGE_SIZE=number open 
option to overload the number of features to retrieve specifically in the first 
request... ??

Regarding the spatial filter, it is passed through the OGR API generally after 
having queried the schema, and for most OGR datasources it wouldn't influence 
the schema, so there isn't much that can be done here, except maybe adding a 
BBOX=west,south,east,north open option.

One option to avoid both issues would be for the service to publish DescribedBy 
links at the collection level that would point to a XML schema (using a GML 
Simple Feature schema profile, such as the one understood by the GML driver) or 
a JSON schema (not "too" complicated too). Both are handled by the driver.

Even
Le 27/09/2021 à 15:21, Rahkonen Jukka (MML) a écrit :
Hi,

I tried to read a relatively small BBOX from an OAPIF server but the process 
feels rather slow and I do not quite understand what I am seeing in the log.

ogr2ogr -f GPKG test.gpkg 
OAPIF:https://avoin-paikkatieto.maanmittauslaitos.fi/maastotiedot/features/v1/?api-key=
 -spat 25 65 25.1 65.1 -oo PAGE_SIZE=1 --debug on --config cpl_curl_verbose 
yes

Excerpts from the log:

HTTP: 
Fetch(https://avoin-paikkatieto.maanmittauslaitos.fi/maastotiedot/features/v1/collections/osoitepiste/items?api-key==json=1)
HTTP: These HTTP headers were set: Accept: application/geo+json, 
application/json
...
> GET 
> /maastotiedot/features/v1/collections/osoitepiste/items?api-key==json=1
>  HTTP/1.1
Host: avoin-paikkatieto.maanmittauslaitos.fi
Accept-Encoding: gzip
Accept: application/geo+json, application/json

* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
...
GeoJSON: First pass: 56.54 %
GeoJSON: First pass: 100.00 %
HTTP: 
Fetch(https://avoin-paikkatieto.maanmittauslaitos.fi/maastotiedot/features/v1/collections/osoitepiste/items?api-key==json=1=25,65,25.1014,65.0943)
...
> GET 
> /maastotiedot/features/v1/collections/osoitepiste/items?api-key==json=1=25,65,25.1014,65.0943
>  HTTP/1.1
...
< Content-Length: 309
GDALVectorTranslate: 0 features written in layer 'osoitepiste'

Do I read right that GDAL is first reading one page, in this time 1 
features without BBOX, perhaps for resolving the schema, and then makes a new 
query with BBOX? In this case the BBOX query finds nothing. Reading 1 
features on the first round and then discarding everything feels too expensive. 
Could it be enough to read for example 10 features that is the default page 
size on the first round instead of the full page?

-Jukka Rahkonen-



___

gdal-dev mailing list

gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org>

https://lists.osgeo.org/mailman/listinfo/gdal-dev

--

http://www.spatialys.com

My software is free, but my time generally not.
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Does OAPIF paging work as supposed?

2021-09-27 Thread Even Rouault

Jukka,

your analysis is completely correct. Whether this is expected or not 
probably depends on situations. Should we have a 
INITIAL_REQUEST_PAGE_SIZE=number open option to overload the number of 
features to retrieve specifically in the first request... ??


Regarding the spatial filter, it is passed through the OGR API generally 
after having queried the schema, and for most OGR datasources it 
wouldn't influence the schema, so there isn't much that can be done 
here, except maybe adding a BBOX=west,south,east,north open option.


One option to avoid both issues would be for the service to publish 
DescribedBy links at the collection level that would point to a XML 
schema (using a GML Simple Feature schema profile, such as the one 
understood by the GML driver) or a JSON schema (not "too" complicated 
too). Both are handled by the driver.


Even

Le 27/09/2021 à 15:21, Rahkonen Jukka (MML) a écrit :


Hi,

I tried to read a relatively small BBOX from an OAPIF server but the 
process feels rather slow and I do not quite understand what I am 
seeing in the log.


ogr2ogr -f GPKG test.gpkg 
OAPIF:https://avoin-paikkatieto.maanmittauslaitos.fi/maastotiedot/features/v1/?api-key= 
 
-spat 25 65 25.1 65.1 -oo PAGE_SIZE=1 --debug on --config 
cpl_curl_verbose yes


Excerpts from the log:

HTTP: 
Fetch(https://avoin-paikkatieto.maanmittauslaitos.fi/maastotiedot/features/v1/collections/osoitepiste/items?api-key==json=1 
)


HTTP: These HTTP headers were set: Accept: application/geo+json, 
application/json


…

> GET 
/maastotiedot/features/v1/collections/osoitepiste/items?api-key==json=1
 HTTP/1.1

Host: avoin-paikkatieto.maanmittauslaitos.fi

Accept-Encoding: gzip

Accept: application/geo+json, application/json

* Mark bundle as not supporting multiuse

< HTTP/1.1 200 OK

…

GeoJSON: First pass: 56.54 %

GeoJSON: First pass: 100.00 %

HTTP: 
Fetch(https://avoin-paikkatieto.maanmittauslaitos.fi/maastotiedot/features/v1/collections/osoitepiste/items?api-key==json=1=25,65,25.1014,65.0943 
)


…

> GET 
/maastotiedot/features/v1/collections/osoitepiste/items?api-key==json=1=25,65,25.1014,65.0943
 HTTP/1.1

…

< Content-Length: 309

GDALVectorTranslate: 0 features written in layer 'osoitepiste'

Do I read right that GDAL is first reading one page, in this time 
1 features without BBOX, perhaps for resolving the schema, and 
then makes a new query with BBOX? In this case the BBOX query finds 
nothing. Reading 1 features on the first round and then discarding 
everything feels too expensive. Could it be enough to read for example 
10 features that is the default page size on the first round instead 
of the full page?


-Jukka Rahkonen-


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


--
http://www.spatialys.com
My software is free, but my time generally not.

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev