Hello everyone,
I'd like to enhance WFS responses in JSON/JSONP format. In some of our
Web GIS Applications, one may download up to several thousand Simple
Features through WFS. Some Layers may have between 400 and 500 columns.
Although JSON produces much smaller responses than XML/GML, such
responses quickly have sizes between 60 and 120 MB (uncompressed).
The idea is to remove some redundancy from the GeoJSON format, which
provides a feature's attributes in a key/value map (JSON Object):
{
"type":"FeatureCollection",
"features":[
{
"type":"Feature",
"id":"bugsites.3",
"geometry":{
"type":"Point",
"coordinates":[
590529,
4914625
]
},
"geometry_name":"the_geom",
"properties":{
"cat":3,
"str1":"Beetle site"
}
},
{
"type":"Feature",
"id":"bugsites.4",
"geometry":{
"type":"Point",
"coordinates":[
590546,
4915353
]
},
"geometry_name":"the_geom",
"properties":{
"cat":4,
"str1":"Beetle site"
}
}
],
"totalFeatures":2,
"numberMatched":2,
"numberReturned":2,
"timeStamp":"2022-09-13T08:44:45.118Z",
"crs":{
"type":"name",
"properties":{
"name":"urn:ogc:def:crs:EPSG::26713"
}
}
}
Also, the "geometry_name" property is repeated for every feature returned.
With lots of features and columns (the latter do not necessarily have
short names), this repeated schema information can quickly become the
dominating factor regarding the size of the response. (Of course, that
also depends on the type and complexity of the geometry.)
Likely the repeated "type":"Feature" could be omitted as well. It's just
there in order to satisfy the GeoJSON specs. Maybe the "level of
compaction" could be specified for a request.
By including schema information only once in the FeatureCollection
object, a more compact form of the JSON response may look like this:
{
"type":"FeatureCollection",
"features":[
{
"type":"Feature",
"id":"bugsites.3",
"geometry":{
"type":"Point",
"coordinates":[
590529,
4914625
]
},
"geometry_name":"the_geom",
"properties":[
3,
"Beetle site"
]
},
{
"type":"Feature",
"id":"bugsites.4",
"geometry":{
"type":"Point",
"coordinates":[
590546,
4915353
]
},
"properties":[
4,
"Beetle site"
]
}
],
"totalFeatures":2,
"numberMatched":2,
"numberReturned":2,
"timeStamp":"2022-09-13T08:44:45.118Z",
"crs":{
"type":"name",
"properties":{
"name":"urn:ogc:def:crs:EPSG::26713"
}
},
"schema":{
"geometry_name":"the_geom",
"properties":[
"cat",
"str1"
]
}
}
Here, a new "schema" object in the root "FeatureCollection" object
contains the name of the geometry field as well as the names of the
other properties as an array. The "properties" object in the "Feature"
objects have become arrays as well, containing the property values only.
Both arrays are "parallel", that is:
Key: "schema"."properties"[0] => cat
Val: features[N]."properties"[0] => 3 or 4 (depending on N)
Key: "schema"."properties"[1] => str1
Val: features[N]."properties"[1] => Beetle site
In the above example with only two short-named fields savings are almost
zero. However, with requests getting some thousand features, each having
300+ fields, savings may be quite significant.
The new compact format is not GeoJSON, of course. However, what
GeoServer currently returns is already not really compatible with
GeoJSON specs, which, for example, only permit EPGS:4326 coordinates. In
any case, that format is likely much smaller for requests described above.
On the wire, these responses are typically compressed (deflate, brotli
etc.). However, compressing smaller amounts of data typically results in
smaller compressed junks. Also, it is not only about transferring the
data; the data must be created and compressed by GeoServer and must be
decompressed and parsed in the client. After all, smaller junks of data
seem to be less resource consuming than larger ones.
I'd like to add this new compact format directly into GeoServer's core
WFS JSON code:
gs-wfs: org.geoserver.wfs.json.GeoJSONGetFeatureResponse
More or less, only method encodeSimpleFeatures must be modified in order
to implement this new compact JSON format.
However, that method iterates over a list of FeatureCollection objects
(argument List<FeatureCollection> resultsList). Could there be more than
one FeatureCollection is the request returns simple features only?
Shouldn't all simple features requested through WFS be of the same type?
In other words, must I expect to deal with several distinct "schema"
objects when requesting simple features? (don't think so)
Since using this format, of course, is optional, I need a mechanism to
tell the server whether to return compact JSON or not (and maybe the
level of compaction it should use).
One quite obvious option is to use vendor parameter "format_options".
Here, a parameter like "json:compact" could trigger responding with
compact JSON. This seems quite simple to implement.
Another idea is to use different outputFormat MIME types or MIME types
with additional parameters:
application/json; type=compact; omitTypes=true
text/javascript; type=compact
Although distinguishing the format via the MIME type may be some more
work to do, I do prefer this approach over the "format_options" way.
What else do I have to consider?
Many thanks in advance for your ideas on this :)
Regards, Carsten
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel