On Thu, Sep 22, 2022 at 10:08 AM Carsten Klein <c.kl...@datagis.com
<mailto:c.kl...@datagis.com>> wrote:
Hello there,
the new WFS output format CompactJSON, implemented as a community
extension, is ready and works as expected. With our mentioned WFS
responses (1.000+ features, 450 columns each), compared to standard
GeoJSON, savings are between ~30% and ~70% (uncompressed) and between
~20% and ~50% with gzip content encoding. (see ´Statistics´ and
´Conclusion´ below)
What are the next steps in order to contribute/publish the new
community
extension? Shall I commit all the stuff to my forked ´geoserver´
repository and issue a CR including all changes required in existing
core code (that is the existing JSON producer)?
Yes, make a Pull Request and send a CLA to OSGeo. When opening the PR
you have a checklist that needs to be satisfied, with links to the
relevant docs.
For reference, here it is:
https://github.com/geoserver/geoserver/blob/main/.github/PULL_REQUEST_TEMPLATE.md
<https://github.com/geoserver/geoserver/blob/main/.github/PULL_REQUEST_TEMPLATE.md>
One last question: some of the WFS format extensions have one or more
GeoServerApplication.properties files, which seem to provide human
readable names for the actual MIME types or formats. Should I provide
these, too? Where are these texts used? What are the rules?
It would be nice if you could add it, yes. They are used in the format
dropdown
you can find in the layer preview page. We don't have a list of rules,
checking
what the other formats do and mimicking it, is the de-facto approach
when adding
a new output format.
Statistics
==========
Given two layers A and B, both having ~450 columns. Layer A's geometry
contains polygons (~300 vertices each). Layer B has point geometries
(created as a view of layer A with ST_Centroid(the_geom) AS the_geom).
Querying only ~1,000 and all ~10,000 features/rows from both layers:
Layer A (heavy polygons)
------------------------
~1,000 rows bytes raw bytes gzip
GeoJSON: 19,098,924 (100%) 5,361,536 (100%)
CompactJSON: 12,586,758 ( 66%) 4,234,995 ( 79%)
FlatGeobuf: 10,093,224 ( 53%) 7,006,365 (130%)
Layer A (heavy polygons)
------------------------
~10,000 rows bytes raw bytes gzip
GeoJSON: 174,686,637 (100%) 45,414,569 (100%)
CompactJSON: 109,700,539 ( 63%) 35,677,456 ( 78%)
FlatGeobuf: 86,031,048 ( 49%) 58,857,835 (130%)
Layer B (lightweight points)
----------------------------
~1,000 rows bytes raw bytes gzip
GeoJSON: 9,443,980 (100%) 2,236,172 (100%)
CompactJSON: 2,931,814 ( 31%) 1,174,942 ( 53%)
FlatGeobuf: 3,710,902 ( 39%) 2,044,614 ( 91%)
Layer B (lightweight points)
----------------------------
~10,000 rows bytes raw bytes gzip
GeoJSON: 92,394,763 (100%) 18,923,302 (100%)
CompactJSON: 27,408,665 ( 30%) 9,720,941 ( 51%)
FlatGeobuf: 32,588,230 ( 35%) 16,824,845 ( 89%)
Conclusion
==========
Savings mainly depend on the ratio between schema information and data.
That is, savings are small if features have much more data than schema
information. Typically, geometries may get quite large, so savings
depend on the size/complexity of the feature's geometries. (e. g. tests
where done using complex polygons vs. simple point objects)
Also remarkable is that CompactJSON produces not much bigger responses
(aka is not much worse) than FlatGeobuf. However, FlatGeobuf responses
do not compress very well and are bigger than compressed CompactJSON in
all (and even bigger than compressed GeoJSON if geometries are heavy in
some) cases.
Interesting finding indeed.
Cheers
Andrea
==
GeoServer Professional Services from the experts!
Visit http://bit.ly/gs-services-us <http://bit.ly/gs-services-us>for
more information.
==
Ing. Andrea Aime
@geowolf
Technical Lead
GeoSolutions Group
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549
https://www.geosolutionsgroup.com/ <https://www.geosolutionsgroup.com/>
http://twitter.com/geosolutions_it <http://twitter.com/geosolutions_it>
-------------------------------------------------------
Con riferimento alla normativa sul trattamento dei dati personali (Reg.
UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si
precisa che ogni circostanza inerente alla presente email (il suo
contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è
riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il
messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra
operazione è illecita. Le sarei comunque grato se potesse darmene notizia.
This email is intended only for the person or entity to which it is
addressed and may contain information that is privileged, confidential
or otherwise protected from disclosure. We remind that - as provided by
European Regulation 2016/679 “GDPR” - copying, dissemination or use of
this e-mail or the information herein by anyone other than the intended
recipient is prohibited. If you have received this email by mistake,
please notify us immediately by telephone or e-mail