Le vendredi 21 novembre 2014 15:35:43, Rahkonen Jukka (Tike) a écrit : > Hi, > > I have no use for this feature myself but by reading various mailing lists > and forums I have learned that many people consider it is always a good > idea to read data for example from WFS services as GeoJSON instead of GML.
Because it consumes less bandwidth ? For the record, if you try the following, it will use the GML schema for the user exposed layer and will do a on-the-fly transform from the hidden GeoJSON layer schema to the GML schema, similarly to the one you could do with a CAST/VRT. $ ogrinfo "WFS:http://demo.opengeo.org/geoserver/wfs?service=wfs&version=1.0.0&request=getfeature&typename=topp:states&outputformat=json" -ro -al -where "STATE_NAME = 'California'" Layer name: topp:states Geometry: Multi Polygon Feature Count: 1 Extent: (-124.391472, 32.535725) - (-114.124451, 42.002346) Layer SRS WKT: GEOGCS["WGS 84", DATUM["WGS_1984", SPHEROID["WGS 84",6378137,298.257223563, AUTHORITY["EPSG","7030"]], AUTHORITY["EPSG","6326"]], PRIMEM["Greenwich",0, AUTHORITY["EPSG","8901"]], UNIT["degree",0.0174532925199433, AUTHORITY["EPSG","9122"]], AUTHORITY["EPSG","4326"]] gml_id: String (0.0) STATE_NAME: String (0.0) STATE_FIPS: String (0.0) SUB_REGION: String (0.0) STATE_ABBR: String (0.0) LAND_KM: Real (0.0) WATER_KM: Real (0.0) PERSONS: Real (0.0) FAMILIES: Real (0.0) HOUSHOLD: Real (0.0) MALE: Real (0.0) FEMALE: Real (0.0) WORKERS: Real (0.0) DRVALONE: Real (0.0) CARPOOL: Real (0.0) PUBTRANS: Real (0.0) EMPLOYED: Real (0.0) UNEMPLOY: Real (0.0) SERVICE: Real (0.0) MANUAL: Real (0.0) P_MALE: Real (0.0) P_FEMALE: Real (0.0) SAMP_POP: Real (0.0) OGRFeature(topp:states):0 gml_id (String) = (null) STATE_NAME (String) = California STATE_FIPS (String) = 06 SUB_REGION (String) = Pacific STATE_ABBR (String) = CA LAND_KM (Real) = 403970.143 WATER_KM (Real) = 20023.368 PERSONS (Real) = 29760021 FAMILIES (Real) = 7139394 HOUSHOLD (Real) = 10381206 MALE (Real) = 14897627 FEMALE (Real) = 14862394 WORKERS (Real) = 11306576 DRVALONE (Real) = 9982242 CARPOOL (Real) = 2036025 PUBTRANS (Real) = 685797 EMPLOYED (Real) = 13996309 UNEMPLOY (Real) = 996502 SERVICE (Real) = 3664771 MANUAL (Real) = 1798201 P_MALE (Real) = 0.501 P_FEMALE (Real) = 0.499 SAMP_POP (Real) = 3792553 MULTIPOLYGON (((....))) > I can easily imagine that there will be troubles with guess-by-data method > if they are making subsequent requests from the service. For example > strings which are all numbers but which may contain leading zeroes are > saved either to integers or strings if leading zeroes are interpreted > right at all. In JSON, "00123" and 00123 are different objects. So a string with leading zeros should be serialized as "00123" and not 00123. If it is serialized as "00123", the GeoJSON driver will interpret it as a string. > Or floats which do not always contain decimals, or list > attributes which sometimes have only zero or one member. Yes, those cases could cause issues. > > Embedded schema feels optimal because then it would always travel together > with the data and we all have probably lost .tfw or .prj files sometimes. > > -Jukka- > > Even Rouault wrote: > > Jukka, > > > > Data type guessing implemented in the OGR GeoJSON driver is quite natural > > hopefully. > > A whole scan of the GeoJSON file is made and the following rules are > > applied : - if an attribute has integer-only content --> Integer > > - if an attribute has an array of integer-only content --> IntegerList > > - if an attribute has integer or floating point content --> Real > > - if an attribute has an array of integer or floating point content --> > > RealList - if an attribute has an array of anything else content --> > > StringList - otherwise --> String > > > > With RFC 50 and other pending improvements in the driver: > > - if an attribute has boolean-only content --> Integer(Boolean) > > - if an attribute has an array of boolean-only content --> > > IntegerList(Boolean) - if an attribute has date-only content --> Date > > - if an attribute has time-only content --> Time > > - if an attribute has datetime or date content --> DateTime > > > > I'm not sure we want to invent a .jsont format, but if you download > > http://svn.osgeo.org/gdal/trunk/gdal/swig/python/samples/ogr2vrt.py > > > > and run : > > > > python ogr2vrt.py > > "http://demo.opengeo.org/geoserver/wfs?service=wfs&version=1.0.0&request > > =getfeature&typename=topp:states&outputformat=json" test.vrt > > > > This will create you a VRT with the default schema, that you can easily > > edit. Note: as with OGR SQL CAST, this is post processing. So if the > > guess done by the GeoJSON driver leads to a loss of information, you > > cannot recover it. Hopefully the implemented rules will not lead to > > information loss. > > > > A better approach would be to have the schema embedded in a JSON way in > > the GeoJSON file itself. > > That could be an evolution of the format, but I'm not sure this would be > > really popular, given JSON/GeoJSON is heavily used by NoSQL > > approaches... > > > > Hum, doing a quick search, I just found http://json-schema.org/ that > > appears to be an IETF draft. > > It doesn't look that the schema is embedded in the data file itself. > > > > There's also GeoJSON-LD that might be a bit related : > > https://github.com/geojson/geojson-ld > > > > CC'ing Sean in case he has thoughts on this. > > > > Even > > > > > Hi, > > > > > > I wonder if GDAL could have some simple and relatively user friendly > > > way for defining a schema for GeoJSON data. The GeoJSON driver seems > > > to guess the data types of attributes with some undocumented way but > > > users could have better knowledge about the desired schema. > > > > > > I know I can control the data type by using OGR SQL and CAST as in > > > ogrinfo -sql "select cast(EMPLOYED as float) from OGRGeojson" > > > states.json -so > > > > > > However, perhaps GeoJSON is enough popular for deserving an easier way > > > for writing a schema. First I thought that it would be enough to copy > > > the "csvt" text file mechanism from the GDAL CSV driver > > > http://www.gdal.org/drv_csv.html. However, the csvt file is a plain > > > list of types which will be applied to the attributes in the same > > > order than they appear in the text file > > > "Integer(5)","Real(10.7)","String(15)" > > > > > > For GeoJSON it would feel more user friendly to include the attribute > > > names in the list somehow like > > > "population;Integer(5)","area;Real(10.7)","name;String(15)". > > > > > > This would make it easier for users to write a valid "jsont" file. A > > > list with attribute names could perhaps also help GDAL as well because > > > the features in GeoJSON file do not necessarily have same attributes. > > > > > > As an example this is the right schema for a WFS feature type which is > > > captured from > > > http://demo.opengeo.org/geoserver/wfs?service=wfs&version=1.0.0&reques > > > t=des > > > cribefeaturetype&typename=topp:states > > > > > > > > > name="the_geom" type="gml:MultiPolygonPropertyType"/> > > > name="STATE_NAME" type="xsd:string"/> > > > name="STATE_FIPS" type="xsd:string"/> > > > name="SUB_REGION" type="xsd:string"/> > > > name="STATE_ABBR" type="xsd:string"/> > > > name="LAND_KM" type="xsd:double"/> > > > name="WATER_KM" type="xsd:double"/> > > > name="PERSONS" type="xsd:double"/> > > > name="FAMILIES" type="xsd:double"/> > > > name="HOUSHOLD" type="xsd:double"/> > > > name="MALE" type="xsd:double"/> > > > name="FEMALE" type="xsd:double"/> > > > name="WORKERS" type="xsd:double"/> > > > name="DRVALONE" type="xsd:double"/> > > > name="CARPOOL" type="xsd:double"/> > > > name="PUBTRANS" type="xsd:double"/> > > > name="EMPLOYED" type="xsd:double"/> > > > name="UNEMPLOY" type="xsd:double"/> > > > name="SERVICE" type="xsd:double"/> > > > name="MANUAL" type="xsd:double"/> > > > name="P_MALE" type="xsd:double"/> > > > name="P_FEMALE" type="xsd:double"/> > > > name="SAMP_POP" type="xsd:double"/> > > > > > > > > > This is what GDAL is guessing: > > > STATE_NAME: String (0.0) > > > STATE_FIPS: String (0.0) > > > SUB_REGION: String (0.0) > > > STATE_ABBR: String (0.0) > > > LAND_KM: Real (0.0) > > > WATER_KM: Real (0.0) > > > PERSONS: Real (0.0) > > > FAMILIES: Integer (0.0) > > > HOUSHOLD: Real (0.0) > > > MALE: Real (0.0) > > > FEMALE: Real (0.0) > > > WORKERS: Real (0.0) > > > DRVALONE: Integer (0.0) > > > CARPOOL: Integer (0.0) > > > PUBTRANS: Integer (0.0) > > > EMPLOYED: Real (0.0) > > > UNEMPLOY: Integer (0.0) > > > SERVICE: Integer (0.0) > > > MANUAL: Integer (0.0) > > > P_MALE: Real (0.0) > > > P_FEMALE: Real (0.0) > > > SAMP_POP: Integer (0.0) > > > bbox: RealList (0.0) > > > > > > -Jukka Rahkonen- > > > > > > _______________________________________________ > > > gdal-dev mailing list > > > gdal-dev@lists.osgeo.org > > > http://lists.osgeo.org/mailman/listinfo/gdal-dev > > > > -- > > Spatialys - Geospatial professional services http://www.spatialys.com > > _______________________________________________ > gdal-dev mailing list > gdal-dev@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/gdal-dev -- Spatialys - Geospatial professional services http://www.spatialys.com _______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev