Hello all
File format has not been a concern for Apache SIS before because we just
followed existing standards. But I now have a case that involves
defining a file format. I would like the possibility to define easily
(without interacting with a database) a few CRS definitions for codes
other than EPSG. For example ESRI also provides their own codes [1] and
the International Astronomical Union (IAU) makes some experiments.
The easiest way is to provide CRS definitions in Well-Know Text (WKT)
format, which is an OGC/ISO standard. But the WKT format provides a
standard way to define a single CRS, while I would like to define many
CRS associated to codes. I'm not aware of a standard for such registry
(if there is some, please let me know!). Other projects like GDAL and
GeoTools use key-value pairs with one entry per line. Example with ESRI
codes [2]:
9248=GEOGCS["Tapi_Aike",DATUM["Tapi_Aike",SPHEROID["International_1924",6378388.0,297.0]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433],AUTHORITY["Esri",9248]]
9251=GEOGCS["MMN",DATUM["Ministerio_de_Marina_Norte",SPHEROID["International_1924",6378388.0,297.0]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433],AUTHORITY["Esri",9251]]
(Note: this example uses legacy WKT 1 because ESRI file at [2]
defines CRS that way, but this proposal does not depend on WKT version)
However above format does not said which authority to associate the 9248
and 9251 keys (ESRI in above example). In order to know the authority,
we have to look at the end of WKT strings for the AUTHORITY["Esri",…]
elements. But if we have to look for AUTHORITY[…] (WKT 1) or ID[…] (WKT
2) elements anyway, then we do not need the keys at all because they are
redundant with AUTHORITY/ID[…] elements. Dropping the keys simplify the
file format and remove one possible source of inconsistencies.
Another issue is that above format puts each WKT definitions on a single
line, which is difficult to read. For making the file more
human-readable, I propose to allow WKT definitions to span many lines
provided that each additional lines is indented relative to the first
line. With those two proposed changes, above example would become:
GEOGCS["Tapi_Aike",
DATUM["Tapi_Aike",
SPHEROID["International_1924",6378388.0,297.0]],
PRIMEM["Greenwich",0.0],
UNIT["Degree",0.0174532925199433],
AUTHORITY["Esri",9248]]
GEOGCS["MMN",
DATUM["Ministerio_de_Marina_Norte",
SPHEROID["International_1924",6378388.0,297.0]],
PRIMEM["Greenwich",0.0],
UNIT["Degree",0.0174532925199433],
AUTHORITY["Esri",9251]]
i.e. we just put standard WKT definitions one after the other (without
"<key>=" prefix) indented in a human-readable way. An inconvenient is
that the AUTHORITY[…] or ID[…] elements, which are optional in WKT
standard, become mandatory in this format.
Aliases for WKT fragments
Files with more than one WKT definition tend to repeat the same WKT
fragments many times, e.g. the same BaseGeogCRS[…] element may be
repeated in every ProjectedCRS definitions. I propose to allow redundant
fragments to be replaced by aliases for making the file more compact,
easier to read, faster to parse and with smaller memory footprint. The
syntax would be the same than environment variables in Unix shell: each
line starting with "SET <identifier>=<WKT>" defines an alias for a
fragment of WKT string. The WKT can span many lines as described above.
Aliases can be expanded in other WKT strings by "$<identifier>". So
above example could become (if user wishes):
SET Int1924 = SPHEROID["International_1924",6378388.0,297.0]
SET Greenwich = PRIMEM["Greenwich",0.0]
SET Degree = UNIT["Degree",0.0174532925199433]
GEOGCS["Tapi_Aike",
DATUM["Tapi_Aike", $Int1924],
$Greenwich, $Degree,
AUTHORITY["Esri",9248]]
GEOGCS["MMN",
DATUM["Ministerio_de_Marina_Norte", $Int1924],
$Greenwich, $Degree,
AUTHORITY["Esri",9251]]
The possibility to define fragments is already a feature of our WKT
parser [3]. Is there any comments?
Martin
[1]
https://raw.githubusercontent.com/Esri/projection-engine-db-doc/master/text/pe_list_projcs.txt
[2]
https://raw.githubusercontent.com/Esri/projection-engine-db-doc/master/gdal/esri_extra.wkt
[3]
http://sis.apache.org/apidocs/org/apache/sis/io/wkt/WKTFormat.html#addFragment(java.lang.String,java.lang.String)