+1 to this discussion and this makes me believe, Martin, that SIS could actually become a Tika parser, for WKTs.
I could imagine us bringing in the GeoTK detection code into SIS, then wrapping that portion of SIS in Tika. Thoughts? Cheers, Chris On 1/18/13 11:41 AM, "Martin Desruisseaux" <[email protected]> wrote: >I forgot to mention a point about the CRS WKT 2.0 meeting at OGC. They >were a discussion about how a parser shall behave when an unknown >keyword is found while parsing a WKT. The GDAL behaviour is to ignore >unknown keywords. The Geotk behaviour is to throw an exception. Both >approaches have pro-and-con, so this seems to be an example of situation >where business and scientific are having different goals. > >My rational for throwing an exception is that the parser have no idea if >an unknown keyword is important or not. Maybe the unknown keyword >contains only metadata information that can be safely ignored. But maybe >the keyword contains information that affect the numerical values if >projected coordinates. For example the WKT 2.0 format is considering to >add some new UNIT keyword (in addition to the existing one). Ignoring >such keyword, then parsing the remainder of the WKT with the wrong unit, >will obviously produce wrong map projection. > >Business wants their products to be tolerant to unexpected situations >and still produce some results, even if inaccurate. But scientists want >to trust their data and be aware of anything that may affect their >confidence. For a scientist, the worst thing that could happen is often >a program that seems to work but produce wrong results because of >unnoticed errors - it is better to stop the calculation rather than >letting unnoticeable errors to happen. But for some business, the >program shall not stop even if there is problems. > >My personal point of view is to adopt the scientific approach as the >default one: be strict and throw an exception if the data are at risk of >being erroneous. But I admit that we will need to allow SIS to run in a >lenient mode if the user wants to. In Geotk that were some boolean flags >there-and-there for this purpose. For SIS, it may be worth to do some >consolidation about "what is strict, what is lenient" in some central >place... > > Martin >
