Hello Adam
Le 25/08/13 21:34, Adam Estrada a écrit :
It is true that the Shapefile is very widely used but it has lots and lots
of limitations. The main one that I can think of is that it can't handle
UTF-encoded characters in the attribute table. Can I suggest maybe working
towards something like an "interchange" module where all the file formats
live?
I agree with all the above, and in the current SIS state the
"interchange" module is actually the "storage" group of modules. This
group of modules currently contains:
* sis-storage: provides the basis common to all formats.
* sis-netcdf: for the NetCDF format.
My concern is about whether we should put the Shapefile code in its own
"sis-shapefile" module (which would depend on "sis-storage"), or put it
straight in "sis-storage".
One extreme view is to adopt a "one format == one module" policy. But in
Geotoolkit.org, this policy resulted in more than 120 modules, some of
them with very few classes. In security constrained environment, where
every JAR files requires its own SecurityManager policies, this is very
tedious.
Consequently, I would like to group some formats in the same JAR files
in order to keep the amount of modules to a reasonable number. Then, the
question would be which granularity to choose. My proposal is to not put
every format in its own module, but put a format in its own module if it
meets some of the following conditions:
* The format is not widely used.
* The format is complex, so it requires a large number of classes or
resources.
* The format depends on an external library or on native code.
The NetCDF format is proposed in its own module because it is complex
(the classes currently in "sis-netcdf" are just scratching the surface)
and may have a dependency to a large library (while I would like to keep
that dependency optional). Shapefile on the contrary is relatively
simple and needs no external dependency.
Given that "sis-storage" would be the basis of all formats in SIS, my
proposal is to put also in "sis-storage" some formats considered as
"fundamental ones", I mean some formats so widely spread that any users
are very likely to meet them. They would not be the only or "main" SIS
formats - they would rather be the "minimal requirements". Other modules
like "sis-netcdf" would provide more elaborated formats.
For vector data, there are quite a few of them out there. OGR
references many of them [1] but that opens the debate on whether or not to
just use GDAL. I suppose we could just have GDAL support as a module which
would require some sort of JNI bindings to work in a pure Java library like
SIS. What are your thoughts on this?
Yes, this is also the plan :-). We already used GDAL through JNI on our
side, and that code is also part of the proposed migration to SIS. The
approach that I would recommend is to use pure Java code for many
formats (Shapefile, ASCII grid, GeoTIFF, NetCDF, PNG), and fallback on
GDAL as a complement for other formats.
A similar argument apply to Coordinate Transformation Services. We have
pure Java code (their port to SIS started last week, beginning with
WKT), but we plan to support Proj.4 through JNI even for map projections
available in pure Java, because in some situations a user may need the
guarantees to get the exact same results than PostGIS or MapServer for
instance (those products are built on top of Proj.4).
Martin