Hi Adam and Martin,

Would it be ok to leave it as is because there are a small number of data
storage modules currently? I think of storage as something that holds
common formats that run across all the different storage formats, like a
Feature. Eventually it will get to the point where you will not want to
have a multitude of jar files. I see the sis-shapefile as a fairly distinct
file driver because of the complex format of a shapefile (not necessarily
good complexity).

Adding GDAL bindings for commons formats would be very useful. This would
make it easier to do large bulk processing of geospatial data with Hadoop
like the presentation in the following video:

https://www.youtube.com/watch?v=_JCPf89s-NI


Thanks,
Travis









On Sun, Aug 25, 2013 at 8:06 PM, Adam Estrada <[email protected]>wrote:

> Hey Martin,
>
> Regarding where to put all the file format modules, I am just concerned
> that it might be difficult to keep things straight if there is a mixture of
> "complex" formats and everything else. I think we all trust your opinion on
> where to put things but we really just need to keep the end user and other
> potential committers in mind when moving forward in the development
> process. For example, I take a look at the directory structure in SVN [1]
> and I automatically think that each format should be in its own module like
> sis-netcdf because of the way it's organized.
>
> Just my 2 cents at this point and feedback from other folks is certainly
> welcome :)
>
> Adam
>
> [1] https://svn.apache.org/repos/asf/sis/trunk/storage/
>
>
> On Sun, Aug 25, 2013 at 4:39 PM, Martin Desruisseaux <
> [email protected]> wrote:
>
> > Hello Adam
> >
> > Le 25/08/13 21:34, Adam Estrada a écrit :
> >
> >  It is true that the Shapefile is very widely used but it has lots and
> lots
> >> of limitations. The main one that I can think of is that it can't handle
> >> UTF-encoded characters in the attribute table. Can I suggest maybe
> working
> >> towards something like an "interchange" module where all the file
> formats
> >> live?
> >>
> >
> > I agree with all the above, and in the current SIS state the
> "interchange"
> > module is actually the "storage" group of modules. This group of modules
> > currently contains:
> >
> >  * sis-storage: provides the basis common to all formats.
> >  * sis-netcdf: for the NetCDF format.
> >
> >
> > My concern is about whether we should put the Shapefile code in its own
> > "sis-shapefile" module (which would depend on "sis-storage"), or put it
> > straight in "sis-storage".
> >
> > One extreme view is to adopt a "one format == one module" policy. But in
> > Geotoolkit.org, this policy resulted in more than 120 modules, some of
> them
> > with very few classes. In security constrained environment, where every
> JAR
> > files requires its own SecurityManager policies, this is very tedious.
> >
> > Consequently, I would like to group some formats in the same JAR files in
> > order to keep the amount of modules to a reasonable number. Then, the
> > question would be which granularity to choose. My proposal is to not put
> > every format in its own module, but put a format in its own module if it
> > meets some of the following conditions:
> >
> >  * The format is not widely used.
> >  * The format is complex, so it requires a large number of classes or
> >    resources.
> >  * The format depends on an external library or on native code.
> >
> >
> > The NetCDF format is proposed in its own module because it is complex
> (the
> > classes currently in "sis-netcdf" are just scratching the surface) and
> may
> > have a dependency to a large library (while I would like to keep that
> > dependency optional). Shapefile on the contrary is relatively simple and
> > needs no external dependency.
> >
> > Given that "sis-storage" would be the basis of all formats in SIS, my
> > proposal is to put also in "sis-storage" some formats considered as
> > "fundamental ones", I mean some formats so widely spread that any users
> are
> > very likely to meet them. They would not be the only or "main" SIS
> formats
> > - they would rather be the "minimal requirements". Other modules like
> > "sis-netcdf" would provide more elaborated formats.
> >
> >
> >
> >  For vector data, there are quite a few of them out there. OGR
> >> references many of them [1] but that opens the debate on whether or not
> to
> >> just use GDAL. I suppose we could just have GDAL support as a module
> which
> >> would require some sort of JNI bindings to work in a pure Java library
> >> like
> >> SIS. What are your thoughts on this?
> >>
> >
> > Yes, this is also the plan :-). We already used GDAL through JNI on our
> > side, and that code is also part of the proposed migration to SIS. The
> > approach that I would recommend is to use pure Java code for many formats
> > (Shapefile, ASCII grid, GeoTIFF, NetCDF, PNG), and fallback on GDAL as a
> > complement for other formats.
> >
> > A similar argument apply to Coordinate Transformation Services. We have
> > pure Java code (their port to SIS started last week, beginning with WKT),
> > but we plan to support Proj.4 through JNI even for map projections
> > available in pure Java, because in some situations a user may need the
> > guarantees to get the exact same results than PostGIS or MapServer for
> > instance (those products are built on top of Proj.4).
> >
> >     Martin
> >
> >
>

Reply via email to