[Geotools-devel] GridCoverages in Geotools

Adrian Custer Thu, 28 Aug 2008 05:54:49 -0700

Hey Andrea,
        
        Like everyone else, I'm getting exhausted by this endless
        discussion. Here's an overview as I best understand things.

                        GRID COVERAGES IN GEOTOOLS
                        --------------------------

Geotools, among other things, handles "GridCoverages" which are, in
approximate language, massive matrices of numeric values. 

There are several separate issues with handling "GridCoverages":
      * Accessing the information from its original source,
      * Instantiating the objects and all the associated structures,
      * Using the objects to do useful work.

The Design of GridCoverage Objects:
-----------------------------------

Skipping the first of these until later, we can discuss the object
structure. 'GridCoverages' are build with classes in the 'Coverage'
package. 
        Note this package is misnamed, since true coverages are a more
        abstract notion, unifying vectorial and raster approaches, the
        nirvana of GIS. ISO 'Coverages' are based on the ISO 'Feature' 
        object structure which is the heart of GeoTools 'main'---here 
        we are still at a lower level.
The Coverage package depends on the classes in the base layer of
Geotools:

          Coverage (package)
             \/
        "Geotools Base:"
          Referencing/Parameter/Metadata (packages)

This class hierarchy was initially created for a separate project and
its arrival into GeoTools is what gave rise to 'Geotools-2'. 

Martin designed all this code with 2 core interests in mind: (1)
implementing well defined specifications designed by experts and (2)
following the Java design and philosophy. 
    As a consequence of (1), GeoTools coverage was designed for an older
spec of the Open Geospatial Consortium (OGC) which has now been updated
with a very closely related spec of the International Organization for
Standardization (ISO) so there are some pending updates to the package.
However, following these specifications guarantees a broad level of
generality for the resulting code. 
    The consequence of (2) is that the whole stack is designed, nay
optimized, for Java. In particular, the code was designed to integrated
deeply with the Java Advanced Imaging (JAI) system. 
    As a consequence of the dependency on lower levels, GridCoverage is
designed to work with the projection system in the Referencing package.
Since both Referencing and JAI use affine transforms for much of their
work, anyone wanting to play with this code effectively better really
understand the power and value of matrix transforms---most computer 
graphics textbooks lay this out.

Creating a GridCoverage involves creating lots of related classes which
define the geo-referencing and data content aspects of the 'numeric
matrix'. Since GridCoverage was built to leverage JAI, we can follow the
basic layout of JAI which has:

[JAI]          ImageReader        <---------->      RenderedImage

both of which are sub-classed for different file and image formats. The
ImageReader classes provide access to the data while the RenderedImage
classes allow us to work on the data with, among other fuctions, a way
to get an Image out of the structure. The Geotools GridCoverage class is
a wrapper around the RenderedImage construct which adds georeferencing
and data content information along with extra methods to leverage that
richer information. So we have:

[Geotools]     (missing)          <----------->      GridCoverage

with the first part, discussed in greater depth below, being some
implementation of the GridCoverageReader interface and the second the
GridCoverage implementation.

Working with GridCoverages:
--------------------------

Because a GridCoverage wraps a JAI RenderedImage with georeferencing and
data content information, once we have created a GridCoverage, we can
work on it as if it were a RenderedImage but also we can work on it with
georeferencing notions and with notions of the meaning of its data
contents. This is possible because of the slew of definitional objects
which were required to create the GridCoverage. So, for example, we can
both readily run convolution filters on an image (JAI) or can convert
images between reference systems (GeoTools).

Martin wrote all this code so he could work studying tuna in the Indian
ocean based on satellite imagery and oceanographic studies. So the bias
in the design was originally towards exploiting GridCoverages for
scientific statistical analysis. Actually visualizing the data is really
only a side benefit of his leveraging JAI in the core design. Anyhow,
the general idea is that once you have a GridCoverage, you have the data
structure against which you can do your work.

Going from data source to GridCoverage:
--------------------------------------

Martin, having produced the metadata, referencing, and coverage packages
did not add the last piece which provided a unified approach to reading
data. This is simply due to not yet having written that code. 

That said, there are two things to keep in mind. First, anyone can come
up relatively quickly with their own way to handle their own data and
build themselves the required objects to make a GridCoverage. Second,
the fully general, efficient, and flexible approach to accessing files
in various formats from simple 2D georeferenced images, to sets of tiles
and pyramids to 5 dimensional data sets, has not yet even been fully
defined. 

As it stands today, there is an interface defined, 'GridCoverageReader'
but Martin did not create any general, abstract implementation, leaving
it to the user of GeoTools to define their implementation for their own
needs, presumably leveraging the work of Java Advance Imaging's ImageIO
project, other libraries, and their ingenuity.

The general solution to accessing 'massive matrix' geospatial data
sources is currently being explored. One of the steps along the way
involves defining an XML schema for the relevant pieces of metadata that
would need to be defined for the different data sources. There are a few
discussions going on around the Web on defining this metadata format.
This does require an enormous grasp of all the different types of data
possible and all the relevant information one would need to know about
them so this is not a small task, just by itself.

It is here that there is currently a split or duplication of approaches,
at this bottom level of how to implement the GridCoverageReader
interface to provide access to the contents of a 'massive matrix' 
data source in order, first, to extract 
     1. the metadata relevant to the data contents and 
     2. the data contents themselves, 
and second, to use that information to create GeoTools GridCoverage
objects. 

You, and everyone, can decide which approach you prefer. You can:
     1. figure out your own way, 
     2. use the 'coverageio' approach, 
     3. use one of the 'coverage.grid*' / spike / imageio-ext approaches

Option 1 is probably what you want to do in your case since you have 
a highly specific data structure coming from GRASS along with all the 
metadata associated with it that you know about and can leverage. You 
may wish to follow the spirit of someone else's design but that's a 
separate issue for you to decide.

Option 2 is Martin's ongoing work. He's currently doing fun things
storing the metadata in his "PostGRID' database schema for a
"PostGIS/Postgresql" system and doing image mosaics/pyramids. There 
is a coverageio package for netcdf files, I believe, that can give you 
the full code he uses to go from a complex file to GridCoverage.

Option 3 is the work of GeoSolutions which Simone already wrote you
about and which I understand even less well than the rest of the stuff I
present above. If you are interested by that talk to him about his
different packages and projects.

Hopefully, this message lays out an overview of where we stand, where we
are going, and gives you the background to then go drill into code. 

A final note: The code is not easy for several reasons. The foremost of
these is that the science behind this is a wee bit complex. Lots of
people generate these 'massive matrices' all for totally different
reasons and each in totally different ways: some by lobbing buses into
space to take pictures of the ground, others by building computer
clusters to generate general circulation models of some fluid basin, and
jokers like me go out and take photographs of a landscape with their
digital cameras and want to use the result. Understanding the
consequences of those approaches, how they may want to be exploited for
analysis, and creating a general framework for handling all that is
*complex*. Fortunately, on one side, some brilliant people thought long
and hard to lead the way by creating some specifications which others
have worked hard to improve and on the other, Martin worked long and
hard to implement a core foundation of code for us to work with. 

The second difficulty is that understanding Martin's code is not easy.
Most of all, all his work needs a set of overview documents for 
each big chunk of work, summarizing both the related specifications and
the scientific background while tracing the big picture through the core
classes. However, while it is *hard* to get started, actually working
with Martin's code can be a real joy: it's well written, thoroughly
documented, and has lots of helper code. Understanding Martin's code and
his design can bring a better understanding of the complexity of
handling GridCoverages and can teach most of us how to write elegant and
well-structured Java code.

--adrian

        P.S. The above document may well have errors/omissions/other
        issues. It is provided on a 'best effort' basis to help those
        who want to get started but only the code itself is
        authoritative.

        P.P.S. I'm shooting at the mirror, you just happen to be 
        sitting in front of it.

August 2008
Montpellier

On Wed, 2008-08-27 at 06:33 +0200, andrea antonello wrote:
> Hi list,
> this will sound strange, but since I tend to stay in the udig life and
> not to get my hands too dirty in geotools, this comes all new to me.
> 
> I followed some fighting of the coverage gods in the past. But what I
> noticed only recently is that there are two separate efforts on the
> coverage side (is this the moment many of you are saying that this is
> no news? well I didn't know and probably others do not).
> I don't want to enter the philosophical part of this story, but since
> now I need to finish my port to geotools of the GRASS raster
> reader/writer, I need to discuss a bit.
> 
> My impressions is that the two trails are somewhat different:
> - imageio-ext seems to be dedicated to imagery and its optimization in
> visualization
> - coveragio seems to be more scientific and with the need to follow
> strick the standards
> 
> I agree that this is a very superficial analysis, which is more due to
> my current needs than to other things
> 
> In fact JGrass is a scientific library for terrain analysis and we are
> going to propose at the next openmi tecnical meeting some spatial
> implementations, for which it is mandatory for us to follow strict
> standards.
> Basing on these facts coverageio seems to be the proper choice for us,
> even if we already started on the imageio-ext side.
> I'm getting a headache...
> 
> Please, can I ask for some comment on all this,
> Thanks,
> The piano player
> 
> 
> 
> PS: don't shoot the piano player

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Geotools-devel mailing list
Geotools-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

[Geotools-devel] GridCoverages in Geotools

Reply via email to