On Fri, 2008-06-13 at 12:27 +0100, Jon Blower wrote:
> Hi Michael,
> 
> Thanks very much for this.
> 
> > My naive 2p / 2c worth is that the domain of a coverage is simply that
> > region within which data are defined.
> 
> I like this definition because I understand it!  However, I'm not sure
> that everyone has the same view.  I think the $64000 question is: does
> the domain for a single coverage have to be contiguous?  If so, this
> would seem to rule out the use of a Coverage for a discretely-sampled
> domain in which you don't want to apply interpolation of any kind.

No, a coverage domain does not need to be continuous. A coverage, in its
most abstract, is merely:

                         some set of direct positions 
                            for all of which we have
                               a set of values. 
        
The set of direct positions may be finite (i.e. a random or regular
group of points) or infinite (i.e. a set of polygons, a mix of points,
lines and polygons). The values can be of any kind of measure: nominal,
ordinal, interval, or ratio and can also be a vector of, possibly mixed,
values. The coverage is however *required* to have a value for each
position in the original set. The key for the discrete/continuous will
be how the values are generated.

An example of a discrete coverage might be "countryNameCoverage". The
domain of such a coverage could be the set of polygons of territories
claimed to be occupied by some nation state (i.e. in today's world, all
the land masses.) For our purposes let's call this a set of mostly
non-overlapping polygons. The coverage, by its construction, guaranttees
that for any point within those polygons we can get a "name" for a
country. So if we give it a point in Boston we get 'USA', if we give it
a point in the mississippi we get 'USA' but if we give it a point in the
Amazon, we could get 'Brazil'. The coverage can define rules about how
it resolves disputed areas like land in Antarctica. So this is
'discrete' in that we get the same result wherever we are within a
polygon---any point we ask for within the alaska polygon will always
give us 'USA'.

An example of a continous coverage, based on the same polygons, could be
"populationDensityCoverage". There, our two queries in the alaska
polygon could return completely different values and indeed, in general,
we expect different points to have different values even within the same
polygon.

Do not confuse continuous coverage with the continuity of the values
however. We could have a third coverage using the same polygons which is
"sexOfclosestHumanCoverage" which would return 'female' 'unknown' 'male'
for any position in the polygons. Again every point in Alaska would have
an answer but that answer would be different for different places in
Alaska so the coverage is continuous although the values are not.

An 'image' can be turned into a coverage in several ways although a
common one will be to characterize the domain as a single, continuous
rectangular block everywhere over which the coverage can return a vector
of values, say one value for each of the image bands. The return vector
would vary with position across the single domain so this would be a
continuous construction. Alternatively, we could define an 'image' as a
multi-domain discrete coverage where the domain is a set of equally
sized polygons arranged side-by-side and the value returned is the same
for each position within each polygon. Note in passing that we are not
talking about how the values are generated---that's a detail of the
internals of the coverage.

In many ways coverages are the end goal of GI Systems so they are rich
and complex. Also, Geotools has for a long time mixed up the notion of
GridCoverage with the notion of Coverage causing some confusion.

For details of the construction of these things, please look at the spec
itself. 

Also, you should all be aware there's a big doc written by Bryce trying
to play with these notions. He's good at giving a 'read' of the specs he
looks at so one can compare one's own understanding to someone else's
(ie. his) as one reads. See
  http://docs.codehaus.org/display/GEOTOOLS/
                                    ISO+19123+progress+and+future+plan
(rebuild the link)
for details.

Hope that helps---it's hard territory.

--adrian



> 
> If domains are allowed to be non-contiguous then this complicates
> things somewhat but not impossibly so.
> 
> I think this has revealed a problem with terminology.  It seems that
> GeoAPI/Tools interprets a DiscreteCoverage to be a
> discretely-*sampled* coverage, which is nevertheless conceptually a
> contiguous region (with the gaps filled in by nearest-neighbour
> interpolation).  A ContinuousCoverage might also be discretely sampled
> but the gaps are filled by some other interpolation method.  I don't
> think this is a very obvious use of the terms.
> 
> By contrast, I believe that the Climate Science Modelling Language (a
> GML Application Schema) regards a DiscreteCoverage to be
> non-contiguous, i.e. the domain consists of a number of sub-domains
> that do not touch or intersect.  Andrew Woolf or Dom Lowe would be
> able to confirm whether my interpretation is correct here.
> 
> For the record the CSML interpretation is more obvious to me.  The
> GeoAPI/Tools interpretation differentiates on the basis of
> interpolation method, which I do not find very satisfactory.
> 
> But I haven't read the ISO specs - I'm hoping that someone else will
> do that! ;-)
> 
> Cheers, Jon
> 
> P.S. I know this conversation has become a bit fragmented.  I'll try
> to find time to type this up on a web page or something.
> 
> On Fri, Jun 13, 2008 at 9:29 AM, Michael Bedward
> <[EMAIL PROTECTED]> wrote:
> > G'day Jon,
> >
> > Good questions - I like it when computing, science and epistemology collide 
> > :)
> >
> > My naive 2p / 2c worth is that the domain of a coverage is simply that
> > region within which data are defined.  i will now try to argue that
> > that is not a tautology...
> >
> > Following on from your example of a set of points, yes - we might
> > decide to restrict ourselves to the convex hull and call that the
> > domain, but there are many other possibilities.  Based on our
> > knowledge of the data, prior experience, available literature etc. we
> > may well feel confident in defining a domain boundary that extends
> > some way beyond the data points.  This may end up being represented
> > digitally as a coverage within which there is a data domain, possibly
> > quite complex in shape, with some surrounding NODATA area.  Extending
> > this idea further, we might get trendily Bayesian :) and dispense with
> > a hard domain boundary altogether, defining instead a gradient of
> > 'reliability of interpolation' or 'expected predictive accuracy' or
> > some such term.  Then, when we use data directly from this domain, or
> > aggregate it, or make inferences from it, we will also take into
> > account the predictive accuracy to put bounds around our results.
> >
> > I think there is an argument for not attaching an interpolation method
> > to a coverage.  I'll give a real example here.  Decisions about the
> > conservation status of plant and animal species are frequently made on
> > the basis of fairly coarse raster data, e.g. national or state-wide
> > censuses where a data from a wide variety of sources, collected with
> > different methods and at different scales, have been aggregated into a
> > relatively small number of grid cells.   If part of the decision
> > process involves determining the area over which a species occurs then
> > the grid cell size is obviously important.  There are examples where
> > it has been impossible for any species to be rated at the highest
> > status because there was an area threshold that was smaller than the
> > grid cell size !  Some researchers have looked at ways of
> > interpolating within cells in such raster data, based on theoretical
> > patterns of distribution (e.g. fractal scaling) and/or expert
> > knowledge of fine scale factors influencing a species' occurrence.
> > Whether or not you want to do this will depend on (a) the nature of
> > the exercise (b) the available data and your confidence in the
> > theoretical underpinnings of the approach (c) convincing the punters
> > to accept it :)  These are case-specific decisions and not something
> > that is bound up in the coverage itself.
> >
> > Jeepers, I've gone on a bit there - sorry.  But it's very interesting
> > stuff and well worth discussing because of all the very practical
> > implications of alternative approaches.
> >
> > cheers
> > Michael
> 


-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Geotools-gt2-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-gt2-users

Reply via email to