Re: [CF-metadata] CF feature types and definitions

2010-12-29 Thread Nan Galbraith

Bob's message reminds me that once these terms are agreed upon
they may be used for many purposes, across many different types
of systems, while a lot of the discussion of this convention has been
focused on just describing the shape of a chunk of data to be returned
by software queries.

For that reason, I still feel strongly that 'timeSeries' data should be
defined in a way that allows data from moored instruments at multiple
depths in a single file.  In the current version, the term 'location' in 
the

definition of the timeSeries feature type is misleading; it can be taken
as x-y location, but I think you mean x-y-z. The definition needs to
be more specific, and I hope it will allow multiple depths.

A time-consuming search of the email archive didn't turn up a specific
message, but I recall that in the past I've been told that moorings with
data at different depths should be classified as a collection of profiles.
While that might be acceptable in terms of defining the shape of the
data returned in a query, it will not be useful when these terms are used
in other ways.

There's a big difference between a series of profiles at a single x-y 
location

and a time series of data taken at the same x-y location by different
instruments at different depths. Not only are different variables measured
at different depths, the characteristics of the measurements can be 
different -
instruments have different response times, resolution/accuracy, ranges, 
etc.


As I've said before, it would be a pretty hard sell to describe our station
time series data sets as collections of profiles; a terrific 2d time series
is not likely to be seen as a good collection of profiles.

I hope we can find a better way to include data from moored buoys in
this vocabulary, without having to distort the data into something else.

Thanks -
Nan



On 12/23/10 11:06 AM, John Caron wrote:
Attached is a message from Bob Simon at ERD/NOAA pointing out the 
inconsistencies in data type and feature type names in various 
Unidata related efforts. The almost-ready CF discrete sampling 
proposal has made a start at standardizing some of these names, and 
there is an interest, I think, between Steve, Jon and I to extend that 
to other types like grid. Essentially its a controlled vocabulary for 
classifying data.


If this group is interested, I would propose a new ticket that would 
add probably an Appendix that would specify this vocabulary and their 
definitions. I anticipate it will be added to and clarified over time.


John

 Original Message 
Subject:Re: [netcdf-java] CDM names
Date:   Thu, 23 Dec 2010 08:48:41 -0700
From:   John Caron ca...@unidata.ucar.edu
To: netcdf-j...@unidata.ucar.edu



Hi Bob:

Yes, you are right, there are too many forms of the data type and 
feature type names, with different lineages.


1) The CF discrete sampling proposal will be the recommended one for 
point data when thats finalized. Unfortunately, it will be somewhat 
different from whats gone before. The CF: prefix is dropped until the 
namespace proposal can be completed. So those feature types are now 
proposed to be:


* *point*: one or more parameters measured at a set of points in
  time and space
* *timeSeries*: a time-series of data points at the same location,
  with varying time
* *trajectory*: a connected set of data points along a 1D curve in
  time and space
* *profile*: a set of data points along a vertical line
* *timeSeriesProfile*: a time-series of profiles at a named location
* *trajectoryProfile*: a collection of profiles which originate
  along a trajectory

The CDM will be backwards compatible, including:

*   accepting the CF: prefix
*   being case insensitive
*   station and stationTimeSeriesas aliases for timeSeries
*   stationProfile as alias for timeSeriesProfile
*   section as alias for trajectoryProfile

I know that CF wants to standardize on other feature types also. Its 
hard to anticipate what they will come with, but its likely:


* grid
* swath

maybe:

* image
* radial
* unstructuredGrid

2) The DataDiscoveryAttConvention is due for another round of work, 
esp in light of the ISO work that Ted and Dave have done. That might 
be a good opportunity to try to reconcile.


3) I will work on the CDM library to standardize.

Thanks for bringing this up.

On 12/22/2010 4:36 PM, Bob Simons wrote:
It is unfortunate that the CDM names listed at the sites below are 
all slightly different (different sets of names, different names for 
the same feature, different case).
And it is unfortunate that there are two global attributes to 
identify the CDM feature/data type (#2 and #3 below).


Is it possible that these could be standardized?

1) 
http://www.unidata.ucar.edu/software/netcdf-java/v4.0/javadoc/index.html 
 CF.FeatureType  (e.g., stationTimeSeries)


2) The cdm_data_type global attribute:

Re: [CF-metadata] CF feature types and definitions

2010-12-29 Thread Lowry, Roy K.
Hi Nan,

Let's consider how you're moored instrument data are mapped into NetCDF.  If a 
given parameter is stored as a 2-D array with time as one dimension and 
instrument depth as the other then I would describe the data in that array as a 
'profile series'.   If they are stored as a set of 1-D vectors against a common 
time channel, then I would call them a 'time series'.  It is perfectly possible 
to have a mixture of 'profile series' and 'time series' in a single file - such 
as 2-D current arrays and 1-D water temperature from an ADCP.

So far I have considered the feature type as a property of individual 
variables, not the file as a whole.  Is this the CF intention (it's been a long 
time since I read the proposal)?  If so, I can't see Nan's problem. However, I 
think she is talking about file-level feature types, which we also need in our 
system to drive visualisation software. What I've done is to define our 
'profile series' equivalent as a file containing one or more 2D arrays (plotted 
as contoured parameters) with zero or more 1D vectors (plotted as time series 
plots stacked with the contour plots on a common time x-axis).  Our 'time 
series' is defined as one or more 1D vectors that are plotted as a stacked time 
series plot. In our working practice, these come from a single instrument, but 
providing all instruments have a common time channel this doesn't have to be 
so. Consequently, 'profile series' and 'time series' work for me as all I want 
to do is plot the data as discrete variables.

However, Nan, I'm guessing that you have other use cases.  It would be helpful 
for my understanding of what you need if you could give examples of the 
variables that would be in your files and the information that you expect to 
obtain from the file feature type. This should help clarify  any extensions 
required to the feature type vocabulary.

Cheers, Roy.

From: cf-metadata-boun...@cgd.ucar.edu [cf-metadata-boun...@cgd.ucar.edu] On 
Behalf Of Nan Galbraith [ngalbra...@whoi.edu]
Sent: 29 December 2010 18:41
To: cf-metadata@cgd.ucar.edu
Subject: Re: [CF-metadata] CF feature types and definitions

Bob's message reminds me that once these terms are agreed upon
they may be used for many purposes, across many different types
of systems, while a lot of the discussion of this convention has been
focused on just describing the shape of a chunk of data to be returned
by software queries.

For that reason, I still feel strongly that 'timeSeries' data should be
defined in a way that allows data from moored instruments at multiple
depths in a single file.  In the current version, the term 'location' in
the
definition of the timeSeries feature type is misleading; it can be taken
as x-y location, but I think you mean x-y-z. The definition needs to
be more specific, and I hope it will allow multiple depths.

A time-consuming search of the email archive didn't turn up a specific
message, but I recall that in the past I've been told that moorings with
data at different depths should be classified as a collection of profiles.
While that might be acceptable in terms of defining the shape of the
data returned in a query, it will not be useful when these terms are used
in other ways.

There's a big difference between a series of profiles at a single x-y
location
and a time series of data taken at the same x-y location by different
instruments at different depths. Not only are different variables measured
at different depths, the characteristics of the measurements can be
different -
instruments have different response times, resolution/accuracy, ranges,
etc.

As I've said before, it would be a pretty hard sell to describe our station
time series data sets as collections of profiles; a terrific 2d time series
is not likely to be seen as a good collection of profiles.

I hope we can find a better way to include data from moored buoys in
this vocabulary, without having to distort the data into something else.

Thanks -
Nan



On 12/23/10 11:06 AM, John Caron wrote:
 Attached is a message from Bob Simon at ERD/NOAA pointing out the
 inconsistencies in data type and feature type names in various
 Unidata related efforts. The almost-ready CF discrete sampling
 proposal has made a start at standardizing some of these names, and
 there is an interest, I think, between Steve, Jon and I to extend that
 to other types like grid. Essentially its a controlled vocabulary for
 classifying data.

 If this group is interested, I would propose a new ticket that would
 add probably an Appendix that would specify this vocabulary and their
 definitions. I anticipate it will be added to and clarified over time.

 John

  Original Message 
 Subject:  Re: [netcdf-java] CDM names
 Date: Thu, 23 Dec 2010 08:48:41 -0700
 From: John Caron ca...@unidata.ucar.edu
 To:   netcdf-j...@unidata.ucar.edu



 Hi Bob:

 Yes, you are right, there are too many forms of the data