Re: [CF-metadata] CF feature types and definitions
Bob's message reminds me that once these terms are agreed upon they may be used for many purposes, across many different types of systems, while a lot of the discussion of this convention has been focused on just describing the shape of a chunk of data to be returned by software queries. For that reason, I still feel strongly that 'timeSeries' data should be defined in a way that allows data from moored instruments at multiple depths in a single file. In the current version, the term 'location' in the definition of the timeSeries feature type is misleading; it can be taken as x-y location, but I think you mean x-y-z. The definition needs to be more specific, and I hope it will allow multiple depths. A time-consuming search of the email archive didn't turn up a specific message, but I recall that in the past I've been told that moorings with data at different depths should be classified as a collection of profiles. While that might be acceptable in terms of defining the shape of the data returned in a query, it will not be useful when these terms are used in other ways. There's a big difference between a series of profiles at a single x-y location and a time series of data taken at the same x-y location by different instruments at different depths. Not only are different variables measured at different depths, the characteristics of the measurements can be different - instruments have different response times, resolution/accuracy, ranges, etc. As I've said before, it would be a pretty hard sell to describe our station time series data sets as collections of profiles; a terrific 2d time series is not likely to be seen as a good collection of profiles. I hope we can find a better way to include data from moored buoys in this vocabulary, without having to distort the data into something else. Thanks - Nan On 12/23/10 11:06 AM, John Caron wrote: Attached is a message from Bob Simon at ERD/NOAA pointing out the inconsistencies in data type and feature type names in various Unidata related efforts. The almost-ready CF discrete sampling proposal has made a start at standardizing some of these names, and there is an interest, I think, between Steve, Jon and I to extend that to other types like grid. Essentially its a controlled vocabulary for classifying data. If this group is interested, I would propose a new ticket that would add probably an Appendix that would specify this vocabulary and their definitions. I anticipate it will be added to and clarified over time. John Original Message Subject:Re: [netcdf-java] CDM names Date: Thu, 23 Dec 2010 08:48:41 -0700 From: John Caron ca...@unidata.ucar.edu To: netcdf-j...@unidata.ucar.edu Hi Bob: Yes, you are right, there are too many forms of the data type and feature type names, with different lineages. 1) The CF discrete sampling proposal will be the recommended one for point data when thats finalized. Unfortunately, it will be somewhat different from whats gone before. The CF: prefix is dropped until the namespace proposal can be completed. So those feature types are now proposed to be: * *point*: one or more parameters measured at a set of points in time and space * *timeSeries*: a time-series of data points at the same location, with varying time * *trajectory*: a connected set of data points along a 1D curve in time and space * *profile*: a set of data points along a vertical line * *timeSeriesProfile*: a time-series of profiles at a named location * *trajectoryProfile*: a collection of profiles which originate along a trajectory The CDM will be backwards compatible, including: * accepting the CF: prefix * being case insensitive * station and stationTimeSeriesas aliases for timeSeries * stationProfile as alias for timeSeriesProfile * section as alias for trajectoryProfile I know that CF wants to standardize on other feature types also. Its hard to anticipate what they will come with, but its likely: * grid * swath maybe: * image * radial * unstructuredGrid 2) The DataDiscoveryAttConvention is due for another round of work, esp in light of the ISO work that Ted and Dave have done. That might be a good opportunity to try to reconcile. 3) I will work on the CDM library to standardize. Thanks for bringing this up. On 12/22/2010 4:36 PM, Bob Simons wrote: It is unfortunate that the CDM names listed at the sites below are all slightly different (different sets of names, different names for the same feature, different case). And it is unfortunate that there are two global attributes to identify the CDM feature/data type (#2 and #3 below). Is it possible that these could be standardized? 1) http://www.unidata.ucar.edu/software/netcdf-java/v4.0/javadoc/index.html CF.FeatureType (e.g., stationTimeSeries) 2) The cdm_data_type global attribute:
Re: [CF-metadata] CF feature types and definitions
Hi Nan, Let's consider how you're moored instrument data are mapped into NetCDF. If a given parameter is stored as a 2-D array with time as one dimension and instrument depth as the other then I would describe the data in that array as a 'profile series'. If they are stored as a set of 1-D vectors against a common time channel, then I would call them a 'time series'. It is perfectly possible to have a mixture of 'profile series' and 'time series' in a single file - such as 2-D current arrays and 1-D water temperature from an ADCP. So far I have considered the feature type as a property of individual variables, not the file as a whole. Is this the CF intention (it's been a long time since I read the proposal)? If so, I can't see Nan's problem. However, I think she is talking about file-level feature types, which we also need in our system to drive visualisation software. What I've done is to define our 'profile series' equivalent as a file containing one or more 2D arrays (plotted as contoured parameters) with zero or more 1D vectors (plotted as time series plots stacked with the contour plots on a common time x-axis). Our 'time series' is defined as one or more 1D vectors that are plotted as a stacked time series plot. In our working practice, these come from a single instrument, but providing all instruments have a common time channel this doesn't have to be so. Consequently, 'profile series' and 'time series' work for me as all I want to do is plot the data as discrete variables. However, Nan, I'm guessing that you have other use cases. It would be helpful for my understanding of what you need if you could give examples of the variables that would be in your files and the information that you expect to obtain from the file feature type. This should help clarify any extensions required to the feature type vocabulary. Cheers, Roy. From: cf-metadata-boun...@cgd.ucar.edu [cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Nan Galbraith [ngalbra...@whoi.edu] Sent: 29 December 2010 18:41 To: cf-metadata@cgd.ucar.edu Subject: Re: [CF-metadata] CF feature types and definitions Bob's message reminds me that once these terms are agreed upon they may be used for many purposes, across many different types of systems, while a lot of the discussion of this convention has been focused on just describing the shape of a chunk of data to be returned by software queries. For that reason, I still feel strongly that 'timeSeries' data should be defined in a way that allows data from moored instruments at multiple depths in a single file. In the current version, the term 'location' in the definition of the timeSeries feature type is misleading; it can be taken as x-y location, but I think you mean x-y-z. The definition needs to be more specific, and I hope it will allow multiple depths. A time-consuming search of the email archive didn't turn up a specific message, but I recall that in the past I've been told that moorings with data at different depths should be classified as a collection of profiles. While that might be acceptable in terms of defining the shape of the data returned in a query, it will not be useful when these terms are used in other ways. There's a big difference between a series of profiles at a single x-y location and a time series of data taken at the same x-y location by different instruments at different depths. Not only are different variables measured at different depths, the characteristics of the measurements can be different - instruments have different response times, resolution/accuracy, ranges, etc. As I've said before, it would be a pretty hard sell to describe our station time series data sets as collections of profiles; a terrific 2d time series is not likely to be seen as a good collection of profiles. I hope we can find a better way to include data from moored buoys in this vocabulary, without having to distort the data into something else. Thanks - Nan On 12/23/10 11:06 AM, John Caron wrote: Attached is a message from Bob Simon at ERD/NOAA pointing out the inconsistencies in data type and feature type names in various Unidata related efforts. The almost-ready CF discrete sampling proposal has made a start at standardizing some of these names, and there is an interest, I think, between Steve, Jon and I to extend that to other types like grid. Essentially its a controlled vocabulary for classifying data. If this group is interested, I would propose a new ticket that would add probably an Appendix that would specify this vocabulary and their definitions. I anticipate it will be added to and clarified over time. John Original Message Subject: Re: [netcdf-java] CDM names Date: Thu, 23 Dec 2010 08:48:41 -0700 From: John Caron ca...@unidata.ucar.edu To: netcdf-j...@unidata.ucar.edu Hi Bob: Yes, you are right, there are too many forms of the data