I spent an instructive evening reading through the previous discussions (thanks 
for the links, Corey) and the arguments for and against using hierarchal 
structures. I also re-read the CF conventions documents again (1.6 and 1.7 
draft) and it seems the standard currently ignores groups rather than 
explicitly forbidding their use. It seems to me that a netCDF dataset with 
groups could still conform to the CF conventions as they are currently written, 
even with all the other restrictions that the standard imposes. I'd be 
interested in seeing and possibly helping with CF conventions for supporting 
the enhanced model.

After reading the previous discussions, I thought it might be interesting to 
the list to explain our use of groups in netCDF products as it is somewhat 
different from the other cases that were discussed. 

Our netCDF datasets have to cope with a number of different needs from various 
parties - archive, end-users, higher-level processing, reprocessing, 
monitoring, etc. To keep things simple, we wanted a single format per 
instrument/processing level that is flexible enough to contain all the data or 
a subset of the data depending upon the consumer needs. To do this, we created 
a hierarchal data structure that encapsulates data in related, but independent 
groups. These groups can be present in or missing from the dataset as required 
by the needs of the consumer. So a level 2 processing function might receive a 
product containing 20 instrument channels at 2 different resolutions, whereas 
the dissemination function might receive a product with just 5 of these 
channels at the lowest resolution. Both of these products are described by a 
single format specification.

This model of including or omitting independent groups also supports other 
needs, for example being able to add data that is produced at irregular 
intervals but needs to be in the product when it is available. Also, by tagging 
groups with a specific attribute, we should also be able to have a single, 
generic method for end-users to be able to subset data retrieved from the 
archive without requiring specific knowledge of each netCDF product. They 
should be able to select only the tagged groups (which might correspond to 
instrument channels for instance) that they want in their retrieved datasets. 

This gives us a single, easily understood format definition that encompasses a 
wide variety of possible variations.

Any feedback on the idiocy or genius of this (ab)use of the netCDF format is 
welcome.

Thanks,

Tim 

---------------------

Dr. Timothy Patterson
Instrument Data Simulation
Product Format Specification

EUMETSAT
Eumetsat-Allee 1
64295 Darmstadt
Germany
_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to