Re: [CF-metadata] Proposal for better handling vector quantities in CF

Steve Hankin Tue, 13 Dec 2011 10:34:27 -0800

John, Jon, Thomas, et. al.,

I will weigh in here with a vote _*against*_ creating another dimension( a new axis type) to achieve vector component . In higher level codecreating a multi-dimensional vector object may well be an elegantapproach -- but I will argue in bullets below that at the filedefinition level it is can add complexity and create a number ofsignificant inconsistencies in the code pipelines and backwardscompatibility problems.


1. There will always be two classes of data access need for vectors --
   1) looking at the individual components; 2) looking at the
   multi-component vector quantity.  Accessing individual components is
   *very common* (I'd speculate that it may be the more common of the
   two modes.)   If we group the components into a single variable
   using an additional dimension it means that the code to treat the
   individual vector components will become different from the scalar
   variable code, despite the fact that there is a nearly identical
   list of use cases for vector components and scalar variables.  (This
   would be a step away  from elegance.)
2. We would almost certainly find that staggered grids becomes a
   slippery slope of complexity.  The specific index ranges needed for
   the individual staggered components depend on the operation that is
   being performed:  vector plots, curl, divergence, volume integrals,
   etc. ...  These needs are not consistent with a single index range
   applying to all components.
3. There are many use cases in which the analysis pipeline is different
   for different components of a vector.  Some examples:  the vectors
   may be stored in separate files (e.g. the entire CMIP5 archive ...
   and we know what a challenge it is to get data providers to utilize
   the aggregation tools); the Z vector component of ocean data is
   often generated through an on-the-fly analysis conservation-of-mass
   anlaysis step, rather than stored in the file; the Z component often
   requires special scaling -- e.g. when making vector plots.  Such
   cases illustrate why it is more elegant to make the vector
   associations in higher level code, rather than at the file level.
4. 3-vector components are often plotted and analyzed in 2-dimensional
   views.  With a vector dimension of length 3, we cannot do a
   multi-dimensional access in the  XZ plane without reading the Y
   component, too -- illustrating where the vector dimension at the
   file level can add complexity.

    - Steve

=======================

On 12/9/2011 11:43 AM, John Caron wrote:

On 12/9/2011 11:37 AM, Jonathan Gregory wrote:
Dear John
I prefer the idea that Thomas has put forward of an umbrella, ratherthan
containing the vector/tensor components in one data variable, because
* I really don't like the idea of units being mixed within a datavariable.I think things with different units must certainly be differentquantitiesand could not be regarded as the same field. You can get away with itif theyare all m s-1, for instance, but not so easily if the vertical one isordersof magnitude different from the horizontal, and not at all if thevector is
expressed in polar coordinates.
I think the common case is that the vector components have the sameunit. One could restrict to that case.
* I think it would be very inconvenient, and would break a lot ofexistingsoftware, if the coordinates were not what they appeared to be,because anoffset had to be added. Also, in general, the component fields of astaggeredgrid do not have the same dimensions, as well as differing in thecoordinates.
Im not sure what "an offset had to be added" means.
I think the common case of staggered grids could be handled with aconvention defining the staggering, rather than seperate dimensions. Ipull out the one Rich Signell and I cam up with a long time ago, forits own sake.
* It avoids the need to define a convention for labelling vector/tensor
components.
I think this convention would be about as complex as the one you willneed for Thomas' proposal.
* It is completely backwards-compatible as regards the componentfields, whichare exactly as before; we're just adding some separate informationlinking
them. This seems neat to me.
I agree thats a strong reason for Thomas' method.
OTOH, if we start thinking in terms of the extended model, a Structure("compound type" in HDF5 parlance) might be useful. What do you thinkabout starting to think about possible uses of extended data model?
Thanks for your thoughts, as always, interesting.

John

_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Re: [CF-metadata] Proposal for better handling vector quantities in CF

Reply via email to