John, Jon, Thomas, et. al.,

I will weigh in here with a vote _*against*_ creating another dimension ( a new axis type) to achieve vector component . In higher level code creating a multi-dimensional vector object may well be an elegant approach -- but I will argue in bullets below that at the file definition level it is can add complexity and create a number of significant inconsistencies in the code pipelines and backwards compatibility problems.

1. There will always be two classes of data access need for vectors --
   1) looking at the individual components; 2) looking at the
   multi-component vector quantity.  Accessing individual components is
   *very common* (I'd speculate that it may be the more common of the
   two modes.)   If we group the components into a single variable
   using an additional dimension it means that the code to treat the
   individual vector components will become different from the scalar
   variable code, despite the fact that there is a nearly identical
   list of use cases for vector components and scalar variables.  (This
   would be a step away  from elegance.)
2. We would almost certainly find that staggered grids becomes a
   slippery slope of complexity.  The specific index ranges needed for
   the individual staggered components depend on the operation that is
   being performed:  vector plots, curl, divergence, volume integrals,
   etc. ...  These needs are not consistent with a single index range
   applying to all components.
3. There are many use cases in which the analysis pipeline is different
   for different components of a vector.  Some examples:  the vectors
   may be stored in separate files (e.g. the entire CMIP5 archive ...
   and we know what a challenge it is to get data providers to utilize
   the aggregation tools); the Z vector component of ocean data is
   often generated through an on-the-fly analysis conservation-of-mass
   anlaysis step, rather than stored in the file; the Z component often
   requires special scaling -- e.g. when making vector plots.  Such
   cases illustrate why it is more elegant to make the vector
   associations in higher level code, rather than at the file level.
4. 3-vector components are often plotted and analyzed in 2-dimensional
   views.  With a vector dimension of length 3, we cannot do a
   multi-dimensional access in the  XZ plane without reading the Y
   component, too -- illustrating where the vector dimension at the
   file level can add complexity.

    - Steve

=======================

On 12/9/2011 11:43 AM, John Caron wrote:
On 12/9/2011 11:37 AM, Jonathan Gregory wrote:
Dear John

I prefer the idea that Thomas has put forward of an umbrella, rather than
containing the vector/tensor components in one data variable, because

* I really don't like the idea of units being mixed within a data variable. I think things with different units must certainly be different quantities and could not be regarded as the same field. You can get away with it if they are all m s-1, for instance, but not so easily if the vertical one is orders of magnitude different from the horizontal, and not at all if the vector is
expressed in polar coordinates.

I think the common case is that the vector components have the same unit. One could restrict to that case.


* I think it would be very inconvenient, and would break a lot of existing software, if the coordinates were not what they appeared to be, because an offset had to be added. Also, in general, the component fields of a staggered grid do not have the same dimensions, as well as differing in the coordinates.
Im not sure what "an offset had to be added" means.

I think the common case of staggered grids could be handled with a convention defining the staggering, rather than seperate dimensions. I pull out the one Rich Signell and I cam up with a long time ago, for its own sake.


* It avoids the need to define a convention for labelling vector/tensor
components.
I think this convention would be about as complex as the one you will need for Thomas' proposal.


* It is completely backwards-compatible as regards the component fields, which are exactly as before; we're just adding some separate information linking
them. This seems neat to me.

I agree thats a strong reason for Thomas' method.

OTOH, if we start thinking in terms of the extended model, a Structure ("compound type" in HDF5 parlance) might be useful. What do you think about starting to think about possible uses of extended data model?

Thanks for your thoughts, as always, interesting.

John

_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to