Re: [CF-metadata] Proposal for better handling vector quantities in CF

2012-01-02 Thread Randy Horne
Folks:

Depending on how loose we define vector in this context, this umbrella variable 
concept can achieve my objective where multiple data variables share the same 
quality flags.

In the problem domain I am working, there are multiple related data variables 
in the same coordinate space for which the quality flags are the same.  It 
would be a stretch to call these multiple related data variables "vectors" in 
the true mathematical sense.

Given this looser definition of "vector", the notion of having standard names 
associated with the umbrella data variable does not seem to make sense as 
different projects would potentially have project-unique groupings of variables 
where it is desirable to share quality flags, etc.

very respectfully,

randy


[CF-metadata] Proposal for better handling vector quantities in CF
Thomas Lavergne x
Thu Nov 24 14:53:52 MST 2011 

Previous message: [CF-metadata] standards for probabilities 
Next message: [CF-metadata] Proposal for better handling vector quantities in 
CF 
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] 



Dear all,

This email is a proposal to strenghthen the storage and exploitation of 
vector/tensor data in CF. Thanks to Jonathan for commenting an early version of 
this note.

As far as I can tell, vectors are not handled as such by CF, only their 
components (via the standard names defining them, e.g. sea_ice_x_velocity, 
northward_sea_ice_velocity, eastern_wind, etc...). Life and some applications 
(e.g. plotting) would be easier if it was possible to group all components of a 
vector field into a single "vector" object. 

Here is my use case: I have an ice drift product, thus two datasets to define 
my vectors: sea_ice_x_displacement, and sea_ice_y_displacement. Note that it 
could be any combination of x/y, north/east, module/direction. It is moreover 
not limited to ice drift, but rather applies to any 2D (3D as well) variables 
such as vectors. As far as I know, the current CF does not provide me a way to 
"group" these two components an re-unite them into a vector. Two consequences: 
1) I cannot define a third variable (say status_flag) that would apply to the 
vector object (thus to both its components). And 2) computer programmes (that 
for example want to draw vectors instead of colour contours) have to "guess" 
that my CF file contains a vector. The software has to skim through my 
variables, check that any two pairs of standard names define a vector, and 
propose a "vector plot" option to the user. This might work in simple files, 
but will fail if my CF files contains 2 sets of vectors, say one 
 from model, the other from satellite: X_model, Y_model, X_sat, Y_sat. Will a 
software be smart enough to avoid proposing a (X_model,Y_sat) vector plots when 
all the 4 share the same standard_names: sea_ice_(x|y)_displacements? 

Here, an approach could be that the X dataset defines its corresponding Y 
dataset as an "auxiliary variable" (and the Y dataset would do the same with 
X). This would probably work, but does not solve my concern number 1 to share a 
3rd variable with both X and Y.

The solution I propose for discussion is to allow an umbrella "dummy" dataset 
(like the proj/mapping ones: no dimension, no data, just attributes). This 
umbrella variable would have a valid standard name 
"sea_ice_displacement_vector" (definition of "vector"). We would then define a 
new standard attribute pattern: components = , e.g. "dX dY dir". The string values in  the list are the name of 
the datasets containing the components of the vector. Note that even for a 2D 
vector, I could choose to have both x/y and speed/dir in the same CF file, 
hence the need to allow more than just 2 "components", even for a 2D vector. We 
must have at least 2.

So in my case:

The two X and Y datasets and the direction:

float dX(time, yc, xc) ;
 dX:long_name = "component of the displacement along the x axis of the grid" ;
 dX:standard_name = "sea_ice_x_displacement" ;
 dX:units = "km" ;
 dX:_FillValue = -1.e+10f ;
 dX:coordinates = "lat lon" ;
 dX:grid_mapping = "Polar_Stereographic_Grid" ;

float dY(time, yc, xc) ;
 dY:long_name = "component of the displacement along the y axis of the grid" ;
 dY:standard_name = "sea_ice_y_displacement" ;
 dY:units = "km" ;
 dY:_FillValue = -1.e+10f ;
 dX:coordinates = "lat lon" ;
 dX:grid_mapping = "Polar_Stereographic_Grid" ;

float dir(time, yc, xc) ;
 dY:long_name = "direction of the displacement" ;
 dY:standard_name = "direction_of_sea_ice_displacement" ;
 dY:units = "degrees" ;
 dY:_FillValue = -1.e+10f ;
 dX:coordinates = "lat lon" ;
 dX:grid_mapping = "Polar_Stereographic_Grid" ;


The new dummy umbrella:

int ice_drift_vector;
 drift_vector:standard_name = "sea_ice_displacement" ;
 drift_vector:long_name = "sea ice drift vector" ;
 drift_vector:components = "dX dY dir" ;

A status flag for the vector:

byte status_flag(time

Re: [CF-metadata] Convention attribute

2012-01-02 Thread John Graybeal
I wasn't sure how to parse these, I'm a little slow today I guess.  After 
trying a few ways, I decided they mostly use spaces to separate convention 
identifiers, and slashes to designate hierarchy. (Except the first two embed a 
space within "OceanSITES x.x", which I think should be a hyphen.)  

Then I finally got your point, that / (slash) is also a delimiter, but one with 
an explicit meaning.

> I guess the $64,000 question is whether any application program cares about 
> such subtleties and I think the answer is probably not. 

As of today, certainly no application program cares about such subtleties, 
since no standard or community practice exists to make use of them. But even if 
we state the question as "what is likely to be the most useful in the future, 
while being usable now?" (likely what you meant), I can't find a use case where 
the computer would use the order information, other than for display -- the 
context being of some educational value to users of the data sets, as you say.

But this leads to an implementation question, using the examples (modified with 
hyphens per the above comment): 

> (A) CF-x.x OceanSITES-x.x SeaDataNet-x.x -- 3 conventions are listed
> (B) CF-x.x/OceanSITES-x.x/SeaDataNet-x.x -- 1 conventions are listed,  
> with its relationship to two others
> (C) CF-x.x/SeaDataNet-x.x CF-x.x/OceanSITES-x.x-- 2 conventions are 
> listed, each with its relationship to a third

>From these strings, in the first and third case we don't really know either 
>way about some of the relationships -- the fact that SeaDataNet-x.x is listed 
>separately from OceanSITES-x.x may mean it is truly independent, but it could 
>just as well be derivative, unless we explicitly preclude that usage. Stated 
>another way, do I have to list every derivative relationship in my Convention 
>Attribute? That is, if SeaDataNet profiles OceanSites which profiles CF, do I 
>have to use form (B) for a SeaDataNet Convention identifier, or are forms (A) 
>and (C) equally acceptable? 

If all agree form (B) is the only acceptable form, then the profiles can 
reflect that explicitly when they define the identifier (the SeaDataNet 
profiler ID will always be of the form (B), and every time CF or OceanSITES 
updates their profile, SeaDataNet needs to double-check that no conflicts have 
been created with their profile, and perhaps put out an updated 'version 
conformance' list each time to reflect that check being successful). Validation 
software could reasonably expect to validate appropriate sequences, and reject 
invalid ones.

If we say any of these forms is OK, then I expect slash will become an 
equivalent to space over time.  Users will start using the (B) form, but get 
things out of order (which software won't have any reason to catch), and soon 
the attribute will just be treated as a list that supports multiple separators.

John







 


On Dec 31, 2011, at 03:50, Lowry, Roy K. wrote:

> We therefore end up with several possibilities for the Convention Attribute:
> 
> CF-x.x OceanSITES x.x SeaDataNet-x.x
> CF-x.x/OceanSITES x.x/SeaDataNet-x.x
> CF-x.x/SeaDataNetx.x CF-x.x/OceanSITES x.x
> 
> These have subtly different semantics.  The first says nothing about the 
> convention relationship.  The second states SeaDataNet is a profile of 
> OceanSITES which is in turn a profile of CF.  The third states that the file 
> is conformant to two independent profiles of CF.
> 
> I guess the $64,000 question is whether any application program cares about 
> such subtleties and I think the answer is probably not. Most will simply 
> search for the convention required by the application within the attribute 
> string.  Therefore we should be more concerned to ensure that our convention 
> designators avoid including well-known designators - like 'CF' - as 
> substrings than with delimiters. Therefore,having information about the 
> relationship between conventions recorded within the file is useful 
> provenance metadata that could be achieved at virtually no cost.



John Graybeal    
phone: 858-534-2162
Product Manager
Ocean Observatories Initiative Cyberinfrastructure Project: 
http://ci.oceanobservatories.org
Marine Metadata Interoperability Project: http://marinemetadata.org   

___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


Re: [CF-metadata] Convention attribute

2012-01-02 Thread Lowry, Roy K.
Hi John,

Guess we're on the same wavelength.  99% of usage of the conventions attribute 
string will be searches using functions like INSTR (function that locates a 
substring wihin a string), which make delimiters irrelevant but as Benno 
pointed out carry a (minimal if we're aware) element of risk.  Therefore those 
of us that care are free to use delimiter conventions to convey semantics on 
the understanding that their point will be missed by the vast majority of users.

Cheers, Roy.


From: John Graybeal [jbgrayb...@mindspring.com]
Sent: 02 January 2012 18:47
To: Lowry, Roy K.
Cc: CF Metadata List; sdn2-t...@seadatanet.org
Subject: Re: [CF-metadata] Convention attribute

I wasn't sure how to parse these, I'm a little slow today I guess.  After 
trying a few ways, I decided they mostly use spaces to separate convention 
identifiers, and slashes to designate hierarchy. (Except the first two embed a 
space within "OceanSITES x.x", which I think should be a hyphen.)

Then I finally got your point, that / (slash) is also a delimiter, but one with 
an explicit meaning.

> I guess the $64,000 question is whether any application program cares about 
> such subtleties and I think the answer is probably not.

As of today, certainly no application program cares about such subtleties, 
since no standard or community practice exists to make use of them. But even if 
we state the question as "what is likely to be the most useful in the future, 
while being usable now?" (likely what you meant), I can't find a use case where 
the computer would use the order information, other than for display -- the 
context being of some educational value to users of the data sets, as you say.

But this leads to an implementation question, using the examples (modified with 
hyphens per the above comment):

> (A) CF-x.x OceanSITES-x.x SeaDataNet-x.x -- 3 conventions are listed
> (B) CF-x.x/OceanSITES-x.x/SeaDataNet-x.x -- 1 conventions are listed,  
> with its relationship to two others
> (C) CF-x.x/SeaDataNet-x.x CF-x.x/OceanSITES-x.x-- 2 conventions are 
> listed, each with its relationship to a third

>From these strings, in the first and third case we don't really know either 
>way about some of the relationships -- the fact that SeaDataNet-x.x is listed 
>separately from OceanSITES-x.x may mean it is truly independent, but it could 
>just as well be derivative, unless we explicitly preclude that usage. Stated 
>another way, do I have to list every derivative relationship in my Convention 
>Attribute? That is, if SeaDataNet profiles OceanSites which profiles CF, do I 
>have to use form (B) for a SeaDataNet Convention identifier, or are forms (A) 
>and (C) equally acceptable?

If all agree form (B) is the only acceptable form, then the profiles can 
reflect that explicitly when they define the identifier (the SeaDataNet 
profiler ID will always be of the form (B), and every time CF or OceanSITES 
updates their profile, SeaDataNet needs to double-check that no conflicts have 
been created with their profile, and perhaps put out an updated 'version 
conformance' list each time to reflect that check being successful). Validation 
software could reasonably expect to validate appropriate sequences, and reject 
invalid ones.

If we say any of these forms is OK, then I expect slash will become an 
equivalent to space over time.  Users will start using the (B) form, but get 
things out of order (which software won't have any reason to catch), and soon 
the attribute will just be treated as a list that supports multiple separators.

John










On Dec 31, 2011, at 03:50, Lowry, Roy K. wrote:

> We therefore end up with several possibilities for the Convention Attribute:
>
> CF-x.x OceanSITES x.x SeaDataNet-x.x
> CF-x.x/OceanSITES x.x/SeaDataNet-x.x
> CF-x.x/SeaDataNetx.x CF-x.x/OceanSITES x.x
>
> These have subtly different semantics.  The first says nothing about the 
> convention relationship.  The second states SeaDataNet is a profile of 
> OceanSITES which is in turn a profile of CF.  The third states that the file 
> is conformant to two independent profiles of CF.
>
> I guess the $64,000 question is whether any application program cares about 
> such subtleties and I think the answer is probably not. Most will simply 
> search for the convention required by the application within the attribute 
> string.  Therefore we should be more concerned to ensure that our convention 
> designators avoid including well-known designators - like 'CF' - as 
> substrings than with delimiters. Therefore,having information about the 
> relationship between conventions recorded within the file is useful 
> provenance metadata that could be achieved at virtually no cost.



John Graybeal   
phone: 858-534-2162
Product Manager
Ocean Observatories Initiative Cyberinfrastructure Project: 
http://ci.oceanobservatories.org
Marine Metadata Interoperability Project: http://mari