Hello Jim

I like your thinking on this topic.  I agree with you that this feels like 
metadata about the coordinate and that the coordinate could carry further 
information about the ensemble.

A new coordinate attribute named ensemble_size, which is limited in scope to 
being used on a coordinate with a standard_name of realization sounds like a 
neat and simple solution to me

If this approach is interesting, then I think that your suggestion of a new 
coordinate type in Chapter 5 is a good one.  This would provide some nice 
consistency with spatial and temporal coordinates and give  scope for future 
work in more detailed descriptions of ensembles.

At the moment I think there is some value in separating the description of an 
ensemble dimensions from the description of statistical processes performed 
with respect to an ensemble dimension.  This leads me away from using 
cell_methods to meet my use case.

many thanks
mark

________________________________
From: Jim Biard [jbi...@cicsnc.org]
Sent: 29 July 2015 16:02
To: Hedley, Mark; cf-metadata@cgd.ucar.edu
Subject: Re: [CF-metadata] original_ensemble_size

Mark,

It seems to me that we have quite a few examples of coordinate variables that 
have extra attributes that further define the contents. Time coordinate 
variables, for example, have the calendar attribute. There are many standard 
names that direct the developer to specify an attribute (most always comment or 
flag_values/flag_meanings attributes) that further defines the contents. Any 
variable can validly have attributes associated with it.

If having a specific attribute name that's not mentioned in the CF Conventions 
document called for in a standard name definition is too troubling, the 
standard name definition could call for putting a string of the form 'ensemble 
size N' in a comment attribute. Or it could call for putting the ensemble size 
in a comment section in the cell_methods attribute on the data variable as 
Jonathan and Karl's suggested. Jonathan and Karl's suggestions imply a change 
to the conventions, since they propose new standardized cell method comment 
names.

In all of these cases the information will be available to a human who reads a 
file dump, but none of them make the information immediately available to 
software automation. The addition to the cell_methods attribute grammar is 
likely the least intrusive way to make it something that people can write 
general software for. The down side to this approach is that the information is 
not held with the coordinate variable, which is the most natural place for it.

Another alternative is to add a new section to Chapter 5 that defines an 
ensemble or sample pool coordinate type (or whatever name you prefer). It may 
be worth the extra trouble to go ahead and give it formal recognition instead 
of trying to work it into existing forms that, in my opinion, don't fit it too 
well. I appreciate the desire to find the least intrusive way to modify the 
conventions, but we can end up painting ourselves into corners in the process.

Grace and peace,

Jim

On 7/29/15 3:24 AM, Hedley, Mark wrote:
Hello Jim

this is a really neat alternative approach

I agree that the information about the ensemble_size is closely related to the 
realization coordinate and less closely related to the data variable, so this 
method encapsulates the metadata nicely.

Whilst the solution is elegant, I cannot see a previous example of a coordinate 
variable within CF defining extra attributes, so I'm a bit wary that this 
approach will require a change to the conventions document, not just a new 
standard_name.

Is there a neat way to use CF to provide metadata about a coordinate, rather 
than about a data variable?

I think it's well worth considering, but it may be a path of some resistance

many thanks
mark

________________________________
From: CF-metadata 
[cf-metadata-boun...@cgd.ucar.edu<mailto:cf-metadata-boun...@cgd.ucar.edu>] on 
behalf of Jim Biard [jbi...@cicsnc.org<mailto:jbi...@cicsnc.org>]
Sent: 23 July 2015 13:11
To: cf-metadata@cgd.ucar.edu<mailto:cf-metadata@cgd.ucar.edu>
Subject: Re: [CF-metadata] original_ensemble_size

Hi.

It seems to me that you would want a coordinate variable with the standard name 
'realization' (whether scalar or multi-valued) and give it an attribute with 
the name 'ensemble_size'. You can store the realization number in the variable 
and the ensemble size in the attribute.

Grace and peace,

Jim

On 7/23/15 6:11 AM, Hedley, Mark wrote:
I use the
'coordinates'
attribute on my data variable, referencing the scalar 'ensemble_size' variable, 
thus defining this ensemble_size as a scalar coordinate variable for the 
temperature dataset

mark

________________________________
From: CF-metadata 
[cf-metadata-boun...@cgd.ucar.edu<mailto:cf-metadata-boun...@cgd.ucar.edu>] on 
behalf of Karl Taylor [taylo...@llnl.gov<mailto:taylo...@llnl.gov>]
Sent: 22 July 2015 22:53
Cc: CF Metadata List
Subject: Re: [CF-metadata] original_ensemble_size

Hi all,

I'm still curious about something:

Suppose we have the temperature field stored from one member of an ensemble of 
size 10.   We want to make the size of the ensemble known to the user.   We 
store 10 as a scalar variable with standard name "ensemble_size", but how does 
that scalar get associated with our temperature variable (other than it having 
being stored in the same file)?

cheers,
Karl

On 7/22/15 1:59 AM, Hedley, Mark wrote:
Hello John, Karl et al

I'm not sure I agree with John's last statement. I think that an ensemble is a 
defined collection of members, so my need is the need for ensemble size to be 
defined explicitly.
The distinction that not all members may be present characterises the need for 
this metadata descriptor, rather than just using the dimension size of 
realization, which does not meet my requirement.

On reflection, I think that I prefer Karl's name of 'ensemble_size'

To restate my use case, I have a data set from an ensemble, where there is a 
coordinate variable called 'realization'.  Let's say there are 23 members, this 
dimension is size 23.

I want to reference the number of members in the ensemble, whilst sub-setting 
the data variable in various ways.

The suggestion is to add a scalar coordinate to my original dataset, which 
contains the number of members in the ensemble.  Then any sub-setting operation 
will retain this coordinate, and I will always be able to state that this 
member is member 0 of 23, 5 of 23 etc

One requirement I have is to slice this variable, to result in a 2D data array, 
2 1D coordinate variables: latitude and longitude; with all other coordinates 
as scalars.

If it is reasonable to talk about an ensemble as a defined collection of 
members, then I agree with Karl, that a standard_name of 'ensemble_size' fits 
the bill.  The description fits my use case nicely

many thanks
mark


________________________________
From: CF-metadata 
[cf-metadata-boun...@cgd.ucar.edu<mailto:cf-metadata-boun...@cgd.ucar.edu>] on 
behalf of John Graybeal 
[jbgrayb...@mindspring.com<mailto:jbgrayb...@mindspring.com>]
Sent: 22 July 2015 05:52
To: Karl Taylor
Cc: CF Metadata List
Subject: Re: [CF-metadata] original_ensemble_size

Karl,

To my understanding (then and now), the use case is explicitly not what your 
definition describes. The entire point of the request was to provide a label 
that was clearly distinguished from the typical concept of ensemble size.

John



On Jul 21, 2015, at 16:36, Karl Taylor 
<taylo...@llnl.gov<mailto:taylo...@llnl.gov>> wrote:

Dear all,

I wonder if the following might also meet requirements of the use case:

name: ensemble_size

description: The number of member realizations in an ensemble.  This name 
provides context for any specific realization, which might not be co-located 
with the other members of the ensemble.

Karl

On 7/20/15 9:49 PM, John Graybeal wrote:
To save others the lookup, the use case phrasing that Mark signed on to were 
these words: "In my use case, the whole ensemble is not present, I only have a 
subset of the members. I have a metadata element telling me how many members 
there were at the time the ensemble was created, which I would like to encode." 
 The entire thread is titled 'realization | x of n', but it is pretty, umm, 
rich with detail.

The last email before discussion went silent appears to be mine:

Modified to fit Mark's use case, I think suitable text is:

name: original_ensemble_size

description: The number of member realizations in the originally constituted 
ensemble. This provides context for any specific realization, for example 
orienting a member relative to its original group (even if the group is no 
longer intact).

This does not mention forecasting, preserves the origination concept, and gives 
a bit of context, without constraining the application. It could even be an 
ensemble of observations, or cat videos, or ... you get the idea.

I will let someone else provide the example of how that is associated with the 
variable, it will be more authoritative!

John


On Jul 20, 2015, at 14:42, Karl Taylor 
<taylo...@llnl.gov<mailto:taylo...@llnl.gov>> wrote:

Hi Mark,

I didn't quite understand how the standard name gets associated with a variable 
(containing 1 or more realizations from the ensemble).   Someone said it was 
through a scalar coordinate variable, but I don't see how the ensemble member 
is a function of the ensemble size, so why would this be appropriate?

Could you supply an example?

Also, I didn't follow why "original" was included in "original ensemble size".  
Surely, you wouldn't report this number unless you thought the ensemble size 
was pretty much set and wouldn't change.  In that case there shouldn't be a 
need for a "modified ensemble size", so wouldn't "ensemble size" suffice?

thanks,
Karl


On 7/20/15 9:24 AM, Hedley, Mark wrote:
Hello CF

Late last year we had a discussion about storing

original_ensemble_size

in a CF file
http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2014/thread.html#57756

There were a few options discussed, with John Graybeal making the suggestion

original_ensemble_size
description: The number of members constituting an ensemble.


for a new standard_name definition, which seemed to fit the case very well

It does not seem to have been adopted into the standard names list as yet.

Please may this name and definition be adopted, or reasons not to detailed here?

thank you
mark





_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu<mailto:CF-metadata@cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu<mailto:CF-metadata@cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata







_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu<mailto:CF-metadata@cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


--
[CICS-NC] <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc>       Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA National Centers for Environmental Information <http://ncdc.noaa.gov/>
formerly NOAA’s National Climatic Data Center
151 Patton Ave, Asheville, NC 28801
e: jbi...@cicsnc.org<mailto:jbi...@cicsnc.org>
o: +1 828 271 4900

Connect with us on Facebook for 
climate<https://www.facebook.com/NOAANCEIclimate> and ocean and 
geophysics<https://www.facebook.com/NOAANCEIoceangeo> information, and follow 
us on Twitter at @NOAANCEIclimate<https://twitter.com/NOAANCEIclimate> and 
@NOAANCEIocngeo<https://twitter.com/NOAANCEIocngeo>.


--
[CICS-NC] <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc>       Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA National Centers for Environmental Information <http://ncdc.noaa.gov/>
formerly NOAA’s National Climatic Data Center
151 Patton Ave, Asheville, NC 28801
e: jbi...@cicsnc.org<mailto:jbi...@cicsnc.org>
o: +1 828 271 4900

Connect with us on Facebook for 
climate<https://www.facebook.com/NOAANCEIclimate> and ocean and 
geophysics<https://www.facebook.com/NOAANCEIoceangeo> information, and follow 
us on Twitter at @NOAANCEIclimate<https://twitter.com/NOAANCEIclimate> and 
@NOAANCEIocngeo<https://twitter.com/NOAANCEIocngeo>.

_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to