Re: [CF-metadata] Branching "history"

2011-09-06 Thread Nan Galbraith

Hi all -


Are there any existing practices (either established or experimental)
for the use of the "history" attribute when dealing with complex,
branching processing histories?

Given the "history" attribute is only intended to be human readable, I
suspect the answer is "no". In which case, what would be more palatable:
inventing a new syntax, or throwing away everything prior to the last
linear sequence?


I'm not sure how many existing practices there are, but a new
syntax is *definitely* preferable to loss of this information.

The standard lets you continuously append processing 'events' to
the global history, using a timestamp; in theory you can append
all the branching histories, adding information about which
component was modified (maybe going from datestamp/action to
datestamp/component/action ?). The component identifiers would
need to include enough information to let the user know what
slice, and what variable, was modified by each action.

John's SSDS example shows an interesting way to accumulate
this information, which could be expanded for branched provenance.
It's not quite the same as the NetCDF-defined history attribute,
but it looks like a great way to make this field machine readable.

By the way, here's the definition, from the NetCDF Users' Guide:


history

A global attribute for an audit trail. This is a character array with
a line for each invocation of a program that has modified the
dataset. Well-behaved generic netCDF applications should append a
line containing: date, time of day, user name, program name and
command arguments.


and a snippet from the CF standard:


2.6.2. Description of file contents

The NUG defines title and history to be global attributes. We wish
to allow the newly defined attributes, i.e., institution, source,
references, and comment, to be either global or assigned to
individual variables. When an attribute appears both globally and as
a variable attribute, the variable's version has precedence. ...

history

Provides an audit trail for modifications to the original data.
Well-behaved generic netCDF filters will automatically append their
name and the parameters with which they were invoked to the global
history attribute of an input netCDF file. We recommend that each
line begin with a timestamp indicating the date and time of day that
the program was executed.


- Nan

___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


[CF-metadata] use of volume_* and *_optical_thickness in variable names

2011-09-06 Thread Markus Fiebig
Dear CF-metadata list members,

having just signed up to the mailing list, I should probably "waist" a few 
words on a brief introduction. I'm managing the WMO Global Atmosphere Watch 
(GAW) World Data Centre for Aerosol (WDCA) at NILU. WDCA collects data on 
atmospheric aerosol properties measured at ground stations connected to the GAW 
network. One primary use of the data is for validating climate and weather 
prediction models, the latter to the degree they include aerosol explicitly. 
WMO is currently working towards a better integration of its several GAW data 
centres, as well as the aerosol observation and modelling communities in 
general. In this context, the need came up for a common, well-defined 
vocabulary for variables, for which we are considering the CF names. I'm not 
the only one to decide this, but if this is approved by the relevant WMO 
bodies, I will likely come to propose a few new standard names that correspond 
to aerosol properties observed at the GAW ground stations.

At this point however, I'm still trying to find out what these names will 
probably look like. To this end, I would have a few questions concerning the CF 
naming philosophy which I couldn't properly clarify by looking at the mailing 
list archives:

1) To what degree are qualifications part of the standard_name? In the 
guidelines for constructing standard_names, qualifications such as _in_air or 
_due_to_dry_aerosol are separated from the standard name, while they are 
included with some standard_names in the table. Would a suitable standard_name 
be "surface_volume_scattering_coefficient_at_stp_in_air_due_to_pm1_dry_aerosol" 
or just "volume_scattering_coefficient" with the qualifications optional? Would 
the qualifications I used in the example be correct?

2) How are the terms _optical_depth and _optical_thickness used? I know there 
are deep ideological divides about the correct use of these terms, and I don't 
plan to re-open any possible previous discussions. I only would like to know 
how these terms are used in the CF convention. I found the standard name 
atmosphere_optical_thickness_due_to_ambient_aerosol in the standard_name table, 
and the explanation "The optical thickness is the integral along the path of 
radiation of a volume scattering/absorption/attenuation coefficient." Is this 
path meant as the slant path pointing, e.g., from the surface at the sun, or 
along the vertical axis from the surface through the atmosphere?

3) I found the standard_name 
"volume_extinction_coefficient_in_air_due_to_ambient_aerosol" in the table. In 
what sense qualifying is the term "volume_", i.e. how would the 
"volume_extinction_coefficient" be different from the "extinction_coefficient"?

Thanks for your help!

Best regards,
Markus

___
Dr. Markus Fiebig

Dept. Atmospheric and Climate Research (ATMOS)
Norwegian Institute for Air Research (NILU)
P.O. Box 100
N-2027 Kjeller
Norway

Tel.: +47 6389-8235
Fax : +47 6389-8050
e-mail: markus.fie...@nilu.no
skype: markus.fiebig


___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


Re: [CF-metadata] standard names for stations (Jonathan Gregory)

2011-09-06 Thread Schultz, Martin
> Date: Wed, 31 Aug 2011 10:33:26 +0100
> From: Jonathan Gregory 
> Subject: Re: [CF-metadata] standard names for stations
> Dear Nan
>
> > Do we need to specify whether the _id is numeric or character? I'd
> > prefer to leave that to the user and his code.
>
> Yes, I think we have to specify this for standard_names; in the standard
> name table, all of them are either assigned units => numeric, or stated to be
> "string". Of course, a number can be written in a string, and maybe that's the
> right thing to do if this variable would never be processed as a number.
>
> Best wishes
>
> Jonathan
>

Dear Jonathan,

... but strings may be formatted differently. "001" is not the same as "1" or 
"01". I think you are right about using strings instead of numbers (if 
sometimes the ID can actually be a string), but someone (in the user community) 
then needs to define the formatting of number IDs.

Martin




Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt


___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


Re: [CF-metadata] Branching "history"

2011-09-06 Thread Lowry, Roy K.
Hi All,

Something in the back of my mind from a project in which I have peripheral 
engagement (so may be a red herring or worse).  Didn't Argo put a lot of effort 
into a NetCDF history encoding for their format?

Cheers, Roy.

From: cf-metadata-boun...@cgd.ucar.edu [cf-metadata-boun...@cgd.ucar.edu] On 
Behalf Of Nan Galbraith [ngalbra...@whoi.edu]
Sent: 06 September 2011 13:46
To: CF list
Subject: Re: [CF-metadata] Branching "history"

Hi all -

> Are there any existing practices (either established or experimental)
> for the use of the "history" attribute when dealing with complex,
> branching processing histories?
>
> Given the "history" attribute is only intended to be human readable, I
> suspect the answer is "no". In which case, what would be more palatable:
> inventing a new syntax, or throwing away everything prior to the last
> linear sequence?

I'm not sure how many existing practices there are, but a new
syntax is *definitely* preferable to loss of this information.

The standard lets you continuously append processing 'events' to
the global history, using a timestamp; in theory you can append
all the branching histories, adding information about which
component was modified (maybe going from datestamp/action to
datestamp/component/action ?). The component identifiers would
need to include enough information to let the user know what
slice, and what variable, was modified by each action.

John's SSDS example shows an interesting way to accumulate
this information, which could be expanded for branched provenance.
It's not quite the same as the NetCDF-defined history attribute,
but it looks like a great way to make this field machine readable.

By the way, here's the definition, from the NetCDF Users' Guide:

> history
>
> A global attribute for an audit trail. This is a character array with
> a line for each invocation of a program that has modified the
> dataset. Well-behaved generic netCDF applications should append a
> line containing: date, time of day, user name, program name and
> command arguments.

and a snippet from the CF standard:

> 2.6.2. Description of file contents
>
> The NUG defines title and history to be global attributes. We wish
> to allow the newly defined attributes, i.e., institution, source,
> references, and comment, to be either global or assigned to
> individual variables. When an attribute appears both globally and as
> a variable attribute, the variable's version has precedence. ...
>
> history
>
> Provides an audit trail for modifications to the original data.
> Well-behaved generic netCDF filters will automatically append their
> name and the parameters with which they were invoked to the global
> history attribute of an input netCDF file. We recommend that each
> line begin with a timestamp indicating the date and time of day that
> the program was executed.

- Nan

___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata-- 
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.
___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


Re: [CF-metadata] Branching "history"

2011-09-06 Thread Nan Galbraith

You're good, Roy!

There's a new section (4 pages - maybe more) in the current Argo users
manual about their history implementation.

It looks like they're using character variables to store structured, global-
level provenance information.  It doesn't look like it would expand easily
to allow "branched" metadata for files that have multiple inputs, partly
because the various history records are numbered and tied to pre-defined
processing levels. It's interesting, though! Another approach, altogether.

argodatamgt.org/content/download/4729/34634/file/argo-dm-user-manual-version-2.3.pdf

Cheers - Nan



On 9/6/11 12:46 PM, Lowry, Roy K. wrote:

Hi All,

Something in the back of my mind from a project in which I have peripheral 
engagement (so may be a red herring or worse).  Didn't Argo put a lot of effort 
into a NetCDF history encoding for their format?

Cheers, Roy.

From: cf-metadata-boun...@cgd.ucar.edu [cf-metadata-boun...@cgd.ucar.edu] On 
Behalf Of Nan Galbraith [ngalbra...@whoi.edu]
Sent: 06 September 2011 13:46
To: CF list
Subject: Re: [CF-metadata] Branching "history"

Hi all -


Are there any existing practices (either established or experimental)
for the use of the "history" attribute when dealing with complex,
branching processing histories?

Given the "history" attribute is only intended to be human readable, I
suspect the answer is "no". In which case, what would be more palatable:
inventing a new syntax, or throwing away everything prior to the last
linear sequence?

I'm not sure how many existing practices there are, but a new
syntax is *definitely* preferable to loss of this information.

The standard lets you continuously append processing 'events' to
the global history, using a timestamp; in theory you can append
all the branching histories, adding information about which
component was modified (maybe going from datestamp/action to
datestamp/component/action ?). The component identifiers would
need to include enough information to let the user know what
slice, and what variable, was modified by each action.

John's SSDS example shows an interesting way to accumulate
this information, which could be expanded for branched provenance.
It's not quite the same as the NetCDF-defined history attribute,
but it looks like a great way to make this field machine readable.

By the way, here's the definition, from the NetCDF Users' Guide:


history

A global attribute for an audit trail. This is a character array with
a line for each invocation of a program that has modified the
dataset. Well-behaved generic netCDF applications should append a
line containing: date, time of day, user name, program name and
command arguments.

and a snippet from the CF standard:


2.6.2. Description of file contents

The NUG defines title and history to be global attributes. We wish
to allow the newly defined attributes, i.e., institution, source,
references, and comment, to be either global or assigned to
individual variables. When an attribute appears both globally and as
a variable attribute, the variable's version has precedence. ...

history

Provides an audit trail for modifications to the original data.
Well-behaved generic netCDF filters will automatically append their
name and the parameters with which they were invoked to the global
history attribute of an input netCDF file. We recommend that each
line begin with a timestamp indicating the date and time of day that
the program was executed.

- Nan

___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata--
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.



--
***
* Nan Galbraith(508) 289-2444 *
* Upper Ocean Processes GroupMail Stop 29 *
* Woods Hole Oceanographic Institution*
* Woods Hole, MA 02543*
***



___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata