Hi John,

    the problem is that the compound name is obvious for a human, but very hard 
to extract for a machine, because we don't have a strict set of grammar rules. 
What you are suggesting sounds almost like you wanted to replace the 
standard_names by some other mechanism of controlled vocabulary, a collection 
of URIs from different fields and different servers which would point to the 
actual reference term in each case? Perhaps I got you wrong here, but I would 
feel rather uneasy about going too far in this direction at present. We were 
very happy to find out in Dublin that the community (of atmospheric chemists) 
is beginning (!) to recognize standard_names as a valuable resource enabling 
them to speak about the same thing with the same words (even though sometimes a 
bit clumsy), and to have one "master list" of terms seems much simpler and more 
resilient to me at present. Yet, it may be good to reflect within the 
standard_name list what is often brought up in the list discussions anyhow, 
that is that some communities have established controlled vocabulary for their 
field, and - as far as I follow the discussions - this is usually a good 
argument for accepting a standard_name proposal, unless it is in conflict with 
other rules.

    The specific situation in atmospheric chemistry (maybe not so specific but 
at least very prominent) is that the "variable name space" is not 
1-dimensional, but multi-dimensional, i.e. for each (new) compound we can 
easily add a dozen or more new terms (= standard_names) which describe the 
molar fraction or mass content in the atmosphere, emission or deposition fluxes 
(due to a myriad individual processes if need be), chemical reaction rates or 
turnover rates, etc. My proposal to add the compound_name and a URI/URL to the 
accepted standard vocabulary list for compounds merely aims at making sure we 
can link the various compound properties together, so that an application can 
understand that "mole_fraction_of_trimethylbenzene_in_air" is linked to 
"tendency_of_atmosphere_mass_content_of_trimethylbenzene_in_air_due_to_emissions_from_traffic",
 for example. If you show me a parser that can extract all compound names from 
the standard_name table and which would work for all future versions of the 
standard_name table, then we might not need this (although the reference to a 
controlled vocabulary list might still be useful and take a little 
responsibility away from CF).

Cheers,

Martin



Von: John Graybeal [mailto:jgrayb...@ucsd.edu]
Gesendet: Montag, 10. September 2012 18:28
An: Schultz, Martin
Cc: Lowry, Roy K.; cf-metadata@cgd.ucar.edu
Betreff: Re: [CF-metadata] Expanding the standard_name metadata

Congratulations on your great meeting!

Concur that when the name is derivable fairly obviously from the other matter, 
it should not be required. In this case the CF name is supposed to be clear 
enough that the compound name should be within it already. Suggest this be 
available as an option if you value it highly (it is perhaps as much the label, 
as the unique identifier?).

We are bootstrapping best semantic practices for a long lifetime of their use 
(hopefully), and so having a URL (well, URI/IRI; yours works) is the principal 
computational reference. (How does the computer know with some confidence what 
the thing is?) Yes, definitely a web 2.0 kind of answer. Although a particular 
unique identifier may no longer be maintained in 10 or 20 years, it is likely 
enough of a 'standard reference' that it has been mapped to its replacement, or 
even forward linked from the old URL. Absolute worst case, a web search should 
find traces of it.

To generalize this (for creatures, phenomena, etc.), could we call it not 
"compound_codelist", but "object_codelist" or "object_IRI", as the compound is 
the direct object of the prepositional phrase?  OK, that's pretty 
grammar-centric and therefore obscure, but I see the names quickly described 
via their mapped components (a great thing!). This is very much the first step 
of that.

John

On Sep 10, 2012, at 02:35, Schultz, Martin wrote:


Hi Roy,

     thanks for supporting this idea. Why include the "compound_name"? I didn't 
really think about this, but only copied what is common practice in ISO 
metadata files. They usually pair a name with the link to the controlled 
vocabulary list. It could have to do with resilience. What do you do if the 
controlled vocabulary server doesn't work at the time when you need it? 
Actually, I would tend to think that the "compound_name" tag is the more 
important one, and I would see the URL more in the sense of a bibliographic 
reference. In a sense, this bibliographic reference lends some weight to the 
name. But perhaps I am still living too much in the web 1.0 world?

Cheers,

Martin


Von: Lowry, Roy K. [mailto:r...@bodc.ac.uk]<mailto:[mailto:r...@bodc.ac.uk]>
Gesendet: Montag, 10. September 2012 11:03
An: Schultz, Martin; cf-metadata@cgd.ucar.edu<mailto:cf-metadata@cgd.ucar.edu>
Betreff: RE: Expanding the standard_name metadata

Hello Martin,

I really like the idea of linking the Standard Name to a resolveable URL for 
the compound, but would question the need for adding the compound name to the 
standard name table as well as the URL.  The plaintext compound name has to be 
included in the Standard Name and is available through resolution of the URL.  
Why introduce a further duplicate of the information with the inherent risk of 
discrepencies creeping in?

In a similar vein, should Standard Names get deeper into biological parameters 
it would be good to include a link to the World Register for Marine Species 
(WoRMS) for the taxon.

Cheers, Roy.
________________________________
From: CF-metadata 
[cf-metadata-boun...@cgd.ucar.edu<mailto:cf-metadata-boun...@cgd.ucar.edu>] On 
Behalf Of Schultz, Martin 
[m.schu...@fz-juelich.de<mailto:m.schu...@fz-juelich.de>]
Sent: 10 September 2012 09:33
To: cf-metadata@cgd.ucar.edu<mailto:cf-metadata@cgd.ucar.edu>
Subject: [CF-metadata] Expanding the standard_name metadata
Dear all,

     last week, we had a rather successful workshop on "Metadata for air 
quality and atmospheric composition" in Dublin. It was nice to see that the 
community (i.e. those present) seemed to agree without much discussion, that 
ISO 19115 (-1) is the way to go for discovery metadata, while CF is the way 
forward for descriptive metadata to be stored in (usually) netcdf data files. 
The main discussions at the workshop centered around ISO issues, but there was 
one interesting point that came up with respect to CF standard_names and their 
relation to controlled vocabulary:

    We did have discussions on this list earlier about a more grammar-oriented 
approach, and this was also brought up at our workshop again, mainly in light 
of the "threat" that the atmospheric composition group will soon begin to flood 
this email list with hundreds of new names in order to add additional chemical 
compounds. As we have seen with the problem of standard_names for emissions, 
this is stretching the limits of the current ways to operate and publish new 
standard_names. I don't want to argue against the concept of one "flat" master 
list (we have been through this and there are good reasons for sticking to this 
concept), but I would like to stipulate a discussion about adding more 
"metadata" to the standard_name table in order to better link it to other 
controlled vocabulary lists and avoid confusing inconsistencies, for example in 
the naming of chemical compounds. Specifically, I would like to propose two 
"conditional" tags compound_name and compound_codelist in the standard_name 
list which shall appear for all standard_names having to do with chemical 
compounds. Example:

-<entry id="atmosphere_mass_content_of_carbon_monoxide">
   <compound_name>Carbon monoxide</compound_name>
   
<compound_codelist>http://rdfdata.eionet.europa.eu/airquality/components/10</compound_codelist><http://rdfdata.eionet.europa.eu/airquality/components/10%3c/compound_codelist%3e>
    <canonical_units>kg m-2</canonical_units>
   <description>"Content" indicates a quantity per unit area. The "atmosphere 
content" of a quantity refers to the vertical integral from the surface to the 
top of the atmosphere. For the content between specified levels in the 
atmosphere, standard names including content_of_atmosphere_layer are used. The 
chemical formula of carbon monoxide is CO.</description>
</entry>

    In a way, this may be seen as duplication of information, but it would 
really help to tie ends together, because it is practically impossible to parse 
the standard_names in order to extract such information (due to the lack of a 
strict grammar). There may be other tags which could be useful to add, and one 
will have to decide about the pros and cons in each case. However, for compound 
names I would see a clear need arising now.

Best regards,

Martin


PD Dr. Martin G. Schultz
IEK-8, Forschungszentrum Jülich
D-52425 Jülich
Ph: +49 2461 61 2831



------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

Kennen Sie schon unsere app? http://www.fz-juelich.de/app

--
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.
_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu<mailto:CF-metadata@cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


----------------
John Graybeal    <mailto:jgrayb...@ucsd.edu>     phone: 858-534-2162
Product Manager
Ocean Observatories Initiative Cyberinfrastructure Project: 
http://ci.oceanobservatories.org
Marine Metadata Interoperability Project: http://marinemetadata.org







_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to