Re: [CF-metadata] Usage of the 'Conventions' attribute (Nan Galbraith)
Dear Jonathan and Martin, What I was trying to do here was legitimise an already established practice in at least two communities in the observational oceanographic domain with thousands of files in existence with the non-standard usage of the Conventions attribute. I also carried the practice over into the specification of the SeaDataNet NetCDF generation tool. I can see absolutely no benefit in pushing through a new attribute for this purpose and so will circumvent the compliance issue for SeaDataNet by moving the variable attribute into our namespace (i.e. calling it SDN_Conventions). This leaves Derrick's glider community in the lurch, but I suggest they either follow us or follow whatever solution the European GROOM glider community adopts. Martin's comments on namespace highlight a concern I identified whilst doing the research for the SeaDataNet specification. Several communities have added large numbers of both global and variable attributes with no indication of namespace. Not only does this make it difficult to tease out what is CF and what is a community extension, but it creates an accident in waiting. What happens if CF creates a new attribute with a name already in community usage? In my view it's too late to introduce a CF namespace and prefer the idea that for a CF-compliant file CF should be the default namespace, with communities taking responsibility for their extensions. This is what I've done for SeaDataNet. Cheers, Roy. From: CF-metadata [cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Schultz, Martin [m.schu...@fz-juelich.de] Sent: 24 January 2013 18:54 To: cf-metadata@cgd.ucar.edu Subject: Re: [CF-metadata] Usage of the 'Conventions' attribute (Nan Galbraith) Dear all, I fully agree with Jonathan's view that it would not be a good idea to call this new thing Convention. On the other hand, I don't really like the term flag_convention_name either, because it doesn't tell me anything. If I understand correctly, then your desire is different from defining a pointer to controlled vocabulary. Should it be a similar idea, then I suggest we should try to follow the ideas and namings of ISO19115 (without being able to tell you right away what the appropriate term would be). If it's cf specific and indeed refers to the version of a cf document (or annex or whatever), shouldn't the attribute then have a name that starts with cf_? E.g. cf_attribute_convention ? Cheers, Martin Message: 4 Date: Thu, 24 Jan 2013 12:23:33 +1100 From: Jonathan Gregory j.m.greg...@reading.ac.uk To: cf-metadata@cgd.ucar.edu Subject: [CF-metadata] Usage of the 'Conventions' attribute Message-ID: 20130124012333.gc22...@met.reading.ac.uk Content-Type: text/plain; charset=us-ascii Dear Roy I understand the need but I tend to think this would not be an appropriate use of the Conventions attribute, which is a general netCDF attribute, not specific to CF, and refers to files. I do appreciate the logical similarity but I would have thought it better to define something more specific in CF for this purpose. If this is a need which several users have and is required for exchange of data, then it is reasonable to propose to add something to CF. Since it's an extra piece of information referring to flags, maybe an attribute such as flag_convention_name would work? We could make it a requirement that this attribute was allowed only if the variable had flag_meanings as well, to prevent its being used as a substitute for spelling out what the flags mean (as is necessary for the file to be self-describing). Best wishes Jonathan Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system. ___ CF-metadata mailing list
Re: [CF-metadata] Usage of the 'Conventions' attribute
Roy et al., Martin's comments on namespace highlight a concern I identified whilst doing the research for the SeaDataNet specification. Several communities have added large numbers of both global and variable attributes with no indication of namespace. Not only does this make it difficult to tease out what is CF and what is a community extension, but it creates an accident in waiting. What happens if CF creates a new attribute with a name already in community usage? In my view it's too late to introduce a CF namespace and prefer the idea that for a CF-compliant file CF should be the default namespace, with communities taking responsibility for their extensions. This is what I've done for SeaDataNet. In working up a local metadata profile of CF for use here at the Met Office, we also spent much time thinking about the 'namespace problem'. In an early draft of our metadata profile, and after having reviewed previous discussions (e.g. https://cf-pcmdi.llnl.gov/trac/ticket/27), we elected to use the double underscore character sequence ('__') as a namespace separator. Our namespace prefixes were then mnemonics like 'ukmo' for the Met Office, 'dc' for Dublin Core, 'cim' for the Common Information Model, and so on. And we devised additional (fairly simple) machinery to associate the prefixes with target namespaces, just as in the XML world. Thus, we envisaged using netcdf attributes along the lines of: variables: float myvar(t, y, x) ; myvar:ukmo__stashcode = m01s01i123 ; myvar:ukmo__runid = abcde ; // global attributes :dc__rights = Copyright (c) 2013, Acme Wind and Rain Corp. ; :dc__created = 2013-01-01 ... ; In the end, driven by a practical need to release a simpler, more digestible release 1.0 of our metadata specification, we dropped all the aforementioned namespace stuff. As part of some subsequent low-level netcdf work, however, I chanced upon the fact that the '.' character is not treated in any special way within netcdf names (or rather, it is one of netcdf's original special characters, but not one that needs to be escaped in the way that, say, the ':' character does). This got me to thinking that the '.' character might be the ideal namespace separator for use in CF/netCDF attribute names. Since '.' is not in the set of characters currently permitted in CF attribute names, we can be reasonably sure that it is not being used in existing CF-compliant netcdf files. The '.' character also has collateral appeal for python/java developers in that it is the familiar namespace separator used by those languages. Applied to the previous example, then, we'd now have netcdf attributes such as ukmo.stashcode, ukmo.runid, dc.rights, dc.created, and so on. Which looks considerably more elegant, IMO. While in your context, Roy, you might elect to use namespace'd attributes called sdn.conventions, sdn.foo, sdn.bar, etc. Or bodc.foo, bodc.bar, etc. for BODC stuff. Clearly there are several technical issues that would need to be addressed (e.g. how/when to use the 'cf.' prefix, what would the default namespace be, how would prefixes and their namespaces be associated, how should software interpret namespaces, and so on). But, assuming these could be resolved, what do people think about use of '.' as a namespace separator? Good idea? Bad idea? Some recent postings to this list have suggested using a 'cf_' prefix, with the implied suggestion of a '_' namespace separator. IMHO, this approach has the limitation that client software would not be able to disambiguate existing names which include the '_' character. For example, would the name 'cell_methods' refer to a property called 'cell_methods' in some default namespace, or a property called 'methods' in the 'cell' namespace? Likewise for some possible new attribute called, e.g. 'cf_my_new_thing', what namespace would that be in? cf? cf_my? cf_my_new? Regards, Phil ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Re: [CF-metadata] Proposal for new standard_names for biomass burning emissions
Dear Martin and Angelika, Thank you for your proposals. Thank you also to Philip and Jonathan for their useful comments in the discussion of these names. I think the molecular hydrogen and dimethyl sulphate names are unproblematic and well defined. The following four names are accepted for inclusion in the standard name table: tendency_of_atmosphere_mass_content_of_molecular_hydrogen_due_to_emission_from_savanna_and_grassland_fires tendency_of_atmosphere_mass_content_of_molecular_hydrogen_due_to_emission_from_forest_fires tendency_of_atmosphere_mass_content_of_dimethyl_sulfide_due_to_emission_from_savanna_and_grassland_fires tendency_of_atmosphere_mass_content_of_dimethyl_sulfide_due_to_emission_from_forest_fires all with units of kg m-2 s-1. They will be added in the February update. Regarding the discussion of nitrogen dioxide and nitrogen monoxide names, I would like to check my understanding of exactly what is now being proposed. There seems to be agreement that it is fine to have nox_expressed_as_nitrogen_monoxide names, so are you in fact proposing that we should add the following eleven names? tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_savanna_and_grassland_fires tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_forest_fires tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_agricultural_production tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_agricultural_waste_burning tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_industrial_processes_and_combustion tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_energy_production_and_distribution tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_land_transport tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_maritime_transport tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_waste_treatment_and_disposal tendency_of_atmosphere_mass_content_of_nox_expressed_as_nitrogen_monoxide_due_to_emission_from_residential_and_commercial_combustion units: kg m-2 s-1 tendency_of_mass_concentration_of_nox_expressed_as_nitrogen_monoxide_in_air_due_to_emission_from_aviation units: kg m-3 s-1 Are these in addition to the recently added nitrogen_monoxide emission names (i.e. I assume you are not intending the existing names to become aliases of these more recent proposals)? For nitrogen dioxide I think you are currently sticking with your original two proposals of: tendency_of_atmosphere_mass_content_of_nitrogen_dioxide_due_to_emission_from_savanna_and_grassland_fires tendency_of_atmosphere_mass_content_of_nitrogen_dioxide_due_to_emission_from_forest_fires. Is that correct? Best wishes, Alison -- Alison Pamment Tel: +44 1235 778065 NCAS/British Atmospheric Data CentreEmail: alison.pamm...@stfc.ac.uk STFC Rutherford Appleton Laboratory R25, 2.22 Harwell Oxford, Didcot, OX11 0QX, U.K. From: CF-metadata [mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Schultz, Martin Sent: 03 January 2013 12:04 To: cf-metadata@cgd.ucar.edu Subject: [CF-metadata] Proposal for new standard_names for biomass burning emissions Dear all, as per the general CF philosophy to add terms when needed, we propose the addition of the following standard_names for emissions from biomass burning. These follow the syntax of existing names and merely add three chemical species, namely nitrogen_dioxide, molecular_hydrogen, and dimethyl_sulfide (aka DMS): tendency_of_atmosphere_mass_content_of_nitrogen_dioxide_due_to_emission_from_savanna_and_grassland_fires tendency_of_atmosphere_mass_content_of_nitrogen_dioxide_due_to_emission_from_forest_fires tendency_of_atmosphere_mass_content_of_molecular_hydrogen_due_to_emission_from_savanna_and_grassland_fires tendency_of_atmosphere_mass_content_of_molecular_hydrogen_due_to_emission_from_forest_fires tendency_of_atmosphere_mass_content_of_dimethyl_sulfide_due_to_emission_from_savanna_and_grassland_fires tendency_of_atmosphere_mass_content_of_dimethyl_sulfide_due_to_emission_from_forest_fires units: kg m-2 s-1 Note, that the definition of emissions for nitrogen_dioxide is tricky: most inventories only give emissions for NOx (the sum of NO and NO2), and sometimes it is not even clear to an uninformed user if these are expressed_as nitrogen_monoxide or nitrogen_dioxide, or even nitrogen. The ratio of emitted NO to emitted NO2 varies by source category, and this is now sometimes taken into account in models. Therefore it becomes necessary to define emissions specifically for nitrogen_monoxide (already defined as standard_name) and nitrogen_dioxide (new proposal). We should
Re: [CF-metadata] Usage of the 'Conventions' attribute
With respect to netcdf (at least the C version), it is the case that these characters can appear unescaped: _.@+- It should be noted however that dot in particular causes problems for accessing remote datasets through DAP because the dot character is used in DAP constraints to specify fields inside DAP Sequences or Structures or Grids. The problem you have is that no matter what choice of character(s) you make, someone may use the characters in a different way. This means that whatever choice you make, you need to enshrine it in a standard somewhere so that at least there is a chance that people will avoid it. Personally, I would think that a two character sequence is least likely to be used by others, but two underscores is probably not a good choice. I would think something like @@ ++ might be a better choice. =Dennis Heimbigner Unidata Bentley, Philip wrote: Roy et al., Martin's comments on namespace highlight a concern I identified whilst doing the research for the SeaDataNet specification. Several communities have added large numbers of both global and variable attributes with no indication of namespace. Not only does this make it difficult to tease out what is CF and what is a community extension, but it creates an accident in waiting. What happens if CF creates a new attribute with a name already in community usage? In my view it's too late to introduce a CF namespace and prefer the idea that for a CF-compliant file CF should be the default namespace, with communities taking responsibility for their extensions. This is what I've done for SeaDataNet. In working up a local metadata profile of CF for use here at the Met Office, we also spent much time thinking about the 'namespace problem'. In an early draft of our metadata profile, and after having reviewed previous discussions (e.g. https://cf-pcmdi.llnl.gov/trac/ticket/27), we elected to use the double underscore character sequence ('__') as a namespace separator. Our namespace prefixes were then mnemonics like 'ukmo' for the Met Office, 'dc' for Dublin Core, 'cim' for the Common Information Model, and so on. And we devised additional (fairly simple) machinery to associate the prefixes with target namespaces, just as in the XML world. Thus, we envisaged using netcdf attributes along the lines of: variables: float myvar(t, y, x) ; myvar:ukmo__stashcode = m01s01i123 ; myvar:ukmo__runid = abcde ; // global attributes :dc__rights = Copyright (c) 2013, Acme Wind and Rain Corp. ; :dc__created = 2013-01-01 ... ; In the end, driven by a practical need to release a simpler, more digestible release 1.0 of our metadata specification, we dropped all the aforementioned namespace stuff. As part of some subsequent low-level netcdf work, however, I chanced upon the fact that the '.' character is not treated in any special way within netcdf names (or rather, it is one of netcdf's original special characters, but not one that needs to be escaped in the way that, say, the ':' character does). This got me to thinking that the '.' character might be the ideal namespace separator for use in CF/netCDF attribute names. Since '.' is not in the set of characters currently permitted in CF attribute names, we can be reasonably sure that it is not being used in existing CF-compliant netcdf files. The '.' character also has collateral appeal for python/java developers in that it is the familiar namespace separator used by those languages. Applied to the previous example, then, we'd now have netcdf attributes such as ukmo.stashcode, ukmo.runid, dc.rights, dc.created, and so on. Which looks considerably more elegant, IMO. While in your context, Roy, you might elect to use namespace'd attributes called sdn.conventions, sdn.foo, sdn.bar, etc. Or bodc.foo, bodc.bar, etc. for BODC stuff. Clearly there are several technical issues that would need to be addressed (e.g. how/when to use the 'cf.' prefix, what would the default namespace be, how would prefixes and their namespaces be associated, how should software interpret namespaces, and so on). But, assuming these could be resolved, what do people think about use of '.' as a namespace separator? Good idea? Bad idea? Some recent postings to this list have suggested using a 'cf_' prefix, with the implied suggestion of a '_' namespace separator. IMHO, this approach has the limitation that client software would not be able to disambiguate existing names which include the '_' character. For example, would the name 'cell_methods' refer to a property called 'cell_methods' in some default namespace, or a property called 'methods' in the 'cell' namespace? Likewise for some possible new attribute called, e.g. 'cf_my_new_thing', what namespace would that be in? cf? cf_my? cf_my_new? Regards, Phil ___ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Re: [CF-metadata] Usage of the 'Conventions' attribute
Any of these special characters, other than the '_', would probably cause problems for code that deals with NetCDF files. The '.' is used in Matlab for access to structures, and the '@' is used to identify a variable as a function handle. There are work-arounds, but they likely wouldn't add efficiency or elegance to our code. I agree with Roy that CF should be the default namespace in a CF compliant file, and that this problem belongs to groups that are writing extensions. In OceanSITES we've mostly ignored this problem-waiting- to-happen, but our code checks the versions of CF (and our OTS spec) in our files, which I hope offers some protection. Should more of these community conventions be added to CF? I'm sure there are SDN and NCADD (data discovery) attributes that would be helpful to some CF users; it would be awfully nice to have a list of already-defined attributes - in one place - to choose from when putting together a CF-based spec for a project. Cheers - Nan On 1/28/13 1:17 PM, Dennis Heimbigner wrote: With respect to netcdf (at least the C version), it is the case that these characters can appear unescaped: _.@+- It should be noted however that dot in particular causes problems for accessing remote datasets through DAP because the dot character is used in DAP constraints to specify fields inside DAP Sequences or Structures or Grids. The problem you have is that no matter what choice of character(s) you make, someone may use the characters in a different way. This means that whatever choice you make, you need to enshrine it in a standard somewhere so that at least there is a chance that people will avoid it. Personally, I would think that a two character sequence is least likely to be used by others, but two underscores is probably not a good choice. I would think something like @@ ++ might be a better choice. =Dennis Heimbigner Unidata Bentley, Philip wrote: Roy et al., Martin's comments on namespace highlight a concern I identified whilst doing the research for the SeaDataNet specification. Several communities have added large numbers of both global and variable attributes with no indication of namespace. Not only does this make it difficult to tease out what is CF and what is a community extension, but it creates an accident in waiting. What happens if CF creates a new attribute with a name already in community usage? In my view it's too late to introduce a CF namespace and prefer the idea that for a CF-compliant file CF should be the default namespace, with communities taking responsibility for their extensions. This is what I've done for SeaDataNet. In working up a local metadata profile of CF for use here at the Met Office, we also spent much time thinking about the 'namespace problem'. In an early draft of our metadata profile, and after having reviewed previous discussions (e.g. https://cf-pcmdi.llnl.gov/trac/ticket/27), we elected to use the double underscore character sequence ('__') as a namespace separator. Our namespace prefixes were then mnemonics like 'ukmo' for the Met Office, 'dc' for Dublin Core, 'cim' for the Common Information Model, and so on. And we devised additional (fairly simple) machinery to associate the prefixes with target namespaces, just as in the XML world. Thus, we envisaged using netcdf attributes along the lines of: variables: float myvar(t, y, x) ; myvar:ukmo__stashcode = m01s01i123 ; myvar:ukmo__runid = abcde ; // global attributes :dc__rights = Copyright (c) 2013, Acme Wind and Rain Corp. ; :dc__created = 2013-01-01 ... ; In the end, driven by a practical need to release a simpler, more digestible release 1.0 of our metadata specification, we dropped all the aforementioned namespace stuff. As part of some subsequent low-level netcdf work, however, I chanced upon the fact that the '.' character is not treated in any special way within netcdf names (or rather, it is one of netcdf's original special characters, but not one that needs to be escaped in the way that, say, the ':' character does). This got me to thinking that the '.' character might be the ideal namespace separator for use in CF/netCDF attribute names. Since '.' is not in the set of characters currently permitted in CF attribute names, we can be reasonably sure that it is not being used in existing CF-compliant netcdf files. The '.' character also has collateral appeal for python/java developers in that it is the familiar namespace separator used by those languages. Applied to the previous example, then, we'd now have netcdf attributes such as ukmo.stashcode, ukmo.runid, dc.rights, dc.created, and so on. Which looks considerably more elegant, IMO. While in your context, Roy, you might elect to use namespace'd attributes called sdn.conventions, sdn.foo, sdn.bar, etc. Or bodc.foo, bodc.bar, etc. for BODC stuff. Clearly there are several technical issues that would need to be addressed (e.g. how/when to use the 'cf.' prefix, what would the
Re: [CF-metadata] Usage of the 'Conventions' attribute
Hi Nan, Would the CF web site be an appropriate place for communities to post the attributes they have added to CF - either with or without namespace prefixes? Cheers, Roy. From: CF-metadata [cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Nan Galbraith [ngalbra...@whoi.edu] Sent: 28 January 2013 20:25 To: cf-metadata@cgd.ucar.edu Subject: Re: [CF-metadata] Usage of the 'Conventions' attribute Any of these special characters, other than the '_', would probably cause problems for code that deals with NetCDF files. The '.' is used in Matlab for access to structures, and the '@' is used to identify a variable as a function handle. There are work-arounds, but they likely wouldn't add efficiency or elegance to our code. I agree with Roy that CF should be the default namespace in a CF compliant file, and that this problem belongs to groups that are writing extensions. In OceanSITES we've mostly ignored this problem-waiting- to-happen, but our code checks the versions of CF (and our OTS spec) in our files, which I hope offers some protection. Should more of these community conventions be added to CF? I'm sure there are SDN and NCADD (data discovery) attributes that would be helpful to some CF users; it would be awfully nice to have a list of already-defined attributes - in one place - to choose from when putting together a CF-based spec for a project. Cheers - Nan On 1/28/13 1:17 PM, Dennis Heimbigner wrote: With respect to netcdf (at least the C version), it is the case that these characters can appear unescaped: _.@+- It should be noted however that dot in particular causes problems for accessing remote datasets through DAP because the dot character is used in DAP constraints to specify fields inside DAP Sequences or Structures or Grids. The problem you have is that no matter what choice of character(s) you make, someone may use the characters in a different way. This means that whatever choice you make, you need to enshrine it in a standard somewhere so that at least there is a chance that people will avoid it. Personally, I would think that a two character sequence is least likely to be used by others, but two underscores is probably not a good choice. I would think something like @@ ++ might be a better choice. =Dennis Heimbigner Unidata Bentley, Philip wrote: Roy et al., Martin's comments on namespace highlight a concern I identified whilst doing the research for the SeaDataNet specification. Several communities have added large numbers of both global and variable attributes with no indication of namespace. Not only does this make it difficult to tease out what is CF and what is a community extension, but it creates an accident in waiting. What happens if CF creates a new attribute with a name already in community usage? In my view it's too late to introduce a CF namespace and prefer the idea that for a CF-compliant file CF should be the default namespace, with communities taking responsibility for their extensions. This is what I've done for SeaDataNet. In working up a local metadata profile of CF for use here at the Met Office, we also spent much time thinking about the 'namespace problem'. In an early draft of our metadata profile, and after having reviewed previous discussions (e.g. https://cf-pcmdi.llnl.gov/trac/ticket/27), we elected to use the double underscore character sequence ('__') as a namespace separator. Our namespace prefixes were then mnemonics like 'ukmo' for the Met Office, 'dc' for Dublin Core, 'cim' for the Common Information Model, and so on. And we devised additional (fairly simple) machinery to associate the prefixes with target namespaces, just as in the XML world. Thus, we envisaged using netcdf attributes along the lines of: variables: float myvar(t, y, x) ; myvar:ukmo__stashcode = m01s01i123 ; myvar:ukmo__runid = abcde ; // global attributes :dc__rights = Copyright (c) 2013, Acme Wind and Rain Corp. ; :dc__created = 2013-01-01 ... ; In the end, driven by a practical need to release a simpler, more digestible release 1.0 of our metadata specification, we dropped all the aforementioned namespace stuff. As part of some subsequent low-level netcdf work, however, I chanced upon the fact that the '.' character is not treated in any special way within netcdf names (or rather, it is one of netcdf's original special characters, but not one that needs to be escaped in the way that, say, the ':' character does). This got me to thinking that the '.' character might be the ideal namespace separator for use in CF/netCDF attribute names. Since '.' is not in the set of characters currently permitted in CF attribute names, we can be reasonably sure that it is not being used in existing CF-compliant netcdf files. The '.' character also has collateral appeal for python/java developers in that it is the familiar namespace separator used by those