Any of these special characters, other than the '_', would probably
cause problems for code that deals with NetCDF files. The '.' is used
in Matlab for access to structures, and the '@' is used to identify a
variable as a function handle. There are work-arounds, but they
likely wouldn't add efficiency or elegance to our code.

I agree with Roy that CF should be the default namespace in a CF
compliant file, and that this problem belongs to groups that are writing
extensions. In OceanSITES we've mostly ignored this problem-waiting-
to-happen, but our code checks the versions of CF (and our OTS spec) in
our files, which I hope offers some protection.

Should more of these community conventions be added to CF? I'm sure
there are SDN and NCADD (data discovery) attributes that would be
helpful to some CF users; it would be awfully nice to have a list of
already-defined attributes - in one place - to choose from when putting
together a CF-based spec for a project.

Cheers - Nan



On 1/28/13 1:17 PM, Dennis Heimbigner wrote:
With respect to netcdf (at least the C version),
it is the case that these characters can appear
unescaped:  _.@+-

It should be noted however that dot in particular
causes problems for accessing remote datasets
through DAP because the dot character is used
in DAP constraints to specify fields inside
DAP Sequences or Structures or Grids.

The problem you have is that no matter what
choice of character(s) you make, someone may
use the characters in a different way.
This means that whatever choice you make, you need
to enshrine it in a standard somewhere so that at
least there is a chance that people will avoid it.

Personally, I would think that a two character sequence
is least likely to be used by others, but two underscores
is probably not a good choice. I would think something
like @@ ++ might be a better choice.


=Dennis Heimbigner
 Unidata
Bentley, Philip wrote:
Roy et al.,
Martin's comments on namespace highlight a concern I identified whilst doing the research for the SeaDataNet specification. Several communities have added large numbers of both global and variable attributes with no indication of namespace. Not only does this make it difficult to tease out what is CF and what is a community extension, but it creates an accident in waiting. What happens if CF creates a new attribute with a name already in community usage? In my view it's too late to introduce a CF namespace and prefer the idea that for a CF-compliant file CF should be the default namespace, with communities taking responsibility for their extensions. This is what I've done for SeaDataNet.

In working up a local metadata profile of CF for use here at the Met
Office, we also spent much time thinking about the 'namespace problem'.
In an early draft of our metadata profile, and after having reviewed
previous discussions (e.g. https://cf-pcmdi.llnl.gov/trac/ticket/27), we
elected to use the double underscore character sequence ('__') as a
namespace separator. Our namespace prefixes were then mnemonics like
'ukmo' for the Met Office, 'dc' for Dublin Core, 'cim' for the Common
Information Model, and so on. And we devised additional (fairly simple)
machinery to associate the prefixes with target namespaces, just as in
the XML world.

Thus, we envisaged using netcdf attributes along the lines of:

variables:
  float myvar(t, y, x) ;
    myvar:ukmo__stashcode = "m01s01i123" ;
    myvar:ukmo__runid = "abcde" ;

// global attributes
  :dc__rights = "Copyright (c) 2013, Acme Wind and Rain Corp." ;
  :dc__created = "2013-01-01 ..." ;

In the end, driven by a practical need to release a simpler, more
digestible release 1.0 of our metadata specification, we dropped all the
aforementioned namespace stuff.

As part of some subsequent low-level netcdf work, however, I chanced
upon the fact that the '.' character is not treated in any special way
within netcdf names (or rather, it is one of netcdf's original special
characters, but not one that needs to be escaped in the way that, say,
the ':' character does).

This got me to thinking that the '.' character might be the ideal
namespace separator for use in CF/netCDF attribute names. Since '.' is
not in the set of characters currently permitted in CF attribute names,
we can be reasonably sure that it is not being used in existing
CF-compliant netcdf files.

The '.' character also has collateral appeal for python/java developers
in that it is the familiar namespace separator used by those languages.

Applied to the previous example, then, we'd now have netcdf attributes
such as ukmo.stashcode, ukmo.runid, dc.rights, dc.created, and so on.
Which looks considerably more elegant, IMO.

While in your context, Roy, you might elect to use namespace'd
attributes called sdn.conventions, sdn.foo, sdn.bar, etc. Or bodc.foo,
bodc.bar, etc. for BODC stuff.

Clearly there are several technical issues that would need to be
addressed (e.g. how/when to use the 'cf.' prefix, what would the default
namespace be, how would prefixes and their namespaces be associated, how
should software interpret namespaces, and so on).

But, assuming these could be resolved, what do people think about use of
'.' as a namespace separator? Good idea? Bad idea?

Some recent postings to this list have suggested using a 'cf_' prefix,
with the implied suggestion of a '_' namespace separator. IMHO, this
approach has the limitation that client software would not be able to
disambiguate existing names which include the '_' character. For
example, would the name 'cell_methods' refer to a property called
'cell_methods' in some default namespace, or a property called 'methods'
in the 'cell' namespace? Likewise for some possible new attribute
called, e.g. 'cf_my_new_thing', what namespace would that be in? cf?
cf_my? cf_my_new?

Regards,
Phil
_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata



--
*******************************************************
* Nan Galbraith                        (508) 289-2444 *
* Upper Ocean Processes Group            Mail Stop 29 *
* Woods Hole Oceanographic Institution                *
* Woods Hole, MA 02543                                *
*******************************************************



_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to