Hi all - Species taxonomies are not like chemical vocabularies, in that terms for organisms change over time. There are some big projects involved in maintaining these taxonomies, and we probably don't want to commit to launching a parallel effort.
The ubio project has a decent description of the problem, at http://www.ubio.org/index.php?pagename=background_intro So, it seems to me that if we're going to expand CF to accommodate biological data, we should follow Roy's advice, and have a 'generic' standard name that means 'organism count' and add at least one required attribute pointing to a taxonomic name server (with a version date). The 'current' species name could be included in the long_name attribute. Although it's a valid point that existing search tools don't know about extra attributes, the effort of keeping up with the changes in terms could render CF useless for this kind of data otherwise. Regards - Nan On 3/25/13 5:00 AM, Jonathan Gregory wrote:
Dear all I agree with Philip that cfu should be spelled out. I was also going to make the same point about Roy's proposal being different from our treatment of chemical species, which are encoded in the standard name; this system seems to be working. One reason for keeping this approach was the "green dog" problem. That particular phrase is actually Roy's, if I remember correctly. That is, we wish to prevent nonsensical constructions, by approving each name which makes (chemical) sense individually. However Roy argues that there is an order of magnitude more biological species to deal with than chemical. I don't think that keeping the same approach (encoding in the standard name) would break the system, but it would make the standard name table very large. Perhaps more importantly, if there were so many species, I expect that data-writers would simply assume that each of the possible combinations of pattern and species did already exist in the standard name table, without bothering to check or have them approved. That would defeat the object of the system of individual approval. We don't have to follow the chemical approach. For named geographical regions and surface area types (vegetation types etc.) we use string-valued coordinate variables, rather like Roy proposes here. To follow that approach we would need a new table, subsidiary to the standard name table, containing a list of controlled names of biological species. We would use the same approval process to add names to this list as we do for the standard name table. (This is what we do for geographical regions and area types.) We would then have a standard_name such as number_concentration_of_biological_species_in_sea_water whose definition would note that a data variable with this standard_name must have a string-valued auxiliary coordinate variable of biological_species containing a valid name from the biological species table. If there is just one species, the auxiliary coordinate variable wouldn't need a dimension, but this construction would also allow a single data variable to contain data for several species, by having a dimension of size greater than one. Cheers Jonathan _______________________________________________ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
-- ******************************************************* * Nan Galbraith Information Systems Specialist * * Upper Ocean Processes Group Mail Stop 29 * * Woods Hole Oceanographic Institution * * Woods Hole, MA 02543 (508) 289-2444 * ******************************************************* _______________________________________________ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata