Dear all Jim asked,
"As some examples of the confusing situation we have now, why do we have a separate word modifier number_of_observations instead of a number_of_observations_of_X transformation modifier? Why don't we have variance_of_X or anomaly_of_X transformations (or separate word modifiers variance or anomaly)? Why isn't there a cell method for standard error? I can't discern any logic behind the current partitioning." I've tried to explain how this came about, but perhaps I am not being clear, so let me try again: * We introduced the modifiers like number_of_observations for those situations where it was thought likely that a large number of standard names would need them. Factorising out this dimension thus avoids a large expansion of the standard name table. So far, only four anomaly_of names have been requested, so it seems the right judgement not to have a standard_name modifier for that. * That was also one of the motivations for cell_methods: there would be vastly more standard names if we had to include all the cell_methods information too. The other motivation for cell_methods is that the statistical operations relate to particular axes. For instance, just "mean" is too vague: does it mean time-mean, zonal-mean, mean over radiation wavelength, or what? The same is true for variance. The cell_methods attribute makes this precise. * There is not a cell method for standard error because it does not relate to a particular dimension. The standard error is a metadata property of the individual data. The cell methods statistically describe the variation of the quantity within cells. These are different purposes. While you may not agree with the logic, does this help to explain what it is? If the situation is perceived as confusing and easily misunderstood, I am all in favour of clarifying it by inserting more explanation and discussion in the CF standard document. That could be done with a defect ticket. As Philip says, it could shorten future discussions. But we can also change the standard, of course. However, changes to existing attributes are difficult for existing software. I do not think we need or ought to change the existing attributes. While I appreciate the reason for the suggestion, I feel that suffixing something to the standard_name to indicate "something" has been done to it would not really help, because there is almost *always* something done to it! Cell methods are recommended to be specified in any case where the default "point" or "sum" is not correct. They should be present if the quantity is a mean, in particular. A mean is also a transformation, just like a standard deviation. I am not convinced yet by the argument that we have to modify the CF standard because the standard_name may be misunderstood or misused by software which catalogues or serves datasets. CF introduced the standard_name attribute. If it's being used now, software must already have been modified to support CF. Well then, why can't be modified again to support CF more fully or correctly? If we explained more clearly in the standard what the intention was, that would no doubt help with future software design. Instead of changing what we have, I think we should add to it. It seems to me, as I've said before, that the existing proposal for "CF strings" summarising some essential metadata (similar to the earlier proposal for common concepts in some ways) would solve this problem. It is *that* kind of string, not the standard name, that the user should be offered to select an appropriate variable. It's a combination of attributes. It's not hard to assemble that information from the separate attributes, but if that's an obstacle, we could help software over it by recommending that this extra attribute be included. Please have a look at https://cf-pcmdi.llnl.gov/trac/ticket/94 and add your comments on it. Best wishes Jonathan _______________________________________________ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata