Dear Martin In my reading of 7.3.3 and the conformance document, it seems clear that "where" is intended to be used with area types.
> There is an issue, it appears, about the use of the "where" modifier for > cell_methods elements other than "area:". Jonathan believes "where" should > only apply for area on the basis that this where the motivation comes from in > the first paragraph of section 7.3.3. The subsequent paragraphs in section > 7.3.3. describe the use of "where" with a generic element "name: ....". The > compliance document clearly states that "where" can be used with any string. I'm sorry, I can't find that - please could you point it out? In http://cfconventions.org/Data/cf-documents/requirements-recommendations/requirements-recommendations-1.6.html regarding method [where type1 [over type2]] it says The valid values for type1 are the name of a string-valued auxiliary or scalar coordinate variable with a standard_name of area_type, or any string value allowed for a variable of standard_name of area_type. We could generalise area_types to mean "states" so they can apply in time as well as space. I think all the existing ones could be interpreted in this way i.e. with the sense of "when" rather than "where". Vegetation is sometimes present and sometimes absent at any given spot, for instance, just as it is present in some spots and not others at any given time. Suppose you want to calculate a radiative flux for a grid-box in cloud-free air. You can do this on each instantaneous timestep for the cloud-free fraction of the grid-box, and then calculate a time-mean of these timestep values i.e. "area: mean where clear_sky time: mean". If the input data supplies a higher spatial resolution than the grid-box, so you have many timeseries, you could alternatively do it the other way round, and first calculate, for each of the points, the value of the flux for those timesteps when there is no cloud, then calculate an area-mean of these local values i.e. "time: mean where clear_sky area: mean". These aren't the same because they imply different weights. For example, suppose you have three points within the grid-box and two times, and the data is as follows: a X X b c X where X means cloudy, and a, b, c are clear-sky values. According to the first method, the value is a/2 + b/4 + c/4, and according to the second method it is a/4 + b/4 + c/2, if I've done my sums right. There is a third method, in which we consider both time and space together: "time: area: mean where clear_sky". In this case the value is a/3 + b/3 + c/3. If I'm right about this, I think we could make this generalisation and it would not be problematic. However, as usual, we should only make the change if there is a use-case which demands it. Best wishes Jonathan _______________________________________________ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata