Dear Dan

I agree with you that it would be better to store F(x) than to use your sign
convention for return periods. However it would be fine to split the return
periods into the two tails in different data variables and give them distinct
standard names. We have some standard names for such things e.g.
  
spell_length_of_days_with_lwe_thickness_of_precipitation_amount_above_threshold
and you could propose suitable ones.

If you store F(x), I think it would be a data variable, not a coordinate or
ancillary variable, and it should have a standard name. I believe the guidance
you quote is about probability distribution functions rather than cumulative
(probability) distribution functions. Following a similar approach, however,
we could have a standard name such as
  cumulative_distribution_function_of_precipitation_amount
for F(x), where x is precipitation_amount, which would be a coordinate. Is
that what you have in mind?

Cheers

Jonathan


----- Forwarded message from "Hollis, Dan" <dan.hol...@metoffice.gov.uk> -----

> Dear all,
> 
> Here is another question related to migrating our UK climate grids to NetCDF.
> 
> As well as grids of the monthly rainfall total (in mm) we also generate grids 
> of the estimated return period of the rainfall total (in years). Currently 
> these two quantities are stored in separate files (with only the file name 
> and location to tell us they are related). I've been trying to think how to 
> store the return period information using CF-NetCDF and would be grateful for 
> advice.
> 
> Some further details:
> 
> Our existing grids contain the return period in years i.e. if the return 
> period for a particular grid point is N years then this means that we 
> estimate that the rainfall total for that grid point will be exceeded on 
> average once every N years. This is equivalent to saying that each year there 
> is a probability of 1/N of exceeding that rainfall amount i.e. the 
> cummulative distribution function, F(x) = 1 - 1/N. For example, if N = 10 
> then F(x) = 0.9. Additionally, as we are also interested in droughts, we have 
> adopted our own convention of using negative values to refer to the left 
> (dry) tail of the rainfall distribution. For example N = -10 is used to mean 
> that F(x) = 0.1 i.e. we estimate that rainfall amounts *less* than the 
> observed value will occur once every 10 years on average.
> 
> This use of positive and negative values to indicate return periods relating 
> to the right (wet) and left (dry) tails is convenient but unconventional. My 
> initial thought is that we should store F(x) itself and only convert to 
> return period for the purposes of presentation e.g. creating maps.
> 
> So, how to store F(x)? The main problem is that the value to which the return 
> period relates (i.e. the rainfall amount) varies from one grid point to 
> another. Two possibilities occur to me, both of which involve storing F(x) 
> alongside the rainfall total:
> 
> - Store F(x) as an auxilliary coordinate
> 
> - Store F(x) as ancillary data
> 
> It's not clear to me whether one is better than the other, or even whether 
> either approach is valid.
> 
> The other question is what to call the F(x) values. The guidance for 
> ancillary data says to use standard name modifiers to indicate the 
> relationship, but there doesn't seem to be anything suitable for describing 
> F(x).
> 
> The other thing I've looked at is the guidance for constructing standard 
> names. I can't seem to locate this on the current CF web site so I've refered 
> to the archived copy available here:
> 
> https://web.archive.org/web/20130728212039/http://cf-pcmdi.llnl.gov/documents/cf-standard-names/guidelines
> 
> The section on transformations includes 
> 'probability_distribution_of_X[_over_Z]' in the list, however it's unclear to 
> me whether this is what I need, or even how I might use it in other 
> circumstances. The notes state:
> 
> "probability distribution (i.e. a number in the range 0.0-1.0 for each range 
> of X) of variations (over Z) of X. The data variable should have an axis for 
> X."
> 
> The reference to 'each range of X' is the bit I find confusing. Is the idea 
> to store F(X1), F(X2), F(X3) etc, or is it intended to be F(X2) - F(X1), 
> F(X3) - F(X2), F(X4) - F(X3) etc? The former doesn't quite fit the 
> description, but the latter has the problem that the number of ranges (= the 
> number of data values) will be one less than the number of X values. I can't 
> see any existing names that use this transformation to use as a guide.
> 
> If anyone can help that would be much appreciated.
> 
> Thanks,
> 
> Dan
> 
> 
> Dan Hollis   Climatologist
> Met Office   Hadley Centre   FitzRoy Road   Exeter   Devon   EX1 3PB   United 
> Kingdom
> Tel: +44 (0)1392 886780   Fax: +44 (0)1392 885681
> E-mail: dan.hol...@metoffice.gov.uk   Website: http://www.metoffice.gov.uk
> For UK climate and past weather information, visit 
> http://www.metoffice.gov.uk/climate
> 
> 

> _______________________________________________
> CF-metadata mailing list
> CF-metadata@cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


----- End forwarded message -----
_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to