Hi,

The methods for estimating probabililties are non-trivial, and may change over 
time. Because of that I will prefer to keep information about the exact process 
outside of the generated file.

I have not been able able to find any references to realization_weight in the 
standard documents. Could you please refer me to the right place?

VG



----- Original Message -----
Fra: "Jamie Kettleborough" <jamie.kettleboro...@metoffice.gov.uk>
Til: "Vegard Bønes" <vegard.bo...@met.no>, "Jonathan Gregory" 
<j.m.greg...@reading.ac.uk>
Kopi: cf-metadata@cgd.ucar.edu, "Jamie Kettleborough" 
<jamie.kettleboro...@metoffice.gov.uk>
Sendt: 16. november 2011 11:53:22
Emne: RE: [CF-metadata] standards for probabilities

Hello Vegard,

How do you generate your cdf from your realisations?  Do you simply weight each 
ensemble member equally?  I think there are cases where you may weight by some 
measure of how 'good' you think the ensemble member is (some sort of measure of 
its error - you downweight those with high errors).  If you are storing the 
output from ensemble members in the file then think cf allows for this using 
the 'realzation_weight' standard name - to store your errors/weights in the 
file.

Furthermore you may want to know the sensitivity of your cdf to your error 
estimates so you could have more than one cdf for the same variable, but based 
on different ways of deriving the errors/weights.

Is this something CF needs to worry about, or is it a case of trying to add 
something that's not really needed yet?  Or maybe this is not in scope for CF 
anyway, and it should be left to something more like 'audit/history/provenance' 
meta data?

Jamie 

> -----Original Message-----
> From: cf-metadata-boun...@cgd.ucar.edu 
> [mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Vegard Bønes
> Sent: 15 November 2011 13:15
> To: Jonathan Gregory
> Cc: cf-metadata@cgd.ucar.edu
> Subject: Re: [CF-metadata] standards for probabilities
> 
> Thank you, Jonathan! :)
> 
> So, a bit more concrete, this is option 1:
> 
> float rain_25(time, y, x);
>  rain_25:standard_name = "precipitation_amount";  
> rain_25:cell_methods = "realization: percentile(25)";
> 
> The only problem I see with this is that in the resulting cdm 
> realization is not used anywhere, apart from possibly in cell 
> methods. But maybe this is ok?
> 
> 
> If I understand the second option correctly, this would lead 
> to something like this:
> 
> float precipitation_amount(time, percentile, y, x);  ...
> float percentile(percentile);
>  percentile:units = "1";
>  percentile:standard_name = 
> "cumulative_distribution_function_of_precipitation_amount";
> 
> But what is the purpose of explicitly refering to 
> precipitation_amount in the standard name? would not 
> cumulative_distribution_function be better? Then the same 
> dimension could be used for other data, such as air_temperature.
> 
> Or, if we want to add something about the nature of the 
> source data for the function, it could be called something 
> like cumulative_distribution_function_due_to_realization?
> 
> 
> I am still a bit uncertain about what is the best, though.
> 
> 
> -- Vegard
> 
> 
> 
> 
> ----- Original Message -----
> Fra: "Jonathan Gregory" <j.m.greg...@reading.ac.uk>
> Til: "Vegard B??nes" <vegard.bo...@met.no>
> Kopi: cf-metadata@cgd.ucar.edu
> Sendt: 15. november 2011 11:11:52
> Emne: Re: [CF-metadata] standards for probabilities
> 
> Dear Vegard
> 
> > I want to express such things as "25th percentile 
> precipitation amount" (based on ensemble data), and 
> probability that air temperature will be within 2.5 degrees 
> of the forecast. How should I do this? 
> 
> You are right, this case has not yet been dealt with, 
> although the guidelines for construction of standard names 
> foresee that needs like this might arise!
> 
> If the quantity is a precipitation_amount, it's fine to use 
> that standard name. The question is how to record that is the 
> 25th percentile. Two possible ways to do this would be:
> 
> * To extend the possible syntax of cell_methods so that it 
> can describe percentiles. It is already possible to indicate 
> a median in cell_methods, and that is a particular 
> percentile. The advantage of this way of doing it would be 
> that you would record whether the distribution of 
> precipitation amounts being considered was for 
> time-variation, or spatial variation, or some other kind of 
> variation. Obviously you could have a probability 
> distribution with percentiles for many different independent 
> variables.
> 
> * To use a size-1 or scalar coordinate variable to record the 
> probability, with a new standard_name, perhaps 
> cumulative_distribution_function_of_precipitation_amount.
> The value of this coordinate would be 0.25 for the 25th 
> percentile. The advantage of this method would be that you 
> could have several different percentiles in the same 
> variable, by having a multivalued probability coord.
> If you wanted to be specific about what the independent 
> variable was, that would have to be included in the standard 
> name as well e.g.
> cumulative_distribution_function_of_precipitation_amount_over_time.
> 
> What do you think?
> 
> Cheers
> 
> Jonathan
> _______________________________________________
> CF-metadata mailing list
> CF-metadata@cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> 
_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to