Thanks for bringing this up Lars. I had a similar question a while back 
(http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2015/018489.html) and 
the suggestion was to use flag_values as you have proposed. As Jonathan 
indicated the use of flag_values is not prohibited but does come with 
some possible issues. Particularly software that automatically converts 
values outside the valid range to NaN or a single missing value 
indicator. Some people use this methodology instead of using 
missing_value attribute. I don't know of any software that does this 
automatically, but it could be an issue.

The proposal I gave was for some data that had history and could not be 
updated so I was looking for a solution that would work without changes. 
Discussions I had with others at meetings suggested using an ancillary 
state variable. I find that a better solution since it separates the 
data from metadata and would not require looping over multiple missing 
values (I don't know if that is supported) or run into the issue of code 
automatically changing data outside the valid range.

My program uses ancillary variables to contain corresponding quality 
control information, and one of the pieces of information is that the 
data variable is set to missing value. This allows for faster quality 
control analysis by only needing to look at the ancillary variable to 
find missing data instead of looking at both quality control and data 
variable.

My suggestion is to use a single missing value indicator with the data 
variable and then indicate the greater detail of why it is missing in 
the ancillary variable using flag_values. Additional quality information 
could be provided beyond the reason for being set to missing value. You 
can also provided multiple pieces (inclusive state) of information on a 
single value using flag_masks instead of flag_values.

For example:

cloud_layer_base_height(time, layer):float
     long_name = "Cloud base height of hydrometeor layers” ;
     units = "m” ;
     missing_value = -9999.f ;
     ancillary_varialbes = "qc_cloud_layer_base_height"
qc_cloud_layer_base_height (time, layer): short
     long_name = "Quality information for Cloud base height of 
hydrometeor layers"
     units = "1"
     flag_values = 0, 1, 2, 3, 4
     flag_meanings = "data_available input_value_missing 
input_data_exists_but_the_computation_did_not_result_in_a_valid_numeric_value 
value_missing_because_of_birds value_was_computed_but_I_would_not_use_it"
     standard_name = "status_flag"

I'm curious to hear your thoughts and others,

Ken


On 2019-7-19 06:57, Jonathan Gregory wrote:
> Dear Lars
>
> I think that using a flag_value would be a good CF way to do this. I am not
> sure whether it's a good idea to choose a value which is outside the valid
> range. That's not a problem for CF (that is, it's not prohibited), but maybe
> it might not suit some software, which could object if it wasn't aware of CF
> flag_values.
>
> Best wishes
>
> Jonathan
>
>
> ----- Forwarded message from Bärring Lars <lars.barr...@smhi.se> -----
>
>> Date: Fri, 19 Jul 2019 10:20:35 +0000
>> From: Bärring Lars <lars.barr...@smhi.se>
>> To: "cf-metadata@cgd.ucar.edu" <cf-metadata@cgd.ucar.edu>
>> Subject: [CF-metadata] How to encode "not occurring" as distinct from
>>      "missing data"
>>
>> Dear all,
>>
>> We are considering how best to store data produced by some computation where 
>> there has to be a distinction between missing input data (i.e. no input data 
>> available) and "not occurring" (i.e. input data exists but the computation 
>> did not result in a valid numeric value).
>>
>> In practice, the situation is reasonably similar to what was discussed back 
>> in 2017  (in the thread "Recording "day of year on which something happens") 
>> where Jim Biard offered a solution 
>> (http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2017/019238.html).
>>
>> We have considered his solution to use flag values outside the valid_range 
>> of the data variable to indicate "no_occurrence". We have also considered to 
>> use a separate quality variable with flag values to use as as a mask 
>> (combined with _MissVal in the data variable).
>>
>> In this work the following questions surfaced,
>>
>> -- Is there any experience regarding how 'standard software' would handle 
>> either of these alternatives, is one more generally accepted?
>>
>> -- Is there any experience to guide us regarding which is better, or 
>> generally more "in line with the CF Conventions"?
>>
>> -- Is there another better approach that we have not thought of?
>>
>>
>> Many thanks,
>> Lars
>>
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata@cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
> ----- End forwarded message -----
> _______________________________________________
> CF-metadata mailing list
> CF-metadata@cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

-- 
Kenneth E. Kehoe
   Research Associate - University of Oklahoma
   Cooperative Institute for Mesoscale Meteorological Studies
   ARM Climate Research Facility - Data Quality Office
   e-mail: kke...@ou.edu | Office: 303-497-4754

_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to