Re: [CF-metadata] New standard names for satellite obs data (time as ISO strings)

2010-10-22 Thread Lowry, Roy K
Hi Jon,

Full ISO8601 does carry time zone expressed in hours relative to UT in the 
syntax Zx where x is the offset from Zulu at the right-hand end of the string. 

Cheers, Roy.

-Original Message-
From: cf-metadata-boun...@cgd.ucar.edu 
[mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Jon Blower
Sent: 21 October 2010 23:28
To: Benno Blumenthal
Cc: cf-metadata@cgd.ucar.edu
Subject: Re: [CF-metadata] New standard names for satellite obs data (time as 
ISO strings)

Hi Benno,

 No one I know beyond the age of four thinks Sep 2009 is ambiguous

Do you mean beyond the age of precisely 4.000... years or beyond the age of 
4.999... years?  Or is the ambiguous temporal metadata concept of the age of 
four sufficient?

;-)

All ISO8601 dates (year, month or day resolution) are inherently ambiguous 
because they carry no time zone information (your precise bounds are likely to 
be 5 hours different from mine, or something more complex if daylight saving is 
involved).  So with ISO8601 alone I don't think there's such a thing as the 
preciseSep2009 in your argument below, unless I've misunderstood what you 
mean, in which case apologies.

Hmm... come to think of it, this might actually argue *against* using ISO8601 
strings alone as indicators of time resolution.  If we really *do* mean that 
data are representative of a 24-hour period starting at midnight UTC on the 
first of September, we can't represent this unambiguously as 2009-01-01 
because of the time zone problem.  (I think that 2009-01-01Z is illegal.)  In 
this case we would be better off representing this period as a nominal value 
plus explicit bounds, or a nominal value plus time zone plus some additional 
information that we can discard any precision greater than 1 day.

*However*, it's still very useful to know the resolution of the time axis by 
some means other than inspecting the coordinate bounds.  An application (e.g. 
automatically generating a time selector widget in a GUI) will probably not 
want to look at all the bounds of all the time coordinates to infer the time 
resolution: apart from being generally tedious, it would be very difficult for 
monthly data (because months are different lengths, and vary between 
calendars).  Additionally, this inference would be complicated if 
floating-point numbers were used to represent time coordinates, since these are 
usually slightly inaccurate (dangerous to compare floats for equality, etc. 
etc.)


So, I'm starting to like the idea of an additional (and optional) CF attribute 
to specify time coordinate resolution.  This could be specified in precise 
numeric terms (e.g. instrument precision of 0.12 ms) or in less-precise human 
calendar terms for certain kinds of data (e.g. P1M for monthly resolution).  
It would be additional to the coordinate bounds: data providers could specify 
one or the other, or both if they are consistent (or neither if appropriate?)

This would not require or preclude the use of ISO8601 strings to represent time 
coordinate values.


Best wishes,
Jon


-Original Message-
From: bennoblument...@gmail.com [mailto:bennoblument...@gmail.com] On Behalf Of 
Benno Blumenthal
Sent: 21 October 2010 21:44
To: Jon Blower
Cc: cf-metadata@cgd.ucar.edu
Subject: Re: [CF-metadata] New standard names for satellite obs data (time as 
ISO strings)

Hi Jon,

Sorry, I am not buying it.

No one I know beyond the age of four thinks Sep 2009 is ambiguous, and
I don't read your examples as needing vagueness on the time
specifically.

Suppose, for a moment, that you succeed beyond your wildest dreams,
and it is possible to express in CF some relationship to a vague
notion of Sep2009, i.e.

data hasADataRelationshipWith vagueSep2009.

I would say there is another relationship

vagueSep2009 isAVagueVersionOf preciseSep2009

And you could have just as easily coded in CF

data hasADataRelationshipWithAVaugeVersionOf preciseSep2009

i.e. there is no reason why the vaugeness  cannot be coded as a
dependent data property.  Which is what CF is currently set up to do,
with a possible extension of the cell_methods vocabulary

Futhermore, you said

 You *could* modify CF so that to represent data that are representative of 
 September 2010, you specify a nominal date half-way through September and 
 set the bounds to the first and last instants of September.  And perhaps use 
 a new cell_methods of representative.  But the half-way point and the 
 bounds would be quite (very) tedious to compute in the general case (months 
 and years are of variable length for example and depend on the calendar 
 system).


That is not a modification of CF -- that is the way it is currently
encoded in CF (though there is no meaning to the nominal value, so you
can set it to whatever).  And yes, you have to generate the edges,
which you have to do anyway if you are going to sensibly handle
computations with the data.

And let me repeat my main original point, so that it does not get
completely buried

Re: [CF-metadata] New standard names for satellite obs data (time as ISO strings)

2010-10-21 Thread Benno Blumenthal
While expressing precision in CF is an interesting issue, in this case
the Wikipedia quote is using the term in a different sense than I
(hopefully we) usually mean -- ISO8601 lets one express time intervals
succinctly in a single string, e.g. 2010-09 to mean all of september
2010, which is not an accuracy issue, it is a precise specification of
a larger interval.  It lets you write 2010-09-01/10-05 as well, i.e.
it is not limited to intervals that involve special notational
boundaries.   As Steve points out CF expresses this using a bounds
coordinate, i.e. giving the precise edges of each interval.  Of
course, how the data is actually related to that interval is where the
notion of precision might come in, which cell methods/measures
addresses, perhaps inadequately for the purpose at hand.

ISO8601 is quite neat in the sense that it forces one to always
specify an interval, and CF software reading time bounds data and
rendering ISO8601 strings would do us all a lot of good.

Benno

On Wed, Oct 20, 2010 at 6:34 PM, Steve Hankin steven.c.han...@noaa.gov wrote:
 Hi Jon,

 Why do you see this as an issue of date-times as ISO strings in particular?
 The same issues of precision are found in longitudes expressed as a
 degrees-minutes-seconds string compared to a floating point.  Or indeed to a
 depth expressed as a decimal string of known numbers of digits.  (100.00
 communicates different precision than 100 though both a represented by the
 same binary value.)

 CF provides the bounds attribute and the cell methods/measures to clarify
 (somewhat) these points.  What is your proposal for improved representation
 of precisions?  And wouldn't a general improvement in how to specify
 coordinate precision be preferable to a solution that applies to time, only?

     - Steve

 =


 On 10/20/2010 9:41 AM, Jon Blower wrote:

 Hi all,

 I haven't followed this debate closely, but I've had cause to do a fair
 amount of thinking (outside the CF context) on the pros and cons of
 identifying date/times as strings or numbers.  I could probably write a
 very boring essay on this but in summary, they are not exactly
 equivalent ways of representing the same information.

 One way in which they are different is precision.  A value of x seconds
 since y has no implied precision - typically in programs we take the
 precision to be milliseconds, but there's nothing to suggest this in the
 actual metadata (anyone who tries to populate a GUI from CF metadata
 struggles with this).  Semantically it's a time instant; i.e. an
 infinitesimal position in a temporal coordinate reference system.
 However, an ISO8601 string can have various precisions.  (The string
 2009-10 is not considered equivalent to 2009-10-01T00:00:00.000Z.)

 From Wikipedia (http://en.wikipedia.org/wiki/ISO_8601):

 For reduced accuracy, any number of values may be dropped from any of
 the date and time representations, but in the order from the least to
 the most significant. For example, 2004-05 is a valid ISO 8601 date,
 which indicates May (the fifth month) 2004. This format will never
 represent the 5th day of an unspecified month in 2004, nor will it
 represent a time-span extending from 2004 into 2005.

 I've argued before in a previous thread on this list that it would be
 good to be able to specify the precision of time coordinates in terms of
 calendar date/time fields (which isn't the same thing as providing a
 tolerance value on the numeric coordinate value of a time axis).

 I'm not saying that we should definitely allow time strings in CF, just
 pointing out that they have some use cases we currently can't fulfil.
 I'm not sure they are definitively bad practice in all cases.

 (Regarding a technical point raised below, yes, it's a pain to represent
 variable length strings in NetCDF, but there is a maximum length for
 ISO8601 strings.)

 Hope this helps,
 Jon

 -Original Message-
 From: cf-metadata-boun...@cgd.ucar.edu
 [mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Lowry, Roy K
 Sent: 20 October 2010 10:00
 To: Ben Hetland; cf-metadata@cgd.ucar.edu
 Subject: Re: [CF-metadata] New standard names for satellite obs data

 Dear All,

 As others have said, I think this debate is irrelevant as there should
 be no need for string timestamps in NetCDF. Providing a Standard Name
 only encourages what I consider to be bad practice.

 Cheers, Roy.

 -Original Message-
 From: cf-metadata-boun...@cgd.ucar.edu
 [mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Ben Hetland
 Sent: 20 October 2010 09:14
 To: cf-metadata@cgd.ucar.edu
 Subject: Re: [CF-metadata] New standard names for satellite obs data

 On 19.10.2010 16:27, Seth McGinnis wrote:

 What about using 'date' for string-valued times?  That was my homebrew
 solution when I was considering a similar problem.

 If I may butt in and contribute here, I usually prefer names like
 'datetime' or 'timestamp' in cases like this, because 'date' is
 

Re: [CF-metadata] New standard names for satellite obs data (time as ISO strings)

2010-10-21 Thread Jon Blower
Hi Benno,

2010-09 is not necessarily a precise specification of a month - time zones make 
it a little fuzzy for one thing.  Separate to this, there are parallel 
conversations going on in the ISO/OGC community about what time strings 
actually mean.  A metadata person might say that 2010-09 is simply a 
shorthand for the fuzzy concept of September 2010 and does not represent a 
precise interval (i.e. a square-wave function that is 1 during September and 0 
outside).  Apart from the time zone issue which blurs the boundaries, this 
square-wave is simply not what humans mean when, for example, they tag a report 
as having been written in September 2010.  It just distinguishes it from 
version 2 of the report, which was written in November.  In this context, it's 
just a label with some temporal meaning.

These metadata guys are in discussion with the positioning guys who view 
date/times as precisely-defined positions within a temporal CRS.  You may (or 
may not!) like to look at the GeoAPI mailing list, in which we are trying to 
figure out whether we can actually use the same Java types for both of these 
subtly-different views of date/times (we hope we can but haven't agreed).  One 
might think that they are obviously the same thing, but I don't think so.

You *could* modify CF so that to represent data that are representative of 
September 2010, you specify a nominal date half-way through September and set 
the bounds to the first and last instants of September.  And perhaps use a new 
cell_methods of representative.  But the half-way point and the bounds would 
be quite (very) tedious to compute in the general case (months and years are of 
variable length for example and depend on the calendar system).

 Of course, how the data is actually related to that interval is where the
 notion of precision might come in

Actually, you've probably gathered that I also consider the notion of precision 
to apply to the interval itself, not just how the data relates to it.

This discussion repeats a bit of the previous discussion on this list entitled 
bounds/precision for time axis.  I like Jonathan's distinction between the 
concepts of temporal resolution and representivity: 
http://www.mail-archive.com/cf-metadata@cgd.ucar.edu/msg01341.html.

And just for completeness we should not that ISO8601 strings are not 
fixed-length, nor do they have a maximum length (in contrast to what I said 
before, sorry).  So I can see some implementation challenges in NetCDF.

Cheers, Jon


-Original Message-
From: bennoblument...@gmail.com [mailto:bennoblument...@gmail.com] On Behalf Of 
Benno Blumenthal
Sent: 21 October 2010 15:43
To: Steve Hankin
Cc: Jon Blower; cf-metadata@cgd.ucar.edu
Subject: Re: [CF-metadata] New standard names for satellite obs data (time as 
ISO strings)

While expressing precision in CF is an interesting issue, in this case
the Wikipedia quote is using the term in a different sense than I
(hopefully we) usually mean -- ISO8601 lets one express time intervals
succinctly in a single string, e.g. 2010-09 to mean all of september
2010, which is not an accuracy issue, it is a precise specification of
a larger interval.  It lets you write 2010-09-01/10-05 as well, i.e.
it is not limited to intervals that involve special notational
boundaries.   As Steve points out CF expresses this using a bounds
coordinate, i.e. giving the precise edges of each interval.  Of
course, how the data is actually related to that interval is where the
notion of precision might come in, which cell methods/measures
addresses, perhaps inadequately for the purpose at hand.

ISO8601 is quite neat in the sense that it forces one to always
specify an interval, and CF software reading time bounds data and
rendering ISO8601 strings would do us all a lot of good.

Benno

On Wed, Oct 20, 2010 at 6:34 PM, Steve Hankin steven.c.han...@noaa.gov wrote:
 Hi Jon,

 Why do you see this as an issue of date-times as ISO strings in particular?
 The same issues of precision are found in longitudes expressed as a
 degrees-minutes-seconds string compared to a floating point.  Or indeed to a
 depth expressed as a decimal string of known numbers of digits.  (100.00
 communicates different precision than 100 though both a represented by the
 same binary value.)

 CF provides the bounds attribute and the cell methods/measures to clarify
 (somewhat) these points.  What is your proposal for improved representation
 of precisions?  And wouldn't a general improvement in how to specify
 coordinate precision be preferable to a solution that applies to time, only?

     - Steve

 =


 On 10/20/2010 9:41 AM, Jon Blower wrote:

 Hi all,

 I haven't followed this debate closely, but I've had cause to do a fair
 amount of thinking (outside the CF context) on the pros and cons of
 identifying date/times as strings or numbers.  I could probably write a
 very boring essay on this but in summary, they are not exactly
 equivalent ways

Re: [CF-metadata] New standard names for satellite obs data (time as ISO strings)

2010-10-21 Thread Jon Blower
Hi Benno,

 No one I know beyond the age of four thinks Sep 2009 is ambiguous

Do you mean beyond the age of precisely 4.000... years or beyond the age of 
4.999... years?  Or is the ambiguous temporal metadata concept of the age of 
four sufficient?

;-)

All ISO8601 dates (year, month or day resolution) are inherently ambiguous 
because they carry no time zone information (your precise bounds are likely to 
be 5 hours different from mine, or something more complex if daylight saving is 
involved).  So with ISO8601 alone I don't think there's such a thing as the 
preciseSep2009 in your argument below, unless I've misunderstood what you 
mean, in which case apologies.

Hmm... come to think of it, this might actually argue *against* using ISO8601 
strings alone as indicators of time resolution.  If we really *do* mean that 
data are representative of a 24-hour period starting at midnight UTC on the 
first of September, we can't represent this unambiguously as 2009-01-01 
because of the time zone problem.  (I think that 2009-01-01Z is illegal.)  In 
this case we would be better off representing this period as a nominal value 
plus explicit bounds, or a nominal value plus time zone plus some additional 
information that we can discard any precision greater than 1 day.

*However*, it's still very useful to know the resolution of the time axis by 
some means other than inspecting the coordinate bounds.  An application (e.g. 
automatically generating a time selector widget in a GUI) will probably not 
want to look at all the bounds of all the time coordinates to infer the time 
resolution: apart from being generally tedious, it would be very difficult for 
monthly data (because months are different lengths, and vary between 
calendars).  Additionally, this inference would be complicated if 
floating-point numbers were used to represent time coordinates, since these are 
usually slightly inaccurate (dangerous to compare floats for equality, etc. 
etc.)


So, I'm starting to like the idea of an additional (and optional) CF attribute 
to specify time coordinate resolution.  This could be specified in precise 
numeric terms (e.g. instrument precision of 0.12 ms) or in less-precise human 
calendar terms for certain kinds of data (e.g. P1M for monthly resolution).  
It would be additional to the coordinate bounds: data providers could specify 
one or the other, or both if they are consistent (or neither if appropriate?)

This would not require or preclude the use of ISO8601 strings to represent time 
coordinate values.


Best wishes,
Jon


-Original Message-
From: bennoblument...@gmail.com [mailto:bennoblument...@gmail.com] On Behalf Of 
Benno Blumenthal
Sent: 21 October 2010 21:44
To: Jon Blower
Cc: cf-metadata@cgd.ucar.edu
Subject: Re: [CF-metadata] New standard names for satellite obs data (time as 
ISO strings)

Hi Jon,

Sorry, I am not buying it.

No one I know beyond the age of four thinks Sep 2009 is ambiguous, and
I don't read your examples as needing vagueness on the time
specifically.

Suppose, for a moment, that you succeed beyond your wildest dreams,
and it is possible to express in CF some relationship to a vague
notion of Sep2009, i.e.

data hasADataRelationshipWith vagueSep2009.

I would say there is another relationship

vagueSep2009 isAVagueVersionOf preciseSep2009

And you could have just as easily coded in CF

data hasADataRelationshipWithAVaugeVersionOf preciseSep2009

i.e. there is no reason why the vaugeness  cannot be coded as a
dependent data property.  Which is what CF is currently set up to do,
with a possible extension of the cell_methods vocabulary

Futhermore, you said

 You *could* modify CF so that to represent data that are representative of 
 September 2010, you specify a nominal date half-way through September and 
 set the bounds to the first and last instants of September.  And perhaps use 
 a new cell_methods of representative.  But the half-way point and the 
 bounds would be quite (very) tedious to compute in the general case (months 
 and years are of variable length for example and depend on the calendar 
 system).


That is not a modification of CF -- that is the way it is currently
encoded in CF (though there is no meaning to the nominal value, so you
can set it to whatever).  And yes, you have to generate the edges,
which you have to do anyway if you are going to sensibly handle
computations with the data.

And let me repeat my main original point, so that it does not get
completely buried  -- CF software really needs to render time bounds
as ISO8601 conveniently and universally (both directions seems to be
essential, i.e. reading and writing), so the the CF convention can be
easily used in this regard.

Sorry I couldn't be more helpful,

Benno


On Thu, Oct 21, 2010 at 11:57 AM, Jon Blower j.d.blo...@reading.ac.uk wrote:
 Hi Benno,

 2010-09 is not necessarily a precise specification of a month - time zones 
 make it a little fuzzy for one thing.  Separate

Re: [CF-metadata] New standard names for satellite obs data (time as ISO strings)

2010-10-20 Thread Steve Hankin

 Hi Jon,

Why do you see this as an issue of date-times as ISO strings in 
particular?  The same issues of precision are found in longitudes 
expressed as a degrees-minutes-seconds string compared to a floating 
point.  Or indeed to a depth expressed as a decimal string of known 
numbers of digits.  (100.00 communicates different precision than 
100 though both a represented by the same binary value.)


CF provides the bounds attribute and the cell methods/measures to 
clarify (somewhat) these points.  What is your proposal for improved 
representation of precisions?  And wouldn't a general improvement in how 
to specify coordinate precision be preferable to a solution that applies 
to time, only?


- Steve

=


On 10/20/2010 9:41 AM, Jon Blower wrote:

Hi all,

I haven't followed this debate closely, but I've had cause to do a fair
amount of thinking (outside the CF context) on the pros and cons of
identifying date/times as strings or numbers.  I could probably write a
very boring essay on this but in summary, they are not exactly
equivalent ways of representing the same information.

One way in which they are different is precision.  A value of x seconds
since y has no implied precision - typically in programs we take the
precision to be milliseconds, but there's nothing to suggest this in the
actual metadata (anyone who tries to populate a GUI from CF metadata
struggles with this).  Semantically it's a time instant; i.e. an
infinitesimal position in a temporal coordinate reference system.
However, an ISO8601 string can have various precisions.  (The string
2009-10 is not considered equivalent to 2009-10-01T00:00:00.000Z.)

 From Wikipedia (http://en.wikipedia.org/wiki/ISO_8601):

For reduced accuracy, any number of values may be dropped from any of
the date and time representations, but in the order from the least to
the most significant. For example, 2004-05 is a valid ISO 8601 date,
which indicates May (the fifth month) 2004. This format will never
represent the 5th day of an unspecified month in 2004, nor will it
represent a time-span extending from 2004 into 2005.

I've argued before in a previous thread on this list that it would be
good to be able to specify the precision of time coordinates in terms of
calendar date/time fields (which isn't the same thing as providing a
tolerance value on the numeric coordinate value of a time axis).

I'm not saying that we should definitely allow time strings in CF, just
pointing out that they have some use cases we currently can't fulfil.
I'm not sure they are definitively bad practice in all cases.

(Regarding a technical point raised below, yes, it's a pain to represent
variable length strings in NetCDF, but there is a maximum length for
ISO8601 strings.)

Hope this helps,
Jon

-Original Message-
From: cf-metadata-boun...@cgd.ucar.edu
[mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Lowry, Roy K
Sent: 20 October 2010 10:00
To: Ben Hetland; cf-metadata@cgd.ucar.edu
Subject: Re: [CF-metadata] New standard names for satellite obs data

Dear All,

As others have said, I think this debate is irrelevant as there should
be no need for string timestamps in NetCDF. Providing a Standard Name
only encourages what I consider to be bad practice.

Cheers, Roy.

-Original Message-
From: cf-metadata-boun...@cgd.ucar.edu
[mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Ben Hetland
Sent: 20 October 2010 09:14
To: cf-metadata@cgd.ucar.edu
Subject: Re: [CF-metadata] New standard names for satellite obs data

On 19.10.2010 16:27, Seth McGinnis wrote:

What about using 'date' for string-valued times?  That was my homebrew
solution when I was considering a similar problem.

If I may butt in and contribute here, I usually prefer names like
'datetime' or 'timestamp' in cases like this, because 'date' is
potentially confusing. It may not be immediately obvious to a future
reader (or programmer) that a variable called 'date' supports points in
time down to for example seconds of accuracy.



(Note that string data is a big pain to deal with in NetCDF-3, because
you're limited to fixed-length character arrays.  You need to use
NetCDF-4 / HDF5 to get Strings as a data type.)

(It may not be such a practical issue with ISO 8601 strings, as a
reasonable max. length can be determined, I presume.)

___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata