Re: TAP datetime (was Re: Current state of TAP::Diagnostics)

A. Pagaltzis Thu, 06 Sep 2007 13:30:37 -0700

* Michael G Schwern <[EMAIL PROTECTED]> [2007-09-04 23:35]:
> A. Pagaltzis wrote:
> > Actually ISO 8601 gives many more options than RFC 3339,
> > which is why the latter was written in the first place. See
> > 5.3 (“Rarely Used Options”) in RFC 3339.
> 
> That's why I'm inclined to go with one based on ISO 8601 rather
> than just RFC 3339. More possibilities for expression.


Yes, and therefore more possibilities for confusion, because most
people write against sample data from a running system and never
look at specs. So minimising variability maximises
interoperability. That’s what RFC 3339 section 5.3 talks about.

> If they're really not useful and just complicate matters I'm
> quite open to being convinced otherwise.

OK; the goal here, I think, is to make TAP as simple as possible
to generate as well as consume, without limiting expressiveness
unnecessarily.

As far as I can see, there is no appreciable difference in the
complexity of generating datetimes in either format.

There is quite a bit of difference on the consumption end.

Timezone information in RFC 3339 is numeric only. This obviates
the need for a long mapping table in parser code, and makes a
coarse validation of the datetime very easy – even at a human
glance.

RFC 3339 also does not permit omitting the day-of-month. Only
seconds and fractions of a second are optional, and if the
fractions are included, seconds must also be, so the only
optional part is always at the end of the string, before the
timezone. Thus, datetimes with the same timezone can be sorted
correctly with just a string compare. And accounting for the
timezone prior to sorting isn’t hard since it’s just a numeric
offset anyway.

ISO 8601, conversely, has a number of ways to express a single
date. You can base dates off the week number instead of the
number, or write them with day-of-year with no middle component
at all. As far as I can tell, this mainly makes it easier to
generate dates when dealing with long-duration processes.
Comparing or sorting such datetimes *requires* date math.

The question is, do we need that flexibility? As far as I can
tell, datetimes in TAP would almost always denote instants in
time, not durations nor long-duration recurring events, and it
will always be easy to come up with the current month and
day-in-month. When timespans have to expressed, it seems to me
they will be such that expressing them as a starting instant in
time and an ending instant will be perfectly adequate.

So it seems to me we’ll do the world a favour if we don’t force
every future consumer of TAP to deal with the whole of ISO 8601,
when (I dare predict) almost everyone will use no more than an
RFC 3339-ish subset of the format anyway. 

RFC 3339 datetimes are so simple that you can write a regex off
the top of your head and it’ll parse them correctly on the first
try. Sorting is so easy you can write a few more lines of code
and it too will work correctly on the first try. In contrast,
ISO 8601 datetimes require a real parser and a date math library.

> What specifically turned me off to RFC 3339 is that all dates
> had to be in the current era. That works for a communications
> protocol where the date is describing current events (like "I
> sent this message at"), but not so much for a data interchange
> format.

How? What RFC 3339 means by current era is that the year in the
date cannot exceed 9999AD (nor go lower than 0000AD). Do you
forsee any problems with that? Will generated TAP have to denote
events in 10000AD or beyond, or events farther in the past than
0000AD?

It should be noted that as far as I can tell from any ISO 8601
docs I can find, it does not permit more than 4 digits for the
year anyway. So the two standards don’t in fact seem to differ
on this point.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>

Re: TAP datetime (was Re: Current state of TAP::Diagnostics)

Reply via email to