mktime does not specify EINVAL and should

Geoff Clare via austin-group-l at The Open Group Tue, 22 Nov 2022 04:49:57 -0800

Having returned refreshed from my break, I have re-examined this issue
and I now have a clear understanding of why the C standard allows
mktime() to return -1 for times in the gap but POSIX does not.
Previously I had been fixated on the tm_isdst text that is a
non-normative footnote in C but normative text in POSIX, but that is
a red herring.  The key to the real reason is the wording of the
RETURN VALUE ("Returns" in C) section. But I'm getting ahead of
myself; there are other parts of kre's latest email that deserve a
response too...


Robert Elz wrote, on 11 Nov 2022:

>     From:        "Geoff Clare via austin-group-l at The Open Group" 
> <austin-group-l@opengroup.org>
> 
> You seem to be of the opinion that mktime()'s prime purpose is to allow
> people to increment time fields, and get a time_t back.   Almost as if
> that is its only use.

Of course I am not of that opinion. I have just been mentioning that type
of usage of mktime() a lot in this thread because it is a use case that
would break if mktime() returns -1 for times in the gap.

>   | However, having previously said I would not object to adding an
>   | EINVAL error for timezone changes, I have now gone to the trouble
>   | of testing your Singapore example.  Every implementation I tested
>   | does not return -1.  (The same ones I tested the DST gap on, except
>   | for HP-UX as the version I have is so old it does not have the
>   | Area/Location timezone database).  So I am now more dubious of
>   | whether Issue 8 should allow this error.
>   |
>   | I modified the existing example code to have:
>   |
>   | /* 1981-12-31 23:40:00 */
>   | t.tm_year = 1981 - 1900; t.tm_mon = 11; t.tm_mday = 31;
>   | t.tm_hour = 23; t.tm_min = 40; t.tm_sec = 0;
>   |
>   | and ran it with TZ=:Asia/Singapore
> 
> Why did you pick that particular date/time ?

Because when the change happened, 1981-12-31 23:30:00 in the old time
zone became 1982-01-01 00:00:00 in the new timezone.  See this press
release from the Singapore government announcing the change:

https://corporate.nas.gov.sg/media/collections-and-research/time-zone-adjustment

(The relevant text is in an image, so I can't copy and paste it.)

>   | I also modified the code to print the mktime() return value.  They
>   | all returned either 378663000 or 378661200
> 
> What implementation returned the latter?

Solaris 11.

> As best I can tell, 378663000
> is the only correct (by anyone's standard) answer to that example.

Apparently Solaris has the correct transition time and some other
implementations have it happening half an hour later than it actually
did.

>   | Does NetBSD return -1?
> 
> To that example, no, of course not.    But if I set it as:
> 
>   | /* 1982-01-01 00:20:00 */
>   | t.tm_year = 1982 - 1900; t.tm_mon = 0; t.tm_mday = 1;
>   | t.tm_hour = 0; t.tm_min = 20; t.tm_sec = 0;
> 
> then yes, it does.   The thirty minute jump forward in Singapore went
> from 1981-12-31 23:59:59 to 1982-01-01 00:30:00

No it didn't.

> Or at least that's what all the research that's been done leads us
> to believe, and that's what Asia/Singapore (with or without that
> leading ':') should encode.

Clearly either the research or the conclusions from it were flawed, if
they disagree with the information I linked to above, which is from
"the horse's mouth".  Wikipedia also has the right info (but perhaps
it has been corrected since the flawed research you mention was done).

Anyway, I have now tried 1982-01-01 00:20:00 to see if I could get -1
from any of the other systems I used before.

Glibc produced 378663600 and "Thu Dec 31 23:50:00 1981", so it is the
same as Solaris except for the change being at the wrong time.

FreeBSD and MacOS returned -1.

In an early draft of this email I was going to say at this point that
I would revert to my previous position on this one and say I'm okay
with adding an EINVAL error for these kinds of timezone change.
However, I have since realised that this would create a conflict with
the C standard, which I will explain below.

> To return briefly to the earlier point - notice that is a transition
> where all of year, month, mday, hour, and minute altered, if starting
> from a time before the transition and going to one after it, that would
> be observed (ie: one had started at 1981-12-01 00:20:00, and did either
> tm_mon++ or tm_mday += 31, and then ran a mktime() that behaves the way
> you believe it should, you'd see changes to all of those fields in the
> resulting struct tm.

Doesn't matter. The fact that tm_hour and tm_min changed is enough to
detect that a transition occurred.  Additional changes are irrelevant.

> Consider Pacific/Kwajalein (and no, I have no idea where that is, though
> I assume a google search would reveal something) where the local time in
> 1993 advanced
> 
>       1993 Aug 20 23:59:59 -> 1993 Aug 22 00:00:00   (one second later).
> 
> August 21 simply never existed.   If you stepped forward a month from
> July 21 (or 31 days - another fortunate case where the two months both
> have the same number of days) you'd end up at the same time of day,
> but on Aug 22 instead.   The month changed (but you were expecting and
> wanting that) but the day of month altered as well (unexpected).
> Time of day would not be affected at all.

Okay, so there really are cases where checking just tm_hour and tm_min
is not enough. Good to know.

> It isn't even all that implausible to imagine a small island nation which
> has a strong relationship with parts of North America, wanting to be using
> the same dates (same working days) as apply there, during the period when
> those activities take place (say winter - maybe tourists escaping the cold)
> but in summer, that trade drops to almost nothing (too hot, too humid, too
> many monsoons) but there's another major relationship with (pick one:
> China, Japan, South Korea, Taiwan) and it is beneficial to be working the
> same days of the week as they are.  In circumstances like those one could
> imagine something like the final Sunday (or Saturday) of March no longer
> existing, in any future year, but the second Sunday (or...) of every September
> gets run twice.   That's just summer time with a 24 hour offset shift.
> Easily encoded as a POSIX TZ string even.
> 
> Code should work when that happens - as while it seems unlikely, or even
> inconceivable right now, economics (or perhaps the pockets of the politicians)
> can generate some very strange results sometimes.

Adding a check of tm_mday would catch all the cases you've identified
so far. Can you think of one where checking tm_mday, tm_hour, and tm_min
would not suffice?

>   | > Not "will not fit", that is not what the text says.   "Cannot be
>   | > represented in" is what it actually says.
>   |
>   | For integer values they mean the same thing.
> 
> I disagree.   Though not so much to the "for integer values" part, but
> to the belief that there is an integer value in the error case.   That's
> the issue, there isn't.
> 
> To return to the sqrt() example for a minute, the arg is a float/double
> (promotion to double applies) the result is a double.   There is no (within
> the range/precision of the type) double value that cannot be represented
> in a variable which is type double.
> 
> But when I do sqrt(-2.0) what happens?   There is an answer, but it is
> not representable as a double (the result type of strt()).   In some floating
> formats an indication of the answer can be stored in the double as a NaN - but
> not in all.   In those this answer, and certainly not the complex answer,
> simply cannot be represented.
> 
> The issue we have is an exact analogy of that - the data passed in is
> not with the domain of the function (which is the representation of local
> times in some local timezone) there is no integer which represents the
> answer.   There is no representation of the local time (which in at least
> many cases, needs no normalisation at all - it isn't adjusted, all values
> are within the appropriate ranges (even tm_mday within the limits for
> the number of days selected by tm_mon and tm_year).

This would be a valid analogy if the standard allowed mktime() to
return -1 when the data passed in is not within the domain of the
function, but it doesn't. It says:

    If the time since the Epoch cannot be represented, the function
    shall return the value (time_t)-1

This text refers to the "time since the Epoch", which is an integer
value (the number of seconds).  It does NOT refer to the broken-down
time that is passed to mktime().

Thank you for bringing my attention to this, because now that I realise
the full implications of this wording, I need to retract my earlier
statement that the general error number rules allow mktime() to return
an EINVAL error for such things as an invalid TZ value.  This would be
true if the RETURN VALUE section was worded in the typical way as
"Upon successful completion [...] Otherwise, mktime() shall return
(time_t)-1 and [...]", but it is not worded that way.  It specifically
describes the (time_t)-1 return as meaning the (integer) time since
the Epoch cannot be represented in a time_t.

So, according to the RETURN VALUE section:

If mktime() returns any value other than (time_t)-1, it is a
time since the Epoch (number of seconds).

If mktime() returns (time_t)-1, it means that the time since the
Epoch (number of seconds) to be returned cannot be represented in
a time_t.

No other meaning for the return value is possible, so mktime() has no
way of indicating that any kind of error other than a "time_t overflow"
occurred.  Setting errno to EINVAL would contradict the meaning of the
(time_t)-1 return value.

NetBSD returning (time_t)-1 for times in the gap clearly does not
conform to this RETURN VALUE wording.  The (time_t)-1 return means the
calculated integer value to be returned is not representable in a
time_t, but NetBSD's mktime() did not get as far as calculating an
integer value to be able to decide whether or not it is representable
in a time_t.

> I suspect that the language in
> POSIX was copied (as close to word for word as made sense) from the
> language in the C standard, perhaps without a lot of thought as to
> why the C standard used that particular phrasing for the error case.
> My impression is that the C standard gets much more scrutiny, and so
> a lot more care that the exact right words are chosen, than applies
> here.   When they chose "cannot be represented" rather than something
> else "does not fit" wouldn't come close to acceptable I'd guess,
> but "outside the range of values that can be held in a time_t" might.
> I suspect they did it for a reason.  That some times that can be
> encoded (validly) in a struct tm without representing any existing local
> time is not some new revelation - that's been common knowledge for
> longer than I have any history to guide me.   Almost certainly someone
> dealing with this would have known that.
> 
>   | And by saying that, you have demonstrated that you are indeed
>   | misinterpreting it.
> 
> Since the words are from the C standards group, and since they have
> stated that an error can be generated because of an invalid, or
> ambiguous, struct tm, together with this being the only kind of
> error return they offer, I suspect it is you that is misinterpreting it.

Okay, let's examine the text in C89/C90:

    The mktime function converts the broken-down time, expressed as
    local time, in the structure pointed to by timeptr into a calendar
    time value with the same encoding as that of the values returned
    by the time function.
    [...]

    Returns
    The mktime function returns the specified calendar time encoded as
    a value of type time_t. If the calendar time cannot be represented,
    the function returns the value (time_t)-1.

(In C99 and C17 it is the same except for additional parentheses
around "-1").

This wording is almost identical to POSIX, except for "shallification",
the use of "time since the Epoch" in POSIX instead of "calendar time" in
C99, and the POSIX requirement to set errno.

The same analysis I gave above for POSIX applies: the "calendar time" is
a numeric (time_t) value.  The C standard says that (time_t)-1 is returned
when the calendar time to be returned cannot be represented in a time_t.
Not that the broken-down time can't be represented, only the (numeric)
calendar time.

However, there is a big difference in the requirements that arise from
these almost identical wordings, and that is because local time and DST
are implementation-defined in C, but in POSIX they are not.

In order for a non-POSIX implementation of mktime() to return (time_t)-1
for a time in the gap, all it has to do is define local time and DST in
such a way that times in the gap are converted to a value that cannot be
represented in a time_t.  For example, it could say they are converted
to UINT64_MAX if time_t is a signed 64-bit integer type.  Then the
requirement in the C standard would kick in, requiring mktime() to
return (time_t)-1 because UINT64_MAX can't be represented in that time_t
type.

This "loophole" is not present in POSIX because local time and DST are
not implementation-defined.  So, although the C committed was perfectly
correct when it responded to Paul Eggert's DR by saying that C90 allowed
mktime() to return -1 for times in the gap, the same is not true for POSIX,
and therefore adding an EINVAL error to POSIX would create a conflict with
the C standard (because it would necessitate changing the meaning of the
(time_t)-1 return value in the RETURN VALUE section).

If the C committee changes their "Returns" wording in a future revision,
that would perhaps allow us to add an EINVAL error in Issue 9, but it
will not be possible for Issue 8.

>   | For integer values they mean the same thing.
> 
> If we had an integer value, yes, the point that you keep overlooking
> is that we don't.   There is no integer (or float, or even complex)
> value of seconds which represents the arg structure.   That is what
> cannot be represented, not some integer value which for some reason
> we don't like.

The standard clearly says (time_t)-1 is returned when the integer time
since the Epoch value cannot be represented in a time_t.  You are
misinterpreting it when you try to make the "cannot be represented"
phrase apply to the broken-down time supplied to mktime() instead of
to the integer value it wants to return.

> OK, now, long long ago, but still within this message, I promised an
> example of a problem with the approach being advocated (the results being 
> not errors, and things working as it has been explained that some
> application programmers expect them to work).

There is one word that best describes your example: "contrived".
You effectively admit as much through the presence of this comment in
the code:

    tm.tm_isdst = -1;   /* this is required, so we are told */

I don't know who you think told you it was required; it certainly
wasn't me.  No real world application would set tm_isdst to -1 when
incrementing the time by an hour.  What I said was that an application
would set tm_isdst to -1 when it wants the time shown by a wall-clock
at the calculated time to be the same (when possible) as it is now.
If it wants the wall-clock time to be the same, it is not going to
change tm_hour (nor tm_min nor tm_sec).  The example I gave added a
month, but setting tm_isdst to -1 would be a reasonable thing to do
when adding any number of whole days.

Your example code would misbehave on almost all existing implementations.
The fact that applications have been happily using mktime() on those
implementations for 30 years means that something equivalent to your
example code is never used in real applications.

In anticipation that your next move would be to change your example to
increment by a day instead of an hour and say it will misbehave when TZ
specifies a DST change of 24 hours, I'll point out again that the fact
that applications have been happily using mktime() for 30 years on
implementations that do not return -1 for times in the gap means that
this situation simply never happens in practice.

-- 
Geoff Clare <g.cl...@opengroup.org>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Re: [1003.1(2016/18)/Issue7+TC2 0001614]: XSH 3/mktime does not specify EINVAL and should

Reply via email to