Robert Elz wrote, on 12 Dec 2022: > > Date: Mon, 12 Dec 2022 12:02:39 +0000 > From: "Geoff Clare via austin-group-l at The Open Group" > <austin-group-l@opengroup.org> > > C23 is apparently going to have timegm() (the mktime() equivalent for UTC > instead of localtime). Using gmtime() modifying the struct tm, and then > timegm() to get the time_t back would work much better, at least if the > specification of timegm() is better than that of mktime() (I haven't > seen it). I know it is getting very late in the process, but perhaps > we should also be adding timegm() now.
It is too late to add timegm() in Issue 8. It will automatically get added in Issue 9 as that will (presumably) align with C23 or later. > | By a strict reading, you may be right, but it is strongly implied by > | "shall be set to represent the specified time since the Epoch". > > That's fine when the specified time (that is, the time passed in in *timeptr) > is a time that exists. This statement provides a big clue as to why you are misinterpreting the standard, and why your attitude towards mktime() is so different from everybody else's. You are suffering from a misconception that *timeptr somehow "specifies" a time since the Epoch. It does not! It specifies a broken-down time. The standard describes, in detail (in the paragraph beginning "The relationship between ..."), how this broken-down time is *converted* to an integer "time since the Epoch" value. When the standard says "shall be set to represent the specified time since the Epoch" it is talking about the integer value that *it* specifies to be calculated from the broken-down time in *timeptr. It is not in any way suggesting that *timeptr "specifies" a time since the Epoch. In trying to treat *timeptr as "specifying" a time since the Epoch, you are misunderstanding the intention and misinterpreting the meaning of much of the mktime() text. Since I mentioned attitudes, I'll explain mine. It is that mktime() follows the well-known principle "be liberal in what you accept, and conservative in what you send" (which originated in relation to communication protocols but I think applies very well here). Applying this principle to mktime() means you can give it an "incorrect" broken-down time and it will make sense of it and give you back a correct time. For example: * If you give it Feb 29 in a non-leap year it treats that as the day after Feb 28 and gives you back Mar 1. * If you give it Feb 0 it treats that as the day before Feb 1 and gives you back Jan 31. * If you give it 21:65 it treats that as 6 minutes after 21:59 and gives you back 22:05. * If you give it tm_isdst=0 for a time when DST is in effect, it gives you back a positive tm_isdst and alters the other fields appropriately. * If there is a DST transition where 02:00 standard time becomes 03:00 DST and you give mktime() 02:30 (with negative tm_isdst), it treats that as either 30 minutes after 02:00 standard time or 30 minutes before 03:00 DST and gives you back a zero or positive tm_isdst, respectively, with the tm_hour field altered appropriately. * If a geographical timezone changes its UTC offset such that "old 00:00" becomes "new 00:30" and you give it 00:20, it treats that as either 20 minutes after "old 00:00" or 10 minutes before "new 00:30", and gives you back appropriately altered struct tm fields. And yes, having listed that last case along with the others, I see no reason that it should not follow the same principle. The "treats it as" wording is much the same as the DST transition case. Returning -1 for any of these cases violates the "be liberal in what you accept" part of the principle. > mday 312, minute -1234, hour 999, second -23456789, year (anything that > doesn't cause time_t overflow for the implementation) tm_isdst anything > represents. If you can find something somewhere that specifies what > that means, in the C or POSIX standards (or just about any other standard > you care to reference) then great. mktime() allows that input, but I > see nothing that says which particular time_t value should be returned. > > You might be imagining how an implementation might deal with this, as can > I, the two might even be the same - but it is certainly not specified > anywhere. I agree it's not clear for pathological cases like that. It comes down to this statement: the tm_yday value used in the expression is the day of the year from 0 to 365 inclusive, calculated from the other tm structure members It may be worth trying to improve this, if implementations all do the tm_yday calculation the same way, but it has no real relevance in the matter of whether mktime() can return -1 for "incorrect" broken-down times. If it doesn't allow it when all of the tm fields are in their normal ranges, then it also doesn't allow it when they are outside those ranges. > | In any case, it is being clarified by bug 1613. > > Unless you made more changes there than I thought, no, it isn't. > The extra text that was added there just says what the returned > struct tm (in *timeptr) must be, in relationship to the time_t > returned. It says nothing at all about how that time_t is selected. And I didn't claim that it does. What I said (which you trimmed) was: By a strict reading, you may be right, but it is strongly implied by "shall be set to represent the specified time since the Epoch". In any case, it is being clarified by bug 1613. My point was entirely about this "shall be set to represent" text, i.e. about what the returned struct tm fields must contain. The context for this was Don's point that Feb 29 2023 has the tm fields in their stated ranges and so the standard, as written, allows the returned struct tm to be left as Feb 29 2023. The change in bug 1613 requires them to be set to the values that would be returned by localtime(), so this will no longer be allowed. > | This would definitely not meet the requirement "shall be set to > | represent the specified time since the Epoch". > > Of course it could. If the time passed in contains out of range > values, there is no defined meaning that can be attributed to them. > If you can find somewhere where that's stated, then please, enlighten us. The above quote is all that's needed, provided "the specified time since the Epoch" is correctly interpreted (which you are not doing). The time since the Epoch being referred to here is a known integer value which mktime() is going to return. The above text requires mktime() to set the struct tm fields to represent that specific, known, time since the Epoch value. (The adjustment to bring struct tm fields into range is done after this value is known - see below). The sort of adjustment you were suggesting, "if (t->tm_sec < 0) t->tm_sec = 0", etc. would cause the fields to no longer represent that time since the Epoch. > | and then requires (on successful completion) that the fields in the > | broken-down time are updated to > | "represent the specified time since the Epoch". > > Yes, this part is not controversial. > > | Your suggested other adjustments would not represent the time since > | the Epoch that is going to be returned. > > Of course it would, the adjustments are made to create a struct tm > that only contains in-range values, and then from that a time_t is > produced. No, the adjustment to bring struct tm fields into range is done after the time since the Epoch value has been calculated. This is clear just from the order in which things are described on the mktime() page, but also from the use of "Upon successful completion", since mktime() can't know whether it will complete successfully until it has calculated the time_t value it is going to return. > | Huh? The struct tm values don't need altering in this case (except > | for tm_isdst obviously). > > Agreed. But we need to pick tm_isdst = 0 or tm_isdst = 1, and > which we pick will alter what time_t value gets returned. There's > nothing anywhere that suggests which one should be selected. Correct, and since the standard is silent on this, either behaviour is allowed. > | > As you indicate, the actual ranges within which the struct tm values are > | > "forced" is one which matches values that localtime() would return > | > | Which is what Issue 8 will require (courtesy of bug 1613). > > No, that's not what that says. I can see you're presuming that the > implementation calculates a time_t first, and then adjusts the tm to > match. That's not required. Yes it is. See above. > | Future applications could > | check errno, but it would be preferable to disallow the (time_t)-1 > | return for times in the gap so that existing applications are guaranteed > | not to misbehave when ported to any (existing or future) conforming > system. > > But unless it is specified what the result must be in that case, > applications moved from a system which generates one result might > fail on one which generates a different one. You've already demonstrated > that there are implementations which return different results for these > times - and you seem to consider that OK. I don't. The only real potential for problems here is if an application does small (less than a day) additions/subtractions using the struct tm fields and sets tm_isdst=-1. Then it might work fine on one implementation but get into the kind of loop you described in an earlier mail on an implementation that behaves the other way. But, as I pointed out in that earlier discussion, no real application would do that. The fact that no such problems have come to light in the last 30 years also means that in practice this is a non-problem. -- Geoff Clare <g.cl...@opengroup.org> The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England