Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data
Robert Haas writes: > More to the point, it's also what 8.3.7 does: Well, no, because the cases at issue are where an is specified. 8.3 did this: regression=# select '99 seconds'::interval second; interval -- 00:00:39 (1 row) and even more amusingly, regression=# select interval '99' minute; interval -- 00:01:00 (1 row) regression=# select interval '99' hour; interval -- 00:00:00 (1 row) It was all pretty broken back then. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data
On Mon, Jun 1, 2009 at 8:04 PM, Tom Lane wrote: > I wrote: >> Sam Mason writes: >>> On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote: There is some case to be made that we should throw error here, which we could do by putting error tests where the attached patch has comments suggesting an error test. > >>> With things as they are I think it would be useful to throw an error >>> here; if the user means 25 hours they should say 25 hours! > >> Well, maybe, but I'm not really convinced. > > I've gone ahead and committed the patch as-is, without the error tests. > There's still time to change it if anyone has a killer argument, but > I thought of another issue that would have to be dealt with: consider > values such as INTERVAL '13' MONTH. Since per spec we should not > reduce this to 1 month, what is going to happen barring significant > redesign on the output side is that the value will print out as > '1 year 1 month'. If we were to consider that as illegal input for > INTERVAL MONTH then we'd be creating a situation where valid data > fails to dump and reload. This won't happen for all cases (eg 60 > days doesn't overflow into months) but it shows the danger of throwing > error for cases that we can't clearly distinguish on both input and > output. So I think we should be satisfied for now with accepting > inputs that are valid per spec, and not worry too much about whether > we are rejecting all values that are a bit outside spec. Well, there is the possibility that if we implement something fully spec-compliant in the future, we might run into a situation where someone puts 13 months in, dumps and reloads, then puts in 13 months in again, compares the two, and surprisingly they turn out to be unequal. But I'm having a hard time caring. The behavior your patch implements is clearly a lot more useful than what it replaced, and I think it's arguably more useful than the spec behavior as well. More to the point, it's also what 8.3.7 does: Welcome to psql 8.3.7, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit rhaas=# select '99 seconds'::interval; interval -- 00:01:39 (1 row) rhaas=# select '99 minutes'::interval; interval -- 01:39:00 (1 row) rhaas=# select '99 hours'::interval; interval -- 99:00:00 (1 row) rhaas=# select '99 days'::interval; interval -- 99 days (1 row) rhaas=# select '99 weeks'::interval; interval -- 693 days (1 row) rhaas=# select '99 months'::interval; interval 8 years 3 mons (1 row) I haven't checked, but hopefully these all now match the 8.4 behavior? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data
I wrote: > Sam Mason writes: >> On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote: >>> There is some case to be made that we should throw error here, >>> which we could do by putting error tests where the attached patch >>> has comments suggesting an error test. >> With things as they are I think it would be useful to throw an error >> here; if the user means 25 hours they should say 25 hours! > Well, maybe, but I'm not really convinced. I've gone ahead and committed the patch as-is, without the error tests. There's still time to change it if anyone has a killer argument, but I thought of another issue that would have to be dealt with: consider values such as INTERVAL '13' MONTH. Since per spec we should not reduce this to 1 month, what is going to happen barring significant redesign on the output side is that the value will print out as '1 year 1 month'. If we were to consider that as illegal input for INTERVAL MONTH then we'd be creating a situation where valid data fails to dump and reload. This won't happen for all cases (eg 60 days doesn't overflow into months) but it shows the danger of throwing error for cases that we can't clearly distinguish on both input and output. So I think we should be satisfied for now with accepting inputs that are valid per spec, and not worry too much about whether we are rejecting all values that are a bit outside spec. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data
Sam Mason writes: > On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote: >> There is some case to be made that we should throw error here, >> which we could do by putting error tests where the attached patch >> has comments suggesting an error test. > With things as they are I think it would be useful to throw an error > here; if the user means 25 hours they should say 25 hours! Well, maybe, but I'm not really convinced. By the same logic, if the column is INTERVAL MONTH and the user puts in '1 year 1 month', then we ought to throw an error for that too. But AdjustIntervalForTypmod is incapable of doing that because months and years are folded into a single field. There is a logical reason for the difference --- 1 year is always exactly 12 months whereas 1 day is not always exactly 24 hours --- but that difference isn't acknowledged by the SQL standard, which last I checked still pretends daylight savings time doesn't exist. The real bottom line here is that our underlying implementation and semantics for INTERVAL are considerably different from what the SQL standard has in mind. AFAICS they intend an interval to contain six separate numeric fields with "what you see is what you get" behavior. We could argue some other time about whether that's a better design than what we're using, but it's surely not going to change for 8.4. My ambitions for the moment are limited to making sure that we accept all spec-compliant interval literals and interpret them in a fashion reasonably compatible with what the spec says the value is. I don't feel that we need to throw error for stuff we used to accept in order to meet that goal. > It would only be different when the interval is used with values of type > timestamptz, or am I missing something? How much sense does it make to > have a timezone aware interval where this distinction is true and leave > the current interval as timezone naive. Doesn't seem practical, certainly not for 8.4. In any case I'm uncomfortable with the idea that a value would be accepted at entry and then fail later on depending on how you used it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data
On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote: > regression=# select '999'::interval second; > The correct interpretation of the input value is certainly 999 seconds. Agreed; silent truncation like this is confusing and will lead to unnecessary bugs in users' code. > the attached patch we would render as > regression=# select '1 day 1 hour'::interval hour; > 1 day 01:00:00 > > There is some case to be made that we should throw error here, > which we could do by putting error tests where the attached patch > has comments suggesting an error test. With things as they are I think it would be useful to throw an error here; if the user means 25 hours they should say 25 hours! > However I'm inclined to think > that doing that would expose an implementation dependency rather more > than we should. It is usually not clear to novices that '1 day 1 hour' > is different from '25 hours', and it would be even less clear why the > latter would be acceptable input for an INTERVAL HOUR field when the > former isn't. So I'm proposing the patch as-is rather than with the > extra error tests, but am open to being convinced otherwise. It would only be different when the interval is used with values of type timestamptz, or am I missing something? How much sense does it make to have a timezone aware interval where this distinction is true and leave the current interval as timezone naive. Not sure if that would help to clean up the semantics at all or if it's just adding more unnecessary complexity. I have a feeling it's probably the latter, but thought it may help things. -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data
As I mentioned a bit ago http://archives.postgresql.org/pgsql-hackers/2009-05/msg01505.php there seems to be a definite problem still remaining with our handling of interval literals. To wit, this behavior is absolutely not per spec: regression=# select '999'::interval second; interval -- 00:00:39 (1 row) The correct interpretation of the input value is certainly 999 seconds. The spec would allow us to throw error if it exceeds the available range of the field, but a silent modulo operation is not per spec and seems against our general design principle of not silently discarding data. I propose the attached patch to make the code not throw away high-order values in this fashion. A somewhat more debatable case is this: regression=# select '1 day 1 hour'::interval hour; interval -- 01:00:00 (1 row) which with the attached patch we would render as regression=# select '1 day 1 hour'::interval hour; interval 1 day 01:00:00 (1 row) There is some case to be made that we should throw error here, which we could do by putting error tests where the attached patch has comments suggesting an error test. However I'm inclined to think that doing that would expose an implementation dependency rather more than we should. It is usually not clear to novices that '1 day 1 hour' is different from '25 hours', and it would be even less clear why the latter would be acceptable input for an INTERVAL HOUR field when the former isn't. So I'm proposing the patch as-is rather than with the extra error tests, but am open to being convinced otherwise. The reason I'm bringing this up now is that we've already changed the behavior of interval literals quite a bit in 8.4. I would rather try to finish getting it right in this release than have the behavior change twice in successive releases. Comments? regards, tom lane Index: src/backend/utils/adt/timestamp.c === RCS file: /cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v retrieving revision 1.199 diff -c -r1.199 timestamp.c *** src/backend/utils/adt/timestamp.c 26 May 2009 02:17:50 - 1.199 --- src/backend/utils/adt/timestamp.c 31 May 2009 22:13:35 - *** *** 962,967 --- 962,979 int range = INTERVAL_RANGE(typmod); int precision = INTERVAL_PRECISION(typmod); + /* +* Our interpretation of intervals with a limited set of fields +* is that fields to the right of the last one specified are zeroed +* out, but those to the left of it remain valid. Since we do not +* have any equivalent of SQL's , +* we can't properly enforce any limit on the leading field. (Before +* PG 8.4 we interpreted a limited set of fields as actually causing +* a "modulo" operation on a given value, potentially losing high-order +* as well as low-order information; but there is no support for such +* behavior in the standard.) +*/ + if (range == INTERVAL_FULL_RANGE) { /* Do nothing... */ *** *** 974,1000 } else if (range == INTERVAL_MASK(MONTH)) { - interval->month %= MONTHS_PER_YEAR; interval->day = 0; interval->time = 0; } /* YEAR TO MONTH */ else if (range == (INTERVAL_MASK(YEAR) | INTERVAL_MASK(MONTH))) { - /* month is already year to month */ interval->day = 0; interval->time = 0; } else if (range == INTERVAL_MASK(DAY)) { ! interval->month = 0; interval->time = 0; } else if (range == INTERVAL_MASK(HOUR)) { ! interval->month = 0; ! interval->day = 0; ! #ifdef HAVE_INT64_TIMESTAMP interval->time = (interval->time / USECS_PER_HOUR) * USECS_PER_HOUR; --- 986,1008 } else if (range == INTERVAL_MASK(MONTH)) { interval->day = 0; interval->time = 0; } /* YEAR TO MONTH */ else if (range == (INTERVAL_MASK(YEAR) | INTERVAL_MASK(MONTH))) { interval->day = 0; interval->time = 0; } else if (range == INTERVAL_MASK(DAY)) { ! /* throw error if month is not 0?