Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Tom Lane
Robert Haas  writes:
> More to the point, it's also what 8.3.7 does:

Well, no, because the cases at issue are where an 
is specified.  8.3 did this:

regression=# select '99 seconds'::interval second;
 interval 
--
 00:00:39
(1 row)

and even more amusingly,

regression=# select interval '99' minute;
 interval 
--
 00:01:00
(1 row)

regression=# select interval '99' hour;  
 interval 
--
 00:00:00
(1 row)

It was all pretty broken back then.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Robert Haas
On Mon, Jun 1, 2009 at 8:04 PM, Tom Lane  wrote:
> I wrote:
>> Sam Mason  writes:
>>> On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote:
 There is some case to be made that we should throw error here,
 which we could do by putting error tests where the attached patch
 has comments suggesting an error test.
>
>>> With things as they are I think it would be useful to throw an error
>>> here; if the user means 25 hours they should say 25 hours!
>
>> Well, maybe, but I'm not really convinced.
>
> I've gone ahead and committed the patch as-is, without the error tests.
> There's still time to change it if anyone has a killer argument, but
> I thought of another issue that would have to be dealt with: consider
> values such as INTERVAL '13' MONTH.  Since per spec we should not
> reduce this to 1 month, what is going to happen barring significant
> redesign on the output side is that the value will print out as
> '1 year 1 month'.  If we were to consider that as illegal input for
> INTERVAL MONTH then we'd be creating a situation where valid data
> fails to dump and reload.  This won't happen for all cases (eg 60
> days doesn't overflow into months) but it shows the danger of throwing
> error for cases that we can't clearly distinguish on both input and
> output.  So I think we should be satisfied for now with accepting
> inputs that are valid per spec, and not worry too much about whether
> we are rejecting all values that are a bit outside spec.

Well, there is the possibility that if we implement something fully
spec-compliant in the future, we might run into a situation where
someone puts 13 months in, dumps and reloads, then puts in 13 months
in again, compares the two, and surprisingly they turn out to be
unequal.  But I'm having a hard time caring.  The behavior your patch
implements is clearly a lot more useful than what it replaced, and I
think it's arguably more useful than the spec behavior as well.

More to the point, it's also what 8.3.7 does:

Welcome to psql 8.3.7, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

rhaas=# select '99 seconds'::interval;
 interval
--
 00:01:39
(1 row)

rhaas=# select '99 minutes'::interval;
 interval
--
 01:39:00
(1 row)

rhaas=# select '99 hours'::interval;
 interval
--
 99:00:00
(1 row)

rhaas=# select '99 days'::interval;
 interval
--
 99 days
(1 row)

rhaas=# select '99 weeks'::interval;
 interval
--
 693 days
(1 row)

rhaas=# select '99 months'::interval;
interval

 8 years 3 mons
(1 row)

I haven't checked, but hopefully these all now match the 8.4 behavior?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Tom Lane
I wrote:
> Sam Mason  writes:
>> On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote:
>>> There is some case to be made that we should throw error here,
>>> which we could do by putting error tests where the attached patch
>>> has comments suggesting an error test.

>> With things as they are I think it would be useful to throw an error
>> here; if the user means 25 hours they should say 25 hours!

> Well, maybe, but I'm not really convinced.

I've gone ahead and committed the patch as-is, without the error tests.
There's still time to change it if anyone has a killer argument, but
I thought of another issue that would have to be dealt with: consider
values such as INTERVAL '13' MONTH.  Since per spec we should not
reduce this to 1 month, what is going to happen barring significant
redesign on the output side is that the value will print out as
'1 year 1 month'.  If we were to consider that as illegal input for
INTERVAL MONTH then we'd be creating a situation where valid data
fails to dump and reload.  This won't happen for all cases (eg 60
days doesn't overflow into months) but it shows the danger of throwing
error for cases that we can't clearly distinguish on both input and
output.  So I think we should be satisfied for now with accepting
inputs that are valid per spec, and not worry too much about whether
we are rejecting all values that are a bit outside spec.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Tom Lane
Sam Mason  writes:
> On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote:
>> There is some case to be made that we should throw error here,
>> which we could do by putting error tests where the attached patch
>> has comments suggesting an error test.

> With things as they are I think it would be useful to throw an error
> here; if the user means 25 hours they should say 25 hours!

Well, maybe, but I'm not really convinced.  By the same logic, if the
column is INTERVAL MONTH and the user puts in '1 year 1 month', then
we ought to throw an error for that too.  But AdjustIntervalForTypmod
is incapable of doing that because months and years are folded into
a single field.  There is a logical reason for the difference --- 1 year
is always exactly 12 months whereas 1 day is not always exactly 24 hours
--- but that difference isn't acknowledged by the SQL standard, which
last I checked still pretends daylight savings time doesn't exist.

The real bottom line here is that our underlying implementation and
semantics for INTERVAL are considerably different from what the SQL
standard has in mind.  AFAICS they intend an interval to contain six
separate numeric fields with "what you see is what you get" behavior.
We could argue some other time about whether that's a better design
than what we're using, but it's surely not going to change for 8.4.
My ambitions for the moment are limited to making sure that we accept
all spec-compliant interval literals and interpret them in a fashion
reasonably compatible with what the spec says the value is.  I don't
feel that we need to throw error for stuff we used to accept in order
to meet that goal.

> It would only be different when the interval is used with values of type
> timestamptz, or am I missing something?  How much sense does it make to
> have a timezone aware interval where this distinction is true and leave
> the current interval as timezone naive.

Doesn't seem practical, certainly not for 8.4.  In any case I'm
uncomfortable with the idea that a value would be accepted at entry
and then fail later on depending on how you used it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-06-01 Thread Sam Mason
On Sun, May 31, 2009 at 06:32:53PM -0400, Tom Lane wrote:
> regression=# select '999'::interval second;
> The correct interpretation of the input value is certainly 999 seconds.

Agreed; silent truncation like this is confusing and will lead to
unnecessary bugs in users' code.

> the attached patch we would render as
> regression=# select '1 day 1 hour'::interval hour;
>  1 day 01:00:00
> 
> There is some case to be made that we should throw error here,
> which we could do by putting error tests where the attached patch
> has comments suggesting an error test.

With things as they are I think it would be useful to throw an error
here; if the user means 25 hours they should say 25 hours!

> However I'm inclined to think
> that doing that would expose an implementation dependency rather more
> than we should.  It is usually not clear to novices that '1 day 1 hour'
> is different from '25 hours', and it would be even less clear why the
> latter would be acceptable input for an INTERVAL HOUR field when the
> former isn't.  So I'm proposing the patch as-is rather than with the
> extra error tests, but am open to being convinced otherwise.

It would only be different when the interval is used with values of type
timestamptz, or am I missing something?  How much sense does it make to
have a timezone aware interval where this distinction is true and leave
the current interval as timezone naive.  Not sure if that would help to
clean up the semantics at all or if it's just adding more unnecessary
complexity.  I have a feeling it's probably the latter, but thought it
may help things.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Patch: AdjustIntervalForTypmod shouldn't discard high-order data

2009-05-31 Thread Tom Lane
As I mentioned a bit ago
http://archives.postgresql.org/pgsql-hackers/2009-05/msg01505.php
there seems to be a definite problem still remaining with our handling
of interval literals.  To wit, this behavior is absolutely not per spec:

regression=# select '999'::interval second;
 interval 
--
 00:00:39
(1 row)

The correct interpretation of the input value is certainly 999 seconds.
The spec would allow us to throw error if it exceeds the available range
of the field, but a silent modulo operation is not per spec and seems
against our general design principle of not silently discarding data.
I propose the attached patch to make the code not throw away high-order
values in this fashion.

A somewhat more debatable case is this:

regression=# select '1 day 1 hour'::interval hour;
 interval 
--
 01:00:00
(1 row)

which with the attached patch we would render as

regression=# select '1 day 1 hour'::interval hour;
interval

 1 day 01:00:00
(1 row)

There is some case to be made that we should throw error here,
which we could do by putting error tests where the attached patch
has comments suggesting an error test.  However I'm inclined to think
that doing that would expose an implementation dependency rather more
than we should.  It is usually not clear to novices that '1 day 1 hour'
is different from '25 hours', and it would be even less clear why the
latter would be acceptable input for an INTERVAL HOUR field when the
former isn't.  So I'm proposing the patch as-is rather than with the
extra error tests, but am open to being convinced otherwise.

The reason I'm bringing this up now is that we've already changed the
behavior of interval literals quite a bit in 8.4.  I would rather try to
finish getting it right in this release than have the behavior change
twice in successive releases.

Comments?

regards, tom lane

Index: src/backend/utils/adt/timestamp.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v
retrieving revision 1.199
diff -c -r1.199 timestamp.c
*** src/backend/utils/adt/timestamp.c   26 May 2009 02:17:50 -  1.199
--- src/backend/utils/adt/timestamp.c   31 May 2009 22:13:35 -
***
*** 962,967 
--- 962,979 
int range = INTERVAL_RANGE(typmod);
int precision = INTERVAL_PRECISION(typmod);
  
+   /*
+* Our interpretation of intervals with a limited set of fields
+* is that fields to the right of the last one specified are 
zeroed
+* out, but those to the left of it remain valid.  Since we do 
not
+* have any equivalent of SQL's ,
+* we can't properly enforce any limit on the leading field.  
(Before
+* PG 8.4 we interpreted a limited set of fields as actually 
causing
+* a "modulo" operation on a given value, potentially losing 
high-order
+* as well as low-order information; but there is no support 
for such
+* behavior in the standard.)
+*/
+ 
if (range == INTERVAL_FULL_RANGE)
{
/* Do nothing... */
***
*** 974,1000 
}
else if (range == INTERVAL_MASK(MONTH))
{
-   interval->month %= MONTHS_PER_YEAR;
interval->day = 0;
interval->time = 0;
}
/* YEAR TO MONTH */
else if (range == (INTERVAL_MASK(YEAR) | INTERVAL_MASK(MONTH)))
{
-   /* month is already year to month */
interval->day = 0;
interval->time = 0;
}
else if (range == INTERVAL_MASK(DAY))
{
!   interval->month = 0;
interval->time = 0;
}
else if (range == INTERVAL_MASK(HOUR))
{
!   interval->month = 0;
!   interval->day = 0;
! 
  #ifdef HAVE_INT64_TIMESTAMP
interval->time = (interval->time / USECS_PER_HOUR) *
USECS_PER_HOUR;
--- 986,1008 
}
else if (range == INTERVAL_MASK(MONTH))
{
interval->day = 0;
interval->time = 0;
}
/* YEAR TO MONTH */
else if (range == (INTERVAL_MASK(YEAR) | INTERVAL_MASK(MONTH)))
{
interval->day = 0;
interval->time = 0;
}
else if (range == INTERVAL_MASK(DAY))
{
!   /* throw error if month is not 0?