Re: [PATCHES] [HACKERS] Interval aggregate regression failure
On Sep 1, 2006, at 20:39 , Michael Glaesemann wrote: Here's a patch that appears to work. Gives the same output with and without --enable-integer-datetimes. Answers look like they're correct. I'm basically treating the components as three different intervals (with the other two components zero), rounding them each to usecs, and adding them together. While it might be nice to carry a little extra precision around, it doesn't seem to be needed in these cases. If errors do arise, they should be at most 3 usec, which is pretty much noise for the floating point case, I suspect. Okay. Here's the patch. Bruce, does it work for you? Michael Glaesemann grzm seespotcode net Index: src/backend/utils/adt/timestamp.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v retrieving revision 1.165 diff -c -r1.165 timestamp.c *** src/backend/utils/adt/timestamp.c 13 Jul 2006 16:49:16 - 1.165 --- src/backend/utils/adt/timestamp.c 1 Sep 2006 12:28:23 - *** *** 2494,2504 float8 factor = PG_GETARG_FLOAT8(1); double month_remainder, day_remainder, ! month_remainder_days; Interval *result; result = (Interval *) palloc(sizeof(Interval)); month_remainder = span->month * factor; day_remainder = span->day * factor; result->month = (int32) month_remainder; --- 2494,2507 float8 factor = PG_GETARG_FLOAT8(1); double month_remainder, day_remainder, ! month_remainder_days, ! month_remainder_time, ! day_remainder_time; Interval *result; result = (Interval *) palloc(sizeof(Interval)); + month_remainder = span->month * factor; day_remainder = span->day * factor; result->month = (int32) month_remainder; *** *** 2506,2511 --- 2509,2553 month_remainder -= result->month; day_remainder -= result->day; + month_remainder_days = month_remainder * DAYS_PER_MONTH; + + /* + if month_remainder_days is not an integer, check to see if it's an + integer when converted to SECS or USECS. + If it is, round month_remainder_days to the nearest integer + +*/ + + if (month_remainder_days != (int32)month_remainder_days && + TSROUND(month_remainder_days * SECS_PER_DAY) == + rint(month_remainder_days * SECS_PER_DAY)) + month_remainder_days = rint(month_remainder_days); + + result->day += (int32)month_remainder_days; + + #ifdef HAVE_INT64_TIMESTAMP + month_remainder_time = rint((month_remainder_days - + (int32)month_remainder_days) * USECS_PER_DAY); + + day_remainder_time = rint(day_remainder * USECS_PER_DAY); + + + result->time = rint(rint(span->time * factor) + day_remainder_time + + month_remainder_time); + #else + month_remainder_time = rint((month_remainder_days - + (int32)month_remainder_days) * SECS_PER_DAY); + day_remainder_time = rint(day_remainder * SECS_PER_DAY); + + result->time = span->time * factor + day_remainder_time + + month_remainder_time; + #endif + + + day_remainder = span->day * factor; + result->day = (int32) day_remainder; + day_remainder -= result->day; + /* * The above correctly handles the whole-number part of the month and day * products, but we have to do something with any fractional part *** *** 2518,2531 /* fractional months full days into days */ month_remainder_days = month_remainder * DAYS_PER_MONTH; ! result->day += (int32) month_remainder_days; ! /* fractional months partial days into time */ ! day_remainder += month_remainder_days - (int32) month_remainder_days; #ifdef HAVE_INT64_TIMESTAMP ! result->time = rint(span->time * factor + day_remainder * USECS_PER_DAY); #else ! result->time = span->time * factor + day_remainder * SECS_PER_DAY; #endif PG_RETURN_INTERVAL_P(result); --- 2560,2599 /* fractional months full days into days */ month_remainder_days = month_remainder * DAYS_PER_MONTH; ! /* !* The remainders suffer from float rounding, so if they are !* within 1us of an integer, we round them to integers. !* It seems doing two multiplies is causing the imprecision. !*/ #ifdef HAVE_INT64_TIMESTAMP ! if (month_remainder_days != (int32)month_remainder_day
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
Michael Glaesemann <[EMAIL PROTECTED]> writes: > Is it worth looking into the overflow and subtraction issues for 8.2? > It seems to me they're bugs rather than features. Or are these 8.3 > since it's so late? IMHO they're bugs not new features, and therefore perfectly fair game to work on during beta. But for the moment I'd suggest staying focused on the interval_mul/interval_div roundoff issue. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
Please ignore the patch I just sent. Much too quick with the send button. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
Here's a patch that appears to work. Gives the same output with and without --enable-integer-datetimes. Answers look like they're correct. I'm basically treating the components as three different intervals (with the other two components zero), rounding them each to usecs, and adding them together. While it might be nice to carry a little extra precision around, it doesn't seem to be needed in these cases. If errors do arise, they should be at most 3 usec, which is pretty much noise for the floating point case, I suspect. Bruce, how's it look on your machine? If it looks good, I'll add the examples we've been using to the regression tests. Michael Glaesemann grzm seespotcode net Index: src/backend/utils/adt/timestamp.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v retrieving revision 1.165 diff -c -r1.165 timestamp.c *** src/backend/utils/adt/timestamp.c 13 Jul 2006 16:49:16 - 1.165 --- src/backend/utils/adt/timestamp.c 1 Sep 2006 11:26:12 - *** *** 2494,2511 float8 factor = PG_GETARG_FLOAT8(1); double month_remainder, day_remainder, ! month_remainder_days; Interval *result; result = (Interval *) palloc(sizeof(Interval)); ! month_remainder = span->month * factor; ! day_remainder = span->day * factor; result->month = (int32) month_remainder; result->day = (int32) day_remainder; month_remainder -= result->month; day_remainder -= result->day; /* * The above correctly handles the whole-number part of the month and day * products, but we have to do something with any fractional part --- 2494,2553 float8 factor = PG_GETARG_FLOAT8(1); double month_remainder, day_remainder, ! month_remainder_days, ! month_remainder_time, ! day_remainder_time; Interval *result; result = (Interval *) palloc(sizeof(Interval)); ! ! month_remainder = span->month / factor; ! day_remainder = span->day / factor; result->month = (int32) month_remainder; result->day = (int32) day_remainder; month_remainder -= result->month; day_remainder -= result->day; + month_remainder_days = month_remainder * DAYS_PER_MONTH; + + /* + if month_remainder_days is not an integer, check to see if it's an + integer when converted to SECS or USECS. + If it is, round month_remainder_days to the nearest integer + +*/ + + if (month_remainder_days != (int32)month_remainder_days && + TSROUND(month_remainder_days * SECS_PER_DAY) == + rint(month_remainder_days * SECS_PER_DAY)) + month_remainder_days = rint(month_remainder_days); + + result->day += (int32)month_remainder_days; + + #ifdef HAVE_INT64_TIMESTAMP + month_remainder_time = rint((month_remainder_days - + (int32)month_remainder_days) * USECS_PER_DAY); + + day_remainder_time = rint(day_remainder * USECS_PER_DAY); + + + result->time = rint(rint(span->time * factor) + day_remainder_time + + month_remainder_time); + #else + month_remainder_time = rint((month_remainder_days - + (int32)month_remainder_days) * SECS_PER_DAY); + day_remainder_time = rint(day_remainder * SECS_PER_DAY); + + result->time = span->time * factor + day_remainder_time + + month_remainder_time; + #endif + + + day_remainder = span->day * factor; + result->day = (int32) day_remainder; + day_remainder -= result->day; + /* * The above correctly handles the whole-number part of the month and day * products, but we have to do something with any fractional part *** *** 2518,2531 /* fractional months full days into days */ month_remainder_days = month_remainder * DAYS_PER_MONTH; ! result->day += (int32) month_remainder_days; ! /* fractional months partial days into time */ ! day_remainder += month_remainder_days - (int32) month_remainder_days; #ifdef HAVE_INT64_TIMESTAMP ! result->time = rint(span->time * factor + day_remainder * USECS_PER_DAY); #else ! result->time = span->time * factor + day_remainder * SECS_PER_DAY; #endif PG_RETURN_INTERVAL_P(result); --- 2560,2599 /* fractional months full days into days */ month_remainder_days = month_remainder * DAYS_PER_MONTH; ! /* !* The remain
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
On Sep 1, 2006, at 11:03 , Bruce Momjian wrote: I am unclear about this report. The patch was not meant to fix every interval issue, but merely to improve multiplication and division computations. Does it do that? I think the 23:60 is a time rounding issue that isn't covered in this patch. I am not against fixing it, but does the submitted patch improve things or not? Given we are post-feature freeze, we don't have time to fix all the interval issues. Your patch doesn't fix the things Tom referenced (nor did you intend it to). I just wanted to to collect examples of all the known issues with the interval code in one place. Probably too ambitious for September 1. Is it worth looking into the overflow and subtraction issues for 8.2? It seems to me they're bugs rather than features. Or are these 8.3 since it's so late? Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
On Sep 1, 2006, at 11:31 , Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian <[EMAIL PROTECTED]> writes: I am unclear about this report. The patch was not meant to fix every interval issue, but merely to improve multiplication and division computations. Does it do that? According to Michael's last report, your patch fails under --enable-integer-datetimes. Where does it fail? Here? select interval '41 mon 12 days 360:00' * 0.3 as product_a , interval '-41 mon -12 days +360:00' * 0.3 as product_b , interval '-41 mon 12 days 360:00' * 0.3 as product_c , interval '-41 mon -12 days -360:00' * 0.3 as product_d; product_a | product_b | product_c |product_d --+- ++- 1 year 11 days 146:24:00 | -1 years -11 days +69:36:00 | -1 years -5 days +98:24:00 | -1 years -11 days -146:23:60.00 - That is wrong, but I think we need another fix for that. Notice the problem is in minutes/seconds, not hours. I was sure it was more wrong than that the first time I saw it, but looks like I can't be sure of anything today :(. I need more sleep. Sorry for the noise on this one. Off work now, so I'm back at it. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > I am unclear about this report. The patch was not meant to fix every > > interval issue, but merely to improve multiplication and division > > computations. Does it do that? > > According to Michael's last report, your patch fails under > --enable-integer-datetimes. Where does it fail? Here? select interval '41 mon 12 days 360:00' * 0.3 as product_a , interval '-41 mon -12 days +360:00' * 0.3 as product_b , interval '-41 mon 12 days 360:00' * 0.3 as product_c , interval '-41 mon -12 days -360:00' * 0.3 as product_d; product_a | product_b | product_c |product_d --+- ++- 1 year 11 days 146:24:00 | -1 years -11 days +69:36:00 | -1 years -5 days +98:24:00 | -1 years -11 days -146:23:60.00 - That is wrong, but I think we need another fix for that. Notice the problem is in minutes/seconds, not hours. > This is an issue where you have to be "as simple as possible, but no > simpler". I think the approach you are proposing is too simple. > Michael's last patch here: > http://archives.postgresql.org/pgsql-patches/2006-08/msg00447.php > looks considerably more likely to lead to a workable answer. I see he is taking the fractional part of the result and finding if that should round. I am confused why that would help the -146:23:60.00 value above. Notice we only see it for negative values too. I do like that he is rounding the computation spillover, and not the total time value, which is what I started with. I believe my provides a more accurate computation, and doesn't have the problems of rounding. The only bug we can find is the powerpc one for -146:23:60 minutes. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
I am unclear about this report. The patch was not meant to fix every interval issue, but merely to improve multiplication and division computations. Does it do that? I think the 23:60 is a time rounding issue that isn't covered in this patch. I am not against fixing it, but does the submitted patch improve things or not? Given we are post-feature freeze, we don't have time to fix all the interval issues. --- Michael Glaesemann wrote: > > On Sep 1, 2006, at 5:05 , Bruce Momjian wrote: > > > Tom Lane wrote: > >> Bruce Momjian <[EMAIL PROTECTED]> writes: > >>> Well, the patch only multiplies by 30, so the interval would have to > >>> span +5 million years to overflow. I don't see any reason to add > >>> rounding until we get an actual query that needs it > >> > >> Have you tried your patch against the various cases that have been > >> discussed in the past? In particular there were several distinct > >> examples of this behavior posted at the beginning of the thread, and > >> I'd not assume that a fix for one handles them all. > > > > Yes, it fixes all posted examples, except one that displays 23:60. I > > cannot reproduce that failure from Powerpc so am waiting for > > Michael to > > test it. > > Here's your patch tested on my machine, both with and without -- > enable-integer-datetimes. I've tweaked the ad hoc test suite to > include a case where the days and time differ in sign and added a > couple of queries to the ad hoc test suite to include the problems > Tom referred to--not that this patch will fix them, but to keep the > known problems together. I hope to add more to this to test more edge > cases. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
Bruce Momjian <[EMAIL PROTECTED]> writes: > I am unclear about this report. The patch was not meant to fix every > interval issue, but merely to improve multiplication and division > computations. Does it do that? According to Michael's last report, your patch fails under --enable-integer-datetimes. This is an issue where you have to be "as simple as possible, but no simpler". I think the approach you are proposing is too simple. Michael's last patch here: http://archives.postgresql.org/pgsql-patches/2006-08/msg00447.php looks considerably more likely to lead to a workable answer. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
On Sep 1, 2006, at 5:05 , Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian <[EMAIL PROTECTED]> writes: Well, the patch only multiplies by 30, so the interval would have to span +5 million years to overflow. I don't see any reason to add rounding until we get an actual query that needs it Have you tried your patch against the various cases that have been discussed in the past? In particular there were several distinct examples of this behavior posted at the beginning of the thread, and I'd not assume that a fix for one handles them all. Yes, it fixes all posted examples, except one that displays 23:60. I cannot reproduce that failure from Powerpc so am waiting for Michael to test it. Here's your patch tested on my machine, both with and without -- enable-integer-datetimes. I've tweaked the ad hoc test suite to include a case where the days and time differ in sign and added a couple of queries to the ad hoc test suite to include the problems Tom referred to--not that this patch will fix them, but to keep the known problems together. I hope to add more to this to test more edge cases. Unfortunately the problem still occur (see product_d), and --enable- integer-datetimes is pretty broken with this patch. Michael Glaesemann grzm seespotcode net -- test queries select interval '41 mon 12 days 360:00' * 0.3 as product_a , interval '-41 mon -12 days +360:00' * 0.3 as product_b , interval '-41 mon 12 days 360:00' * 0.3 as product_c , interval '-41 mon -12 days -360:00' * 0.3 as product_d; select interval '41 mon 12 days 360:00' / 10 as quotient_a , interval '-41 mon -12 days +360:00' / 10 as quotient_b , interval '-41 mon 12 days 360:00' / 10 as quotient_c , interval '-41 mon -12 days -360:00' / 10 as quotient_d; select interval '-12 days' * 0.3; select 1 * '100 hours'::interval as "ten billion"; set time zone 'EST5EDT'; select '2005-10-29 13:22:00-04'::timestamptz + '1 day'::interval as "2005-01-30 13:22:00-05"; select '2005-10-30 13:22:00-05'::timestamptz - '2005-10-29 13:22:00-04'::timestamptz as "a day"; set time zone local; -- end test queries -- without --enable-integer-datetimes select interval '41 mon 12 days 360:00' * 0.3 as product_a , interval '-41 mon -12 days +360:00' * 0.3 as product_b , interval '-41 mon 12 days 360:00' * 0.3 as product_c , interval '-41 mon -12 days -360:00' * 0.3 as product_d; product_a | product_b | product_c |product_d --+- ++- 1 year 11 days 146:24:00 | -1 years -11 days +69:36:00 | -1 years -5 days +98:24:00 | -1 years -11 days -146:23:60.00 (1 row) select interval '41 mon 12 days 360:00' / 10 as quotient_a , interval '-41 mon -12 days +360:00' / 10 as quotient_b , interval '-41 mon 12 days 360:00' / 10 as quotient_c , interval '-41 mon -12 days -360:00' / 10 as quotient_d; quotient_a |quotient_b | quotient_c |quotient_d +--- +---+--- 4 mons 4 days 40:48:00 | -4 mons -4 days +31:12:00 | -4 mons -2 days +40:48:00 | -4 mons -4 days -40:48:00 (1 row) select interval '-12 days' * 0.3; ?column? -- -3 days -14:23:60.00 (1 row) select 1 * '100 hours'::interval as "ten billion"; ten billion -- 2147483647:00:00 (1 row) set time zone 'EST5EDT'; SET select '2005-10-29 13:22:00-04'::timestamptz + '1 day'::interval as "2005-01-30 13:22:00-05"; 2005-01-30 13:22:00-05 2005-10-30 13:22:00-05 (1 row) select '2005-10-30 13:22:00-05'::timestamptz - '2005-10-29 13:22:00-04'::timestamptz as "a day"; a day 1 day 01:00:00 (1 row) set time zone local; SET -- with --enable-integer-datetimes select interval '41 mon 12 days 360:00' * 0.3 as product_a , interval '-41 mon -12 days +360:00' * 0.3 as product_b , interval '-41 mon 12 days 360:00' * 0.3 as product_c , interval '-41 mon -12 days -360:00' * 0.3 as product_d; product_a | product_b | product_c | product_d --+- ++-- 1 year 11 days 146:24:00 | -1 years -11 days +69:36:00 | -1 years -5 days +98:24:00 | -1 years -11 days -146:24:00 (1 row) select interval '41 mon 12 days 360:00' / 10 as quotient_a , interval '-41 mon -12 days +360:00' / 10 as quotient_b , interval '-41 mon 12 days 360:00' / 10 as quotient_c , interval '-41 mon -12 days -360:00' / 10 as quotient_d; quotient_a |quotient_b | quotient_c |quotient_d +
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Well, the patch only multiplies by 30, so the interval would have to > > span +5 million years to overflow. I don't see any reason to add > > rounding until we get an actual query that needs it > > Have you tried your patch against the various cases that have been > discussed in the past? In particular there were several distinct > examples of this behavior posted at the beginning of the thread, and > I'd not assume that a fix for one handles them all. Yes, it fixes all posted examples, except one that displays 23:60. I cannot reproduce that failure from Powerpc so am waiting for Michael to test it. > BTW, while trolling for examples I came across this: > http://archives.postgresql.org/pgsql-bugs/2005-10/msg00307.php > pointing out some issues that still haven't been addressed. Yea, that is a bunch of issues. They are already on the TODO list. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
Bruce Momjian <[EMAIL PROTECTED]> writes: > Well, the patch only multiplies by 30, so the interval would have to > span +5 million years to overflow. I don't see any reason to add > rounding until we get an actual query that needs it Have you tried your patch against the various cases that have been discussed in the past? In particular there were several distinct examples of this behavior posted at the beginning of the thread, and I'd not assume that a fix for one handles them all. BTW, while trolling for examples I came across this: http://archives.postgresql.org/pgsql-bugs/2005-10/msg00307.php pointing out some issues that still haven't been addressed. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > OK, here is a much nicer patch. The fix is to do no rounding, but to > > find the number of days before applying the factor adjustment. > > You have forgotten the problem of the factor not being exactly > representable (eg, things like '10 days' * 0.1 not giving the expected > result). Also, as coded this is subject to integer-overflow risks > that weren't there before. That could be fixed, but it's still only > addressing a subset of the problems. I don't think you can fix them > all without rounding somewhere. Well, the patch only multiplies by 30, so the interval would have to span +5 million years to overflow. I don't see any reason to add rounding until we get an actual query that needs it (and because rounding is arbitrary). I think the proposed fix is the best we can do at this time. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Interval aggregate regression failure
Bruce Momjian <[EMAIL PROTECTED]> writes: > OK, here is a much nicer patch. The fix is to do no rounding, but to > find the number of days before applying the factor adjustment. You have forgotten the problem of the factor not being exactly representable (eg, things like '10 days' * 0.1 not giving the expected result). Also, as coded this is subject to integer-overflow risks that weren't there before. That could be fixed, but it's still only addressing a subset of the problems. I don't think you can fix them all without rounding somewhere. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match