[HACKERS] timestamp datatype cleanup
PosgreSQL hackers, Here's an initial bit of my attempt at cleaning up the the timestamp datatype. I have gone through the backend and made a couple small changes to stop using the HAVE_INT64_TIMESTAMP define to select a type in code by creating typedefs in a header and using the typedef in the code. I think this small bit is ready for inclusion for this small bit, but I have a couple questions for further work. 1) Is there a reason that header information is duplicated between normal posgresql include and ecpg includes instead of defining the info in one place and #including it into the files that need it? 2) Would it be reasonable to change timestamp.h into a file that includes other files that define the specific parts depending on HAVE_INT64_TIMESTAMP instead of testing for HAVE_INT64_TIMESTAMP many times throughout timestamp.h? I think this might more cleanly separate the logic for the different timestamp types. Thanks, wt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Clean up date/time variable type selection
I have removed all instances of using HAVE_INT64_TIMESTAMP to determine the type of a variable in files depending on timestamp.h. --- src/backend/utils/adt/nabstime.c |6 +- 1 files changed, 1 insertions(+), 5 deletions(-) diff --git a/src/backend/utils/adt/nabstime.c b/src/backend/utils/adt/nabstime.c index 992f7ab..5a662f3 100644 --- a/src/backend/utils/adt/nabstime.c +++ b/src/backend/utils/adt/nabstime.c @@ -832,11 +832,7 @@ interval_reltime(PG_FUNCTION_ARGS) month, day; -#ifdef HAVE_INT64_TIMESTAMP - int64 span; -#else - double span; -#endif + TimeOffset span; year = interval-month / MONTHS_PER_YEAR; month = interval-month % MONTHS_PER_YEAR; -- 1.5.2.5 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Add TimeOffset and DateOffset typedefs
I added TimeOffset and DateOffset typedefs to get rid of the instances using the HAVE_INT64_TIMESTAMP define being used to determine the types of variables or functions in timestamp.c. --- src/backend/utils/adt/timestamp.c | 77 +++- src/include/utils/timestamp.h |4 ++ 2 files changed, 19 insertions(+), 62 deletions(-) diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c index 9b90873..4759e22 100644 --- a/src/backend/utils/adt/timestamp.c +++ b/src/backend/utils/adt/timestamp.c @@ -44,11 +44,7 @@ TimestampTz PgStartTime; -#ifdef HAVE_INT64_TIMESTAMP -static int64 time2t(const int hour, const int min, const int sec, const fsec_t fsec); -#else -static double time2t(const int hour, const int min, const int sec, const fsec_t fsec); -#endif +static TimeOffset time2t(const int hour, const int min, const int sec, const fsec_t fsec); static int EncodeSpecialTimestamp(Timestamp dt, char *str); static Timestamp dt2local(Timestamp dt, int timezone); static void AdjustTimestampForTypmod(Timestamp *time, int32 typmod); @@ -977,11 +973,7 @@ AdjustIntervalForTypmod(Interval *interval, int32 typmod) } else if (range == INTERVAL_MASK(MINUTE)) { -#ifdef HAVE_INT64_TIMESTAMP - int64 hour; -#else - double hour; -#endif + TimeOffset hour; interval-month = 0; interval-day = 0; @@ -998,11 +990,7 @@ AdjustIntervalForTypmod(Interval *interval, int32 typmod) } else if (range == INTERVAL_MASK(SECOND)) { -#ifdef HAVE_INT64_TIMESTAMP - int64 minute; -#else - double minute; -#endif + TimeOffset minute; interval-month = 0; interval-day = 0; @@ -1076,11 +1064,7 @@ AdjustIntervalForTypmod(Interval *interval, int32 typmod) else if (range == (INTERVAL_MASK(MINUTE) | INTERVAL_MASK(SECOND))) { -#ifdef HAVE_INT64_TIMESTAMP - int64 hour; -#else - double hour; -#endif + TimeOffset hour; interval-month = 0; interval-day = 0; @@ -1342,11 +1326,7 @@ timestamptz_to_str(TimestampTz t) void dt2time(Timestamp jd, int *hour, int *min, int *sec, fsec_t *fsec) { -#ifdef HAVE_INT64_TIMESTAMP - int64 time; -#else - double time; -#endif + TimeOffset time; time = jd; @@ -1547,13 +1527,8 @@ recalc_t: int tm2timestamp(struct pg_tm * tm, fsec_t fsec, int *tzp, Timestamp *result) { -#ifdef HAVE_INT64_TIMESTAMP - int date; - int64 time; -#else - double date, - time; -#endif + DateOffset date; + TimeOffset time; /* Julian day routines are not correct for negative Julian days */ if (!IS_VALID_JULIAN(tm-tm_year, tm-tm_mon, tm-tm_mday)) @@ -1596,13 +1571,8 @@ tm2timestamp(struct pg_tm * tm, fsec_t fsec, int *tzp, Timestamp *result) int interval2tm(Interval span, struct pg_tm * tm, fsec_t *fsec) { -#ifdef HAVE_INT64_TIMESTAMP - int64 time; - int64 tfrac; -#else - double time; - double tfrac; -#endif + TimeOffset time; + TimeOffset tfrac; tm-tm_year = span.month / MONTHS_PER_YEAR; tm-tm_mon = span.month % MONTHS_PER_YEAR; @@ -1658,19 +1628,15 @@ tm2interval(struct pg_tm * tm, fsec_t fsec, Interval *span) return 0; } -#ifdef HAVE_INT64_TIMESTAMP -static int64 +static TimeOffset time2t(const int hour, const int min, const int sec, const fsec_t fsec) { +#ifdef HAVE_INT64_TIMESTAMP return (hour * MINS_PER_HOUR) + min) * SECS_PER_MINUTE) + sec) * USECS_PER_SEC) + fsec; -} /* time2t() */ #else -static double -time2t(const int hour, const int min, const int sec, const fsec_t fsec) -{ return (((hour * MINS_PER_HOUR) + min) * SECS_PER_MINUTE) + sec + fsec; -} /* time2t() */ #endif +} /* time2t() */ static Timestamp dt2local(Timestamp dt, int tz) @@ -2042,13 +2008,8 @@ timestamptz_cmp_timestamp(PG_FUNCTION_ARGS) static int interval_cmp_internal(Interval *interval1, Interval *interval2) { -#ifdef HAVE_INT64_TIMESTAMP - int64 span1, + TimeOffset span1, span2; -#else - double span1, - span2; -#endif span1 = interval1-time; span2 = interval2-time; @@ -2388,11 +2349,7 @@ interval_justify_interval(PG_FUNCTION_ARGS) Interval *span =
[HACKERS] timestamp datatype cleanup
PosgreSQL hackers, Here's an initial bit of my attempt at cleaning up the the timestamp datatype. I have gone through the backend and made a couple small changes to stop using the HAVE_INT64_TIMESTAMP define to select a type in code by creating typedefs in a header and using the typedef in the code. I think this small bit is ready for inclusion for this small bit, but I have a couple questions for further work. 1) Is there a reason that header information is duplicated between normal posgresql include and ecpg includes instead of defining the info in one place and #including it into the files that need it? 2) Would it be reasonable to change timestamp.h into a file that includes other files that define the specific parts depending on HAVE_INT64_TIMESTAMP instead of testing for HAVE_INT64_TIMESTAMP many times throughout timestamp.h? I think this might more cleanly separate the logic for the different timestamp types. Thanks, wt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] timestamp datatype cleanup
On Sun, Mar 09, 2008 at 12:32:20AM -0800, Warren Turkal wrote: 1) Is there a reason that header information is duplicated between normal posgresql include and ecpg includes instead of defining the info in one place and #including it into the files that need it? As long as both implementations are kept in sync I don't see a reason. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Lazy constraints / defaults
Hello, I would like to be able to add CONSTRAINT and/or DEFAULT with out affecting old rows. Yes, it sounds strange, but... Let's say I have big table, I want to add new column, with DEFAULT and NOT NULL. Normally it means long exclusive lock. So - right now I'm adding plain new column, then DEFAULT, then UPDATE on all rows in chunks, then NOT NULL... Can it be little simpler? -- Regards, Michał Zaborowski (TeXXaS) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] timestamp datatype cleanup
Michael Meskes wrote: On Sun, Mar 09, 2008 at 12:32:20AM -0800, Warren Turkal wrote: 1) Is there a reason that header information is duplicated between normal posgresql include and ecpg includes instead of defining the info in one place and #including it into the files that need it? Merlin and I ran into this question while working on the libpq type system, found here: http://archives.postgresql.org/pgsql-patches/2008-03/msg00057.php It opens up the binary parameterized interface in libpq by providing data type converters, to and from external binary format and C data types, when performing queries or getting results. This required copying and pasting some code from the backend for a couple types ... like datetime and numeric. We had to work around several backend compile-time checks by using run-time checks; being how servers are compiled differently. The most modular approach would be to abstract the backend/utils/adt API so the backend and client stuff can share the functionality. We did mention this during one of our patch submissions, but didn't push it as it is a large change outside the scope of our patch. As long as both implementations are kept in sync I don't see a reason. Sharing the backend data type converters with client stuff is an obvious win, but its a tedious process probably lacking any motivation. We did look at it though, it is possible. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Idea: Comments on system catalogs?
On Fri, Mar 7, 2008 at 3:51 PM, Bruce Momjian [EMAIL PROTECTED] wrote: ~ Jim C. Nasby wrote: On Wed, Jul 04, 2007 at 01:03:20PM +0200, Dawid Kuroczko wrote: Hello. I think it could be a nice idea to put descriptions from http://www.postgresql.org/docs/8.2/static/catalogs.html into system catalogs itself. I.e., make a bunch of COMMENT ON COLUMN pg_class.relname IS 'Name of the table, index, view, etc.'; ... COMMENT ON COLUMN pg_class.relkind IS 'r = ordinary table, i = index, S = sequence, v = view, c = composite type, t = TOAST table'; Actually, this does exist for some things in the catalog; I suspect it just wasn't done in the past (perhaps Postgres didn't originally have comments). I think it would be a useful addition. But I think it'd need to be more than just a .sql file (initdb would probably need to be modified). Ideally, we'd be able to suck the info out of the appropriate .sgml files. Added to TODO: * Add comments on system tables/columns using the information in catalogs.sgml Ideally the information would be pulled from the SGML file automatically. Since I have raised the issue back then, I volunteer for doing this TODO. As I am lazy by nature, I am sure I will try to use SGML files instead of tediuos copypaste routine. ;-) Regards, Dawid -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lazy constraints / defaults
=?ISO-8859-2?Q?Micha=B3_Zaborowski?= [EMAIL PROTECTED] writes: I would like to be able to add CONSTRAINT and/or DEFAULT with out affecting old rows. You mean without actually checking that the old rows satisfy the constraint? There's approximately zero chance that that proposal will be accepted. Yes, it sounds strange, but... Let's say I have big table, I want to add new column, with DEFAULT and NOT NULL. Normally it means long exclusive lock. So - right now I'm adding plain new column, then DEFAULT, then UPDATE on all rows in chunks, then NOT NULL... Can it be little simpler? Just do it all in one ALTER command. alter table tab add column col integer not null default 42 check (col 0); regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] timestamp datatype cleanup
As long as both implementations are kept in sync I don't see a reason. Sharing the backend data type converters with client stuff is an obvious win, but its a tedious process probably lacking any motivation. We did look at it though, it is possible. I thought we were just talking about some definitions. If we were to move the whole logic into one place we might be facing a performance penalty inside the backend, maybe not a large one though. Others around here are better suited to judge this. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] timestamp datatype cleanup
I thought we were just talking about some definitions. I was expanding on your #1 question, which was directly talking about shared headers rather than just cleaning out HAVE_INT64_TIMESTAMP. I had the same experience but also ran into the need for shared library code; which BTW ecpg could benefit from as well. performance penalty inside the backend The idea requires reorganizing, not reimplementing. There shouldn't be a change in performance in either direction. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Maximum statistics target
Tom Lane wrote: Martijn van Oosterhout [EMAIL PROTECTED] writes: On Fri, Mar 07, 2008 at 07:25:25PM +0100, Peter Eisentraut wrote: What's the problem with setting it to ten million if I have ten million values in the table and I am prepared to spend the resources to maintain those statistics? That it'll probably take 10 million seconds to calculate the plans using it? I think Tom pointed there are a few places that are O(n^2) the number entries... I'm not wedded to the number 1000 in particular --- obviously that's just a round number. But it would be good to see some performance tests with larger settings before deciding that we don't need a limit. I recently encountered a situation where I would have liked to be able to try a larger limit (amongst other ideas for improving my situation): I have a field whose distribution of frequencies of values is roughly geometric, rather than flat. Total rows = 36 million relpages=504864 Distinct field values in use = 169 10 values account for 50% of the rows. 41 values account for 90% of the rows. After setting statistics target to 1000 for that field, and analyzing the table, the statistics row for that field had 75 most frequent values and a histogram with 76 entries in it. Estimating 151 values in total. For this situation using a larger statistics target should result in more pages being read, and a more accurate record of statistics. It shouldn't result in significantly more work for the planner. It wouldn't solve my problem though, which is frequent over-estimation of rows when restricting by this field with values not known at plan time. Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way. __ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lazy constraints / defaults
On Sun, Mar 9, 2008 at 7:25 PM, Tom Lane [EMAIL PROTECTED] wrote: =?ISO-8859-2?Q?Micha=B3_Zaborowski?= [EMAIL PROTECTED] writes: I would like to be able to add CONSTRAINT and/or DEFAULT with out affecting old rows. You mean without actually checking that the old rows satisfy the constraint? There's approximately zero chance that that proposal will be accepted. I think the problem here is to minimize the time when table is held by exclusive lock, Something similar to the CREATE INDEX CONCURRENTLY (i.e. hold exclusive lock for a jiffy, then do the actual work for the old tuples). So, the proposal would read as to add the ability to perform: ALTER TABLE CONCURRENTLY ALTER COLUMN foo SET NOT NULL ...where exclusive lock would be held to place the constraint (so all new tuples would satisfy it), lock would be released and the old tuples would be checked to make sure the constraint is valid. Should a NULL value be found or should the backend die, the constraint should disappear or be marked invalid. Yes, it sounds strange, but... Let's say I have big table, I want to add new column, with DEFAULT and NOT NULL. Normally it means long exclusive lock. So - right now I'm adding plain new column, then DEFAULT, then UPDATE on all rows in chunks, then NOT NULL... Can it be little simpler? Just do it all in one ALTER command. alter table tab add column col integer not null default 42 check (col 0); I think this will not solve the OP's problem. He wants to minimize the time a table is under exclusive lock, and this ALTER command will effectively rewrite the whole table (to add new not null column). Probably a workable solution would be to play with inheritance: -- Add the NULL col colum: ALTER TABLE tab ADD COLUMN col integer; -- Create a table which will have col NOT NULL CREATE TABLE tab_new (LIKE tab INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES ) INHERITS (tab); ALTER TABLE tab_new ALTER COLUMN col SET NOT NULL; -- Make the new values go to tab_new, if simple enough same might be done for UPDATEs CREATE RULE insert_new AS ON INSERT TO tab DO INSTEAD INSERT INTO tab_new VALUES (NEW.*); -- Now, make a job which will do something like this: START TRANSACTION ISOLATON LEVEL SERIALIZABLE; UPDATE ONLY tab SET col = 42 WHERE id BETWEEN n AND n + 1000; INSERT INTO tab_new SELECT * FROM ONLY tab WHERE id BETWEEN n AND n + 1000; -- or better: -- INSERT INTO tab_new SELECT a,b,c,42 AS col FROM ONLY tab WHERE id BETWEEN n AND n + 1000 FOR UPDATE; DELETE FROM ONLY tab WHERE id BETWEEN n AND n + 1000; COMMIT; -- Finally, exhange parti^W^W get rid of old tab: SELECT count(*) FROM ONLY tab; -- should be zero ALTER TABLE tab RENAME TO tab_old; ALTER TABLE tab_new RENAME TO tab; ALTER TABLE tab NO INHERIT tab_old; Of course each step should be done in transaction, probably starting with explicit LOCK. And extra care should be taken with respect to the UNIQUE constraints. In short: unless you are 100% sure what you are doing, don't. :-) Regards, Dawid -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Estimating geometric distributions
I wrote: I have a field whose distribution of frequencies of values is roughly geometric, rather than flat. Total rows = 36 million relpages=504864 Distinct field values in use = 169 10 values account for 50% of the rows. 41 values account for 90% of the rows. After setting statistics target to 1000 for that field, and analyzing the table, the statistics row for that field had 75 most frequent values and a histogram with 77 entries in it. Estimating 152 values in total. public;mytable;myfield;0;4;152;{202,179,8,181,173,207,6,118,107,205,182,4,54,247,168,77,169,53,120,159,149,174,167,156,148,150,56,108,66,119,5,99,96,175,97,208,1,130,10,102,228,101,121,50,11,152,32,12,78,221,55,244,241,252,203,116,103,184,154,153,238,65,49,220,83,98,111,85,139,242,240,260,7,109,114};{0.0836433,0.0781667,0.0738367,0.0598533,0.04629,0.04447,0.0359833,0.0314267,0.0278333,0.0268,0.0251433,0.0244867,0.02438,0.0223433,0.0207567,0.0189667,0.0168833,0.01582,0.0150267,0.0141767,0.0130467,0.0128933,0.0125767,0.0123567,0.0116567,0.0114967,0.01048,0.01037,0.00994667,0.00987667,0.00977667,0.00965333,0.00916333,0.00828667,0.00732667,0.00712,0.00629,0.00624,0.00576667,0.00558667,0.00477667,0.00475333,0.00410333,0.00405667,0.00371667,0.00334667,0.00334,0.00312667,0.00312667,0.00302,0.00300333,0.00295,0.00287333,0.00271,0.00267,0.00240667,0.00224,0.00221333,0.00215333,0.0021,0.00205667,0.00202667,0.00197333,0.00197333,0.00168667,0.00166,0.00159333,0.00159,0.00154667,0.00150333,0.00149,0.0013,0.00132,0.00112667,0.00104};{2,9,9,9,67,76,84,84,86,87,87,88,95,100,100,100,104,105,105,110,112,112,128,137,137,138,143,144,144,144,151,155,155,155,157,157,158,171,171,183,185,185,185,185,187,194,199,199,200,200,201,204,204,209,209,214,214,214,214,215,217,225,225,225,229,239,239,249,250,250,253,253,255,257,261,262,266};0.449246 My problem is frequent over-estimation of rows when restricting by this field with values not known at plan time. examples: select * from mytable where myfield = ?; select * from mytable where myfield in (subquery); My arithmetic mean of the frequencies is 214200 My geometric mean is 13444 However analyze didn't find all my values, and thinks that there are only 152 of them, so it uses a mean of 238046 When the subquery is estimated to return three myfield values, the query estimates 714138 rows, and chooses a sequential scan over mytable (myfield is indexed). explain select * from mytable where myfield in (values (1),(2),(3)); Hash IN Join (cost=0.08..1009521.37 rows=714138 width=86) Hash Cond: (mytable.myfield = *VALUES*.column1) - Seq Scan on mytable (cost=0.00..866693.76 rows=36182976 width=86) - Hash (cost=0.04..0.04 rows=3 width=4) - Values Scan on *VALUES* (cost=0.00..0.04 rows=3 width=4) I think this query is much more likely to return around 4 rows, and a Bitmap Index Scan should be used. explain select * from mytable where myfield in (values (1),(2)); Nested Loop (cost=4445.11..931383.93 rows=476092 width=86) - Unique (cost=0.04..0.04 rows=2 width=4) - Sort (cost=0.04..0.04 rows=2 width=4) Sort Key: *VALUES*.column1 - Values Scan on *VALUES* (cost=0.00..0.03 rows=2 width=4) - Bitmap Heap Scan on mytable (cost=4445.08..462716.37 rows=238046 width=86) Recheck Cond: (mytable.myfield = *VALUES*.column1) - Bitmap Index Scan on myindex (cost=0.00..4385.56 rows=238046 width=0) Index Cond: (mytable.myfield = *VALUES*.column1) The expected number of loops (2 here, 3 above) through the Bitmap Heap Scan * 462716.37 1009521.37, but the cost estimate is far too high in the general case. It should be closer to 26000 per loop if adjusted for my expectation of the number of rows, being 13444 per loop. As such, you should need to expect close to 40 myfield values being returned by the subquery before choosing a sequential scan. Is there any facility already in PostgreSQL to help me here? Hopefully an index type that I don't know about yet? (Geometric distributions are similar to those found in word count distributions). If not... is there any merit in this idea: During the analyze process, the geometric mean of sampled rows was calculated, and if determined to be significantly different from the arithmetic mean, stored in a new stats column. When estimating the number of rows that will be returned by queries of the form shown above, if there is a geometric mean stored, use it instead of the arithmetic mean. Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way. __ This email has been scanned by the DMZGlobal
[HACKERS] Cleaning up the commit fest to-do list
I've gone through the patch queue and marked all or at least most of the threads that don't contain any commit-fest material. (The remaining threads contain actual patches, or at least design discussions that seem worth getting more eyeballs on at the present time.) Could you pull out all that stuff into a separate TODO mailbox? It should not go into pgpatches-hold because it's not going to be material for future commit-fests, either, until someone does some actual work. I don't want to have to repeat this sorting exercise for the next commit fest. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] timestamp datatype cleanup
Michael Meskes [EMAIL PROTECTED] writes: As long as both implementations are kept in sync I don't see a reason. Sharing the backend data type converters with client stuff is an obvious win, but its a tedious process probably lacking any motivation. We did look at it though, it is possible. I thought we were just talking about some definitions. If we were to move the whole logic into one place we might be facing a performance penalty inside the backend, maybe not a large one though. Others around here are better suited to judge this. It's already a huge maintenance problem that there are two copies of the datetime code (and no, they are *not* being kept in sync; most patches to the core code have not gone into ecpg). The prospect of extending that mess to even more datatypes fills me with dread. I don't know what we could do about it though. We can't really replicate the conveniences of the backend coding environment in a frontend application, and yet I surely don't want to tighten up the expectations on backend datatypes to try to make them work in a frontend-ish environment. The chances of that happening reliably are about nil. My feeling is that we should resist the temptation to provide any such functionality on the frontend side (and yes, that inclues ripping out ecpg's datetime stuff). It'll be a black hole sucking infinite numbers of maintenance man-hours, with not nearly enough benefit in return. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] timestamp datatype cleanup
conveniences of the backend coding environment in a frontend application I don't think this is required. I'm thinking about light-weight wrappers around A-2-B style converters, no ties to the backend or client. A free standing util library. The idea doesn't require a backend environment. want to tighten up the expectations on backend datatypes to try to make them work in a frontend-ish environment. Making timestamp2tm or tm2timestamp (for example) work in the backend as well as the client seems rather straight forward, unless I am missing something. I am not talking about making the data type functional: like add, divide, multiply funtions. I am only referring to data type converters. The chances of that happening reliably are about nil. Why? Again, I could be missing something. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump additional options for performance
Bruce Momjian [EMAIL PROTECTED] writes: Decibel! wrote: On Feb 26, 2008, at 4:36 PM, Tom Lane wrote: I think a sane way to think about what Simon would like to accomplish is not turn psql into a parallel job scheduler My $0.02: I often find myself wishing I could perform parallel operations in psql. There was a proposal for that that came up during 8.3 development; whatever happened to it? The concurrent psql patch was never updated to an acceptable state for it to be reviewed. Well, I got tied in knots trying to fix up the SIGINT handling working properly. I want to try rewriting it from scratch to try to produce a cleaner version. But that doesn't mean nobody should look at the patch and give comments. It would be silly for me to rewrite it keeping things basically the way they are now just cleaned up -- only to then find out that people disagree with the basic approach and don't accept it anyways. This may be a trivial example but some pretty major patches which required a lot of work have been submitted before where nobody read any of the WIP patches and then basic questions were raised long after a lot of work had been committed. We seem to often think of review as equivalent to commit. But patch authors often want to get some confirmation they're on the right track before they move on the next step. As a result many patches kind of get stuck in a catch-22 where they're not ready for review and no ready for development either. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cleaning up the commit fest to-do list
Tom Lane wrote: I've gone through the patch queue and marked all or at least most of the threads that don't contain any commit-fest material. (The remaining threads contain actual patches, or at least design discussions that seem worth getting more eyeballs on at the present time.) Could you pull out all that stuff into a separate TODO mailbox? It should not go into pgpatches-hold because it's not going to be material for future commit-fests, either, until someone does some actual work. I don't want to have to repeat this sorting exercise for the next commit fest. I will deal with them and add them TODO and remove them. I was hoping more people would add comments before I made them TODO's, but perhaps not. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Include Lists for Text Search
Simon Riggs [EMAIL PROTECTED] writes: I've coded a small patch to allow CaseSensitive synonyms. Applied with corrections (it'd be good if you at least pretended to test stuff before submitting it). Would a similar parameter be useful for any of the other dictionary types? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Negative LIMIT and OFFSET?
Simon Riggs [EMAIL PROTECTED] writes: On Thu, 2007-12-13 at 22:23 -0800, Neil Conway wrote: If LIMIT n means emit at most n tuples, then a query that produces 0 rows with n 0 is arguably violating its spec, since it has produced more tuples than the LIMIT specified (0 n). Interpreted this way, no result set can be consistent with a negative limit, so I'd vote for throwing an error. I even found an existing, unused error message called ERRCODE_INVALID_LIMIT_VALUE so here's a patch. Applied, but using just ERRCODE_INVALID_PARAMETER_VALUE rather than guessing what the SQL committee intended with that SQLSTATE. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Need -fwrapv or -fno-strict-overflow for gcc-4.3
A while back Tom Lane presented the potential problem with gcc signed overflow handling, but now it's not just a theoretical problem. http://archives.postgresql.org/pgsql-hackers/2005-12/msg00635.php Gcc 4.3 has started to perform optimizations based on the denial of the existence of signed overflow. Building CVS HEAD with gcc 4.3rc2 I get the following warnings: localtime.c:1223: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false localtime.c:1227: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false array_userfuncs.c:100: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false float.c:2696: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false float.c:2712: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false oracle_compat.c:1479: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false I don't understand the difference between -fwrapv and -fno-strict-aliasing, but it seems we need at least one of them. http://www.airs.com/blog/archives/120 Kris Jurka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers