Re: [HACKERS] roundoff problem in time datatype
On Sun, 25 Sep 2005, Tom Lane wrote: Alternatively: why are we forbidding the value 24:00:00 anyway? Is there a reason not to allow the hours field to exceed 23? One reason is because it's what the standard demand. Another is that it isn't a proper time, just like feb 31 isn't a proper date. -- /Dennis Björklund ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] expected authentication request from server, but received...
Hello, I have a very old postgres 6.4 installation. I'm trying to connect to it from a fresh, up-to-date Debian-host, but I'm constantly getting (from DBI and PHP as well) this error message: # ./update_dns.pl DBI connect('dbname=dns;host=sql;authtype=password','dns',...) failed: expected authentication request from server, but received U at ./update_dns.pl line 237 Connection with psql works OK. Connecting from different (older) hosts with DBI/PHP works perfect as well. Something must have changed in recent versions of libpq and now it doesn't work. I looked into source code and found that the server's message at the very start of conversation appears to be something else than the libpq expects. But how to fix it?! Has the protocol change or what? P.S. Don't tell me to upgrade postgres :) It must stay like that for some reasons which are beyond me. -- Samotnik Michał Jęczalik, +48.695.64.75.14 www.zagle.org.pl - rejsy morskie ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] expected authentication request from server, but received...
On Mon, Sep 26, 2005 at 09:59:53AM +0200, Micha? J?czalik wrote: Connection with psql works OK. Connecting from different (older) hosts with DBI/PHP works perfect as well. Something must have changed in recent versions of libpq and now it doesn't work. I looked into source code and found that the server's message at the very start of conversation appears to be something else than the libpq expects. But how to fix it?! Has the protocol change or what? Umm, I think the protocol version is up to 3 or 4 now. I think libpq supports all the way back to 7.0, I don't know if anyone tests earlier than that. If you really need this to work, I suggest pulling the libpq from that release (CVS or tarball) or slightly later and installing that on the new machines. But you should upgrade, 6.4 hasn't had support for year, who know how many bugs... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpteoQYstgAk.pgp Description: PGP signature
Re: [HACKERS] expected authentication request from server, but
On Mon, 26 Sep 2005, Martijn van Oosterhout wrote: On Mon, Sep 26, 2005 at 09:59:53AM +0200, Micha? J?czalik wrote: Connection with psql works OK. Connecting from different (older) hosts with DBI/PHP works perfect as well. Something must have changed in recent versions of libpq and now it doesn't work. I looked into source code and found that the server's message at the very start of conversation appears to be something else than the libpq expects. But how to fix it?! Has the protocol change or what? Umm, I think the protocol version is up to 3 or 4 now. I think libpq supports all the way back to 7.0, I don't know if anyone tests earlier than that. If you really need this to work, I suggest pulling the libpq from that release (CVS or tarball) or slightly later and installing that on the new machines. Heh, that's what I'm currently doing, but it's not so easy to compile 6.4 on a new system. configure states that the compiler cannot create executables, but I ripped this check out of it. Anyway, something goes wrong while generating the Makefile: sed: file conftest.s1 line 33: unterminated `s' command creating backend/catalog/genbki.sh sed: file conftest.s1 line 33: unterminated `s' command creating backend/utils/Gen_fmgrtab.sh sed: file conftest.s1 line 33: unterminated `s' command creating bin/pg_dump/Makefile sed: file conftest.s1 line 33: unterminated `s' command creating bin/pg_version/Makefile sed: file conftest.s1 line 33: unterminated `s' command :( -- Samotnik Michał Jęczalik, +48.695.64.75.14 www.zagle.org.pl - rejsy morskie ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Inheritance + references fixup
Common problem with inheritance is that references work ONLY with top table and are not propagated to children. Is it possible to write triggers (like utils/adt/ri_triggers.c) which will act on INSERT/UPDATE/DELETE and check/cascade/restrict properly ? pg_attribute, pg_inherits inherits can be used along with an additional (non-system) table to store foreign keys. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] expected authentication request from server, but
Michal Jeczalik wrote: On Mon, 26 Sep 2005, Martijn van Oosterhout wrote: On Mon, Sep 26, 2005 at 09:59:53AM +0200, Micha? J?czalik wrote: Connection with psql works OK. Connecting from different (older) hosts with DBI/PHP works perfect as well. Something must have changed in recent versions of libpq and now it doesn't work. I looked into source code and found that the server's message at the very start of conversation appears to be something else than the libpq expects. But how to fix it?! Has the protocol change or what? Umm, I think the protocol version is up to 3 or 4 now. I think libpq supports all the way back to 7.0, I don't know if anyone tests earlier than that. If you really need this to work, I suggest pulling the libpq from that release (CVS or tarball) or slightly later and installing that on the new machines. Heh, that's what I'm currently doing, but it's not so easy to compile 6.4 on a new system. configure states that the compiler cannot create executables, but I ripped this check out of it. Anyway, something goes wrong while generating the Makefile: Try to compile 7.0 or 7.1, those versions will probably still support the 6.4 wire protocol. I am sure someone with more historical knowledge (or reading the release notes back to 6.4) can exactly tell you the latest version compatible with 6.4. It was already said, but please -- upgrade. Any version before 7.2 is going to eat your data after a certain amount of transactions. Best Regards, Michael Paesold ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] roundoff problem in time datatype
Actually, I think there is a case where 24:00 is a proper time. Isn't it used for adding leap seconds ? Dave On 26-Sep-05, at 3:39 AM, Dennis Bjorklund wrote: On Sun, 25 Sep 2005, Tom Lane wrote: Alternatively: why are we forbidding the value 24:00:00 anyway? Is there a reason not to allow the hours field to exceed 23? One reason is because it's what the standard demand. Another is that it isn't a proper time, just like feb 31 isn't a proper date. -- /Dennis Björklund ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] roundoff problem in time datatype
Dave Cramer [EMAIL PROTECTED] writes: Actually, I think there is a case where 24:00 is a proper time. Isn't it used for adding leap seconds ? No, I think the usual notation for a leap-second is '23:59:60'. We do allow 60 in the seconds field for this purpose. I suppose there's another possible approach, which is to special-case the output of this value to look like '23:59:60' instead of '24:00:00'. Then it could be reloaded. On the whole though, most people who came across that behavior would probably think it's a bug... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] expected authentication request from server, but received...
=?iso-8859-2?Q?Micha=B3_J=EAczalik?= [EMAIL PROTECTED] writes: I have a very old postgres 6.4 installation. I'm trying to connect to it from a fresh, up-to-date Debian-host, but I'm constantly getting (from DBI and PHP as well) this error message: # ./update_dns.pl DBI connect('dbname=dns;host=sql;authtype=password','dns',...) failed: expected authentication request from server, but received U at ./update_dns.pl line 237 You certain it's 6.4, and not even older? A look in the CVS logs says that we were using protocol 2.0 starting in 6.4, so current libpq should still talk to it. P.S. Don't tell me to upgrade postgres :) It must stay like that for some reasons which are beyond me. 6.4 *will* eat your data eventually. Don't say you weren't warned. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Server process exited with unexpected status 128.
Hello pgsql-hackers, When I try to execute the next SQL statement, sever was crashed: DELETE FROM ma_data WHERE id in (-1,212803,. ); ... - is 500k text like id separated by ,. Its about 10 values. Run from pgplsql function, like EXECUTE st;. postgresql-2005-09-25_00.log: 2005-09-26 15:45:52 LOG: server process (PID 2040) exited with unexpected status 128 2005-09-26 15:45:52 LOG: terminating any other active server processes 2005-09-26 15:45:52 WARNING: terminating connection because of crash of another server process 2005-09-26 15:45:52 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2005-09-26 15:45:52 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2005-09-26 15:45:52 WARNING: terminating connection because of crash of another server process 2005-09-26 15:45:52 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2005-09-26 15:45:52 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2005-09-26 15:45:52 WARNING: terminating connection because of crash of another server process 2005-09-26 15:45:52 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2005-09-26 15:45:52 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2005-09-26 15:45:52 LOG: all server processes terminated; reinitializing 2005-09-26 15:45:52 LOG: database system was interrupted at 2005-09-26 15:45:48 FLE Daylight Time 2005-09-26 15:45:52 LOG: checkpoint record is at 1/1720CF88 2005-09-26 15:45:52 LOG: redo record is at 1/17008C80; undo record is at 0/0; shutdown FALSE 2005-09-26 15:45:52 LOG: next transaction ID: 40476; next OID: 1836657 2005-09-26 15:45:52 LOG: next MultiXactId: 102; next MultiXactOffset: 202 2005-09-26 15:45:52 LOG: database system was not properly shut down; automatic recovery in progress 2005-09-26 15:45:52 LOG: redo starts at 1/17008C80 2005-09-26 15:45:53 LOG: unexpected pageaddr 1/11A7C000 in log file 1, segment 24, offset 10993664 2005-09-26 15:45:53 LOG: redo done at 1/18A7AA08 2005-09-26 15:46:02 LOG: database system is ready #--- # RESOURCE USAGE (except WAL) #--- # - Memory - shared_buffers = 8000 # min 16 or max_connections*2, 8KB each temp_buffers = 1# min 100, 8KB each #max_prepared_transactions = 5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 20 # min 64, size in KB maintenance_work_mem = 65536# min 1024, size in KB max_stack_depth = 65536 # min 100, size in KB PG 8.1beta2 WIN32. -- С наилучшими пожеланиями, Репко Андрей Владимирович mailto:[EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] expected authentication request from server, but
On Mon, Sep 26, 2005 at 10:28:48AM +0200, Michal Jeczalik wrote: Heh, that's what I'm currently doing, but it's not so easy to compile 6.4 on a new system. configure states that the compiler cannot create executables, but I ripped this check out of it. Anyway, something goes wrong while generating the Makefile: sed: file conftest.s1 line 33: unterminated `s' command I bet this is the multiline gcc --version. You could try wrapping gcc with a script that returns a single line --version. (Or maybe using an older gcc release.) -- Alvaro Herrerahttp://www.PlanetPostgreSQL.org World domination is proceeding according to plan(Andrew Morton) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] roundoff problem in time datatype
On Mon, 26 Sep 2005, Tom Lane wrote: Actually, I think there is a case where 24:00 is a proper time. Isn't it used for adding leap seconds ? No, I think the usual notation for a leap-second is '23:59:60'. We do allow 60 in the seconds field for this purpose. Yes, and it can go up to 23:59:60.99 (depending on how many fractional seconds one want). I suppose there's another possible approach, which is to special-case the output of this value to look like '23:59:60' instead of '24:00:00'. You would get the same problem with 23:59:60.9 which I guess you want to round up. One solution is to round '23:59:59.9'::time(0) up to '00:00:00'. That is normally the next following time value after all. I know why you might not want to round it up to 00:00:00, but it's one logical solution. By the way, here is another example of the same problem: # SELECT time '23:59:59.9' + interval '0.1'; ?column? -- 24:00:00 # SELECT time '23:59:59.9' + interval '0.11'; ?column? - 00:00:00.01 (1 rad) -- /Dennis Björklund ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] roundoff problem in time datatype
Dennis Bjorklund [EMAIL PROTECTED] writes: On Mon, 26 Sep 2005, Tom Lane wrote: No, I think the usual notation for a leap-second is '23:59:60'. We do allow 60 in the seconds field for this purpose. Yes, and it can go up to 23:59:60.99 (depending on how many fractional seconds one want). That's an urban legend. There never have been, and never will be, two leap seconds instituted in the same minute. We really should reject anything larger than '23:59:60'. One solution is to round '23:59:59.9'::time(0) up to '00:00:00'. 7.2 did that, and we concluded it was broken. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] roundoff problem in time datatype
On Mon, 26 Sep 2005, Tom Lane wrote: Yes, and it can go up to 23:59:60.99 (depending on how many fractional seconds one want). That's an urban legend. There never have been, and never will be, two leap seconds instituted in the same minute. We really should reject anything larger than '23:59:60'. The above is still just one leap second. The time continues to tick until it wraps over to 00:00:00. So for example a time value of 23:59:60.42 exists if we allow just one leap second. One solution is to round '23:59:59.9'::time(0) up to '00:00:00'. 7.2 did that, and we concluded it was broken. Doesn't mean that it necissary was a correct conclusion (and I'm not stating that it was wrong, I would like to think about it for a while before I claim something like that). Do the sql standard say anything on the matter? -- /Dennis Björklund ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Server process exited with unexpected status 128.
=?Windows-1251?Q?=C0=ED=E4=F0=E5=E9_=D0=E5=EF=EA=EE?= [EMAIL PROTECTED] writes: When I try to execute the next SQL statement, sever was crashed: DELETE FROM ma_data WHERE id in (-1,212803,. ); ... - is 500k text like id separated by ,. Its about 10 values. I wouldn't be too surprised that that ran the server out of memory. The recovery ought to be a little more graceful though :-( ... and it is, on my machine: ERROR: stack depth limit exceeded HINT: Increase the configuration parameter max_stack_depth. CONTEXT: SQL statement DELETE FROM ma_data WHERE id in (1,2,3,4,5,6,7,8, ... much omitted ... 0,1,2,3,4,5,6,7,8,9,10,0); PL/pgSQL function blowup line 7 at execute statement I'm guessing something wrong with the stack depth check on Windows. It's passing the regression test though, so maybe the issue is specific to your machine? What variant of Windows have you got exactly? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Server process exited with unexpected status 128.
[ looking again... ] =?Windows-1251?Q?=C0=ED=E4=F0=E5=E9_=D0=E5=EF=EA=EE?= [EMAIL PROTECTED] writes: max_stack_depth = 65536 # min 100, size in KB Hmm, maybe this is the problem. Are we sure Windows will allow a 64M stack? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Server process exited with unexpected status 128.
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane Sent: 26 September 2005 15:47 To: Андрей Репко Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Server process exited with unexpected status 128. [ looking again... ] =?Windows-1251?Q?=C0=ED=E4=F0=E5=E9_=D0=E5=EF=EA=EE?= [EMAIL PROTECTED] writes: max_stack_depth = 65536 # min 100, size in KB Hmm, maybe this is the problem. Are we sure Windows will allow a 64M stack? Looks like we used 4MB in the backend by default: http://archives.postgresql.org/pgsql-committers/2005-01/msg00386.php Regards, Dave. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Server process exited with unexpected status 128.
Dave Page dpage@vale-housing.co.uk writes: [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane max_stack_depth = 65536 # min 100, size in KB Hmm, maybe this is the problem. Are we sure Windows will allow a 64M stack? Looks like we used 4MB in the backend by default: http://archives.postgresql.org/pgsql-committers/2005-01/msg00386.php D'oh. Well, at the very least we have a documentation issue here. Is it sensible to try to prevent people from raising the GUC variable higher than the platform will allow? It seems we can know the limit on Windows, but on most other platforms I don't think there's any good way to find it out. (Which is why max_stack_depth is a SUSET variable --- you're assumed to know what you are doing if you change it.) regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] roundoff problem in time datatype
Tom Lane [EMAIL PROTECTED] writes: That's an urban legend. There never have been, and never will be, two leap seconds instituted in the same minute. We really should reject anything larger than '23:59:60'. I don't understand. The last second of a normal minute goes from 59.0 to 59.999 (etc) before the next minute begins. So surely the last second of a minute containing a leap second goes from 60.0 to 60.999? -- greg ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Server process exited with unexpected status 128.
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: 26 September 2005 16:01 To: Dave Page Cc: Андрей Репко; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Server process exited with unexpected status 128. Dave Page dpage@vale-housing.co.uk writes: [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane max_stack_depth = 65536 # min 100, size in KB Hmm, maybe this is the problem. Are we sure Windows will allow a 64M stack? Looks like we used 4MB in the backend by default: http://archives.postgresql.org/pgsql-committers/2005-01/msg00386.php D'oh. Well, at the very least we have a documentation issue here. Is it sensible to try to prevent people from raising the GUC variable higher than the platform will allow? It seems we can know the limit on Windows, but on most other platforms I don't think there's any good way to find it out. (Which is why max_stack_depth is a SUSET variable --- you're assumed to know what you are doing if you change it.) I think It's sensible if it's a limit we can find relatively easily. In this case though it sounds like this is not the case. Perhaps we could issue a warning at startup if the value seems like it might be over the top? I assume the current limit is purely down to the data type. Regards, Dave ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] roundoff problem in time datatype
On 9/26/05, Dennis Bjorklund wrote: On Sun, 25 Sep 2005, Tom Lane wrote: Alternatively: why are we forbidding the value 24:00:00 anyway? Is there a reason not to allow the hours field to exceed 23? One reason is because it's what the standard demand. Could you cite that? The only thing I can find in the SQL standard is that the hour field in an INTERVAL can not exceed 23, not datetimes. Another is that it isn't a proper time, just like feb 31 isn't a proper date. IIRC ISO 8601 (to whcih the SQL standard points) says 2005-10-01T24:00:00 is valid (and happens to be the same as 2005-10-02T00:00:00). It does seem a bit inconsistent with the spec of an interval though. Jochem ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] expected authentication request from server, but
On Mon, 26 Sep 2005, Tom Lane wrote: =?iso-8859-2?Q?Micha=B3_J=EAczalik?= [EMAIL PROTECTED] writes: I have a very old postgres 6.4 installation. I'm trying to connect to it from a fresh, up-to-date Debian-host, but I'm constantly getting (from DBI and PHP as well) this error message: # ./update_dns.pl DBI connect('dbname=dns;host=sql;authtype=password','dns',...) failed: expected authentication request from server, but received U at ./update_dns.pl line 237 I went back and built the 6.4 branch just to confirm that current libpq and psql would still talk to a 6.4 server. They do (modulo most of psql's \d commands not working). So there's something more here than a simple version mismatch. Is it possible that DBI is using some private all-Perl client library rather than libpq to talk to the server? Quite possible. I managed to compile 7.4 and that error message disappeared. Anyway, perl dumped core at the very start of the connection. ;) I downgraded DBD::Pg to 1.22 and... it worked! I did not try to upgrade libpq back to 8.x, but... PHP didn't worked as well with recent version of libs... After downgrading to 7.4 it worked fine. So there must be something about libpq anyway. Regards, -- Samotnik Michał Jęczalik, +48.695.64.75.14 www.zagle.org.pl - rejsy morskie ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Vacuum questions...
Joshua D. Drake wrote: Hannu Krosing wrote: On L, 2005-09-24 at 20:25 -0700, Joshua D. Drake wrote: Actually this also probably would not gain you much in 8.1 as vacuum in theory is already dealing with itself. Interesting. Could you explain it in a more detailed way ? How does vacuum deal with itself in 8.1 ? Autovacuum is integrated into the backend for 8.1 Can I set the autovacuum parameter per table instead of per engine ? I'm using pg_autovacuum right now in 7.4 and is not enough because some tables ( one that implement a materialized view for example ) are out of an average engine usage and other tables are so huge to not be analyzed for months. Regards Gaetano Mendola ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] roundoff problem in time datatype
Jochem van Dieten [EMAIL PROTECTED] writes: On 9/26/05, Dennis Bjorklund wrote: One reason is because it's what the standard demand. Could you cite that? The only thing I can find in the SQL standard is that the hour field in an INTERVAL can not exceed 23, not datetimes. SQL99 has _Table_11-Valid_values_for_datetime_fields_ _KeywordValid_values_of_datetime_fields | YEAR | 0001 to | | | | | MONTH| 01 to 12 | | | | | DAY | Within the range 1 (one) to 31, but further | constrained by the value of MONTH and YEAR fields, according to the rules for well- formed dates in the Gregorian calendar. | HOUR | 00 to 23 | | | | | MINUTE | 00 to 59 | | | | | SECOND | 00 to 61.9(N) where 9(N) indicates | the number of digits specified by time fractional seconds precision. | TIMEZONE_HOUR| -12 to 13 | | | | |_TIMEZONE_MINUTE__|_-59_to_59_| | | | NOTE 62 - Datetime data types will allow dates in the Gregorian format to be stored in the date range 0001-01-01 CE through -12-31 CE. The range for SECOND allows for as many as two leap seconds. Interval arithmetic that involves leap seconds or discontinuities in calendars will produce implementation- defined results. The urban legend about needing 2 leap seconds in the same minute has infected the standard I see. It should only allow 60. as the max value for SECOND. Note however that we feel free to exceed the spec in other aspects of this --- we exceed their year range for instance. So I don't think we necessarily have to reject '24:00:00'. Also, the spec explicitly states that arithmetic on TIME values is done modulo 24 hours. So it's correct for '23:59:59'::time + '1 second'::interval to yield '00:00:00', but this does not necessarily mean that we should cause rounding to behave that way. Depends whether you think that rounding is an arithmetic operation or not ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] roundoff problem in time datatype
Dennis Bjorklund [EMAIL PROTECTED] writes: Do the sql standard say anything on the matter? It doesn't seem very helpful. AFAICS, we should interpret storing '23:59:59.99' into a TIME(0) field as a cast from TIME(2) to TIME(0), and the spec defines that as 15) If TD is the datetime data type TIME WITHOUT TIME ZONE, then let TSP be the time precision of TD. b) If SD is TIME WITHOUT TIME ZONE, then TV is SV, with implementation-defined rounding or truncation if necessary. So it's implementation-defined what we do. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] roundoff problem in time datatype
Tom Lane wrote: Dennis Bjorklund [EMAIL PROTECTED] writes: On Mon, 26 Sep 2005, Tom Lane wrote: No, I think the usual notation for a leap-second is '23:59:60'. We do allow 60 in the seconds field for this purpose. Yes, and it can go up to 23:59:60.99 (depending on how many fractional seconds one want). That's an urban legend. There never have been, and never will be, two leap seconds instituted in the same minute. We really should reject anything larger than '23:59:60'. mmm. The second 60 have is on duration of 1 second so 23:59:60.4 have is right to exist. Regards Gaetano Mendola ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Vacuum questions...
On Mon, Sep 26, 2005 at 05:41:24PM +0200, Gaetano Mendola wrote: Joshua D. Drake wrote: Autovacuum is integrated into the backend for 8.1 Can I set the autovacuum parameter per table instead of per engine ? Yes. -- Alvaro Herrera Architect, http://www.EnterpriseDB.com Jude: I wish humans laid eggs Ringlord: Why would you want humans to lay eggs? Jude: So I can eat them ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] roundoff problem in time datatype
Tom Lane wrote: Dennis Bjorklund [EMAIL PROTECTED] writes: Do the sql standard say anything on the matter? It doesn't seem very helpful. AFAICS, we should interpret storing '23:59:59.99' into a TIME(0) field as a cast from TIME(2) to TIME(0), and the spec defines that as 15) If TD is the datetime data type TIME WITHOUT TIME ZONE, then let TSP be the time precision of TD. b) If SD is TIME WITHOUT TIME ZONE, then TV is SV, with implementation-defined rounding or truncation if necessary. So it's implementation-defined what we do. IMHO Since 23:59:59.99 probably means the last milliseconds of this day, as far as precision allows to express it, this should be truncated to 23:59:59, not rounded to 24:00:00. Until the last microsecond has elapsed, it's not 24 hours (you wouldn't round happy new year at 23:59:30 from a clock with minutes only either) Regards, Andreas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] roundoff problem in time datatype
On Mon, 26 Sep 2005, Tom Lane wrote: b) If SD is TIME WITHOUT TIME ZONE, then TV is SV, with implementation-defined rounding or truncation if necessary. So it's implementation-defined what we do. Truncation would avoid the problem but of course loses some of the info. So, what are the alternatives: * Truncation. * Rounding and let it wrap when rounding up towards midnight. * Rounding and never let it wrap. The cases that would wrap goes to 23:59:59 (or 23:59:59.9 and so on for other precisions) or to 23:59:60 (or 23:59.60.9 and so on) if one start with a leap second time. Are there any more viable cases? -- /Dennis Björklund ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] expected authentication request from server, but
On Mon, 26 Sep 2005, Michael Paesold wrote: to be something else than the libpq expects. But how to fix it?! Has the protocol change or what? Umm, I think the protocol version is up to 3 or 4 now. I think libpq supports all the way back to 7.0, I don't know if anyone tests earlier than that. If you really need this to work, I suggest pulling the libpq from that release (CVS or tarball) or slightly later and installing that on the new machines. Heh, that's what I'm currently doing, but it's not so easy to compile 6.4 on a new system. configure states that the compiler cannot create executables, but I ripped this check out of it. Anyway, something goes wrong while generating the Makefile: Try to compile 7.0 or 7.1, those versions will probably still support the 6.4 wire protocol. I am sure someone with more historical knowledge (or reading the release notes back to 6.4) can exactly tell you the latest version compatible with 6.4. OK. It worked with 7.4 + DBD not newer than 1.22 - more recent versions simply make perl to dump core while connecting to postgres. ;) -- Samotnik Michał Jęczalik, +48.695.64.75.14 www.zagle.org.pl - rejsy morskie ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] vacuum analyze hanging
We have what may be a bug in beta2. We have two databases running beta2, one on Linux and one on Windows. The hardware is identical. The configurate files are identical. They are being fed identical streams of data modifications (primarily inserts, very few deletes). We've been running this setup for a few days. The performance has matched very closely -- until now. We are running the new autovacuum with default parameters. We have run an explicit vacuum analyze verbose on the full database (as the database owner, not the superuser) twice. These went fine, and didn't show any numbers that led us to believe we needed to adjust the defaults for autovacuum. About an hour ago we started an explicit vacuum analyze (without verbose this time). The Windows box finished in about one minute, and the Linux box -- I was going to say it was still running, but it completed as I was typing this, after running over an hour. While it was stuck, the message about skipping pg_database had not yet appeared (the other five message like it had appeared). The pg_database message appeared very close to the time of completion. The normal processing continued while the vacuums were run, in all cases. This remained static during the stuck state: postgres=# select * from pg_locks where transaction = 8536365; locktype| database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid | mode | granted ---+--+--+--+---+---+-+---+--+-+---+--+- relation |30793 | 2666 | | | | | | | 8536365 | 31798 | RowExclusiveLock | t relation |30793 | 2666 | | | | | | | 8536365 | 31798 | ShareUpdateExclusiveLock | t relation |30793 | 2664 | | | | | | | 8536365 | 31798 | ShareUpdateExclusiveLock | t relation |30793 | 2665 | | | | | | | 8536365 | 31798 | ShareUpdateExclusiveLock | t transactionid | | | | | 8536365 | | | | 8536365 | 31798 | ExclusiveLock| t relation |30793 | 2606 | | | | | | | 8536365 | 31798 | ShareUpdateExclusiveLock | t relation |30793 | 2667 | | | | | | | 8536365 | 31798 | ShareUpdateExclusiveLock | t (7 rows) The only other locks were on other connections and were very short-lived. Time accumulated on the VACUUM process also remained constant (at 3:04), while the other connections slowly increased (from 0:41) while I watched: linpost:/opt/ccap/dtr # ps aux|grep ^postgres postgres 31603 0.0 0.0 170968 4952 ? S12:37 0:00 /usr/local/pgsql/bin/postmaster -D /var/pgsql/data postgres 31605 0.0 1.9 171196 165672 ? S12:37 0:03 postgres: writer process postgres 31606 0.2 0.0 7240 2788 ?S12:37 0:39 postgres: stats buffer process postgres 31607 0.2 0.0 6500 2020 ?S12:37 0:35 postgres: stats collector process postgres 31614 0.5 1.6 171992 135064 ? S12:39 1:23 postgres: dtr dtr 127.0.0.1(33384) idle postgres 31615 0.5 1.6 172008 133152 ? S12:39 1:23 postgres: dtr dtr 127.0.0.1(33386) idle postgres 31616 0.5 1.6 172008 133264 ? S12:39 1:22 postgres: dtr dtr 127.0.0.1(33388) idle postgres 31617 0.5 1.6 172008 132964 ? S12:39 1:23 postgres: dtr dtr 127.0.0.1(33390) idle postgres 31618 0.5 1.6 172008 133168 ? S12:39 1:23 postgres: dtr dtr 127.0.0.1(33392) idle postgres 31619 0.5 1.6 171992 133952 ? S12:39 1:24 postgres: dtr dtr 127.0.0.1(33394) idle postgres 31798 1.5 2.0 189036 167752 ? S13:42 3:04 postgres: dtr dtr 165.219.88.77(2313) VACUUM Upon completion of the VACUUM, that 3:04 went to 3:21, without any other commands being issued on the connection. Here is what is not commented out in the postgresql.conf file: listen_addresses = '*' # what IP interface(s) to listen on; max_connections = 100 # note: increasing max_connections costs shared_buffers = 2 # min 16 or max_connections*2, 8KB each work_mem = 10240# min 64, size in KB wal_buffers = 20# min 4, 8KB each effective_cache_size = 393216 # typically 8KB each random_page_cost = 2# units are one sequential page fetch stats_start_collector = on stats_row_level = on autovacuum = true # enable autovacuum subprocess? lc_messages = 'C' # locale for system error message lc_monetary = 'C'
Re: [HACKERS] openbsd, plpython, missing threading symbols
Patch applied. Thanks. --- Marko Kreen wrote: On Fri, Aug 12, 2005 at 10:27:16PM -0400, Bruce Momjian wrote: Where are we going with this patch? It doesn't test specific OS's known to fail. I hoped people more familiar with the problem would tune it... Here is updated patch where I test specifically 'openbsd*|freebsd*' instead of '*bsd*'. Although AFAIK all BSD's use similar libc/libc_r setup so *bsd* should have been fine. Also I 'clarified' the error message a bit. There is one notable feature of this patch - it will check for threaded Python on all platforms and print the result, this hopefully helps tracking problems on other platforms too. -- marko [ Attachment, skipping... ] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Open items list for 8.1
Here are the open items for 8.1: PostgreSQL 8.1 Open Items = Current version at http://candle.pha.pa.us/cgi-bin/pgopenitems or from http://www.postgresql.org/developer/beta. Changes --- Win32 signal handling patch (Magnus) fix pg_dump --clean for roles cosider O_SYNC as default when O_DIRECT exists test terminate_backend()? /contrib move to pgfoundry bump major library version number? foreign trigger timing issue pgindent make sure bitmap scan optimizer settings are reasonable fix ALTER SCHEMA RENAME for sequence dependency, or remove feature spinlock performance fix semantic issues of granted permissions in roles fix pgxs for Win32 paths Documentation - document control over partial page writes Fixed Since Last Beta - -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] On Logging
Folks, I've run into something that concerns me. It's pretty much an 8.2 issue, but I'm hoping to stimulate some discussion on it. It's PostgreSQL's log files. Right now, they're (sometimes just barely ;) human-readable, but they take significant effort to parse. For example, pqa, a very clever piece of code, is mostly devoted to parsing said files and works only with significant tweaking and restrictions on log file formats in 8.0. Simple logging is a default that should probably not change, but I'm thinking that for people who want to find something out from the logs, we could see about a kind of plugin architecture which would enable things like: * CSV * YAML * XML * Piped logs, as Apache can do * DB handle. I know this one will be controversial. I'm thinking that a GUC variable (or should there be a class of them?) called log_format would be part of the user interface to this and would be able to switch from the cheap default code path to one that's more expensive, just as log_statement does. So, a few questions: 1. Am I the only one who would wants an option for machine-readable logs? 2. Am I way off with the idea for an architecture for same? 3. What big things am I missing here? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] roundoff problem in time datatype
Andreas Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: b) If SD is TIME WITHOUT TIME ZONE, then TV is SV, with implementation-defined rounding or truncation if necessary. So it's implementation-defined what we do. IMHO Since 23:59:59.99 probably means the last milliseconds of this day, as far as precision allows to express it, this should be truncated to 23:59:59, not rounded to 24:00:00. Until the last microsecond has elapsed, it's not 24 hours (you wouldn't round happy new year at 23:59:30 from a clock with minutes only either) Hm, so the proposal is round unless that would produce 24:00:00, in which case truncate? Seems a bit ugly but it would follow the letter of the spec, and avoid rejecting inputs that we used to accept. It's still not very clear what to do with '23:59:60.9' though. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Patching dblink.c to avoid warning about open transaction
[ Joe, would you review this? ] Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Bruce Momjian wrote: Tom Lane wrote: Jonathan Beit-Aharon [EMAIL PROTECTED] writes: nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp; if (!conn)br nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;nbsp; DBLINK_CONN_NOT_AVAIL;br Could we see this in a less broken format? Here is the patch in text format. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 --- dblink.cSat Jan 1 00:43:05 2005 +++ /home/jbeitaharon/dev/third/postgreSQL/contrib/dblink/dblink.c Thu Sep 22 16:10:20 2005 @@ -329,12 +329,16 @@ if (!conn) DBLINK_CONN_NOT_AVAIL; + + if (rcon) + rcon-remoteTrFlag = (PQtransactionStatus(conn) != PQTRANS_IDLE); - res = PQexec(conn, BEGIN); - if (PQresultStatus(res) != PGRES_COMMAND_OK) - DBLINK_RES_INTERNALERROR(begin error); - - PQclear(res); + if ((!rcon) || (!(rcon-remoteTrFlag))) { + res = PQexec(conn, BEGIN); + if (PQresultStatus(res) != PGRES_COMMAND_OK) + DBLINK_RES_INTERNALERROR(begin error); + PQclear(res); + } appendStringInfo(str, DECLARE %s CURSOR FOR %s, curname, sql); res = PQexec(conn, str-data); @@ -424,12 +428,13 @@ PQclear(res); - /* commit the transaction */ - res = PQexec(conn, COMMIT); - if (PQresultStatus(res) != PGRES_COMMAND_OK) - DBLINK_RES_INTERNALERROR(commit error); - - PQclear(res); + if ((!rcon) || (!(rcon-remoteTrFlag))) { + /* commit the transaction */ + res = PQexec(conn, COMMIT); + if (PQresultStatus(res) != PGRES_COMMAND_OK) + DBLINK_RES_INTERNALERROR(commit error); + PQclear(res); + } PG_RETURN_TEXT_P(GET_TEXT(OK)); } ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Open items list for 8.1
/contrib move to pgfoundry Is this actually happening? -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] On Logging
On 9/26/05, David Fetter [EMAIL PROTECTED] wrote: I've run into something that concerns me. It's pretty much an 8.2 issue, but I'm hoping to stimulate some discussion on it. It's PostgreSQL's log files. Right now, they're (sometimes just barely ;) human-readable, but they take significant effort to parse. For example, pqa, a very clever piece of code, is mostly devoted to parsing said files and works only with significant tweaking and restrictions on log file formats in 8.0. In a previous life (oh, like 6 months ago), I spent all my time working on parsing log files from dozens of different software products, and I learned something that made parsing some files orders of magnitude easier than others: Always use message codes. Cisco does this, and it helps a lot. A few other vendors do this, and it helps a lot. While this might seem an old mainframeism, it's terribly useful to have something at the beginning that tells you what the message is, what it means, and most importantly, how to parse the rest. I would be happy to help create this catalog, though it's definately a big step to implement. It would also require identifying every message that could be generated -- something few open source projects do, but it is critical to those of us who have to process the output! Simple logging is a default that should probably not change, but I'm thinking that for people who want to find something out from the logs, we could see about a kind of plugin architecture which would enable things like: * CSV CSV is the best format, ever. Trivially simple to parse, it requires no extra processing so long as you abide by a few extra rules, such as escaping. * YAML Nice, but I think perhaps not the best format for logging. It's more of a configuration file format in my mind, and it requires a bit more oompf to parse. Not going to happen in AWK. :-) * Piped logs, as Apache can do Useful, but doesn't create any new capabilities, just simplifies some of them. Focus on new capabilities first, then added functionality if required. * DB handle. I know this one will be controversial. I can't imagine why. :-) 1. Am I the only one who would wants an option for machine-readable logs? Not likely. I'd love it. It makes monitoring and reporting easier. Chris -- | Christopher Petrilli | [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Open items list for 8.1
Joshua D. Drake wrote: /contrib move to pgfoundry Is this actually happening? Josh has talked about it, but not sure where he is. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] roundoff problem in time datatype
Tom Lane wrote: Hm, so the proposal is round unless that would produce 24:00:00, in which case truncate? Seems a bit ugly but it would follow the letter of the spec, and avoid rejecting inputs that we used to accept. It's still not very clear what to do with '23:59:60.9' though. I'd handle it the same; 23.59.60.9 - 23.59.60 since this is apparently a leap second. A normal second should never become a leap second from some conversion, but a leap second should stay one. Regards, Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] Releasing memory during External sorting?
-Original Message- From: Ron Peacetree [mailto:[EMAIL PROTECTED] Sent: Saturday, September 24, 2005 3:31 AM To: Dann Corbit; pgsql-hackers@postgresql.org; pgsql- [EMAIL PROTECTED] Subject: RE: [HACKERS] [PERFORM] Releasing memory during External sorting? From: Dann Corbit [EMAIL PROTECTED] Sent: Sep 23, 2005 5:38 PM Subject: RE: [HACKERS] [PERFORM] Releasing memory during External sorting? _C Unleashed_ also explains how to use a callback function to perform arbitrary radix sorts (you simply need a method that returns the [bucketsize] most significant bits for a given data type, for the length of the key). So you can sort fairly arbitrary data in linear time (of course if the key is long then O(n*log(n)) will be better anyway.) But in any case, if we are talking about external sorting, then disk time will be so totally dominant that the choice of algorithm is practically irrelevant. Horsefeathers. Jim Gray's sorting contest site: http://research.microsoft.com/barc/SortBenchmark/ proves that the choice of algorithm can have a profound affect on performance. Picklesmoke. I was referring to the algorithm used to perform the sort stage, and not the algorithm used to perform the IO which has a dominant effect on the overall sort time. I thought that should be clear from context. After all, the amount of IO done is the most important of the things that you should be optimizing for in choosing an external sorting algorithm. Replacement selection uses a terrible O(f(n)) algorithm. The only reason it is a customary choice for external sorting is because the runs are twice as long. Using a classical merge sequence, you have half as many reads and writes using replacement selection as with other methods. That saves ONE read and ONE write pass, because of having half as many subfiles. Suppose, for instance, that you have 64 subfiles. Using any classical merge algorithm, they will have to be read and merged in a first pass, giving 32, then again giving 16 then again giving 8 then again giving 4, then again giving two and one final pass to create one file. So, if replacement selection were applied, there would be 6 read/write passes instead of seven in this problem set. After the creation of the original subfiles, the algorithm I listed reads once and writes once and is done. So what about the argument for skipping around? Well, first of all the OS is going to cache the reads to a large degree. And second of all, if we read a single record with no buffering and wrote a single record for each operation, then because we only have to read once, that is better than skipping around 7 times for every read and write because of physically reading and writing the files over and over. But don't take my word for it. Try it yourself. It is laughably trivial to implement it. Clearly, if we know or can assume the range of the data in question the theoretical minimum amount of IO is one pass through all of the data (otherwise, we are back in O(lg(n!)) land ). Equally clearly, for HD's that one pass should involve as few seeks as possible. In fact, such a principle can be applied to _all_ forms of IO: HD, RAM, and CPU cache. The absolute best that any sort can possibly do is to make one pass through the data and deduce the proper ordering of the data during that one pass. It's usually also important that our algorithm be Stable, preferably Wholly Stable. Let's call such a sort Optimal External Sort (OES). Just how much faster would it be than current practice? The short answer is the difference between how long it currently takes to sort a file vs how long it would take to cat the contents of the same file to a RAM buffer (_without_ displaying it). IOW, there's SIGNIFICANT room for improvement over current standard practice in terms of sorting performance, particularly external sorting performance. Since sorting is a fundamental operation in many parts of a DBMS, this is a Big Deal. This discussion has gotten my creative juices flowing. I'll post some Straw Man algorithm sketches after I've done some more thought. Ron -Original Message- From: Dann Corbit [EMAIL PROTECTED] Sent: Friday, September 23, 2005 2:21 PM Subject: Re: [HACKERS] [PERFORM] Releasing memory during ... For the subfiles, load the top element of each subfile into a priority queue. Extract the min element and write it to disk. If the next value is the same, then the queue does not need to be adjusted. If the next value in the subfile changes, then adjust it. Then, when the lowest element in the priority queue changes, adjust the queue. Keep doing that until the queue is empty. You can create all the subfiles in one pass over the data. You can read all the subfiles, merge them, and write them out in a second pass (no matter how many of them there are). The Gotcha with Priority Queues is that their performance depends
Re: [HACKERS] On Logging
David Fetter wrote: Simple logging is a default that should probably not change, but I'm thinking that for people who want to find something out from the logs, we could see about a kind of plugin architecture which would enable things like: * CSV * YAML * XML * Piped logs, as Apache can do * DB handle. I know this one will be controversial. This list doesn't seem to be to be all in the same category. The first 3 concern format, the last 2 concern destination (and as such probably don't belong in this discussion) ISTM what we need is a proposal for an abstract structure that will account for all the possible logging messages. i.e. the important issue is not what structuring mechanism is used, but what structure it reflects. For example, we might decide that there are 10 message types and the each has certain fields. (And much as I know you like YAML, I don't think its use is sufficiently widespread to belong here anyway). cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] On Logging
Interesting. I am thinking we could put markers like '|' in the log output, and then have some secondary process either remove them or add special formatting to match the requested output format. --- David Fetter wrote: Folks, I've run into something that concerns me. It's pretty much an 8.2 issue, but I'm hoping to stimulate some discussion on it. It's PostgreSQL's log files. Right now, they're (sometimes just barely ;) human-readable, but they take significant effort to parse. For example, pqa, a very clever piece of code, is mostly devoted to parsing said files and works only with significant tweaking and restrictions on log file formats in 8.0. Simple logging is a default that should probably not change, but I'm thinking that for people who want to find something out from the logs, we could see about a kind of plugin architecture which would enable things like: * CSV * YAML * XML * Piped logs, as Apache can do * DB handle. I know this one will be controversial. I'm thinking that a GUC variable (or should there be a class of them?) called log_format would be part of the user interface to this and would be able to switch from the cheap default code path to one that's more expensive, just as log_statement does. So, a few questions: 1. Am I the only one who would wants an option for machine-readable logs? 2. Am I way off with the idea for an architecture for same? 3. What big things am I missing here? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Open items list for 8.1
Bruce Momjian wrote: Joshua D. Drake wrote: /contrib move to pgfoundry Is this actually happening? Josh has talked about it, but not sure where he is. Well pgFoundry isn't ready to have a load of code that is that actively maintained put on it. It still needs to be moved to its new servers. Also we should probably seriously consider which contrib modules get moved. IMHO shipping PostgreSQL without TSearch2 and pgcrypto (of course I think those should be core anyway) is a non-starter. Sincerely, Joshua D. Drake -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Open items list for 8.1
Joshua D. Drake wrote: Bruce Momjian wrote: Joshua D. Drake wrote: /contrib move to pgfoundry Is this actually happening? Josh has talked about it, but not sure where he is. Well pgFoundry isn't ready to have a load of code that is that actively maintained put on it. It still needs to be moved to its new servers. Also we should probably seriously consider which contrib modules get moved. IMHO shipping PostgreSQL without TSearch2 and pgcrypto (of course I think those should be core anyway) is a non-starter. Agreed. The idea was to move _some_ /contrib to pgfoundry. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] On Logging
David Fetter wrote: Folks, I've run into something that concerns me. It's pretty much an 8.2 issue, but I'm hoping to stimulate some discussion on it. It's PostgreSQL's log files. Right now, they're (sometimes just barely ;) human-readable, but they take significant effort to parse. For example, pqa, a very clever piece of code, is mostly devoted to parsing said files and works only with significant tweaking and restrictions on log file formats in 8.0. Simple logging is a default that should probably not change, but I'm thinking that for people who want to find something out from the logs, we could see about a kind of plugin architecture which would enable things like: There are two other restrictions about the log files: - There's no means of restricting logging on some patterns (e.g. specific backends only, certain clients, certain events except for log_duration) - query is truncated due to UDP restrictions. I'd call this not necessarily a logging issue, but a profiling issue. I regularly use MSSQL's profiler to tap an application's query traffic, to find out what's going on, and I'd like the same feature on pgsql. This issue comes up on -hackers regularly, e.g. named logging to tables/logging as inserts, and several others (I can cite them if necessary). What I'd like is an extended logging/profiling facility that can be en/disabled with finer granularity (performance/data volume issues), going to an intermediate file/whatever and regularly converted to table data for easier evaluation (which would fix the format question in the most pgsql like way). Regards, Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Open items list for 8.1
Joshua D. Drake [EMAIL PROTECTED] writes: /contrib move to pgfoundry Well pgFoundry isn't ready to have a load of code that is that actively maintained put on it. It still needs to be moved to its new servers. The modules proposed to be moved out aren't actively maintained now; if they were we'd probably be keeping them in core. Also we should probably seriously consider which contrib modules get moved. You seem to have forgotten the discussion entirely. These are the modules proposed to be moved: adddepend dbase dbmirror fulltextindex mSQL-interface mac oracle tips regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] \d on database with a lot of tables is slow
On Sun, Sep 25, 2005 at 10:10:22AM +0300, Hannu Krosing wrote: Also, not sure if this matters, but they're occasionally getting errors like 'Did not find relation named table that exists' (where 'table that exists' is the name of some table that is in the catalog) from \d. Does anyone know what causes that? mostly this happens on temp tables from other connections, which have managed to disappear by the time their detailed info is requested, and which would actually not show up in \d output due tu visibility checks. They are seeing this doing \d on regular tables that won't be getting deleted. Maybe there's some locking that should be happening but isn't? Or maybe \d should be doing things in one statement instead of a bunch? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Vacuum questions...
On Sun, Sep 25, 2005 at 11:50:14AM -0400, Tom Lane wrote: Hannu Krosing [EMAIL PROTECTED] writes: Perhaps VACUUM could send some statistics after each N pages and this would then be available through something similar to pg_statistics table. Why not just have it send some text to be displayed in the current command field of pg_stat_activity? The infrastructure is all there already for that. If someone wanted to write a nice interface showing the status of a vacuum it would be easier if they didn't have to parse a text field... but *anything* would be a vast improvement over what we have now. newbie TODO? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vacuum questions...
On Sat, Sep 24, 2005 at 08:25:30PM -0700, Joshua D. Drake wrote: Jim C. Nasby wrote: Would it be difficult to vacuum as part of a dump? The reasoning behind this is that you have to read the table to do the dump anyway, so it would be a good time to be able to piggy-back other operations that need to read the entire table on top. I know vacuuming of indexes complicates this, so it's probably not as simple as just firing off a vacuum and copy at the same time (although that idea is probably worth testing, since it might still be a win). This would be a nightmare on a large database. Think of how long it takes to dump 20 gig, now add how long it is going to take to vacuum that size of DB, now think about a 500 gig database. What says that the length of the dump is limited by the disks the database is on? I suspect in many cases it's not. Anyway, this would certainly be an optional step, so if it turns out it hurts you in your environment, you don't have to use it. Actually this also probably would not gain you much in 8.1 as vacuum in theory is already dealing with itself. Maybe, maybe not. If you're already reading the entire table to dump it, why read the entire table again some other time to vacuum it. Just get it all done at once. Whenever we get a 'map of pages that need vacuuming' which will presumably greatly speed up vacuum then maybe your argument makes sense. Right now I don't see how it helps anything. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] On Logging
Tom Lane wrote: Andreas Pflug [EMAIL PROTECTED] writes: - query is truncated due to UDP restrictions. Are you confusing the logs with pg_stat_activity? Not confused. I'm talking about the case where statement logging is enabled, I could have mentioned that... Regards, Andreas ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_autovacuum startup from /etc/rc fails after system crash
As a work-around, you can use the scripts at http://cvs.distributed.net/viewcvs.cgi/stats-sql/tools/ On Thu, Sep 22, 2005 at 02:16:58PM -0400, Jonathan Beit-Aharon wrote: Hi, I'm not a member of this list, so please CC me on responses and discussion. After a system crash PostgreSQL startup is slow as the database recovers. So the db_connect() call from pg_autovacuum terminates as soon as it tries to connect to template1. Looking at the README file, I find this note: pg_autovacuum does not get started automatically by either the postmaster or by pg_ctl. Similarly, when the postmaster exits, no one tells pg_autovacuum. The result of that is that at the start of the next loop, pg_autovacuum will fail to connect to the server and exit(). Any time it fails to connect pg_autovacuum exit()s. So the failure we're experiencing is an unintended result of an intended solution. Any suggestions on how I can work-around this problem? Would it make sense to put the first db_connect() call in the init_db_list() routine inside a [configurable repeatition] loop, sleeping after disappointed attempt to connect, and breaking out on success? That way, I think, when pg_autovacuum is initiated, we assume the postmaster is up, but when the VacuumLoop connection fails, we assume the postmaster went away, and take our exit(). Thanks, Jonathan -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] \d on database with a lot of tables is slow
On Sun, Sep 25, 2005 at 10:10:22AM +0300, Hannu Krosing wrote: On L, 2005-09-24 at 18:59 -0500, Jim C. Nasby wrote: I have a client with a database that contains 4000 relations according to vacuum verbose, and \d in psql is painfully slow. In particular... - Seq Scan on pg_class c (cost=0.00..2343.09 rows=6124 width=73) (actual time=0.325..22100.840 rows=16856 loops=1) Filter: (((relkind = 'r'::char) OR (relkind = 'v'::char) OR (relkind = 'S'::char) OR (relkind = ''::char)) AND pg_table_is_visible(oid)) That's off my laptop, but they're seeing similar issues on an 8-way Opteron as well... I expext the issue on 8-way opteron to be more of a high load than slow scan. It seems that sometimes a database with lots of activity slows down considerably. I suspect some locking issues, but I'm not sure this is the case. No temp tables in use. The issue didn't appear to be load-dependant, either. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] On Logging
David Fetter wrote: ...log file formats in 8.0 * CSV * YAML * XML * Piped logs, as Apache can do * DB handle. I know this one will be controversial. [...] 1. Am I the only one who would wants an option for machine-readable logs? I'd very much like a format that can be easily loaded into a database (not necessarily the same one producing the logs :-) ) in real time and/or be visible as a table through something like dbi-link. I suppose any of the first three formats you suggest could work with dbi-link; or another alternate format * sql insert statements would work if piped logs were supported by sending it to psql. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] On Logging
Andreas Pflug [EMAIL PROTECTED] writes: - query is truncated due to UDP restrictions. Are you confusing the logs with pg_stat_activity? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_config/share_dir
Jim C. Nasby [EMAIL PROTECTED] writes: * Add options to pg_config to show the share_dir, sysconfdir, pkgincludedir, and localedir Should that be marked as a beginner TODO? It could, but I think it's going to get DONE in the next few days as a necessary step in fixing the pgxs relocatability issue. So if any beginners want to do it, they should step right up. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] \d on database with a lot of tables is slow
Jim C. Nasby [EMAIL PROTECTED] writes: I have a client with a database that contains 4000 relations according to vacuum verbose, and \d in psql is painfully slow. In particular... It's presumably mostly in the pg_table_is_visible() calls. Not sure if we can do much to speed those up, but: how many schemas in your search path? What's the distribution of pg_class entries among the schemas? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_config/share_dir
Should that be marked as a beginner TODO? On Thu, Sep 22, 2005 at 11:04:23PM -0400, Bruce Momjian wrote: Added to TODO: * Add options to pg_config to show the share_dir, sysconfdir, pkgincludedir, and localedir --- Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Andrew Dunstan wrote: pg_config doesn't currently seem to have an option to report the share_dir. Should it? Is there a case where a user would need anything from there? pg_config serves a function of recording the configuration, so I tend to agree with Andrew that this should be available. I notice that SYSCONFDIR, PKGINCLUDEDIR, and LOCALEDIR aren't available either. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] On Logging
On Mon, Sep 26, 2005 at 01:13:08PM -0400, Christopher Petrilli wrote: On 9/26/05, David Fetter [EMAIL PROTECTED] wrote: I've run into something that concerns me. It's pretty much an 8.2 issue, but I'm hoping to stimulate some discussion on it. It's PostgreSQL's log files. Right now, they're (sometimes just barely ;) human-readable, but they take significant effort to parse. For example, pqa, a very clever piece of code, is mostly devoted to parsing said files and works only with significant tweaking and restrictions on log file formats in 8.0. In a previous life (oh, like 6 months ago), I spent all my time working on parsing log files from dozens of different software products, and I learned something that made parsing some files orders of magnitude easier than others: Always use message codes. Could you elucidate a bit on this as to how this might affect PostgreSQL logging? Cisco does this, and it helps a lot. A few other vendors do this, and it helps a lot. While this might seem an old mainframeism, ^ You say that like it's a *bad* thing. I think some fruitful communication is possible and has been missed over the decades between mainframe people and *n*x people. The same applies to supercomputing people and *n*x people, but that's a story for another day. it's terribly useful to have something at the beginning that tells you what the message is, what it means, and most importantly, how to parse the rest. OK I would be happy to help create this catalog, though it's definately a big step to implement. It would also require identifying every message that could be generated -- something few open source projects do, but it is critical to those of us who have to process the output! Right. How big a project is this, and what kind of framework would we need in order assure that new messages come with new message codes? Simple logging is a default that should probably not change, but I'm thinking that for people who want to find something out from the logs, we could see about a kind of plugin architecture which would enable things like: * CSV CSV is the best format, ever. Trivially simple to parse, it requires no extra processing so long as you abide by a few extra rules, such as escaping. I agree that it's nice, but seeing as how many smart people have stubbed their toes on the various incarnations of CSV, I must disagree as to its simplicity. * YAML Nice, but I think perhaps not the best format for logging. It's more of a configuration file format in my mind, and it requires a bit more oompf to parse. Not going to happen in AWK. :-) It's not bad for logging, partly because it's a lot fewer bytes than XML or SGML, but it maintains a structure. Of course, it's not as simple in some sense as CSV. * Piped logs, as Apache can do Useful, but doesn't create any new capabilities, just simplifies some of them. Focus on new capabilities first, then added functionality if required. Fair enough :) * DB handle. I know this one will be controversial. I can't imagine why. :-) Heh 1. Am I the only one who would wants an option for machine-readable logs? Not likely. I'd love it. It makes monitoring and reporting easier. That's where I've run across this :) Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Open items list for 8.1
Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: /contrib move to pgfoundry Well pgFoundry isn't ready to have a load of code that is that actively maintained put on it. It still needs to be moved to its new servers. The modules proposed to be moved out aren't actively maintained now; if they were we'd probably be keeping them in core. Speaking as a pgFoundry admin, I would say if they aren't actively maintained we don't want them either. pgFoundry is not a dumping ground for modules that are dying. If they are not maintained then drop them. They can always be recovered from the CVS archive. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Open items list for 8.1
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: The modules proposed to be moved out aren't actively maintained now; if they were we'd probably be keeping them in core. Speaking as a pgFoundry admin, I would say if they aren't actively maintained we don't want them either. pgFoundry is not a dumping ground for modules that are dying. I didn't say they were dying --- the ones we thought were dead, we already dropped. I was responding to Joshua's concern that they might get enough update traffic to pose a noticeable load on the pgfoundry server. Most of them seem to have been touched only once or twice in the past year. That does not indicate that they don't have user communities, though. There was already very extensive discussion about this in this thread: http://archives.postgresql.org/pgsql-hackers/2005-06/msg00302.php and no one objected to the summary proposal I posted here: http://archives.postgresql.org/pgsql-hackers/2005-06/msg00976.php so I'm not inclined to think that the floor is still open for debate about what to move. It's just a matter of someone getting it done. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Re-run query on automatic reconnect
On Mon, Sep 26, 2005 at 01:23:03PM -0500, Jim C. Nasby wrote: Is there any particular reason why psql doesn't re-run a query that failed due to disconnect from the server after re-connecting? I've thought maybe it's because it might somehow be dangerous, but I can't think of any case where that's actually true. What if the query itself resulted in the disconnect by causing the backend to crash? Re-running such a query automatically would be a bad idea. Or did I misunderstand what you're asking? -- Michael Fuhr ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] [PERFORM] A Better External Sort?
From: Ron Peacetree [EMAIL PROTECTED] Sent: Sep 24, 2005 6:30 AM Subject: Re: [HACKERS] [PERFORM] Releasing memory during External sorting? ... the amount of IO done is the most important of the things that you should be optimizing for in choosing an external sorting algorithm. snip Since sorting is a fundamental operation in many parts of a DBMS, this is a Big Deal. This discussion has gotten my creative juices flowing. I'll post some Straw Man algorithm sketches after I've done some more thought. As a thought exeriment, I've been considering the best way to sort 1TB (2^40B) of 2-4KB (2^11-2^12B) records. That's 2^28-2^29 records. Part I: A Model of the System The performance of such external sorts is limited by HD IO, then memory IO, and finally CPU throughput. On commodity HW, single HD IO is ~1/2048 (single HD realistic worst case) to ~1/128 (single HD best case. No more than one seek every ~14.7ms for a ~50MB/s 7200rpm SATA II HD) the throughtput of RAM. RAID HD IO will be in the range from as low as a single HD (RAID 1) to ~1/8 (a RAID system saturating the external IO bus) the throughput of RAM. RAM is ~1/8-1/16 the throughput and ~128x the latency of the data pathways internal to the CPU. This model suggests that HD IO will greatly dominate every other factor, particuarly if we are talking about a single HD rather than a peripheral bus saturating RAID subsystem. If at all possible, we want to access the HD subsystem only once for each data item, and we want to avoid seeking more than the critical number of seeks implied above when doing it. It also suggests that at a minimum, it's worth it to spend ~8 memory operations or ~64 CPU operations to avoid a HD access. Far more than that if we are talking about a single random access. It's worth spending ~128 CPU operations to avoid a single random RAM access, and literally 10's or even 100's of thousands of CPU operations to avoid a random HD access. In addition, there are many indications in current ECE and IT literature that the performance gaps between these pieces of computer systems are increasing and expected to continue to do so for the forseeable future. In short, _internal_ sorts have some, and are going to increasingly have more, of the same IO problems usually associated with external sorts. Part II: a Suggested Algorithm The simplest case is one where we have to order the data using a key that only has two values. Given 2^40B of data using 2KB or 4KB per record, the most compact representation we can make of such a data set is to assign a 32b= 4B RID or Rptr for location + a 1b key for each record. Just the RID's would take up 1.25GB (250M records) or 2.5GB (500M records). Enough space that even an implied ordering of records may not fit into RAM. Still, sorting 1.25GB or 2.5GB of RIDs is considerably less expensive in terms of IO operations than sorting the actual 1TB of data. That IO cost can be lowered even further if instead of actually physically sorting the RIDs, we assign a RID to the appropriate catagory inside the CPU as we scan the data set and append the entries in a catagory from CPU cache to a RAM file in one IO burst whenever said catagory gets full inside the CPU. We can do the same with either RAM file to HD whenever they get full. The sorted order of the data is found by concatenating the appropriate files at the end of the process. As simple as this example is, it has many of the characteristics we are looking for: A= We access each piece of data once on HD and in RAM. B= We do the minimum amount of RAM and HD IO, and almost no random IO in either case. C= We do as much work as possible within the CPU. D= This process is stable. Equal keys stay in the original order they are encountered. To generalize this method, we first need our 1b Key to become a sufficiently large enough Key or KeyPrefix to be useful, yet not so big as to be CPU cache unfriendly. Cache lines (also sometimes called blocks) are usually 64B= 512b in size. Therefore our RID+Key or KeyPrefix should never be larger than this. For a 2^40B data set, a 5B RID leaves us with potentially as much as 59B of Key or KeyPrefix. Since the data can't take on more than 40b worth different values (actually 500M= 29b for our example), we have more than adequate space for Key or KeyPrefix. We just have to figure out how to use it effectively. A typical CPU L2 cache can hold 10's or 100's of thousands of such cache lines. That's enough that we should be able to do a significant amount of useful work within the CPU w/o having to go off-die. The data structure we are using to represent the sorted data also needs to be generalized. We want a space efficient DS that allows us to find any given element in as few accesses as possible and that allows us to insert new elements or reorganize the DS as efficiently as possible. This being a DB discussion list, a B+ tree seems like a fairly obvious suggestion ;-) A B+ tree where each
[HACKERS] Re-run query on automatic reconnect
Is there any particular reason why psql doesn't re-run a query that failed due to disconnect from the server after re-connecting? I've thought maybe it's because it might somehow be dangerous, but I can't think of any case where that's actually true. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Patching dblink.c to avoid warning about open transaction
Bruce Momjian wrote: [ Joe, would you review this? ] Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. The patch itself is pretty simple, but I'm unclear on the use case. Jonathan, can you elaborate a bit? Thanks, Joe p.s. I'm on a business trip in Asia again, so my responses may be delayed a bit. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Open items list for 8.1
Changes --- Win32 signal handling patch (Magnus) Unless someone else steps up to doing this one, please remove it from the list. I will not have time to dig into this patch before 8.1. //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Re-run query on automatic reconnect
Jim C. Nasby [EMAIL PROTECTED] writes: Is there any particular reason why psql doesn't re-run a query that failed due to disconnect from the server after re-connecting? I've thought maybe it's because it might somehow be dangerous, but I can't think of any case where that's actually true. You haven't thought very hard. (1) psql cannot tell whether the query was already completed before the connection dropped; if so, reissuing it would make its effects if any happen twice (eg, duplicate insert, or adding something to a bank balance twice). (2) If inside a transaction block, the query might or might not depend on previous operations in the same transaction. (3) The query might depend on previous session-local operations, such as SET commands or creation of temp tables. (4) If the query actually caused the server crash, re-issuing it will probably cause another crash. Instant infinite loop, complete with denial of service to all other database users. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Open items list for 8.1
Tom Lane wrote: Speaking as a pgFoundry admin, I would say if they aren't actively maintained we don't want them either. pgFoundry is not a dumping ground for modules that are dying. I didn't say they were dying --- the ones we thought were dead, we already dropped. I was responding to Joshua's concern that they might get enough update traffic to pose a noticeable load on the pgfoundry server. Most of them seem to have been touched only once or twice in the past year. That does not indicate that they don't have user communities, though. OK. I agree that we do not need to wait, any more than we are waiting now on other newly registered projects. What we do need is an owner in each case. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] A Better External Sort?
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Ron Peacetree Sent: Monday, September 26, 2005 10:47 AM To: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org Subject: [HACKERS] [PERFORM] A Better External Sort? From: Ron Peacetree [EMAIL PROTECTED] Sent: Sep 24, 2005 6:30 AM Subject: Re: [HACKERS] [PERFORM] Releasing memory during External sorting? ... the amount of IO done is the most important of the things that you should be optimizing for in choosing an external sorting algorithm. snip Since sorting is a fundamental operation in many parts of a DBMS, this is a Big Deal. This discussion has gotten my creative juices flowing. I'll post some Straw Man algorithm sketches after I've done some more thought. As a thought exeriment, I've been considering the best way to sort 1TB (2^40B) of 2-4KB (2^11-2^12B) records. That's 2^28-2^29 records. Part I: A Model of the System The performance of such external sorts is limited by HD IO, then memory IO, and finally CPU throughput. On commodity HW, single HD IO is ~1/2048 (single HD realistic worst case) to ~1/128 (single HD best case. No more than one seek every ~14.7ms for a ~50MB/s 7200rpm SATA II HD) the throughtput of RAM. RAID HD IO will be in the range from as low as a single HD (RAID 1) to ~1/8 (a RAID system saturating the external IO bus) the throughput of RAM. RAM is ~1/8-1/16 the throughput and ~128x the latency of the data pathways internal to the CPU. This model suggests that HD IO will greatly dominate every other factor, particuarly if we are talking about a single HD rather than a peripheral bus saturating RAID subsystem. If at all possible, we want to access the HD subsystem only once for each data item, If you can achieve that, I think you should be given a Nobel Prize, and I mean that sincerely. I also think that your analysis is interesting. and we want to avoid seeking more than the critical number of seeks implied above when doing it. It also suggests that at a minimum, it's worth it to spend ~8 memory operations or ~64 CPU operations to avoid a HD access. Far more than that if we are talking about a single random access. It's worth spending ~128 CPU operations to avoid a single random RAM access, and literally 10's or even 100's of thousands of CPU operations to avoid a random HD access. In addition, there are many indications in current ECE and IT literature that the performance gaps between these pieces of computer systems are increasing and expected to continue to do so for the forseeable future. In short, _internal_ sorts have some, and are going to increasingly have more, of the same IO problems usually associated with external sorts. Knuth has made the observation (confirmed by others) that 40% of mainframe CPU cycles are spent on sorting. Hence, any sort of optimization in this area is a potential for enormous savings. Part II: a Suggested Algorithm The simplest case is one where we have to order the data using a key that only has two values. I suggest testing against a very large class of distributions. All of the common statistical models are a start (Gaussian, Poisson, etc.) and also single value, two distinct values, to some limit. Given 2^40B of data using 2KB or 4KB per record, the most compact representation we can make of such a data set is to assign a 32b= 4B RID or Rptr for location + a 1b key for each record. Just the RID's would take up 1.25GB (250M records) or 2.5GB (500M records). Enough space that even an implied ordering of records may not fit into RAM. Still, sorting 1.25GB or 2.5GB of RIDs is considerably less expensive in terms of IO operations than sorting the actual 1TB of data. That IO cost can be lowered even further if instead of actually physically sorting the RIDs, we assign a RID to the appropriate catagory inside the CPU as we scan the data set and append the entries in a catagory from CPU cache to a RAM file in one IO burst whenever said catagory gets full inside the CPU. We can do the same with either RAM file to HD whenever they get full. The sorted order of the data is found by concatenating the appropriate files at the end of the process. As simple as this example is, it has many of the characteristics we are looking for: A= We access each piece of data once on HD and in RAM. B= We do the minimum amount of RAM and HD IO, and almost no random IO in either case. C= We do as much work as possible within the CPU. D= This process is stable. Equal keys stay in the original order they are encountered. To generalize this method, we first need our 1b Key to become a sufficiently large enough Key or KeyPrefix to be useful, yet not so big as to be CPU cache unfriendly. Cache lines (also sometimes called blocks) are usually 64B= 512b in size. Therefore our RID+Key or KeyPrefix should never be larger than
Re: [HACKERS] 64-bit API for large objects
On Sat, Sep 24, 2005 at 12:13:11PM -0400, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: On Fri, Sep 23, 2005 at 05:40:09PM -0400, Tom Lane wrote: For that matter, we can't even guarantee that they work at all: not all platforms even *have* int64 types. What platforms that PG supports don't have int64 arithmetic? We claim to build with any ANSI C compiler, and there is no requirement for a 64-bit type in ANSI C. The historical project policy is that we should still build without such a type, and everything should still work except that the effective bounds of bigint data correspond to int32 instead of int64 limits. I see no reason to back off that policy. It's not very much harder to do it right. So what happens if you attempt to put a value greater than 2^32 into a bigint on a non-int64 platform? I would argue that by default we should not allow users to even create bigints on any platform where bigint = int. And if the default is overridden, we should still throw a warning. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] A Better External Sort?
Ron, Having rested my brain for the last few days, your theory made for interesting reading... Rather than argue the technical specs, I'd love to see an implementation :) -JonahOn 9/26/05, Dann Corbit [EMAIL PROTECTED] wrote: -Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED]] On Behalf Of Ron Peacetree Sent: Monday, September 26, 2005 10:47 AM To: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org Subject: [HACKERS] [PERFORM] A Better External Sort? From: Ron Peacetree [EMAIL PROTECTED] Sent: Sep 24, 2005 6:30 AM Subject: Re: [HACKERS] [PERFORM] Releasing memory during External sorting? ... the amount of IO done is the most important of the things that you should be optimizing for in choosing an external sorting algorithm. snip Since sorting is a fundamental operation in many parts of a DBMS, this is a Big Deal. This discussion has gotten my creative juices flowing.I'll post some Straw Man algorithm sketches after I've done some more thought. As a thought exeriment, I've been considering the best way to sort 1TB (2^40B) of 2-4KB (2^11-2^12B) records.That's 2^28-2^29 records. Part I: A Model of the System The performance of such external sorts is limited by HD IO, then memory IO, and finally CPU throughput. On commodity HW, single HD IO is ~1/2048 (single HD realistic worst case) to ~1/128 (single HD best case. No more than one seek every ~14.7ms for a ~50MB/s 7200rpm SATA II HD) the throughtput of RAM. RAID HD IO will be in the range from as low as a single HD (RAID 1) to ~1/8 (a RAID system saturating the external IO bus) the throughput of RAM. RAM is ~1/8-1/16 the throughput and ~128x the latency of the data pathways internal to the CPU. This model suggests that HD IO will greatly dominate every other factor, particuarly if we are talking about a single HD rather than a peripheral bus saturating RAID subsystem. If at all possible, we want to access the HD subsystem only once for each data item,If you can achieve that, I think you should be given a Nobel Prize, andI mean that sincerely.I also think that your analysis is interesting. and we want to avoid seeking more than the critical number of seeks implied above when doing it.It also suggests that at a minimum, it's worth it to spend ~8 memory operations or ~64 CPU operations to avoid a HD access. Far more than that if we are talking about a single random access. It's worth spending ~128 CPU operations to avoid a single random RAM access, and literally 10's or even 100's of thousands of CPU operations to avoid a random HD access.In addition, there are many indications in current ECE and IT literature that the performance gaps between these pieces of computer systems are increasing and expected to continue to do so for the forseeable future.In short, _internal_ sorts have some,and are going to increasingly have more, of the same IO problems usually associated with external sorts. Knuth has made the observation (confirmed by others) that 40% ofmainframe CPU cycles are spent on sorting.Hence, any sort ofoptimization in this area is a potential for enormous savings. Part II: a Suggested Algorithm The simplest case is one where we have to order the data using a keythat only has two values.I suggest testing against a very large class of distributions.All ofthe common statistical models are a start (Gaussian, Poisson, etc.) and also single value, two distinct values, to some limit. Given 2^40B of data using 2KB or 4KB per record, the most compact representation we can make of such a data set is to assign a 32b= 4BRID or Rptr for location + a 1b key for each record.Just the RID's would take up 1.25GB (250M records) or 2.5GB (500M records).Enough space that even an implied ordering of records may not fit into RAM. Still, sorting 1.25GB or 2.5GB of RIDs is considerably less expensivein terms of IO operations than sorting the actual 1TB of data. That IO cost can be lowered even further if instead of actually physically sorting the RIDs, we assign a RID to the appropriate catagory insidethe CPU as we scan the data set and append the entries in a catagory from CPU cache to a RAM file in one IO burst whenever said catagory gets full inside the CPU. We can do the same with either RAM file to HD whenever they get full.The sorted order of the data is found by concatenating the appropriatefiles at the end of the process. As simple as this example is, it has many of the characteristics weare looking for: A= We access each piece of data once on HD and in RAM. B= We do the minimum amount of RAM and HD IO, and almost no random IO in either case. C= We do as much work as possible within the CPU. D= This process is stable.Equal keys stay in the original order theyare encountered. To generalize this method, we first need our 1b Key to become a sufficiently large enough Key or KeyPrefix to be useful, yet not so big as to be CPUcache unfriendly. Cache lines (also sometimes called blocks) are usually
Re: [HACKERS] 64-bit API for large objects
Jim C. Nasby [EMAIL PROTECTED] writes: So what happens if you attempt to put a value greater than 2^32 into a bigint on a non-int64 platform? You get the same error as if you tried to store a value greater than 2^64. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] State of support for back PG branches
I had originally been planning to back-port this fix: http://archives.postgresql.org/pgsql-committers/2005-08/msg00213.php as far as 7.2. I've completed the back-port as far as 7.3, but found that 7.2 would be significantly more work because the API of heap_fetch() would need to change from what it was back then, and the patch would therefore have to touch many places that it does not touch in later versions. Given that the bug is so low-probability that it's escaped detection all this time, it doesn't really seem worth the effort (not to mention risk of creating new bugs). This brings up the question of whether we should officially abandon support for 7.2 and/or later branches. I don't think anyone is planning on supporting old branches forever, but when do we stop? I have a corporate need to keep supporting 7.3, at least to the extent of critical bug fixes, because Red Hat is still on the hook to support that version in RHEL3 for awhile longer. I have no such interest in 7.2 (which is one reason I'm not excited about doing the extra work to back-patch the VACUUM/ctid fix). I can definitely see that the community might not want to expend more effort on 7.3, though. I have no idea what the needs of other distributions might be. Thoughts anyone? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] State of support for back PG branches
Hi, On Mon, 26 Sep 2005, Tom Lane wrote: snipped This brings up the question of whether we should officially abandon support for 7.2 and/or later branches. I don't think anyone is planning on supporting old branches forever, but when do we stop? I have a corporate need to keep supporting 7.3, at least to the extent of critical bug fixes, because Red Hat is still on the hook to support that version in RHEL3 for awhile longer. I have no such interest in 7.2 (which is one reason I'm not excited about doing the extra work to back-patch the VACUUM/ctid fix). I can definitely see that the community might not want to expend more effort on 7.3, though. I have no idea what the needs of other distributions might be. Doesn't Red Hat support RHEL 2.1, and so that PostgreSQL 7.1? Anyway, IMHO PGDG should stop supporting 7.2 when 8.1 will be officially released. But at this point, (recalling the vacuum bug) it may now be time to abandon supporting 7.2. Also, as the RPM maintainer of PGDG, it is hard to support 7.2 for us, too. Compiling 7.2 on newer platforms becomes a pain... There are some 7.3 users around (I remember some on Slony lists, etc), therefore we should keep supporting it. But maybe we can announce that 7.3 will become unsupported after XXX time so that people will know before we abandon the support. The best time for not supporting 7.3 might be when 8.2 will be released. However, I believe that 7.4 should live longer, since that's the last of the 7.X branch. Regards, -- Devrim GUNDUZ Kivi Bilişim Teknolojileri - http://www.kivi.com.tr devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Database file compatability
If a database is created with a 64 bit version of initdb, would a 32bit backend be able to talk to it? Likewise, would a backend compiled by a different compiler be able to? If there was some kind of incompatability, would the backend just refuse to start, or would it start and start silently trashing data? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] State of support for back PG branches
Devrim GUNDUZ [EMAIL PROTECTED] writes: There are some 7.3 users around (I remember some on Slony lists, etc), therefore we should keep supporting it. But maybe we can announce that 7.3 will become unsupported after XXX time so that people will know before we abandon the support. The best time for not supporting 7.3 might be when 8.2 will be released. However, I believe that 7.4 should live longer, since that's the last of the 7.X branch. Well, the distinction between the 7.X and 8.X branches is more marketing than anything else ;-). I just went back through the release notes and recalled that 7.2 is the first branch we *ever* continued to support past the initial release of the next major version --- for all the older branches, the last point release predates initial release of the next branch. And I think we really only started that policy because we found some pretty serious data-loss bugs shortly after 7.3 came out (see 7.2.4 release notes), and felt we had to do a 7.2 update. To my mind the main rationale for continuing to support 7.2 is that it was the last pre-schema release, and so people whose apps haven't yet been fixed to cope with schemas will be on their own once we drop it. While each release has some portability gotchas, I don't think there have been any quite that big since then. If we drop support for 7.2, it wouldn't be out of the question for us to drop 7.3 and 7.4 too (at least not from where I sit ... I'm sure some will differ). If we want to have some sort of fixed policy for support lifespan, I would suggest it be like X amount of time after the release of the following major version. But X probably has to depend on how big the compatibility gotchas are in the following version, so we're still really talking about a judgment call here. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Database file compatability
Jim C. Nasby [EMAIL PROTECTED] wrote If a database is created with a 64 bit version of initdb, would a 32bit backend be able to talk to it? Likewise, would a backend compiled by a different compiler be able to? The key problem I believe is the serials of ALIGNOF macros. Especially for MAX_ALIGNOF. Different Hardware/OS/compiler will have different understanding of it. Compare your two versions PG, if they match, then with big chance, you can exchange their data. If there was some kind of incompatability, would the backend just refuse to start, or would it start and start silently trashing data? -- Undefined. Mostly core dump. Regards, Qingqing ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Gerbil build farm failure
Gerbil's looking better lately: http://pgbuildfarm.org/cgi-bin/show_history.pl?nm=gerbilbr=REL8_0_STABLE -- Michael Fuhr ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Database file compatability
Qingqing Zhou [EMAIL PROTECTED] writes: Jim C. Nasby [EMAIL PROTECTED] wrote If a database is created with a 64 bit version of initdb, would a 32bit backend be able to talk to it? Likewise, would a backend compiled by a different compiler be able to? The key problem I believe is the serials of ALIGNOF macros. Especially for MAX_ALIGNOF. Different Hardware/OS/compiler will have different understanding of it. Yeah. It might be worth adding MAX_ALIGNOF to the set of configuration data stored in pg_control, just to be sure you couldn't shoot yourself in the foot that way. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] State of support for back PG branches
Tom Lane wrote: If we want to have some sort of fixed policy for support lifespan, I would suggest it be like X amount of time after the release of the following major version. But X probably has to depend on how big the compatibility gotchas are in the following version, so we're still really talking about a judgment call here. I'm not sure that that's going to help users much. I should think around 3 years (or some such predictable period) is a reasonable lifetime goal for a piece of software like this, accompanied by some weasel words. Maybe something like this would do: We will attempt to maintain support of each major version for 3 years after its release, although this will not always be possible. After that time any major support requirement is likely to result in support being ended. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] State of support for back PG branches
On Mon, 26 Sep 2005, Andrew Dunstan wrote: Tom Lane wrote: If we want to have some sort of fixed policy for support lifespan, I would suggest it be like X amount of time after the release of the following major version. But X probably has to depend on how big the compatibility gotchas are in the following version, so we're still really talking about a judgment call here. I'm not sure that that's going to help users much. I should think around 3 years (or some such predictable period) is a reasonable lifetime goal for a piece of software like this, accompanied by some weasel words. Maybe something like this would do: We will attempt to maintain support of each major version for 3 years after its release, although this will not always be possible. After that time any major support requirement is likely to result in support being ended. This sounds reasonable to me ... I think it is more then most software projects do, isn't it? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Questions about proper newline handling in psql output
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 3. How to show that a value is continued? As you can see below I use ... I remember a discussion on this before but couldn't find it in the archives. Either a reference or some other hints would be appreciated. I think we liked the + character. 4. Some system output like pg_views has really really long strings, would it be acceptable to change the output there to add newlines at various places to make it output nicer with this change? I'd say no, until you propose a more concrete set of rules on how and when. 5. Auto string folding. If a string is really long, fold it so it fits in a screen width, perhaps with '\' continuation. I havn't done this but I can imagine some people (including me) would love it. Sounds good to me. Query is: select oid, prosrc as HdrLine1 HdrLine2, proacl from pg_proc limit 1; Some smaller samples with a third column (and a large 2nd one) might be nice. Thanks for tackling this, it should be a nice improvement. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200509262011 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFDOI5OvJuQZxSWSsgRAoiZAJ4sNyNMFE5+xiA+uDlRnWZA03cbmACfWv67 lFKR/iwZnkp7mb0BvLLFkkk= =L9nw -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Making pgxs builds work with a relocated installation
I looked at the recently noted problem that pgxs builds only work if the installation paths recorded in Makefile.global are accurate; which pretty much breaks our claim to supporting relocatable installations. What I propose we do about this is change the path setup section of Makefile.global to look like (for each path variable) ifdef PGXS pkglibdir = $(shell pg_config --pkglibdir) else # existing code to set up pkglibdir endif Since a pgxs build has already assumed it could use pg_config to find pgxs.mk, this isn't introducing any new dependency, and it will allow the existing relocatable-path code to do its thing. Not all of the path variables set up in Makefile.global are currently available from pg_config; the missing ones are prefix exec_prefix sbindir mandir localedir libexecdir datadir sysconfdir pkgincludedir docdir The first three of these don't seem to be directly referenced anywhere in the Makefiles, so I propose just removing them from Makefile.global. The other ones will need to be added to pg_config's repertoire, unless someone can make a pretty good case that no pgxs-using module would ever need to install into that directory. Also note that I'm assuming the following path variables can continue to be defined as they are, ie, relative to other path variables that will get the pg_config treatment: includedir_server = $(pkgincludedir)/server includedir_internal = $(pkgincludedir)/internal pgxsdir = $(pkglibdir)/pgxs Comments? The other open issue in this area was that on Windows, pg_config needs to return space-free path names to avoid breaking the makefiles. It was suggested that this could be handled by passing pg_config's result path names through GetShortPathName() on that platform. That sounds OK to me but I'm not in a position to write or test such a patch; can someone take care of that? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Gerbil build farm failure
Michael Fuhr [EMAIL PROTECTED] writes: Gerbil's looking better lately: http://pgbuildfarm.org/cgi-bin/show_history.pl?nm=gerbilbr=REL8_0_STABLE Yeah. We've been poking at it off-list, and it seems that the problem was a local build failure due to not having a clean copy of the repository (ye olde junk-in-the-supposedly-clean-vpath-tree problem). regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] State of support for back PG branches
[ Forgot to answer this part ] Devrim GUNDUZ [EMAIL PROTECTED] writes: On Mon, 26 Sep 2005, Tom Lane wrote: I have a corporate need to keep supporting 7.3, at least to the extent of critical bug fixes, because Red Hat is still on the hook to support that version in RHEL3 for awhile longer. Doesn't Red Hat support RHEL 2.1, and so that PostgreSQL 7.1? Yeah, but I'm not expecting any help from the community on that ;-). (In practice it's hard to imagine any support request for which I'd not tell the user to get off 7.1 anyway...) 7.3 is still reasonably supportable, ie, it's not so full of unfixable problems. Nonetheless it's not clear that the PG development community ought to be spending time on it. One way to phrase this discussion is whether the community still wants to help me support 7.3/RHEL3. If y'all feel you have better uses of your time, I understand completely. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] State of support for back PG branches
Marc G. Fournier [EMAIL PROTECTED] writes: On Mon, 26 Sep 2005, Andrew Dunstan wrote: Maybe something like this would do: We will attempt to maintain support of each major version for 3 years after its release, although this will not always be possible. After that time any major support requirement is likely to result in support being ended. This sounds reasonable to me ... I think it is more then most software projects do, isn't it? To translate that into reality: 7.2 (2002-02-04) would be dead already, and 7.3 (2002-11-27) will be dead around the time we are likely to release 8.1. Do people feel comfortable with that? It seems to fit with what I'd like to do right at the moment, which is to release updates back to 7.3 but not 7.2. I'd prefer to measure the time from the release of the follow-on version, so I'd make it 2 years from release of following major version; that would give people a clearer idea of the time frame in which they're expected to update their applications. But I'm not wedded to that. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Vacuum questions...
Alvaro Herrera wrote: On Mon, Sep 26, 2005 at 05:41:24PM +0200, Gaetano Mendola wrote: Joshua D. Drake wrote: Autovacuum is integrated into the backend for 8.1 Can I set the autovacuum parameter per table instead of per engine ? Yes. Finally :-) good work. Regards Gaetano Mendola ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] State of support for back PG branches
Tom Lane wrote: Marc G. Fournier [EMAIL PROTECTED] writes: On Mon, 26 Sep 2005, Andrew Dunstan wrote: Maybe something like this would do: We will attempt to maintain support of each major version for 3 years after its release, although this will not always be possible. After that time any major support requirement is likely to result in support being ended. This sounds reasonable to me ... I think it is more then most software projects do, isn't it? To translate that into reality: 7.2 (2002-02-04) would be dead already, and 7.3 (2002-11-27) will be dead around the time we are likely to release 8.1. It doesn't say we must drop it, it says we can after 3 years (or 1 release + 2 years if you like) without any troubles of conscience. ISTM that we could and possibly should keep supporting it until it appeared some major patch was required that was too much work or too dangerous. Remember, many people don't want to jump onto a release right away - I know of large enterprises that have a policy not to use the .0 version of anything. So a 3 year cycle is more likely to be a 2 1/2 year cycle in reality. Then factor in testing and migration time and the production life in the field between deployment and end of life might be only about 2 years. That's plenty short enough, especially as we still don't have a nice pg_upgrade utility. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Database file compatability
Jim C. Nasby wrote: If a database is created with a 64 bit version of initdb, would a 32bit backend be able to talk to it? Likewise, would a backend compiled by a different compiler be able to? Not in my experience at least from going 32 bit intel to 64bit opteron. If there was some kind of incompatability, would the backend just refuse to start, or would it start and start silently trashing data? -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] State of support for back PG branches
On Mon, 26 Sep 2005, Tom Lane wrote: Marc G. Fournier [EMAIL PROTECTED] writes: On Mon, 26 Sep 2005, Andrew Dunstan wrote: Maybe something like this would do: We will attempt to maintain support of each major version for 3 years after its release, although this will not always be possible. After that time any major support requirement is likely to result in support being ended. This sounds reasonable to me ... I think it is more then most software projects do, isn't it? To translate that into reality: 7.2 (2002-02-04) would be dead already, and 7.3 (2002-11-27) will be dead around the time we are likely to release 8.1. Do people feel comfortable with that? It seems to fit with what I'd like to do right at the moment, which is to release updates back to 7.3 but not 7.2. IMHO ... after 3 years of running on a version, if someone hasn't hit some of the bugs that we're back-patching for, the either aren't going to, or should have that as an encouragement to upgrade ... in most cases, I believe that alot of the ones you've back patched for, its something you've fixed in a recent release, and ended up going looking for in past releases to make sure they were safe ... no? I'd prefer to measure the time from the release of the follow-on version, so I'd make it 2 years from release of following major version; that would give people a clearer idea of the time frame in which they're expected to update their applications. But I'm not wedded to that. 'k, if you mean 'major version' == x.0 (ie. 7.0.0, 8.0.0), then I think the span of time + 2 years is *way* too long, considering an average of, what, 5 years between major releases ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] State of support for back PG branches
Marc G. Fournier [EMAIL PROTECTED] writes: On Mon, 26 Sep 2005, Tom Lane wrote: I'd prefer to measure the time from the release of the follow-on version, so I'd make it 2 years from release of following major version; that would give people a clearer idea of the time frame in which they're expected to update their applications. But I'm not wedded to that. 'k, if you mean 'major version' == x.0 (ie. 7.0.0, 8.0.0), then I think the span of time + 2 years is *way* too long, considering an average of, what, 5 years between major releases ... No, I mean the clock starts to run on 8.0 when we release 8.1. It's been about a year between major releases lately, so 1 release + 2 years is in the same ballpark as 3 years. But I think the former gives people more clarity about how much time they have to do upgrades. It's not a big deal either way, probably --- for instance, as of now 7.2 is dead and 7.3 still alive by either rule. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] State of support for back PG branches
This sounds reasonable to me ... I think it is more then most software projects do, isn't it? To translate that into reality: 7.2 (2002-02-04) would be dead already, and 7.3 (2002-11-27) will be dead around the time we are likely to release 8.1. Do people feel comfortable with that? It seems to fit with what I'd like to do right at the moment, which is to release updates back to 7.3 but not 7.2. I think there should be levels of support. previous major release less than 18 month old (this would cover 7.4): Bug fixes, security fixes previous major release greator than 18 months but not over 3 years: security fixes Over 3 years... your on your own. Although this will be tougher as versions such as 7.4 could easily be running in another 3 years as it is a reasonable stable version without any significant issue (meaning production issue bugs). Also from a commercial perspective the community would be freed up a little to concentrate on delivering the kick ass product, where commercial interests could help keep up with bug fixes, security fixes on older releases etc... Heck that is what RedHat does. Sincerely, Joshua D. Drake -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] State of support for back PG branches
Joshua D. Drake [EMAIL PROTECTED] writes: I think there should be levels of support. There already are, in that only fairly major bugs get backpatched to the way-back releases. I think that's right --- the older a release is, the more it means that people still using it value stability over the latest fixes. The question at hand is when are we willing to pull the plug completely and declare that even security holes and data-loss risks won't get fixed. Although this will be tougher as versions such as 7.4 could easily be running in another 3 years as it is a reasonable stable version without any significant issue (meaning production issue bugs). Yeah. The biggest reason we declared 7.1 unsupported is that it has the unfixable transaction-ID-wraparound problem, and we wanted to encourage people to stop using 7.1 and before ASAP. 7.2 has some pretty serious unfixable problems too, such as the lack of error checking associated with OPAQUE-using functions (ye olde select cash_out(2) crash). 7.3 is the oldest version that I think is actually supportable, in that there are no known, unfixable security or data-loss risks. So another way we might approach this is that it's time to kill 7.2 because we want to encourage people to get off it sooner not later, but 7.3 and later still have an indefinite support lifespan ahead of them. In that mindset, we'd only pull the plug on a version when an identifiable reason to kill it emerges. I'd still not commit to an infinite lifespan --- but we might be willing to support solid versions for as long as, say, five years. Or, as you say, we could take the viewpoint that there are commercial companies willing to take on the burden of supporting back releases, and the development community ought not spend its limited resources on doing that. I'm hesitant to push that idea very hard myself, because it would look too much like I'm pushing the interests of my employer Red Hat ... but certainly there's a reasonable case to be made there. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] State of support for back PG branches
On Mon, Sep 26, 2005 at 05:57:08PM -0400, Tom Lane wrote: I had originally been planning to back-port this fix: http://archives.postgresql.org/pgsql-committers/2005-08/msg00213.php as far as 7.2. I've completed the back-port as far as 7.3, but found that 7.2 would be significantly more work because the API of heap_fetch() would need to change from what it was back then, and the patch would therefore have to touch many places that it does not touch in later versions. Given that the bug is so low-probability that it's escaped detection all this time, it doesn't really seem worth the effort (not to mention risk of creating new bugs). This brings up the question of whether we should officially abandon support for 7.2 and/or later branches. I don't think anyone is planning on supporting old branches forever, but when do we stop? I have a corporate need to keep supporting 7.3, at least to the extent of critical bug fixes, because Red Hat is still on the hook to support that version in RHEL3 for awhile longer. I have no such interest in 7.2 (which is one reason I'm not excited about doing the extra work to back-patch the VACUUM/ctid fix). I can definitely see that the community might not want to expend more effort on 7.3, though. I have no idea what the needs of other distributions might be. Thoughts anyone? Expiry dates are good. Although there are people who don't read anything, enough people will read release notes that have a definite date attached, as in: PostgreSQL 8.1 Released: November 30, 2005 Community support ends: November 30, 2008 This could be softened a bit in a sentence or two below :) Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] State of support for back PG branches
The question at hand is when are we willing to pull the plug completely and declare that even security holes and data-loss risks won't get fixed. It is definitely a sensitive issue because we (my community hat on) want to make sure and not alienate people because we won't support a version for very long. However most major projects do this to a degree. RedHat does not release fixes for 7.3 anymore for example. Although the fedora-legacy project does. I don't know if it is affiliated with RedHat like Fedora is though. 7.3 is the oldest version that I think is actually supportable, in that there are no known, unfixable security or data-loss risks. I would think that 7.3 would be security fixes and data loss fixes only. It will be 3 years old in two months. In OSS terms that is quite a long time. This isn't like Windows where you see a release every 5 years. From a commercial perspective, I do have quite a few customers still on 7.3. Frankly I won't be able to get many of them to upgrade *until* the community deprecates 7.3. So another way we might approach this is that it's time to kill 7.2 because we want to encourage people to get off it sooner not later, but 7.3 and later still have an indefinite support lifespan ahead of them. Well from a community perspective that is definitely a very nice way to approach. Possibly a sub community or pgFoundry project --- postgresql-legacy? In that mindset, we'd only pull the plug on a version when an identifiable reason to kill it emerges. I'd still not commit to an infinite lifespan --- but we might be willing to support solid versions for as long as, say, five years. Five years is an awful long time in our community. We would in theory still be supporting 7.1. 7.1 was a great distro in comparison to 7.0. Although it did have the XID issue. Or, as you say, we could take the viewpoint that there are commercial companies willing to take on the burden of supporting back releases, and the development community ought not spend its limited resources on doing that. I'm hesitant to push that idea very hard myself, because it would look too much like I'm pushing the interests of my employer Red Hat ... but certainly there's a reasonable case to be made there. Well one way to look at it, is that by doing so the community is enabling a commercial opportunity which in turn, could (and hopefully would) encourage said commercial entities to donate more resources to the project. Look at how much RedHat gives to Gnome, or Novell to mono. I know it is not the community responsibility to ensure a commercial opportunity but it can't hurt either. Sincerely, Joshua D. Drake regards, tom lane -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] State of support for back PG branches
Tom, Or, as you say, we could take the viewpoint that there are commercial companies willing to take on the burden of supporting back releases, and the development community ought not spend its limited resources on doing that. I'm hesitant to push that idea very hard myself, because it would look too much like I'm pushing the interests of my employer Red Hat ... but certainly there's a reasonable case to be made there. Well, I think you know my opinion on this. Since there *are* commercial companies available, I think we should use them to reduce back-patching effort. I suggest that our policy should be: the community will patch two old releases, and beyond that if it's convenient, but no promises. In other words, when 8.1 comes out we'd be telling 7.3 users We'll be patching this only where we can apply 7.4 patches. Otherwise, better get a support contract. Of course, a lot of this is up to individual initiative; if someone fixes a patch so it applies back to 7.2, there's no reason not to make it available. However, there's no reason *you* should make it a priority. --Josh ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] State of support for back PG branches
On Mon, Sep 26, 2005 at 09:27:28PM -0400, Andrew Dunstan wrote: Tom Lane wrote: Marc G. Fournier [EMAIL PROTECTED] writes: On Mon, 26 Sep 2005, Andrew Dunstan wrote: Maybe something like this would do: We will attempt to maintain support of each major version for 3 years after its release, although this will not always be possible. After that time any major support requirement is likely to result in support being ended. This sounds reasonable to me ... I think it is more then most software projects do, isn't it? To translate that into reality: 7.2 (2002-02-04) would be dead already, and 7.3 (2002-11-27) will be dead around the time we are likely to release 8.1. It doesn't say we must drop it, it says we can after 3 years (or 1 release + 2 years if you like) without any troubles of conscience. ISTM that we could and possibly should keep supporting it until it appeared some major patch was required that was too much work or too dangerous. Remember, many people don't want to jump onto a release right away - I know of large enterprises that have a policy not to use the .0 version of anything. So a 3 year cycle is more likely to be a 2 1/2 year cycle in reality. Then factor in testing and migration time and the production life in the field between deployment and end of life might be only about 2 years. That's plenty short enough, especially as we still don't have a nice pg_upgrade utility. We started our upgrade from 7.2 to 7.4 about 20 months ago and finished it about 10 months ago, skipping 7.3 entirely. We've only just today hit our first problem in 7.4, and it's fixed by upgrading to 7.4.current, rather than the 7.4.something we originally upgraded to from 7.2.something. We'll be skipping 8.0 completely and the next step will probably be to 8.1.something (or possibly 8.2.something, depending on how bizgres looks in 3 months time). We'd probably consider upgrading our customers more often, but a dump and restore is extremely painful. Just a view from the pg-based-enterprise-application world. A nice pg_upgrade utility would make a big difference. Clearly an in-place upgrade is possible, but maintaining is hard. There are two broad ways of running a pg_upgrade project - one that is entirely independent of the main codebase and one that puts requirements on the main codebase developers (if you change $foo you provide code to translate old $foo to new $foo). Any feel for the relative difficulty of the two approaches? And how much push-back there'd be on the latter? Cheers, Steve ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] State of support for back PG branches
A nice pg_upgrade utility would make a big difference. Clearly an in-place upgrade is possible, but maintaining is hard. There are two broad ways of running a pg_upgrade project - one that is entirely independent of the main codebase and one that puts requirements on the main codebase developers (if you change $foo you provide code to translate old $foo to new $foo). Any feel for the relative difficulty of the two approaches? And how much push-back there'd be on the latter? You can do in place upgrades with Slony-I and Mammoth Replicator. Sincerely, Joshua D. Drake Cheers, Steve ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] State of support for back PG branches
Steve Atkins [EMAIL PROTECTED] writes: We'll be skipping 8.0 completely and the next step will probably be to 8.1.something (or possibly 8.2.something, depending on how bizgres looks in 3 months time). [ raised eyebrow... ] Has bizgres obtained a crystal ball from somewhere? There is *no* way anyone could provide you anything that has any legitimate claim on the name PG 8.2 three months from now. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] unchecked malloc
There are several places in both backend and tools that forget to check the return value of malloc(). For example(8.0.1), backend/port/dynloader/beos.c/pg_dlopen() backend/bootstrap/bootstrap.c/AddStr() port/strdup.c/strdup() bin/pg_dump/common.c/findParentsByOid() ... I am thinking we should fix them. Basically we have two ways, one is to define a pg_malloc() as psql already did, the other is to fix these places one by one. I prefer the first method, since it hides the return value check details in the function and less error proning. To report the out of memory error, we should differenciate if ErrorContext is already set up. Comments? Regards, Qingqing ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster