Re: [GENERAL] Postgre connect on Postgre
Hi, Anderson dos Santos Donda wrote: Thanks man!! I'll study this module!! You should also be aware that sometimes instead of connecting two separate databases via dblink or similar, two schemas in one database can be used instead. It really depends on what you are really doing if there are even more solutions (like slony for example). Regards Tino PS: the database likes to be called Postgresql or Postgres :-) On Tue, Aug 26, 2008 at 11:19 PM, Christophe [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Aug 26, 2008, at 7:10 PM, Anderson dos Santos Donda wrote: How I can connect a postgre database on another postgre database, and manipulate the datas on both database? There is a module in contrib just for such a purpose: http://www.postgresql.org/docs/8.3/interactive/dblink.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org mailto:pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general smime.p7s Description: S/MIME Cryptographic Signature
[GENERAL] ENABLE / DISABLE ALL TRIGGERS IN DATABASE
Hi, I think this has been asked before, but I can't find the answer from arcive nor google. How to disable/enable all the riggers in a database? I have problem with disabled triggers after database restore. If there is no simple way, could this be made in a function where you find the table names and construct the needed commands in strings. If so, how to get all the tablenames from database? Best regards and thanks! Teemu Juntunen
[GENERAL] update and group by/aggregate
How can I update catalog_items.Authors create table catalog_items( ItemID bigint primary key, Authors varchar(1024) ); taking results from select ia.ItemID, array_accum(trim(' \t]' from a.Name)) from catalog_itemauthor ia join catalog_author a on a.AuthorID=ia.AuthorID where a.Name is not null and length(trim(' \t' from a.Name))1 group by ia.ItemID; Currently I'm achieving the same result with a plpsql function with a for loop, and I'm planning to see if using aggregates is going to be faster and then see if it is even worth to keep an extra column... create or replace function UpdateAuthors() returns void as $$ declare _row record; _ItemID bigint; _Authors varchar(1024); _Name varchar(50); begin _Authors:=''; _ItemID:=null; for _row in select a.Name, ia.ItemID from {catalog_itemauthor} ia join {catalog_author} a on a.AuthorID=ia.AuthorID order by ia.ItemID loop if(_row.ItemID_ItemID) then if(length(_Authors)2) then _Authors:=substring(_Authors from 3); update {catalog_items} set Authors=_Authors where ItemID=_ItemID; end if; _Authors:=''; end if; _ItemID:=_row.ItemID; _Name:=trim(' \t' from _row.Name); if(length(_Name)0) then _Authors:= _Authors || ', ' || _Name; end if; end loop; return; end; $$ language plpgsql volatile; BTW I've noticed that array_append gives back not uniform results: select array_append(ARRAY['az','e','i'],''); {az,e,i,} while I'd expect {az,e,i,} -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] update and group by/aggregate
2008/8/27 Ivan Sergio Borgonovo [EMAIL PROTECTED]: How can I update catalog_items.Authors create table catalog_items( ItemID bigint primary key, Authors varchar(1024) ); taking results from select ia.ItemID, array_accum(trim(' \t]' from a.Name)) from catalog_itemauthor ia join catalog_author a on a.AuthorID=ia.AuthorID where a.Name is not null and length(trim(' \t' from a.Name))1 group by ia.ItemID; Currently I'm achieving the same result with a plpsql function with a for loop, and I'm planning to see if using aggregates is going to be faster and then see if it is even worth to keep an extra column... create or replace function UpdateAuthors() returns void as $$ declare _row record; _ItemID bigint; _Authors varchar(1024); _Name varchar(50); begin _Authors:=''; _ItemID:=null; for _row in select a.Name, ia.ItemID from {catalog_itemauthor} ia join {catalog_author} a on a.AuthorID=ia.AuthorID order by ia.ItemID loop if(_row.ItemID_ItemID) then if(length(_Authors)2) then _Authors:=substring(_Authors from 3); update {catalog_items} set Authors=_Authors where ItemID=_ItemID; end if; _Authors:=''; end if; _ItemID:=_row.ItemID; _Name:=trim(' \t' from _row.Name); if(length(_Name)0) then _Authors:= _Authors || ', ' || _Name; end if; end loop; return; end; $$ language plpgsql volatile; BTW I've noticed that array_append gives back not uniform results: select array_append(ARRAY['az','e','i'],''); {az,e,i,} it's correct, double quotes are used only for elements with some speces or for empty string if you would to print array content well, use array_to_string function regards Pavel Stehule while I'd expect {az,e,i,} -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ENABLE / DISABLE ALL TRIGGERS IN DATABASE
Hi, I made the function myself. Here is it, if anyone else has a need for this. Teemu /* Enable/disable all the triggers in database */ CREATE OR REPLACE FUNCTION fn_triggerall(DoEnable boolean) RETURNS integer AS $BODY$ DECLARE mytables RECORD; BEGIN FOR mytables IN SELECT relname FROM pg_class WHERE reltriggers 0 AND NOT relname LIKE 'pg_%' LOOP IF DoEnable THEN EXECUTE 'ALTER TABLE ' || mytables.relname || ' ENABLE TRIGGER ALL'; ELSE EXECUTE 'ALTER TABLE ' || mytables.relname || ' DISABLE TRIGGER ALL'; END IF; END LOOP; RETURN 1; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; ALTER FUNCTION fn_triggerall(DoEnable boolean) OWNER TO postgres; COMMENT ON FUNCTION fn_triggerall(DoEnable boolean) IS 'Enable/disable all the triggers in database'; - Original Message - From: Teemu Juntunen To: PostgreSQL Sent: Wednesday, August 27, 2008 11:24 AM Subject: [GENERAL] ENABLE / DISABLE ALL TRIGGERS IN DATABASE Hi, I think this has been asked before, but I can't find the answer from arcive nor google. How to disable/enable all the riggers in a database? I have problem with disabled triggers after database restore. If there is no simple way, could this be made in a function where you find the table names and construct the needed commands in strings. If so, how to get all the tablenames from database? Best regards and thanks! Teemu Juntunen
Re: [GENERAL] ENABLE / DISABLE ALL TRIGGERS IN DATABASE
On Wednesday 27 August 2008 04:24, Teemu Juntunen wrote: Hi, I think this has been asked before, but I can't find the answer from arcive nor google. How to disable/enable all the riggers in a database? I have problem with disabled triggers after database restore. If there is no simple way, could this be made in a function where you find the table names and construct the needed commands in strings. If so, how to get all the tablenames from database? Best regards and thanks! Teemu Juntunen You can accomplish this by manipulating the run time parameter, session_replication_role. For example, from within your psql session: SET SESSION session_replication_role = replica; This will prevent all triggers from firing for the entire session except those defined as replica. We use this all the time. HTH... -- Terry Lee Tucker Turbo's IT Manager Turbo, division of Ozburn-Hessey Logistics 2251 Jesse Jewell Pkwy NE Gainesville, GA 30501 Tel: (336) 372-6812 Fax: (336) 372-6812 Cell: (336) 404-6987 [EMAIL PROTECTED] www.turbocorp.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] update and group by/aggregate
On Wed, Aug 27, 2008 at 10:32:43AM +0200, Ivan Sergio Borgonovo wrote: How can I update catalog_items.Authors create table catalog_items( ItemID bigint primary key, Authors varchar(1024) ); The type varchar(1024) looks a little awkward, wouldn't an unadorned TEXT be easier? if you explicitly want to limit things to 1024 characters then what you're doing is correct. taking results from select ia.ItemID, array_accum(trim(' \t]' from a.Name)) from catalog_itemauthor ia join catalog_author a on a.AuthorID=ia.AuthorID where a.Name is not null and length(trim(' \t' from a.Name))1 group by ia.ItemID; UPDATE catalog_items SET authors=array_to_string(x.authors,', ') FROM ( SELECT ia.itemid, array_accum(trim(' \t]' from a.name)) AS authors FROM catalog_itemauthor ia JOIN catalog_author a ON a.authorid = ia.authorid WHERE a.name IS NOT NULL AND length(trim(' \t' from a.name))1 GROUP BY ia.itemid) x; is a reasonably direct translation. Though I may be tempted to use something more like: UPDATE catalog_items SET authors=array_to_string(x.authors,', ') FROM ( SELECT ia.itemid, array_accum(a.name) AS authors FROM catalog_itemauthor ia, ( SELECT authorid, trim(' \t' from name) AS name FROM catalog_author) a WHERE ia.authorid = a.authorid AND a.name IS NOT NULL AND length(a.name) 1 GROUP BY ia.itemid) x; to ensure that the characters trimmed from the authors' names are consistent. Sam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] update and group by/aggregate
Oops, both my statements were horribly broken. They needed a WHERE clause for the UPDATE. On Wed, Aug 27, 2008 at 11:44:20AM +0100, Sam Mason wrote: UPDATE catalog_items SET authors=array_to_string(x.authors,', ') FROM ( SELECT ia.itemid, array_accum(trim(' \t]' from a.name)) AS authors FROM catalog_itemauthor ia JOIN catalog_author a ON a.authorid = ia.authorid WHERE a.name IS NOT NULL AND length(trim(' \t' from a.name))1 GROUP BY ia.itemid) x; should be: UPDATE catalog_items i SET authors=array_to_string(x.authors,', ') FROM ( SELECT ia.itemid, array_accum(trim(' \t]' from a.name)) AS authors FROM catalog_itemauthor ia JOIN catalog_author a ON a.authorid = ia.authorid WHERE a.name IS NOT NULL AND length(trim(' \t' from a.name))1 GROUP BY ia.itemid) x WHERE i.itemid = x.itemid; UPDATE catalog_items SET authors=array_to_string(x.authors,', ') FROM ( SELECT ia.itemid, array_accum(a.name) AS authors FROM catalog_itemauthor ia, ( SELECT authorid, trim(' \t' from name) AS name FROM catalog_author) a WHERE ia.authorid = a.authorid AND a.name IS NOT NULL AND length(a.name) 1 GROUP BY ia.itemid) x; should be: UPDATE catalog_items i SET authors=array_to_string(x.authors,', ') FROM ( SELECT ia.itemid, array_accum(a.name) AS authors FROM catalog_itemauthor ia, ( SELECT authorid, trim(' \t' from name) AS name FROM catalog_author) a WHERE ia.authorid = a.authorid AND a.name IS NOT NULL AND length(a.name) 1 GROUP BY ia.itemid) x WHERE i.itemid = x.itemid; Sorry! Sam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] loop vs. aggregate was: update and group by/aggregate
On Wed, 27 Aug 2008 10:32:43 +0200 Ivan Sergio Borgonovo [EMAIL PROTECTED] wrote: create or replace function UpdateAuthors() returns void as $$ declare _row record; _ItemID bigint; _Authors varchar(1024); _Name varchar(50); begin _Authors:=''; _ItemID:=null; for _row in select a.Name, ia.ItemID from {catalog_itemauthor} ia join {catalog_author} a on a.AuthorID=ia.AuthorID order by ia.ItemID loop if(_row.ItemID_ItemID) then if(length(_Authors)2) then _Authors:=substring(_Authors from 3); update {catalog_items} set Authors=_Authors where ItemID=_ItemID; end if; _Authors:=''; end if; _ItemID:=_row.ItemID; _Name:=trim(' \t' from _row.Name); if(length(_Name)0) then _Authors:= _Authors || ', ' || _Name; end if; end loop; return; end; $$ language plpgsql volatile; Replying to myself: update catalog_items set Authors=s.Authors from ( select ia.ItemID, array_to_string(array_accum(trim(' \t]' from a.Name)),', ') as Authors from catalog_itemauthor ia join catalog_author a on a.AuthorID=ia.AuthorID where a.Name is not null and length(trim(' \t' from a.Name))1 group by ia.ItemID) as s where s.ItemID=catalog_items.ItemID; but this looks much slower than the function: function: 113sec vs. single statement: 488sec I repeated the test 3 times with similar results. Can anybody explain why aggregates under perform so badly? I just read that most of the times I can't be smarter than the planner and I thought that this would be one of the circumstances theplanner could outperform my handwritten function. here is the explain: Hash Join (cost=137408.51..271599.78 rows=209674 width=221) Hash Cond: (outer.itemid = inner.itemid) - HashAggregate (cost=32994.81..36664.11 rows=209674 width=58) - Hash Join (cost=8544.62..31946.44 rows=209674 width=58) Hash Cond: (outer.authorid = inner.authorid) - Seq Scan on catalog_itemauthor ia (cost=0.00..10297.21 rows=629021 width=12) - Hash (cost=8309.00..8309.00 rows=94248 width=54) - Seq Scan on catalog_author a (cost=0.00..8309.00 rows=94248 width=54) Filter: ((name IS NOT NULL) AND (length(btrim((name)::text, E' \011'::text)) 1)) - Hash (cost=79538.96..79538.96 rows=833496 width=189) - Seq Scan on catalog_items (cost=0.00..79538.96 rows=833496 width=189) thanks to Pavel who suggested the use of array_to_string Sam... I did your same error the first time I wrote the above statement... missing the where clause but I can't understand the need for your longer version to ensure that characters trimmed from the authors' name are consistent. I prefer to put constraint on the length of varchar as an early warning for something that is unexpected. eg. if the loop was not working as expected I'd get an error. Infact that's exactly what happened during development of the above function. If Authors was a text field I'd be waiting forever, fill the DB with rubbish etc... There are not unlimited fields... there are fields where the upper limit may be hard to guess. 1024 is actually an inherited choice... but I did some research to find a reasonable upper limit (highest # of authors * longest author). thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ENABLE / DISABLE ALL TRIGGERS IN DATABASE
Teemu Juntunen wrote: I think this has been asked before, but I can't find the answer from arcive nor google. How to disable/enable all the riggers in a database? I have problem with disabled triggers after database restore. If there is no simple way, could this be made in a function where you find the table names and construct the needed commands in strings. If so, how to get all the tablenames from database? Use the information_schema, see http://www.postgresql.org/docs/current/static/information-schema.html There is a view called triggers that will be useful. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] loop vs. aggregate was: update and group by/aggregate
On Wed, Aug 27, 2008 at 01:19:30PM +0200, Ivan Sergio Borgonovo wrote: but this looks much slower than the function: function: 113sec vs. single statement: 488sec I repeated the test 3 times with similar results. Have you been vacuuming (non-full) between runs? and as always, are the stats reasonably up to date? Can anybody explain why aggregates under perform so badly? You could try just running the SELECT part to see how long the aggregation takes. I just read that most of the times I can't be smarter than the planner and I thought that this would be one of the circumstances theplanner could outperform my handwritten function. It's quite easy to outsmart the planner with large amounts of data, but it's surprising how well it does most of the time. Generally you can just write whatever is obvious and the planer will do something good with it. If it doesn't do the right thing then you can worry about performance, rather than most languages where you have to worry about performance from the start. here is the explain: Hash Join (cost=137408.51..271599.78 rows=209674 width=221) Hash Cond: (outer.itemid = inner.itemid) - HashAggregate (cost=32994.81..36664.11 rows=209674 width=58) - Hash Join (cost=8544.62..31946.44 rows=209674 width=58) Hash Cond: (outer.authorid = inner.authorid) - Seq Scan on catalog_itemauthor ia (cost=0.00..10297.21 rows=629021 width=12) - Hash (cost=8309.00..8309.00 rows=94248 width=54) - Seq Scan on catalog_author a (cost=0.00..8309.00 rows=94248 width=54) Filter: ((name IS NOT NULL) AND (length(btrim((name)::text, E' \011'::text)) 1)) - Hash (cost=79538.96..79538.96 rows=833496 width=189) - Seq Scan on catalog_items (cost=0.00..79538.96 rows=833496 width=189) The seqscan of catalog_items looks a little strange to me, have you set work_mem to some obscenely big value? Sam... I did your same error the first time I wrote the above statement... missing the where clause but I can't understand the need for your longer version to ensure that characters trimmed from the authors' name are consistent. You're trimming an extra close square bracket (']') in one of the trim statements and not in the other. If you just do it in one place then you don't have to worry about inconsistency. I prefer to put constraint on the length of varchar as an early warning for something that is unexpected. eg. if the loop was not working as expected I'd get an error. Infact that's exactly what happened during development of the above function. OK, Knuth generally seemed to recommend doing similar things. Most modern programming people tend to advocate removing artificial limits as some fundamental good. I'm never quite so sure, either way! Sam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PITR - base backup question
On Tue, Aug 26, 2008 at 9:04 PM, Richard Broersma [EMAIL PROTECTED] wrote: On Tue, Aug 26, 2008 at 5:38 PM, Merlin Moncure [EMAIL PROTECTED] wrote: If you ever want to mess around with log shipping I strongly suggest you go through the motions of setting up a warm standby vi the pg_standby utility and practice popping the standby out of recovery. Thanks for the reply! I want to be certain that I understand the point you are making: Is setting up a warm standby server good pratice for gaining experience in PITR or do you mean that PITR is a good starting point for setting up a warm standby server? setting up a warm standby is good for understanding pitr. A warm standby just 'lays on top' of pitr and you should learn how to do it. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] loop vs. aggregate was: update and group by/aggregate
On Wed, 27 Aug 2008 12:45:42 +0100 Sam Mason [EMAIL PROTECTED] wrote: On Wed, Aug 27, 2008 at 01:19:30PM +0200, Ivan Sergio Borgonovo wrote: but this looks much slower than the function: function: 113sec vs. single statement: 488sec I repeated the test 3 times with similar results. Have you been vacuuming (non-full) between runs? and as always, are the stats reasonably up to date? there is autovacuum running regularly and I: vacuumed full, analyze and reindexed everything recently: that means that all changes to DB were done BEFORE maintenance stuff but then I executes several selects on tables (including the ones involved). I tried to run the function and the statement 3 times one after the other... so they should have been in similar situation. Can anybody explain why aggregates under perform so badly? You could try just running the SELECT part to see how long the aggregation takes. 33sec Surprising. I just read that most of the times I can't be smarter than the planner and I thought that this would be one of the circumstances theplanner could outperform my handwritten function. It's quite easy to outsmart the planner with large amounts of data, but it's surprising how well it does most of the time. Generally you can just write whatever is obvious and the planer will do something good with it. If it doesn't do the right thing then you can worry about performance, rather than most languages where you have to worry about performance from the start. I really thought that in this case the planner was going to outsmart me since well I think in that statement it could see a lot more optimisation than me knowing the nature of the data. here is the explain: Hash Join (cost=137408.51..271599.78 rows=209674 width=221) Hash Cond: (outer.itemid = inner.itemid) - HashAggregate (cost=32994.81..36664.11 rows=209674 width=58) - Hash Join (cost=8544.62..31946.44 rows=209674 width=58) Hash Cond: (outer.authorid = inner.authorid) - Seq Scan on catalog_itemauthor ia (cost=0.00..10297.21 rows=629021 width=12) - Hash (cost=8309.00..8309.00 rows=94248 width=54) - Seq Scan on catalog_author a (cost=0.00..8309.00 rows=94248 width=54) Filter: ((name IS NOT NULL) AND (length(btrim((name)::text, E' \011'::text)) 1)) - Hash (cost=79538.96..79538.96 rows=833496 width=189) - Seq Scan on catalog_items (cost=0.00..79538.96 rows=833496 width=189) The seqscan of catalog_items looks a little strange to me, have you set work_mem to some obscenely big value? 32Mb I think postgresql users and image would greatly benefit from some sort of tool that gives suggestions on how to set postgresql.conf accordingly to the hardware... and maybe asking some questions to the user. Tom suggested to raise that number in the range of tens of Mb for another problem. I saw the hot spot was at 16Mb and considering I was expecting the DB to get bigger I increased it to 32Mb. Sam... I did your same error the first time I wrote the above statement... missing the where clause but I can't understand the need for your longer version to ensure that characters trimmed from the authors' name are consistent. You're trimming an extra close square bracket (']') in one of the trim statements and not in the other. If you just do it in one place then you don't have to worry about inconsistency. Guess: that was a typo. regexp brain context switching :) I prefer to put constraint on the length of varchar as an early warning for something that is unexpected. eg. if the loop was not working as expected I'd get an error. Infact that's exactly what happened during development of the above function. OK, Knuth generally seemed to recommend doing similar things. Most modern programming people tend to advocate removing artificial limits as some fundamental good. I'm never quite so sure, either way! I do know Knuth and I think I share his opinions. I don't know modern programming people and I'm alien to the concept of fundamental good. But well, I'm here to learn. Could you point me to some explanation on why it should be a fundamental good in DB context? I'd say that: a) you may discover that your forecast of upper limit was wrong and you could see things breaking suddenly b) you incur in a small overhead (is it?) I do think that most of the time it is worth (and part of the problem) to make a reasonable forecast. If I can't really make one or it requires too much effort to make a reasonable one at least I start developing with an unreasonable constraint just to catch some errors earlier, drop the constraint and leave a comment. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PITR - base backup question
I have what I have sometimes called a 'tepid spare' backup. Once a week I copy the physical files over to another system (actually to two of them) and every few hours I make sure the archived WAL log files are in sync (using rsync.) Anyway, here's the cookbook guide I wrote for updating one of the backup servers, which I have used several times, so I don't have to rediscover the steps each time. I think it has sufficient detail that someone other than me could follow them. Obviously it would have to be changed to apply to your file organization scheme. Enjoy! -- Mike Nolan How to restore the PostgreSQL low level backup tar files and WAL files to do a point-in-time recovery (PITR) files on a backup server. This note will explain how to perform a point-in-time recovery of the PostgreSQL database using the low-level backup files and the archived WAL (write-ahead log) files. (These steps should work on either server, with any differences in actual file locations dealt with using symbolic links.) The low level backup files and WAL log files should already be present on the backup servers. The low level tar files files are copied when they are created (usually early on a Tuesday morning each week) and the WAL log files are rsync'ed to both servers every few hours so that both backup servers should have files enabling them to be restored to a state that is no more than a few hours out of sync with the live database server. (In the future, we hope to keep one of the backup servers in full real-time synchronization with the live server, using a different technique.) The steps given below will generally take 3-5 hours to run, depending on how many archived log files need to be processed when the PostgreSQL server is restarted. These instructions assume some familiarity with Unix/Linux system administration tools, including the vi editor, and with database administration procedures, though not necessarily detailed knowledge of PostgreSQL. For more details, see chapter 23 of the PostgreSQL documentation, especially section 23.3. The documentation for version 8.2 is at http://www.postgresql.org/docs/8.2/static/backup.html 1. su to root on the backup server that the restore will be performed on 2. su to postgres (su - postgres) 3. Shut down the Postgresql server running on the backup server, if any pg_ctl stop (Use 'ps ax' to make sure the server is stopped.) 4. cd to /disk1/postgres/data and copy two files to /tmp cd /disk1/postgres/data cp pg_hba.conf /tmp cp recovery.done /tmp 5. Delete the entire contents of the /disk1/postgres/data directory tree. MAKE ABSOLUTELY SURE YOU ARE IN THE /disk1/postgres/data directory! cd /disk1/postgres/data pwd rm -rf * 6. Restore the tar file for the low-level backup from the live server tar xvf /usr/local/pgsql/tardir/pg_part1.tar (This restore may take 2-3 hours) 7. Remove the PostgreSQL log file and the WAL files that were restored from the tar archive rm log.out cd pg_xlog rm 00* (do not remove the archive_status subdirectory) 8. Copy the pg_hba.conf file back from the /tmp directory cd /disk1/postgres/data cp /tmp/pg_hba.conf . (This is necessary to keep this database server configured for restricted access. If more expansive access is needed the restored pg_hba.conf file may be more appropriate.) 9. Copy the recovery configuration file from the /tmp directory (changing the file name) cp /tmp/recovery.done recovery.conf (This file may be edited to do a point-in-time recovery other than to the end of the PITR log files, see the PostgreSQL documentation for details.) 10. Delete the entire contents of the /disk2/postgres/data directory tree. MAKE ABSOLUTELY SURE YOU ARE IN THE /disk2/postgres/data directory! cd /disk2/postgres/data pwd rm -rf * 11. Restore the 2nd tar archive tar xvf /usr/local/pgsql/tardir/pg_part2.tar (This restore may take around a half hour) 12. Go to the directory where the WAL files have been archived on the server and remove all files older than the file matching the last .backup file. The fastest way to do this is as follows: cd /usr/local/pgsql/archivedir ls -1 files This will produce a file listing all files in the archivedir directory. Now edit that file (using vi) to take out all lines after the latest 'backup' file (including the file just ahead of that backup file, since it is actually the first file that needs to be processed. (Note: These ones after the latest backup file are the files we're NOT going to delete, so we don't want those file names in this text file.) For example, the directory list may look like this: 00010096000A 00010096000B 00010096000C 00010096000C.004027C0.backup 00010096000D There will
[GENERAL] Dumping/Restoring with constraints?
Hello. I have a database dump. With data and schema, which includes all the constraints and rules. But it seems the pg_dumpall command does not copy data in such a way that foreign keys are satisfied upon restoring. Because tables are inter-related, importing them keep giving errors and eventually no data is imported. Neither pg_dumpall nor pg_restore seems to have a without constraints or delay constraints check type command. What am I missing? Thanks for any advice. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
On Wednesday 27 August 2008 09:19, Phoenix Kiula wrote: Hello. I have a database dump. With data and schema, which includes all the constraints and rules. But it seems the pg_dumpall command does not copy data in such a way that foreign keys are satisfied upon restoring. Because tables are inter-related, importing them keep giving errors and eventually no data is imported. Neither pg_dumpall nor pg_restore seems to have a without constraints or delay constraints check type command. What am I missing? Thanks for any advice. We have all sorts of constraints and foreign keys and we have never had any problem with pg_restore related to dumping such that foreign keys are satisfied. You must have data already in the database that violates the restraints. You can restore in two phases; that is, by restoring the schema, and then the data using --disable-triggers. I'm assuming you are doing a binary dump. See the man page for pg_restore. HTH -- Terry Lee Tucker Turbo's IT Manager Turbo, division of Ozburn-Hessey Logistics 2251 Jesse Jewell Pkwy NE Gainesville, GA 30501 Tel: (336) 372-6812 Fax: (336) 372-6812 Cell: (336) 404-6987 [EMAIL PROTECTED] www.turbocorp.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PITR - base backup question
On Wed, Aug 27, 2008 at 9:18 AM, Michael Nolan [EMAIL PROTECTED] wrote: I have what I have sometimes called a 'tepid spare' backup. Once a week I copy the physical files over to another system (actually to two of them) and every few hours I make sure the archived WAL log files are in sync (using rsync.) I have a couple of comments...see below: 3. Shut down the Postgresql server running on the backup server, if any pg_ctl stop (Use 'ps ax' to make sure the server is stopped.) probably pg_ctl -m fast stop or -m immediate...since we are overwriting it. 5. Delete the entire contents of the /disk1/postgres/data directory tree. MAKE ABSOLUTELY SURE YOU ARE IN THE /disk1/postgres/data directory! cd /disk1/postgres/data pwd rm -rf * why not just rm -rf /disk1/postgres/data? 6. Restore the tar file for the low-level backup from the live server tar xvf /usr/local/pgsql/tardir/pg_part1.tar (This restore may take 2-3 hours) 7. Remove the PostgreSQL log file and the WAL files that were restored from the tar archive I prefer to exclude WAL files from the original tar, and recreate the folders here (being careful to chown them to postgres account). Every little bit helps. 12. Go to the directory where the WAL files have been archived on the server and remove all files older than the file matching the last .backup file. The fastest way to do this is as follows: cd /usr/local/pgsql/archivedir ls -1 files This is a nice touch. With a little bash-fu you could do a find | xargs rm and list/kill the files in one pass. In the standby setups I've done I usually script the whole process, a prep on the main and a startup on the standby. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
On Wed, Aug 27, 2008 at 9:29 PM, Terry Lee Tucker [EMAIL PROTECTED] wrote: We have all sorts of constraints and foreign keys and we have never had any problem with pg_restore related to dumping such that foreign keys are satisfied. You must have data already in the database that violates the restraints. You can restore in two phases; that is, by restoring the schema, and then the data using --disable-triggers. I'm assuming you are doing a binary dump. See the man page for pg_restore. Thanks for this. I don't have any foreign key violations in my existing database. I think the violation is happening because upon restoring the table that is being populated checks in another table that doesn't yet have data. I am not using pg_restore. I am just using psql --file=FILENAME syntax. Is that an issue? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] loop vs. aggregate was: update and group by/aggregate
On Wed, Aug 27, 2008 at 02:58:18PM +0200, Ivan Sergio Borgonovo wrote: On Wed, 27 Aug 2008 12:45:42 +0100 Sam Mason [EMAIL PROTECTED] wrote: Have you been vacuuming (non-full) between runs? and as always, are the stats reasonably up to date? there is autovacuum running regularly and I: vacuumed full, analyze and reindexed everything recently: that means that all changes to DB were done BEFORE maintenance stuff but then I executes several selects on tables (including the ones involved). I tried to run the function and the statement 3 times one after the other... so they should have been in similar situation. I'd probably start by doing a normal vacuum, then run a command that touches every row then run a normal vacuum again. between every iteration you want to run a normal vaccum otherwise the later runs will see more garbage than the previous run (updating a row leaves a dead row behind). Can anybody explain why aggregates under perform so badly? You could try just running the SELECT part to see how long the aggregation takes. 33sec Surprising. OK, so it's the UPDATE that's taking the time. I'm not experienced enough to say why though. It's quite easy to outsmart the planner with large amounts of data, but it's surprising how well it does most of the time. Generally you can just write whatever is obvious and the planer will do something good with it. If it doesn't do the right thing then you can worry about performance, rather than most languages where you have to worry about performance from the start. I really thought that in this case the planner was going to outsmart me since well I think in that statement it could see a lot more optimisation than me knowing the nature of the data. an experienced human thinking for several seconds (or probably minutes in this example) is almost always going to be better than a computer thinking for a millisecond. have you set work_mem to some obscenely big value? 32Mb OK. It's just that it was doing a hash aggregation involving 160MB of data. That will cause it to disk and I'd think it would prefer to do something else. Tom suggested to raise that number in the range of tens of Mb for another problem. I saw the hot spot was at 16Mb and considering I was expecting the DB to get bigger I increased it to 32Mb. 32MB seems a reasonable default on modern hardware. You're trimming an extra close square bracket (']') in one of the trim statements and not in the other. If you just do it in one place then you don't have to worry about inconsistency. Guess: that was a typo. regexp brain context switching :) I do the same thing far too often, hence I tend to do similar query rewrites, as I did on yours, to prevent this (human bug/feature) from happening. I do know Knuth and I think I share his opinions. I don't know modern programming people and I'm alien to the concept of fundamental good. A lot of programming courses will try and teach you to remove all constraints from your code, whether they're going to be hit or not. In the real world, resources are finite and effort has to be spent appropriately. In my eyes this is the difference between computer science and software engineering. Software engineering is just interested in making something that works now, computer science is about pushing back of the boundaries of what's possible. Too often the two get confused. But well, I'm here to learn. Could you point me to some explanation on why it should be a fundamental good in DB context? as in why using TEXT is good over a large VARCHAR ? it's an engineering choice normally. Some database systems optimize one or the other a lot more, so people tend to prefer one for arbitrary reasons. In PG it doesn't really matter in terms of performance and you should use whichever expresses the data you're trying to store appropriately. I do think that most of the time it is worth (and part of the problem) to make a reasonable forecast. If I can't really make one or it requires too much effort to make a reasonable one at least I start developing with an unreasonable constraint just to catch some errors earlier, drop the constraint and leave a comment. Yes, however this requires experience of what's worked so far and is difficult to teach. Sam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PITR - base backup question
On Wed, Aug 27, 2008 at 8:32 AM, Merlin Moncure [EMAIL PROTECTED] wrote: 3. Shut down the Postgresql server running on the backup server, if any pg_ctl stop (Use 'ps ax' to make sure the server is stopped.) probably pg_ctl -m fast stop or -m immediate...since we are overwriting it. Good point, but sometimes there are tasks running on the backup server (recently I did a test of a revised procedure that took 5 days to run.) I'll probably update the cookbook guide to deal with that possibility. 5. Delete the entire contents of the /disk1/postgres/data directory tree. MAKE ABSOLUTELY SURE YOU ARE IN THE /disk1/postgres/data directory! cd /disk1/postgres/data pwd rm -rf * why not just rm -rf /disk1/postgres/data? I prefer the visual confirmation that I am indeed in the right directory rather than risk a typo in the longer rm command. 6. Restore the tar file for the low-level backup from the live server tar xvf /usr/local/pgsql/tardir/pg_part1.tar (This restore may take 2-3 hours) 7. Remove the PostgreSQL log file and the WAL files that were restored from the tar archive I prefer to exclude WAL files from the original tar, and recreate the folders here (being careful to chown them to postgres account). Every little bit helps. Another good point. 12. Go to the directory where the WAL files have been archived on the server and remove all files older than the file matching the last .backup file. The fastest way to do this is as follows: cd /usr/local/pgsql/archivedir ls -1 files This is a nice touch. With a little bash-fu you could do a find | xargs rm and list/kill the files in one pass. In the standby setups I've done I usually script the whole process, a prep on the main and a startup on the standby. The scripts to create the system level backups and copy them to the backup servers and to rsync the WAL files are both cron jobs. I've considered writing a script to do all the restore tasks on the backup server, but I figure if someone other than me ever has to do it, for example if there is a problem when I'm unavailable, that person will probably want to watch the entire process carefully, since he or she will be far less familiar with the configuration. As disk space permits, I actually keep TWO sets of the tar archive base files, the most recent one and the one from the previous week. That way I could go back further for a PITR recovery up to some incident. I've never needed to do that and I hope I never do, but it's nice to have that capability just in case. -- Mike Nolan
Re: [GENERAL] Dumping/Restoring with constraints?
Phoenix Kiula [EMAIL PROTECTED] writes: Hello. I have a database dump. With data and schema, which includes all the constraints and rules. But it seems the pg_dumpall command does not copy data in such a way that foreign keys are satisfied upon restoring. Because tables are inter-related, importing them keep giving errors and eventually no data is imported. This shouldn't be possible in a regular dump, at least not with any remotely modern version of PG. However, if you are trying to restore a data-only dump into an already-created set of tables, it is possible because pg_dump doesn't know how to order the data loads in that case. (The problem may in fact be insoluble, since circular foreign key relationships are allowed.) The recommended solution is to use a regular schema-and-data dump. If you really have to separate schema and data, the --disable-triggers switch might help, though you open yourself to the possibility of loading inconsistent data that way. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PITR - base backup question
--- On Tue, 8/26/08, Richard Broersma [EMAIL PROTECTED] wrote: From: Richard Broersma [EMAIL PROTECTED] Subject: [GENERAL] PITR - base backup question To: pgsql-general@postgresql.org pgsql-general@postgresql.org, [EMAIL PROTECTED] Date: Tuesday, August 26, 2008, 10:53 PM From the following link: http://www.postgresql.org/docs/8.3/interactive/continuous-archiving.html#BACKUP-BASE-BACKUP Step 3 says to perform the back up. Does this mean a File System Backup of the Data directory? OR Does this mean performing a pg_dumpall and backing up the dump file? is a file system backup of he data directory -- Regards, Richard Broersma Jr. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
The recommended solution is to use a regular schema-and-data dump. If you really have to separate schema and data, the --disable-triggers switch might help, though you open yourself to the possibility of loading inconsistent data that way. Thanks Tom. This is the dump command being used on a 8.2.3 database on Linux: $ pg_dumpall mydb.sql $ umask 077 $ gzip mydb.sql Then I download the mydb.sql.gz file into my local computer (Mac OSX with 8.3.3) and unzip it to mydb.sql. The local database is already created, with all tables and constraints and all. Here is the command I use to restore: $ psql -d mydb -U myuser -h localhost --file=mydb.sql Is this not the recommended method? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
On Wednesday 27 August 2008 09:36, Phoenix Kiula wrote: On Wed, Aug 27, 2008 at 9:29 PM, Terry Lee Tucker [EMAIL PROTECTED] wrote: We have all sorts of constraints and foreign keys and we have never had any problem with pg_restore related to dumping such that foreign keys are satisfied. You must have data already in the database that violates the restraints. You can restore in two phases; that is, by restoring the schema, and then the data using --disable-triggers. I'm assuming you are doing a binary dump. See the man page for pg_restore. Thanks for this. I don't have any foreign key violations in my existing database. I think the violation is happening because upon restoring the table that is being populated checks in another table that doesn't yet have data. I am not using pg_restore. I am just using psql --file=FILENAME syntax. Is that an issue? The errors you are having, then, must be related to your own trigger code. It sounds like you will need to prevent those triggers from firing and the only way I know how to accomplish that is to do a binary dump and then use pg_restore as I indicated earlier. There is no way to disable triggers in your method referenced above. -- Terry Lee Tucker Turbo's IT Manager Turbo, division of Ozburn-Hessey Logistics 2251 Jesse Jewell Pkwy NE Gainesville, GA 30501 Tel: (336) 372-6812 Fax: (336) 372-6812 Cell: (336) 404-6987 [EMAIL PROTECTED] www.turbocorp.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PITR - base backup question
On Wed, Aug 27, 2008 at 9:52 AM, Michael Nolan [EMAIL PROTECTED] wrote: This is a nice touch. With a little bash-fu you could do a find | xargs rm and list/kill the files in one pass. In the standby setups I've done I usually script the whole process, a prep on the main and a startup on the standby. The scripts to create the system level backups and copy them to the backup servers and to rsync the WAL files are both cron jobs. I've considered writing a script to do all the restore tasks on the backup server, but I figure if someone other than me ever has to do it, for example if there is a problem when I'm unavailable, that person will probably want to watch the entire process carefully, since he or she will be far less familiar with the configuration. Here are some basic facts of life about PITR/log shipping. It is a disaster prevention feature. Here's the scenario: You are going to depend upon it on 3 a.m. early saturday morning three years in the future, only you are not going to be there. A drive just went out on the main, but instead of degrading the entire backplane went amber. You are going to be in Barbados on your honeymoon, with no connectivity to the outside world (your blushing bride made you leave the laptop at home). The guy doing the switchover to the standby is the only one that could be gotten a hold of, he still hasn't gotten over the 12 hour bender from Friday. He's never really understood why your company took your advice and went with PostgreSQL instead of SQL Server, is cranky, and doesn't like you that much. He secretly hopes the standby wont come up and barely knows how to use a console. write the script. test it. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
In response to Phoenix Kiula [EMAIL PROTECTED]: The recommended solution is to use a regular schema-and-data dump. If you really have to separate schema and data, the --disable-triggers switch might help, though you open yourself to the possibility of loading inconsistent data that way. Thanks Tom. This is the dump command being used on a 8.2.3 database on Linux: $ pg_dumpall mydb.sql $ umask 077 $ gzip mydb.sql Then I download the mydb.sql.gz file into my local computer (Mac OSX with 8.3.3) and unzip it to mydb.sql. The local database is already created, with all tables and constraints and all. Don't do that. Do one of the following: *) Allow the dump file to create all tables. *) In the early step dump the data only with pg_dumpall --disable-triggers -a -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ [EMAIL PROTECTED] Phone: 412-422-3463x4023 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
Thanks Bill for this recommendation: *) In the early step dump the data only with pg_dumpall --disable-triggers -a Dumb question. Will this kind of pg_dumpall lead to downtime, I mean is there a database lock during this time? Thanks! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
In response to Phoenix Kiula [EMAIL PROTECTED]: Thanks Bill for this recommendation: *) In the early step dump the data only with pg_dumpall --disable-triggers -a Dumb question. Will this kind of pg_dumpall lead to downtime, I mean is there a database lock during this time? No. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ [EMAIL PROTECTED] Phone: 412-422-3463x4023 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
On Wed, Aug 27, 2008 at 10:34 PM, Bill Moran [EMAIL PROTECTED] wrote: Dumb question. Will this kind of pg_dumpall lead to downtime, I mean is there a database lock during this time? No. Thanks. But there seems to be a tangible slowdown of DB operations during the time that pg_dump is running. Perhaps some of my postgresql.conf variables are not geared to this. (I vaguely recall setting it up so that the wal_buffers or checkpoint_segments needed to be set upwards for maintenance tasks). My question: is it possible to interactively set up these variables so that pg_dumpall can work very fast? And behind the scenes, without slowing stuff down? Thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
On Wed, Aug 27, 2008 at 10:42 PM, Tom Lane [EMAIL PROTECTED] wrote: For pg_dumpall, you shouldn't even have created the databases, just start from a virgin installation. Should I have CREATEd the db at least without the table structure? I dropped the database locally. Entirely. Gone. Then I tried this: $ pg_restore -h localhost mydb.sql This tells me: pg_restore: [archiver] input file does not appear to be a valid archive What gives? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
Phoenix Kiula [EMAIL PROTECTED] writes: Then I tried this: $ pg_restore -h localhost mydb.sql This tells me: pg_restore: [archiver] input file does not appear to be a valid archive No, you were right the first time: just feed the dump script to psql. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] 8.3.1 query plan
Hello List, I have inherited an existing application. It has a table of events and some events may reference an earlier event. We have a script that saves the first N number of events for each device, but we also want to save any additional event that is referenced by one of the first N events. The following query was developed to do this. It seemed to work ok for a while but one time when it was run it never finished after running for a day. So I did an explain and I see it is doing a seq scan in the last sub plan - there are about 375,000 rows in myevent - why isn't it using the index instead of doing a seq scan? create unique index indx1myevents on myevents (event_log_no) CREATE INDEX vacuum analyze VACUUM explain insert into myevents select * from t_unit_event_log a where exists (select b.event_log_no from myevents b where a.event_status = 1 and a.event_ref_log_no IS NOT NULL and a.event_ref_log_no = b.event_log_no and a.event_log_no not in (select event_log_no from myevents) ) Seq Scan on t_unit_event_log a (cost=0.00..25863477934.49 rows=645692 width=145) Filter: (subplan) SubPlan - Result (cost=20019.39..20027.70 rows=1 width=4) One-Time Filter: (($1 = 1) AND ($2 IS NOT NULL) AND (NOT (subplan))) - Index Scan using indx1myevents on myevents b (cost=20019.39..20027.70 rows=1 width=4) Index Cond: ($2 = event_log_no) SubPlan - Materialize (cost=16579.16..22379.12 rows=432196 width=4) - Seq Scan on myevents (cost=0.00..14668.96 rows=432196 width=4) Why wouldn't the planner use the index instead of doing a seq scan? Any advice would be greatly appreciated. Thanks, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] loop vs. aggregate was: update and group by/aggregate
On Wed, 27 Aug 2008 14:47:24 +0100 Sam Mason [EMAIL PROTECTED] wrote: On Wed, Aug 27, 2008 at 02:58:18PM +0200, Ivan Sergio Borgonovo wrote: On Wed, 27 Aug 2008 12:45:42 +0100 Sam Mason [EMAIL PROTECTED] wrote: Have you been vacuuming (non-full) between runs? and as always, are the stats reasonably up to date? there is autovacuum running regularly and I: vacuumed full, analyze and reindexed everything recently: that means that all changes to DB were done BEFORE maintenance stuff but then I executes several selects on tables (including the ones involved). I tried to run the function and the statement 3 times one after the other... so they should have been in similar situation. I'd probably start by doing a normal vacuum, then run a command [snip] Pardon me if I'll waste 2 posts... I'll try to do it later but I doubt it will make any difference. Can anybody explain why aggregates under perform so badly? You could try just running the SELECT part to see how long the aggregation takes. 33sec Surprising. OK, so it's the UPDATE that's taking the time. I'm not experienced enough to say why though. I can't get it either. I'd say that: - use of memory is larger in the single statement that use aggregate - if swapping may not be a factor, walking down large memory structure may introduce some overhead Still the aggregate version is 4 times slower. Hard to justify such difference just for the overhead of walking down an array in memory considering the box has 4Gb. But both versions happen inside one transaction... I'd think that one large update should be faster than several updates, that should make the aggregate version faster. It would be nice if someone come up with an explanation. It's quite easy to outsmart the planner with large amounts of data, but it's surprising how well it does most of the time. Generally you can just write whatever is obvious and the planer will do something good with it. If it doesn't do the right thing then you can worry about performance, rather than most languages where you have to worry about performance from the start. Well... but the function version doesn't exploit any information on actual data contained in the tables. In the single statement version the planner know everything is going to be touched. Does the planner consider all the statement in a function to optimise it or is the optimising work happening statement by statement? I really thought that in this case the planner was going to outsmart me since well I think in that statement it could see a lot more optimisation than me knowing the nature of the data. an experienced human thinking for several seconds (or probably minutes in this example) is almost always going to be better than a computer thinking for a millisecond. I just couldn't come up with the single statement version. I asked if anyone knew if I could rewrite the function in a single statement because I thought it was going to run faster and be easier to understand once written. When I wrote the function optimisation was the last of my thought. have you set work_mem to some obscenely big value? 32Mb OK. It's just that it was doing a hash aggregation involving 160MB of data. That will cause it to disk and I'd think it would prefer to do something else. On a 4Gb RAM box running Apache, PHP and Postgres, considering the main table is going to contain 1M records (currently ~800K, data size should be around 300-400Mb) is 32Mb a reasonable choice? You're trimming an extra close square bracket (']') in one of the trim statements and not in the other. If you just do it in one place then you don't have to worry about inconsistency. Guess: that was a typo. regexp brain context switching :) I do the same thing far too often, hence I tend to do similar query rewrites, as I did on yours, to prevent this (human bug/feature) from happening. oh well every time I cutpaste I know I'm doing something I'll be punished for ;) and there is something to be fixed. In fact once you made me notice I did... I thought I just could clean catalog_author.Name during import. Still it is annoying that this doesn't work and you still have to have 2 copies of the same snippet around: select trim(' \t' from Name) as _Name from catalog_author where length(_Name)1; can't work. I do know Knuth and I think I share his opinions. I don't know modern programming people and I'm alien to the concept of fundamental good. A lot of programming courses will try and teach you to remove all constraints from your code, whether they're going to be hit or not. Maybe that's why I'm so ignorant. I studied physics ;) In the real world, resources are finite and effort has to be spent appropriately. In my eyes this is the difference between computer science and software engineering. Software engineering is just interested in making something that works now, computer
Re: [GENERAL] Dumping/Restoring with constraints?
On Wed, Aug 27, 2008 at 10:53 PM, Tom Lane [EMAIL PROTECTED] wrote: No, you were right the first time: just feed the dump script to psql. Ok. Tried that. Because there's no database, I have to execute the psql command as postgres otherwise it doesn't work. So here's my command: $ psql -d postgres -U postgres -h localhost mydb.sql This seems to be it. This is what I should be executing, except that it spews out many errors, like: ERROR: invalid byte sequence for encoding UTF8: 0x80 This prevents my main table from being copied - Why can't the dump and the restore just copy the file as-is, including the encoding and such? What am I not doing right? Thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
On Wed, Aug 27, 2008 at 10:40:41PM +0800, Phoenix Kiula wrote: Thanks. But there seems to be a tangible slowdown of DB operations during the time that pg_dump is running. Yes. Pg_dump copies all the data out, so it puts load on your database and disks. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
On Wed, Aug 27, 2008 at 11:20:55PM +0800, Phoenix Kiula wrote: This seems to be it. This is what I should be executing, except that it spews out many errors, like: ERROR: invalid byte sequence for encoding UTF8: 0x80 You have bad data in your database. Apparently, you have an encoding of UTF-8, but you have data in there that's not UTF-8 data. I'll bet your other encoding is SQL_ASCII. This prevents my main table from being copied - Why can't the dump and the restore just copy the file as-is, including the encoding and such? What am I not doing right? My bet is that you did initdb on one system with a locale of C and on another with a locale of utf-8 (somehow). You can use pg_controldata to find out: run it against the data areas on each system. If I'm right, then you probably want to run initidb again on the target system. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
In response to Phoenix Kiula [EMAIL PROTECTED]: On Wed, Aug 27, 2008 at 10:53 PM, Tom Lane [EMAIL PROTECTED] wrote: No, you were right the first time: just feed the dump script to psql. Ok. Tried that. Because there's no database, I have to execute the psql command as postgres otherwise it doesn't work. So here's my command: $ psql -d postgres -U postgres -h localhost mydb.sql This seems to be it. This is what I should be executing, except that it spews out many errors, like: ERROR: invalid byte sequence for encoding UTF8: 0x80 In psql, do a \l on both the original and the DB you're restoring to. The encoding should be the same, if it's not, then fix that before restoring. If you're using pg_dump (which you don't mention ... it's getting difficult to follow what you're doing in this thread) then you can also use the -C option to have pg_dump add a CREATE DATABASE statement to the dump file for you. If you actually want to switch database encodings, that's an entirely different question thread unto itself. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ [EMAIL PROTECTED] Phone: 412-422-3463x4023 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
Ok. Here is what it is. OLD DB: Since 2002. May contain non-UTF8 data. But I thought I had modified it all when I changed it to UTF-8 pgsql database (it was originally Mysql). The database works very well on a very busy website. Everything on that website is now UTF-8. I wish to mirror this database locally on my home machine. Server is linux with 6GB ram and pgsql 8.2.3, home machine is osx with imac of 4GM ram and 8.3.3 (new pg). NEW DB: Just installed fresh. There's nothing in it. I can basically wipe all data out. WHAT I AM DOING: 1. On the server, I am executing pg_dumpall mydb.out. Simple. 2. FTP that mydb.out file to local home machine. 3. Here, locally, I do this:psql -h localhost -d postgres -U postgres -f mydb.out. This is what gives me the error. MY QUESTION: What can I do to: (a) Make sure all data on pgsql on server (OLD DB) is utf-8. Is there a command I can execute to convert all data? I thought I had converted it all to utf-8 using PHP sometime ago, which went through each and every row and column! (b) Once that data is utf8-ed, how can I bring it home and have a mirror of the db. Thanks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 8.3.1 query plan
On Wed, Aug 27, 2008 at 11:00:54AM -0400, Steve Clark wrote: So I did an explain and I see it is doing a seq scan in the last sub plan - there are about 375,000 rows in myevent - why isn't it using the index instead of doing a seq scan? create unique index indx1myevents on myevents (event_log_no) CREATE INDEX Are there NULLs in this subtable? NOT IN must check the entire table for NULLs. Try changing the query to use NOT EXISTS. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Please line up in a tree and maintain the heap invariant while boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
Re: [GENERAL] Dumping/Restoring with constraints?
On Wed, Aug 27, 2008 at 11:54:27PM +0800, Phoenix Kiula wrote: OLD DB: Since 2002. May contain non-UTF8 data. But I thought I had modified it all when I changed it to UTF-8 pgsql database (it was originally Mysql). The database works very well on a very busy website. Everything on that website is now UTF-8. I wish to mirror this database locally on my home machine. Server is linux with 6GB ram and pgsql 8.2.3, home machine is osx with imac of 4GM ram and 8.3.3 (new pg). Somehow, you're getting non-UTF-8 chars in there, either because your conversion didn't work, or because there's still bugs in your application that send non-UTF-8 data. If your database encoding is not UTF-8, then it is possible to get non-UTF-8 data anyway. That's why people asked about the database encoding. SQL_ASCII, please note, does not enforce that you're in the bottom 7 bits: it'll take anything you put in there. So if someone put (say) ISO 8859-1 in, you'll get in trouble. WHAT I AM DOING: 1. On the server, I am executing pg_dumpall mydb.out. Simple. 2. FTP that mydb.out file to local home machine. 3. Here, locally, I do this:psql -h localhost -d postgres -U postgres -f mydb.out. This is what gives me the error. Right. So the file includes data that doesn't match the encoding of the target database. AFAIR -- and my memory's not what it used to be, so check the release notes -- the UTF-8 checking in 8.2 was as good as in 8.3. One good test of this would be to install 8.2 on your home machine, and try restoring that too. If it works, then we know more. MY QUESTION: What can I do to: (a) Make sure all data on pgsql on server (OLD DB) is utf-8. Is there a command I can execute to convert all data? I thought I had converted it all to utf-8 using PHP sometime ago, which went through each and every row and column! The usual advice is to use iconv. Your Mac should have it installed. (b) Once that data is utf8-ed, how can I bring it home and have a mirror of the db. If you run iconv on the data dump before you load it, then it should work. This is not a trivial job, however. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] loop vs. aggregate was: update and group by/aggregate
On Wed, Aug 27, 2008 at 05:14:46PM +0200, Ivan Sergio Borgonovo wrote: On Wed, 27 Aug 2008 14:47:24 +0100 Sam Mason [EMAIL PROTECTED] wrote: OK, so it's the UPDATE that's taking the time. I'm not experienced enough to say why though. I can't get it either. I'd say that: - use of memory is larger in the single statement that use aggregate - if swapping may not be a factor, walking down large memory structure may introduce some overhead Doing everything in memory in PG would quicker than lots of transitions into and out of the kernel to read data from the cache. I'm guessing the performance is because the code is using an index scan to actually update the table. I've not tried doing this myself, but you may have luck convincing PG to use index scans by increasing effective_cache_size and/o reducing random_page_cost. Well... but the function version doesn't exploit any information on actual data contained in the tables. In the single statement version the planner know everything is going to be touched. Does the planner consider all the statement in a function to optimise it or is the optimising work happening statement by statement? It's just working statement by statement; it'll plan right at the beginning, the first time the function is called, as well. I just couldn't come up with the single statement version. I asked if anyone knew if I could rewrite the function in a single statement because I thought it was going to run faster and be easier to understand once written. I'd certainly expect the UPDATE to be quicker! When I wrote the function optimisation was the last of my thought. but then you got lucky with your implementation. it would be easy to bung everything into a big hash table and work from there, doing some of the work in the database and having it coming out being ordered makes it work quickly. so you've kind of implicitly optimized it. On a 4Gb RAM box running Apache, PHP and Postgres, considering the main table is going to contain 1M records (currently ~800K, data size should be around 300-400Mb) is 32Mb a reasonable choice? no idea, it certainly isn't a bad choice. putting some connection pooling software infront of PG is probably a better thing to think about than work_mem size if you want to worry about something! Still it is annoying that this doesn't work and you still have to have 2 copies of the same snippet around: select trim(' \t' from Name) as _Name from catalog_author where length(_Name)1; can't work. yes, SQL is a bit annoying. you have to make a subquery to do anything like that In the real world, resources are finite and effort has to be spent appropriately. In my eyes this is the difference between computer science and software engineering. Software engineering is just interested in making something that works now, computer science is about pushing back of the boundaries of what's possible. Too often the two get confused. Well... coming from a different background I'd say if you can't measure where the limit are, you can't know if you're pushing them back. But you'll know if you're solving an immediate problem or some theoretical possibility. As up to my knowledge using varchar(N) should have a small overhead in postgresql and not the other way around. you're right but it's small enough not to worry about it Sam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 8.3.1 query plan
Martijn van Oosterhout wrote: On Wed, Aug 27, 2008 at 11:00:54AM -0400, Steve Clark wrote: So I did an explain and I see it is doing a seq scan in the last sub plan - there are about 375,000 rows in myevent - why isn't it using the index instead of doing a seq scan? create unique index indx1myevents on myevents (event_log_no) CREATE INDEX Are there NULLs in this subtable? NOT IN must check the entire table for NULLs. Try changing the query to use NOT EXISTS. Have a nice day, Hi Martijn, I guess I misunderstand what you are saying because this is what I get now: srm2=# explain insert into myevents select * from t_unit_event_log a where exists srm2-# (select b.event_log_no from myevents b srm2(# where a.event_status = 1 and a.event_ref_log_no IS NOT NULL srm2(# and a.event_ref_log_no = b.event_log_no and a.event_log_no not exists srm2(# (select event_log_no from myevents) srm2(# ); ERROR: syntax error at or near exists LINE 4: ...nt_ref_log_no = b.event_log_no and a.event_log_no not exists Regards, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
On Wed, Aug 27, 2008 at 8:40 AM, Phoenix Kiula [EMAIL PROTECTED] wrote: Thanks. But there seems to be a tangible slowdown of DB operations during the time that pg_dump is running. Perhaps some of my postgresql.conf variables are not geared to this. (I vaguely recall setting it up so that the wal_buffers or checkpoint_segments needed to be set upwards for maintenance tasks). My question: is it possible to interactively set up these variables so that pg_dumpall can work very fast? And behind the scenes, without slowing stuff down? Nope, it's not a software problem, it's a hardware problem. You're drive subsystem can only deliver data so fast. The fix is usually better hardware. Big RAID controller and lots of drives. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Do I have a corrupted database?
Craig Ringer wrote: William Garrison wrote: I fear I have a corrupted database, and I'm not sure what to do. First, make sure you have a recent backup. If your backups rotate, stop the rotation so that all currently available historical copies of the database are preserved from now on - just in case you need them. Since I made my post, we found that we can't do a pg_dump. :( Every time this error appears in the logs, postgres forcably closes any connections (including any running instances of pgadmin or pg_dump) when it runs this little recovery process. We have backups from some days ago plus transaction logs. We also have a snapshot of the file system, and I'm hoping to find a way to attach that onto another system. I've had trouble with that in the past. As for the SAN and the Windows event log: Our IT guy says the SAN reported no failures at the time. I don't know much about the SAN itself, I just know it uses dual fiber-channels and all the drives are in some RAID array. I think it also is hardened against nuclear strikes and has a built-in laser defense system. At the time of the problem, the Windows event log indicates no problems writing to the drives, or any other failures of any kind really. No other apps crashed, no unusual memory usage, plenty of disk space. So the cause is a complete mystery. :( So for now, I'm focused on repair. We tried to REINDEX each table, and we are getting duplicate key errors so the reindex fails. I can fix those records manually, but I was hoping to dump the database, find the duplicates using another system, then delete/repair the bad records and restore onto the production machine. But since the backup/restore isn't working, that isn't looking like a viable option. Are there any kind of repair tools for a postgres database? Any sort of routine where I can take it offline and run like pg_fsck --all and it will come back with a report or a repair procedure? Now, if possible dump your database with pg_dump. Restore the dump to a test database instance and make sure that it all goes OK. Once that's done, so you know you have a decent recovery point to work from in case you make a mistake during your recovery efforts. After that I don't have all that much to offer, especially as you're using an operating system I don't have much experience with Pg on and you're using an (unspecified) SAN. Normally I'd ask if you'd verified your RAID array / tested your disks. In this case, I'm wondering if there's any chance there was a service interruption on the SAN that might've caused some sort of intermittent or partial writes. 2008-08-23 20:00:27 ERROR: xlog flush request E0/293CF278 is not satisfied --- flushed only to E0/21B1B7F0 2008-08-23 20:00:27 CONTEXT: writing block 94218 of relation 16712/16713/16725 2008-08-23 20:04:36 DETAIL: Multiple failures --- write error may be permanent. Yeah, I'm really wondering about the SAN and SAN connection. What sort of SAN is it? How is the host connected? Does it have any sort of logging and monitoring that might let you see if there was a problem around the time Pg was complaining? Have you checked the Windows error logs? -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
Phoenix Kiula [EMAIL PROTECTED] writes: This is the dump command being used on a 8.2.3 database on Linux: $ pg_dumpall mydb.sql $ umask 077 $ gzip mydb.sql Then I download the mydb.sql.gz file into my local computer (Mac OSX with 8.3.3) and unzip it to mydb.sql. That's all fine ... The local database is already created, with all tables and constraints and all. ... but this isn't. A pg_dump script expects to restore into an empty database. For pg_dumpall, you shouldn't even have created the databases, just start from a virgin installation. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Restoring a database from a file system snapshot
I have a PostgreSQL database on Windows Server 2003, and the database is kept on a SAN that has the ability to make instantaneous snapshots. Once I have made such a snapshot, I am unclear how to re-attach it to another postgres database on another machine. Postgres seems to create a directory structure that looks like this: Z:\MyDatabase Z:\MyDatabase\PG_VERSION Z:\MyDatabase\1234567 Z:\MyDatabase\lots of files... The 1234567 number above changes with each new database I create. So to restore a snapshot, I create a new database, which makes a X:\NewDatabase\98765 directory. Do I then drop the contents of the Z:\MyDatabase\1234567\* inside of the X:\NewDatabase\98765? I attempted this some time ago, and postgres refused to start. (I haven't tried it again so I don't know the exact error). I got the impression that this is not the correct procedure, since the number (which I think corresponds to the hard link inside C:\Program Files\PostgreSQL\8.3\data\pg_tblspc) did not match. I am not sure what else must be altered in order for the snapshot to attach Any suggestions? Is my procedure correct? Would I need to also copy the transaction logs or something like that? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Restoring a database from a file system snapshot
Any suggestions? Is my procedure correct? Would I need to also copy the transaction logs or something like that? the 'by the book' procedure for this operation is to use pg_dumpall . dump_file.sql and later psql -f dump_file.sql postgres pg_dumpall gives you a transaction aware dump. I'm not sure how you'dd do about that old snapshot, sorry. joao -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Restoring a database from a file system snapshot
William Garrison wrote: I have a PostgreSQL database on Windows Server 2003, and the database is kept on a SAN that has the ability to make instantaneous snapshots. Once I have made such a snapshot, I am unclear how to re-attach it to another postgres database on another machine. Postgres seems to create a directory structure that looks like this: Z:\MyDatabase Z:\MyDatabase\PG_VERSION Z:\MyDatabase\1234567 Z:\MyDatabase\lots of files... The 1234567 number above changes with each new database I create. It doesn't work. There's a procedure for restoring files, but you need to also save the pg_xlog files as a stream, for which you need to set up an archive_command in postgresql.conf beforehand. If you're interested in this, see the Point in time recovery chapter in the documentation. pg_dump/pg_restore is the easiest combination to use anyway. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] strange explain analyze output
This is in version 8.3.1 (I also tried 8.3.3). It looks like the sort is producing more rows than the input. The hash aggregate produces 10k, but the sort produces 10M. Am I just misinterpreting this output? Even the optimizer thinks that the output of the hashagg and the output of the sort should be identical. Regards, Jeff Davis = explain analyze select - a, b, c_max - from - (select a, max(c) as c_max from t group by a) dummy1 - natural join - (select a, b from t) dummy2; QUERY PLAN --- Merge Join (cost=199211.12..660979.37 rows=9998773 width=12) (actual time=8887.540..27866.804 rows=1000 loops=1) Merge Cond: (public.t.a = public.t.a) - Index Scan using t_a_idx on t (cost=0.00..286789.72 rows=9998773 width=8) (actual time=19.784..5676.407 rows=1000 loops=1) - Sort (cost=199211.12..199217.72 rows=2641 width=8) (actual time=8867.749..11692.015 rows=1000 loops=1) Sort Key: public.t.a Sort Method: quicksort Memory: 647kB - HashAggregate (cost=199001.60..199034.61 rows=2641 width=8) (actual time=8854.848..8859.306 rows=10001 loops=1) - Seq Scan on t (cost=0.00..149007.73 rows=9998773 width=8) (actual time=0.007..3325.292 rows=1000 loops=1) Total runtime: 30355.218 ms (9 rows) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Restoring a database from a file system snapshot
Sorry Alvaro, i sent to you the message ;P Is the same way (from snapshot) that use Mysql on ZFS. IF you don't change anything in the database, why it don't works? Then you restart the service with the same path. The problem it will be that you need to stop postgres BEFORE snapshot. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Postgres in a solaris zone
Hi, Anybody has an experience in running postgres database server from a solaris zone? Do you see any problems with such a setup? Any traps? Thanks, -'\/\/\/`v^v^v^v^v^v^v Nina Markova, Database Analyst/ Analyst de base de données (613) 992-3753 facsimile / télécopieur (613) 992-8836 [EMAIL PROTECTED] Geological Survey of Canada / Commission géologique du Canada Natural Resources Canada/ Ressources naturelles Canada Government of Canada/ Gouvernement du Canada -'\/\/\/`v^v^v^v^v^v^v
Re: [GENERAL] Restoring a database from a file system snapshot
Alvaro Herrera wrote: William Garrison wrote: I have a PostgreSQL database on Windows Server 2003, and the database is kept on a SAN that has the ability to make instantaneous snapshots. Once I have made such a snapshot, I am unclear how to re-attach it to another postgres database on another machine. Postgres seems to create a directory structure that looks like this: Z:\MyDatabase Z:\MyDatabase\PG_VERSION Z:\MyDatabase\1234567 Z:\MyDatabase\lots of files... The 1234567 number above changes with each new database I create. It doesn't work. There's a procedure for restoring files, but you need to also save the pg_xlog files as a stream, for which you need to set up an archive_command in postgresql.conf beforehand. If you're interested in this, see the Point in time recovery chapter in the documentation. pg_dump/pg_restore is the easiest combination to use anyway. I've read (and I am reading it again now) that chapter, and I'm making this post because that documentation doesn't seem to address the scenario I am in. The PITR article (http://www.postgresql.org/docs/8.2/static/continuous-archiving.html) shows you how to use the WAL archives after a file system backup that is not consistent. But it doesn't address two vital things: 1) I have a file system backup that *IS* consistent. So I should not need any WAL files at all right? 2) It doesn't explain exactly what folders you should be creating a backup of, or how exactly to restore those files onto another system, or how to munge the tablespaces to work. Specifically, it says Restore the database files from your backup dump... If you are using tablespaces, you should verify that the symbolic links in pg_tblspc/ were correctly restored. That's exactly what I'm asking how to do. **update** I got it working. Here's how 1) I have a file system snapshot. No WAL files were required. 2) Using my example from my first post, here is how I manually rearranged the files so that postgres saw the new database. - Create a new tablesspace on the new server - Create a new database on the new server, using that tablespace. I placed it into X:\NewDatabase\ - PostgreSQL will create a folder X:\NewDatabase\98765. Notice that the number postgres created is NOT the same as your old one. - Stop PostgreSQL - Move your files from Z:\MyDatabase\1234567 into the X:\NewDatabase folder - Delete the 98765 directory. - Start PostgreSql What I was doing before was moving the files from Z:\MyDatabase\1234567 into X:\NewDatabase\98765. The log file would then log a message like 2008-08-27 13:24:23 FATAL: database Your Database Name does not exist 2008-08-27 13:24:23 DETAIL: The database subdirectory pg_tblspc/32789/12345 is missing. It specifically wants the old folder name, not the new folder name of the new database you are attaching into. --- The referenced article is lacking an explanation for how postgres arranges the files and how the symlinks for the tablespaces are made. Without that knowledge, attaching to another database is guesswork. It really only took me a few tries to get the folder correct, but I think there needs to be an attach procedure for attaching an existing database, or the article should describe the folder structure used by postgres. It isn't as simple as just copying the folder. You have to also rename the directory structure to match. That's what I needed to know. meh, it was easy to do, just unclear.
Re: [GENERAL] Dumping/Restoring with constraints?
Hi, Phoenix Kiula wrote: Hello. I have a database dump. With data and schema, which includes all the constraints and rules. But it seems the pg_dumpall command does not copy data in such a way that foreign keys are satisfied upon restoring. Because tables are inter-related, importing them keep giving errors and eventually no data is imported. Neither pg_dumpall nor pg_restore seems to have a without constraints or delay constraints check type command. What am I missing? Thanks for any advice. Apart from disabling triggers temporarely, if you have enough space in xlog, you could try to add BEGIN work; to the start of the restore SQL and COMMIT; a the end. Regards Tino smime.p7s Description: S/MIME Cryptographic Signature
Re: [GENERAL] Restoring a database from a file system snapshot
William Garrison wrote: 1) I have a file system backup that *IS* consistent. So I should not need any WAL files at all right? It is consistent only if it was taken when the postmaster was down. **update** I got it working. Here's how 1) I have a file system snapshot. No WAL files were required. 2) Using my example from my first post, here is how I manually rearranged the files so that postgres saw the new database. - Create a new tablesspace on the new server - Create a new database on the new server, using that tablespace. I placed it into X:\NewDatabase\ - PostgreSQL will create a folder X:\NewDatabase\98765. Notice that the number postgres created is NOT the same as your old one. - Stop PostgreSQL - Move your files from Z:\MyDatabase\1234567 into the X:\NewDatabase folder - Delete the 98765 directory. - Start PostgreSql This does not really work, because you're missing the necessay pg_clog files. You can make it sort-of-work by doing a VACUUM FREEZE and a CHECKPOINT on the database before taking the snapshot. However, you'd still be missing the entries in shared catalogs. The only one you've recreated is the one on pg_database, but there are more. --- The referenced article is lacking an explanation for how postgres arranges the files and how the symlinks for the tablespaces are made. Without that knowledge, attaching to another database is guesswork. Attaching to another database is not supported at all. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Do I have a corrupted database?
On Wed, Aug 27, 2008 at 01:45:43PM -0400, William Garrison wrote: Since I made my post, we found that we can't do a pg_dump. :( Every time this error appears in the logs, postgres forcably closes any connections (including any running instances of pgadmin or pg_dump) when it runs this little recovery process. We have backups from some days ago plus transaction logs. We also have a snapshot of the file system, and I'm hoping to find a way to attach that onto another system. I've had trouble with that in the past. You're going to have to be more specific. What do you mean by this error? It is possible to startup postgresql such that it will not use any system indexes. Are there any kind of repair tools for a postgres database? Any sort of routine where I can take it offline and run like pg_fsck --all and it will come back with a report or a repair procedure? There is no tools that do fixing, only the DB server itself. If you can't get it to work within postgresql, then pgfsck can attempt to do a raw data dump. It doesn't guarentee the integrity of the data but it may be able to get your data out. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Please line up in a tree and maintain the heap invariant while boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
Re: [GENERAL] Dumping/Restoring with constraints?
Is there a mechanism to check exactly which row may have characters that are not UTF8? I am trying with this kind of a command: select id from employee where modify_date between '2008-03-01' and '2008-07-01' and joint_field_of_name_etc convert(joint_field_of_name_etc, 'UTF8') This is of course not the best way of doing so. Is there a better system-provided way of checking for rows that may not be UTF8? Or to even have a constraint to this effect, to disallow any non-UTF8 data from getting in there? Thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Do I have a corrupted database?
Wait... there really is a pgfsck...? I just made that up as an example of something I wanted. Great! And... how would I tell postgres to start without using any indexes? Martijn van Oosterhout wrote: On Wed, Aug 27, 2008 at 01:45:43PM -0400, William Garrison wrote: Are there any kind of repair tools for a postgres database? Any sort of routine where I can take it offline and run like pg_fsck --all and it will come back with a report or a repair procedure? You're going to have to be more specific. What do you mean by this error? It is possible to startup postgresql such that it will not use any system indexes. There is no tools that do fixing, only the DB server itself. If you can't get it to work within postgresql, then pgfsck can attempt to do a raw data dump. It doesn't guarentee the integrity of the data but it may be able to get your data out. Have a nice day,
Re: [GENERAL] Restoring a database from a file system snapshot
I have just come to a horrible realization about PostgreSQL that I'm sure is supposed to be pre-requisite knowledge even going into this. So everyone may laugh at me now. We have a SAN volume, and we created a tablespace that that points to that SAN volume (Z: drive). This put all the table files on Z:. It was our assumption that the table files + the archived transaction would now be on the Z: drive, and that was enough to restore the database. It is shocking to me that I told PostgreSQL to put the database on Z:, but it is only putting a subset of the necessary files on that drive. That is just frightening. A database is not just tables - it is tables and transaction logs. Why on earth would PostgreSQL put the tables separately from the transaction logs? This is having a chilling effect on me as I realize that the transaction log files are not separated by database. So when I have multiple database systems, I have one single shared set of transaction logs. Even though those databases are on completely separate SANs. I'm used to things like MS SQL Server where I say database Foo goes on Z: and this database Bar goes on X: and you can take it for granted that the transaction logs for database Foo also go on Z: and the transaction logs for database Bar go on X:. I'm still reeling from the thought that there can somehow be a single transaction log for multiple databases. How is that even possible? Are the transaction ID numbers shared across databases too? I need to educate our IT group about this. They setup the SAN volumes based on my incorrect assumptions about how PostgreSQL worked. It sounds like, on Windows, we need to just flat-out reinstall postgres and install it into the Z: drive so that the entire data directory is on the SAN volume. Installing it to C: and having only parts of the database on the SAN is not good. (Thanks to everyone who is replying - this is clearing-up a lot of misconceptions for me) P.S. I guess on Unix, you guys all just setup the data directory to be a hard-link to some other location? Alvaro Herrera wrote: William Garrison wrote: 1) I have a file system backup that *IS* consistent. So I should not need any WAL files at all right? It is consistent only if it was taken when the postmaster was down. **update** I got it working. Here's how 1) I have a file system snapshot. No WAL files were required. 2) Using my example from my first post, here is how I manually rearranged the files so that postgres saw the new database. - Create a new tablesspace on the new server - Create a new database on the new server, using that tablespace. I placed it into X:\NewDatabase\ - PostgreSQL will create a folder X:\NewDatabase\98765. Notice that the number postgres created is NOT the same as your old one. - Stop PostgreSQL - Move your files from Z:\MyDatabase\1234567 into the X:\NewDatabase folder - Delete the 98765 directory. - Start PostgreSql This does not really work, because you're missing the necessay pg_clog files. You can make it sort-of-work by doing a VACUUM FREEZE and a CHECKPOINT on the database before taking the snapshot. However, you'd still be missing the entries in shared catalogs. The only one you've recreated is the one on pg_database, but there are more. --- The referenced article is lacking an explanation for how postgres arranges the files and how the symlinks for the tablespaces are made. Without that knowledge, attaching to another database is guesswork. Attaching to another database is not supported at all.
[GENERAL] Postgresql RPM upgrade (8.2.3 - 8.2.9)
I hope this is a simple question. I checked the PG wiki but didn't find an answer: http://wiki.postgresql.org/wiki/RPM_Installation I have my 8.2.3 RPMs installed. I would like to upgrade to 8.2.9. Because this is a minor version upgrade (within 8.2) I understand that I should be able to upgrade the rpm *without* dump/restore step. Now, is it correct that i don't have to stop my server and I can just execute these commands: rpm -Uvh --test postgresql-libs-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-contrib-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-devel-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-server-8.2.9-1PGDG.rhel4.i386.rpm And all will be okay? (I would run these without the --test) Right now, these tests give errors. My question: I do not have to change my CONF files and data and the upgrade should work with the above? The Wiki does not address this in the upgrade section. Thanks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Restoring a database from a file system snapshot
William Garrison wrote: [snip] A database is not just tables - it is tables and transaction logs. Why on earth would PostgreSQL put the tables separately from the transaction logs? Because you told it to. If you want everything on Z:\postgresql you just initdb that location and point PG at that location (or just install there). Tablespaces let you store sets of tables/indexes on specific disks (well, filesystem mount-points). How is that even possible? Are the transaction ID numbers shared across databases too? Yes. The PG term for this is a database cluster - an installation that shares transaction logs, ids, users. I need to educate our IT group about this. They setup the SAN volumes based on my incorrect assumptions about how PostgreSQL worked. It sounds like, on Windows, we need to just flat-out reinstall postgres and install it into the Z: drive so that the entire data directory is on the SAN volume. Installing it to C: and having only parts of the database on the SAN is not good. Yes. A dump/restore is probably the simplest way of doing this. P.S. I guess on Unix, you guys all just setup the data directory to be a hard-link to some other location? Mount a filesystem at the desired point in the directory tree, or just use soft-links. Which is how tablespaces are implemented, as it happens. -- Richard Huxton Archonet Ltd -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql RPM upgrade (8.2.3 - 8.2.9)
Hi, On Thu, 2008-08-28 at 09:00 +0800, Phoenix Kiula wrote: I hope this is a simple question. I checked the PG wiki but didn't find an answer: http://wiki.postgresql.org/wiki/RPM_Installation That document may not be up2date. I'll check. Until then, http://pgfoundry.org/docman/view.php/148/1338/PostgreSQL-RPM-Installation-PGDG.pdf and http://www.westnet.com/~gsmith/gregsmith/content/postgresql/pgrpm.htm may help you. I have my 8.2.3 RPMs installed. I would like to upgrade to 8.2.9. Because this is a minor version upgrade (within 8.2) I understand that I should be able to upgrade the rpm *without* dump/restore step. Well, you still need to check release notes. There has been *very few* exceptions for that. But in principle, answer is no. Now, is it correct that i don't have to stop my server and I can just execute these commands: rpm -Uvh --test postgresql-libs-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-contrib-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-devel-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-server-8.2.9-1PGDG.rhel4.i386.rpm And all will be okay? (I would run these without the --test) Right now, these tests give errors. What errors? Also, I think you should update all packages at once, like: rpm -Uvh postgresql-server-8.2.9-1PGDG.rhel4.i386.rpm postgresql-8.2.9-1PGDG.rhel4.i386.rpm ... Regards, -- Devrim GÜNDÜZ, RHCE devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org signature.asc Description: This is a digitally signed message part
Re: [GENERAL] Postgresql RPM upgrade (8.2.3 - 8.2.9)
On Thu, Aug 28, 2008 at 3:00 AM, Phoenix Kiula [EMAIL PROTECTED] wrote: Now, is it correct that i don't have to stop my server and I can just execute these commands: rpm -Uvh --test postgresql-libs-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-contrib-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-devel-8.2.9-1PGDG.rhel4.i386.rpm rpm -Uvh --test postgresql-server-8.2.9-1PGDG.rhel4.i386.rpm And all will be okay? (I would run these without the --test) Right now, these tests give errors. Most probably your errors are caused by dependencies between the packages; as someone else already suggested, try to write the command as: rpm -Uvh --test postgresql-8.2.9... postgresql-server... etc. In that way all the dependencies should be addressed. If that does not solve the problem, post the error message. Regards Marco -- Marco Bizzarri http://iliveinpisa.blogspot.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql RPM upgrade (8.2.3 - 8.2.9)
Thanks all. Turns out I didn't have any problems at all. I just ran this command, all rpms together as Devrim Gunduz suggested - -- rpm -Uvh postgresql-8.2.9-1PGDG.rhel4.i386.rpm postgresql-contrib-8.2.9-1PGDG.rhel4.i386.rpm postgresql-devel-8.2.9-1PGDG.rhel4.i386.rpm postgresql-libs-8.2.9-1PGDG.rhel4.i386.rpm postgresql-server-8.2.9-1PGDG.rhel4.i386.rpm -- Database working in tiptop condition :) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Some server processes stalled with aborted client sockets
Hi all! I'm playing with client thread abort issues in Npgsql. And with a test sample one of our users provided us we are seeing that even after the client finishes its processing, I'm seeing some stalled server processes processing the query. The problem is that those server processes seem not to die when the client disconnects. Even worse, if I try to stop server, because of then, the server can't shutdown. Have you seen something like that? Is it possible that I can mess up with frontend protocol so that the server process keeps waiting for something? What is strange is that even after the socket is closed, the server process is still there. Also, I'd like to ask what is the best way of handling an abrupt client abortion. On my tests I'm doing the following: I'm sending a cancelrequest message followed by closing the socket. I know this isn't the most elegant way of doing it. For me the ideal would be to clear the protocol from any garbage the abrupt interruption may let it and return the connection to our internal pool. But I don't have any idea about how to clear the protocol state other than send the cancelrequest and try to eat any still existent byte in the stream until I receive an errorresponse or readyforquery (in case the query was successfully executed before the cancelrequest) But this approach may lead me to read up too much bytes before cleaning the protocol, or am I missing something? I'm using Postgresql 8.3.3 on OSX 10.5.4 Thanks in advance for any advice about this issue. -- Regards, Francisco Figueiredo Jr. http://fxjr.blogspot.com http://www.npgsql.org -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] strange explain analyze output
Jeff Davis [EMAIL PROTECTED] writes: This is in version 8.3.1 (I also tried 8.3.3). It looks like the sort is producing more rows than the input. The hash aggregate produces 10k, but the sort produces 10M. Merge Join (cost=199211.12..660979.37 rows=9998773 width=12) (actual time=8887.540..27866.804 rows=1000 loops=1) Merge Cond: (public.t.a = public.t.a) - Index Scan using t_a_idx on t (cost=0.00..286789.72 rows=9998773 width=8) (actual time=19.784..5676.407 rows=1000 loops=1) - Sort (cost=199211.12..199217.72 rows=2641 width=8) (actual time=8867.749..11692.015 rows=1000 loops=1) Sort Key: public.t.a Sort Method: quicksort Memory: 647kB - HashAggregate (cost=199001.60..199034.61 rows=2641 width=8) (actual time=8854.848..8859.306 rows=10001 loops=1) What this shows is that the HashAggregate emitted 10001 output rows, which necessarily was the number of rows sorted. The Sort node was (successfully) called on to produce an output row 1000 times. The reason that these statements are not inconsistent is that the Sort is the inner relation for a mergejoin. In the presence of duplicate keys in the outer relation, a mergejoin will rewind and rescan duplicate keys in the inner relation; that is, any row in the inner relation will be fetched approximately as many times as it has matches in the outer relation. So it looks like you've got roughly 1000X duplication in these tables? (BTW, the planner knows that this is expensive and will avoid mergejoins when there are many duplicate keys. But apparently a hash join seemed even worse for the stats of this particular pair of tables.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 8.3.1 query plan
Steve Clark [EMAIL PROTECTED] writes: explain insert into myevents select * from t_unit_event_log a where exists (select b.event_log_no from myevents b where a.event_status = 1 and a.event_ref_log_no IS NOT NULL and a.event_ref_log_no = b.event_log_no and a.event_log_no not in (select event_log_no from myevents) ) Consider testing the conditions on A at the top level, instead of redundantly checking them inside the sub-query on B. I'm not certain exactly how much that's hurting you (EXPLAIN ANALYZE output would've been more informative), but it can't be good. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] indexes on functions and create or replace function
Given table T(c1 int) and function F(arg int) create an index on T using F(c1). It appears that if you execute create or replace function F and provide a different implementation that the index still contains the results from the original implementation, thus if you execute something like select * from T where F(c1) after replacing the function that it now misses rows that should be returned. In other words, the index isn't aware the function is now returning different values. That's not the correct/expected behavior is it? I would have expected that replacing the function would have caused any indexes that depend on that function to be reindexed/recreated with the new function implementation.
[GENERAL] Feature Request: additional extension to UPDATE
Regarding: row-wise update. In a previous thread: http://archives.postgresql.org/pgsql-sql/2008-08/msg00122.php it was mentioned that the UPDATE ... SET ... FROM ... WHERE syntax is an extension. (an extension employed by many other db flavors.) This creates a problems since each db flavor is left to their own devices in how to implement it. However the ANSI SQL syntax can be a real hassle when you have to a lot of fields to update using (complicated) correlated sub-queries: UPDATE Foo SET a = ( SELECT newA FROM Bar WHERE foo.id = Bar.id ), b = ( SELECT newB FROM Bar WHERE foo.id = Bar.id ), c = ( SELECT newC FROM Bar WHERE foo.id = Bar.id ), ... n = ( SELECT newN FROM Bar WHERE foo.id = Bar.id ) WHERE Foo.id 100; Maybe a row-wise update could simplify this process and at the same time produce results in with those defined by the SQL Standard. UPDATE Foo SET ( a, b, c, ..., n ) = ( SELECT newA, newB, newC, ..., newN FROM Bar WHERE foo.id = Bar.id ) WHERE Foo.id 100; Any thoughts on this? -- Regards, Richard Broersma Jr. Visit the Los Angeles PostgreSQL Users Group (LAPUG) http://pugs.postgresql.org/lapug -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general