* Aidan Van Dyk <[EMAIL PROTECTED]> [081031 15:11]: > How about something like the attached. It's been spun quickly, passed > regression tests, and some simple hand tests on REL8_3_STABLE. It seem slike > HEAD can't initdb on my machine (quad opteron with SW raid1), I tried a few > revision in the last few days, and initdb dies on them all...
OK, HEAD does work, I don't know what was going on previosly... Attached is my
patch against head.
I'll try and pull out some machines on Monday to really thrash/crash this but
I'm running out of time today to set that up.
But in running head, I've come accross this:
regression=# SELECT pg_stop_backup();
WARNING: pg_stop_backup still waiting for archive to complete (60
seconds elapsed)
WARNING: pg_stop_backup still waiting for archive to complete (120
seconds elapsed)
WARNING: pg_stop_backup still waiting for archive to complete (240
seconds elapsed)
My archive script is *not* running, it ran and exited:
[EMAIL PROTECTED]:~/projects/postgresql/PostgreSQL/src/test/regress$ ps
-ewf | grep post
mountie 2904 1 0 16:31 pts/14 00:00:00
/home/mountie/projects/postgresql/PostgreSQL/src/test/regress/tmp_check/install/usr/local/pgsql
mountie 2906 2904 0 16:31 ? 00:00:01 postgres: writer process
mountie 2907 2904 0 16:31 ? 00:00:00 postgres: wal writer
process
mountie 2908 2904 0 16:31 ? 00:00:00 postgres: archiver
process last was 00000001000000000000001F
mountie 2909 2904 0 16:31 ? 00:00:01 postgres: stats
collector process
mountie 2921 2904 1 16:31 ? 00:00:18 postgres: mountie
regression 127.0.0.1(56455) idle
Those all match up:
[EMAIL PROTECTED]:~/projects/postgresql/PostgreSQL/src/test/regress$
pstree -acp 2904
postgres,2904 -D/home/mountie/projects/postgres
├─postgres,2906
├─postgres,2907
├─postgres,2908
├─postgres,2909
└─postgres,2921
strace on the "archiver process" postgres:
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
getppid() = 2904
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
getppid() = 2904
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
getppid() = 2904
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
getppid() = 2904
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
getppid() = 2904
It *does* finally finish, postmaster log looks like ("Archving ..." is what my
archive script prints, bytes is the gzip'ed size):
Archiving 000000010000000000000016 [16397 bytes]
Archiving 000000010000000000000017 [4405457 bytes]
Archiving 000000010000000000000018 [3349243 bytes]
Archiving 000000010000000000000019 [3349505 bytes]
LOG: ZEROING xlog file 0 segment 27 from 7954432 - 16777216 [8822784
bytes]
Archiving 00000001000000000000001A [3349590 bytes]
Archiving 00000001000000000000001B [1596676 bytes]
LOG: ZEROING xlog file 0 segment 28 from 8192 - 16777216 [16769024
bytes]
Archiving 00000001000000000000001C [16398 bytes]
LOG: ZEROING xlog file 0 segment 29 from 8192 - 16777216 [16769024
bytes]
Archiving 00000001000000000000001D [16397 bytes]
LOG: ZEROING xlog file 0 segment 30 from 8192 - 16777216 [16769024
bytes]
Archiving 00000001000000000000001E [16393 bytes]
Archiving 00000001000000000000001E.00000020.backup [146 bytes]
WARNING: pg_stop_backup still waiting for archive to complete (60
seconds elapsed)
WARNING: pg_stop_backup still waiting for archive to complete (120
seconds elapsed)
WARNING: pg_stop_backup still waiting for archive to complete (240
seconds elapsed)
LOG: ZEROING xlog file 0 segment 31 from 8192 - 16777216 [16769024
bytes]
Archiving 00000001000000000000001F [16395 bytes]
So what's this "pg_stop_backup still waiting for archive to complete" for 5
minutes state? I've not seen that before (runing 8.2 and 8.3).
a.
--
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED] command like a king,
http://www.highrise.ca/ work like a slave.
commit fba38257e52564276bb106d55aef14d0de481169
Author: Aidan Van Dyk <[EMAIL PROTECTED]>
Date: Fri Oct 31 12:35:24 2008 -0400
WIP: Zero xlog tal on a forced switch
If XLogWrite is called with xlog_switch, an XLog swithc has been force, either
by a timeout based switch (archive_timeout), or an interactive force xlog
switch (pg_switch_xlog/pg_stop_backup). In those cases, we assume we can
afford a little extra IO bandwidth to make xlogs so much more compressable
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 003098f..c6f9c79 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1600,6 +1600,30 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch)
*/
if (finishing_seg || (xlog_switch && last_iteration))
{
+ /*
+ * If we've had an xlog switch forced, then we want to zero
+ * out the rest of the segment. We zero it out here because at the
+ * force switch time, IO bandwidth isn't a problem.
+ * -- AIDAN
+ */
+ if (xlog_switch)
+ {
+ char buf[1024];
+ uint32 left = (XLogSegSize - openLogOff);
+ ereport(LOG,
+ (errmsg("ZEROING xlog file %u segment %u from %u - %u [%u bytes]",
+ openLogId, openLogSeg,
+ openLogOff, XLogSegSize, left)
+ ));
+ memset(buf, 0, sizeof(buf));
+ while (left > 0)
+ {
+ size_t len = (left > sizeof(buf)) ? sizeof(buf) : left;
+ write(openLogFile, buf, len);
+ left -= len;
+ }
+ }
+
issue_xlog_fsync();
LogwrtResult.Flush = LogwrtResult.Write; /* end of page */
signature.asc
Description: Digital signature
