Re: [HACKERS] Making pg_standby compression-friendly

2008-10-27 Thread Charles Duffy

Koichi Suzuki wrote:

As Heikki pointed out, the issue is not to decompress the compressed
WAL, but also how we can keep archive log still compressed after it is
handled by pg_standby.


pg_standby makes a *copy* of the segment from the archive, and need only 
ensure that the copy is decompressed; it has no reason to ever 
decompress the original version in the archive.


I don't see the problem here.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making pg_standby compression-friendly

2008-10-24 Thread Charles Duffy
In the absence of further feedback from 'yall (and in the presence of 
some positive results from internal QA), I'm adding the posted patch 
as-is to the 2008-11 CommitFest queue. That said, any such additional 
feedback would be gratefully appreciated.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making pg_standby compression-friendly

2008-10-22 Thread Charles Duffy
On Thu, Oct 23, 2008 at 1:15 AM, Heikki Linnakangas <
[EMAIL PROTECTED]> wrote:

> Charles Duffy wrote:
>
>> I'm interested in compressing archived WAL segments in an environment
>> set up for PITR in the interests of reducing both network traffic and
>> storage requirements. However, pg_standby presently checks file sizes,
>> requiring that an archive segment be exactly the right size to be
>> considered valid. The idea of compressing log segments is not new --
>> the clearxlogtail project in pgfoundry provides a tool to make such
>> compression more effective, and is explicitly intended for said
>> purpose -- but as of 8.3.4, pg_standby appears not to support such
>> environments; I propose adding such support.
>>
>
> Can't you decompress the files in whatever script you use to copy them to
> the archive location?


To be sure I understand -- you're proposing a scenario in which the
archive_command on the master compresses the files, passes them over to the
slave while compressed, and then decompresses them on the slave for storage
in their decompressed state? That succeeds in the goal of decreasing network
bandwidth, but (1) isn't necessarily easy to implement over NFS, and (2)
doesn't succeed in decreasing storage requirements on the slave.

(While pg_standby's behavior is to delete segments which are no longer
needed to keep a warm standby slave running, I maintain a separate archive
for PITR use with hardlinked copies of those same archive segments; storage
on the slave is a much bigger issue in this environment than it would be if
the space used for segments were being deallocated as soon as pg_standby
chose to unlink them).


[Heikki, please accept my apologies for the initial off-list response; I
wasn't paying enough attention to gmail's default reply behavior].


[HACKERS] Making pg_standby compression-friendly

2008-10-22 Thread Charles Duffy
Howdy, all.

I'm interested in compressing archived WAL segments in an environment
set up for PITR in the interests of reducing both network traffic and
storage requirements. However, pg_standby presently checks file sizes,
requiring that an archive segment be exactly the right size to be
considered valid. The idea of compressing log segments is not new --
the clearxlogtail project in pgfoundry provides a tool to make such
compression more effective, and is explicitly intended for said
purpose -- but as of 8.3.4, pg_standby appears not to support such
environments; I propose adding such support.

To allow pg_standby to operate in an environment where archive
segments are compressed, two behaviors are necessary:

 - suppressing the file-size checks. This puts the onus on the user to
create these files via an atomic mechanism, but is necessary to allow
compressed files to be considered.
 - allowing a custom restore command to be provided. This permits the
user to specify the mechanism to be used to decompress the segment.
One bikeshed is determining whether the user should pass in a command
suitable for use in a pipeline or a command which accepts input and
output as arguments.

A sample implementation is attached, intended only to kickstart
discussion; I'm not attached to either its implementation or its
proposed command-line syntax.

Thoughts?
--- pg_standby.c.orig	2008-07-08 10:12:04.0 -0500
+++ pg_standby.c	2008-10-22 19:05:41.0 -0500
@@ -50,9 +50,11 @@
 bool		triggered = false;	/* have we been triggered? */
 bool		need_cleanup = false;		/* do we need to remove files from
 		 * archive? */
+bool		disable_size_checks = false;	/* avoid checking segment size */
 
 static volatile sig_atomic_t signaled = false;
 
+char	   *customRestore;	/* Filter or command used to restore segments */
 char	   *archiveLocation;	/* where to find the archive? */
 char	   *triggerPath;		/* where to find the trigger file? */
 char	   *xlogFilePath;		/* where we are going to restore to */
@@ -66,6 +68,8 @@
 
 #define RESTORE_COMMAND_COPY 0
 #define RESTORE_COMMAND_LINK 1
+#define RESTORE_COMMAND_PIPE 2
+#define RESTORE_COMMAND_CUST 3
 int			restoreCommandType;
 
 #define XLOG_DATA			 0
@@ -112,8 +116,15 @@
 	snprintf(WALFilePath, MAXPGPATH, "%s\\%s", archiveLocation, nextWALFileName);
 	switch (restoreCommandType)
 	{
+		case RESTORE_COMMAND_PIPE:
+			snprintf(restoreCommand, MAXPGPATH, "%s <\"%s\" >\"%s\"", customRestore, WALFilePath, xlogFilePath);
+			break;
+		case RESTORE_COMMAND_CUST:
+			SET_RESTORE_COMMAND(customRestore, WALFilePath, xlogFilePath);
+			break;
 		case RESTORE_COMMAND_LINK:
 			SET_RESTORE_COMMAND("mklink", WALFilePath, xlogFilePath);
+			break;
 		case RESTORE_COMMAND_COPY:
 		default:
 			SET_RESTORE_COMMAND("copy", WALFilePath, xlogFilePath);
@@ -123,6 +134,12 @@
 	snprintf(WALFilePath, MAXPGPATH, "%s/%s", archiveLocation, nextWALFileName);
 	switch (restoreCommandType)
 	{
+		case RESTORE_COMMAND_PIPE:
+			snprintf(restoreCommand, MAXPGPATH, "%s <\"%s\" >\"%s\"", customRestore, WALFilePath, xlogFilePath);
+			break;
+		case RESTORE_COMMAND_CUST:
+			snprintf(restoreCommand, MAXPGPATH, "%s \"%s\" \"%s\"", customRestore, WALFilePath, xlogFilePath);
+			break;
 		case RESTORE_COMMAND_LINK:
 #if HAVE_WORKING_LINK
 			SET_RESTORE_COMMAND("ln -s -f", WALFilePath, xlogFilePath);
@@ -170,7 +187,7 @@
 			nextWALFileType = XLOG_BACKUP_LABEL;
 			return true;
 		}
-		else if (stat_buf.st_size == XLOG_SEG_SIZE)
+		else if (disable_size_checks || stat_buf.st_size == XLOG_SEG_SIZE)
 		{
 #ifdef WIN32
 
@@ -190,7 +207,7 @@
 		/*
 		 * If still too small, wait until it is the correct size
 		 */
-		if (stat_buf.st_size > XLOG_SEG_SIZE)
+		if ( (!disable_size_checks) && stat_buf.st_size > XLOG_SEG_SIZE)
 		{
 			if (debug)
 			{
@@ -432,12 +449,15 @@
 	fprintf(stderr, "note space between ARCHIVELOCATION and NEXTWALFILE\n");
 	fprintf(stderr, "with main intended use as a restore_command in the recovery.conf\n");
 	fprintf(stderr, "	 restore_command = 'pg_standby [OPTION]... ARCHIVELOCATION %%f %%p %%r'\n");
-	fprintf(stderr, "e.g. restore_command = 'pg_standby -l /mnt/server/archiverdir %%f %%p %%r'\n");
+	fprintf(stderr, "e.g. restore_command = 'pg_standby -l /mnt/server/archiverdir %%f %%p %%r'\n\n");
+	fprintf(stderr, "If -C or -p are used, the archive must be populated using atomic calls (ie. rename).\n");
 	fprintf(stderr, "\nOptions:\n");
+	fprintf(stderr, "  -C COMMAND		invoke command for retrieval from the archive (as \"COMMAND source dest\")\n");
 	fprintf(stderr, "  -c			copies file from archive (default)\n");
 	fprintf(stderr, "  -d			generate lots of debugging output (testing only)\n");
 	fprintf(stderr, "  -k NUMFILESTOKEEP	if RESTARTWALFILE not used, removes files prior to limit (0 keeps all)\n");
 	fprintf(stderr, "  -l			links into archive (leaves file in archive)\n");
+	fprintf(stderr, "  -p COMMAND		pipe through command on retrieval from the archive (ie. 'gzip -c')\n");
 	fp

Re: [HACKERS] [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()

2006-07-28 Thread Charles Duffy

On 7/15/06, Tom Lane <[EMAIL PROTECTED]> wrote:

Anyway, Qingqing's question still needs to be answered: how can a sort
of under 30k items take so long?



It happens because (as previously suggested by Tom) the dataset for
the 'short' (~10k rows, .3 sec) sort has no rows whose leftmost fields
evaluate to 'equal' when passed to the qsort compare function. The
'long' sort, (~30k rows, 78 sec) has plenty of rows whose first 6
columns all evaluate as 'equal' when the rows are compared.

For the 'long' data, the compare moves on rightward until it
encounters 'flato', which is a TEXT column with an average length of
7.5k characters (with some rows up to 400k). The first 6 columns are
mostly INTEGER, so compares on them are relatively inexpensive. All
the expensive compares on 'flato' account for the disproportionate
difference in sort times, relative to the number of rows in each set.

As for the potential for memory leaks - thinking about it.

Thanks,

Charles Duffy.


Peter Eisentraut <[EMAIL PROTECTED]> writes:
> The merge sort is here:

> 
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/msort.c?rev=1.21&content-type=text/x-cvsweb-markup&cvsroot=glibc

> It uses alloca, so we're good here.

Uh ... but it also uses malloc, and potentially a honkin' big malloc at
that (up to a quarter of physical RAM).  So I'm worried again.

Anyway, Qingqing's question still needs to be answered: how can a sort
of under 30k items take so long?

regards, tom lane

  Column   |  Type   | Modifiers
---+-+---
 record| integer |
 commr1| integer |
 envr1 | oid |
 docin | integer |
 creat | integer |
 flati | text|
 flato | text|
 doc   | text|
 docst | integer |
 vlord | integer |
 vl0   | integer |
 vl1   | date|
 vl2   | text|
 vl3   | text|
 vl4   | text|
 vl5   | text|
 vl6   | text|
 vl7   | date|
 vl8   | text|
 vl9   | integer |

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster