On Thursday 12 February 2009 17:41:01 Peter Eisentraut wrote:
> I know we've already had a discussion on the naming of the pg_restore -m
> option, but in any case this description in pg_restore --help is confusing:
>
> -m, --multi-thread=NUM   use this many parallel connections to restore
>
> Either it is using that many threads in the client, or it is using that
> many connections to the server.  I assume the implementation does
> approximately both, but we should be clear about what we promise to the
> user.  Either: Reserve this many connections on the server.  Or: Reserve
> this many threads in the kernel of the client.  The documentation in the
> reference/man page is equally confused.
>
> Also, the term "multi" is redundant, because whether it is multi or single
> is obviously determined by the value of NUM.

After reviewing the discussion and the implementation, I would say "workers" 
would be the best description of the feature, but unfortunately the options -w 
or -W are not available.  I'd also avoid -n or -N for "num..." because pg_dump 
already uses -n and -N for something else, and we are now trying to avoid 
inconsistent options between these programs.  Also, option names usually don't 
start with units (imagine --num-shared-buffers or --num-port).

While I think "jobs" isn't a totally accurate description, I would still 
propose to use -j/--jobs for the option name, because it is neutral about the 
implementation and has a strong precedent as being used to increase the 
parallelization to get the work done faster.  I also noticed that Andrew D. 
used "jobs" in his own emails to comment on the feature. :-)

The attached patch also updated the documentation to give some additional 
advice about which numbers to use.
Index: doc/src/sgml/ref/pg_restore.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_restore.sgml,v
retrieving revision 1.80
diff -u -3 -p -r1.80 pg_restore.sgml
--- doc/src/sgml/ref/pg_restore.sgml	26 Feb 2009 16:02:37 -0000	1.80
+++ doc/src/sgml/ref/pg_restore.sgml	19 Mar 2009 21:18:32 -0000
@@ -216,6 +216,46 @@
      </varlistentry>
 
      <varlistentry>
+      <term><option>-j <replaceable class="parameter">number-of-jobs</replaceable></option></term>
+      <term><option>--jobs=<replaceable class="parameter">number-of-jobs</replaceable></option></term>
+      <listitem>
+       <para>
+        Run the most time-consuming parts
+        of <application>pg_restore</> &mdash; those which load data,
+        create indexes, or create constraints &mdash; using multiple
+        concurrent jobs.  This option can dramatically reduce the time
+        to restore a large database to a server running on a
+        multi-processor machine.
+       </para>
+
+       <para>
+        Each job is one process or one thread, depending on the
+        operating system, and uses a separate connection to the
+        server.
+       </para>
+
+       <para>
+        The optimal value for this option depends on the hardware
+        setup of the server, of the client, and of the network.
+        Factors include the number of CPU cores and the disk setup.  A
+        good place to start is the number of CPU cores on the server,
+        but values larger than that can also lead to faster restore
+        times in many cases.  Of course, values that are too high will
+        lead to decreasing performance because of thrashing.
+       </para>
+
+       <para>
+        Only the custom archive format is supported with this option.
+        The input file must be a regular file (not, for example, a
+        pipe).  This option is ignored when emitting a script rather
+        than connecting directly to a database server.  Also, multiple
+        jobs cannot be used together with the
+        option <option>--single-transaction</option>.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-l</option></term>
       <term><option>--list</option></term>
       <listitem>
@@ -242,28 +282,6 @@
      </varlistentry>
 
      <varlistentry>
-      <term><option>-m <replaceable class="parameter">number-of-threads</replaceable></option></term>
-      <term><option>--multi-thread=<replaceable class="parameter">number-of-threads</replaceable></option></term>
-      <listitem>
-       <para>
-        Run the most time-consuming parts of <application>pg_restore</>
-        &mdash; those which load data, create indexes, or create
-        constraints &mdash; using multiple concurrent connections to the
-        database. This option can dramatically reduce the time to restore a
-        large database to a server running on a multi-processor machine.
-       </para>
-
-       <para>
-        This option is ignored when emitting a script rather than connecting
-        directly to a database server.  Multiple threads cannot be used
-        together with <option>--single-transaction</option>.  Also, the input
-        must be a plain file (not, for example, a pipe), and at present only
-        the custom archive format is supported.
-       </para>
-      </listitem>
-     </varlistentry>
-
-     <varlistentry>
       <term><option>-n <replaceable class="parameter">namespace</replaceable></option></term>
       <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
       <listitem>
Index: src/bin/pg_dump/pg_backup.h
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup.h,v
retrieving revision 1.50
diff -u -3 -p -r1.50 pg_backup.h
--- src/bin/pg_dump/pg_backup.h	26 Feb 2009 16:02:37 -0000	1.50
+++ src/bin/pg_dump/pg_backup.h	19 Mar 2009 21:18:32 -0000
@@ -139,7 +139,7 @@ typedef struct _restoreOptions
 	int			suppressDumpWarnings;	/* Suppress output of WARNING entries
 										 * to stderr */
 	bool		single_txn;
-	int			number_of_threads;
+	int			number_of_jobs;
 
 	bool	   *idWanted;		/* array showing which dump IDs to emit */
 } RestoreOptions;
Index: src/bin/pg_dump/pg_backup_archiver.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.167
diff -u -3 -p -r1.167 pg_backup_archiver.c
--- src/bin/pg_dump/pg_backup_archiver.c	13 Mar 2009 22:50:44 -0000	1.167
+++ src/bin/pg_dump/pg_backup_archiver.c	19 Mar 2009 21:18:32 -0000
@@ -354,7 +354,7 @@ RestoreArchive(Archive *AHX, RestoreOpti
 	 *
 	 * In parallel mode, turn control over to the parallel-restore logic.
 	 */
-	if (ropt->number_of_threads > 1 && ropt->useDB)
+	if (ropt->number_of_jobs > 1 && ropt->useDB)
 		restore_toc_entries_parallel(AH);
 	else
 	{
@@ -3061,7 +3061,7 @@ static void
 restore_toc_entries_parallel(ArchiveHandle *AH)
 {
 	RestoreOptions *ropt = AH->ropt;
-	int			n_slots = ropt->number_of_threads;
+	int			n_slots = ropt->number_of_jobs;
 	ParallelSlot *slots;
 	int			work_status;
 	int			next_slot;
Index: src/bin/pg_dump/pg_restore.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_restore.c,v
retrieving revision 1.95
diff -u -3 -p -r1.95 pg_restore.c
--- src/bin/pg_dump/pg_restore.c	11 Mar 2009 03:33:29 -0000	1.95
+++ src/bin/pg_dump/pg_restore.c	19 Mar 2009 21:18:32 -0000
@@ -93,8 +93,8 @@ main(int argc, char **argv)
 		{"host", 1, NULL, 'h'},
 		{"ignore-version", 0, NULL, 'i'},
 		{"index", 1, NULL, 'I'},
+		{"jobs", 1, NULL, 'j'},
 		{"list", 0, NULL, 'l'},
-		{"multi-thread", 1, NULL, 'm'},
 		{"no-privileges", 0, NULL, 'x'},
 		{"no-acl", 0, NULL, 'x'},
 		{"no-owner", 0, NULL, 'O'},
@@ -146,7 +146,7 @@ main(int argc, char **argv)
 		}
 	}
 
-	while ((c = getopt_long(argc, argv, "acCd:ef:F:h:iI:lL:m:n:Op:P:RsS:t:T:U:vwWxX:1",
+	while ((c = getopt_long(argc, argv, "acCd:ef:F:h:iI:j:lL:n:Op:P:RsS:t:T:U:vwWxX:1",
 							cmdopts, NULL)) != -1)
 	{
 		switch (c)
@@ -181,6 +181,10 @@ main(int argc, char **argv)
 				/* ignored, deprecated option */
 				break;
 
+			case 'j':			/* number of restore jobs */
+				opts->number_of_jobs = atoi(optarg);
+				break;
+
 			case 'l':			/* Dump the TOC summary */
 				opts->tocSummary = 1;
 				break;
@@ -189,10 +193,6 @@ main(int argc, char **argv)
 				opts->tocFile = strdup(optarg);
 				break;
 
-			case 'm':			/* number of restore threads */
-				opts->number_of_threads = atoi(optarg);
-				break;
-
 			case 'n':			/* Dump data for this schema only */
 				opts->schemaNames = strdup(optarg);
 				break;
@@ -318,9 +318,9 @@ main(int argc, char **argv)
 	}
 
 	/* Can't do single-txn mode with multiple connections */
-	if (opts->single_txn && opts->number_of_threads > 1)
+	if (opts->single_txn && opts->number_of_jobs > 1)
 	{
-		fprintf(stderr, _("%s: cannot specify both --single-transaction and multiple threads\n"),
+		fprintf(stderr, _("%s: cannot specify both --single-transaction and multiple jobs\n"),
 				progname);
 		exit(1);
 	}
@@ -417,9 +417,9 @@ usage(const char *progname)
 	printf(_("  -C, --create             create the target database\n"));
 	printf(_("  -e, --exit-on-error      exit on error, default is to continue\n"));
 	printf(_("  -I, --index=NAME         restore named index\n"));
+	printf(_("  -j, --jobs=NUM           use this many parallel jobs to restore\n"));
 	printf(_("  -L, --use-list=FILENAME  use table of contents from this file for\n"
 		 "                           selecting/ordering output\n"));
-	printf(_("  -m, --multi-thread=NUM   use this many parallel connections to restore\n"));
 	printf(_("  -n, --schema=NAME        restore only objects in this schema\n"));
 	printf(_("  -O, --no-owner           skip restoration of object ownership\n"));
 	printf(_("  -P, --function=NAME(args)\n"
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to