On Wed, Nov 28, 2012 at 03:22:32PM -0500, Bruce Momjian wrote: > On Tue, Nov 27, 2012 at 09:35:10PM -0800, Jeff Janes wrote: > > > I tested custom format with pg_restore -j and -1, as well as text > > > restore. The winner was pg_dump -Fc | pg_restore -1; > > > > I don't have the numbers at hand, but if my relcache patch is > > accepted, then "-1" stops being faster. > > > > -1 gets rid of the AtOEXAct relcache N^2 behavior, but at the cost of > > invoking a different N^2, that one in the stats system. > > OK, here are the testing results: > > #tbls git -1 AtOEXAct both > 1 11.06 13.06 10.99 13.20 > 1000 21.71 22.92 22.20 22.51 > 2000 32.86 31.09 32.51 31.62 > 4000 55.22 49.96 52.50 49.99 > 8000 105.34 82.10 95.32 82.94 > 16000 223.67 164.27 187.40 159.53 > 32000 543.93 324.63 366.44 317.93 > 64000 1697.14 791.82 767.32 752.57 > > Up to 2k, they are all similar. 4k & 8k have the -1 patch as a win, and > 16k+ really need both patches. > > I will continue working on the -1 patch, and hopefully we can get your > AtOEXAct patch in soon. Is someone reviewing that?
I have polished up the patch (attached) and it is ready for application to 9.3. Since there is no pg_dump/pg_restore pipe parallelism, I had the old cluster create per-database dump files, so I don't need to have the old and new clusters running at the same time, which would have required two port numbers and make shared memory exhaustion more likely. We now create a dump file per database, so thousands of database dump files might cause a performance problem. This also adds status output so you can see the database names as their schemas are dumped and restored. This was requested by users. I retained custom mode for pg_dump because it is measurably faster than text mode (not sure why, psql overhead?): git -Fc -Fp 1 11.04 11.08 11.02 1000 22.37 19.68 21.64 2000 32.39 28.62 31.40 4000 56.18 48.53 51.15 8000 105.15 81.23 91.84 16000 227.64 156.72 177.79 32000 542.80 323.19 371.81 64000 1711.77 789.17 865.03 Text dump files are slightly easier to debug, but probably not by much. Single-transaction restores were recommended to me over a year ago (by Magnus?), but I wanted to get pg_upgrade rock-solid before doing optimization, and now is the right time to optimize. One risk of single-transaction restores is max_locks_per_transaction exhaustion, but you will need to increase that on the old cluster for pg_dump anyway because that is done a single transaction, so the only new thing is that the new cluster might also need to adjust max_locks_per_transaction. I was able to remove split_old_dump() because pg_dumpall now produces a full global restore file and we do database dumps separately. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c new file mode 100644 index 285f10c..bccceb1 *** a/contrib/pg_upgrade/check.c --- b/contrib/pg_upgrade/check.c *************** output_check_banner(bool *live_check) *** 72,78 **** void ! check_old_cluster(bool live_check, char **sequence_script_file_name) { /* -- OLD -- */ --- 72,78 ---- void ! check_and_dump_old_cluster(bool live_check, char **sequence_script_file_name) { /* -- OLD -- */ *************** check_old_cluster(bool live_check, char *** 131,140 **** * the old server is running. */ if (!user_opts.check) - { generate_old_dump(); - split_old_dump(); - } if (!live_check) stop_postmaster(false); --- 131,137 ---- diff --git a/contrib/pg_upgrade/dump.c b/contrib/pg_upgrade/dump.c new file mode 100644 index 577ccac..d206e98 *** a/contrib/pg_upgrade/dump.c --- b/contrib/pg_upgrade/dump.c *************** *** 16,110 **** void generate_old_dump(void) { ! /* run new pg_dumpall binary */ ! prep_status("Creating catalog dump"); ! /* ! * --binary-upgrade records the width of dropped columns in pg_class, and ! * restores the frozenid's for databases and relations. ! */ exec_prog(UTILITY_LOG_FILE, NULL, true, ! "\"%s/pg_dumpall\" %s --schema-only --binary-upgrade %s -f %s", new_cluster.bindir, cluster_conn_opts(&old_cluster), log_opts.verbose ? "--verbose" : "", ! ALL_DUMP_FILE); ! check_ok(); ! } ! ! ! /* ! * split_old_dump ! * ! * This function splits pg_dumpall output into global values and ! * database creation, and per-db schemas. This allows us to create ! * the support functions between restoring these two parts of the ! * dump. We split on the first "\connect " after a CREATE ROLE ! * username match; this is where the per-db restore starts. ! * ! * We suppress recreation of our own username so we don't generate ! * an error during restore ! */ ! void ! split_old_dump(void) ! { ! FILE *all_dump, ! *globals_dump, ! *db_dump; ! FILE *current_output; ! char line[LINE_ALLOC]; ! bool start_of_line = true; ! char create_role_str[MAX_STRING]; ! char create_role_str_quote[MAX_STRING]; ! char filename[MAXPGPATH]; ! bool suppressed_username = false; ! ! ! /* ! * Open all files in binary mode to avoid line end translation on Windows, ! * both for input and output. ! */ ! ! snprintf(filename, sizeof(filename), "%s", ALL_DUMP_FILE); ! if ((all_dump = fopen(filename, PG_BINARY_R)) == NULL) ! pg_log(PG_FATAL, "Could not open dump file \"%s\": %s\n", filename, getErrorText(errno)); ! snprintf(filename, sizeof(filename), "%s", GLOBALS_DUMP_FILE); ! if ((globals_dump = fopen_priv(filename, PG_BINARY_W)) == NULL) ! pg_log(PG_FATAL, "Could not write to dump file \"%s\": %s\n", filename, getErrorText(errno)); ! snprintf(filename, sizeof(filename), "%s", DB_DUMP_FILE); ! if ((db_dump = fopen_priv(filename, PG_BINARY_W)) == NULL) ! pg_log(PG_FATAL, "Could not write to dump file \"%s\": %s\n", filename, getErrorText(errno)); ! ! current_output = globals_dump; ! ! /* patterns used to prevent our own username from being recreated */ ! snprintf(create_role_str, sizeof(create_role_str), ! "CREATE ROLE %s;", os_info.user); ! snprintf(create_role_str_quote, sizeof(create_role_str_quote), ! "CREATE ROLE %s;", quote_identifier(os_info.user)); ! while (fgets(line, sizeof(line), all_dump) != NULL) { ! /* switch to db_dump file output? */ ! if (current_output == globals_dump && start_of_line && ! suppressed_username && ! strncmp(line, "\\connect ", strlen("\\connect ")) == 0) ! current_output = db_dump; ! /* output unless we are recreating our own username */ ! if (current_output != globals_dump || !start_of_line || ! (strncmp(line, create_role_str, strlen(create_role_str)) != 0 && ! strncmp(line, create_role_str_quote, strlen(create_role_str_quote)) != 0)) ! fputs(line, current_output); ! else ! suppressed_username = true; ! if (strlen(line) > 0 && line[strlen(line) - 1] == '\n') ! start_of_line = true; ! else ! start_of_line = false; } ! fclose(all_dump); ! fclose(globals_dump); ! fclose(db_dump); } --- 16,49 ---- void generate_old_dump(void) { ! int dbnum; ! prep_status("Creating catalog dump\n"); ! ! pg_log(PG_REPORT, OVERWRITE_MESSAGE, "global objects"); ! ! /* run new pg_dumpall binary for globals */ exec_prog(UTILITY_LOG_FILE, NULL, true, ! "\"%s/pg_dumpall\" %s --schema-only --globals-only --binary-upgrade %s -f %s", new_cluster.bindir, cluster_conn_opts(&old_cluster), log_opts.verbose ? "--verbose" : "", ! GLOBALS_DUMP_FILE); ! /* create per-db dump files */ ! for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++) { ! char file_name[MAXPGPATH]; ! DbInfo *old_db = &old_cluster.dbarr.dbs[dbnum]; ! pg_log(PG_REPORT, OVERWRITE_MESSAGE, old_db->db_name); ! snprintf(file_name, sizeof(file_name), DB_DUMP_FILE_MASK, old_db->db_oid); ! exec_prog(RESTORE_LOG_FILE, NULL, true, ! "\"%s/pg_dump\" %s --schema-only --binary-upgrade --format=custom %s --file=\"%s\" \"%s\"", ! new_cluster.bindir, cluster_conn_opts(&old_cluster), ! log_opts.verbose ? "--verbose" : "", file_name, old_db->db_name); } ! end_progress_output(); ! check_ok(); } diff --git a/contrib/pg_upgrade/exec.c b/contrib/pg_upgrade/exec.c new file mode 100644 index 76247fd..35de541 *** a/contrib/pg_upgrade/exec.c --- b/contrib/pg_upgrade/exec.c *************** exec_prog(const char *log_file, const ch *** 104,111 **** if (result != 0) { ! report_status(PG_REPORT, "*failure*"); fflush(stdout); pg_log(PG_VERBOSE, "There were problems executing \"%s\"\n", cmd); if (opt_log_file) pg_log(throw_error ? PG_FATAL : PG_REPORT, --- 104,113 ---- if (result != 0) { ! /* we might be in on a progress status line, so go to the next line */ ! report_status(PG_REPORT, "\n*failure*"); fflush(stdout); + pg_log(PG_VERBOSE, "There were problems executing \"%s\"\n", cmd); if (opt_log_file) pg_log(throw_error ? PG_FATAL : PG_REPORT, diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c new file mode 100644 index 4d2e79c..bdc6d13 *** a/contrib/pg_upgrade/pg_upgrade.c --- b/contrib/pg_upgrade/pg_upgrade.c *************** main(int argc, char **argv) *** 92,98 **** check_cluster_compatibility(live_check); ! check_old_cluster(live_check, &sequence_script_file_name); /* -- NEW -- */ --- 92,98 ---- check_cluster_compatibility(live_check); ! check_and_dump_old_cluster(live_check, &sequence_script_file_name); /* -- NEW -- */ *************** create_new_objects(void) *** 282,287 **** --- 282,292 ---- prep_status("Adding support functions to new cluster"); + /* + * Technically, we only need to install these support functions in new + * databases that also exist in the old cluster, but for completeness + * we process all new databases. + */ for (dbnum = 0; dbnum < new_cluster.dbarr.ndbs; dbnum++) { DbInfo *new_db = &new_cluster.dbarr.dbs[dbnum]; *************** create_new_objects(void) *** 292,302 **** } check_ok(); ! prep_status("Restoring database schema to new cluster"); ! exec_prog(RESTORE_LOG_FILE, NULL, true, ! "\"%s/psql\" " EXEC_PSQL_ARGS " %s -f \"%s\"", ! new_cluster.bindir, cluster_conn_opts(&new_cluster), ! DB_DUMP_FILE); check_ok(); /* regenerate now that we have objects in the databases */ --- 297,323 ---- } check_ok(); ! prep_status("Restoring database schema to new cluster\n"); ! ! for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++) ! { ! char file_name[MAXPGPATH]; ! DbInfo *old_db = &old_cluster.dbarr.dbs[dbnum]; ! ! pg_log(PG_REPORT, OVERWRITE_MESSAGE, old_db->db_name); ! snprintf(file_name, sizeof(file_name), DB_DUMP_FILE_MASK, old_db->db_oid); ! ! /* ! * Using pg_restore --single-transaction is faster than other ! * methods, like --jobs. pg_dump only produces its output at the ! * end, so there is little parallelism using the pipe. ! */ ! exec_prog(RESTORE_LOG_FILE, NULL, true, ! "\"%s/pg_restore\" %s --exit-on-error --single-transaction --verbose --dbname \"%s\" \"%s\"", ! new_cluster.bindir, cluster_conn_opts(&new_cluster), ! old_db->db_name, file_name); ! } ! end_progress_output(); check_ok(); /* regenerate now that we have objects in the databases */ *************** cleanup(void) *** 455,468 **** /* Remove dump and log files? */ if (!log_opts.retain) { char **filename; for (filename = output_files; *filename != NULL; filename++) unlink(*filename); ! /* remove SQL files */ ! unlink(ALL_DUMP_FILE); unlink(GLOBALS_DUMP_FILE); ! unlink(DB_DUMP_FILE); } } --- 476,498 ---- /* Remove dump and log files? */ if (!log_opts.retain) { + int dbnum; char **filename; for (filename = output_files; *filename != NULL; filename++) unlink(*filename); ! /* remove dump files */ unlink(GLOBALS_DUMP_FILE); ! ! if (old_cluster.dbarr.dbs) ! for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++) ! { ! char file_name[MAXPGPATH]; ! DbInfo *old_db = &old_cluster.dbarr.dbs[dbnum]; ! ! snprintf(file_name, sizeof(file_name), DB_DUMP_FILE_MASK, old_db->db_oid); ! unlink(file_name); ! } } } diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h new file mode 100644 index ace56e5..d981035 *** a/contrib/pg_upgrade/pg_upgrade.h --- b/contrib/pg_upgrade/pg_upgrade.h *************** *** 29,38 **** #define OVERWRITE_MESSAGE " %-" MESSAGE_WIDTH "." MESSAGE_WIDTH "s\r" #define GET_MAJOR_VERSION(v) ((v) / 100) - #define ALL_DUMP_FILE "pg_upgrade_dump_all.sql" /* contains both global db information and CREATE DATABASE commands */ #define GLOBALS_DUMP_FILE "pg_upgrade_dump_globals.sql" ! #define DB_DUMP_FILE "pg_upgrade_dump_db.sql" #define SERVER_LOG_FILE "pg_upgrade_server.log" #define RESTORE_LOG_FILE "pg_upgrade_restore.log" --- 29,37 ---- #define OVERWRITE_MESSAGE " %-" MESSAGE_WIDTH "." MESSAGE_WIDTH "s\r" #define GET_MAJOR_VERSION(v) ((v) / 100) /* contains both global db information and CREATE DATABASE commands */ #define GLOBALS_DUMP_FILE "pg_upgrade_dump_globals.sql" ! #define DB_DUMP_FILE_MASK "pg_upgrade_dump_%u.custom" #define SERVER_LOG_FILE "pg_upgrade_server.log" #define RESTORE_LOG_FILE "pg_upgrade_restore.log" *************** extern OSInfo os_info; *** 296,307 **** /* check.c */ void output_check_banner(bool *live_check); ! void check_old_cluster(bool live_check, char **sequence_script_file_name); void check_new_cluster(void); void report_clusters_compatible(void); void issue_warnings(char *sequence_script_file_name); ! void output_completion_banner(char *analyze_script_file_name, char *deletion_script_file_name); void check_cluster_versions(void); void check_cluster_compatibility(bool live_check); --- 295,306 ---- /* check.c */ void output_check_banner(bool *live_check); ! void check_and_dump_old_cluster(bool live_check, char **sequence_script_file_name); void check_new_cluster(void); void report_clusters_compatible(void); void issue_warnings(char *sequence_script_file_name); ! void output_completion_banner(char *analyze_script_file_name, char *deletion_script_file_name); void check_cluster_versions(void); void check_cluster_compatibility(bool live_check); *************** void disable_old_cluster(void); *** 319,325 **** /* dump.c */ void generate_old_dump(void); - void split_old_dump(void); /* exec.c */ --- 318,323 ---- *************** __attribute__((format(PG_PRINTF_ATTRIBUT *** 433,438 **** --- 431,437 ---- void pg_log(eLogType type, char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3))); + void end_progress_output(void); void prep_status(const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2))); diff --git a/contrib/pg_upgrade/relfilenode.c b/contrib/pg_upgrade/relfilenode.c new file mode 100644 index 7dbaac9..14e66df *** a/contrib/pg_upgrade/relfilenode.c --- b/contrib/pg_upgrade/relfilenode.c *************** transfer_all_new_dbs(DbInfoArr *old_db_a *** 82,90 **** } } ! prep_status(" "); /* in case nothing printed; pass a space so ! * gcc doesn't complain about empty format ! * string */ check_ok(); return msg; --- 82,88 ---- } } ! end_progress_output(); check_ok(); return msg; diff --git a/contrib/pg_upgrade/util.c b/contrib/pg_upgrade/util.c new file mode 100644 index 1d4bc89..0c1eccc *** a/contrib/pg_upgrade/util.c --- b/contrib/pg_upgrade/util.c *************** report_status(eLogType type, const char *** 35,40 **** --- 35,52 ---- } + /* force blank output for progress display */ + void + end_progress_output(void) + { + /* + * In case nothing printed; pass a space so gcc doesn't complain about + * empty format string. + */ + prep_status(" "); + } + + /* * prep_status * diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c new file mode 100644 index ca95bad..83eae81 *** a/src/bin/pg_dump/pg_dumpall.c --- b/src/bin/pg_dump/pg_dumpall.c *************** main(int argc, char *argv[]) *** 502,508 **** } /* Dump CREATE DATABASE commands */ ! if (!globals_only && !roles_only && !tablespaces_only) dumpCreateDB(conn); /* Dump role/database settings */ --- 502,508 ---- } /* Dump CREATE DATABASE commands */ ! if (binary_upgrade || (!globals_only && !roles_only && !tablespaces_only)) dumpCreateDB(conn); /* Dump role/database settings */ *************** dumpRoles(PGconn *conn) *** 745,753 **** * will acquire the right properties even if it already exists (ie, it * won't hurt for the CREATE to fail). This is particularly important * for the role we are connected as, since even with --clean we will ! * have failed to drop it. */ ! appendPQExpBuffer(buf, "CREATE ROLE %s;\n", fmtId(rolename)); appendPQExpBuffer(buf, "ALTER ROLE %s WITH", fmtId(rolename)); if (strcmp(PQgetvalue(res, i, i_rolsuper), "t") == 0) --- 745,755 ---- * will acquire the right properties even if it already exists (ie, it * won't hurt for the CREATE to fail). This is particularly important * for the role we are connected as, since even with --clean we will ! * have failed to drop it. binary_upgrade cannot generate any errors, ! * so we assume the role is already created. */ ! if (!binary_upgrade) ! appendPQExpBuffer(buf, "CREATE ROLE %s;\n", fmtId(rolename)); appendPQExpBuffer(buf, "ALTER ROLE %s WITH", fmtId(rolename)); if (strcmp(PQgetvalue(res, i, i_rolsuper), "t") == 0)
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers