On Tue, Mar 13, 2012 at 09:15:52PM -0400, Bruce Momjian wrote:
> On Tue, Mar 13, 2012 at 08:22:51PM -0400, Bruce Momjian wrote:
> > On Tue, Mar 13, 2012 at 05:33:29PM -0500, Kevin Grittner wrote:
> > > Bruce Momjian <br...@momjian.us> wrote:
> > >  
> > > > What is the target=10 duration?  I think 10 is as low as we can
> > > > acceptably recommend.  Should we recommend they run vacuumdb
> > > > twice, once with default_statistics_target = 4, and another with
> > > > the default?
> > >  
> > > Here are the results at various settings.
> > >  
> > > 1   :  172198.892 ms
> > > 2   :  295536.814 ms
> > > 4   :  474319.826 ms
> > > 10  :  750458.312 ms
> > > 100 :  3433794.609 ms
> > 
> > Thanks, good numbers to know.
> 
> OK, new crazy idea.  Kevin has shown that his database can get a single
> bucket in 2.8 minutes.  What if we have pg_upgrade create a vacuumdb
> script that generates increasingly accurate statistics, e.g. it runs for
> default_statistics_target values of 1, 10, and default (100).  That
> would give basic statistics quickly (2.8 minutes), and full statistics
> in an hour, for Kevin's database.
> 
>       PGOPTIONS='-c default_statistics_target=1' vacuumdb --all --analyze-only
>       PGOPTIONS='-c default_statistics_target=10' vacuumdb --all 
> --analyze-only
>       vacuumdb --all --analyze-only
> 
> The only problem I see is that users who use non-default statistics per
> table would not be affected by the increasing default_statistics_target
> values.
> 
> The upside is this would work for all releases of Postgres.

OK, I have modified pg_upgrade with the attached patch to do exactly
this.  I have also attached the script pg_upgrade creates that should be
run instead of vacuumdb.

Based on Kevin's numbers above, the first vacuumdb will be done in 2.8
minutes (1 target), the 10 target vacuumdb done after 15 minutes, and
the 100 target vacuumdb done after 72 minutes (times accumulate).  Here
is what the output looks like:

        Generating minimal optimizer statistics (1 target)
        --------------------------------------------------
        vacuumdb: vacuuming database "postgres"
        vacuumdb: vacuuming database "template1"
        vacuumdb: vacuuming database "test"
        
        The server is now available with minimal optimizer statistics.
        Query performance will be optimal once this script completes.
        
        Generating medium optimizer statistics (10 targets)
        ---------------------------------------------------
        vacuumdb: vacuuming database "postgres"
        vacuumdb: vacuuming database "template1"
        vacuumdb: vacuuming database "test"
        
        Generating default (full) optimizer statistics (100 targets?)
        -------------------------------------------------------------
        vacuumdb: vacuuming database "postgres"
        vacuumdb: vacuuming database "template1"
        vacuumdb: vacuuming database "test"

        Done

If we don't want to try migrating the statistics from the old system,
this seems like the best approach.

Does anyone know how bad the queries will be with only one target?

I did see if vacuumdb --analyze-only was somehow being throttled by the
vacuum settings, but saw the drive at 100% utilization analying a 36GB
table on a 24GB RAM server, so it seems I/O bound.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index cf43384..4481de0
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*************** static void check_for_reg_data_type_usag
*** 23,28 ****
--- 23,33 ----
  static void check_for_support_lib(ClusterInfo *cluster);
  static void get_bin_version(ClusterInfo *cluster);
  
+ #ifndef WIN32
+ #define	ECHO_QUOTE	"'"
+ #else
+ #define	ECHO_QUOTE	""
+ #endif
  
  void
  output_check_banner(bool *live_check)
*************** issue_warnings(char *sequence_script_fil
*** 193,213 ****
  
  
  void
! output_completion_banner(char *deletion_script_file_name)
  {
  	/* Did we copy the free space files? */
  	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804)
  		pg_log(PG_REPORT,
! 			   "Optimizer statistics are not transferred by pg_upgrade so\n"
! 			   "consider running:\n"
! 			   "    vacuumdb --all --analyze-only\n"
! 			   "on the newly-upgraded cluster.\n\n");
  	else
  		pg_log(PG_REPORT,
  			   "Optimizer statistics and free space information are not transferred\n"
! 			   "by pg_upgrade so consider running:\n"
! 			   "    vacuumdb --all --analyze\n"
! 			   "on the newly-upgraded cluster.\n\n");
  
  	pg_log(PG_REPORT,
  		   "Running this script will delete the old cluster's data files:\n"
--- 198,217 ----
  
  
  void
! output_completion_banner(char *analyze_script_file_name,
! 						 char *deletion_script_file_name)
  {
  	/* Did we copy the free space files? */
  	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804)
  		pg_log(PG_REPORT,
! 			   "Optimizer statistics are not transferred by pg_upgrade so,\n"
! 			   "once you start the new server, consider running:\n"
! 			   "    %s\n\n", analyze_script_file_name);
  	else
  		pg_log(PG_REPORT,
  			   "Optimizer statistics and free space information are not transferred\n"
! 			   "by pg_upgrade so, once you start the new server, consider running:\n"
! 			   "    %s\n\n", analyze_script_file_name);
  
  	pg_log(PG_REPORT,
  		   "Running this script will delete the old cluster's data files:\n"
*************** check_new_cluster_is_empty(void)
*** 379,384 ****
--- 383,477 ----
  }
  
  
+ /*
+  * create_script_for_cluster_analyze()
+  *
+  *	This incrementally generates better optimizer statistics
+  */
+ void
+ create_script_for_cluster_analyze(char **analyze_script_file_name)
+ {
+ 	FILE	   *script = NULL;
+ 
+ 	*analyze_script_file_name = pg_malloc(MAXPGPATH);
+ 
+ 	prep_status("Creating script to analyze new cluster");
+ 
+ 	snprintf(*analyze_script_file_name, MAXPGPATH, "analyze_new_cluster.%s",
+ 			 SCRIPT_EXT);
+ 
+ 	if ((script = fopen_priv(*analyze_script_file_name, "w")) == NULL)
+ 		pg_log(PG_FATAL, "Could not open file \"%s\": %s\n",
+ 			   *analyze_script_file_name, getErrorText(errno));
+ 
+ #ifndef WIN32
+ 	/* add shebang header */
+ 	fprintf(script, "#!/bin/sh\n\n");
+ 
+ 	fprintf(script, "PGOPTIONS='-c default_statistics_target=1'\n");
+ 	/* only need to export once */
+ 	fprintf(script, "export PGOPTIONS\n");
+ #else
+ 	fprintf(script, "SET PGOPTIONS=-c default_statistics_target=1\n");
+ #endif
+ 
+ 	fprintf(script, "echo %sGenerating minimal optimizer statistics (1 target)%s\n",
+ 					ECHO_QUOTE, ECHO_QUOTE);
+ 	fprintf(script, "echo %s--------------------------------------------------%s\n",
+ 					ECHO_QUOTE, ECHO_QUOTE);
+ 	fprintf(script, "vacuumdb --all --analyze-only\n");
+ 	fprintf(script, "echo\n");
+ 	fprintf(script, "echo %sThe server is now available with minimal optimizer statistics.%s\n",
+ 					ECHO_QUOTE, ECHO_QUOTE);
+ 	fprintf(script, "echo %sQuery performance will be optimal once this script completes.%s\n",
+ 					ECHO_QUOTE, ECHO_QUOTE);
+ 	fprintf(script, "echo\n\n");
+ 
+ #ifndef WIN32
+ 	fprintf(script, "PGOPTIONS='-c default_statistics_target=10'\n");
+ #else
+ 	fprintf(script, "SET PGOPTIONS=-c default_statistics_target=10\n");
+ #endif
+ 
+ 	fprintf(script, "echo %sGenerating medium optimizer statistics (10 targets)%s\n",
+ 					ECHO_QUOTE, ECHO_QUOTE);
+ 	fprintf(script, "echo %s---------------------------------------------------%s\n",
+ 					ECHO_QUOTE, ECHO_QUOTE);
+ 	fprintf(script, "vacuumdb --all --analyze-only\n");
+ 	fprintf(script, "echo\n\n");
+ 
+ #ifndef WIN32
+ 	fprintf(script, "unset PGOPTIONS\n");
+ #else
+ 	fprintf(script, "SET PGOPTIONS\n");
+ #endif
+ 
+ 	fprintf(script, "echo %sGenerating default (full) optimizer statistics (100 targets?)%s\n",
+ 					ECHO_QUOTE, ECHO_QUOTE);
+ 	fprintf(script, "echo %s-------------------------------------------------------------%s\n",
+ 					ECHO_QUOTE, ECHO_QUOTE);
+ 	/* Did we copy the free space files? */
+ 	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 804)
+ 		fprintf(script, "vacuumdb --all --analyze-only\n");
+ 	else
+ 		fprintf(script, "vacuumdb --all --analyze\n");
+ 
+ 	fprintf(script, "echo\n\n");
+ 	fprintf(script, "echo %sDone%s\n",
+ 					ECHO_QUOTE, ECHO_QUOTE);
+ 
+ 	fclose(script);
+ 
+ #ifndef WIN32
+ 	if (chmod(*analyze_script_file_name, S_IRWXU) != 0)
+ 		pg_log(PG_FATAL, "Could not add execute permission to file \"%s\": %s\n",
+ 			   *analyze_script_file_name, getErrorText(errno));
+ #endif
+ 
+ 	check_ok();
+ }
+ 
+ 
  /*
   * create_script_for_old_cluster_deletion()
   *
diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c
new file mode 100644
index b0720c3..fa48ef1
*** a/contrib/pg_upgrade/pg_upgrade.c
--- b/contrib/pg_upgrade/pg_upgrade.c
*************** int
*** 67,72 ****
--- 67,73 ----
  main(int argc, char **argv)
  {
  	char	   *sequence_script_file_name = NULL;
+ 	char	   *analyze_script_file_name = NULL;
  	char	   *deletion_script_file_name = NULL;
  	bool		live_check = false;
  
*************** main(int argc, char **argv)
*** 142,147 ****
--- 143,149 ----
  			  new_cluster.pgdata, UTILITY_LOG_FILE);
  	check_ok();
  
+ 	create_script_for_cluster_analyze(&analyze_script_file_name);
  	create_script_for_old_cluster_deletion(&deletion_script_file_name);
  
  	issue_warnings(sequence_script_file_name);
*************** main(int argc, char **argv)
*** 149,156 ****
  	pg_log(PG_REPORT, "\nUpgrade Complete\n");
  	pg_log(PG_REPORT, "----------------\n");
  
! 	output_completion_banner(deletion_script_file_name);
  
  	pg_free(deletion_script_file_name);
  	pg_free(sequence_script_file_name);
  
--- 151,160 ----
  	pg_log(PG_REPORT, "\nUpgrade Complete\n");
  	pg_log(PG_REPORT, "----------------\n");
  
! 	output_completion_banner(analyze_script_file_name,
! 							 deletion_script_file_name);
  
+ 	pg_free(analyze_script_file_name);
  	pg_free(deletion_script_file_name);
  	pg_free(sequence_script_file_name);
  
diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h
new file mode 100644
index 46f9169..da37249
*** a/contrib/pg_upgrade/pg_upgrade.h
--- b/contrib/pg_upgrade/pg_upgrade.h
*************** void check_old_cluster(bool live_check,
*** 283,292 ****
  void		check_new_cluster(void);
  void		report_clusters_compatible(void);
  void		issue_warnings(char *sequence_script_file_name);
! void		output_completion_banner(char *deletion_script_file_name);
  void		check_cluster_versions(void);
  void		check_cluster_compatibility(bool live_check);
  void		create_script_for_old_cluster_deletion(char **deletion_script_file_name);
  
  
  /* controldata.c */
--- 283,294 ----
  void		check_new_cluster(void);
  void		report_clusters_compatible(void);
  void		issue_warnings(char *sequence_script_file_name);
! void		output_completion_banner(char *analyze_script_file_name,
! 									 char *deletion_script_file_name);
  void		check_cluster_versions(void);
  void		check_cluster_compatibility(bool live_check);
  void		create_script_for_old_cluster_deletion(char **deletion_script_file_name);
+ void		create_script_for_cluster_analyze(char **analyze_script_file_name);
  
  
  /* controldata.c */

Attachment: analyze_new_cluster.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to