On Tue, Apr 22, 2025 at 11:03:29PM +0200, Christoph Berg wrote:
> Re: Nathan Bossart
>> In any case, IMO it's unfortunate
>> that we might end up recommending roughly the same post-upgrade steps as
>> before even though the optimizer statistics are carried over.
>
> Maybe the docs (and the pg_upgrade scripts) should recommend the old
> procedure by default until this gap is closed? People could then still
> opt to use the new procedure in specific cases.
I think we'd still want to modify the --analyze-in-stages recommendation
(from what is currently recommended for supported versions). If we don't,
you'll wipe out the optimizer stats you brought over from the old version.
Here is a rough draft of what I am thinking.
--
nathan
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index df13365b287..648c6e2967c 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -833,17 +833,19 @@ psql --username=postgres --file=script.sql postgres
<para>
Because not all statistics are not transferred by
- <command>pg_upgrade</command>, you will be instructed to run a command to
+ <command>pg_upgrade</command>, you will be instructed to run commands to
regenerate that information at the end of the upgrade. You might need to
set connection parameters to match your new cluster.
</para>
<para>
- Using <command>vacuumdb --all --analyze-only
--missing-stats-only</command>
- can efficiently generate such statistics. Alternatively,
+ First, use
<command>vacuumdb --all --analyze-in-stages --missing-stats-only</command>
- can be used to generate minimal statistics quickly. For either command,
- the use of <option>--jobs</option> can speed it up.
+ to quickly generate minimal optimizer statistics for relations without
+ any. Then, use <command>vacuumdb --all --analyze-only</command> to ensure
+ all relations have updated cumulative statistics for triggering vacuum and
+ analyze. For both commands, the use of <option>--jobs</option> can speed
+ it up.
If <varname>vacuum_cost_delay</varname> is set to a non-zero
value, this can be overridden to speed up statistics generation
using <envar>PGOPTIONS</envar>, e.g., <literal>PGOPTIONS='-c
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 18c2d652bb6..f1b90c5957e 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -814,9 +814,12 @@ output_completion_banner(char *deletion_script_file_name)
}
pg_log(PG_REPORT,
- "Some optimizer statistics may not have been transferred by
pg_upgrade.\n"
+ "Some statistics are not transferred by pg_upgrade.\n"
"Once you start the new server, consider running:\n"
- " %s/vacuumdb %s--all --analyze-in-stages
--missing-stats-only", new_cluster.bindir, user_specification.data);
+ " %s/vacuumdb %s--all --analyze-in-stages
--missing-stats-only\n"
+ " %s/vacuumdb %s--all --analyze-only",
+ new_cluster.bindir, user_specification.data,
+ new_cluster.bindir, user_specification.data);
if (deletion_script_file_name)
pg_log(PG_REPORT,