On 2020/03/11 3:39, Magnus Hagander wrote:
On Tue, Mar 10, 2020 at 6:19 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote:



On 2020/03/10 22:43, Amit Langote wrote:
On Tue, Mar 10, 2020 at 6:09 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote:
So, I will make the patch adding support for --no-estimate-size option
in pg_basebackup.

Patch attached.

Like the idea and the patch looks mostly good.

Thanks for reviewing the patch!

+      total size. If the estimation is disabled in
+      <application>pg_basebackup</application>
+      (i.e., <literal>--no-estimate-size</literal> option is specified),
+      this is always <literal>0</literal>.

"always" seems unnecessary.

Fixed.

+        This option prevents the server from estimating the total
+        amount of backup data that will be streamed. In other words,
+        <literal>backup_total</literal> column in the
+        <structname>pg_stat_progress_basebackup</structname>
+        view always indicates <literal>0</literal> if this option is enabled.

Here too.

Fixed.

Attached is the updated version of the patch.

Would it perhaps be better to return NULL instead of 0 in the
statistics view if there is no data?

Also, should it really  be the server version that decides how this
feature behaves, and not the pg_basebackup version? Given that the
implementation is entirely in the client, it seems that's more
logical?

Yeah, you're right. I changed the patch that way.
Attached is the updated version of the patch.
and a few docs nitpicks:

         <para>
          Whether this is enabled or not, the
          <structname>pg_stat_progress_basebackup</structname> view
-        report the progress of the backup in the server side. But note
-        that the total amount of data that will be streamed is estimated
-        and reported only when this option is enabled. In other words,
-        <literal>backup_total</literal> column in the view always
-        indicates <literal>0</literal> if this option is disabled.
+        report the progress of the backup in the server side.
+       </para>
+       <para>
+        This option is not allowed when using
+        <option>--no-estimate-size</option>.
         </para>

I think you should just remove that whole paragraph. The details are
now listed under the disable parameter.

Fixed.

+        This option prevents the server from estimating the total
+        amount of backup data that will be streamed. In other words,
+        <literal>backup_total</literal> column in the
+        <structname>pg_stat_progress_basebackup</structname>
+        view indicates <literal>0</literal> if this option is enabled.

I suggest just "This option prevents the server from estimating the
total amount of backup data that will be streamed, resulting in the
ackup_total column in pg_stat_progress_basebackup to be (zero or NULL
depending on above)".

(Markup needed on that of course ,but you get the idea)

Yes, fixed.

+        When this is disabled, the backup will start by enumerating

I'd try to avoid the double negation, with something "without this
parameter, the backup will start..."

Fixed. I used "Without using this option ...".
+  <para>
+   <application>pg_basebackup</application> asks the server to estimate
+   the total amount of data that will be streamed by default (unless
+   <option>--no-estimate-size</option> is specified) in version 13 or later,
+   and does that only when <option>--progress</option> is specified in
+   the older versions.
+  </para>

That's an item for the release notes, not for the reference page, I
think. It's already explained under the --disable parameter, so I
suggest removing this paragraph as well.

Fixed.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 987580d6df..2c9f51daf4 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4388,10 +4388,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
      <entry><structfield>backup_total</structfield></entry>
      <entry><type>bigint</type></entry>
      <entry>
-      Total amount of data that will be streamed. If progress reporting
-      is not enabled in <application>pg_basebackup</application>
-      (i.e., <literal>--progress</literal> option is not specified),
-      this is <literal>0</literal>. Otherwise, this is estimated and
+      Total amount of data that will be streamed. This is estimated and
       reported as of the beginning of
       <literal>streaming database files</literal> phase. Note that
       this is only an approximation since the database
@@ -4399,7 +4396,10 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       and WAL log may be included in the backup later. This is always
       the same value as <structfield>backup_streamed</structfield>
       once the amount of data streamed exceeds the estimated
-      total size.
+      total size. If the estimation is disabled in
+      <application>pg_basebackup</application>
+      (i.e., <literal>--no-estimate-size</literal> option is specified),
+      this is <literal>0</literal>.
      </entry>
     </row>
     <row>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml 
b/doc/src/sgml/ref/pg_basebackup.sgml
index 29bf2f9b97..90638aad0e 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -460,21 +460,6 @@ PostgreSQL documentation
         in this case the estimated target size will increase once it passes the
         total estimate without WAL.
        </para>
-       <para>
-        When this is enabled, the backup will start by enumerating the size of
-        the entire database, and then go back and send the actual contents.
-        This may make the backup take slightly longer, and in particular it
-        will take longer before the first data is sent.
-       </para>
-       <para>
-        Whether this is enabled or not, the
-        <structname>pg_stat_progress_basebackup</structname> view
-        report the progress of the backup in the server side. But note
-        that the total amount of data that will be streamed is estimated
-        and reported only when this option is enabled. In other words,
-        <literal>backup_total</literal> column in the view always
-        indicates <literal>0</literal> if this option is disabled.
-       </para>
       </listitem>
      </varlistentry>
 
@@ -552,6 +537,30 @@ PostgreSQL documentation
        </para>
       </listitem>
      </varlistentry>
+
+     <varlistentry>
+      <term><option>--no-estimate-size</option></term>
+      <listitem>
+       <para>
+        This option prevents the server from estimating the total
+        amount of backup data that will be streamed, resulting in the
+        <literal>backup_total</literal> column in the
+        <structname>pg_stat_progress_basebackup</structname>
+        to be <literal>0</literal>.
+       </para>
+       <para>
+        Without this option, the backup will start by enumerating
+        the size of the entire database, and then go back and send
+        the actual contents. This may make the backup take slightly
+        longer, and in particular it will take longer before the first
+        data is sent. This option is useful to avoid such estimation
+        time if it's too long.
+       </para>
+       <para>
+        This option is not allowed when using <option>--progress</option>.
+       </para>
+      </listitem>
+     </varlistentry>
     </variablelist>
    </para>
 
diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index 48bd838803..c5d95958b2 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -121,6 +121,7 @@ static char *label = "pg_basebackup base backup";
 static bool noclean = false;
 static bool checksum_failure = false;
 static bool showprogress = false;
+static bool estimatesize = true;
 static int     verbose = 0;
 static int     compresslevel = 0;
 static IncludeWal includewal = STREAM_WAL;
@@ -386,6 +387,7 @@ usage(void)
        printf(_("      --no-slot          prevent creation of temporary 
replication slot\n"));
        printf(_("      --no-verify-checksums\n"
                         "                         do not verify checksums\n"));
+       printf(_("      --no-estimate-size do not estimate backup size in 
server side\n"));
        printf(_("  -?, --help             show this help, then exit\n"));
        printf(_("\nConnection options:\n"));
        printf(_("  -d, --dbname=CONNSTR   connection string\n"));
@@ -1741,7 +1743,7 @@ BaseBackup(void)
        basebkp =
                psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s %s",
                                 escaped_label,
-                                showprogress ? "PROGRESS" : "",
+                                estimatesize ? "PROGRESS" : "",
                                 includewal == FETCH_WAL ? "WAL" : "",
                                 fastcheckpoint ? "FAST" : "",
                                 includewal == NO_WAL ? "" : "NOWAIT",
@@ -2066,6 +2068,7 @@ main(int argc, char **argv)
                {"waldir", required_argument, NULL, 1},
                {"no-slot", no_argument, NULL, 2},
                {"no-verify-checksums", no_argument, NULL, 3},
+               {"no-estimate-size", no_argument, NULL, 4},
                {NULL, 0, NULL, 0}
        };
        int                     c;
@@ -2234,6 +2237,9 @@ main(int argc, char **argv)
                        case 3:
                                verify_checksums = false;
                                break;
+                       case 4:
+                               estimatesize = false;
+                               break;
                        default:
 
                                /*
@@ -2356,6 +2362,14 @@ main(int argc, char **argv)
        }
 #endif
 
+       if (showprogress && !estimatesize)
+       {
+               pg_log_error("--progress and --no-estimate-size are 
incompatible options");
+               fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+                               progname);
+               exit(1);
+       }
+
        /* connection in replication mode to server */
        conn = GetConnection();
        if (!conn)

Reply via email to