from:"Peter Smith"

typedef struct LogicalDecodingContext

2023-03-01 Thread Peter Smith

Hi,

During a recent code review, I noticed a lot of 'struct
LogicalDecodingContext' usage.

There are many function prototypes where the params are (for no
apparent reason to me) a mixture of structs and typedef structs.

AFAICT just by pre-declaring the typedef struct
LogicalDecodingContext, all of those 'struct LogicalDecodingContext'
can be culled, resulting in cleaner and more consistent function
signatures.

The PG Docs were similarly modified.

PSA patch for this.  It passes make check-world.

(I recognize this is potentially the tip of an iceberg. If this patch
is deemed OK, I can hunt down similar underuse of typedefs for other
structs)

Thoughts?

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Use-typedef-struct-LogicalDecodingContext.patch
Description: Binary data

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-28 Thread Peter Smith

Here are some review comments for v9-0001, but these are only very trivial.

==
Commit Message

1.
Nitpick. The new text is jagged-looking. It should wrap at ~80 chars.

~~~

2.
2. Another reason is for that parallel streaming, the transaction will be opened
immediately by the parallel apply worker. Therefore, if the walsender
is delayed in sending the final record of the transaction, the
parallel apply worker must wait to receive it with an open
transaction. This would result in the locks acquired during the
transaction not being released until the min_send_delay has elapsed.

~

The text already said there are "two reasons", and already this is
numbered as reason 2. So it doesn't need to keep saying "Another
reason" here.

"Another reason is for that parallel streaming" --> "For parallel streaming..."

==
src/backend/replication/walsender.c

3. WalSndDelay

+ /* die if timeout was reached */
+ WalSndCheckTimeOut();

Other nearby comments start uppercase, so this should too.

==
src/include/replication/walreceiver.h

4. WalRcvStreamOptions

@@ -187,6 +187,7 @@ typedef struct
  * prepare time */
  char*origin; /* Only publish data originating from the
  * specified origin */
+ int32 min_send_delay; /* The minimum send delay */
  } logical;
  } proto;
 } WalRcvStreamOptions;

~

Should that comment mention the units are "(ms)"

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Rework LogicalOutputPluginWriterUpdateProgress

2023-02-27 Thread Peter Smith

(!ctx->accept_writes)
- elog(ERROR, "writes are only accepted in commit, begin and change callbacks");
+ elog(ERROR, "writes are only accepted in callbacks in the
OutputPluginCallbacks structure (except startup, shutdown,
filter_by_origin and filter_prepare callbacks)");

It seems a confusing error message. Can it be worded better? Also, I
noticed this flag is never used except in this one place where it
throws an error, so would an "Assert" would be more appropriate here?

~~~

8. rollback_prepared_cb_wrapper

  /*
  * If the plugin support two-phase commits then rollback prepared callback
  * is mandatory
+ *
+ * FIXME: This should have been caught much earlier.
  */
  if (ctx->callbacks.rollback_prepared_cb == NULL)
~
Is this FIXME related to the current patch, or should this be an
entirely different topic?

~~~


9. is_skip_threshold_change

The current usage for this function is like:

if (is_skip_threshold_change(ctx))
+ update_progress_and_keepalive(ctx, false);

~

IMO a better name for this function might be like
'is_change_threshold_exceeded()' (or
'is_keepalive_threshold_exceeded()' etc) because seems more readable
to say

if (is_change_threshold_exceeded())
do_something();

~~~

10. is_skip_threshold_change

static bool
is_skip_threshold_change(struct LogicalDecodingContext *ctx)
{
static int changes_count = 0; /* used to accumulate the number of
*  changes */

/* If the change was published, reset the counter and return false */
if (ctx->did_write)
{
changes_count = 0;
return false;
}

/*
* It is possible that the data is not sent to downstream for a long time
* either because the output plugin filtered it or there is a DDL that
* generates a lot of data that is not processed by the plugin. So, in
* such cases, the downstream can timeout. To avoid that we try to send a
* keepalive message if required.  Trying to send a keepalive message
* after every change has some overhead, but testing showed there is no
* noticeable overhead if we do it after every ~100 changes.
*/
#define CHANGES_THRESHOLD 100
if (!ctx->did_write && ++changes_count >= CHANGES_THRESHOLD)
{
changes_count = 0;
return true;
}

return false;
}

~

That 2nd condition checking if (!ctx->did_write && ++changes_count >=
CHANGES_THRESHOLD) does not seem right. There is no need to check the
ctx->did_write; it must be false because it was checked earlier in the
function:

BEFORE
if (!ctx->did_write && ++changes_count >= CHANGES_THRESHOLD)

SUGGESTION1
Assert(!ctx->did_write);
if (++changes_count >= CHANGES_THRESHOLD)

SUGGESTION2
if (++changes_count >= CHANGES_THRESHOLD)

~~~

11. update_progress_and_keepalive

/*
 * Update progress tracking and send keep alive (if required).
 */
static void
update_progress_and_keepalive(struct LogicalDecodingContext *ctx,
  bool finished_xact)
{
if (!ctx->update_progress_and_keepalive)
return;

ctx->update_progress_and_keepalive(ctx, ctx->write_location,
   ctx->write_xid, ctx->did_write,
   finished_xact);
}

~

Maybe it's simpler to code this without the return.

e.g.

if (ctx->update_progress_and_keepalive)
{
ctx->update_progress_and_keepalive(ctx, ctx->write_location,
   ctx->write_xid, ctx->did_write,
   finished_xact);
}

(it is just generic suggested code for example -- I made some other
review comments overlapping this)


==
.../replication/logical/reorderbuffer.c

12. ReorderBufferAbort

+ UpdateDecodingProgressAndKeepalive((LogicalDecodingContext *)rb->private_data,
+xid, lsn, !TransactionIdIsValid(txn->toplevel_xid));
+

I didn't really recognise how the
"!TransactionIdIsValid(txn->toplevel_xid)" maps to the boolean
'finished_xact' param. Can this call have an explanatory comment about
how it works?

==
src/backend/replication/walsender.c

~~~

13. WalSndUpdateProgressAndKeepalive

- if (pending_writes || (!end_xact &&
+ if (pending_writes || (!finished_xact && wal_sender_timeout > 0 &&
 now >= TimestampTzPlusMilliseconds(last_reply_timestamp,
wal_sender_timeout / 2)))
- ProcessPendingWrites();
+ WalSndSendPending();

Is this new function name OK to be WalSndSendPending? From this code,
we can see it can also be called in other scenarios even when there is
nothing "pending" at all.


--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: PGDOCS - sgml linkend using single-quotes

2023-02-27 Thread Peter Smith

On Mon, Feb 27, 2023 at 7:04 PM Heikki Linnakangas  wrote:
>
...
>
> There were also a few "id" attributes using single-quotes. Fixed those
> too, and pushed. Thanks!
>

Thankyou for pushing.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Allow logical replication to copy tables in binary format

2023-02-23 Thread Peter Smith

d to enhance the "binary" mode so it can be made to
fall back to text mode if it needs to in the same way that binary
replication does.

If such an enhanced COPY format mode worked, then most of the patch is redundant
- there is no need for any new option
- tablesync COPY could then *always* use this "binary-with-falback" mode


--
[1] Melih initially wanted a unified binary mode -
https://www.postgresql.org/message-id/CAGPVpCQYi9AYQSS%3DRmGgVNjz5ZEnLB8mACwd9aioVhLmbgiAMA%40mail.gmail.com

[2] Barath doesn't like the binary/copy_format inter-dependency -
https://www.postgresql.org/message-id/CALj2ACVPt-BaLMm3Ezy1-rfUzH9qStxePcyGrHPamPESEZSBFA%40mail.gmail.com

[3] Shi-san found binary mode has the ability to fall back to text
sometimes - 
https://www.postgresql.org/message-id/OSZPR01MB6310B58F069FF8E148B247FDFDA49%40OSZPR01MB6310.jpnprd01.prod.outlook.com

[4] Jelte idea to enhance the copy_data option -
https://www.postgresql.org/message-id/CAGECzQS393xiME%2BEySLU7ceO4xOB81kPjPqNBjaeW3sLgfLhNw%40mail.gmail.com

[5] Kuroda-san etc expecting copy_data=false/copy_format=binary
combination is not allowed -
https://www.postgresql.org/message-id/TYAPR01MB5866DDF02B3CEE59DA024CC3F5AB9%40TYAPR01MB5866.jpnprd01.prod.outlook.com

[6] Jelte says it is the operator's responsibility to use the correct
options - 
https://www.postgresql.org/message-id/CAGECzQSStdb%2Bx1BxzCktZd1uSjvd6eYq1wcHV3vPCykrGqxYKQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Add pretty-printed XML output option

2023-02-23 Thread Peter Smith

The patch v19 LGTM.

- v19 applies cleanly for me
- Full clean build OK
- HTML docs build and render OK
- The 'make check' tests all pass for me
- Also cfbot reports latest patch has no errors -- http://cfbot.cputube.org/

So, I marked it a "Ready for Committer" in the CF --
https://commitfest.postgresql.org/42/4162/

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

2023-02-22 Thread Peter Smith

On Thu, Feb 23, 2023 at 11:28 AM Michael Paquier  wrote:
>
> On Wed, Feb 22, 2023 at 10:21:51AM +, shiy.f...@fujitsu.com wrote:
> > Thanks for your reply. Using two flags makes sense to me.
> > Attach the updated patch.
>
> Fine by me as far as it goes.  Any thoughts from others?
> --

Should the 'relation_callback_registered' variable name be plural?

Otherwise, LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Add pretty-printed XML output option

2023-02-22 Thread Peter Smith

Here are my review comments for patch v17-0001.

==
src/test/regress/sql/xml.sql

The blank line(s) which previously separated the xmlserialize tests
from the xml IS [NOT] DOCUMENT tests are now missing...


e.g.

-- indent different encoding (returns UTF-8)
SELECT xmlserialize(DOCUMENT '' AS
text INDENT);
SELECT xmlserialize(CONTENT  '' AS
text INDENT);
-- 'no indent' = not using 'no indent'
SELECT xmlserialize(DOCUMENT '42' AS text) = xmlserialize(DOCUMENT
'42' AS text NO INDENT);
SELECT xmlserialize(CONTENT '42' AS text) = xmlserialize(CONTENT
'42' AS text NO INDENT);
SELECT xml 'bar' IS DOCUMENT;
SELECT xml 'barfoo' IS DOCUMENT;
SELECT xml '' IS NOT DOCUMENT;
SELECT xml 'abc' IS NOT DOCUMENT;
SELECT '<>' IS NOT DOCUMENT;

~~

Apart from that, patch v17 LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-22 Thread Peter Smith

Patch v6 LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Add pretty-printed XML output option

2023-02-22 Thread Peter Smith

Here are some review comments for patch v16-0001.

==
> src/backend/executor/execExprInterp.c
>
> 2. ExecEvalXmlExpr
>
> @@ -3829,7 +3829,8 @@ ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op)
>{
>Datum*argvalue = op->d.xmlexpr.argvalue;
>bool*argnull = op->d.xmlexpr.argnull;
> -
> + boolindent = op->d.xmlexpr.xexpr->indent;
> + text*data;
>/* argument type is known to be xml */
>Assert(list_length(xexpr->args) == 1);
> Missing whitespace after the variable declarations
Whitespace added.

~

Oh, I meant something different to that fix. I meant there is a
missing blank line after the last ('data') variable declaration.

==
Test code.

I wondered if there ought to be a test that demonstrates explicitly
saying NO INDENT will give the identical result to just omitting it.

For example:

test=# -- no indent is default
test=# SELECT xmlserialize(DOCUMENT '42' AS text) = xmlserialize(DOCUMENT
'42' AS text NO INDENT);
 ?column?
--
 t
(1 row)

test=# SELECT xmlserialize(CONTENT '42' AS text) = xmlserialize(CONTENT
'42' AS text NO INDENT);
 ?column?
--
 t
(1 row)

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Add pretty-printed XML output option

2023-02-21 Thread Peter Smith

Here are some review comments for patch v15-0001

FYI, the patch applies clean and tests OK for me.

==
doc/src/sgml/datatype.sgml

1.
XMLSERIALIZE ( { DOCUMENT | CONTENT } value
AS type [ { NO INDENT | INDENT } ] )

~

Another/shorter way to write that syntax is like below. For me, it is
easier to read. YMMV.

XMLSERIALIZE ( { DOCUMENT | CONTENT } value
AS type [ [NO] INDENT ] )

==
src/backend/executor/execExprInterp.c

2. ExecEvalXmlExpr

@@ -3829,7 +3829,8 @@ ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op)
  {
  Datum*argvalue = op->d.xmlexpr.argvalue;
  bool*argnull = op->d.xmlexpr.argnull;
-
+ boolindent = op->d.xmlexpr.xexpr->indent;
+ text*data;
  /* argument type is known to be xml */
  Assert(list_length(xexpr->args) == 1);
Missing whitespace after the variable declarations

~~~

3.
+
+ data = xmltotext_with_xmloption(DatumGetXmlP(value),
+ xexpr->xmloption);
+ if(indent)
+ *op->resvalue = PointerGetDatum(xmlformat(data));
+ else
+ *op->resvalue = PointerGetDatum(data);
+
  }

Unnecessary blank line at the end.
==
src/backend/utils/adt/xml.

4. xmlformat

+#else
+ NO_XML_SUPPORT();
+return 0;
+#endif

Wrong indentation (return 0) in the indentation function? ;-)

------
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

2023-02-21 Thread Peter Smith

On Wed, Feb 22, 2023 at 12:03 PM Kyotaro Horiguchi
 wrote:
>
> At Tue, 21 Feb 2023 10:31:29 +, "shiy.f...@fujitsu.com" 
>  wrote in
> > Thanks for your reply. I agree that's expensive. Attach a new patch which 
> > adds a
> > static boolean to avoid duplicate registration.
>
> Thank you for the patch.  It is exactly what I had in my mind. But now
> that I've had a chance to mull it over, I came to think it might be
> better to register the callbacks at one place. I'm thinking we could
> create a new function called register_callbacks() or something and
> move all the calls to CacheRegisterSyscacheCallback() into that. What
> do you think about that refactoring?
>
> I guess you could say that that refactoring somewhat weakens the
> connection or dependency between init_rel_sync_cache and
> rel_sync_cache_relation_cb, but anyway the callback works even if
> RelationSyncCache is not around.
>

If you are going to do that, then won't just copying the
CacheRegisterSyscacheCallback(PUBLICATIONOID...  into function
init_rel_sync_cache() be effectively the same as doing that?

Then almost nothing else to do...e.g. no need for a new extra static
boolean if static RelationSyncCache is acting as the one-time guard
anyway.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-21 Thread Peter Smith

Here are some very minor review comments for the patch v4-0001

==
Commit Message

1.
The other possibility was to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and
locks being
held for a long time.

~

The reply [1] for review comment #2 says that this was "slightly
reworded", but AFAICT nothing is changed here.

~~~

2.
Eariler versions were written by Euler Taveira, Takamichi Osumi, and
Kuroda Hayato

Typo: "Eariler"

==
doc/src/sgml/ref/create_subscription.sgml

3.
+ 
+  By default, the publisher sends changes as soon as possible. This
+  parameter allows the user to delay changes by given time period. If
+  the value is specified without units, it is taken as milliseconds.
+  The default is zero (no delay). See 
+  for details on the available valid time units.
+ 

"by given time period" --> "by the given time period"

==
src/backend/replication/pgoutput/pgoutput.c

4. parse_output_parameters

+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ unsigned long parsed;
+ char*endptr;

I think 'parsed' is a fairly meaningless variable name. How about
calling this variable something useful like 'delay_val' or
'min_send_delay_value', or something like those? Yes, I recognize that
you copied this from some existing code fragment, but IMO that doesn't
make it good.

==
src/backend/replication/walsender.c

5.
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+Min(timeout_sleeptime_ms, remaining_wait_time_ms),
+WAIT_EVENT_WALSENDER_SEND_DELAY);

In my previous review [2] comment #14, I questioned if this comment
was correct. It looks like that was accidentally missed.

==
src/include/replication/logical.h

6.
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending
+ * COMMIT/PREPARE record
+ */
+ int32 min_send_delay;

The comment is missing a period.


--
[1] Kuroda-san replied to my review v3-0001.
https://www.postgresql.org/message-id/TYAPR01MB5866C6BCA4D9386D9C486033F5A59%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[2] My previous review v3-0001.
https://www.postgresql.org/message-id/CAHut%2BPu6Y%2BBkYKg6MYGi2zGnx6imeK4QzxBVhpQoPMeDr1npnQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [Proposal] Add foreign-server health checks infrastructure

2023-02-20 Thread Peter Smith

Here are some review comments for v32-0001.

==
Commit message

1.
PQconnCheck() function allows to check the status of the socket by polling
the socket. This function is currently available only on systems that
support the non-standard POLLRDHUP extension to the poll system call,
including Linux.

~

(missed fix from previous review)

"status of the socket" --> "status of the connection"


doc/src/sgml/libpq.sgml

2. PQconnCheck
+  
+   This function check the health of the connection. Unlike ,
+   this check is performed by polling the corresponding socket. This
+   function is currently available only on systems that support the
+   non-standard POLLRDHUP extension to the
poll
+   system call, including Linux. 
+   returns greater than zero if the remote peer seems to be closed, returns
+   0 if the socket is valid, and returns
-1
+   if the connection has already been closed or an error has occurred.
+  

"check the health" --> "checks the health"

~~~

3. PQcanConnCheck

+  
+   Returns true (1) or false (0) to indicate if the PQconnCheck function
+   is supported on this platform.

Should the reference to PQconnCheck be a link as it previously was?

==
src/interfaces/libpq/fe-misc.c

4. PQconnCheck

+/*
+ * Check whether the socket peer closed connection or not.
+ *
+ * Returns >0 if remote peer seems to be closed, 0 if it is valid,
+ * -1 if the input connection is bad or an error occurred.
+ */
+int
+PQconnCheck(PGconn *conn)
+{
+ return pqSocketCheck(conn, 0, 0, (time_t) 0);
+}

I'm confused. This comment says =0 means connection is valid. But the
pqSocketCheck comment says =0 means it timed out.

So those two function comments don't seem compatible

~~~

5. PQconnCheckable

+/*
+ * Check whether PQconnCheck() works well on this platform.
+ *
+ * Returns true (1) if this can use PQconnCheck(), otherwise false (0).
+ */
+int
+PQconnCheckable(void)
+{
+#if (defined(HAVE_POLL) && defined(POLLRDHUP))
+ return true;
+#else
+ return false;
+#endif
+}

Why say "works well"? IMO it either works or doesn't work – there is no "well".

SUGGESTION1
Check whether PQconnCheck() works on this platform.

SUGGESTION2
Check whether PQconnCheck() can work on this platform.

~~~

6. pqSocketCheck

 /*
  * Checks a socket, using poll or select, for data to be read, written,
- * or both.  Returns >0 if one or more conditions are met, 0 if it timed
+ * or both. Moreover, when neither forRead nor forWrite is requested and
+ * timeout is disabled, try to check the health of socket.
+ *
+ * Returns >0 if one or more conditions are met, 0 if it timed
  * out, -1 if an error occurred.
  *
  * If SSL is in use, the SSL buffer is checked prior to checking the socket

~

See review comment #4. (e.g. This says =0 if it timed out).

~~~

7. pqSocketPoll

+ * When neither forRead nor forWrite are set and timeout is disabled,
+ *
+ * - If the timeout is disabled, try to check the health of the socket
+ * - Otherwise this immediately returns 0
+ *
+ * Return >0 if condition is met, 0 if a timeout occurred, -1 if an error
+ * or interrupt occurred.

Don't say "and timeout is disabled," because it clashes with the 1st
bullet which also says "- If the timeout is disabled,".

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-20 Thread Peter Smith

Here are some review comments for the v3-0001 test code.

==
src/test/regress/sql/subscription.sql

1.
+-- fail - utilizing streaming = parallel with time-delayed
replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false, streaming = parallel, min_send_delay = 123);

"utilizing" --> "specifying"

~~~

2.
+-- success -- min_send_delay value without unit is take as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect =
false, min_send_delay = 123);
+\dRs+

"without unit is take as" --> "without units is taken as"

~~~

3.
+-- success -- min_send_delay value with unit is converted into ms and
stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+


"with unit is converted into ms" --> "with units other than ms is
converted to ms"

~~~

4. Missing tests?

Why have the previous ALTER SUBSCRIPTION tests been removed? AFAIK,
currently, there are no regression tests for error messages like:

test_sub=# ALTER SUBSCRIPTION sub1 SET (min_send_delay = 123);
ERROR:  cannot set min_send_delay for subscription in parallel streaming mode

==
src/test/subscription/t/001_rep_changes.pl

5.
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for
non-streaming transaction"
+);

It's not strictly an "apply delay". Maybe this comment only needs to
say like below:

SUGGESTION
# This test is successful only if at least the configured delay has elapsed.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-20 Thread Peter Smith

 errno = 0;
+ parsed = strtoul(strVal(defel->arg), , 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (parsed > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg;

Should the validation be also checking/asserting no negative numbers,
or actually should the min_send_delay be defined as a uint32 in the
first place?

~~~

12. pgoutput_startup

@@ -501,6 +528,15 @@ pgoutput_startup(LogicalDecodingContext *ctx,
OutputPluginOptions *opt,
  else
  ctx->twophase_opt_given = true;

+ if (data->min_send_delay &&
+ data->protocol_version < LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested proto_version=%d does not support delay sending
data, need %d or higher",
+ data->protocol_version, LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)));
+ else
+ ctx->min_send_delay = data->min_send_delay;


IMO it doesn't make sense to compare this new feature with the
unrelated LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM protocol
version. I think we should define a new constant
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM (even if it has the same
value as the LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM).

==
src/backend/replication/walsender.c

13. WalSndDelay

+ long diffms;
+ long timeout_interval_ms;

IMO some more informative name for these would make the code read better:

'diffms' --> 'remaining_wait_time_ms'
'timeout_interval_ms' --> 'timeout_sleeptime_ms'

~~~

14.
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+Min(timeout_interval_ms, diffms),
+WAIT_EVENT_WALSENDER_SEND_DELAY);

Sorry, I didn't understand this comment "reply from worker"... AFAIK
here we are just sleeping, not waiting for replies from anywhere (???)

==
src/include/replication/logical.h

15.
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
  LogicalOutputPluginWriterPrepareWrite prepare_write;
  LogicalOutputPluginWriterWrite write;
  LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay;

~

15a.
Question: Is there some advantage to introducing another callback,
instead of just doing the delay inline?

~

15b.
Should this be a more informative member name like 'delay_send'?

~~~

16.
@@ -100,6 +105,8 @@ typedef struct LogicalDecodingContext
  */
  bool twophase_opt_given;

+ int32 min_send_delay;
+

Missing comment for this new member.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [Proposal] Add foreign-server health checks infrastructure

2023-02-19 Thread Peter Smith

Here is a code review only for patch v31-0001.

==
General Comment

1.
PQcanConnCheck seemed like a strange API name. Maybe it can have the
same prefix as the other?

e.g.

- PQconnCheck()
- PGconnCheckSupported()

or

- PQconnCheck()
- PGconnCheckable()

==
Commit Message

2.
PqconnCheck() function allows to check the status of socket by polling
the socket. This function is currently available only on systems that
support the non-standard POLLRDHUP extension to the poll system call,
including Linux.

PqcanConnCheck() checks whether above function is available or not.

~

2a.
"status of socket" --> "status of the connection"

~

2b.
"above function" --> "the above function"

==
doc/src/sgml/libpq.sgml

3. PQconnCheck

Returns the health of the socket.

int PQconnCheck(PGconn *conn);

Unlike PQstatus, this function checks socket health. This check is
performed by polling the socket. This function is currently available
only on systems that support the non-standard POLLRDHUP extension to
the poll system call, including Linux. PQconnCheck returns greater
than zero if the remote peer seems to be closed, returns 0 if the
socket is valid, and returns -1 if the connection has been already
invalid or an error is error occurred.

~

3a.
Should these descriptions be referring to the health of the
*connection* rather than the health of the socket?

~

3b.
"has been already invalid" ?? wording

~~~

4. PQcanConnCheck

Returns whether PQconnCheck is available on this platform.
PQcanConnCheck returns 1 if the function is supported, otherwise
returns 0.

~

I thought this should be worded using "true" and "false" same as other
boolean functions on this page.

SUGGESTION
Returns true (1) or false (0) to indicate if the PQconnCheck function
is supported on this platform.

==
src/interfaces/libpq/fe-misc.c

5.
-static int pqSocketCheck(PGconn *conn, int forRead, int forWrite,
-   time_t end_time);
+static int pqSocketIsReadableOrWritableOrValid(PGconn *conn, int forRead,
+ int forWrite, time_t end_time);

I was not 100% sure overloading this API is the right thing to do.
Doesn't this introduce a subtle side-effect on some of the existing
callers? e.g. Previously pqWaitTimed would ALWAYS return 0 if
forRead/forWrite were both false. But now other return values like
errors will be possible. Is that OK?

~~~

6. pqSocketPoll

/*
 * Check a file descriptor for read and/or write data, possibly waiting.
 * If neither forRead nor forWrite are set, immediately return a timeout
 * condition (without waiting).  Return >0 if condition is met, 0
 * if a timeout occurred, -1 if an error or interrupt occurred.
 *
 * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
 * if end_time is 0 (or indeed, any time before now).
 *
 * Moreover, when neither forRead nor forWrite is requested and timeout is
 * disabled, try to check the health of socket.
 */


The new comment "Moreover..." is contrary to the earlier part of the
same comment which already said, "If neither forRead nor forWrite are
set, immediately return a timeout condition (without waiting)."

There might be side-effects to previous/existing callers of this
function (e.g. pqWaitTimed via pqSocketCheck)

~~~

7.
  if (!forRead && !forWrite)
- return 0;
+ {
+ /* Try to check the health if requested */
+ if (end_time == 0)
+#if defined(POLLRDHUP)
+ input_fd.events = POLLRDHUP | POLLHUP | POLLNVAL;
+#else
+ return 0;
+#endif /* defined(POLLRDHUP) */
+ else
+ return 0;
+ }

FYI - I think the new code can be simpler without needing #else by
calling your other new function.

SUGGESTION
if (!forRead && !forWrite)
{
if (!PQcanConnCheck() || end_time != 0)
return 0;

/* Check the connection health when end_time is 0 */
Assert(PQcanConnCheck() && end_time == 0);
#if defined(POLLRDHUP)
input_fd.events = POLLRDHUP | POLLHUP | POLLNVAL;
#endif
}

~~~

8. PQconnCheck
+/*
+ * Check whether PQconnCheck() can work well on this platform.
+ *
+ * Returns 1 if this can use PQconnCheck(), otherwise 0.
+ */
+int
+PQcanConnCheck(void)
+{
+#if (defined(HAVE_POLL) && defined(POLLRDHUP))
+ return true;
+#else
+ return false;
+#endif
+}

~

8a.
"can work well" --> "works"

~

8b.
Maybe better to say "true (1)" and "otherwise false (0)"

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2023-02-18 Thread Peter Smith

On Fri, Feb 17, 2023 at 5:57 PM Peter Smith  wrote:
>
> FYI, I accidentally left this (v23) patch's TAP test
> t/032_subscribe_use_index.pl still lurking even after removing all
> other parts of this patch.
>
> In this scenario, the t/032 test gets stuck (build of latest HEAD)
>
> IIUC the patch is only meant to affect performance, so I expected this
> 032 test to work regardless of whether the rest of the patch is
> applied.
>
> Anyway,  it hangs every time for me. I didn't dig looking for the
> cause, but if it requires patched code for this new test to pass, I
> thought it indicates something wrong either with the test or something
> more sinister the new test has exposed. Maybe I am mistaken
>

Sorry, probably the above was a false alarm. After a long time
(minutes) the stuck test did eventually timeout with:

t/032_subscribe_use_index.pl ... # poll_query_until timed out
executing this query:
# select (idx_scan = 1) from pg_stat_all_indexes where indexrelname =
'test_replica_id_full_idx';
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
t/032_subscribe_use_index.pl ... Dubious, test returned 29 (wstat
7424, 0x1d00)

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2023-02-16 Thread Peter Smith

FYI, I accidentally left this (v23) patch's TAP test
t/032_subscribe_use_index.pl still lurking even after removing all
other parts of this patch.

In this scenario, the t/032 test gets stuck (build of latest HEAD)

IIUC the patch is only meant to affect performance, so I expected this
032 test to work regardless of whether the rest of the patch is
applied.

Anyway,  it hangs every time for me. I didn't dig looking for the
cause, but if it requires patched code for this new test to pass, I
thought it indicates something wrong either with the test or something
more sinister the new test has exposed. Maybe I am mistaken

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-02-15 Thread Peter Smith

LGTM. My only comment is about the commit message.

==
Commit message

d9d7fe6 reuse existing wait event when sending data in apply worker. But we
should have invent a new wait state if we are waiting at a new place, so fix
this.

~

SUGGESTION
d9d7fe6 made use of an existing wait event when sending data from the apply
worker, but we should have invented a new wait state since the code was
waiting at a new place.

This patch corrects the mistake by using a new wait state
"LogicalApplySendData".

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Add pretty-printed XML output option

2023-02-15 Thread Peter Smith

On Thu, Feb 16, 2023 at 12:49 AM Jim Jones  wrote:
>
> Accidentally left the VERBOSE settings out -- sorry!
>
> Now it matches the approach used in a xpath test in xml.sql, xml.out,
> xml_1.out and xml_2.out
>
> -- Since different libxml versions emit slightly different
> -- error messages, we suppress the DETAIL in this test.
> \set VERBOSITY terse
> SELECT xpath('/*', '');
> ERROR:  could not parse XML document
> \set VERBOSITY default
>
> v11 now correctly sets xml_2.out.
>
> Best, Jim


Firstly, Sorry it seems like I made a mistake and was premature
calling bingo above for v9.
- today I repeated v9 'make check' and found it failing still.
- the new xmlformat tests are OK, but some pre-existing xmlparse tests
are broken.
- see attached file pretty-v9-results



OTOH, the absence of xml_2.out from this patch appears to be the
correct explanation for why my results have been differing.



Today I fetched and tried the latest v11.

It is failing too, but only just.
- see attached file pretty-v11-results

It looks only due to a whitespace EOF issue in the xml_2.out

@@ -1679,4 +1679,4 @@
 -- XML format: empty string
 SELECT xmlformat('');
 ERROR:  invalid XML document
-\set VERBOSITY default
\ No newline at end of file
+\set VERBOSITY default

--

The attached patch update (v12-0002) fixes the xml_2.out for me.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v12-0001-Add-pretty-printed-XML-output-option.patch
Description: Binary data


v12-0002-PS-fix-EOF-for-xml_2.out.patch
Description: Binary data


pretty-v9-results
Description: Binary data


pretty-v11-results
Description: Binary data

Re: [PATCH] Add pretty-printed XML output option

2023-02-15 Thread Peter Smith

On Wed, Feb 15, 2023 at 6:10 PM Jim Jones  wrote:
>
> On 15.02.23 02:09, Peter Smith wrote:
> > With v8, in my environment, in psql I see something slightly different:
> >
> >
> > test_pub=# SET XML OPTION CONTENT;
> > SET
> > test_pub=# SELECT xmlformat('');
> > ERROR:  invalid XML document
> > DETAIL:  line 1: switching encoding : no input
> > line 1: Document is empty
> > test_pub=# SET XML OPTION DOCUMENT;
> > SET
> > test_pub=# SELECT xmlformat('');
> > ERROR:  invalid XML document
> > LINE 1: SELECT xmlformat('');
> >   ^
> > DETAIL:  line 1: switching encoding : no input
> > line 1: Document is empty
> >
> > ~~
> >
> > test_pub=# SET XML OPTION CONTENT;
> > SET
> > test_pub=# INSERT INTO xmltest VALUES (3, ' > ERROR:  relation "xmltest" does not exist
> > LINE 1: INSERT INTO xmltest VALUES (3, ' >  ^
> > test_pub=# SET XML OPTION DOCUMENT;
> > SET
> > test_pub=# INSERT INTO xmltest VALUES (3, ' > ERROR:  relation "xmltest" does not exist
> > LINE 1: INSERT INTO xmltest VALUES (3, ' >  ^
> >
> > ~~
>
> Yes... a cfbot also complained about the same thing.
>
> Setting the VERBOSITY to terse might solve this issue:
>
> postgres=# \set VERBOSITY terse
> postgres=# SELECT xmlformat('');
> ERROR:  invalid XML document
>
> postgres=# \set VERBOSITY default
> postgres=# SELECT xmlformat('');
> ERROR:  invalid XML document
> DETAIL:  line 1: switching encoding : no input
>
> ^
> line 1: Document is empty
>
> ^
>
> v9 wraps the corner test cases with VERBOSITY terse to reduce the error
> message output.
>

Bingo!! Your v9 patch now passes all 'make check' tests for me.

But I'll leave it to a committer to decide if this VERBOSITY toggle is
the best fix.

(I don't understand, maybe someone can explain, how the patch managed
to mess verbosity of the existing tests.)

--
Kind Regards,
Peter Smith.
Fujitsu Austalia.

Re: Support logical replication of DDLs

2023-02-14 Thread Peter Smith

On Thu, Feb 9, 2023 at 8:55 PM Ajin Cherian  wrote:
>
> On Fri, Feb 3, 2023 at 11:41 AM Peter Smith  wrote:
> >
> > Here are some review comments for patch v63-0001.
> >

> > ==
> > src/backend/catalog/aclchk.c
> >
> > 3. ExecuteGrantStmt
> >
> > + /* Copy the grantor id needed for DDL deparsing of Grant */
> > + istmt.grantor_uid = grantor;
> > +
> >
> > SUGGESTION (comment)
> > Copy the grantor id to the parsetree, needed for DDL deparsing of Grant
> >
>
> didn't change this, as Alvaro said this was not a parsetree.

Perhaps there is more to do here? Sorry, I did not understand the
details of Alvaro's post [1], but I did not recognize the difference
between ExecuteGrantStmt and ExecSecLabelStmt so it was my impression
either one or both of these places are either wrongly commented, or
maybe are doing something that should not be done.

> > ==
> > src/backend/utils/adt/regproc.c
> >
> > 9.
> > +
> > +/*
> > + * Append the parenthesized arguments of the given pg_proc row into the 
> > output
> > + * buffer. force_qualify indicates whether to schema-qualify type names
> > + * regardless of visibility.
> > + */
> > +static void
> > +format_procedure_args_internal(Form_pg_proc procform, StringInfo buf,
> > +bool force_qualify)
> > +{
> > + int i;
> > + char* (*func[2])(Oid) = {format_type_be, format_type_be_qualified};
> > +
> > + appendStringInfoChar(buf, '(');
> > + for (i = 0; i < procform->pronargs; i++)
> > + {
> > + Oid thisargtype = procform->proargtypes.values[i];
> > + char*argtype = NULL;
> > +
> > + if (i > 0)
> > + appendStringInfoChar(buf, ',');
> > +
> > + argtype = func[force_qualify](thisargtype);
> > + appendStringInfoString(buf, argtype);
> > + pfree(argtype);
> > + }
> > + appendStringInfoChar(buf, ')');
> > +}
> >
> > 9a.
> > Assign argtype = NULL looks redundant because it will always be
> > overwritten anyhow.
> >
>
> changed this.
>
> > ~
> >
> > 9b.
> > I understand why this function was put here beside the other static
> > functions in "Support Routines" but IMO it really belongs nearby (i.e.
> > directly above) the only caller (format_procedure_args). Keeping both
> > those functional together will improve the readability of both, and
> > will also remove the need to have the static forward declaration.
> >

There was no reply for 9b. Was it accidentally overlooked, or just
chose not to do it?

--
[1] 
https://www.postgresql.org/message-id/20230213090752.27ftbb6byiw3qcbl%40alvherre.pgsql

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Support logical replication of DDLs

2023-02-14 Thread Peter Smith

On Sat, Feb 11, 2023 at 3:21 AM vignesh C  wrote:
>
> On Thu, 9 Feb 2023 at 03:47, Peter Smith  wrote:
> >
> > Hi Vignesh, thanks for addressing my v63-0002 review comments.
> >
> > I confirmed most of the changes. Below is a quick follow-up for the
> > remaining ones.
> >
> > On Mon, Feb 6, 2023 at 10:32 PM vignesh C  wrote:
> > >
> > > On Mon, 6 Feb 2023 at 06:47, Peter Smith  wrote:
> > > >
> > ...
> > > >
> > > > 8.
> > > > + value = findJsonbValueFromContainer(container, JB_FOBJECT, );
> > > >
> > > > Should the code be checking or asserting value is not NULL?
> > > >
> > > > (IIRC I asked this a long time ago - sorry if it was already answered)
> > > >
> > >
> > > Yes, this was already answered by Zheng, quoting as "The null checking
> > > for value is done in the upcoming call of expand_one_jsonb_element()."
> > > in [1]
> >
> > Thanks for the info. I saw that Zheng-san only wrote it is handled in
> > the “upcoming call of expand_one_jsonb_element”, but I don’t know if
> > that is sufficient. For example, if the execution heads down the other
> > path (expand_jsonb_array) with a NULL jsonarr then it going to crash,
> > isn't it? So I still think some change may be needed here.
>
> Added an Assert for this.
>

Was this a correct change to make here?

IIUC this Assert is now going to intercept both cases including the
expand_one_jsonb_element() which previously would have thrown a proper
ERROR.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2023-02-14 Thread Peter Smith

ier in this function.
+ */
+ idxoid = GetRelationIdentityOrPK(localrel);
+ if (OidIsValid(idxoid))
+ return idxoid;
+
+ /* If index scans are disabled, use a sequential scan */
+ if (!enable_indexscan)
+ return InvalidOid;

~

IMO that "Note" really belongs with the if (!enable)indexscan) more like this:

SUGGESTION
/*
* Simple case, we already have a primary key or a replica identity index.
*/
idxoid = GetRelationIdentityOrPK(localrel);
if (OidIsValid(idxoid))
return idxoid;

/*
* If index scans are disabled, use a sequential scan.
*
* Note we hesitate to move this check to earlier in this function
* because allowing primary key or replica identity even when index scan
* is disabled is the legacy behaviour.
*/
if (!enable_indexscan)
return InvalidOid;

==
src/backend/replication/logical/worker.c

10. get_usable_indexoid

+/*
+ * Decide whether we can pick an index for the relinfo (e.g., the relation)
+ * we're actually deleting/updating from. If it is a child partition of
+ * edata->targetRelInfo, find the index on the partition.
+ *
+ * Note that if the corresponding relmapentry has invalid usableIndexOid,
+ * the function returns InvalidOid.
+ */

"(e.g., the relation)" --> "(i.e. the relation)"

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Add pretty-printed XML output option

2023-02-14 Thread Peter Smith

On Wed, Feb 15, 2023 at 11:05 AM Jim Jones  wrote:
>
> On 14.02.23 23:45, Peter Smith wrote:
> > Those results implied to me that this function code (in my environment
> > anyway) is somehow introducing a side effect causing the *other* XML
> > tests to fail.
>
> I believe I've found the issue. It is probably related to the XML OPTION
> settings, as it seems to deliver different error messages when set to
> DOCUMENT or CONTENT:
>
> postgres=# SET XML OPTION CONTENT;
> SET
> postgres=# SELECT xmlformat('');
> ERROR:  invalid XML document
> DETAIL:  line 1: switching encoding : no input
>
> ^
> line 1: Document is empty
>
> ^
> postgres=# SET XML OPTION DOCUMENT;
> SET
> postgres=# SELECT xmlformat('');
> ERROR:  invalid XML document
> LINE 1: SELECT xmlformat('');
>   ^
> DETAIL:  line 1: switching encoding : no input
>
> ^
> line 1: Document is empty
>
> ^
>
> v8 attached reintroduces the SELECT xmlformat('') test case and adds SET
> XML OPTION DOCUMENT to the regression tests.
>

With v8, in my environment, in psql I see something slightly different:


test_pub=# SET XML OPTION CONTENT;
SET
test_pub=# SELECT xmlformat('');
ERROR:  invalid XML document
DETAIL:  line 1: switching encoding : no input
line 1: Document is empty
test_pub=# SET XML OPTION DOCUMENT;
SET
test_pub=# SELECT xmlformat('');
ERROR:  invalid XML document
LINE 1: SELECT xmlformat('');
 ^
DETAIL:  line 1: switching encoding : no input
line 1: Document is empty

~~

test_pub=# SET XML OPTION CONTENT;
SET
test_pub=# INSERT INTO xmltest VALUES (3, '

Re: [PATCH] Add pretty-printed XML output option

2023-02-14 Thread Peter Smith

On Wed, Feb 15, 2023 at 8:55 AM Jim Jones  wrote:
>
> On 13.02.23 13:15, Jim Jones wrote:
>
> diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/xml.out 
> /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out
> --- /tmp/cirrus-ci-build/src/test/regress/expected/xml.out 2023-02-12 
> 09:02:57.077569000 +
> +++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out 
> 2023-02-12 09:05:45.14810 +
> @@ -1695,10 +1695,7 @@
>  -- XML format: empty string
>  SELECT xmlformat('');
>  ERROR:  invalid XML document
> -DETAIL:  line 1: switching encoding : no input
> -
> -^
> -line 1: Document is empty
> +DETAIL:  line 1: Document is empty
>
>  ^
>  -- XML format: invalid string (whitespaces)
>
> I couldn't figure out why the error messages are different -- I'm wondering 
> if the issue is the test environment itself. I just removed the troubling 
> test case for now
>
> SELECT xmlformat('');
>
> v7 attached.
>
> Thanks for reviewing this patch!
>

Yesterday I looked at those cfbot configs and noticed all those
machines have different versions of libxml.

2.10.3
2.6.23
2.9.10
2.9.13

But I don't if version numbers have any impact on the different error
details or not.

~

The thing that puzzled me most is that in MY environment (CentOS7;
libxml 20901; PG --with-libxml build) I get this behaviour.

- Without your v6 patch 'make check' is all OK.

- With your v6 patch other XML tests (not only yours) of 'make check'
failed with different error messages.

- Similarly, if I keep the v6 patch but just change (in xmlformat) the
#ifdef USE_LIBXML to be #if 0, then only the new xmlformat tests fail,
but the other XML tests are working OK again.

Those results implied to me that this function code (in my environment
anyway) is somehow introducing a side effect causing the *other* XML
tests to fail.

But so far I was unable to identify the reason. Sorry, I don't know
this XML API well enough to help more.

--
Kind Regards,
Peter Smith.
Fujitsu Austalia.

Re: Perform streaming logical transactions by background workers and parallel apply

2023-02-13 Thread Peter Smith

On Tue, Feb 14, 2023 at 5:04 PM Amit Kapila  wrote:
>
> On Fri, Feb 10, 2023 at 8:56 AM Peter Smith  wrote:
> >
> > On Fri, Feb 10, 2023 at 1:32 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > On Tuesday, February 7, 2023 11:17 AM Amit Kapila 
> > >  wrote:
> > > >
> > > > On Mon, Feb 6, 2023 at 3:43 PM houzj.f...@fujitsu.com
> > > >  wrote:
> > > > >
> > > > > while reading the code, I noticed that in pa_send_data() we set wait
> > > > > event to WAIT_EVENT_LOGICAL_PARALLEL_APPLY_STATE_CHANGE while
> > > > sending
> > > > > the message to the queue. Because this state is used in multiple
> > > > > places, user might not be able to distinguish what they are waiting
> > > > > for. So It seems we'd better to use WAIT_EVENT_MQ_SEND here which will
> > > > > be eaier to distinguish and understand. Here is a tiny patch for that.
> > > > >
> > >
> > > As discussed[1], we'd better invent a new state for this purpose, so here 
> > > is the patch
> > > that does the same.
> > >
> > > [1] 
> > > https://www.postgresql.org/message-id/CAA4eK1LTud4FLRbS0QqdZ-pjSxwfFLHC1Dx%3D6Q7nyROCvvPSfw%40mail.gmail.com
> > >
> >
> > My first impression was the
> > WAIT_EVENT_LOGICAL_PARALLEL_APPLY_SEND_DATA name seemed misleading
> > because that makes it sound like the parallel apply worker is doing
> > the sending, but IIUC it's really the opposite.
> >
>
> So, how about WAIT_EVENT_LOGICAL_APPLY_SEND_DATA?
>

Yes, IIUC all the LR events are named WAIT_EVENT_LOGICAL_xxx.

So names like the below seem correct format:

a) WAIT_EVENT_LOGICAL_APPLY_SEND_DATA
b) WAIT_EVENT_LOGICAL_LEADER_SEND_DATA
c) WAIT_EVENT_LOGICAL_LEADER_APPLY_SEND_DATA

Of those, I prefer option c) because saying LEADER_APPLY_xxx matches
the name format of the existing
WAIT_EVENT_LOGICAL_PARALLEL_APPLY_STATE_CHANGE.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Support logical replication of DDLs

2023-02-13 Thread Peter Smith

FYI - the latest patch cannot be applied.

See cfbot [1]

--
[1] http://cfbot.cputube.org/patch_42_3595.log

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Adding "large" to PG_TEST_EXTRA

2023-02-13 Thread Peter Smith

On Tue, Feb 14, 2023 at 10:44 AM Andres Freund  wrote:
>
> Hi,
>
> On 2023-02-14 09:26:47 +1100, Peter Smith wrote:
> > I've observed suggested test cases get rejected as being overkill, or
> > because they would add precious seconds to the test execution. OTOH, I
> > felt such tests would still help gain some additional percentages from
> > the "code coverage" stats. The kind of tests I am thinking of don't
> > necessarily need a huge disk/CPU - but they just take longer to run
> > than anyone has wanted to burden the build-farm with.
>
> I'd say it depend on the test whether it's worth adding. Code coverage for its
> own sake isn't that useful, they have to actually test something useful. And
> tests have costs beyond runtime, e.g. new tests tend to fail in some edge
> cases.
>
> E.g. just having tests hit more lines, without verifying that the behaviour is
> actually correct, only provides limited additional assurance.  It's also not
> very useful to add a very expensive test that provides only a very small
> additional amount of coverage.
>
> IOW, even if we add more test categories, it'll still be a tradeoff.
>
>
> > Sorry for the thread interruption -- but I thought this might be the
> > right place to ask: What is the recommended way to deal with such
> > tests intended primarily for better code coverage?
>
> I don't think that exists today.
>
> Do you have an example of the kind of test you're thinking of?

No, nothing specific in mind. But maybe like these:
- tests for causing obscure errors that would never otherwise be
reached without something deliberately designed to fail a certain way
- tests for trivial user errors apparently deemed not worth bloating
the regression tests with -- e.g. many errorConflictingDefElem not
being called [1].
- timing-related or error tests where some long (multi-second) delay
is a necessary part of the setup.

--
[1] 
https://coverage.postgresql.org/src/backend/commands/subscriptioncmds.c.gcov.html

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Adding "large" to PG_TEST_EXTRA

2023-02-13 Thread Peter Smith

On Tue, Feb 14, 2023 at 5:42 AM Andres Freund  wrote:
>
> Hi,
>
> I'm working on rebasing [1], my patch to make relation extension scale
> better.
>
> As part of that I'd like to add tests for relation extension. To be able to
> test the bulk write strategy path, we need to have a few backends concurrently
> load > 16MB files.
>
> It seems pretty clear that doing that on all buildfarm machines wouldn't be
> nice / welcome. And it also seems likely that this won't be the last case
> where that'd be useful.
>
> So I'd like to add a 'large' class to PG_TEST_EXTRA, that we can use in tests
> that we only want to execute on machines with sufficient resources.
>

Oh, I was been thinking about a similar topic recently, but I was
unaware of PG_TEST_EXTRA  [1]

I've observed suggested test cases get rejected as being overkill, or
because they would add precious seconds to the test execution. OTOH, I
felt such tests would still help gain some additional percentages from
the "code coverage" stats. The kind of tests I am thinking of don't
necessarily need a huge disk/CPU - but they just take longer to run
than anyone has wanted to burden the build-farm with.

~

Sorry for the thread interruption -- but I thought this might be the
right place to ask: What is the recommended way to deal with such
tests intended primarily for better code coverage?

I didn't see anything that looked pre-built for 'coverage'. Did I miss
something, or is it a case of just invent-your-own extra tests/values
for your own ad-hoc requirements?

e.g.
make check EXTRA_TESTS=extra_regress_for_coverage
make check-world PG_TEST_EXTRA='extra_tap_tests_for_coverage'

Thanks!

--
[1] https://www.postgresql.org/docs/devel/regress-run.html

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-13 Thread Peter Smith

t; 0)
{
TimestampTz next_feedback_due_ts;
long next_feedback_due_ms;

/*
* Find if the next feedback is due earlier than the remaining delay ms.
*/
next_feedback_due_ts = TimestampTzPlusMilliseconds(send_time,
status_interval_ms);
next_feedback_due_ms = TimestampDifferenceMilliseconds(now,
next_feedback_due_ts);
if (next_feedback_due_ms < remaining_delay_ms)
{
/* delay before feedback */
WaitLatch(MyLatch,
  WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
  next_feedback_due_ms,
  WAIT_EVENT_LOGICAL_APPLY_DELAY);
send_feedback(last_received, true, false, true);
continue;
}
}

/* delay before apply */
WaitLatch(MyLatch,
  WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
  remaining_delay_ms,
  WAIT_EVENT_LOGICAL_APPLY_DELAY);
}

==
src/include/utils/wait_event.h

3.
@@ -149,7 +149,8 @@ typedef enum
  WAIT_EVENT_REGISTER_SYNC_REQUEST,
  WAIT_EVENT_SPIN_DELAY,
  WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_LOGICAL_APPLY_DELAY
 } WaitEventTimeout;

FYI - The PGDOCS has a section with "Table 28.13. Wait Events of Type
Timeout" so if you a going to add a new Timeout Event then you also
need to document it (alphabetically) in that table.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Support logical replication of DDLs

2023-02-12 Thread Peter Smith

On Sat, Feb 11, 2023 at 3:31 AM vignesh C  wrote:
>
> On Fri, 10 Feb 2023 at 21:50, vignesh C  wrote:
> > The attached v68 version patch has the changes for the same.
>
> I was not sure if we should support ddl replication of
> create/alter/drop subscription commands as there might be some data
> inconsistency issues in the following cases:
> #node1 who is running in port 5432
> create publication pub_node1 for all tables with ( PUBLISH = 'insert,
> update, delete, truncate');
>
> #node2 who is running in port 5433
> create publication pub_node2 for all tables with(PUBLISH = 'insert,
> update, delete, truncate', ddl = 'all');
> create subscription sub_node2 connection 'dbname=postgres host=node1
> port=5432' publication pub_node1;
>
> #node3
> create subscription sub_node3 connection 'dbname=postgres host=node2
> port=5433 publication pub_node2;
>
> #node1
> create table t1(c1 int );
>
> #node2
> create table t1(c1 int);
> alter subscription sub_node2 refresh publication;
>
> # Additionally this command will be replicated to node3, creating a
> subscription sub2_node2 in node3 which will subscribe data from node1
> create subscription sub2_node2 connection 'dbname=postgres host=node1
> port=5432' publication pub_node1;
>
> After this any insert into t1 from node1 will be replicated to node2
> and node3, additionally node2's replicated data(which was replicated
> from node1) will also be sent to node3 causing inconsistency. If the
> table has unique or primary key constraints, it will lead to an error.
>
> Another option would be to replicate the create subscription in
> disabled state and not support few ddl replication of alter
> subscription which will connect to publisher like:
> 1) Alter subscription sub1 enable;
> 2) Alter subscription sub1 refresh publication;
>
> But in this case also, we will be able to support few alter
> subscription commands and not support few alter subscription commands.
> I feel it is better that we do not need to support ddl replication of
> create/alter/drop subscription command and let users handle the
> subscription commands.
> Thoughts?
>

So essentially, node3 is subscribed 2x from the same table in node1

node1 --> node2
|| ddl
|V
+-> node3

I think the suggested options are:

option #1. If you support CREATE SUBSCRIPTION DDL replication then you
can end up with the conflict troubles you described above

option #2. If you support CREATE SUBSCRIPTION DDL replication but only
make sure it is disabled first then your above scenario might be OK
but you will also need to *not* allow DDL replication of the ALTER
SUBSCRIPTION which might cause it to become re-enabled. IMO adding
tricky rules is just inviting more problems.

option #3. Do nothing, don't support it. +1 but see below for a
variation of this

~

Actually, I am sort of expecting lots of potential difficulties with
DDL replication and this CREATE SUBSCRIPTION problem is just one of
them. IMO it would be a mistake to try and make the first version of
these patches try to do *everything*. E.g. Why invent tricky solutions
to problems without yet knowing user expectations/requirements?

Therefore, my first impression is to do a generic option #4:

option #4. This is a variation of "do nothing". My suggestion is you
can still replicate every DDL via logical replication messages but
just don't actually *apply* anything on the subscriber side for the
commands which are problematic (like this one is). Instead of
applying, the subscriber can just log a NOTICE about each command that
was skipped. This will make it easier for the user to know what didn’t
happen, but they can just cut/paste that command from the NOTICE if
they really want to. Also, this option #4 is deliberately generic,
which means you can do the same for every kind of DDL that proves too
difficult to automate in the first version (I doubt CREATE
SUBSCRIPTION will be the only example).

The option #4 result might look like this:

test_sub=# create subscription sub1 connection 'dbname=test_pub'
publication pub1;
NOTICE: created replication slot "sub1" on publisher CREATE SUBSCRIPTION
NOTICE: subscription "sub1" skipping DDL: create subscription
sub_node2 connection 'dbname=postgres host=node1 port=5432'
publication pub_node1;
...

(And if it turns out users really do want this then it can be
revisited in later patch versions)

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Exit walsender before confirming remote flush in logical replication

2023-02-12 Thread Peter Smith

Here are some comments for patch v7-0002.

==
Commit Message

1.
This commit extends START_REPLICATION to accept SHUTDOWN_MODE clause. It is
currently implemented only for logical replication.

~

"to accept SHUTDOWN_MODE clause." --> "to accept a SHUTDOWN_MODE clause."

==
doc/src/sgml/protocol.sgml

2.
START_REPLICATION SLOT slot_name LOGICAL XXX/XXX [ SHUTDOWN_MODE {
'wait_flush' | 'immediate' } ] [ ( option_name [ option_value ] [,
...] ) ]

~

IMO this should say shutdown_mode as it did before:
START_REPLICATION SLOT slot_name LOGICAL XXX/XXX [ SHUTDOWN_MODE
shutdown_mode ] [ ( option_name [ option_value ] [, ...] ) ]

~~~

3.
+   
+shutdown_mode
+
+ 
+  Determines the behavior of the walsender process at shutdown. If
+  shutdown_mode is 'wait_flush', the walsender waits
+  for all the sent WALs to be flushed on the subscriber side. This is
+  the default when SHUTDOWN_MODE is not specified. If shutdown_mode is
+  'immediate', the walsender exits without
+  confirming the remote flush.
+ 
+
+   

Is the font of the "shutdown_mode" correct? I expected it to be like
the others (e.g. slot_name)

==
src/backend/replication/walsender.c

4.
+static void
+CheckWalSndOptions(const StartReplicationCmd *cmd)
+{
+ if (cmd->shutdownmode)
+ {
+ char*mode = cmd->shutdownmode;
+
+ if (pg_strcasecmp(mode, "wait_flush") == 0)
+ shutdown_mode = WALSND_SHUTDOWN_MODE_WAIT_FLUSH;
+ else if (pg_strcasecmp(mode, "immediate") == 0)
+ shutdown_mode = WALSND_SHUTDOWN_MODE_IMMEDIATE;
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid value for shutdown mode: \"%s\"", mode),
+ errhint("Available values: wait_flush, immediate."));
+ }
+
+}

Unnecessary extra whitespace at end of the function.

==
src/include/nodes/replnodes.

5.
@@ -83,6 +83,7 @@ typedef struct StartReplicationCmd
  char*slotname;
  TimeLineID timeline;
  XLogRecPtr startpoint;
+ char*shutdownmode;
  List*options;
 } StartReplicationCmd;

IMO I those the last 2 members should have a comment something like:
/* Only for logical replication */

because that will make it more clear why sometimes they are assigned
and sometimes they are not.

==
src/include/replication/walreceiver.h

6.
Should the protocol version be bumped (and documented) now that the
START REPLICATION supports a new extended syntax? Or is that done only
for messages sent by pgoutput?

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: [PATCH] Add pretty-printed XML output option

2023-02-12 Thread Peter Smith

Something is misbehaving.

Using the latest HEAD, and before applying the v6 patch, 'make check'
is working OK.

But after applying the v6 patch, some XML regression tests are failing
because the DETAIL part of the message indicating the wrong syntax
position is not getting displayed. Not only for your new tests -- but
for other XML tests too.

My ./configure looks like this:
./configure --prefix=/usr/local/pg_oss --with-libxml --enable-debug
--enable-cassert --enable-tap-tests  CFLAGS="-ggdb -O0 -g3
-fno-omit-frame-pointer"

resulting in:
checking whether to build with XML support... yes
checking for libxml-2.0 >= 2.6.23... yes

~

e.g.(these are a sample of errors)

xml  ... FAILED 2561 ms

@@ -344,8 +326,6 @@
 
^
 line 1: Opening and ending tag mismatch: twoerrors line 1 and unbalanced
-
-^
 SELECT xmlparse(document '');
   xmlparse
 -
@@ -1696,14 +1676,8 @@
 SELECT xmlformat('');
 ERROR:  invalid XML document
 DETAIL:  line 1: switching encoding : no input
-
-^
 line 1: Document is empty
-
-^
 -- XML format: invalid string (whitespaces)
 SELECT xmlformat('   ');
 ERROR:  invalid XML document
 DETAIL:  line 1: Start tag expected, '<' not found
-
-   ^

~~

Separately (but maybe it's related?), the CF-bot test also reported a
failure [1] with strange error detail differences.

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/xml.out
/tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/xml.out 2023-02-12
09:02:57.077569000 +
+++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out
2023-02-12 09:05:45.14810 +
@@ -1695,10 +1695,7 @@
 -- XML format: empty string
 SELECT xmlformat('');
 ERROR:  invalid XML document
-DETAIL:  line 1: switching encoding : no input
-
-^
-line 1: Document is empty
+DETAIL:  line 1: Document is empty

 ^
 -- XML format: invalid string (whitespaces)

--
[1] 
https://api.cirrus-ci.com/v1/artifact/task/4802219812323328/testrun/build/testrun/regress/regress/regression.diffs

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Exit walsender before confirming remote flush in logical replication

2023-02-09 Thread Peter Smith

, then IMO maybe it's better to
remove the "= 0" because the explicit assignment made me expect that
it had special meaning, and then it was confusing when I could not
find a reason.

~~~

10. ProcessPendingWrites

+ /*
+ * In this function, there is a possibility that the walsender is
+ * stuck. It is caused when the opposite worker is stuck and then the
+ * send-buffer of the walsender becomes full. Therefore, we must add
+ * an additional path for shutdown for immediate shutdown mode.
+ */
+ if (shutdown_mode == WALSND_SHUTDOWN_MODE_IMMIDEATE &&
+ got_STOPPING)
+ WalSndDone(XLogSendLogical);

10a.
Can this comment say something like "receiving worker" instead of
"opposite worker"?

SUGGESTION
This can happen when the receiving worker is stuck, and then the
send-buffer of the walsender...

~

10b.
IMO it makes more sense to check this around the other way. E.g. we
don't care what is the shutdown_mode value unless got_STOPPING is
true.

SUGGESTION
if (got_STOPPING && (shutdown_mode == WALSND_SHUTDOWN_MODE_IMMEDIATE))

~~~

11. WalSndDone

+ * If we are in the immediate shutdown mode, flush location and output
+ * buffer is not checked. This may break the consistency between nodes,
+ * but it may be useful for the system that has high-latency network to
+ * reduce the amount of time for shutdown.

Add some quotes for the mode.

SUGGESTION
'immediate' shutdown mode

~~~

12.
+/*
+ * Check options for walsender itself and set flags accordingly.
+ *
+ * Currently only one option is accepted.
+ */
+static void
+CheckWalSndOptions(const StartReplicationCmd *cmd)
+{
+ if (cmd->shutdownmode)
+ ParseShutdownMode(cmd->shutdownmode);
+}
+
+/*
+ * Parse given shutdown mode.
+ *
+ * Currently two values are accepted - "wait_flush" and "immediate"
+ */
+static void
+ParseShutdownMode(char *shutdownmode)
+{
+ if (pg_strcasecmp(shutdownmode, "wait_flush") == 0)
+ shutdown_mode = WALSND_SHUTDOWN_MODE_WAIT_FLUSH;
+ else if (pg_strcasecmp(shutdownmode, "immediate") == 0)
+ shutdown_mode = WALSND_SHUTDOWN_MODE_IMMIDEATE;
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid value for shutdown mode: \"%s\"", shutdownmode),
+ errhint("Available values: wait_flush, immediate."));
+}

IMO the ParseShutdownMode function seems unnecessary because it's not
really "parsing" anything and it is only called in one place. I
suggest wrapping everything into the CheckWalSndOptions function. The
end result is still only a simple function:

SUGGESTION

static void
CheckWalSndOptions(const StartReplicationCmd *cmd)
{
if (cmd->shutdownmode)
{
char *mode = cmd->shutdownmode;

if (pg_strcasecmp(mode, "wait_flush") == 0)
shutdown_mode = WALSND_SHUTDOWN_MODE_WAIT_FLUSH;
else if (pg_strcasecmp(mode, "immediate") == 0)
shutdown_mode = WALSND_SHUTDOWN_MODE_IMMEDIATE;

else
ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("invalid value for shutdown mode: \"%s\"", mode),
errhint("Available values: wait_flush, immediate."));
}
}

==
src/include/replication/walreceiver.h

13.
@@ -170,6 +170,7 @@ typedef struct
  * false if physical stream.  */
  char*slotname; /* Name of the replication slot or NULL. */
  XLogRecPtr startpoint; /* LSN of starting point. */
+ char*shutdown_mode; /* Name of specified shutdown name */

  union
  {
~

13a.
Typo (shutdown name?)

SUGGESTION
/* The specified shutdown mode string, or NULL. */

~

13b.
Because they have the same member names I kept confusing this option
shutdown_mode with the other enum also called shutdown_mode.

I wonder if is it possible to call this one something like
'shutdown_mode_str' to make reading the code easier?

~

13c.
Is this member in the right place? AFAIK this is not even implemented
for physical replication. e.g. Why isn't this new member part of the
'logical' sub-structure in the union?

==
src/test/subscription/t/001_rep_changes.pl

14.
-# Set min_apply_delay parameter to 3 seconds
+# Check restart on changing min_apply_delay to 3 seconds
 my $delay = 3;
 $node_subscriber->safe_psql('postgres',
  "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+$node_publisher->poll_query_until('postgres',
+ "SELECT pid != $oldpid FROM pg_stat_replication WHERE
application_name = 'tap_sub_renamed' AND state = 'streaming';"
+  )
+  or die
+  "Timed out while waiting for the walsender to restart after
changing min_apply_delay to non-zero value";

IIUC this test is for verifying that a new walsender worker was
started if the delay was changed from 0 to non-zero. E.g. I think it
is for it is testing your new logic of the maybe_reread_subscription.

Probably more complete testing also needs to check the other scenarios:
* min_apply_delay from one non-zero value to another non-zero value
--> verify a new worker is NOT started.
* change min_apply_delay from non-zero to zero --> verify a new worker
IS started

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-02-09 Thread Peter Smith

On Fri, Feb 10, 2023 at 1:32 PM houzj.f...@fujitsu.com
 wrote:
>
> On Tuesday, February 7, 2023 11:17 AM Amit Kapila  
> wrote:
> >
> > On Mon, Feb 6, 2023 at 3:43 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > while reading the code, I noticed that in pa_send_data() we set wait
> > > event to WAIT_EVENT_LOGICAL_PARALLEL_APPLY_STATE_CHANGE while
> > sending
> > > the message to the queue. Because this state is used in multiple
> > > places, user might not be able to distinguish what they are waiting
> > > for. So It seems we'd better to use WAIT_EVENT_MQ_SEND here which will
> > > be eaier to distinguish and understand. Here is a tiny patch for that.
> > >
>
> As discussed[1], we'd better invent a new state for this purpose, so here is 
> the patch
> that does the same.
>
> [1] 
> https://www.postgresql.org/message-id/CAA4eK1LTud4FLRbS0QqdZ-pjSxwfFLHC1Dx%3D6Q7nyROCvvPSfw%40mail.gmail.com
>

My first impression was the
WAIT_EVENT_LOGICAL_PARALLEL_APPLY_SEND_DATA name seemed misleading
because that makes it sound like the parallel apply worker is doing
the sending, but IIUC it's really the opposite.

And since WAIT_EVENT_LOGICAL_PARALLEL_APPLY_LEADER_SEND_DATA seems too
verbose, how about shortening the prefix for both events? E.g.

BEFORE
WAIT_EVENT_LOGICAL_PARALLEL_APPLY_SEND_DATA,
WAIT_EVENT_LOGICAL_PARALLEL_APPLY_STATE_CHANGE,

AFTER
WAIT_EVENT_LOGICAL_PA_LEADER_SEND_DATA,
WAIT_EVENT_LOGICAL_PA_STATE_CHANGE,

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Add pretty-printed XML output option

2023-02-09 Thread Peter Smith

On Thu, Feb 9, 2023 at 7:17 PM Jim Jones  wrote:
>
> On 09.02.23 08:23, Tom Lane wrote:
> > Um ... why are you using PG_TRY here at all?  It seems like
> > you have full control of the actually likely error cases.
> > The only plausible error out of the StringInfo calls is OOM,
> > and you probably don't want to trap that at all.
>
> My intention was to catch any unexpected error from
> xmlDocDumpFormatMemory and handle it properly. But I guess you're right,
> I can control the likely error cases by checking doc and nbytes.
>
> You suggest something along these lines?
>
>  xmlDocPtr  doc;
>  xmlChar*xmlbuf = NULL;
>  text   *arg = PG_GETARG_TEXT_PP(0);
>  StringInfoData buf;
>  int nbytes;
>
>  doc = xml_parse(arg, XMLOPTION_DOCUMENT, false,
> GetDatabaseEncoding(), NULL);
>
>  if(!doc)
>  elog(ERROR, "could not parse the given XML document");
>
>  xmlDocDumpFormatMemory(doc, , , 1);
>
>  xmlFreeDoc(doc);
>
>  if(!nbytes)
>  elog(ERROR, "could not indent the given XML document");
>
>  initStringInfo();
>  appendStringInfoString(, (const char *)xmlbuf);
>
>  xmlFree(xmlbuf);
>
>  PG_RETURN_XML_P(stringinfo_to_xmltype());
>
>
> Thanks!
>

Something like that LGTM, but here are some other minor comments.

==

1.
FYI, there are some whitespace warnings applying the v5 patch

[postgres@CentOS7-x64 oss_postgres_misc]$ git apply
../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch
../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:26:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:29:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:33:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:37:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:41:
trailing whitespace.

warning: squelched 48 whitespace errors
warning: 53 lines add whitespace errors.

==
src/test/regress/sql/xml.sql

2.
The v5 patch was already testing NULL, but it might be good to add
more tests to verify the function is behaving how you want for edge
cases. For example,

+-- XML pretty print: NULL, empty string, spaces only...
SELECT xmlpretty(NULL);
SELECT xmlpretty('');
SELECT xmlpretty(' ');

~~

3.
The function is returning XML anyway, so is the '::xml' casting in
these tests necessary?

e.g.
SELECT xmlpretty(NULL)::xml; --> SELECT xmlpretty(NULL);

==
src/include/catalog/pg_proc.dat

4.

+  { oid => '4642', descr => 'Indented text from xml',
+  proname => 'xmlpretty', prorettype => 'xml',
+  proargtypes => 'xml', prosrc => 'xmlpretty' },

Spurious leading space for this new entry.

==
doc/src/sgml/func.sgml

5.
+ 
+   
+ 42
+   
+ 
+
+(1 row)
+
+]]>

A spurious blank line in the example after the "(1 row)"

~~~

6.
Does this function docs belong in section 9.15.1 "Producing XML
Content"? Or (since it is not really producing content) should it be
moved to the 9.15.3 "Processing XML" section?

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-09 Thread Peter Smith

> The comment adjustment suggested by Peter-san above
> was also included in this v33.
> Please have a look at the attached patch.

Patch v33 LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [PATCH] Add pretty-printed XML output option

2023-02-08 Thread Peter Smith

On Thu, Feb 9, 2023 at 10:42 AM Jim Jones  wrote:
>
> On 09.02.23 00:09, Peter Smith wrote:
> > I noticed the xmlFreeDoc(doc) within the PG_CATCH is guarded but the
> > other xmlFreeDoc(doc) is not. As the doc is assigned outside the
> > PG_TRY shouldn't those both be the same?
>
> Hi Peter,
>
> My logic there was the following: if program reached that part of the
> code it means that the xml_parse() and xmlDocDumpFormatMemory() worked,
> which consequently means that the variables doc and xmlbuf are != NULL,
> therefore not needing to be checked. Am I missing something?
>

Thanks. I think I understand it better now -- I expect
xmlDocDumpFormatMemory will cope OK when passed a NULL doc (see this
source [1]), but it will return nbytes of 0, but your code will still
throw ERROR, meaning the guard for doc NULL is necessary for the
PG_CATCH.

In that case, everything LGTM.

~

OTOH, if you are having to check for NULL doc anyway, maybe it's just
as easy only doing that up-front. Then you could quick-exit the
function without calling xmlDocDumpFormatMemory etc. in the first
place. For example:

doc = xml_parse(arg, XMLOPTION_DOCUMENT, false, GetDatabaseEncoding(), NULL);
if (!doc)
   return 0;

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: [PATCH] Add pretty-printed XML output option

2023-02-08 Thread Peter Smith

On Thu, Feb 9, 2023 at 7:31 AM Jim Jones  wrote:
>
> while working on another item of the TODO list I realized that I should
> be using a PG_TRY() block in he xmlDocDumpFormatMemory call.
>
> Fixed in v5.
>

I noticed the xmlFreeDoc(doc) within the PG_CATCH is guarded but the
other xmlFreeDoc(doc) is not. As the doc is assigned outside the
PG_TRY shouldn't those both be the same?

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Support logical replication of DDLs

2023-02-08 Thread Peter Smith

Hi Vignesh, thanks for addressing my v63-0002 review comments.

I confirmed most of the changes. Below is a quick follow-up for the
remaining ones.

On Mon, Feb 6, 2023 at 10:32 PM vignesh C  wrote:
>
> On Mon, 6 Feb 2023 at 06:47, Peter Smith  wrote:
> >
...
> >
> > 8.
> > + value = findJsonbValueFromContainer(container, JB_FOBJECT, );
> >
> > Should the code be checking or asserting value is not NULL?
> >
> > (IIRC I asked this a long time ago - sorry if it was already answered)
> >
>
> Yes, this was already answered by Zheng, quoting as "The null checking
> for value is done in the upcoming call of expand_one_jsonb_element()."
> in [1]

Thanks for the info. I saw that Zheng-san only wrote it is handled in
the “upcoming call of expand_one_jsonb_element”, but I don’t know if
that is sufficient. For example, if the execution heads down the other
path (expand_jsonb_array) with a NULL jsonarr then it going to crash,
isn't it? So I still think some change may be needed here.

> > 11.
> > +/*
> > + * Expand a JSON value as an operator name.
> > + */
> > +static void
> > +expand_jsonval_operator(StringInfo buf, JsonbValue *jsonval)
> >
> > Should this function comment be more like the comment for
> > expand_jsonval_dottedname by saying there can be an optional
> > "schemaname"?
>
> Modified

Is it really OK for the “objname" to be optional here (Yes, I know the
code is currently implemented like it is OK, but I am doubtful)

That would everything can be optional and the buf result might be
nothing. It could also mean if the "schemaname" is provided but the
"objname" is not, then the buf will have a trailing ".".

It doesn't sound quite right to me.

>
> > ~~~
> >
> > 12.
> > +static bool
> > +expand_jsonval_string(StringInfo buf, JsonbValue *jsonval)
> > +{
> > + if (jsonval->type == jbvString)
> > + {
> > + appendBinaryStringInfo(buf, jsonval->val.string.val,
> > +jsonval->val.string.len);
> > + }
> > + else if (jsonval->type == jbvBinary)
> > + {
> > + json_trivalue present;
> > +
> > + present = find_bool_in_jsonbcontainer(jsonval->val.binary.data,
> > +   "present");
> > +
> > + /*
> > + * If "present" is set to false, this element expands to empty;
> > + * otherwise (either true or absent), fall through to expand "fmt".
> > + */
> > + if (present == tv_false)
> > + return false;
> > +
> > + expand_fmt_recursive(jsonval->val.binary.data, buf);
> > + }
> > + else
> > + return false;
> > +
> > + return true;
> > +}
> >
> > I felt this could be simpler if there is a new 'expanded' variable
> > because then you can have just a single return point instead of 3
> > returns;
> >
> > If you choose to do this there is a minor tweak to the "fall through" 
> > comment.
> >
> > SUGGESTION
> > expand_jsonval_string(StringInfo buf, JsonbValue *jsonval)
> > {
> > bool expanded = true;
> >
> > if (jsonval->type == jbvString)
> > {
> > appendBinaryStringInfo(buf, jsonval->val.string.val,
> >jsonval->val.string.len);
> > }
> > else if (jsonval->type == jbvBinary)
> > {
> > json_trivalue present;
> >
> > present = find_bool_in_jsonbcontainer(jsonval->val.binary.data,
> >   "present");
> >
> > /*
> >  * If "present" is set to false, this element expands to empty;
> >  * otherwise (either true or absent), expand "fmt".
> >  */
> > if (present == tv_false)
> > expanded = false;
> > else
> > expand_fmt_recursive(jsonval->val.binary.data, buf);
> > }
> >
> > return expanded;
> > }
>
> I'm not sure if this change is required as this will introduce a new
> variable and require it to be set, this variable should be set to
> "expand = false" in else after else if also, instead I preferred the
> existing code. I did not make any change for this unless you are
> seeing some bigger optimization.
>

Sorry, I messed up the previous code suggestion. It should have said:

SUGGESTION
expand_jsonval_string(StringInfo buf, JsonbValue *jsonval)
{
bool expanded = false;

if (jsonval->type == jbvString)
{
appendBinaryStringInfo(buf, jsonval->val.string.val,
jsonval->val.strin

Re: Deadlock between logrep apply worker and tablesync worker

2023-02-08 Thread Peter Smith

On Fri, Feb 3, 2023 at 6:58 PM houzj.f...@fujitsu.com
 wrote:
>
> On Thursday, February 2, 2023 7:21 PM Amit Kapila  
> wrote:
> >
> > On Thu, Feb 2, 2023 at 12:05 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > On Tuesday, January 31, 2023 1:07 AM vignesh C 
> > wrote:
> > > > On Mon, 30 Jan 2023 at 17:30, vignesh C  wrote:
> > > >
> > >
> > > I also tried to test the time of "src/test/subscription/t/002_types.pl"
> > > before and after the patch(change the lock level) and Tom's
> > > patch(split
> > > transaction) like what Vignesh has shared on -hackers.
> > >
> > > I run about 100 times for each case. Tom's and the lock level patch
> > > behave similarly on my machines[1].
> > >
> > > HEAD: 3426 ~ 6425 ms
> > > HEAD + Tom: 3404 ~ 3462 ms
> > > HEAD + Vignesh: 3419 ~ 3474 ms
> > > HEAD + Tom + Vignesh: 3408 ~ 3454 ms
> > >
> > > Even apart from the testing time reduction, reducing the lock level
> > > and lock the specific object can also help improve the lock contention
> > > which user(that use the exposed function) , table sync worker and
> > > apply worker can also benefit from it. So, I think pushing the patch to 
> > > change
> > the lock level makes sense.
> > >
> > > And the patch looks good to me.
> > >
> >
> > Thanks for the tests. I also see a reduction in test time variability with 
> > Vignesh's
> > patch. I think we can release the locks in case the origin is concurrently
> > dropped as in the attached patch. I am planning to commit this patch
> > tomorrow unless there are more comments or objections.
> >
> > > While on it, after pushing the patch, I think there is another case
> > > might also worth to be improved, that is the table sync and apply
> > > worker try to drop the same origin which might cause some delay. This
> > > is another case(different from the deadlock), so I feel we can try to 
> > > improve
> > this in another patch.
> > >
> >
> > Right, I think that case could be addressed by Tom's patch to some extent 
> > but
> > I am thinking we should also try to analyze if we can completely avoid the 
> > need
> > to remove origins from both processes. One idea could be to introduce
> > another relstate something like PRE_SYNCDONE and set it in a separate
> > transaction before we set the state as SYNCDONE and remove the slot and
> > origin in tablesync worker.
> > Now, if the tablesync worker errors out due to some reason during the second
> > transaction, it can remove the slot and origin after restart by checking 
> > the state.
> > However, it would add another relstate which may not be the best way to
> > address this problem. Anyway, that can be accomplished as a separate patch.
>
> Here is an attempt to achieve the same.
> Basically, the patch removes the code that drop the origin in apply worker. 
> And
> add a new state PRE_SYNCDONE after synchronization finished in front of apply
> (sublsn set), but before dropping the origin and other final cleanups. The
> tablesync will restart and redo the cleanup if it failed after reaching the 
> new
> state. Besides, since the changes can already be applied on the table in
> PRE_SYNCDONE state, so I also modified the check in
> should_apply_changes_for_rel(). And some other conditions for the origin drop
> in subscription commands are were adjusted in this patch.
>


BTW, the tablesync.c has a large file header comment which describes
all about the relstates including some examples. So this patch needs
to include modifications to that comment.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-08 Thread Peter Smith

On Wed, Feb 8, 2023 at 8:03 PM Hayato Kuroda (Fujitsu)
 wrote:
>
...
> > ==
> >
> > src/backend/replication/logical/worker.c
> >
> > 2. maybe_apply_delay
> >
> > + if (wal_receiver_status_interval > 0 &&
> > + diffms > wal_receiver_status_interval * 1000L)
> > + {
> > + WaitLatch(MyLatch,
> > +   WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
> > +   wal_receiver_status_interval * 1000L,
> > +   WAIT_EVENT_RECOVERY_APPLY_DELAY);
> > + send_feedback(last_received, true, false, true);
> > + }
> >
> > I felt that introducing another variable like:
> >
> > long statusinterval_ms = wal_receiver_status_interval * 1000L;
> >
> > would help here by doing 2 things:
> > 1) The condition would be easier to read because the ms units would be the 
> > same
> > 2) Won't need * 1000L repeated in two places.
> >
> > Only, do take care to assign this variable in the right place in this
> > loop in case the configuration is changed.
>
> Fixed. Calculations are done on two lines - first one is the entrance of the 
> loop,
> and second one is the after SIGHUP is detected.
>

TBH, I expected you would write this as just a *single* variable
assignment before the condition like below:

SUGGESTION (tweaked comment and put single assignment before condition)
/*
 * Call send_feedback() to prevent the publisher from exiting by
 * timeout during the delay, when the status interval is greater than
 * zero.
 */
status_interval_ms = wal_receiver_status_interval * 1000L;
if (status_interval_ms > 0 && diffms > status_interval_ms)
{
...

~

I understand in theory, your code is more efficient, but in practice,
I think the overhead of a single variable assignment every loop
iteration (which is doing WaitLatch anyway) is of insignificant
concern, whereas having one assignment is simpler than having two IMO.

But, if you want to keep it the way you have then that is OK.

Otherwise, this patch v32 LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Deadlock between logrep apply worker and tablesync worker

2023-02-08 Thread Peter Smith

On Tue, Feb 7, 2023 at 6:46 PM houzj.f...@fujitsu.com
 wrote:
>
> On Tuesday, February 7, 2023 12:12 PM Peter Smith  
> wrote:
> > On Fri, Feb 3, 2023 at 6:58 PM houzj.f...@fujitsu.com 
> > 
> > wrote:
> > >
> > ...
> > > > Right, I think that case could be addressed by Tom's patch to some
> > > > extent but I am thinking we should also try to analyze if we can
> > > > completely avoid the need to remove origins from both processes. One
> > > > idea could be to introduce another relstate something like
> > > > PRE_SYNCDONE and set it in a separate transaction before we set the
> > > > state as SYNCDONE and remove the slot and origin in tablesync worker.
> > > > Now, if the tablesync worker errors out due to some reason during
> > > > the second transaction, it can remove the slot and origin after restart 
> > > > by
> > checking the state.
> > > > However, it would add another relstate which may not be the best way
> > > > to address this problem. Anyway, that can be accomplished as a separate
> > patch.
> > >
> > > Here is an attempt to achieve the same.
> > > Basically, the patch removes the code that drop the origin in apply
> > > worker. And add a new state PRE_SYNCDONE after synchronization
> > > finished in front of apply (sublsn set), but before dropping the
> > > origin and other final cleanups. The tablesync will restart and redo
> > > the cleanup if it failed after reaching the new state. Besides, since
> > > the changes can already be applied on the table in PRE_SYNCDONE state,
> > > so I also modified the check in should_apply_changes_for_rel(). And
> > > some other conditions for the origin drop in subscription commands are
> > were adjusted in this patch.
> > >
> >
> > Here are some review comments for the 0001 patch
> >
> > ==
> > General Comment
> >
> > 0.
> > The idea of using the extra relstate for clean-up seems OK, but the
> > implementation of the new state in this patch appears misordered and
> > misnamed to me.
> >
> > The state name should indicate what it is doing (PRE_SYNCDONE is
> > meaningless). The patch describes in several places that this state means
> > "synchronized, but not yet cleaned up" therefore IMO it means the SYNCDONE
> > state should be *before* this new state. And since this new state is for
> > "cleanup" then let's call it something like that.
> >
> > To summarize, I don’t think the meaning of SYNCDONE should be touched.
> > SYNCDONE means the synchronization is done, same as before. And your new
> > "cleanup" state belongs directly *after* that. IMO it should be like this:
> >
> > 1. STATE_INIT
> > 2. STATE_DATASYNC
> > 3. STATE_FINISHEDCOPY
> > 4. STATE_SYNCDONE
> > 5. STATE_CLEANUP <-- new relstate
> > 6. STATE_READY
> >
> > Of course, this is going to impact almost every aspect of the patch, but I 
> > think
> > everything will be basically the same as you have it now
> > -- only all the state names and comments need to be adjusted according to 
> > the
> > above.
>
> Although I agree the CLEANUP is easier to understand, but I am a bit concerned
> that the changes would be a bit invasive.
>
> If we add a CLEANUP state at the end as suggested, it will change the meaning
> of the existing SYNCDONE state, before the change it means both data sync and
> cleanup have been done, but after the change it only mean the data sync is
> over. This also means all the current C codes that considered the SYNCDONE as
> the final state of table sync will need to be changed. Moreover, it's common
> for user to query the relation state from pg_subscription_rel to identify if
> the table sync of a table is finished(e.g. check relstate IN ('r', 's')), but
> if we add a new state(CLEANUP) as the final state, then all these SQLs would
> need to be changed as they need to check like relstate IN ('r', 'x'(new 
> cleanup
> state)).
>

IIUC, you are saying that we still want to keep the SYNCDONE state as
the last state before READY mainly because otherwise there is too much
impact on user/test SQL that is currently checking those ('s','r')
states.

OTOH, in the current 001 patch you had the SUBREL_STATE_PRE_SYNCDONE
meaning "synchronized but not yet cleaned up" (that's verbatim from
your PGDOCS). And there is C code where you are checking
SUBREL_STATE_PRE_SYNCDONE and essentially giving the state before the
SYNCDONE an equal status to the SYNCDONE (e.g.
should_apply_changes_for_rel seemed to be doing th

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-07 Thread Peter Smith

Here are my review comments for v31-0001

==
doc/src/sgml/glossary.sgml

1.
+ 
+  Replication setup that applies time-delayed copy of the data.
+

That sentence seemed a bit strange to me.

SUGGESTION
Replication setup that delays the application of changes by a
specified minimum time-delay period.

==

src/backend/replication/logical/worker.c

2. maybe_apply_delay

+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+   WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+   wal_receiver_status_interval * 1000L,
+   WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }

I felt that introducing another variable like:

long statusinterval_ms = wal_receiver_status_interval * 1000L;

would help here by doing 2 things:
1) The condition would be easier to read because the ms units would be the same
2) Won't need * 1000L repeated in two places.

Only, do take care to assign this variable in the right place in this
loop in case the configuration is changed.

==
src/test/subscription/t/001_rep_changes.pl

3.
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# look the time duration between tuples are inserted on publisher and then
+# changes are replicated on subscriber.

This comment and the other one appearing later in this test are both
explaining the same test strategy. I think both comments should be
combined into one big one up-front, like this:

SUGGESTION
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for
min_apply_delay milliseconds. We verify this by looking at the time
difference between a) when tuples are inserted on the publisher, and
b) when those changes are replicated on the subscriber. Even on slow
machines, this strategy will give predictable behavior.

~~

4.
+my $delay = 3;
+
+# Set min_apply_delay parameter to 3 seconds
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");

IMO that "my $delay = 3;" assignment should be *after* the comment:

e.g.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");

~~~

5.
+# Make new content on publisher and check its presence in subscriber depending
+# on the delay applied above. Before doing the insertion, get the
+# current timestamp that will be used as a comparison base. Even on slow
+# machines, this allows to have a predictable behavior when comparing the
+# delay between data insertion moment on publisher and replay time on
subscriber.

Most of this comment is now redundant because this was already
explained in the big comment up-front (see #3). Only one useful
sentence is left.

SUGGESTION
Before doing the insertion, get the current timestamp that will be
used as a comparison base.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Peter Smith

On Tue, Feb 7, 2023 at 4:02 PM Amit Kapila  wrote:
>
> On Tue, Feb 7, 2023 at 10:07 AM Kyotaro Horiguchi
>  wrote:
> >
> > At Tue, 7 Feb 2023 09:10:01 +0530, Amit Kapila  
> > wrote in
> > > On Tue, Feb 7, 2023 at 6:03 AM Peter Smith  wrote:
> > > > 5b.
> > > > Since there are no translator considerations here why not write the
> > > > second error like:
> > > >
> > > > errmsg("%d ms is outside the valid range for parameter
> > > > \"min_apply_delay\" (%d .. %d)",
> > > > result, 0, PG_INT32_MAX))
> > > >
> > >
> > > I see that existing usage in the code matches what the patch had
> > > before this comment. See below and similar usages in the code.
> > > if (start <= 0)
> > > ereport(ERROR,
> > > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > errmsg("invalid value for parameter \"%s\": %d",
> > > "start", start)));
> >
> > The same errmsg text occurs mamy times in the tree. On the other hand
> > the pointed message is the only one.  I suppose Peter considered this
> > aspect.
> >
> > # "%d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)"
> > # also appears just once
> >
> > As for me, it seems to me a good practice to do that regadless of the
> > number of duplicates to (semi)mechanically avoid duplicates.
> >
> > (But I believe I would do as Peter suggests by myself for the first
> > cut, though:p)
> >
>
> Personally, I would prefer consistency. I think we can later start a
> new thread to change the existing message and if there is a consensus
> and value in the same then we could use the same style here as well.
>

Of course, if there is a convention then we should stick to it.

My understanding was that (string literal) message parameters are
specified separately from the message format string primarily as an
aid to translators. That makes good sense for parameters with names
that are also English words (like "start" etc), but for non-word
parameters like "min_apply_delay" there is no such ambiguity in the
first place.

Anyway, I am fine with it being written either way.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Deadlock between logrep apply worker and tablesync worker

2023-02-06 Thread Peter Smith

 = synchronization done, c = clean-up done, r = ready
(normal replication)

==
src/backend/commands/subscriptioncmds.c

4. AlterSubscription_refresh

Some adjustments are needed according to my "General Comment".

~~~

5. DropSubscription

Some adjustments are needed according to my "General Comment".

==
src/backend/replication/logical/tablesync.c

6.

+ * Update the state of the table to SUBREL_STATE_SYNCDONE and cleanup the
+ * tablesync slot and drop the tablesync's origin tracking.
+ */
+static void
+finish_synchronization(bool restart_after_crash)

6a.
Suggest calling this function something like 'cleanup_after_synchronization'

~

6b.
Some adjustments to states and comments are needed according to my
"General Comment".

~~~

7. process_syncing_tables_for_sync

- MyLogicalRepWorker->relstate = SUBREL_STATE_SYNCDONE;
+ MyLogicalRepWorker->relstate = SUBREL_STATE_PRE_SYNCDONE;
  MyLogicalRepWorker->relstate_lsn = current_lsn;

This should just be setting SUBREL_STATE_SYNCDONE how it previously did.

Other states/comments in this function to change according to my
"General Comments".

~

8.
if (rstate->state == SUBREL_STATE_SYNCDONE)
{
/*
* Apply has caught up to the position where the table sync has
* finished.  Mark the table as ready so that the apply will just
* continue to replicate it normally.
*/

That should now be checking for SUBREL_STATE_CLEANUPDONE according to
me "General Comment"

~~~

9. process_syncing_tables_for_apply

Some adjustments to states and comments are needed according to my
"General Comment".

~~~

10. LogicalRepSyncTableStart

Some adjustments to states and comments are needed according to my
"General Comment".

==
src/backend/replication/logical/worker.c

11. should_apply_changes_for_rel

Some adjustments to states according to my "General Comment".

==
src/include/catalog/pg_subscription_rel.h

12.
@@ -62,8 +62,10 @@
DECLARE_UNIQUE_INDEX_PKEY(pg_subscription_rel_srrelid_srsubid_index,
6117, Subsc
  * NULL) */
 #define SUBREL_STATE_FINISHEDCOPY 'f' /* tablesync copy phase is completed
  * (sublsn NULL) */
-#define SUBREL_STATE_SYNCDONE 's' /* synchronization finished in front of
- * apply (sublsn set) */
+#define SUBREL_STATE_PRE_SYNCDONE 'p' /* synchronization finished in front of
+ * apply (sublsn set), but the final
+ * cleanup has not yet been performed */
+#define SUBREL_STATE_SYNCDONE 's' /* synchronization complete */
 #define SUBREL_STATE_READY 'r' /* ready (sublsn set) */

Some adjustments to states and comments are needed according to my
"General Comment".

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Peter Smith

Here are my review comments for v29-0001.

==
Commit Message

1.
Discussion: 
https://postgr.es/m/CAB-JLwYOYwL=xtyaxkih5ctm_vm8kjkh7aaitckvmch4rzr...@mail.gmail.com

tmp

~

What's that "tmp" doing there? A typo?

==
doc/src/sgml/catalogs.sgml

2.
+ 
+  
+   subminapplydelay int4
+  
+  
+   The minimum delay (ms) for applying changes.
+  
+ 

For consistency remove the period (.) because the other
single-sentence descriptions on this page do not have one.

==
src/backend/commands/subscriptioncmds.c

3. AlterSubscription
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+"min_apply_delay"));

Since there are no translator considerations here why not write it like this:

errmsg("cannot set parallel streaming mode for subscription with
min_apply_delay")

~~~

4. AlterSubscription
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+"min_apply_delay"));

Since there are no translator considerations here why not write it like this:

errmsg("cannot set min_apply_delay for subscription in parallel streaming mode")

~~~

5.
+defGetMinApplyDelay(DefElem *def)
+{
+ char*input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, , GUC_UNIT_MS, ))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}

5a.
Since there are no translator considerations here why not write the
first error like:

errmsg("invalid value for parameter \"min_apply_delay\": \"%s\"",
input_string)

~

5b.
Since there are no translator considerations here why not write the
second error like:

errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-05 Thread Peter Smith

On Sat, Feb 4, 2023 at 5:04 PM Takamichi Osumi (Fujitsu)
 wrote:
>
...
>
> Kindly have a look at the attached v27.
>

Here are some review comments for patch v27-0001.

==
src/test/subscription/t/032_apply_delay.pl

1.
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log

~

"has been effective from the server log" --> "worked, by inspecting
the server log"

~~~

2.
+my $delay   = 3;

Might be better to name this variable as 'min_apply_delay'.

~~~

3.
+# Now wait for replay to complete on publisher. We're done waiting when the
+# subscriber has applyed up to the publisher LSN.
+$node_publisher->wait_for_catchup($appname);

3a.
Something seemed wrong with the comment.

Was it meant to say more like? "The publisher waits for the
replication to complete".

Typo: "applyed"

~

3b.
Instead of doing this wait_for_catchup stuff why don't you just use a
synchronous pub/sub and then the publication will just block
internally like you require but without you having to block using test
code?

~~~

4.
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});

SUGGESTION (for the comment)
# Running a dummy query causes the config to be reloaded.

~~~

5.
+# Confirm the record is not applied expectedly
+my $result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_int WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");

"expectedly" ??

SUGGESTION (for comment)
# Confirm the record was not applied

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Support logical replication of DDLs

2023-02-05 Thread Peter Smith

s also be checking/asserting that the type is jbvNumeric?

~~~

14.
+/*
+ * Expand a JSON value as a role name.  If the is_public element is set to
+ * true, PUBLIC is expanded (no quotes); otherwise, expand the given role name,
+ * quoting as an identifier.
+ */
+static void
+expand_jsonval_role(StringInfo buf, JsonbValue *jsonval)

Maybe more readable to quote that param?

BEFORE
If the is_public element is set...

AFTER
If the 'is_public' element is set...

~~~

15.
+ *
+ * Returns false if no actual expansion was made (due to the "present" flag
+ * being set to "false" in formatted string expansion).
+ */
+static bool
+expand_one_jsonb_element(StringInfo buf, char *param, JsonbValue *jsonval,
+ convSpecifier specifier, const char *fmt)
+{
+ bool result = true;
+ ErrorContextCallback sqlerrcontext;

~

15a.
Looking at the implementation, maybe that comment can be made more
clear. Something like below:

SUGGESTION
Returns true, except for the formatted string case if no actual
expansion was made (due to the "present" flag being set to "false").

~

15b.
Maybe use a better variable name.

"result" --> "string_expanded"

==
src/include/catalog/pg_proc.dat

16.
@@ -11891,4 +11891,10 @@
   prorettype => 'bytea', proargtypes => 'pg_brin_minmax_multi_summary',
   prosrc => 'brin_minmax_multi_summary_send' },

+{ oid => '4642', descr => 'deparse the DDL command into JSON format string',
+  proname => 'ddl_deparse_to_json', prorettype => 'text',
+  proargtypes => 'pg_ddl_command', prosrc => 'ddl_deparse_to_json' },
+{ oid => '4643', descr => 'expand JSON format DDL to a plain DDL command',
+  proname => 'ddl_deparse_expand_command', prorettype => 'text',
+  pr

16a.
"deparse the DDL command into JSON format string" ==> "deparse the DDL
command into a JSON format string"

~

16b.
"expand JSON format DDL to a plain DDL command" --> "expand JSON
format DDL to a plain text DDL command"

==
src/include/tcop/ddl_deparse.h

17.
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group

"2022" --> "2023"

~~~

+extern char *deparse_utility_command(CollectedCommand *cmd, bool verbose_mode);
+extern char *deparse_ddl_json_to_string(char *jsonb);
+extern char *deparse_drop_command(const char *objidentity, const char
*objecttype,
+   DropBehavior behavior);
+
+
+#endif /* DDL_DEPARSE_H */

Double blank lines.

==
src/include/tcop/deparse_utility.h

18.
@@ -100,6 +103,12 @@ typedef struct CollectedCommand
  {
  ObjectType objtype;
  } defprivs;
+
+ struct
+ {
+ ObjectAddress address;
+ Node *real_create;
+ } ctas;
  } d;

All the other sub-structures have comments. IMO this one should have a
comment too.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Support logical replication of DDLs

2023-02-05 Thread Peter Smith

On Fri, Feb 3, 2023 at 9:21 PM Alvaro Herrera  wrote:
>
> On 2023-Feb-03, Peter Smith wrote:
>
...
> > 3. ExecuteGrantStmt
> >
> > + /* Copy the grantor id needed for DDL deparsing of Grant */
> > + istmt.grantor_uid = grantor;
> > +
> >
> > SUGGESTION (comment)
> > Copy the grantor id to the parsetree, needed for DDL deparsing of Grant
>
> Is istmt really "the parse tree" actually?  As I recall, it's a derived
> struct that's created during execution of the grant/revoke command, so
> modifying the comment like this would be a mistake.
>

I thought this comment was analogous to another one from this same
patch 0001 (see seclabel.c), so the suggested change above was simply
to make the wording consistent.

@@ -134,6 +134,9 @@ ExecSecLabelStmt(SecLabelStmt *stmt)
  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
  errmsg("must specify provider when multiple security label providers
have been loaded")));
  provider = (LabelProvider *) linitial(label_provider_list);
+
+ /* Copy the provider name to the parsetree, needed for DDL deparsing
of SecLabelStmt */
+ stmt->provider = pstrdup(provider->provider_name);

So if the suggestion for the ExecuteGrantStmt comment was a mistake
then perhaps the ExecSecLabelStmt comment is wrong also?

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-02 Thread Peter Smith

On Fri, Feb 3, 2023 at 4:21 PM Amit Kapila  wrote:
>
> On Fri, Feb 3, 2023 at 6:41 AM Peter Smith  wrote:
> >
> > On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
> >  wrote:
> > >
> > ...
> > >
> > >
> > > > Besides, I am not sure it's a stable test to check the log. Is it 
> > > > possible that there's
> > > > no such log on a slow machine? I modified the code to sleep 1s at the 
> > > > beginning
> > > > of apply_dispatch(), then the new added test failed because the server 
> > > > log
> > > > cannot match.
> > > To get the log by itself is necessary to ensure
> > > that the delay is conducted by the apply worker, because we emit the 
> > > diffms
> > > only if it's bigger than 0 in maybe_apply_delay(). If we omit the step,
> > > we are not sure the delay is caused by other reasons or the time-delayed 
> > > feature.
> > >
> > > As you mentioned, it's possible that no log is emitted on slow machine. 
> > > Then,
> > > the idea to make the test safer for such machines should be to make the 
> > > delayed time longer.
> > > But we shortened the delay time to 1 second to mitigate the long test 
> > > execution time of this TAP test.
> > > So, I'm not sure if it's a good idea to make it longer again.
> >
> > I think there are a couple of things that can be done about this problem:
> >
> > 1. If you need the code/test to remain as-is then at least the test
> > message could include some comforting text like "(this can fail on
> > slow machines when the delay time is already exceeded)" so then a test
> > failure will not cause undue alarm.
> >
> > 2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so that
> > it will *always* log the remaining wait time even if that wait time
> > becomes negative. Then I think the test cases can be made
> > deterministic instead of relying on good luck. This seems like the
> > better option.
> >
>
> I don't understand why we have to do any of this instead of using 3s
> as min_apply_delay similar to what we are doing in
> src/test/recovery/t/005_replay_delay. Also, I think we should use
> exactly the same way to verify the test even though we want to keep
> the log level as DEBUG2 to check logs in case of any failures.
>

IIUC the reasons are due to conflicting requirements. e.g.
- A longer delay like 3s might work better for testing this feature, but OTOH
- A longer delay will also cause the whole BF execution to take longer

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-02 Thread Peter Smith

Here are my review comments for patch v26-0001.

On Thu, Feb 2, 2023 at 7:18 PM Takamichi Osumi (Fujitsu)
 wrote:
>
> Hi,
>
> On Wednesday, February 1, 2023 1:37 PM Peter Smith  
> wrote:
> > Here are my review comments for the patch v25-0001.
> Thank you for your review !
>

> > 8.
> > + if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
> > + opts->min_apply_delay > 0 && opts->streaming ==
> > + opts->LOGICALREP_STREAM_PARALLEL)
> > + ereport(ERROR,
> > + errcode(ERRCODE_SYNTAX_ERROR),
> >
> > Saying "> 0" (in the condition) is not strictly necessary here, since it is 
> > never < 0.
> This check is necessary.
>
> For example, imagine a case when we CREATE a subscription with streaming = on
> and then try to ALTER the subscription with streaming = parallel
> without any settings for min_apply_delay. The ALTER command
> throws an error of "min_apply_delay > 0 and streaming = parallel are
> mutually exclusive options." then.
>
> This is because min_apply_delay is supported by ALTER command
> (so the first condition becomes true) and we set
> streaming = parallel (which makes the 2nd condition true).
>
> So, we need to check the opts's actual min_apply_delay value
> to make the irrelavent case pass.

I think there is some misunderstanding. I was not suggesting removing
the condition -- only that I thought it could be written without the >
0 as:

if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts->min_apply_delay && opts->streaming == LOGICALREP_STREAM_PARALLEL)
ereport(ERROR,

> > ~~~
> >
> > 9. AlterSubscription
> >
> > + /*
> > + * The combination of parallel streaming mode and
> > + * min_apply_delay is not allowed. See
> > + * parse_subscription_options for details of the reason.
> > + */
> > + if (opts.streaming == LOGICALREP_STREAM_PARALLEL) if
> > + ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
> > opts.min_apply_delay > 0) ||
> > + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
> > sub->minapplydelay > 0))
> >
> > Saying "> 0" (in the condition) is not strictly necessary here, since it is 
> > never < 0.
> This is also necessary.
>
> For example, imagine a case that
> there is a subscription whose min_apply_delay is 1 day.
> Then, you want to try to execute ALTER SUBSCRIPTION
> with (min_apply_delay = 0, streaming = parallel).
> If we remove the condition of otps.min_apply_delay > 0,
> then we error out in this case too.
>
> First we pass the first condition
> of the opts.streaming == LOGICALREP_STREAM_PARALLEL,
> since we use streaming option.
> Then, we also set min_apply_delay in this example,
> then without checking the value of min_apply_delay,
> the second condition becomes true
> (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)).
>
> So, we need to make this case(min_apply_delay = 0) pass.
> Meanwhile, checking the "sub" value is necessary for checking existing 
> subscription value.

I think there is some misunderstanding. I was not suggesting removing
the condition -- only that I thought it could be written without the >
0 as::

if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts.min_apply_delay) ||
(!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay))
ereport(ERROR,

> > ~~~
> >
> > 10.
> > + if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)) {
> > + /*
> > + * The combination of parallel streaming mode and
> > + * min_apply_delay is not allowed.
> > + */
> > + if (opts.min_apply_delay > 0)
> >
> > Saying "> 0" (in the condition) is not strictly necessary here, since it is 
> > never < 0.
> This is also required to check the value equals to 0 or not.
> Kindly imagine a case when we want to execute ALTER min_apply_delay from 1day
> with a pair of (min_apply_delay = 0 and
> streaming = parallel). If we remove this check, then this ALTER command fails
> with error. Without the check, when we set min_apply_delay
> and parallel streaming mode, even when making the min_apply_delay 0,
> the error is invoked.
>
> The check for sub.stream is necessary for existing definition of target 
> subscription.

I think there is some misunderstanding. I was not suggesting removing
the condition -- only that I thought it could be written without the >
0 as::

if (opts.min_apply_delay)
if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming ==
LOGICALREP_STREAM_PARALLEL) ||
(!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
LOGICALREP_STREAM_PARALLEL))
ereport(ERROR,

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-02 Thread Peter Smith

On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
 wrote:
>
...
>
>
> > Besides, I am not sure it's a stable test to check the log. Is it possible 
> > that there's
> > no such log on a slow machine? I modified the code to sleep 1s at the 
> > beginning
> > of apply_dispatch(), then the new added test failed because the server log
> > cannot match.
> To get the log by itself is necessary to ensure
> that the delay is conducted by the apply worker, because we emit the diffms
> only if it's bigger than 0 in maybe_apply_delay(). If we omit the step,
> we are not sure the delay is caused by other reasons or the time-delayed 
> feature.
>
> As you mentioned, it's possible that no log is emitted on slow machine. Then,
> the idea to make the test safer for such machines should be to make the 
> delayed time longer.
> But we shortened the delay time to 1 second to mitigate the long test 
> execution time of this TAP test.
> So, I'm not sure if it's a good idea to make it longer again.

I think there are a couple of things that can be done about this problem:

1. If you need the code/test to remain as-is then at least the test
message could include some comforting text like "(this can fail on
slow machines when the delay time is already exceeded)" so then a test
failure will not cause undue alarm.

2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so that
it will *always* log the remaining wait time even if that wait time
becomes negative. Then I think the test cases can be made
deterministic instead of relying on good luck. This seems like the
better option.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Support logical replication of DDLs

2023-02-02 Thread Peter Smith

 of the given pg_proc row into the output
+ * buffer. force_qualify indicates whether to schema-qualify type names
+ * regardless of visibility.
+ */
+static void
+format_procedure_args_internal(Form_pg_proc procform, StringInfo buf,
+bool force_qualify)
+{
+ int i;
+ char* (*func[2])(Oid) = {format_type_be, format_type_be_qualified};
+
+ appendStringInfoChar(buf, '(');
+ for (i = 0; i < procform->pronargs; i++)
+ {
+ Oid thisargtype = procform->proargtypes.values[i];
+ char*argtype = NULL;
+
+ if (i > 0)
+ appendStringInfoChar(buf, ',');
+
+ argtype = func[force_qualify](thisargtype);
+ appendStringInfoString(buf, argtype);
+ pfree(argtype);
+ }
+ appendStringInfoChar(buf, ')');
+}

9a.
Assign argtype = NULL looks redundant because it will always be
overwritten anyhow.

~

9b.
I understand why this function was put here beside the other static
functions in "Support Routines" but IMO it really belongs nearby (i.e.
directly above) the only caller (format_procedure_args). Keeping both
those functional together will improve the readability of both, and
will also remove the need to have the static forward declaration.

==
src/backend/utils/adt/ruleutils.c

10.
+void
+pg_get_ruledef_detailed(Datum ev_qual, Datum ev_action,
+ char **whereClause, List **actions)
+{
+ int prettyFlags = 0;
+ char*qualstr = TextDatumGetCString(ev_qual);
+ char*actionstr = TextDatumGetCString(ev_action);
+ List*actionNodeList = (List *) stringToNode(actionstr);
+ StringInfoData buf;
+
+ *whereClause = NULL;
+ *actions = NIL;
+ initStringInfo();
+ if (strlen(qualstr) > 0 && strcmp(qualstr, "<>") != 0)
+ {

If you like, that condition could have been written more simply as:

if (*qualstr && strcmp(qualstr, "<>") != 0)

~~~

11.
+/*
+ * Parse back the TriggerWhen clause of a trigger given the
pg_trigger record and
+ * the expression tree (in nodeToString() representation) from
pg_trigger.tgqual
+ * for the trigger's WHEN condition.
+ */
+char *
+pg_get_trigger_whenclause(Form_pg_trigger trigrec, Node *whenClause,
bool pretty)
+{

It seemed "Parse back" is a typo.

I assume it was meant to say something like "Passes back", or maybe
just "Returns" is better.

==
src/include/replication/logicalrelation.h

12.
@@ -14,6 +14,7 @@

 #include "access/attmap.h"
 #include "replication/logicalproto.h"
+#include "storage/lockdefs.h"

What is this needed here for? I tried without this change and
everything still builds OK.


--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-02-01 Thread Peter Smith

Some minor review comments for v91-0001

==
doc/src/sgml/config.sgml

1.

-Allows streaming or serializing changes immediately in
logical decoding.
-The allowed values of logical_replication_mode are
-buffered and immediate. When set
-to immediate, stream each change if
+The allowed values are buffered and
+immediate. The default is
buffered.
+This parameter is intended to be used to test logical decoding and
+replication of large transactions for which otherwise we need
to generate
+the changes till logical_decoding_work_mem is
+reached.  The effect of logical_replication_mode is
+different for the publisher and subscriber:
+   

The "for which otherwise..." part is only relevant for the
publisher-side. So it seemed slightly strange to give the reason why
to use the GUC for one side but not the other side.

Maybe we can just to remove that "for which otherwise..." part, since
the logical_decoding_work_mem gets mentioned later in the "On the
publisher side,..." paragraph anyway.

~~~

2.

-This parameter is intended to be used to test logical decoding and
-replication of large transactions for which otherwise we need to
-generate the changes till logical_decoding_work_mem
-is reached.
+On the subscriber side, if the streaming
option is set to
+parallel,
logical_replication_mode
+can be used to direct the leader apply worker to send changes to the
+shared memory queue or to serialize changes to the file.  When set to
+buffered, the leader sends changes to parallel apply
+workers via a shared memory queue.  When set to
+immediate, the leader serializes all
changes to files
+and notifies the parallel apply workers to read and apply them at the
+end of the transaction.


"or serialize changes to the file." --> "or serialize all changes to
files." (just to use same wording as later in this same paragraph, and
also same wording as the GUC hint text).

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-31 Thread Peter Smith

time.

~~~

8.
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),

Saying "> 0" (in the condition) is not strictly necessary here, since
it is never < 0.

~~~

9. AlterSubscription

+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
sub->minapplydelay > 0))

Saying "> 0" (in the condition) is not strictly necessary here, since
it is never < 0.

~~~

10.
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)

Saying "> 0" (in the condition) is not strictly necessary here, since
it is never < 0.

~~~

11. defGetMinApplyDelay

+ /*
+ * Check lower bound. parse_int() has already been confirmed that result
+ * is less than or equal to INT_MAX.
+ */

The parse_int already checks < INT_MAX. But on return from that
function, don’t you need to check again that it is < PG_INT32_MAX (in
case those are different)

(I think Kuroda-san already suggested same as this)

==
src/backend/replication/logical/worker.c

12.
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not make positions of the flushed and apply LSN overwritten
+ * by the last received latest LSN. See send_feedback() for details.
+ */

"we should not make positions of the flushed and apply LSN
overwritten" --> "we should overwrite positions of the flushed and
apply LSN"

~~~

14. send_feedback

@@ -3738,8 +3867,15 @@ send_feedback(XLogRecPtr recvpos, bool force,
bool requestReply)
  /*
  * No outstanding transactions to flush, we can report the latest received
  * position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, it just
+ * sends a feedback message to avoid a replication timeout during the
+ * delay.
  */

"Instead, it just sends" --> "Instead, just send"

==
src/bin/pg_dump/pg_dump.h

15. SubscriptionInfo

@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
  char*subdisableonerr;
  char*suborigin;
  char*subsynccommit;
+ int subminapplydelay;
  char*subpublications;
 } SubscriptionInfo;

Should this also be "int32" to match the other member type changes?

==
src/test/subscription/t/032_apply_delay.pl

16.
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");

"knows to wait for more than" --> "waits for more than"

(this occurs in a couple of places)

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Logical replication timeout problem

2023-01-31 Thread Peter Smith

Here are my review comments for v13-1.

==
Commit message

1.
The DDLs like Refresh Materialized views that generate lots of temporary
data due to rewrite rules may not be processed by output plugins (for
example pgoutput). So, we won't send keep-alive messages for a long time
while processing such commands and that can lead the subscriber side to
timeout.

~

SUGGESTION (minor rearranged way to say the same thing)

For DDLs that generate lots of temporary data due to rewrite rules
(e.g. REFRESH MATERIALIZED VIEW) the output plugins (e.g. pgoutput)
may not be processed for a long time. Since we don't send keep-alive
messages while processing such commands that can lead the subscriber
side to timeout.

~~~

2.
The commit message says what the problem is, but it doesn’t seem to
describe what this patch does to fix the problem.

==
src/backend/replication/logical/reorderbuffer.c

3.
+ /*
+ * It is possible that the data is not sent to downstream for a
+ * long time either because the output plugin filtered it or there
+ * is a DDL that generates a lot of data that is not processed by
+ * the plugin. So, in such cases, the downstream can timeout. To
+ * avoid that we try to send a keepalive message if required.
+ * Trying to send a keepalive message after every change has some
+ * overhead, but testing showed there is no noticeable overhead if
+ * we do it after every ~100 changes.
+ */


3a.
"data is not sent to downstream" --> "data is not sent downstream" (?)

~

3b.
"So, in such cases," --> "In such cases,"

~~~

4.
+#define CHANGES_THRESHOLD 100
+
+ if (++changes_count >= CHANGES_THRESHOLD)
+ {
+ rb->update_progress_txn(rb, txn, change->lsn);
+ changes_count = 0;
+ }

I was wondering if it would have been simpler to write this code like below.

Also, by doing it this way the 'changes_count' variable name makes
more sense IMO, otherwise (for current code) maybe it should be called
something like 'changes_since_last_keepalive'

SUGGESTION
if (++changes_count % CHANGES_THRESHOLD == 0)
rb->update_progress_txn(rb, txn, change->lsn);


--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-30 Thread Peter Smith

Thanks for the updates to address all of my previous review comments.

Patch v90-0001 LGTM.

--
Kind Reagrds,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-30 Thread Peter Smith

On Mon, Jan 30, 2023 at 5:23 PM houzj.f...@fujitsu.com
 wrote:
>
> On Monday, January 30, 2023 12:13 PM Peter Smith  
> wrote:
> >
> > Here are my review comments for v88-0002.
>
> Thanks for your comments.
>
> >
> > ==
> > General
> >
> > 1.
> > The test cases are checking the log content but they are not checking for
> > debug logs or untranslated elogs -- they are expecting a normal ereport LOG
> > that might be translated. I’m not sure if that is OK, or if it is a 
> > potential problem.
>
> We have tests that check the ereport ERROR and ereport WARNING message(by
> search for the ERROR or WARNING keyword for all the tap tests), so I think
> checking the LOG should be fine.
>
> > ==
> > doc/src/sgml/config.sgml
> >
> > 2.
> > On the publisher side, logical_replication_mode allows allows streaming or
> > serializing changes immediately in logical decoding. When set to immediate,
> > stream each change if streaming option (see optional parameters set by
> > CREATE SUBSCRIPTION) is enabled, otherwise, serialize each change. When set
> > to buffered, the decoding will stream or serialize changes when
> > logical_decoding_work_mem is reached.
> >
> > 2a.
> > typo "allows allows"  (Kuroda-san reported same)
> >
> > 2b.
> > "if streaming option" --> "if the streaming option"
>
> Changed.

Although you replied "Changed" for the above, AFAICT my review comment
#2b. was accidentally missed.

Otherwise, the patch LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: pub/sub - specifying optional parameters without values.

2023-01-30 Thread Peter Smith

On Tue, Jan 31, 2023 at 4:00 AM Tom Lane  wrote:
>
> Peter Smith  writes:
> > The v3 patch LGTM (just for the logical replication commands).
>
> Pushed then.
>

Thanks for pushing the v3 patch.

I'd forgotten about the 'streaming' option -- AFAIK this was
previously a boolean parameter and so its [= value] part can also be
omitted. However, in PG16 streaming became an enum type
(on/off/parallel), and the value can still be omitted but that is not
really being covered by the new generic text note about booleans added
by yesterday's patch.

e.g. The enum 'streaming' value part can still be omitted.
test_sub=# create subscription sub1 connection 'host=localhost
dbname=test_pub' publication pub1 with (streaming);

Perhaps a small top-up patch to CREATE SUBSCRIPTION is needed to
describe this special case?

PSA.

(I thought mentioning this special streaming case again for ALTER
SUBSCRIPTION might be overkill)

--
Kind Regards,
Peter Smith.
Fujitsu Australia

v1-0001-Added-note-for-streaming-parameter-with-value-par.patch
Description: Binary data

Re: pub/sub - specifying optional parameters without values.

2023-01-29 Thread Peter Smith

On Mon, Jan 30, 2023 at 8:36 AM Tom Lane  wrote:
>
> Zheng Li  writes:
> > The behavior is due to the following code
> > https://github.com/postgres/postgres/blob/master/src/backend/commands/define.c#L113
>
> Yeah, so you can grep for places that have this behavior by looking
> for defGetBoolean calls ... and there are quite a few.  That leads
> me to the conclusion that we'd better invent a fairly stylized
> documentation solution that we can plug into a lot of places,
> rather than thinking of slightly different ways to say it and
> places to say it.  I'm not necessarily opposed to Peter's desire
> to fix replication-related commands first, but we have more to do
> later.
>
> I'm also not that thrilled with putting the addition up at the top
> of the relevant text.  This behavior is at least two decades old,
> so if we've escaped documenting it at all up to now, it can't be
> that important to most people.
>
> I also notice that ALTER SUBSCRIPTION has fully three different
> sub-sections with about equal claims on this note, if we're going
> to stick it directly into the affected option lists.
>
> That all leads me to propose that we add the new text at the end of
> the Parameters  in the affected man pages.  So about
> like the attached.  (I left out alter_publication.sgml, as I'm not
> sure it needs its own copy of this text --- it doesn't describe
> individual parameters at all, just refer to CREATE PUBLICATION.)
>

The v3 patch LGTM (just for the logical replication commands).

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [DOCS] Stats views and functions not in order?

2023-01-29 Thread Peter Smith

On Fri, Jan 27, 2023 at 10:30 PM Peter Eisentraut
 wrote:
>
> On 19.01.23 00:45, Peter Smith wrote:
> > The original $SUBJECT requirements evolved to also try to make each
> > view appear on a separate page after that was suggested by DavidJ [2].
> > I was unable to achieve per-page views "without radically changing the
> > document structure." [3], but DavidJ found a way [4] to do it using
> > refentry. I then wrote the patch v8-0003 using that strategy, which
> > after more rebasing became the v10-0001 you see today.
> >
> > I did prefer the view-per-page results (although I also only use HTML
> > docs). But my worry is that there seem still to be a few unknowns
> > about how this might affect other (not the HTML) renderings of the
> > docs. If you think that risk is too great, or if you feel this patch
> > will cause unwarranted link/bookmark grief, then I am happy to just
> > drop it.
>
> I'm wary of making semantic markup changes to achieve an ad-hoc
> presentation effects.  Sometimes it's necessary, but it should be
> considered carefully and globally.
>
> We could change the chunking boundary to be sect2 globally.  This is
> easily configurable (chunk.section.depth).
>
> Thinking about it now, maybe this is what we need.  As the documentation
> grows, as it clearly does, the depth of the structure increases and
> pages get longer.  This can also be seen in other chapters.
>
> Of course, this would need to be tested and checked in more detail.
>

This chunk configuration idea sounds a better approach. If somebody
else wants to champion that change separately then I can maybe help to
review it.

Meanwhile, this pagination topic has strayed far away from the
original $SUBJECT, so I guess since there is nothing else pending this
thread's CF entry [1] can just be marked as "Committed" now?

--
[1] https://commitfest.postgresql.org/41/3904/

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-29 Thread Peter Smith

Here are my review comments for v88-0002.

==
General

1.
The test cases are checking the log content but they are not checking
for debug logs or untranslated elogs -- they are expecting a normal
ereport LOG that might be translated. I’m not sure if that is OK, or
if it is a potential problem.

==
doc/src/sgml/config.sgml

2.
On the publisher side, logical_replication_mode allows allows
streaming or serializing changes immediately in logical decoding. When
set to immediate, stream each change if streaming option (see optional
parameters set by CREATE SUBSCRIPTION) is enabled, otherwise,
serialize each change. When set to buffered, the decoding will stream
or serialize changes when logical_decoding_work_mem is reached.

2a.
typo "allows allows"  (Kuroda-san reported same)

2b.
"if streaming option" --> "if the streaming option"

~~~

3.
On the subscriber side, if streaming option is set to parallel,
logical_replication_mode also allows the leader apply worker to send
changes to the shared memory queue or to serialize changes.

SUGGESTION
On the subscriber side, if the streaming option is set to parallel,
logical_replication_mode can be used to direct the leader apply worker
to send changes to the shared memory queue or to serialize changes.

==
src/backend/utils/misc/guc_tables.c

4.
  {
  {"logical_replication_mode", PGC_USERSET, DEVELOPER_OPTIONS,
- gettext_noop("Controls when to replicate each change."),
- gettext_noop("On the publisher, it allows streaming or serializing
each change in logical decoding."),
+ gettext_noop("Controls the internal behavior of logical replication
publisher and subscriber"),
+ gettext_noop("On the publisher, it allows streaming or "
+ "serializing each change in logical decoding. On the "
+ "subscriber, in parallel streaming mode, it allows "
+ "the leader apply worker to serialize changes to "
+ "files and notifies the parallel apply workers to "
+ "read and apply them at the end of the transaction."),
  GUC_NOT_IN_SAMPLE
  },
Suggest re-wording the long description (subscriber part) to be more
like the documentation text.

BEFORE
On the subscriber, in parallel streaming mode, it allows the leader
apply worker to serialize changes to files and notifies the parallel
apply workers to read and apply them at the end of the transaction.

SUGGESTION
On the subscriber, if the streaming option is set to parallel, it
directs the leader apply worker to send changes to the shared memory
queue or to serialize changes and apply them at the end of the
transaction.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-29 Thread Peter Smith

Patch v88-0001 LGTM.

Below are just some minor review comments about the commit message.

==
Commit message

1.
We have discussed having this parameter as a subscription option but
exposing a parameter that is primarily used for testing/debugging to users
didn't seem advisable and there is no other such parameter. The other
option we have discussed is to have a separate GUC for subscriber-side
testing but it appears that for the current testing existing parameter is
sufficient and avoids adding another GUC.

SUGGESTION
We discussed exposing this parameter as a subscription option, but it
did not seem advisable since it is primarily used for
testing/debugging and there is no other such developer option.

We also discussed having separate GUCs for publisher/subscriber-side,
but for current testing/debugging requirements, one GUC is sufficient.

~~

2.
Reviewed-by: Pater Smith, Kuroda Hayato, Amit Kapila

"Pater" --> "Peter"

------
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-24 Thread Peter Smith

Here are my review comments for patch v87-0002.

==
doc/src/sgml/config.sgml

1.

-Allows streaming or serializing changes immediately in
logical decoding.
 The allowed values of logical_replication_mode are
-buffered and immediate. When set
-to immediate, stream each change if
+buffered and immediate.
The default
+is buffered.
+   

I didn't think it was necessary to say “of logical_replication_mode”.
IMO that much is already obvious because this is the first sentence of
the description for logical_replication_mode.

(see also review comment #4)

~~~

2.
+   
+On the publisher side, it allows streaming or serializing changes
+immediately in logical decoding.  When set to
+immediate, stream each change if
 streaming option (see optional parameters set by
 CREATE
SUBSCRIPTION)
 is enabled, otherwise, serialize each change.  When set to
-buffered, which is the default, decoding will stream
-or serialize changes when logical_decoding_work_mem
-is reached.
+buffered, decoding will stream or serialize changes
+when logical_decoding_work_mem is reached.


2a.
"it allows" --> "logical_replication_mode allows"

2b.
"decoding" --> "the decoding"

~~~

3.
+   
+On the subscriber side, if streaming option is set
+to parallel, this parameter also allows the leader
+apply worker to send changes to the shared memory queue or to serialize
+changes.  When set to buffered, the leader sends
+changes to parallel apply workers via shared memory queue.  When set to
+immediate, the leader serializes all changes to
+files and notifies the parallel apply workers to read and apply them at
+the end of the transaction.
+   

"this parameter also allows" --> "logical_replication_mode also allows"

~~~

4.

 This parameter is intended to be used to test logical decoding and
 replication of large transactions for which otherwise we need to
 generate the changes till logical_decoding_work_mem
-is reached.
+is reached. Moreover, this can also be used to test the transmission of
+changes between the leader and parallel apply workers.


"Moreover, this can also" --> "It can also"

I am wondering would this sentence be better put at the top of the GUC
description. So then the first paragraph becomes like this:


SUGGESTION (I've also added another sentence "The effect of...")

The allowed values are buffered and immediate. The default is
buffered. This parameter is intended to be used to test logical
decoding and replication of large transactions for which otherwise we
need to generate the changes till logical_decoding_work_mem is
reached. It can also be used to test the transmission of changes
between the leader and parallel apply workers. The effect of
logical_replication_mode is different for the publisher and
subscriber:

On the publisher side...

On the subscriber side...
==
.../replication/logical/applyparallelworker.c

5.
+ /*
+ * In immeidate mode, directly return false so that we can switch to
+ * PARTIAL_SERIALIZE mode and serialize remaining changes to files.
+ */
+ if (logical_replication_mode == LOGICAL_REP_MODE_IMMEDIATE)
+ return false;

Typo "immediate"

Also, I felt "directly" is not needed. "return false" and "directly
return false" is the same.

SUGGESTION
Using ‘immediate’ mode returns false to cause a switch to
PARTIAL_SERIALIZE mode so that the remaining changes will be
serialized.

==
src/backend/utils/misc/guc_tables.c

6.
  {
  {"logical_replication_mode", PGC_USERSET, DEVELOPER_OPTIONS,
- gettext_noop("Allows streaming or serializing each change in logical
decoding."),
- NULL,
+ gettext_noop("Controls the behavior of logical replication publisher
and subscriber"),
+ gettext_noop("If set to immediate, on the publisher side, it "
+ "allows streaming or serializing each change in "
+ "logical decoding. On the subscriber side, in "
+ "parallel streaming mode, it allows the leader apply "
+ "worker to serialize changes to files and notifies "
+ "the parallel apply workers to read and apply them at "
+ "the end of the transaction."),
  GUC_NOT_IN_SAMPLE
  },

6a. short description

User PoV behaviour should be the same. Instead, maybe say "controls
the internal behavior" or something like that?

~

6b. long description

IMO the long description shouldn’t mention ‘immediate’ mode first as it does.

BEFORE
If set to immediate, on the publisher side, ...

AFTER
On the publisher side, ...

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-24 Thread Peter Smith

On Tue, Jan 24, 2023 at 11:49 PM houzj.f...@fujitsu.com
 wrote:
>
...
>
> Sorry, the patch set was somehow attached twice. Here is the correct new 
> version
> patch set which addressed all comments so far.
>

Here are my review comments for patch v87-0001.

==
src/backend/replication/logical/reorderbuffer.c

1.
@@ -210,7 +210,7 @@ int logical_decoding_work_mem;
 static const Size max_changes_in_memory = 4096; /* XXX for restore only */

 /* GUC variable */
-int logical_decoding_mode = LOGICAL_DECODING_MODE_BUFFERED;
+int logical_replication_mode = LOGICAL_REP_MODE_BUFFERED;


I noticed that the comment /* GUC variable */ is currently only above
the logical_replication_mode, but actually logical_decoding_work_mem
is a GUC variable too. Maybe this should be rearranged somehow then
change the comment "GUC variable" -> "GUC variables"?

==
src/backend/utils/misc/guc_tables.c

@@ -4908,13 +4908,13 @@ struct config_enum ConfigureNamesEnum[] =
  },

  {
- {"logical_decoding_mode", PGC_USERSET, DEVELOPER_OPTIONS,
+ {"logical_replication_mode", PGC_USERSET, DEVELOPER_OPTIONS,
  gettext_noop("Allows streaming or serializing each change in logical
decoding."),
  NULL,
  GUC_NOT_IN_SAMPLE
  },
- _decoding_mode,
- LOGICAL_DECODING_MODE_BUFFERED, logical_decoding_mode_options,
+ _replication_mode,
+ LOGICAL_REP_MODE_BUFFERED, logical_replication_mode_options,
  NULL, NULL, NULL
  },

That gettext_noop string seems incorrect. I think Kuroda-san
previously reported the same, but then you replied it has been fixed
already [1]

> I felt the description seems not to be suitable for current behavior.
> A short description should be like "Sets a behavior of logical replication", 
> and
> further descriptions can be added in lond description.
I adjusted the description here.

But this doesn't look fixed to me. (??)

==
src/include/replication/reorderbuffer.h

3.
@@ -18,14 +18,14 @@
 #include "utils/timestamp.h"

 extern PGDLLIMPORT int logical_decoding_work_mem;
-extern PGDLLIMPORT int logical_decoding_mode;
+extern PGDLLIMPORT int logical_replication_mode;

Probably here should also be a comment to say "/* GUC variables */"

--
[1] 
https://www.postgresql.org/message-id/OS0PR01MB5716AE9F095F9E7888987BC794C99%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-23 Thread Peter Smith

Here are some review comments for v86-0002

==
Commit message

1.
Use the use the existing developer option logical_replication_mode to test the
parallel apply of large transaction on subscriber.

~

Typo “Use the use the”

SUGGESTION (rewritten)
Give additional functionality to the existing developer option
'logical_replication_mode' to help test parallel apply of large
transactions on the subscriber.

~~~

2.
Maybe that commit message should also say extra TAP tests that have
been added to exercise the serialization part of the parallel apply?

BTW – I can see the TAP tests are testing full serialization (when the
GUC is 'immediate') but I not sure how is "partial" serialization
(when it has to switch halfway from shmem to files) being tested.

==
doc/src/sgml/config.sgml

3.
Allows streaming or serializing changes immediately in logical
decoding. The allowed values of logical_replication_mode are buffered
and immediate. When set to immediate, stream each change if streaming
option (see optional parameters set by CREATE SUBSCRIPTION) is
enabled, otherwise, serialize each change. When set to buffered, which
is the default, decoding will stream or serialize changes when
logical_decoding_work_mem is reached.
On the subscriber side, if streaming option is set to parallel, this
parameter also allows the leader apply worker to send changes to the
shared memory queue or to serialize changes. When set to buffered, the
leader sends changes to parallel apply workers via shared memory
queue. When set to immediate, the leader serializes all changes to
files and notifies the parallel apply workers to read and apply them
at the end of the transaction.

~

Because now this same developer GUC affects both the publisher side
and the subscriber side differently IMO this whole description should
be re-structured accordingly.

SUGGESTION (something like)

The allowed values of logical_replication_mode are buffered and
immediate. The default is buffered.

On the publisher side, ...

On the subscriber side, ...

~~~

4.
This parameter is intended to be used to test logical decoding and
replication of large transactions for which otherwise we need to
generate the changes till logical_decoding_work_mem is reached.

~

Maybe this paragraph needs rewording or moving. e.g. Isn't that
misleading now? Although this might be an explanation for the
publisher side, it does not seem relevant to the subscriber side's
behaviour.

==
.../replication/logical/applyparallelworker.c

5.
@ -1149,6 +1149,9 @@ pa_send_data(ParallelApplyWorkerInfo *winfo, Size
nbytes, const void *data)
  Assert(!IsTransactionState());
  Assert(!winfo->serialize_changes);

+ if (logical_replication_mode == LOGICAL_REP_MODE_IMMEDIATE)
+ return false;
+

I felt that code should have some comment, even if it is just
something quite basic like "/* For developer testing */"

==
.../t/018_stream_subxact_abort.pl

6.
+# Clean up test data from the environment.
+$node_publisher->safe_psql('postgres', "TRUNCATE TABLE test_tab_2");
+$node_publisher->wait_for_catchup($appname);

Is it necessary to TRUNCATE the table here? If everything is working
shouldn't the data be rolled back anyway?

~~~

7.
+$node_publisher->safe_psql(
+ 'postgres', q{
+ BEGIN;
+ INSERT INTO test_tab_2 values(1);
+ SAVEPOINT sp;
+ INSERT INTO test_tab_2 values(1);
+ ROLLBACK TO sp;
+ COMMIT;
+ });

Perhaps this should insert 2 different values so then the verification
code can check the correct value remains instead of just checking
COUNT(*)?

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-23 Thread Peter Smith

On Tue, Jan 24, 2023 at 5:58 PM Amit Kapila  wrote:
>
> On Tue, Jan 24, 2023 at 8:15 AM Kyotaro Horiguchi
>  wrote:
> >
> > > Attached the updated patch v19.
> >
> > + maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
> >
> > I look this spelling strange.  How about maybe_apply_delay()?
> >
>
> +1.

It depends on how you read it. I read it like this:

maybe_delay_apply === means "maybe delay [the] apply"
(which is exactly what the function does)

versus

maybe_apply_delay === means "maybe [the] apply [needs a] delay"
(which is also correct, but it seemed a more awkward way to say it IMO)

~

Perhaps it's better to rename it more fully like
*maybe_delay_the_apply* to remove any ambiguous interpretations.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Logical replication timeout problem

2023-01-23 Thread Peter Smith

On Tue, Jan 24, 2023 at 1:45 PM wangw.f...@fujitsu.com
 wrote:
>
> On Tues, Jan 24, 2023 at 8:28 AM Peter Smith  wrote:
> > Hi Hou-san, Here are my review comments for v5-0001.
>
> Thanks for your comments.
...
>
> Changed as suggested.
>
> Attach the new patch.

Thanks! Patch v6 LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-23 Thread Peter Smith

Here are my review comments for patch v86-0001.

==
General

1.

IIUC the GUC name was made generic 'logical_replication_mode' so that
multiple developer GUCs are not needed later.

But IMO those current option values (buffered/immediate) for that GUC
are maybe a bit too generic. Perhaps in future, we might want more
granular control than that allows. e.g. I can imagine there might be
multiple different meanings for what "buffered" means. If there is any
chance of the generic values being problematic later then maybe they
should be made more specific up-front.

e.g. maybe like:
logical_replication_mode = buffered_decoding
logical_replication_mode = immediate_decoding

Thoughts?

==
Commit message

2.
Since we may extend the developer option logical_decoding_mode to to test the
parallel apply of large transaction on subscriber, rename this option to
logical_replication_mode to make it easier to understand.

~

2a
typo "to to"

typo "large transaction on subscriber" --> "large transactions on the
subscriber"

~

2b.
IMO better to rephrase the whole paragraph like shown below.

SUGGESTION

Rename the developer option 'logical_decoding_mode' to the more
flexible name 'logical_replication_mode' because doing so will make it
easier to extend this option in future to help test other areas of
logical replication.

==
doc/src/sgml/config.sgml

3.
Allows streaming or serializing changes immediately in logical
decoding. The allowed values of logical_replication_mode are buffered
and immediate. When set to immediate, stream each change if streaming
option (see optional parameters set by CREATE SUBSCRIPTION) is
enabled, otherwise, serialize each change. When set to buffered, which
is the default, decoding will stream or serialize changes when
logical_decoding_work_mem is reached.

~

IMO it's more clear to say the default when the options are first
mentioned. So I suggested removing the "which is the default" part,
and instead saying:

BEFORE
The allowed values of logical_replication_mode are buffered and immediate.

AFTER
The allowed values of logical_replication_mode are buffered and
immediate. The default is buffered.

==
src/backend/utils/misc/guc_tables.c

4.
@@ -396,8 +396,8 @@ static const struct config_enum_entry
ssl_protocol_versions_info[] = {
 };

 static const struct config_enum_entry logical_decoding_mode_options[] = {
- {"buffered", LOGICAL_DECODING_MODE_BUFFERED, false},
- {"immediate", LOGICAL_DECODING_MODE_IMMEDIATE, false},
+ {"buffered", LOGICAL_REP_MODE_BUFFERED, false},
+ {"immediate", LOGICAL_REP_MODE_IMMEDIATE, false},
  {NULL, 0, false}
 };

I noticed this array is still called "logical_decoding_mode_options".
Was that deliberate?

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Logical replication timeout problem

2023-01-23 Thread Peter Smith

Hi Hou-san, Here are my review comments for v5-0001.

==
src/backend/replication/logical/reorderbuffer.c

1.
@@ -2446,6 +2452,23 @@ ReorderBufferProcessTXN(ReorderBuffer *rb,
ReorderBufferTXN *txn,
  elog(ERROR, "tuplecid value in changequeue");
  break;
  }
+
+ /*
+ * Sending keepalive messages after every change has some overhead, but
+ * testing showed there is no noticeable overhead if keepalive is only
+ * sent after every ~100 changes.
+ */
+#define CHANGES_THRESHOLD 100
+
+ /*
+ * Try to send a keepalive message after every CHANGES_THRESHOLD
+ * changes.
+ */
+ if (++changes_count >= CHANGES_THRESHOLD)
+ {
+ rb->update_progress_txn(rb, txn, change);
+ changes_count = 0;
+ }

I noticed you put the #define adjacent to the only usage of it,
instead of with the other variable declaration like it was before.
Probably it is better how you have done it, but:

1a.
The comment indentation is incorrect.

~

1b.
Since the #define is adjacent to its only usage IMO now the 2nd
comment is redundant. So the code can just say

   /*
* Sending keepalive messages after every change has some
overhead, but
* testing showed there is no noticeable overhead if
keepalive is only
* sent after every ~100 changes.
*/
#define CHANGES_THRESHOLD 100
if (++changes_count >= CHANGES_THRESHOLD)
{
rb->update_progress_txn(rb, txn, change);
changes_count = 0;
}

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-23 Thread Peter Smith

On Mon, Jan 23, 2023 at 9:44 PM Amit Kapila  wrote:
>
> On Mon, Jan 23, 2023 at 1:36 PM Peter Smith  wrote:
> >
> > Here are my review comments for v19-0001.
> >
> ...
> >
> > 5. parse_subscription_options
> >
> > + /*
> > + * The combination of parallel streaming mode and min_apply_delay is not
> > + * allowed. The subscriber in the parallel streaming mode applies each
> > + * stream on arrival without the time of commit/prepare. So, the
> > + * subscriber needs to depend on the arrival time of the stream in this
> > + * case, if we apply the time-delayed feature for such transactions. Then
> > + * there is a possibility where some unnecessary delay will be added on
> > + * the subscriber by network communication break between nodes or other
> > + * heavy work load on the publisher. On the other hand, applying the delay
> > + * at the end of transaction with parallel apply also can cause issues of
> > + * used resource bloat and locks kept in open for a long time. Thus, those
> > + * features can't work together.
> > + */
> >
> > IMO some re-wording might be warranted here. I am not sure quite how
> > to do it. Perhaps like below?
> >
> > SUGGESTION
> >
> > The combination of parallel streaming mode and min_apply_delay is not 
> > allowed.
> >
> > Here are some reasons why these features are incompatible:
> > a. In the parallel streaming mode the subscriber applies each stream
> > on arrival without knowledge of the commit/prepare time. This means we
> > cannot calculate the underlying network/decoding lag between publisher
> > and subscriber, and so always waiting for the full 'min_apply_delay'
> > period might include unnecessary delay.
> > b. If we apply the delay at the end of the transaction of the parallel
> > apply then that would cause issues related to resource bloat and locks
> > being held for a long time.
> >
> > ~~~
> >
>
> How about something like:
> The combination of parallel streaming mode and min_apply_delay is not
> allowed. This is because we start applying the transaction stream as
> soon as the first change arrives without knowing the transaction's
> prepare/commit time. This means we cannot calculate the underlying
> network/decoding lag between publisher and subscriber, and so always
> waiting for the full 'min_apply_delay' period might include
> unnecessary delay.
>
> The other possibility is to apply the delay at the end of the parallel
> apply transaction but that would cause issues related to resource
> bloat and locks being held for a long time.
>

+1. That's better.

>
> > 6. defGetMinApplyDelay
> >
...
> >
> > 6b.
> > I thought this function should be implemented as static and located at
> > the top of the subscriptioncmds.c source file.
> >
>
> I agree that this should be a static function but I think its current
> location is a better place as other similar function is just above it.
>

But, why not do everything, instead of settling on a half-fix?

e.g.
1. Change the new function (defGetMinApplyDelay) to be static as it should be
2. And move defGetMinApplyDelay to the top of the file where IMO it
really belongs
3. And then remove the (now) redundant forward declaration of
defGetMinApplyDelay
4. And also move the existing function (defGetStreamingMode) to the
top of the file so that those similar functions (defGetMinApplyDelay
and defGetStreamingMode) can remain together

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-23 Thread Peter Smith

rce file.

==
src/backend/replication/logical/worker.c

7. maybe_delay_apply

+static void maybe_delay_apply(TransactionId xid, TimestampTz finish_ts);

Is there a reason why this is here? AFAIK the static implementation
precedes any usage so I doubt this forward declaration is required.

~~~

8. send_feedback

@@ -3775,11 +3912,12 @@ send_feedback(XLogRecPtr recvpos, bool force,
bool requestReply)
  pq_sendint64(reply_message, now); /* sendTime */
  pq_sendbyte(reply_message, requestReply); /* replyRequested */

- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write
%X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write
%X/%X, flush %X/%X in-delayed: %d",
  force,
  LSN_FORMAT_ARGS(recvpos),
  LSN_FORMAT_ARGS(writepos),
- LSN_FORMAT_ARGS(flushpos));
+ LSN_FORMAT_ARGS(flushpos),
+ in_delayed_apply);

Wondering if it is better to write this as:
"sending feedback (force %d, in_delayed_apply %d) to recv %X/%X, write
%X/%X, flush %X/%X"

==
src/test/regress/sql/subscription.sql

9. Add new test?

Should there be an additional test to check redundant parameter
setting -- eg. "... WITH (min_apply_delay=123, min_apply_delay=456)"

(this is related to the review comment #4)

~

10. Add new tests?

Should there be other tests just to verify different units (like 'd',
'h', 'min') are working OK?

==
src/test/subscription/t/032_apply_delay.pl

11.
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verifies
+# that the current worker's delayed time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log

"the current worker's delayed time..." --> "the current worker's
remaining wait time..." ??

~~~

12.
+ # Get the delay time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);

"Get the delay time" --> "Get the remaining wait time..."

~~~

13.
+# Create a subscription that applies the trasaction after 50 milliseconds delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off,
min_apply_delay = '50ms', streaming = 'on')"
+);

13a.
typo: "trasaction"

~

13b
50ms seems an extremely short time – How do you even know if this is
testing anything related to the time delay? You may just be detecting
the normal lag between publisher and subscriber without time delay
having much to do with anything.

~

14.

+# Note that we cannot call check_apply_delay_log() here because there is a
+# possibility that the delay is skipped. The event happens when the WAL
+# replication between publisher and subscriber is delayed due to a mechanical
+# problem. The log output will be checked later - substantial delay-time case.
+
+# Verify that the subscriber lags the publisher by at least 50 milliseconds
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '0.05');

14a.
"The event happens..." ??

Did you mean "This might happen if the WAL..."

~

14b.
The log output will be checked later - substantial delay-time case.

I think that needs re-wording to clarify.
e.g1. you have nothing called a "substantial delay-time" case.
e.g2. the word "later" confused me. Originally, I thought you meant it
is not tested yet but that you will check it "later", but now IIUC you
are just  referring to the "1 day 5 minutes" test that comes below in
this location TAP file (??)


--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Logical replication timeout problem

2023-01-22 Thread Peter Smith

Here are my review comments for patch v4-0001

==
General

1.

It makes no real difference, but I was wondering about:
"update txn progress" versus "update progress txn"

I thought that the first way sounds more natural. YMMV.

If you change this then there is impact for the typedef, function
names, comments, member names:

ReorderBufferUpdateTxnProgressCB -->  ReorderBufferUpdateProgressTxnCB

“/* update progress txn callback */” --> “/* update txn progress callback */”

update_progress_txn_cb_wrapper -->  update_txn_progress_cb_wrapper

updated_progress_txn --> update_txn_progress

==
Commit message

2.

The problem is when there is a DDL in a transaction that generates lots of
temporary data due to rewrite rules, these temporary data will not be processed
by the pgoutput plugin. The previous commit (f95d53e) only fixed timeouts
caused by filtering out changes in pgoutput. Therefore, the previous fix for
DML had no impact on this case.

~

IMO this still some rewording to say up-front what the the actual
problem -- i.e. an avoidable timeout occuring.

SUGGESTION (or something like this...)

When there is a DDL in a transaction that generates lots of temporary
data due to rewrite rules, this temporary data will not be processed
by the pgoutput plugin. This means it is possible for a timeout to
occur if a sufficiently long time elapses since the last pgoutput
message. A previous commit (f95d53e) fixed a similar scenario in this
area, but that only fixed timeouts for DML going through pgoutput, so
it did not address this DDL timeout case.

==
src/backend/replication/logical/logical.c

3. update_progress_txn_cb_wrapper

+/*
+ * Update progress callback while processing a transaction.
+ *
+ * Try to update progress and send a keepalive message during sending data of a
+ * transaction (and its subtransactions) to the output plugin.
+ *
+ * For a large transaction, if we don't send any change to the downstream for a
+ * long time (exceeds the wal_receiver_timeout of standby) then it can timeout.
+ * This can happen when all or most of the changes are either not published or
+ * got filtered out.
+ */
+static void
+update_progress_txn_cb_wrapper(ReorderBuffer *cache, ReorderBufferTXN *txn,
+ReorderBufferChange *change)

Simplify the "Try to..." paragraph. And other part should also mention
about DDL.

SUGGESTION

Try send a keepalive message during transaction processing.

This is done because if we don't send any change to the downstream for
a long time (exceeds the wal_receiver_timeout of standby), then it can
timeout. This can happen for large DDL, or for large transactions when
all or most of the changes are either not published or got filtered
out.

==
.../replication/logical/reorderbuffer.c

4. ReorderBufferProcessTXN

@@ -2105,6 +2105,19 @@ ReorderBufferProcessTXN(ReorderBuffer *rb,
ReorderBufferTXN *txn,

  PG_TRY();
  {
+ /*
+ * Static variable used to accumulate the number of changes while
+ * processing txn.
+ */
+ static int changes_count = 0;
+
+ /*
+ * Sending keepalive messages after every change has some overhead, but
+ * testing showed there is no noticeable overhead if keepalive is only
+ * sent after every ~100 changes.
+ */
+#define CHANGES_THRESHOLD 100
+

IMO these can be relocated to be declared/defined inside the "while"
loop -- i.e. closer to where they are being used.

~~~

5.

+ if (++changes_count >= CHANGES_THRESHOLD)
+ {
+ rb->update_progress_txn(rb, txn, change);
+ changes_count = 0;
+ }

When there is no update_progress function this code is still incurring
some small additional overhead for incrementing and testing the
THRESHOLD every time, and also needlessly calling to the wrapper every
100x. This overhead could be avoided with a simpler up-front check
like shown below. OTOH, maybe the overhead is insignificant enough
that just leaving the curent code is neater?

LogicalDecodingContext *ctx = rb->private_data;
...
if (ctx->update_progress_txn && (++changes_count >= CHANGES_THRESHOLD))
{
rb->update_progress_txn(rb, txn, change);
changes_count = 0;
}

--
Kind Reagrds,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-19 Thread Peter Smith

On Fri, Jan 20, 2023 at 2:47 PM shveta malik  wrote:
>
...
> 2)
>  Logging:
> 2023-01-19 17:33:16.202 IST [404797] DEBUG:  logical replication apply
> delay: 19979 ms
> 2023-01-19 17:33:26.212 IST [404797] DEBUG:  logical replication apply
> delay: 9969 ms
> 2023-01-19 17:34:25.730 IST [404962] DEBUG:  logical replication apply
> delay: 179988 ms-->previous wait over, started for next txn
> 2023-01-19 17:34:35.737 IST [404962] DEBUG:  logical replication apply
> delay: 169981 ms
> 2023-01-19 17:34:45.746 IST [404962] DEBUG:  logical replication apply
> delay: 159972 ms
>
> Is there a way to distinguish between these logs? Maybe dumping xids 
> along-with?
>

+1

Also, I was thinking of some other logging enhancements

a) the message should say that this is the *remaining* time to left to wait.

b) it might be convenient to know from the log what was the original
min_apply_delay value in the 1st place.

For example, the logs might look something like this:

DEBUG: time-delayed replication for txid 1234, min_apply_delay =
16 ms. Remaining wait time: 159972 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
16 ms. Remaining wait time: 142828 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
16 ms. Remaining wait time: 129994 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
16 ms. Remaining wait time: 110001 ms
...

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-19 Thread Peter Smith

apply(TimestampTz finish_ts)

That last sentence "Hence,... delay at the time" does not sound
correct. Is there a typo or missing words here?

Maybe it meant to say "... at the STREAM START time."?

~~~

9.
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }

I was unsure why did you make a special mention of
'wal_receiver_status_interval' here. I mean, Aren't there also other
GUCs that might change and affect something here so was there some
special reason only this one was mentioned?

==
src/test/subscription/t/032_apply_delay.pl

10.
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time.
+sub check_apply_delay_time

Maybe the comment could also mention that the time is automatically
stored in the table column 'c'.

~~~

11.
+# Confirm the suspended record doesn't get applied expectedly by the ALTER
+# DISABLE command.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check if the delayed transaction doesn't get
applied expectedly");

The use of "doesn't get applied expectedly" (in 2 places here) seemed
strange. Maybe it's better to say like

SUGGESTION
# Confirm disabling the subscription by ALTER DISABLE did not cause
the delayed transaction to be applied.
$result = $node_subscriber->safe_psql('postgres',
"SELECT count(a) FROM test_tab WHERE a = 0;");
is($result, qq(0), "check the delayed transaction was not applied");

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Logical replication timeout problem

2023-01-19 Thread Peter Smith

On Fri, Jan 20, 2023 at 3:35 PM Amit Kapila  wrote:
>
> On Fri, Jan 20, 2023 at 7:40 AM Peter Smith  wrote:
> >
> > Here are some review comments for patch v3-0001.
> >
> > ==
> > src/backend/replication/logical/logical.c
> >
> > 3. forward declaration
> >
> > +/* update progress callback */
> > +static void update_progress_cb_wrapper(ReorderBuffer *cache,
> > +ReorderBufferTXN *txn,
> > +ReorderBufferChange *change);
> >
> > I felt this function wrapper name was a bit misleading... AFAIK every
> > other wrapper really does just wrap their respective functions. But
> > this one seems a bit different because it calls the wrapped function
> > ONLY if some threshold is exceeded. IMO maybe this function could have
> > some name that conveys this better:
> >
> > e.g. update_progress_cb_wrapper_with_threshold
> >
>
> I am wondering whether it would be better to move the threshold logic
> to the caller. Previously this logic was inside the function because
> it was being invoked from multiple places but now that won't be the
> case. Also, then your concern about the name would also be addressed.
>
> >
> > ~
> >
> > 7b.
> > Would it be neater to just call OutputPluginUpdateProgress here instead?
> >
> > e.g.
> > BEFORE
> > ctx->update_progress(ctx, ctx->write_location, ctx->write_xid, false);
> > AFTER
> > OutputPluginUpdateProgress(ctx, false);
> >
>
> We already check whether ctx->update_progress is defined or not which
> is the only extra job done by OutputPluginUpdateProgress but probably
> we can consolidate the checks and directly invoke
> OutputPluginUpdateProgress.
>

Yes, I saw that, but I thought it was better to keep the early exit
from update_progress_cb_wrapper, so incurring just one additional
boolean check for every 100 changes was not anything to worry about.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Logical replication timeout problem

2023-01-19 Thread Peter Smith

Here are some review comments for patch v3-0001.

==
Commit message

1.
The problem is when there is a DDL in a transaction that generates lots of
temporary data due to rewrite rules, these temporary data will not be processed
by the pgoutput - plugin. Therefore, the previous fix (f95d53e) for DML had no
impact on this case.

~

1a.
IMO this comment needs to give a bit of background about the original
problem here, rather than just starting with "The problem is" which is
describing the flaws of the previous fix.

~

1b.
"pgoutput - plugin" -> "pgoutput plugin" ??

~~~

2.

To fix this, we introduced a new ReorderBuffer callback -
'ReorderBufferUpdateProgressCB'. This callback is called to try to update the
process after each change has been processed during sending data of a
transaction (and its subtransactions) to the output plugin.

IIUC it's not really "after each change" - shouldn't this comment
mention something about the CHANGES_THRESHOLD 100?

==
src/backend/replication/logical/logical.c

3. forward declaration

+/* update progress callback */
+static void update_progress_cb_wrapper(ReorderBuffer *cache,
+ReorderBufferTXN *txn,
+ReorderBufferChange *change);

I felt this function wrapper name was a bit misleading... AFAIK every
other wrapper really does just wrap their respective functions. But
this one seems a bit different because it calls the wrapped function
ONLY if some threshold is exceeded. IMO maybe this function could have
some name that conveys this better:

e.g. update_progress_cb_wrapper_with_threshold

~~~

4. update_progress_cb_wrapper

+/*
+ * Update progress callback
+ *
+ * Try to update progress and send a keepalive message if too many changes were
+ * processed when processing txn.
+ *
+ * For a large transaction, if we don't send any change to the downstream for a
+ * long time (exceeds the wal_receiver_timeout of standby) then it can timeout.
+ * This can happen when all or most of the changes are either not published or
+ * got filtered out.
+ */

SUGGESTION (instead of the "Try to update" sentence)
Send a keepalive message whenever more than 
changes are encountered while processing a transaction.

~~~

5.

+static void
+update_progress_cb_wrapper(ReorderBuffer *cache, ReorderBufferTXN *txn,
+ReorderBufferChange *change)
+{
+ LogicalDecodingContext *ctx = cache->private_data;
+ LogicalErrorCallbackState state;
+ ErrorContextCallback errcallback;
+ static int changes_count = 0; /* Static variable used to accumulate
+ * the number of changes while
+ * processing txn. */
+

IMO this may be more readable if the static 'changes_count' local var
was declared first and separated from the other vars by a blank line.

~~~

6.

+ /*
+ * We don't want to try sending a keepalive message after processing each
+ * change as that can have overhead. Tests revealed that there is no
+ * noticeable overhead in doing it after continuously processing 100 or so
+ * changes.
+ */
+#define CHANGES_THRESHOLD 100

6a.
I think it might be better to define this right at the top of the
function adjacent to the 'changes_count' variable (e.g. a bit like the
original HEAD code looked)

~

6b.
SUGGESTION (for the comment)
Sending keepalive messages after every change has some overhead, but
testing showed there is no noticeable overhead if keepalive is only
sent after every ~100 changes.

~~~

7.

+
+ /*
+ * After continuously processing CHANGES_THRESHOLD changes, we
+ * try to send a keepalive message if required.
+ */
+ if (++changes_count >= CHANGES_THRESHOLD)
+ {
+ ctx->update_progress(ctx, ctx->write_location, ctx->write_xid, false);
+ changes_count = 0;
+ }
+

7a.
SUGGESTION (for comment)
Send a keepalive message after every CHANGES_THRESHOLD changes.

~

7b.
Would it be neater to just call OutputPluginUpdateProgress here instead?

e.g.
BEFORE
ctx->update_progress(ctx, ctx->write_location, ctx->write_xid, false);
AFTER
OutputPluginUpdateProgress(ctx, false);

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Use appendStringInfoSpaces more

2023-01-19 Thread Peter Smith

On Thu, Jan 19, 2023 at 8:45 PM David Rowley  wrote:
>
> In [1] I noticed a bit of a poor usage of appendStringInfoString which
> just appends 4 spaces in a loop, one for each indent level of the
> jsonb.  It should be better just to use appendStringInfoSpaces and
> just append all the spaces in one go rather than appending 4 spaces in
> a loop. That'll save having to check enlargeStringInfo() once for each
> loop.
>

Should the add_indent function also have a check to avoid making
unnecessary calls to appendStringInfoSpaces when the level is 0?

e.g.
if (indent)
{
appendStringInfoCharMacro(out, '\n');
if (level > 0)
appendStringInfoSpaces(out, level * 4);
 }

V.

if (indent)
{
appendStringInfoCharMacro(out, '\n');
appendStringInfoSpaces(out, level * 4);
 }

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Deduplicate logicalrep_read_tuple()

2023-01-18 Thread Peter Smith

On Wed, Jan 18, 2023 at 6:26 PM Bharath Rupireddy
 wrote:
>
> Hi,
>
> logicalrep_read_tuple() duplicates code for LOGICALREP_COLUMN_TEXT and
> LOGICALREP_COLUMN_BINARY introduced by commit 9de77b5. While it
> doesn't hurt anyone, deduplication makes code a bit leaner by 57 bytes
> [1]. I've attached a patch for $SUBJECT.
>
> Thoughts?
>

The code looks the same but there is a subtle comment difference where
previously only LOGICALREP_COLUMN_BINARY case said:
 /* not strictly necessary but per StringInfo practice */

So if you de-duplicate the code then should that comment be modified to say
/* not strictly necessary for LOGICALREP_COLUMN_BINARY but per
StringInfo practice */

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-18 Thread Peter Smith

On Wed, Jan 18, 2023 at 6:06 PM Peter Smith  wrote:
>
>  Here are my review comments for the latest patch v16-0001. (excluding
> the test code)
>

And here are some review comments for the v16-0001 test code.

==

src/test/regress/sql/subscription.sql

1. General
For all comments

"time delayed replication" -> "time-delayed replication" maybe is better?

~~~

2.
-- fail - utilizing streaming = parallel with time delayed replication
is not supported.

For readability please put a blank line before this test.

~~~

3.
-- success -- value without unit is taken as milliseconds

"value" -> "min_apply_delay value"

~~~

4.
-- success -- interval is converted into ms and stored as integer

"interval" -> "min_apply_delay interval"

"integer" -> "an integer"

~~~

5.
You could also add another test where min_apply_delay is 0

Then the following combination can be confirmed OK -- success create
subscription with (streaming=parallel, min_apply_delay=0)

~~

6.
-- fail - alter subscription with min_apply_delay should fail when
streaming = parallel is set.
CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false, streaming = parallel);

There is another way to do this test without creating a brand-new
subscription. You could just alter the existing subscription like:
ALTER ... SET (min_apply_delay = 0)
then ALTER ... SET (parallel = streaming)
then ALTER ... SET (min_apply_delay = 123)

==

src/test/subscription/t/032_apply_delay.pl

7. sub check_apply_delay_log

my ($node_subscriber, $message, $expected) = @_;

Why pass in the message text? I is always the same so can be hardwired
in this function, right?

~~~

8.
# Get the delay time in the server log

"int the server log" -> "from the server log" (?)

~~~

9.
qr/$message: (\d+) ms/
or die "could not get delayed time";
my $logged_delay = $1;

# Is it larger than expected?
cmp_ok($logged_delay, '>', $expected,
"The wait time of the apply worker is long enough expectedly"
);

9a.
"could not get delayed time" -> "could not get the apply worker wait time"

9b.
"The wait time of the apply worker is long enough expectedly" -> "The
apply worker wait time has expected duration"

~~~

10.
sub check_apply_delay_time

Maybe a brief explanatory comment for this function is needed to
explain the unreplicated column c.

~~~

11.
$node_subscriber->safe_psql('postgres',
"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
application_name=$appname' PUBLICATION tap_pub WITH (streaming = on,
min_apply_delay = '3s')"

I think there should be a comment here highlighting that you are
setting up a subscriber time delay of 3 seconds, and then later you
can better describe the parameters for the checking functions...

e.g. (add this comment)
# verifies that the subscriber lags the publisher by at least 3 seconds
check_apply_delay_time($node_publisher, $node_subscriber, '5', '3');

e.g.
# verifies that the subscriber lags the publisher by at least 3 seconds
check_apply_delay_time($node_publisher, $node_subscriber, '8', '3');

~~~

12.
# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
# (1 day 1 minute).
$node_subscriber->safe_psql('postgres',
"ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 8646)"
);

Update the comment with another note.
# Note - The extra 1 min is to account for any decoding/network overhead.

~~~

13.
# Make sure we have long enough min_apply_delay after the ALTER command
check_apply_delay_log($node_subscriber, "logical replication apply
delay", "8000");

IMO the expectation of 1 day (8646 ms) wait time might be a better
number for your "expected" value.

So update the comment/call like this:

# Make sure the apply worker knows to wait for more than 1 day (8640 ms)
check_apply_delay_log($node_subscriber, "logical replication apply
delay", "8640");

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-18 Thread Peter Smith

On Wed, Jan 18, 2023 at 6:06 PM Peter Smith  wrote:
>
>  Here are my review comments for the latest patch v16-0001. (excluding
> the test code)
>
...
>
> 8. AlterSubscription (general)
>
> I observed during testing there are 3 different errors….
>
> At subscription CREATE time you can get this error:
> ERROR:  min_apply_delay > 0 and streaming = parallel are mutually
> exclusive options
>
> If you try to ALTER the min_apply_delay when already streaming =
> parallel you can get this error:
> ERROR:  cannot enable min_apply_delay for subscription in streaming =
> parallel mode
>
> If you try to ALTER the streaming to be parallel if there is already a
> min_apply_delay > 0 then you can get this error:
> ERROR:  cannot enable streaming = parallel mode for subscription with
> min_apply_delay
>
> ~
>
> IMO there is no need to have 3 different error message texts.  I think
> all these cases are explained by just the first text (ERROR:
> min_apply_delay > 0 and streaming = parallel are mutually exclusive
> options)
>
>

After checking the regression test output I can see the merit of your
separate error messages like this, even if they are maybe not strictly
necessary. So feel free to ignore my previous review comment.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [DOCS] Stats views and functions not in order?

2023-01-18 Thread Peter Smith

On Thu, Jan 19, 2023 at 2:55 AM David G. Johnston
 wrote:
>
> On Wed, Jan 18, 2023 at 8:38 AM Tom Lane  wrote:
>>
>> "David G. Johnston"  writes:
>> > ...  I was going for the html effect
>> > of having these views chunked into their own pages, any other changes being
>> > non-detrimental.
>>
>> But is that a result we want?  It will for example break any bookmarks
>> that people might have for these documentation entries.  It will also
>> pretty thoroughly break the cross-version navigation links in this
>> part of the docs.
>>
>>
>> Maybe the benefit is worth those costs, but I'm entirely not convinced
>> of that.  I think we need to tread pretty lightly when rearranging
>> longstanding documentation-layout decisions.
>>
>

David already gave a good summary [1], but since I was the OP here is
the background of v10-0001 from my PoV.

~

The original $SUBJECT requirements evolved to also try to make each
view appear on a separate page after that was suggested by DavidJ [2].
I was unable to achieve per-page views "without radically changing the
document structure." [3], but DavidJ found a way [4] to do it using
refentry. I then wrote the patch v8-0003 using that strategy, which
after more rebasing became the v10-0001 you see today.

I did prefer the view-per-page results (although I also only use HTML
docs). But my worry is that there seem still to be a few unknowns
about how this might affect other (not the HTML) renderings of the
docs. If you think that risk is too great, or if you feel this patch
will cause unwarranted link/bookmark grief, then I am happy to just
drop it.

--
[1] DJ overview -
https://www.postgresql.org/message-id/CAKFQuwaVm%3D6d_sw9Wrp4cdSm5_k%3D8ZVx0--v2v4BH4KnJtqXqg%40mail.gmail.com
[2] DJ suggested view-per-page -
https://www.postgresql.org/message-id/CAKFQuwa9JtoCBVc6CJb7NC5FqMeEAy_A8X4H8t6kVaw7fz9LTw%40mail.gmail.com
[3] PS don't know how to do it -
https://www.postgresql.org/message-id/CAHut%2BPv5Efz1TLWOLSoFvoyC0mq%2Bs92yFSd534ctWSdjEFtKCw%40mail.gmail.com
[4] DJ how to do it using refentry -
https://www.postgresql.org/message-id/CAKFQuwYkM5UZT%2B6tG%2BNgZvDcd5VavS%2BxNHsGsWC8jS-KJsxh7w%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-17 Thread Peter Smith

ly”?

~~~

21.

@@ -3737,8 +3869,15 @@ send_feedback(XLogRecPtr recvpos, bool force,
bool requestReply)
  /*
  * No outstanding transactions to flush, we can report the latest received
  * position. This is important for synchronous replication.
+ *
+ * During the delay of time-delayed replication, do not tell the publisher
+ * that the received latest LSN is already applied and flushed at this
+ * stage, since we don't apply the transaction yet. If we do so, it leads
+ * to a wrong assumption of logical replication progress on the publisher
+ * side. Here, we just send a feedback message to avoid publisher's
+ * timeout during the delay.
  */

Minor rewording of the comment

SUGGESTION
If the subscriber side apply is delayed (because of time-delayed
replication) then do not tell the publisher that the received latest
LSN is already applied and flushed, otherwise, it leads to the
publisher side making a wrong assumption of logical replication
progress. Instead, we just send a feedback message to avoid a
publisher timeout during the delay.


==


src/bin/pg_dump/pg_dump.c

22.

@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);

  if (fout->remoteVersion >= 16)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
  else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subminapplydelay\n");
+ }

Can’t those appends in the else part can be combined to a single
appendPQExpBuffer

appendPQExpBuffer(query,
" '%s' AS suborigin,\n"
" 0 AS subminapplydelay\n"
LOGICALREP_ORIGIN_ANY);


==

src/include/catalog/pg_subscription.h

23.

@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROW
  XLogRecPtr subskiplsn; /* All changes finished at this LSN are
  * skipped */

+ int64 subminapplydelay; /* Replication apply delay */
+
  NameData subname; /* Name of the subscription */

  Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */

SUGGESTION (for comment)
Replication apply delay (ms)

~~

24.

@@ -120,6 +122,7 @@ typedef struct Subscription
  * in */
  XLogRecPtr skiplsn; /* All changes finished at this LSN are
  * skipped */
+ int64 minapplydelay; /* Replication apply delay */

SUGGESTION (for comment)
Replication apply delay (ms)


--
Kind Regards,
Peter Smith.
Fujitsu Australia

PGDOCS - sgml linkend using single-quotes

2023-01-17 Thread Peter Smith

Hi,

I happened to notice some examples of SGML linkends that were using
single quotes instead of double quotes.

It didn't seem to be the conventional style because grepping (from
doc/src/sgml folder) showed only a tiny fraction using single quotes.

(single-quotes)
$ grep --include=*.sgml -rn . -e "linkend='" | wc -l
12

(double-quotes)
$ grep --include=*.sgml -rn . -e 'linkend="' | wc -l
5915

~~

PSA patch that makes them all use double quotes.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Replace-linkend-single-quotes-with-double-quotes.patch
Description: Binary data

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-16 Thread Peter Smith

On Tue, Jan 17, 2023 at 2:37 PM houzj.f...@fujitsu.com
 wrote:
>
> On Tuesday, January 17, 2023 11:32 AM Peter Smith  
> wrote:
> >
> > On Tue, Jan 17, 2023 at 1:21 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > On Tuesday, January 17, 2023 5:43 AM Peter Smith
> >  wrote:
> > > >
> > > > On Mon, Jan 16, 2023 at 5:41 PM Amit Kapila
> > > > 
> > > > wrote:
> > > > >
> > > > > On Mon, Jan 16, 2023 at 10:24 AM Peter Smith
> > > > > 
> > > > wrote:
> > > > > >
> > > > > > 2.
> > > > > >
> > > > > >  /*
> > > > > > + * Return the pid of the leader apply worker if the given pid
> > > > > > +is the pid of a
> > > > > > + * parallel apply worker, otherwise return InvalidPid.
> > > > > > + */
> > > > > > +pid_t
> > > > > > +GetLeaderApplyWorkerPid(pid_t pid) {  int leader_pid =
> > > > > > +InvalidPid;  int i;
> > > > > > +
> > > > > > + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
> > > > > > +
> > > > > > + for (i = 0; i < max_logical_replication_workers; i++) {
> > > > > > + LogicalRepWorker *w = >workers[i];
> > > > > > +
> > > > > > + if (isParallelApplyWorker(w) && w->proc && pid ==
> > > > > > + w->proc->pid) { leader_pid = w->leader_pid; break; } }
> > > > > > +
> > > > > > + LWLockRelease(LogicalRepWorkerLock);
> > > > > > +
> > > > > > + return leader_pid;
> > > > > > +}
> > > > > >
> > > > > > 2a.
> > > > > > IIUC the IsParallelApplyWorker macro does nothing except check
> > > > > > that the leader_pid is not InvalidPid anyway, so AFAIK this
> > > > > > algorithm does not benefit from using this macro because we will
> > > > > > want to return InvalidPid anyway if the given pid matches.
> > > > > >
> > > > > > So the inner condition can just say:
> > > > > >
> > > > > > if (w->proc && w->proc->pid == pid) { leader_pid =
> > > > > > w->leader_pid; break; }
> > > > > >
> > > > >
> > > > > Yeah, this should also work but I feel the current one is explicit
> > > > > and more clear.
> > > >
> > > > OK.
> > > >
> > > > But, I have one last comment about this function -- I saw there are
> > > > already other functions that iterate max_logical_replication_workers
> > > > like this looking for things:
> > > > - logicalrep_worker_find
> > > > - logicalrep_workers_find
> > > > - logicalrep_worker_launch
> > > > - logicalrep_sync_worker_count
> > > >
> > > > So I felt this new function (currently called
> > > > GetLeaderApplyWorkerPid) ought to be named similarly to those ones.
> > > > e.g. call it something like "logicalrep_worker_find_pa_leader_pid".
> > > >
> > >
> > > I am not sure we can use the name, because currently all the API name
> > > in launcher that used by other module(not related to subscription) are
> > > like AxxBxx style(see the functions in logicallauncher.h).
> > > logicalrep_worker_xxx style functions are currently only declared in
> > > worker_internal.h.
> > >
> >
> > OK. I didn't know there was another header convention that you were 
> > following.
> > In that case, it is fine to leave the name as-is.
>
> Thanks for confirming!
>
> Attach the new version 0001 patch which addressed all other comments.
>

OK. I checked the differences between patches v81-0001/v82-0001 and
found everything I was expecting to see.

I have no more review comments for v82-0001.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-16 Thread Peter Smith

On Tue, Jan 17, 2023 at 1:21 PM houzj.f...@fujitsu.com
 wrote:
>
> On Tuesday, January 17, 2023 5:43 AM Peter Smith  
> wrote:
> >
> > On Mon, Jan 16, 2023 at 5:41 PM Amit Kapila 
> > wrote:
> > >
> > > On Mon, Jan 16, 2023 at 10:24 AM Peter Smith 
> > wrote:
> > > >
> > > > 2.
> > > >
> > > >  /*
> > > > + * Return the pid of the leader apply worker if the given pid is
> > > > +the pid of a
> > > > + * parallel apply worker, otherwise return InvalidPid.
> > > > + */
> > > > +pid_t
> > > > +GetLeaderApplyWorkerPid(pid_t pid)
> > > > +{
> > > > + int leader_pid = InvalidPid;
> > > > + int i;
> > > > +
> > > > + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
> > > > +
> > > > + for (i = 0; i < max_logical_replication_workers; i++) {
> > > > + LogicalRepWorker *w = >workers[i];
> > > > +
> > > > + if (isParallelApplyWorker(w) && w->proc && pid == w->proc->pid) {
> > > > + leader_pid = w->leader_pid; break; } }
> > > > +
> > > > + LWLockRelease(LogicalRepWorkerLock);
> > > > +
> > > > + return leader_pid;
> > > > +}
> > > >
> > > > 2a.
> > > > IIUC the IsParallelApplyWorker macro does nothing except check that
> > > > the leader_pid is not InvalidPid anyway, so AFAIK this algorithm
> > > > does not benefit from using this macro because we will want to
> > > > return InvalidPid anyway if the given pid matches.
> > > >
> > > > So the inner condition can just say:
> > > >
> > > > if (w->proc && w->proc->pid == pid)
> > > > {
> > > > leader_pid = w->leader_pid;
> > > > break;
> > > > }
> > > >
> > >
> > > Yeah, this should also work but I feel the current one is explicit and
> > > more clear.
> >
> > OK.
> >
> > But, I have one last comment about this function -- I saw there are already
> > other functions that iterate max_logical_replication_workers like this 
> > looking
> > for things:
> > - logicalrep_worker_find
> > - logicalrep_workers_find
> > - logicalrep_worker_launch
> > - logicalrep_sync_worker_count
> >
> > So I felt this new function (currently called GetLeaderApplyWorkerPid) ought
> > to be named similarly to those ones. e.g. call it something like
> > "logicalrep_worker_find_pa_leader_pid".
> >
>
> I am not sure we can use the name, because currently all the API name in 
> launcher that
> used by other module(not related to subscription) are like
> AxxBxx style(see the functions in logicallauncher.h).
> logicalrep_worker_xxx style functions are currently only declared in
> worker_internal.h.
>

OK. I didn't know there was another header convention that you were
following.  In that case, it is fine to leave the name as-is.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-16 Thread Peter Smith

On Mon, Jan 16, 2023 at 5:41 PM Amit Kapila  wrote:
>
> On Mon, Jan 16, 2023 at 10:24 AM Peter Smith  wrote:
> >
> > 2.
> >
> >  /*
> > + * Return the pid of the leader apply worker if the given pid is the pid 
> > of a
> > + * parallel apply worker, otherwise return InvalidPid.
> > + */
> > +pid_t
> > +GetLeaderApplyWorkerPid(pid_t pid)
> > +{
> > + int leader_pid = InvalidPid;
> > + int i;
> > +
> > + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
> > +
> > + for (i = 0; i < max_logical_replication_workers; i++)
> > + {
> > + LogicalRepWorker *w = >workers[i];
> > +
> > + if (isParallelApplyWorker(w) && w->proc && pid == w->proc->pid)
> > + {
> > + leader_pid = w->leader_pid;
> > + break;
> > + }
> > + }
> > +
> > + LWLockRelease(LogicalRepWorkerLock);
> > +
> > + return leader_pid;
> > +}
> >
> > 2a.
> > IIUC the IsParallelApplyWorker macro does nothing except check that
> > the leader_pid is not InvalidPid anyway, so AFAIK this algorithm does
> > not benefit from using this macro because we will want to return
> > InvalidPid anyway if the given pid matches.
> >
> > So the inner condition can just say:
> >
> > if (w->proc && w->proc->pid == pid)
> > {
> > leader_pid = w->leader_pid;
> > break;
> > }
> >
>
> Yeah, this should also work but I feel the current one is explicit and
> more clear.

OK.

But, I have one last comment about this function -- I saw there are
already other functions that iterate max_logical_replication_workers
like this looking for things:
- logicalrep_worker_find
- logicalrep_workers_find
- logicalrep_worker_launch
- logicalrep_sync_worker_count

So I felt this new function (currently called GetLeaderApplyWorkerPid)
ought to be named similarly to those ones. e.g. call it something like
 "logicalrep_worker_find_pa_leader_pid".

>
> > ~
> >
> > 2b.
> > A possible alternative comment.
> >
> > BEFORE
> > Return the pid of the leader apply worker if the given pid is the pid
> > of a parallel apply worker, otherwise return InvalidPid.
> >
> >
> > AFTER
> > If the given pid has a leader apply worker then return the leader pid,
> > otherwise, return InvalidPid.
> >
>
> I don't think that is an improvement.
>
> > ==
> >
> > src/backend/utils/adt/pgstatfuncs.c
> >
> > 3.
> >
> > @@ -434,6 +435,16 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
> >   values[28] = Int32GetDatum(leader->pid);
> >   nulls[28] = false;
> >   }
> > + else
> > + {
> > + int leader_pid = GetLeaderApplyWorkerPid(beentry->st_procpid);
> > +
> > + if (leader_pid != InvalidPid)
> > + {
> > + values[28] = Int32GetDatum(leader_pid);
> > + nulls[28] = false;
> > + }
> > +
> >
> > 3a.
> > There is an existing comment preceding this if/else but it refers only
> > to leaders of parallel groups. Should that comment be updated to
> > mention the leader apply worker too?
> >
>
> Yeah, we can slightly adjust the comments. How about something like the below:
> index 415e711729..7eb668634a 100644
> --- a/src/backend/utils/adt/pgstatfuncs.c
> +++ b/src/backend/utils/adt/pgstatfuncs.c
> @@ -410,9 +410,9 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
>
> /*
>  * If a PGPROC entry was retrieved, display
> wait events and lock
> -* group leader information if any.  To avoid
> extra overhead, no
> -* extra lock is being held, so there is no guarantee 
> of
> -* consistency across multiple rows.
> +* group leader or apply leader information if
> any.  To avoid extra
> +* overhead, no extra lock is being held, so
> there is no guarantee
> +* of consistency across multiple rows.
>      */
> if (proc != NULL)
> {
> @@ -428,7 +428,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
> /*
>  * Show the leader only for active
> parallel workers.  This
>  * leaves the field as NULL for the
> leader of a parallel
> -* group.
> +* group or the leader of a parallel apply.
>  */
> if (leader && leader->pid !=
> beentry->st_procpid)
>

The updated comment LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-15 Thread Peter Smith

Here are some review comments for v81-0001.

==

Commit Message

1.

Additionally, update the leader_pid column in pg_stat_activity as well to
display the PID of the leader apply worker for parallel apply workers.

~

Probably it should not say both "Additionally" and "as well" in the
same sentence.

==

src/backend/replication/logical/launcher.c

2.

 /*
+ * Return the pid of the leader apply worker if the given pid is the pid of a
+ * parallel apply worker, otherwise return InvalidPid.
+ */
+pid_t
+GetLeaderApplyWorkerPid(pid_t pid)
+{
+ int leader_pid = InvalidPid;
+ int i;
+
+ LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
+
+ for (i = 0; i < max_logical_replication_workers; i++)
+ {
+ LogicalRepWorker *w = >workers[i];
+
+ if (isParallelApplyWorker(w) && w->proc && pid == w->proc->pid)
+ {
+ leader_pid = w->leader_pid;
+ break;
+ }
+ }
+
+ LWLockRelease(LogicalRepWorkerLock);
+
+ return leader_pid;
+}

2a.
IIUC the IsParallelApplyWorker macro does nothing except check that
the leader_pid is not InvalidPid anyway, so AFAIK this algorithm does
not benefit from using this macro because we will want to return
InvalidPid anyway if the given pid matches.

So the inner condition can just say:

if (w->proc && w->proc->pid == pid)
{
leader_pid = w->leader_pid;
break;
}

~

2b.
A possible alternative comment.

BEFORE
Return the pid of the leader apply worker if the given pid is the pid
of a parallel apply worker, otherwise return InvalidPid.


AFTER
If the given pid has a leader apply worker then return the leader pid,
otherwise, return InvalidPid.

==

src/backend/utils/adt/pgstatfuncs.c

3.

@@ -434,6 +435,16 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
  values[28] = Int32GetDatum(leader->pid);
  nulls[28] = false;
  }
+ else
+ {
+ int leader_pid = GetLeaderApplyWorkerPid(beentry->st_procpid);
+
+ if (leader_pid != InvalidPid)
+ {
+ values[28] = Int32GetDatum(leader_pid);
+ nulls[28] = false;
+ }
+

3a.
There is an existing comment preceding this if/else but it refers only
to leaders of parallel groups. Should that comment be updated to
mention the leader apply worker too?

~

3b.
It may be unrelated to this patch, but it seems strange to me that the
nulls[28]/values[28] assignments are done where they are. Every other
nulls/values assignment of this function here is pretty much in the
correct numerical order except this one, so IMO this code ought to be
relocated to later in this same function.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-12 Thread Peter Smith

Here are some review comments for patch v79-0002.

==

General

1.

I saw that earlier in this thread Hou-san [1] and Amit [2] also seemed
to say there is not much point for this patch.

So I wanted to +1 that same opinion.

I feel this patch just adds more complexity for almost no gain:
- reducing the 'max_apply_workers_per_suibscription' seems not very
common in the first place.
- even when the GUC is reduced, at that point in time all the workers
might be in use so there may be nothing that can be immediately done.
- IIUC the excess workers (for a reduced GUC) are going to get freed
naturally anyway over time as more transactions are completed so the
pool size will reduce accordingly.


~

OTOH some refactoring parts of this patch (e.g. the new pa_stop_worker
function) look better to me. I would keep those ones but remove all
the pa_stop_idle_workers function/call.


*** NOTE: The remainder of these review comments are maybe only
relevant if you are going to keep this pa_stop_idle_workers
behaviour...

==

Commit message

2.

If the max_parallel_apply_workers_per_subscription is changed to a
lower value, try to stop free workers in the pool to keep the number of
workers lower than half of the max_parallel_apply_workers_per_subscription

SUGGESTION

If the GUC max_parallel_apply_workers_per_subscription is changed to a
lower value, try to stop unused workers to keep the pool size lower
than half of max_parallel_apply_workers_per_subscription.


==

.../replication/logical/applyparallelworker.c

3. pa_free_worker

if (winfo->serialize_changes ||
list_length(ParallelApplyWorkerPool) >
(max_parallel_apply_workers_per_subscription / 2))
{
pa_stop_worker(winfo);
return;
}

winfo->in_use = false;
winfo->serialize_changes = false;

~

IMO the above code can be more neatly written using if/else because
then there is only one return point, and there is a place to write the
explanatory comment about the else.

SUGGESTION

if (winfo->serialize_changes ||
list_length(ParallelApplyWorkerPool) >
(max_parallel_apply_workers_per_subscription / 2))
{
pa_stop_worker(winfo);
}
else
{
/* Don't stop the worker. Only mark it available for re-use. */
winfo->in_use = false;
winfo->serialize_changes = false;
}

==

src/backend/replication/logical/worker.c

4. pa_stop_idle_workers

/*
 * Try to stop parallel apply workers that are not in use to keep the number of
 * workers lower than half of the max_parallel_apply_workers_per_subscription.
 */
void
pa_stop_idle_workers(void)
{
List*active_workers;
ListCell   *lc;
int max_applyworkers = max_parallel_apply_workers_per_subscription / 2;

if (list_length(ParallelApplyWorkerPool) <= max_applyworkers)
return;

active_workers = list_copy(ParallelApplyWorkerPool);

foreach(lc, active_workers)
{
ParallelApplyWorkerInfo *winfo = (ParallelApplyWorkerInfo *) lfirst(lc);

pa_stop_worker(winfo);

/* Recheck the number of workers. */
if (list_length(ParallelApplyWorkerPool) <= max_applyworkers)
break;
}

list_free(active_workers);
}

~

4a. function comment

SUGGESTION

Try to keep the worker pool size lower than half of the
max_parallel_apply_workers_per_subscription.

~

4b. function name

This is not stopping all idle workers, so maybe a more meaningful name
for this function is something more like "pa_reduce_workerpool"

~

4c.

IMO the "max_applyworkers" var is a misleading name. Maybe something
like "goal_poolsize" is better?

~

4d.

Maybe I misunderstand the logic for the pool, but shouldn't this be
checking the winfo->in_use flag before blindly stopping each worker?


==

src/backend/replication/logical/worker.c

5.

@@ -3630,6 +3630,13 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
  {
  ConfigReloadPending = false;
  ProcessConfigFile(PGC_SIGHUP);
+
+ /*
+ * Try to stop free workers in the pool in case the
+ * max_parallel_apply_workers_per_subscription is changed to a
+ * lower value.
+ */
+ pa_stop_idle_workers();
  }
5a.

SUGGESTED COMMENT
If max_parallel_apply_workers_per_subscription is changed to a lower
value, try to reduce the worker pool to match.

~

5b.

Instead of unconditionally calling pa_stop_idle_workers, shouldn't
this code compare the value of
max_parallel_apply_workers_per_subscription before/after the
ProcessConfigFile so it only calls if the GUC was lowered?


--
[1] Hou-san - 
https://www.postgresql.org/message-id/OS0PR01MB5716E527412A3481F90B4397941A9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2] Amit - 
https://www.postgresql.org/message-id/CAA4eK1J%3D9m-VNRMHCqeG8jpX0CTn3Ciad2o4H-ogrZMDJ3tn4w%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-12 Thread Peter Smith

On Fri, Jan 13, 2023 at 2:37 PM Amit Kapila  wrote:
>
> > 3.
> >
> >
> > +   leader_pid integer
> > +  
> > +  
> > +   Process ID of the leader apply worker if this process is a parallel
> > +   apply worker; NULL if this process is a leader apply worker or does 
> > not
> > +   participate in parallel apply, or a synchronization worker
> > +  
> >
> > I felt this change is giving too many details and ended up just
> > muddying the water.
> >
>
> I see that we give a similar description for other parameters as well.
> For example leader_pid in pg_stat_activity, see client_dn,
> client_serial in pg_stat_ssl. It is better to be consistent here and
> this gives the reader a bit more information when the value is NULL
> for the new column.
>

It is OK to give extra details as those other examples do, but my
point -- where I wrote "the leader apply worker and the (not leader)
apply worker are one-and-the-same process" -- was there are currently
only 3 kinds of workers possible (leader apply, parallel apply,
tablsync). If it is not a "parallel apply" worker then it can only be
one of the other 2. So I think it is sufficient and less confusing to
say:

Process ID of the leader apply worker if this process is a parallel
apply worker; NULL if this process is a leader apply worker or a
synchronization worker.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-12 Thread Peter Smith

Here are my review comments for v79-0001.

==

General

1.

When Amit suggested [1] changing the name just to "leader_pid" instead
of "leader_apply_pid" I thought he was only referring to changing the
view column name, not also the internal member names of the worker
structure. Maybe it is OK anyway, but please check if that was the
intention.

==

Commit message

2.

leader_pid is the process ID of the leader apply worker if this process is a
parallel apply worker. If this field is NULL, it indicates that the process is
a leader apply worker or does not participate in parallel apply, or a
synchronization worker.

~

This text is just cut/paste from the monitoring.sgml. In a review
comment below I suggest some changes to that text, so then this commit
message should also change to be the same.

==

doc/src/sgml/monitoring.sgml

3.

   
+   leader_pid integer
+  
+  
+   Process ID of the leader apply worker if this process is a parallel
+   apply worker; NULL if this process is a leader apply worker or does not
+   participate in parallel apply, or a synchronization worker
+  

I felt this change is giving too many details and ended up just
muddying the water.

E.g. Now this says basically "NULL if AAA or BBB, or CCC" but that
makes it sounds like there are 3 other things the process could be
instead of a parallel worker. But that is not not really true unless
you are making some distinction between the main "apply worker" which
is a leader versus a main apply worker which is not a leader. IMO we
should not be making any distinction at all - the leader apply worker
and the main (not leader) apply worker are one-and-the-same process.

So, I still prefer my previous suggestion (see [2] #5b]

==

src/backend/catalog/system_views.sql

4.

@@ -949,6 +949,7 @@ CREATE VIEW pg_stat_subscription AS
 su.oid AS subid,
 su.subname,
 st.pid,
+st.leader_pid,
 st.relid,
 st.received_lsn,
 st.last_msg_send_time,

IMO it would be very useful to have an additional "kind" attribute for
this view. This will save the user from needing to do mental
gymnastics every time just to recognise what kind of process they are
looking at.

For example, I tried this:

CREATE VIEW pg_stat_subscription AS
SELECT
su.oid AS subid,
su.subname,
CASE
WHEN st.relid IS NOT NULL THEN 'tablesync'
WHEN st.leader_pid IS NOT NULL THEN 'parallel apply'
ELSE 'leader apply'
END AS kind,
st.pid,
st.leader_pid,
st.relid,
st.received_lsn,
st.last_msg_send_time,
st.last_msg_receipt_time,
st.latest_end_lsn,
st.latest_end_time
FROM pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL) st
  ON (st.subid = su.oid);


and it results in much more readable output IMO:

test_sub=# select * from pg_stat_subscription;
 subid | subname | kind | pid  | leader_pid | relid |
received_lsn |  last_msg_send_time   |
last_msg_receipt_time | lat
est_end_lsn |latest_end_time
---+-+--+--++---+--+---+---+
+---
 16388 | sub1| leader apply | 5281 ||   |
0/1901378| 2023-01-13 12:39:03.984249+11 | 2023-01-13
12:39:03.986157+11 | 0/1
901378  | 2023-01-13 12:39:03.984249+11
(1 row)

Thoughts?


--
[1] Amit - 
https://www.postgresql.org/message-id/CAA4eK1KYUbnthSPyo4VjnhMygB0c1DZtp0XC-V2-GSETQ743ww%40mail.gmail.com
[2] My v78-0001 review -
https://www.postgresql.org/message-id/CAHut%2BPvA10Bp9Jaw9OS2%2BpuKHr7ry_xB3Tf2-bbv5gyxD5E_gw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-11 Thread Peter Smith

kers

~~~

10.

@@ -3249,7 +3263,7 @@ SELECT pid, wait_event_type, wait_event FROM
pg_stat_activity WHERE wait_event i
   
   
Time of last write-ahead log location reported to origin WAL
-   sender
+   sender; null for the parallel apply worker
   
  
 

(same as #6)

BEFORE
null for the parallel apply worker

AFTER
null for parallel apply workers

==

src/backend/catalog/system_views.sql

11.

@@ -949,6 +949,7 @@ CREATE VIEW pg_stat_subscription AS
 su.oid AS subid,
 su.subname,
 st.pid,
+st.apply_leader_pid,
 st.relid,
 st.received_lsn,
 st.last_msg_send_time,

(Same as general comment #1 about terminology)

"apply_leader_pid" --> "leader_apply_pid"

==

src/backend/replication/logical/launcher.c

12.

+ if (worker.apply_leader_pid == InvalidPid)
  nulls[3] = true;
  else
- values[3] = LSNGetDatum(worker.last_lsn);
- if (worker.last_send_time == 0)
+ values[3] = Int32GetDatum(worker.apply_leader_pid);
+

12a.

(Same as general comment #1 about terminology)

"apply_leader_pid" --> "leader_apply_pid"

~~

12b.

I wondered if here the code should be using the
isParallelApplyWorker(worker) macro here for readability.

e.g.

if (isParallelApplyWorker(worker))
values[3] = Int32GetDatum(worker.apply_leader_pid);
else
  nulls[3] = true;

==

src/include/catalog/pg_proc.dat

13.

+  proallargtypes =>
'{oid,oid,oid,int4,int4,pg_lsn,timestamptz,timestamptz,pg_lsn,timestamptz}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o}',
+  proargnames =>
'{subid,subid,relid,pid,apply_leader_pid,received_lsn,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time}',

(Same as general comment #1 about terminology)

"apply_leader_pid" --> "leader_apply_pid"

==

src/test/regress/expected/rules.out

14.

@@ -2094,6 +2094,7 @@ pg_stat_ssl| SELECT s.pid,
 pg_stat_subscription| SELECT su.oid AS subid,
 su.subname,
 st.pid,
+st.apply_leader_pid,
 st.relid,
 st.received_lsn,
 st.last_msg_send_time,
@@ -2101,7 +2102,7 @@ pg_stat_subscription| SELECT su.oid AS subid,
 st.latest_end_lsn,
 st.latest_end_time
FROM (pg_subscription su
- LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
pid, received_lsn, last_msg_send_time, last_msg_receipt_time,
latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+ LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
pid, apply_leader_pid, received_lsn, last_msg_send_time,
last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid
= su.oid)));
 pg_stat_subscription_stats| SELECT ss.subid,
 s.subname,
 ss.apply_error_count,

(Same comment as elsewhere)

"apply_leader_pid" --> "leader_apply_pid"

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: [DOCS] Stats views and functions not in order?

2023-01-10 Thread Peter Smith

On Wed, Jan 4, 2023 at 6:08 PM vignesh C  wrote:
>
> On Mon, 2 Jan 2023 at 13:47, Peter Eisentraut
>  wrote:
> >
> > On 08.12.22 03:30, Peter Smith wrote:
> > > PSA patches for v9*
> > >
> > > v9-0001 - Now the table rows are ordered per PeterE's suggestions [1]
> >
> > committed

Thanks for pushing.

> >
> > > v9-0002 - All the review comments from DavidJ [2] are addressed
> >
> > I'm not sure about this one.  It removes the "see [link] for details"
> > phrases and instead makes the view name a link.  I think this loses the
> > cue that there is more information elsewhere.  Otherwise, one could
> > think that, say, the entry about pg_stat_activity is the primary source
> > and the link just links to itself.  Also keep in mind that people use
> > media where links are not that apparent (PDF), so the presence of a link
> > by itself cannot be the only cue about the flow of the information.
>

PSA new patch for v10-0001

v9-0001 --> pushed, thanks!
v9-0002 --> I removed this based on the reject reason above
v9-0003 --> v10-0001

> I'm not sure if anything is pending for v9-0003, if there is something
> pending, please post an updated patch for the same.
>

Thanks for the reminder. PSA v10.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v10-0001-Add-Statistics-Views-section-and-refentry-for-ea.patch
Description: Binary data

Re: pgsql: Doc: Explain about Column List feature.

2022-12-21 Thread Peter Smith

On Wed, Dec 21, 2022 at 6:59 PM Alvaro Herrera  wrote:
>
> On 2022-Dec-21, Peter Smith wrote:
>
> > By "searching" I also meant just scanning visually, although I was
> > thinking more about scanning the PDF.
> >
> > Right now, the intention of any text box is obvious at a glance
> > because of those titles like "Caution", "Tip", "Note", "Warning".
> > Sure, the HTML rendering also uses colours to convey the purpose, but
> > in the PDF version [1] everything is black-and-white so apart from the
> > title all boxes look the same. That's why I felt using non-standard
> > box titles might be throwing away some of the meaning - e.g. the
> > reader of the PDF won't know anymore at a glance are they looking at a
> > warning or a tip.
>
> Oh, I see.  It's been so long that I haven't looked at the PDFs, that I
> failed to realize that they don't use color.  I agree that would be a
> problem.  Maybe we can change the title to have the word:
>
>     Warning: Combining Column Lists from Multiple Publications
>

That last idea LGTM. But no patch at all LGTM also.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Force streaming every change in logical decoding

2022-12-21 Thread Peter Smith

On Wed, Dec 21, 2022 at 6:22 PM Masahiko Sawada  wrote:
>
> On Tue, Dec 20, 2022 at 7:49 PM Amit Kapila  wrote:
> >
> > On Tue, Dec 20, 2022 at 2:46 PM Hayato Kuroda (Fujitsu)
> >  wrote:
> > >
> > > Dear hackers,
> > >
> > > > We have discussed three different ways to provide GUC for these
> > > > features. (1) Have separate GUCs like force_server_stream_mode,
> > > > force_server_serialize_mode, force_client_serialize_mode (we can use
> > > > different names for these) for each of these; (2) Have two sets of
> > > > GUCs for server and client. We can have logical_decoding_mode with
> > > > values as 'stream' and 'serialize' for the server and then
> > > > logical_apply_serialize = true/false for the client. (3) Have one GUC
> > > > like logical_replication_mode with values as 'server_stream',
> > > > 'server_serialize', 'client_serialize'.
> > >
> > > I also agreed for adding new GUC parameters (and I have already done 
> > > partially
> > > in parallel apply[1]), and basically options 2 made sense for me. But is 
> > > it OK
> > > that we can choose "serialize" mode even if subscribers require streaming?
> > >
> > > Currently the reorder buffer transactions are serialized on publisher 
> > > only when
> > > the there are no streamable transaction. So what happen if the
> > > logical_decoding_mode = "serialize" but streaming option streaming is on? 
> > > If we
> > > break the first one and serialize changes on publisher anyway, it may be 
> > > not
> > > suitable for testing the normal operation.
> > >
> >
> > I think the change will be streamed as soon as the next change is
> > processed even if we serialize based on this option. See
> > ReorderBufferProcessPartialChange. However, I see your point that when
> > the streaming option is given, the value 'serialize' for this GUC may
> > not make much sense.
> >
> > > Therefore, I came up with the variant of (2): logical_decoding_mode can be
> > > "normal" or "immediate".
> > >
> > > "normal" is a default value, which is same as current HEAD. Changes are 
> > > streamed
> > > or serialized when the buffered size exceeds logical_decoding_work_mem.
> > >
> > > When users set to "immediate", the walsenders starts to stream or 
> > > serialize all
> > > changes. The choice is depends on the subscription option.
> > >
> >
> > The other possibility to achieve what you are saying is that we allow
> > a minimum value of logical_decoding_work_mem as 0 which would mean
> > stream or serialize each change depending on whether the streaming
> > option is enabled. I think we normally don't allow a minimum value
> > below a certain threshold for other *_work_mem parameters (like
> > maintenance_work_mem, work_mem), so we have followed the same here.
> > And, I think it makes sense from the user's perspective because below
> > a certain threshold it will just add overhead by either writing small
> > changes to the disk or by sending those over the network. However, it
> > can be quite useful for testing/debugging. So, not sure, if we should
> > restrict setting logical_decoding_work_mem below a certain threshold.
> > What do you think?
>
> I agree with (2), having separate GUCs for publisher side and
> subscriber side. Also, on the publisher side, Amit's idea, controlling
> the logical decoding behavior by changing logical_decoding_work_mem,
> seems like a good idea.
>
> But I'm not sure it's a good idea if we lower the minimum value of
> logical_decoding_work_mem to 0. I agree it's helpful for testing and
> debugging but setting logical_decoding_work_mem = 0 doesn't benefit
> users at all, rather brings risks.
>
> I prefer the idea Kuroda-san previously proposed; setting
> logical_decoding_mode = 'immediate' means setting
> logical_decoding_work_mem = 0. We might not need to have it as an enum
> parameter since it has only two values, though.

Did you mean one GUC (logical_decoding_mode) will cause a side-effect
implicit value change on another GUC value
(logical_decoding_work_mem)?

If so, then that seems a like potential source of confusion IMO.
- e.g. actual value is invisibly set differently from what the user
sees in the conf file
- e.g. will it depend on the order they get assigned

Are there any GUC precedents for something like that?

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Force streaming every change in logical decoding

2022-12-21 Thread Peter Smith

Here are some review comments for patch v2.

Since the GUC is still under design maybe these comments can be
ignored for now, but I guess similar comments will apply in future
anyhow (just with some name changes).

==

1. Commit message

Add a new GUC force_stream_mode, when it is set on, send the change to
output plugin immediately in streaming mode. Otherwise, send until
logical_decoding_work_mem is exceeded.

~

Is that quite right? I thought it was more like shown below:

SUGGESTION
Add a new GUC 'force_stream_mode' which modifies behavior when
streaming mode is enabled. If force_stream_mode is on the changes are
sent to the output plugin immediately. Otherwise,(when
force_stream_mode is off) changes are written to memory until
logical_decoding_work_mem is exceeded.

==

2. doc/src/sgml/config.sgml

+   
+Specifies whether to force sending the changes to output plugin
+immediately in streaming mode. If set to off (the
+default), send until logical_decoding_work_mem is
+exceeded.
+   

Suggest slight rewording like #1.

SUGGESTION
This modifies the behavior when streaming mode is enabled. If set to
on the changes are sent to the output plugin
immediately. If set off (the default), changes are
written to memory until logical_decoding_work_mem
is exceeded.

==

3. More docs.

It might be helpful if this developer option is referenced also on the
page "31.10.1 Logical Replication > Configuration Settings >
Publishers" [1]

==

src/backend/replication/logical/reorderbuffer.c

4. GUCs

+/*
+ * Whether to send the change to output plugin immediately in streaming mode.
+ * When it is off, wait until logical_decoding_work_mem is exceeded.
+ */
+bool force_stream_mode;

4a.
"to output plugin" -> "to the output plugin"

~

4b.
By convention (see. [2]) IIUC there should be some indication that
these (this and 'logical_decoding_work_mem') are GUC variables. e.g.
these should be refactored to be grouped togther without the other
static var in between. And add a comment for them both like:

/* GUC variable. */

~

4c.
Also, (see [2]) it makes the code more readable to give the GUC an
explicit initial value.

SUGGESTION
bool force_stream_mode = false;

~~~

5. ReorderBufferCheckMemoryLimit

+ /* we know there has to be one, because the size is not zero */

Uppercase comment. Although not strictly added by this patch you might
as well make this change while adjusting the indentation.

==

src/backend/utils/misc/guc_tables.c

6.

+ {
+ {"force_stream_mode", PGC_USERSET, DEVELOPER_OPTIONS,
+ gettext_noop("Force sending the changes to output plugin immediately
if streaming is supported, without waiting till
logical_decoding_work_mem."),
+ NULL,
+ GUC_NOT_IN_SAMPLE
+ },
+ _stream_mode,
+ false,
+ NULL, NULL, NULL
+ },

"without waiting till logical_decoding_work_mem." seem like an
incomplete sentence

SUGGESTION
Force sending any streaming changes to the output plugin immediately
without waiting until logical_decoding_work_mem is exceeded."),

--
[1] https://www.postgresql.org/docs/devel/logical-replication-config.html
[2] GUC declarations -
https://github.com/postgres/postgres/commit/d9d873bac67047cfacc9f5ef96ee488f2cb0f1c3

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Perform streaming logical transactions by background workers and parallel apply

2022-12-20 Thread Peter Smith

On Tue, Dec 20, 2022 at 5:22 PM Peter Smith  wrote:
>
> On Tue, Dec 20, 2022 at 2:20 PM Amit Kapila  wrote:
> >
> > On Tue, Dec 20, 2022 at 8:17 AM Peter Smith  wrote:
> > >
> > > Summary
> > > ---
> > >
> > > In summary, everything I have tested so far appeared to be working
> > > properly. In other words, for overlapping streamed transactions of
> > > different kinds, and regardless of whether zero/some/all of those
> > > transactions are getting processed by a PA worker, the resulting
> > > replicated data looked consistently OK.
> > >
> >
> > Thanks for doing the detailed testing of this patch. I think the one
> > area where we can focus more is the switch-to-serialization mode while
> > sending changes to the parallel worker.
> >
> > >
> > > NOTE - all testing described in this post above was using v58-0001
> > > only. However, the point of implementing these as a .spec test was to
> > > be able to repeat these same regression tests on newer versions with
> > > minimal manual steps required. Later I plan to fetch/apply the most
> > > recent patch version and repeat these same tests.
> > >
> >
> > That would be really helpful.
> >
>

FYI, my pub-sub.spec tests gave the same result (i.e. pass) when
re-run with the latest v63 (0001,0002,0003) applied.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

< 2 3 4 5 6 7 8 9 10 11 >

601 - 700 of 1527 matches

Mail list logo