Re: Request for further clarification on synchronous_commit

2020-08-18 Thread Bruce Momjian
On Tue, Aug 18, 2020 at 10:58:51AM -0400, Bruce Momjian wrote:
> Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> fsync.  If you want to go in order of severity, with the most severe
> first, it is:
> 
>   remote_apply
>   on
>   remote_write
>   local
> 
> This is seen in the C enum ordering for synchronous_commit, but in
> reverse order:
> 
>   typedef enum
>   {
>   SYNCHRONOUS_COMMIT_OFF, /* asynchronous commit */
>   SYNCHRONOUS_COMMIT_LOCAL_FLUSH, /* wait for local flush only */
>   SYNCHRONOUS_COMMIT_REMOTE_WRITE,/* wait for local flush and 
> remote
>* write */
>   SYNCHRONOUS_COMMIT_REMOTE_FLUSH,/* wait for local and remote 
> flush */
>   SYNCHRONOUS_COMMIT_REMOTE_APPLY /* wait for local flush and remote 
> apply */
>   }   SyncCommitLevel;

Also, there is some logic to say that the postgresql.conf
synchronous_commit options list should be reordered from:

#synchronous_commit = on# synchronization level;
# off, local, remote_write, 
remote_apply, or on

to

#synchronous_commit = on# synchronization level;
# off, local, remote_write, on, 
or remote_apply

I think we should backpatch the doc changes, but maybe not the
postgresql.conf one --- I am not sure.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: need for "see also" section or similar thing at hub pages

2020-08-18 Thread David G. Johnston
On Wed, Aug 5, 2020 at 5:29 PM Bruce Momjian  wrote:

> On Tue, Aug  4, 2020 at 03:56:32PM +, PG Doc comments form wrote:
> > The following documentation comment has been logged on the website:
> >
> > Page: https://www.postgresql.org/docs/12/functions-textsearch.html
> > Description:
> >
> > A suggestion. This page,  "functions-textsearch", is fine,  and the best
> > place to explain functions like array_to_tsvector() and
> jsonb_to_tsvector().
> > The suggestion is just to cite these native functions in the Guide's "hub
> > page" of respective prefixes...
>
> > Simple citation anywhere in the hub page, or a simple "See also"
> section  to
> > list the non-explained functions that can be also classifyed as valid
> > function for that hub page.
>
> Uh, I am not sure it helps to mention that there are text
> search-specific functions in the generic sections.
>

By that logic array_to_text should exist under string functions and not
array functions (I would seriously consider this specific change).
However, the premise that the array and json pages are "hub pages" is not
one that I support.  They are not lacking information but simply do not
attempt to fulfill such a broad mandate.  If a user wishs to know every
function/operator in their database that accepts an array as input or
outputs an array that would be better served by querying the database and
not scanning documentation.  Nor do I find it plausible that given a
function name array_to_tsvector that is difficult to locate - even without
using search - if the function is only listed in the text search section.

In short, I see no usability gap here that needs to be fixed.  I'll agree
that it could be different but I don't believe that it would be an obvious
improvement even if an actual patch were presented for discussion.

David J.


Re: obsolete indexing method "rtree"

2020-08-18 Thread David G. Johnston
On Fri, Aug 7, 2020 at 8:46 AM Jürgen Purtz  wrote:

>
> On 06.08.20 10:12, David G. Johnston wrote:
>
> On Thu, Aug 6, 2020 at 12:18 AM Jürgen Purtz  wrote:
>
>>
>> > Why?
>>
>> Because it can hinder the learning process.
>
>
>
> Such a "make current" patch that covers this complaint seems reasonable;
> IOW, why not if you are in there anyway - rtree has some baggage and no
> particular merit such that a different label would be worse.
>
>
> Patch with four additional comments and one modified comment attached.
>
Works for me.

David J.


Re: Request for further clarification on synchronous_commit

2020-08-18 Thread Bruce Momjian
On Tue, Aug 18, 2020 at 12:50:34PM +0200, Kasper Kondzielski wrote:
> Hi, thanks for the reply.
> 
> To be honest I don't think it is better. Previously paragraph about
> remote_apply was after paragraph about `on` and before remote_write which
> followed natural order in terms of how strict these parameters are (i.e. how
> strong are the guarantees they provide). Because of that I think that
> remote_apply should return to its previous position.

Uh, not really --- see below.

> My original concern was about the fact that the difference between `on`,
> remote_write and remote_apply wasn't perfectly clear.
> I am not sure if I understand this difference correctly but maybe such a table
> could be helpful to me and others:
> 
> +-+---+
> | | synchronous_commit|
> +-+-+--+--+---+
> | operation on standby server | on  | remote_apply | remote_write | local |
> +-+-+--+--+---+
> | written to WAL  | Yes | Yes  | Yes  | No|
> +-+-+--+--+---+
> | commit transaction  | Yes | Yes  | No   | No|
> +-+-+--+--+---+
> | fsync   | Yes | No   | No   | No|
> +-+-+--+--+---+
> 
> From which we can clearly see that only `on` option guarantees fsync, and the
> only difference between remote_write and remote_apply is the visibility of
> transaction results to the queries.

Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
fsync.  If you want to go in order of severity, with the most severe
first, it is:

remote_apply
on
remote_write
local

This is seen in the C enum ordering for synchronous_commit, but in
reverse order:

typedef enum
{
SYNCHRONOUS_COMMIT_OFF, /* asynchronous commit */
SYNCHRONOUS_COMMIT_LOCAL_FLUSH, /* wait for local flush only */
SYNCHRONOUS_COMMIT_REMOTE_WRITE,/* wait for local flush and 
remote
 * write */
SYNCHRONOUS_COMMIT_REMOTE_FLUSH,/* wait for local and remote 
flush */
SYNCHRONOUS_COMMIT_REMOTE_APPLY /* wait for local flush and remote 
apply */
}   SyncCommitLevel;

and this defines the 'on' behavior:

/* Define the default setting for synchronous_commit */
#define SYNCHRONOUS_COMMIT_ON   SYNCHRONOUS_COMMIT_REMOTE_FLUSH

I will clarify this comment, and the docs, to say that remote_apply
includes remote flush.

Obviously these docs need improvement.  Updated patch attached.  I have
to admit I was kind of confused if remote_apply did remote fsync, but
never had the time to research it until you asked.  remote_apply is so
different from the rest, and so heavy, that I put it last in its own
paragraph.

> +         Finally, when set to remote_apply, commits will
> +         wait until replies from the current synchronous standby(s) indicate
> +         they have received the commit record of the transaction and applied
> +         it, so that it has become visible to queries on the standby(s).
> +         This can cause much larger commit delays than previous settings
> +         since it involves WAL replay.
> 'This can cause much' - What does it mean that it can cause? Under what
> circumstances it will/won't cause it?

Uh, I think we can change this to "will cause", because I can't think of
a case where it will not.

> "since it involves WAL replay" - What is a WAL replay? 

Well, there is a doc section that talks about WAL:

https://www.postgresql.org/docs/12/wal.html

and other parts of the config docs that talk about WAL.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index 7a7177c..095caa1
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** include_dir 'conf.d'
*** 2726,2752 
  for their WAL records to be replicated to the standby server(s).
  When set to on, commits will wait until replies
  from the current synchronous standby(s) indicate they have received
! the commit record of the transaction and flushed it to disk.  This
  ensures the transaction will not be lost unless both the primary and
  all synchronous standbys suffer corruption of their database storage.
- When set to remote_apply, commits will wait until replies
- from the current synchro

Re: Request for further clarification on synchronous_commit

2020-08-18 Thread Kasper Kondzielski
Hi, thanks for the reply.

To be honest I don't think it is better. Previously paragraph about
remote_apply was after paragraph about `on` and before remote_write which
followed natural order in terms of how strict these parameters are (i.e.
how strong are the guarantees they provide). Because of that I think that
remote_apply should return to its previous position.

My original concern was about the fact that the difference between `on`,
remote_write and remote_apply wasn't perfectly clear.
I am not sure if I understand this difference correctly but maybe such a
table could be helpful to me and others:

+-+---+
| | synchronous_commit|
+-+-+--+--+---+
| operation on standby server | on  | remote_apply | remote_write | local |
+-+-+--+--+---+
| written to WAL  | Yes | Yes  | Yes  | No|
+-+-+--+--+---+
| commit transaction  | Yes | Yes  | No   | No|
+-+-+--+--+---+
| fsync   | Yes | No   | No   | No|
+-+-+--+--+---+

>From which we can clearly see that only `on` option guarantees fsync, and
the only difference between remote_write and remote_apply is the visibility
of transaction results to the queries.

Also when it comes to the content of your reply, I have few questions:

+ Finally, when set to remote_apply, commits will
+ wait until replies from the current synchronous standby(s)
indicate
+ they have received the commit record of the transaction and
applied
+ it, so that it has become visible to queries on the standby(s).
+ This can cause much larger commit delays than previous settings
+ since it involves WAL replay.
'This can cause much' - What does it mean that it can cause? Under what
circumstances it will/won't cause it?

"since it involves WAL replay" - What is a WAL replay?

Best regards,
Kasper Kondzielski

pon., 17 sie 2020 o 19:47 Bruce Momjian  napisał(a):

> On Fri, Aug 14, 2020 at 01:32:35PM +, PG Doc comments form wrote:
> > The following documentation comment has been logged on the website:
> >
> > Page: https://www.postgresql.org/docs/12/runtime-config-wal.html
> > Description:
> >
> > Hello,
> >
> > First of all I would like to say that PostgreSQL has the best
> documentation
> > I've ever seen. It is very clear and comprehensive. That's the main
> reason
> > why I decided to add my little 2 cents and make it even better.
> >
> > I think that the distinction between first three values of
> > synchronous_commit parameter is not clear enough
> > (
> https://www.postgresql.org/docs/12/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
> ).
> >
> > " When set to on, commits will wait until replies from the current
> > synchronous standby(s) indicate they have received the commit record of
> the
> > transaction and flushed it to disk. This ensures the transaction will
> not be
> > lost unless both the primary and all synchronous standbys suffer
> corruption
> > of their database storage. When set to remote_apply, commits will wait
> until
> > replies from the current synchronous standby(s) indicate they have
> received
> > the commit record of the transaction and applied it, so that it has
> become
> > visible to queries on the standby(s). When set to remote_write, commits
> will
> > wait until replies from the current synchronous standby(s) indicate they
> > have received the commit record of the transaction and written it out to
> > their operating system. This setting is sufficient to ensure data
> > preservation even if a standby instance of PostgreSQL were to crash, but
> not
> > if the standby suffers an operating-system-level crash, since the data
> has
> > not necessarily reached stable storage on the standby"
> >
> > The last sentence : "This setting is sufficient to ensure data
> preservation
> > even if a standby instance..." seems to refer only to the remote_write
> > option while in my option it should refer to both remote_write and
> > remote_apply options, as the fsync is performed only when
> synchronous_commit
> > is set to ON.
> >
> > In other words I think that the documentation should be more clear in
> terms
> > of which option uses fsync.
>
> I think this paragraph just has just too complex.  I have moved the
> mention of remote_apply into its own paragraph, and simplified the
> sentences about remote_write.  Is this attached patch better?
>
> --
>   Bruce Momjian  https://momjian.us
>   EnterpriseDB https://enterprisedb.com
>
>   The usefulness of a cup is in its emptiness