Re: [HACKERS] Draft release notes for next week's back-branch releases

2017-05-08 Thread Petr Jelinek
On 06/05/17 19:38, Tom Lane wrote:
> Petr Jelinek  writes:
>> On 06/05/17 19:15, Tom Lane wrote:
>>> (Or, wait a minute.  That documentation only applies to v10, but we
>>> need to be writing this relnote for 9.6 users.  What terminology should
>>> we be using anyway?)
> 
>> Yeah we need to somehow mention that it only affects 3rd party tools
>> using logical decoding.
> 
>> "The initial snapshot created for a logical decoding slot was
>> potentially incorrect.  This could allow the 3rd party tools using
>> the logical decoding to copy incomplete existing(?) data.  This was
>> more likely to happen if the source server was busy at the time of
>> slot creation, or if two slots were created concurrently."
> 
>>> Also, do we need to recommend that people not trust any logical replicas
>>> at this point, but recreate them after installing the update?
> 
>> Yes, but only if there was preexisting data *and* there was concurrent
>> activity on the server when the "replication" was setup.
> 
> OK, I can work with this.  Thanks for the help!
> 

Great, thanks.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes for next week's back-branch releases

2017-05-06 Thread Tom Lane
Petr Jelinek  writes:
> On 06/05/17 19:15, Tom Lane wrote:
>> (Or, wait a minute.  That documentation only applies to v10, but we
>> need to be writing this relnote for 9.6 users.  What terminology should
>> we be using anyway?)

> Yeah we need to somehow mention that it only affects 3rd party tools
> using logical decoding.

> "The initial snapshot created for a logical decoding slot was
> potentially incorrect.  This could allow the 3rd party tools using
> the logical decoding to copy incomplete existing(?) data.  This was
> more likely to happen if the source server was busy at the time of
> slot creation, or if two slots were created concurrently."

>> Also, do we need to recommend that people not trust any logical replicas
>> at this point, but recreate them after installing the update?

> Yes, but only if there was preexisting data *and* there was concurrent
> activity on the server when the "replication" was setup.

OK, I can work with this.  Thanks for the help!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes for next week's back-branch releases

2017-05-06 Thread Petr Jelinek
On 06/05/17 19:15, Tom Lane wrote:
> Petr Jelinek  writes:
>> On 06/05/17 18:16, Tom Lane wrote:
>>> Hmm, I'm hoping for something more user-oriented.  Is the corruption
>>> time-limited?  What's an "exported snapshot" anyway, is it the same
>>> thing as pg_export_snapshot(), and if so what's that got to do with
>>> logical replication?
> 
>> Okay to explain what's happening. When you create logical replication
>> slot via walsender, it exports snapshot (like the one exported by
>> pg_export_snapshot() indeed) which corresponds to exact point in time
>> where the decoding will start for the slot. You can import this snapshot
>> to another backend and use it to copy any existing data before starting
>> the replication. The bugs cause that these snapshots would be corrupted
>> and you'd have inconsistent data (some tuples missing). One of them
>> would export snapshot that's only valid for system catalogs but not user
>> tables (the ondisk snapshot, this needs at least one preexisting slot).
>> The other would not guarantee that tuples needed by the snapshot weren't
>> removed by vacuum of HOT pruning (the window being only the time it took
>> to generate the snapshot).
> 
> OK, that's better.  But I'm still not really seeing a reason to treat
> these as two separate items for release-note purposes: I don't think users

Sure my main issue was that the text combined the info from two commits
in a way that made the statement no longer correct.

> will care.  Now that I've read section 31.4, I'd suggest that we phrase
> the release notes in the terms it uses.  How about saying something like
> "The initial snapshot created for a logical replication slot was
> incorrect.  This could allow the apply process to copy incomplete or
> inconsistent data.  This was more likely to happen if the source server
> was busy at the time of slot creation, or if two slots were created
> concurrently" ?
> 
> (Or, wait a minute.  That documentation only applies to v10, but we
> need to be writing this relnote for 9.6 users.  What terminology should
> we be using anyway?)

Yeah we need to somehow mention that it only affects 3rd party tools
using logical decoding.

"The initial snapshot created for a logical decoding slot was
potentially incorrect.  This could allow the 3rd party tools using
the logical decoding to copy incomplete existing(?) data.  This was
more likely to happen if the source server was busy at the time of
slot creation, or if two slots were created concurrently."

> 
> Also, do we need to recommend that people not trust any logical replicas
> at this point, but recreate them after installing the update?
> 

Yes, but only if there was preexisting data *and* there was concurrent
activity on the server when the "replication" was setup.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes for next week's back-branch releases

2017-05-06 Thread Tom Lane
Petr Jelinek  writes:
> On 06/05/17 18:16, Tom Lane wrote:
>> Hmm, I'm hoping for something more user-oriented.  Is the corruption
>> time-limited?  What's an "exported snapshot" anyway, is it the same
>> thing as pg_export_snapshot(), and if so what's that got to do with
>> logical replication?

> Okay to explain what's happening. When you create logical replication
> slot via walsender, it exports snapshot (like the one exported by
> pg_export_snapshot() indeed) which corresponds to exact point in time
> where the decoding will start for the slot. You can import this snapshot
> to another backend and use it to copy any existing data before starting
> the replication. The bugs cause that these snapshots would be corrupted
> and you'd have inconsistent data (some tuples missing). One of them
> would export snapshot that's only valid for system catalogs but not user
> tables (the ondisk snapshot, this needs at least one preexisting slot).
> The other would not guarantee that tuples needed by the snapshot weren't
> removed by vacuum of HOT pruning (the window being only the time it took
> to generate the snapshot).

OK, that's better.  But I'm still not really seeing a reason to treat
these as two separate items for release-note purposes: I don't think users
will care.  Now that I've read section 31.4, I'd suggest that we phrase
the release notes in the terms it uses.  How about saying something like
"The initial snapshot created for a logical replication slot was
incorrect.  This could allow the apply process to copy incomplete or
inconsistent data.  This was more likely to happen if the source server
was busy at the time of slot creation, or if two slots were created
concurrently" ?

(Or, wait a minute.  That documentation only applies to v10, but we
need to be writing this relnote for 9.6 users.  What terminology should
we be using anyway?)

Also, do we need to recommend that people not trust any logical replicas
at this point, but recreate them after installing the update?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes for next week's back-branch releases

2017-05-06 Thread Petr Jelinek
On 06/05/17 18:16, Tom Lane wrote:
> Petr Jelinek  writes:
>> On 06/05/17 17:25, Tom Lane wrote:
>>> OK, can you suggest better wording?
> 
>> Something like the attached (it requires some polishing of English
>> probably).
> 
> Hmm, I'm hoping for something more user-oriented.  Is the corruption
> time-limited?  What's an "exported snapshot" anyway, is it the same
> thing as pg_export_snapshot(), and if so what's that got to do with
> logical replication?
> 

Well user does not care unless they use something like pglogical, or
bottledwater, or wal2json, etc.

Okay to explain what's happening. When you create logical replication
slot via walsender, it exports snapshot (like the one exported by
pg_export_snapshot() indeed) which corresponds to exact point in time
where the decoding will start for the slot. You can import this snapshot
to another backend and use it to copy any existing data before starting
the replication. The bugs cause that these snapshots would be corrupted
and you'd have inconsistent data (some tuples missing). One of them
would export snapshot that's only valid for system catalogs but not user
tables (the ondisk snapshot, this needs at least one preexisting slot).
The other would not guarantee that tuples needed by the snapshot weren't
removed by vacuum of HOT pruning (the window being only the time it took
to generate the snapshot).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes for next week's back-branch releases

2017-05-06 Thread Tom Lane
Petr Jelinek  writes:
> On 06/05/17 17:25, Tom Lane wrote:
>> OK, can you suggest better wording?

> Something like the attached (it requires some polishing of English
> probably).

Hmm, I'm hoping for something more user-oriented.  Is the corruption
time-limited?  What's an "exported snapshot" anyway, is it the same
thing as pg_export_snapshot(), and if so what's that got to do with
logical replication?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes for next week's back-branch releases

2017-05-06 Thread Petr Jelinek
On 06/05/17 17:25, Tom Lane wrote:
> Petr Jelinek  writes:
>> Hmm, the 2bef06d51 and 56e19d938 are fixes for different bugs, we can
>> keep them together since result of both is corrupted snapshot, but the
>> description can't just mangle pieces of text from different commits
>> together like this as that's misleading.
> 
> OK, can you suggest better wording?
> 

Something like the attached (it requires some polishing of English
probably).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/doc/src/sgml/release-9.6.sgml b/doc/src/sgml/release-9.6.sgml
index cdff084..195cbb1 100644
--- a/doc/src/sgml/release-9.6.sgml
+++ b/doc/src/sgml/release-9.6.sgml
@@ -92,17 +92,24 @@ Branch: REL9_5_STABLE [54270d7eb] 2017-04-27 15:29:43 -0700
 Branch: REL9_4_STABLE [b6ecf26cc] 2017-04-27 15:29:52 -0700
 -->
  
-  Fix possibly-corrupt initial snapshot during logical decoding
-  (Petr Jelinek, Andres Freund)
+  Fix two bugs resulting in possibly-corrupt initial exported snapshot
+  during logical decoding (Petr Jelinek, Andres Freund)
  
 
  
   If a logical decoding replication slot was created while another slot
-  already exists, it was initialized with a potentially-corrupted
-  snapshot, allowing wrong data to be returned during decoding.
-  The time window during which this snapshot continued to be used
-  depended on how busy the server was; under low load it would be hard
-  to see any problem.
+  already exists, it would exported a potentially-corrupted snapshot,
+  allowing wrong data to be returned when such snapshot was imported to
+  another backend.
+ 
+
+ 
+  Logical decoding normally preserves all required historical catalog
+  tuples, it didn't guarantee that the non-catalog tuples also
+  were preserved for exported snapshot. This could have resulted in
+  concurrent VACUUM or HOT pruning to remove tuples needed by the
+  snapshot. The time window for this was very small and the problem
+  required server to be very busy during slot creation.
  
 
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes for next week's back-branch releases

2017-05-06 Thread Tom Lane
Petr Jelinek  writes:
> Hmm, the 2bef06d51 and 56e19d938 are fixes for different bugs, we can
> keep them together since result of both is corrupted snapshot, but the
> description can't just mangle pieces of text from different commits
> together like this as that's misleading.

OK, can you suggest better wording?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes for next week's back-branch releases

2017-05-06 Thread Petr Jelinek
On 06/05/17 01:37, Tom Lane wrote:
> I've written $SUBJECT ... you can find them at
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=54dbd4dc78b045ffcc046b9a43681770c3992dd4
> 

Hmm, the 2bef06d51 and 56e19d938 are fixes for different bugs, we can
keep them together since result of both is corrupted snapshot, but the
description can't just mangle pieces of text from different commits
together like this as that's misleading.

The 2bef06d51 is about not preserving global xmin correctly, window for
that is indeed small. But the 56e19d938 was about using ondisk snapshot
(which only contain transactions changing catalogs) as full snapshots,
it's more about creating multiple logical replication slots while
concurrent writes happen.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Draft release notes for next week's back-branch releases

2017-05-05 Thread Tom Lane
I've written $SUBJECT ... you can find them at

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=54dbd4dc78b045ffcc046b9a43681770c3992dd4

and in an hour or so they should be built in the devel docs at

https://www.postgresql.org/docs/devel/static/release-9-6-3.html

As usual, this one page contains entries for all back branches;
I'll split them up later.

Please review by Sunday.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Draft release notes for next week's back-branch releases

2016-08-06 Thread Tom Lane
I've put up draft notes for 9.5.4 at

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3676631c687009c2fadcb35e7d312e9eb8a98182

The website should have them after guaibasaurus's next run, due a couple
hours from now.

As usual, the older branches will have subsets of these notes (but there
is one item about pg_ctl that applies only to 9.1).  Please review in the
next 24 hours.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers