Re: [BUG]Update Toast data failure in logical replication
Hi, On 2022-02-14 14:54:41 +0530, Amit Kapila wrote: > On Fri, Feb 11, 2022 at 4:27 PM Amit Kapila wrote: > > > > On Fri, Feb 11, 2022 at 12:00 PM tanghy.f...@fujitsu.com > > wrote: > > > > > > Attached the patches which fixed the above two comments and the first > > > comment in > > > my previous mail [1], the rest is the same as before. > > > I ran the tests on all branches, they all passed as expected. > > > > > > > Thanks, these look good to me. I'll push these early next week > > (Monday) unless there are any more suggestions or comments. > > > > Pushed! Thanks for all the work on this!
Re: [BUG]Update Toast data failure in logical replication
On Mon, Feb 14, 2022 at 2:54 PM Amit Kapila wrote: > > On Fri, Feb 11, 2022 at 4:27 PM Amit Kapila wrote: > > > > On Fri, Feb 11, 2022 at 12:00 PM tanghy.f...@fujitsu.com > > wrote: > > > > > > Attached the patches which fixed the above two comments and the first > > > comment in > > > my previous mail [1], the rest is the same as before. > > > I ran the tests on all branches, they all passed as expected. > > > > > > > Thanks, these look good to me. I'll push these early next week > > (Monday) unless there are any more suggestions or comments. > > > > Pushed! > Thanks!! -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
On Fri, Feb 11, 2022 at 4:27 PM Amit Kapila wrote: > > On Fri, Feb 11, 2022 at 12:00 PM tanghy.f...@fujitsu.com > wrote: > > > > Attached the patches which fixed the above two comments and the first > > comment in > > my previous mail [1], the rest is the same as before. > > I ran the tests on all branches, they all passed as expected. > > > > Thanks, these look good to me. I'll push these early next week > (Monday) unless there are any more suggestions or comments. > Pushed! -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Fri, Feb 11, 2022 at 12:00 PM tanghy.f...@fujitsu.com wrote: > > Attached the patches which fixed the above two comments and the first comment > in > my previous mail [1], the rest is the same as before. > I ran the tests on all branches, they all passed as expected. > Thanks, these look good to me. I'll push these early next week (Monday) unless there are any more suggestions or comments. -- With Regards, Amit Kapila.
RE: [BUG]Update Toast data failure in logical replication
On Thu, Feb 10, 2022 9:34 PM Amit Kapila wrote: > > On Mon, Feb 7, 2022 at 1:27 PM Dilip Kumar wrote: > > > > On Mon, Feb 7, 2022 at 12:25 PM Amit Kapila > wrote: > > > > > > Attached please find the modified patches. > > > > I have looked into the latest modification and back branch patches and > > they look fine to me. > > > > Today, while looking at this patch again, I think I see one problem > with the below change (referring pg10 patch): > + if (attrnum < 0) > + { > + if (attrnum != ObjectIdAttributeNumber && > + attrnum != TableOidAttributeNumber) > + { > + modified = bms_add_member(modified, > + attrnum - > + FirstLowInvalidHeapAttributeNumber); > + continue; > + } > + } > ... > ... > + /* No need to check attributes that can't be stored externally. */ > + if (isnull1 || TupleDescAttr(tupdesc, attrnum - 1)->attlen != -1) > + continue; > > I think it is possible that we use TupleDescAttr for system attribute > (in this case ObjectIdAttributeNumber/TableOidAttributeNumber) which > will be wrong as it contains only user attributes, not system > attributes. See comments atop TupleDescData. > > I think this check should be modified to if (attrnum < 0 || isnull1 > || TupleDescAttr(tupdesc, attrnum - 1)->attlen != -1). What do you > think? > I agree with you. > * Another minor comment: > + if (!heap_attr_equals(RelationGetDescr(relation), attrnum, value1, > + value2, isnull1, isnull2)) > > I think here we can directly use tupdesc instead of > RelationGetDescr(relation). > +1. Attached the patches which fixed the above two comments and the first comment in my previous mail [1], the rest is the same as before. I ran the tests on all branches, they all passed as expected. [1] https://www.postgresql.org/message-id/OS0PR01MB61134DD41BE6D986B9DB80CCFB2E9%40OS0PR01MB6113.jpnprd01.prod.outlook.com Regards, Tang 13-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: 13-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch 14-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: 14-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch HEAD-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-at.patch Description: HEAD-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-at.patch 10-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: 10-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch 11-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: 11-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch 12-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: 12-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch
Re: [BUG]Update Toast data failure in logical replication
On Mon, Feb 7, 2022 at 1:27 PM Dilip Kumar wrote: > > On Mon, Feb 7, 2022 at 12:25 PM Amit Kapila wrote: > > > > Attached please find the modified patches. > > I have looked into the latest modification and back branch patches and > they look fine to me. > Today, while looking at this patch again, I think I see one problem with the below change (referring pg10 patch): + if (attrnum < 0) + { + if (attrnum != ObjectIdAttributeNumber && + attrnum != TableOidAttributeNumber) + { + modified = bms_add_member(modified, + attrnum - + FirstLowInvalidHeapAttributeNumber); + continue; + } + } ... ... + /* No need to check attributes that can't be stored externally. */ + if (isnull1 || TupleDescAttr(tupdesc, attrnum - 1)->attlen != -1) + continue; I think it is possible that we use TupleDescAttr for system attribute (in this case ObjectIdAttributeNumber/TableOidAttributeNumber) which will be wrong as it contains only user attributes, not system attributes. See comments atop TupleDescData. I think this check should be modified to if (attrnum < 0 || isnull1 || TupleDescAttr(tupdesc, attrnum - 1)->attlen != -1). What do you think? * Another minor comment: + if (!heap_attr_equals(RelationGetDescr(relation), attrnum, value1, + value2, isnull1, isnull2)) I think here we can directly use tupdesc instead of RelationGetDescr(relation). -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Wed, Feb 9, 2022 at 11:08 AM tanghy.f...@fujitsu.com wrote: > > On Tue, Feb 8, 2022 3:18 AM Andres Freund wrote: > > > > On 2022-02-07 08:44:00 +0530, Amit Kapila wrote: > > > Right, and it is getting changed. We are just printing the first 200 > > > characters (by using SQL [1]) from the decoded tuple so what is shown > > > in the results is the initial 200 bytes. > > > > Ah, I knew I must have been missing something. > > > > > > > The complete decoded data after the patch is as follows: > > > > Hm. I think we should change the way the strings are shortened - otherwise > > we > > don't really verify much in that test. Perhaps we could just replace the > > long > > repetitive strings with something shorter in the output? > > > > E.g. using something like regexp_replace(data, > > '(1234567890|9876543210){200}', '\1{200}','g') > > inside the substr(). > > > > Wonder if we should deduplicate the number of different toasted strings in > > the > > file to something that'd allow us to have a single "redact_toast" function > > or > > such. There's too many different ones to have a reasonbly simple redaction > > function right now. But that's perhaps better done separately. > > > > I tried to make the output shorter using your suggestion like the following > SQL, > please see the attached patch, which is based on v8 patch[1]. > > SELECT substr(regexp_replace(data, '(1234567890|9876543210){200}', > '\1{200}','g'), 1, 200) FROM pg_logical_slot_get_changes('regression_slot', > NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); > > Note that some strings are still longer than 200 characters even though they > have > been shorter, so they can't be shown entirely. > > e.g. > table public.toasted_key: UPDATE: old-key: > toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1 > toasted_key[text]:unchanged-toast-datum > toasted_col1[text]:unchanged-toast-datum toasted_col2[te > > The entire string is: > table public.toasted_key: UPDATE: old-key: > toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1 > toasted_key[text]:unchanged-toast-datum > toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'9876543210{200}' > > Maybe it's better to change the substr length to 250 to show the entire > string, or we > can do it as separate HEAD only improvement where we can deduplicate some of > the > other long strings as well. Thoughts? > I think it is better to do this as a separate HEAD-only improvement as it can affect other tests results. We can also try to deduplicate some of the other long strings used in toast.sql file along with it. -- With Regards, Amit Kapila.
RE: [BUG]Update Toast data failure in logical replication
On Tue, Feb 8, 2022 3:18 AM Andres Freund wrote: > > On 2022-02-07 08:44:00 +0530, Amit Kapila wrote: > > Right, and it is getting changed. We are just printing the first 200 > > characters (by using SQL [1]) from the decoded tuple so what is shown > > in the results is the initial 200 bytes. > > Ah, I knew I must have been missing something. > > > > The complete decoded data after the patch is as follows: > > Hm. I think we should change the way the strings are shortened - otherwise we > don't really verify much in that test. Perhaps we could just replace the long > repetitive strings with something shorter in the output? > > E.g. using something like regexp_replace(data, > '(1234567890|9876543210){200}', '\1{200}','g') > inside the substr(). > > Wonder if we should deduplicate the number of different toasted strings in the > file to something that'd allow us to have a single "redact_toast" function or > such. There's too many different ones to have a reasonbly simple redaction > function right now. But that's perhaps better done separately. > I tried to make the output shorter using your suggestion like the following SQL, please see the attached patch, which is based on v8 patch[1]. SELECT substr(regexp_replace(data, '(1234567890|9876543210){200}', '\1{200}','g'), 1, 200) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); Note that some strings are still longer than 200 characters even though they have been shorter, so they can't be shown entirely. e.g. table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[te The entire string is: table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'9876543210{200}' Maybe it's better to change the substr length to 250 to show the entire string, or we can do it as separate HEAD only improvement where we can deduplicate some of the other long strings as well. Thoughts? [1] https://www.postgresql.org/message-id/CAA4eK1L_Z_2LDwMNbGrwoO%2BFc-2Q04YORQSA9UfGUTMQpy2O1Q%40mail.gmail.com Regards, Tang improve_toast_test.diff Description: improve_toast_test.diff
Re: [BUG]Update Toast data failure in logical replication
On Wed, Feb 9, 2022 at 7:16 AM Euler Taveira wrote: > > On Tue, Feb 8, 2022, at 10:18 PM, tanghy.f...@fujitsu.com wrote: > > 2) > + /* > + * Check if the old tuple's attribute is stored externally and is a > + * member of external_cols. > + */ > + if (VARATT_IS_EXTERNAL((struct varlena *) DatumGetPointer(value1)) && > + bms_is_member(attrnum - FirstLowInvalidHeapAttributeNumber, > + external_cols)) > + *has_external = true; > > If has_external is already true, it seems we don't need this check, so should > we > check has_external first? > > Is it worth it? I don't think so. > I also don't think it is worth adding such a check. -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Tue, Feb 8, 2022, at 10:18 PM, tanghy.f...@fujitsu.com wrote: > 2) > + /* > + * Check if the old tuple's attribute is stored externally and is a > + * member of external_cols. > + */ > + if (VARATT_IS_EXTERNAL((struct varlena *) DatumGetPointer(value1)) && > + bms_is_member(attrnum - FirstLowInvalidHeapAttributeNumber, > + external_cols)) > + *has_external = true; > > If has_external is already true, it seems we don't need this check, so should > we > check has_external first? Is it worth it? I don't think so. It complicates a non-critical path. In general, the condition will be executed once or twice. -- Euler Taveira EDB https://www.enterprisedb.com/
RE: [BUG]Update Toast data failure in logical replication
On Mon, Feb 7, 2022 2:55 PM Amit Kapila wrote: > > On Sat, Feb 5, 2022 at 6:10 AM Amit Kapila wrote: > > > > On Fri, Feb 4, 2022 at 9:06 PM Alvaro Herrera > > wrote: > > > > > > > > > I have some suggestions > > > on the comments and docs though. > > > > > > > Thanks, your suggestions look good to me. I'll take care of these in > > the next version. > > > > Attached please find the modified patches. > Thanks for your patch. I tried it and it works well. Two small comments: 1) +static Bitmapset *HeapDetermineColumnsInfo(Relation relation, + Bitmapset *interesting_cols, + Bitmapset *external_cols, + HeapTuple oldtup, HeapTuple newtup, + bool *id_has_external); +HeapDetermineColumnsInfo(Relation relation, +Bitmapset *interesting_cols, +Bitmapset *external_cols, +HeapTuple oldtup, HeapTuple newtup, +bool *has_external) The declaration and the definition of this function use different parameter names for the last parameter (id_has_external and has_external), it's better to be consistent. 2) + /* +* Check if the old tuple's attribute is stored externally and is a +* member of external_cols. +*/ + if (VARATT_IS_EXTERNAL((struct varlena *) DatumGetPointer(value1)) && + bms_is_member(attrnum - FirstLowInvalidHeapAttributeNumber, + external_cols)) + *has_external = true; If has_external is already true, it seems we don't need this check, so should we check has_external first? Regards, Tang
Re: [BUG]Update Toast data failure in logical replication
On Tue, Feb 8, 2022 at 12:48 AM Andres Freund wrote: > > On 2022-02-07 08:44:00 +0530, Amit Kapila wrote: > > Right, and it is getting changed. We are just printing the first 200 > > characters (by using SQL [1]) from the decoded tuple so what is shown > > in the results is the initial 200 bytes. > > Ah, I knew I must have been missing something. > > > > The complete decoded data after the patch is as follows: > > Hm. I think we should change the way the strings are shortened - otherwise we > don't really verify much in that test. Perhaps we could just replace the long > repetitive strings with something shorter in the output? > > E.g. using something like regexp_replace(data, > '(1234567890|9876543210){200}', '\1{200}','g') > inside the substr(). > This sounds like a good idea. Shall we do this as part of this patch itself or as a separate improvement? > Wonder if we should deduplicate the number of different toasted strings in the > file to something that'd allow us to have a single "redact_toast" function or > such. There's too many different ones to have a reasonbly simple redaction > function right now. > I think this is also worth trying. > But that's perhaps better done separately. > +1. -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Tue, Feb 8, 2022 at 12:48 AM Andres Freund wrote: > > Hi, > > On 2022-02-07 08:44:00 +0530, Amit Kapila wrote: > > Right, and it is getting changed. We are just printing the first 200 > > characters (by using SQL [1]) from the decoded tuple so what is shown > > in the results is the initial 200 bytes. > > Ah, I knew I must have been missing something. > > > > The complete decoded data after the patch is as follows: > > Hm. I think we should change the way the strings are shortened - otherwise we > don't really verify much in that test. Perhaps we could just replace the long > repetitive strings with something shorter in the output? > > E.g. using something like regexp_replace(data, > '(1234567890|9876543210){200}', '\1{200}','g') > inside the substr(). IMHO, in this particular case using regexp_replace as you explained would be a good option as we will be verifying complete data instead of just the first 200 characters. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
Hi, On 2022-02-07 08:44:00 +0530, Amit Kapila wrote: > Right, and it is getting changed. We are just printing the first 200 > characters (by using SQL [1]) from the decoded tuple so what is shown > in the results is the initial 200 bytes. Ah, I knew I must have been missing something. > The complete decoded data after the patch is as follows: Hm. I think we should change the way the strings are shortened - otherwise we don't really verify much in that test. Perhaps we could just replace the long repetitive strings with something shorter in the output? E.g. using something like regexp_replace(data, '(1234567890|9876543210){200}', '\1{200}','g') inside the substr(). Wonder if we should deduplicate the number of different toasted strings in the file to something that'd allow us to have a single "redact_toast" function or such. There's too many different ones to have a reasonbly simple redaction function right now. But that's perhaps better done separately. Greetings, Andres Freund
Re: [BUG]Update Toast data failure in logical replication
On Mon, Feb 7, 2022 at 12:25 PM Amit Kapila wrote: > > On Sat, Feb 5, 2022 at 6:10 AM Amit Kapila wrote: > > > > On Fri, Feb 4, 2022 at 9:06 PM Alvaro Herrera > > wrote: > > > > > > > > > I have some suggestions > > > on the comments and docs though. > > > > > > > Thanks, your suggestions look good to me. I'll take care of these in > > the next version. > > > > Attached please find the modified patches. I have looked into the latest modification and back branch patches and they look fine to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
On Sat, Feb 5, 2022 at 6:10 AM Amit Kapila wrote: > > On Fri, Feb 4, 2022 at 9:06 PM Alvaro Herrera wrote: > > > > > > I have some suggestions > > on the comments and docs though. > > > > Thanks, your suggestions look good to me. I'll take care of these in > the next version. > Attached please find the modified patches. -- With Regards, Amit Kapila. HEAD-v8-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: Binary data 14-v8-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: Binary data 13-v8-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: Binary data 12-v8-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: Binary data 11-v8-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: Binary data 10-v8-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch Description: Binary data
Re: [BUG]Update Toast data failure in logical replication
On Sun, Feb 6, 2022 at 5:04 AM Andres Freund wrote: > > On 2022-02-04 17:45:36 +0530, Amit Kapila wrote: > > diff --git a/contrib/test_decoding/expected/toast.out > > b/contrib/test_decoding/expected/toast.out > > index cd03e9d..a757e7d 100644 > > --- a/contrib/test_decoding/expected/toast.out > > +++ b/contrib/test_decoding/expected/toast.out > > @@ -77,7 +77,7 @@ SELECT substr(data, 1, 200) FROM > > pg_logical_slot_get_changes('regression_slot', > > table public.toasted_key: INSERT: id[integer]:1 > > toasted_key[text]:'1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123 > > COMMIT > > BEGIN > > - table public.toasted_key: UPDATE: id[integer]:1 > > toasted_key[text]:unchanged-toast-datum > > toasted_col1[text]:unchanged-toast-datum > > toasted_col2[text]:'987654321098765432109876543210987654321098765432109 > > + table public.toasted_key: UPDATE: old-key: > > toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678 > > Hm. This looks weird. What happened to the the change to toasted_col2 that was > in the "removed" line? > > This corresponds to the following statement I think: > -- test update of a toasted key without changing it > UPDATE toasted_key SET toasted_col2 = toasted_col1; > which previously was inserted as: > INSERT INTO toasted_key(toasted_key, toasted_col1) > VALUES(repeat('1234567890', 200), repeat('9876543210', 200)); > > so toasted_col2 should have changed from NULL to repeat('9876543210', 200) > Right, and it is getting changed. We are just printing the first 200 characters (by using SQL [1]) from the decoded tuple so what is shown in the results is the initial 200 bytes. The complete decoded data after the patch is as follows: table public.toasted_key: UPDATE: old-key: toasted_key[text]:'12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890' new-tuple: id[integer]:2 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'98765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321098765
Re: [BUG]Update Toast data failure in logical replication
Hi, On 2022-02-04 17:45:36 +0530, Amit Kapila wrote: > diff --git a/contrib/test_decoding/expected/toast.out > b/contrib/test_decoding/expected/toast.out > index cd03e9d..a757e7d 100644 > --- a/contrib/test_decoding/expected/toast.out > +++ b/contrib/test_decoding/expected/toast.out > @@ -77,7 +77,7 @@ SELECT substr(data, 1, 200) FROM > pg_logical_slot_get_changes('regression_slot', > table public.toasted_key: INSERT: id[integer]:1 > toasted_key[text]:'1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123 > COMMIT > BEGIN > - table public.toasted_key: UPDATE: id[integer]:1 > toasted_key[text]:unchanged-toast-datum > toasted_col1[text]:unchanged-toast-datum > toasted_col2[text]:'987654321098765432109876543210987654321098765432109 > + table public.toasted_key: UPDATE: old-key: > toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678 Hm. This looks weird. What happened to the the change to toasted_col2 that was in the "removed" line? This corresponds to the following statement I think: -- test update of a toasted key without changing it UPDATE toasted_key SET toasted_col2 = toasted_col1; which previously was inserted as: INSERT INTO toasted_key(toasted_key, toasted_col1) VALUES(repeat('1234567890', 200), repeat('9876543210', 200)); so toasted_col2 should have changed from NULL to repeat('9876543210', 200) Am I misreading something? Greetings, Andres Freund
Re: [BUG]Update Toast data failure in logical replication
On 2022-Feb-05, Amit Kapila wrote: > On Fri, Feb 4, 2022 at 9:06 PM Alvaro Herrera wrote: > > > > I don't have a reason not to commit this patch. > > > > It is not very clear to me from this so just checking again, are you > fine with back-patching this as well? Hmm, of course, I never thought it'd be a good idea to let the bug unfixed in back branches. Thanks! -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El sudor es la mejor cura para un pensamiento enfermo" (Bardia)
Re: [BUG]Update Toast data failure in logical replication
On Fri, Feb 4, 2022 at 9:06 PM Alvaro Herrera wrote: > > I don't have a reason not to commit this patch. > It is not very clear to me from this so just checking again, are you fine with back-patching this as well? > > I have some suggestions > on the comments and docs though. > Thanks, your suggestions look good to me. I'll take care of these in the next version. -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
I don't have a reason not to commit this patch. I have some suggestions on the comments and docs though. > @@ -8359,14 +8408,15 @@ log_heap_new_cid(Relation relation, HeapTuple tup) > * Returns NULL if there's no need to log an identity or if there's no > suitable > * key defined. > * > - * key_changed should be false if caller knows that no replica identity > - * columns changed value. It's always true in the DELETE case. > + * key_required should be false if caller knows that no replica identity > + * columns changed value and it doesn't has any external data. It's always > + * true in the DELETE case. > * > * *copy is set to true if the returned tuple is a modified copy rather than > * the same tuple that was passed in. > */ > static HeapTuple > -ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, > +ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_required, I find the new comment pretty hard to interpret. I would say something like "Pass key_required true if any replica identity columns changed value, or if any of them have external data. DELETE must always pass true". > diff --git a/doc/src/sgml/ref/alter_table.sgml > b/doc/src/sgml/ref/alter_table.sgml > index dee026e..d67ef7c 100644 > --- a/doc/src/sgml/ref/alter_table.sgml > +++ b/doc/src/sgml/ref/alter_table.sgml > @@ -873,8 +873,10 @@ WITH ( MODULUS class="parameter">numeric_literal, REM >This form changes the information which is written to the write-ahead > log >to identify rows which are updated or deleted. This option has no > effect >except when logical replication is in use. > - In all cases, no old values are logged unless at least one of the > columns > - that would be logged differs between the old and new versions of the > row. > + In all cases except toasted values, no old values are logged unless at > + least one of the columns that would be logged differs between the old > and > + new versions of the row. We detoast the unchanged old toast values > and log > + them. Here we're patching with a minimal wording change with almost incomprehensible results. I think we should patch more extensively. I suggest: This form changes the information which is written to the write-ahead log to identify rows which are updated or deleted. In most cases, the old value of each column is only logged if it differs from the new value; however, if the old value is stored externally, it is always logged regardless of whether it changed. This option has no effect unless logical replication is in use. I didn't get a chance to review the code, but I think this is valuable. -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
Re: [BUG]Update Toast data failure in logical replication
On Mon, Jan 31, 2022 at 9:03 AM Dilip Kumar wrote: > > > > > I have changed this and various other comments in the patch. I have > > modified the docs as well to reflect the changes. I thought of adding > > a test but I think the current test in toast.sql seems sufficient. > > Kindly let me know what you think of the attached? I think we should > > backpatch this till v10. What do you think? > > Looks fine to me. > Attached are the patches for back-branches till v10. I have made two modifications: (a) changed heap_tuple_attr_equals() to heap_attr_equals() as we are not passing tuple to it; (b) changed parameter name 'check_external_cols' to 'external_cols' to make it sound similar to existing parameter 'interesting_cols' in HeapDetermine* function. Let me know what you think of the attached? Do you see any reason not to back-patch this fix? -- With Regards, Amit Kapila. HEAD-v7-0001-WAL-log-unchanged-toasted-replica-identity-key-at.patch Description: Binary data 14-v7-0001-WAL-log-unchanged-toasted-replica-identity-key-a.patch Description: Binary data 13-v7-0001-WAL-log-unchanged-toasted-replica-identity-key-a.patch Description: Binary data 12-v7-0001-WAL-log-unchanged-toasted-replica-identity-key-a.patch Description: Binary data 11-v7-0001-WAL-log-unchanged-toasted-replica-identity-key-a.patch Description: Binary data 10-v7-0001-WAL-log-unchanged-toasted-replica-identity-key-a.patch Description: Binary data
Re: [BUG]Update Toast data failure in logical replication
On Sat, Jan 29, 2022 at 3:57 PM Amit Kapila wrote: > > On Fri, Jan 28, 2022 at 12:16 PM Dilip Kumar wrote: > > > + /* > + * If it's a whole-tuple reference, say "not equal". It's not really > + * worth supporting this case, since it could only succeed after a > + * no-op update, which is hardly a case worth optimizing for. > + */ > + if (attrnum == 0) > + continue; > + > + /* > + * Likewise, automatically say "not equal" for any system attribute > + * other than tableOID; we cannot expect these to be consistent in a > + * HOT chain, or even to be set correctly yet in the new tuple. > + */ > + if (attrnum < 0) > + { > + if (attrnum != TableOidAttributeNumber) > + continue; > + } > > These two cases need to be considered as the corresponding attribute > is modified, so the attnum needs to be added in the bitmapset of > modified attrs. Yeah right. > > I have changed this and various other comments in the patch. I have > modified the docs as well to reflect the changes. I thought of adding > a test but I think the current test in toast.sql seems sufficient. > Kindly let me know what you think of the attached? I think we should > backpatch this till v10. What do you think? Looks fine to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
On Fri, Jan 28, 2022 at 12:16 PM Dilip Kumar wrote: > > I think the best way is to do some refactoring and renaming of the > function, because as part of HeapDetermineModifiedColumns we are > already processing the tuple so we can not put extra overhead of > reprocessing it again. In short I like the idea of renaming the > HeapDetermineModifiedColumns and moving part of heap_tuple_attr_equals > code into the caller. Here is the patch set for the same. I have > divided it into two patches which can eventually be merged, 0001- for > refactoring 0002- does the actual work. > + /* + * If it's a whole-tuple reference, say "not equal". It's not really + * worth supporting this case, since it could only succeed after a + * no-op update, which is hardly a case worth optimizing for. + */ + if (attrnum == 0) + continue; + + /* + * Likewise, automatically say "not equal" for any system attribute + * other than tableOID; we cannot expect these to be consistent in a + * HOT chain, or even to be set correctly yet in the new tuple. + */ + if (attrnum < 0) + { + if (attrnum != TableOidAttributeNumber) + continue; + } These two cases need to be considered as the corresponding attribute is modified, so the attnum needs to be added in the bitmapset of modified attrs. I have changed this and various other comments in the patch. I have modified the docs as well to reflect the changes. I thought of adding a test but I think the current test in toast.sql seems sufficient. Kindly let me know what you think of the attached? I think we should backpatch this till v10. What do you think? Does anyone else have better ideas to fix this? -- With Regards, Amit Kapila. v6-0001-WAL-log-unchanged-toasted-replica-identity-key-at.patch Description: Binary data
Re: [BUG]Update Toast data failure in logical replication
On Tue, Jan 25, 2022 at 11:59 AM Amit Kapila wrote: > > On Tue, Jan 25, 2022 at 12:26 AM Euler Taveira wrote: > > > > I am not sure if your proposal is much different compared to v4 or how > much it improves the situation? I see you didn't consider > 'check_external_attr' parameter and I think that is important to know > if the key has any external toast value. Overall, I see your point > that the change of APIs looks a bit ugly. But, I guess that is more > due to their names and current purpose. I think it could be better if > we bring all the code of heap_tuple_attr_equals in its only caller > HeapDetermineModifiedColumns or at least part of the code where we get > attr value and can determine whether the value is stored externally. > Then change name of HeapDetermineModifiedColumns to > HeapDetermineColumnsInfo with additional parameters. I think the best way is to do some refactoring and renaming of the function, because as part of HeapDetermineModifiedColumns we are already processing the tuple so we can not put extra overhead of reprocessing it again. In short I like the idea of renaming the HeapDetermineModifiedColumns and moving part of heap_tuple_attr_equals code into the caller. Here is the patch set for the same. I have divided it into two patches which can eventually be merged, 0001- for refactoring 0002- does the actual work. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com From db839d20ab109488670de8b480b9f30f4f30c084 Mon Sep 17 00:00:00 2001 From: Dilip Kumar Date: Fri, 28 Jan 2022 11:35:12 +0530 Subject: [PATCH v5 2/2] WAL log unchanged toasted eplica identity key attributes Currently, if the unchanged replica identity key attributes are not logged separately because they are getting logged as part of the new tuple. But if it is stored externally then the untoasted value is not getting logged as part of the new tuple. So we need to log that as part of the old_key_tuple. In order to do that we need to identify whether any of the unchanged replica identity key attributes is stored externally or not. So, we are trying to identify this while identifying the modified attributes because at that time we are already processing the tuple so there will not be any extra overhead. --- contrib/test_decoding/expected/toast.out | 2 +- src/backend/access/heap/heapam.c | 77 2 files changed, 60 insertions(+), 19 deletions(-) diff --git a/contrib/test_decoding/expected/toast.out b/contrib/test_decoding/expected/toast.out index cd03e9d..a757e7d 100644 --- a/contrib/test_decoding/expected/toast.out +++ b/contrib/test_decoding/expected/toast.out @@ -77,7 +77,7 @@ SELECT substr(data, 1, 200) FROM pg_logical_slot_get_changes('regression_slot', table public.toasted_key: INSERT: id[integer]:1 toasted_key[text]:'1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123 COMMIT BEGIN - table public.toasted_key: UPDATE: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'987654321098765432109876543210987654321098765432109 + table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678 COMMIT BEGIN table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678 diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 86eb016..ab17f2e 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -78,9 +78,11 @@ static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf, Buffer newbuf, HeapTuple oldtup, HeapTuple newtup, HeapTuple old_key_tuple, bool all_visible_cleared, bool new_all_visible_cleared); -static Bitmapset *HeapDetermineModifiedColumns(Relation relation, - Bitmapset *interesting_cols, - HeapTuple oldtup, HeapTuple newtup); +static Bitmapset *HeapDetermineColumnsInfo(Relation relation, + Bitmapset *interesting_cols, + Bitmapset *check_external_attr, + HeapTuple oldtup, HeapTuple newtup, + bool *id_has_external); static bool heap_acquire_tuplock(Relation relation, ItemPointer tid, LockTupleMode mode, LockWaitPolicy wait_policy, bool *have_tuple_lock); @@ -106,7 +108,7 @@ static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status static void index_delete_sort(TM_IndexDeleteOp *delstate); static int bottomup_sort_and_shrink(TM_IndexDeleteOp *delstate); static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup); -static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_changed, +s
Re: [BUG]Update Toast data failure in logical replication
On Tue, Jan 25, 2022 at 12:26 AM Euler Taveira wrote: > > On Mon, Jan 24, 2022, at 12:58 AM, Michael Paquier wrote: > > FWIW, I find the API changes of HeapDetermineModifiedColumns() and > ExtractReplicaIdentity() a bit grotty. Shouldn't we try to flatten > the old tuple instead? There are things like > toast_flatten_tuple_to_datum() for this purpose if a tuple satisfies > HeapTupleHasExternal(), or just heap_copy_tuple_as_datum(). > > I checked v4 and I don't like the HeapDetermineModifiedColumns() and > heap_tuple_attr_equals() changes either. It seems it is hijacking these > functions to something else. I would suggest to change the signature to > > static void > heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum, >HeapTuple tup1, HeapTuple tup2, >bool *is_equal, bool *key_has_external); > > and > > static void > HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols, > HeapTuple oldtup, HeapTuple newtup, > Bitmapset *modified_attrs, bool > *key_has_external); > > I didn't figure out a cheap way to check if the key has external value other > than slightly modifying the HeapDetermineModifiedColumns() function and its > subroutine heap_tuple_attr_equals(). > I am not sure if your proposal is much different compared to v4 or how much it improves the situation? I see you didn't consider 'check_external_attr' parameter and I think that is important to know if the key has any external toast value. Overall, I see your point that the change of APIs looks a bit ugly. But, I guess that is more due to their names and current purpose. I think it could be better if we bring all the code of heap_tuple_attr_equals in its only caller HeapDetermineModifiedColumns or at least part of the code where we get attr value and can determine whether the value is stored externally. Then change name of HeapDetermineModifiedColumns to HeapDetermineColumnsInfo with additional parameters. > As Alvaro said I don't think adding > HeapTupleHasExternal() (as in v3) is a good idea because it does not optimize > genuine cases such as a table whose PK is an integer and contains a single > TOAST column. > > Another suggestion is to keep key_changed and add another attribute > (key_has_external) to ExtractReplicaIdentity(). If we need key_changed in the > future we'll have to decompose it again. > True, but we can make the required changes at that point as well. OTOH, we can do what you are suggesting as well but I am not sure if that is required. -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Mon, Jan 24, 2022 at 4:42 PM Andres Freund wrote: > On 2022-01-24 16:31:08 -0500, Robert Haas wrote: > > That seems consistent with what's been described on this thread so > > far, but I still don't quite understand why the logic that reassembles > > TOAST chunks doesn't solve it. > > There are no toast chunks to reassemble if the update didn't change the > primary key. So this just hits the path we'd also hit for an unchanged toasted > non-key column. Oh. Hmm. That's bad. -- Robert Haas EDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
On 2022-01-24 16:31:08 -0500, Robert Haas wrote: > That seems consistent with what's been described on this thread so > far, but I still don't quite understand why the logic that reassembles > TOAST chunks doesn't solve it. There are no toast chunks to reassemble if the update didn't change the primary key. So this just hits the path we'd also hit for an unchanged toasted non-key column.
Re: [BUG]Update Toast data failure in logical replication
On Mon, Jan 24, 2022 at 4:17 PM Andres Freund wrote: > Possibly the root of the problem is that we/I didn't think of cases where the > primary key is an external toast datum - in moast scenarios you'd an error > about a too wide index tuple. But of course that neglects cases where toasting > happens due to SET STORAGE or due to the aggregate tuple width, rather than > individual column width. That seems consistent with what's been described on this thread so far, but I still don't quite understand why the logic that reassembles TOAST chunks doesn't solve it. I mean, decoding doesn't know whether any changes are happening on the subscriber side, so it's not like we can (a) query for the row and then (b) decide to reassemble TOAST chunks if we find it, or something like that. The decoding has to say, well, here's the new tuple and the old key columns, and then the subscriber does whatever it does. I guess it could check whether the old and new values are identical to decide whether to drop that column out of the result, but it shouldn't compare a TOAST pointer to a detoasted value and go "yeah, that looks equal" -- Robert Haas EDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
Hi, On 2022-01-24 15:10:05 -0500, Robert Haas wrote: > I think we realized when we were working on the logical decoding stuff > that the key columns of the old tuple would have to be detoasted in > order for the mechanism to work, because I remember worrying about > whether it would potentially be a problem that the WAL record would > end up huge. However, I think we believed that the new tuple wouldn't > need to have the detoasted values, because logical decoding is > designed to notice all the TOAST insertions for the new tuple and > reassemble those separate chunks to get the original value back. Possibly the root of the problem is that we/I didn't think of cases where the primary key is an external toast datum - in moast scenarios you'd an error about a too wide index tuple. But of course that neglects cases where toasting happens due to SET STORAGE or due to the aggregate tuple width, rather than individual column width. > And off-hand I'm not sure why that logic doesn't apply just as much to the > key columns as any others. The difference is that it's mostly fine to not have unchanging key columns as part of decoded update - you just don't update those columns. But you can't do that without knowing the replica identity... Greetings, Andres Freund
Re: [BUG]Update Toast data failure in logical replication
On Tue, Aug 10, 2021 at 1:20 AM Amit Kapila wrote: > It seems to me this problem exists from the time we introduced > wal_level = logical in the commit e55704d8b2 [1], or another > possibility is that logical replication commit didn't consider > something to make it work. Andres, Robert, Petr, can you guys please > comment because otherwise, we might miss something here. I'm belatedly getting around to looking at this thread. My recollection of this is: I think we realized when we were working on the logical decoding stuff that the key columns of the old tuple would have to be detoasted in order for the mechanism to work, because I remember worrying about whether it would potentially be a problem that the WAL record would end up huge. However, I think we believed that the new tuple wouldn't need to have the detoasted values, because logical decoding is designed to notice all the TOAST insertions for the new tuple and reassemble those separate chunks to get the original value back. And off-hand I'm not sure why that logic doesn't apply just as much to the key columns as any others. But the evidence does suggest that there's some kind of bug here, so evidently there's some flaw in that line of thinking. I'm not sure off-hand what it is, though. -- Robert Haas EDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
On Mon, Jan 24, 2022, at 12:58 AM, Michael Paquier wrote: > FWIW, I find the API changes of HeapDetermineModifiedColumns() and > ExtractReplicaIdentity() a bit grotty. Shouldn't we try to flatten > the old tuple instead? There are things like > toast_flatten_tuple_to_datum() for this purpose if a tuple satisfies > HeapTupleHasExternal(), or just heap_copy_tuple_as_datum(). > I checked v4 and I don't like the HeapDetermineModifiedColumns() and heap_tuple_attr_equals() changes either. It seems it is hijacking these functions to something else. I would suggest to change the signature to static void heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum, HeapTuple tup1, HeapTuple tup2, bool *is_equal, bool *key_has_external); and static void HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols, HeapTuple oldtup, HeapTuple newtup, Bitmapset *modified_attrs, bool *key_has_external); I didn't figure out a cheap way to check if the key has external value other than slightly modifying the HeapDetermineModifiedColumns() function and its subroutine heap_tuple_attr_equals(). As Alvaro said I don't think adding HeapTupleHasExternal() (as in v3) is a good idea because it does not optimize genuine cases such as a table whose PK is an integer and contains a single TOAST column. Another suggestion is to keep key_changed and add another attribute (key_has_external) to ExtractReplicaIdentity(). If we need key_changed in the future we'll have to decompose it again. You also encapsulates that optimization into the function that helps with future improvements/fixes. static HeapTuple ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, bool key_has_external, bool *copy); -- Euler Taveira EDB https://www.enterprisedb.com/
Re: [BUG]Update Toast data failure in logical replication
On Mon, Jan 24, 2022 at 9:28 AM Michael Paquier wrote: > > On Wed, Aug 11, 2021 at 06:14:55PM +0530, Dilip Kumar wrote: > > Right > > Amit, are you planning to look more at this patch? It has been a > couple of months since the last update, and this is still a bug as far > as I understand. > > FWIW, I find the API changes of HeapDetermineModifiedColumns() and > ExtractReplicaIdentity() a bit grotty. Shouldn't we try to flatten > the old tuple instead? There are things like > toast_flatten_tuple_to_datum() for this purpose if a tuple satisfies > HeapTupleHasExternal(), or just heap_copy_tuple_as_datum(). > That can add overhead in cases where we don't need to log the toasted values of the old tuple. We only need it for the case where we have unchanged toasted replica identity columns. In the previous version [1], we were doing something like you are suggesting and that seems to have overhead as explained in the second paragraph of the email [2]. Also, Alvaro seems to have some reservations about that change. I don't know if there is a better way to fix this but I could be missing something. [1] - https://www.postgresql.org/message-id/CAFiTN-sTS4bB7W3UJV3iUm%3DwKdr9EpOwyK97hNr77MzFQm_NBw%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAA4eK1KgZr%3DQSBE_Qh0Qfb2ma1Tc6%2BZxkMaUHO7aC7b9WSCRaw%40mail.gmail.com -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Wed, Aug 11, 2021 at 06:14:55PM +0530, Dilip Kumar wrote: > Right Amit, are you planning to look more at this patch? It has been a couple of months since the last update, and this is still a bug as far as I understand. FWIW, I find the API changes of HeapDetermineModifiedColumns() and ExtractReplicaIdentity() a bit grotty. Shouldn't we try to flatten the old tuple instead? There are things like toast_flatten_tuple_to_datum() for this purpose if a tuple satisfies HeapTupleHasExternal(), or just heap_copy_tuple_as_datum(). -- Michael signature.asc Description: PGP signature
Re: [BUG]Update Toast data failure in logical replication
On Wed, Sep 8, 2021 at 11:26 AM Ajin Cherian wrote: > On Wed, Aug 11, 2021 at 10:45 PM Dilip Kumar > wrote: > > > Yeah we can avoid that by detecting any toasted replica identity key > > in HeapDetermineModifiedColumns, check the attached patch. > > > > I had a second look at this, and I just had a small doubt. Since the > convention is that for UPDATES, the old tuple/key is written to > WAL only if the one of the columns in the key has changed as part of > the update, and we are breaking that convention with this patch by > also including > the old key if it is toasted and is stored in disk even if it is not > changed. > Why do we not include the detoasted key as part of the new tuple > rather than the old tuple? Then we don't really break this convention. > The purpose of including the toasted old data is only for the replica identity, but if you include it in the new tuple then it will affect the general wal replay, basically, now you will have large detoasted data in your new tuple which your are directly going to memcpy on the standby while replaying so that will create corruption. So basically, you can not include this in the new tuple without changing a lot of logic around replay which is completely useless. So let this tuple be for a specific purpose and that is replica identity in our case. > And one small typo in the patch: > > The header above ExtractReplicaIdentity() > > Before: > * key_required should be false if caller knows that no replica identity > * columns changed value and it doesn't has any external data. > * It's always true in the DELETE case. > > After: > * key_required should be false if caller knows that no replica identity > * columns changed value and it doesn't have any external data. > * It's always true in the DELETE case. > Okay, I will change this. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
On Wed, Aug 11, 2021 at 10:45 PM Dilip Kumar wrote: > Yeah we can avoid that by detecting any toasted replica identity key > in HeapDetermineModifiedColumns, check the attached patch. > I had a second look at this, and I just had a small doubt. Since the convention is that for UPDATES, the old tuple/key is written to WAL only if the one of the columns in the key has changed as part of the update, and we are breaking that convention with this patch by also including the old key if it is toasted and is stored in disk even if it is not changed. Why do we not include the detoasted key as part of the new tuple rather than the old tuple? Then we don't really break this convention. And one small typo in the patch: The header above ExtractReplicaIdentity() Before: * key_required should be false if caller knows that no replica identity * columns changed value and it doesn't has any external data. * It's always true in the DELETE case. After: * key_required should be false if caller knows that no replica identity * columns changed value and it doesn't have any external data. * It's always true in the DELETE case. regards, Ajin Cherian Fujitsu Australia
Re: [BUG]Update Toast data failure in logical replication
On Wed, Aug 11, 2021 at 10:45 PM Dilip Kumar wrote: > > Yeah we can avoid that by detecting any toasted replica identity key > in HeapDetermineModifiedColumns, check the attached patch. > The patch applies cleanly, all tests pass, I tried out a few toast combination tests and they seem to be working fine. No review comments, the patch looks good to me. regards, Ajin Cherian Fujitsu Australia
Re: [BUG]Update Toast data failure in logical replication
On Wed, Aug 11, 2021 at 10:30 AM Amit Kapila wrote: > > On Tue, Aug 10, 2021 at 8:08 PM Alvaro Herrera > wrote: > > > > On 2021-Jul-30, Amit Kapila wrote: > > > > Reading Dilip's last posted patch that day, I had some reservations > > about the API of ExtractReplicaIdentity. The new argument makes for a > > very strange to explain behavior "return the key values if they are > > unchanged, *or* if they are toasted" ... ??? > > > > I think we can say it as "Return the key values if they are changed > *or* if they are toasted". Currently, we have an exception for Deletes > where the caller always passed key_changed as true, so maybe we can > have a similar exception when the tuple has toasted values. Can we > think of changing the flag to "key_required" instead of "key_changed" > and let the caller identify and set its value? For Deletes, it will > work the same but for Updates, the caller needs to compute it by > checking if any of the key columns are modified or has a toast value. > We can try to see if the caller can identify it cheaply along with > determining the modified_attrs as at that time we will anyway check > replica key attrs. Right > > Currently, in proposed patch first, we check that the tuple has any > toast values and then it deforms and forms the new key tuple. After > that, it checks if the key has any toast values and then only decides > to return the tuple. If as described in the previous paragraph, we can > cheaply identify whether the key has toasted values, then we can avoid > deform/form cost in some cases. Also, I think we need to change the > Replica Identity description in the docs[1]. Yeah we can avoid that by detecting any toasted replica identity key in HeapDetermineModifiedColumns, check the attached patch. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com From 3879809bf1a995995dccf5064eb9282f72af0412 Mon Sep 17 00:00:00 2001 From: Dilip Kumar Date: Wed, 11 Aug 2021 18:09:25 +0530 Subject: [PATCH v4] Extract unchanged external replica identity key If replica identity is set to key and the key is not modified we don't log key seperately because it should be logged along with the updated tuple. But if the key is stored externally we must have to detoast and log it separately. --- contrib/test_decoding/expected/toast.out | 2 +- src/backend/access/heap/heapam.c | 51 ++-- 2 files changed, 37 insertions(+), 16 deletions(-) diff --git a/contrib/test_decoding/expected/toast.out b/contrib/test_decoding/expected/toast.out index 75c4d22..769ab0e 100644 --- a/contrib/test_decoding/expected/toast.out +++ b/contrib/test_decoding/expected/toast.out @@ -77,7 +77,7 @@ SELECT substr(data, 1, 200) FROM pg_logical_slot_get_changes('regression_slot', table public.toasted_key: INSERT: id[integer]:1 toasted_key[text]:'1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123 COMMIT BEGIN - table public.toasted_key: UPDATE: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'987654321098765432109876543210987654321098765432109 + table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678 COMMIT BEGIN table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678 diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 2433998..e788926 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -80,7 +80,9 @@ static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf, bool all_visible_cleared, bool new_all_visible_cleared); static Bitmapset *HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols, - HeapTuple oldtup, HeapTuple newtup); + Bitmapset *check_external_attr, + HeapTuple oldtup, HeapTuple newtup, + bool *id_has_external); static bool heap_acquire_tuplock(Relation relation, ItemPointer tid, LockTupleMode mode, LockWaitPolicy wait_policy, bool *have_tuple_lock); @@ -106,7 +108,7 @@ static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status static void index_delete_sort(TM_IndexDeleteOp *delstate); static int bottomup_sort_and_shrink(TM_IndexDeleteOp *delstate); static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup); -static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_changed, +static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_required, bool *copy); @@ -3185,6 +3187,7 @@ heap_update(Relation relation, ItemPointer otid, H
Re: [BUG]Update Toast data failure in logical replication
On Tue, Aug 10, 2021 at 8:08 PM Alvaro Herrera wrote: > > On 2021-Jul-30, Amit Kapila wrote: > > Reading Dilip's last posted patch that day, I had some reservations > about the API of ExtractReplicaIdentity. The new argument makes for a > very strange to explain behavior "return the key values if they are > unchanged, *or* if they are toasted" ... ??? > I think we can say it as "Return the key values if they are changed *or* if they are toasted". Currently, we have an exception for Deletes where the caller always passed key_changed as true, so maybe we can have a similar exception when the tuple has toasted values. Can we think of changing the flag to "key_required" instead of "key_changed" and let the caller identify and set its value? For Deletes, it will work the same but for Updates, the caller needs to compute it by checking if any of the key columns are modified or has a toast value. We can try to see if the caller can identify it cheaply along with determining the modified_attrs as at that time we will anyway check replica key attrs. Currently, in proposed patch first, we check that the tuple has any toast values and then it deforms and forms the new key tuple. After that, it checks if the key has any toast values and then only decides to return the tuple. If as described in the previous paragraph, we can cheaply identify whether the key has toasted values, then we can avoid deform/form cost in some cases. Also, I think we need to change the Replica Identity description in the docs[1]. [1] - https://www.postgresql.org/docs/devel/sql-altertable.html -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On 2021-Jul-30, Amit Kapila wrote: > I was thinking of using toast pointer but that won't work because it > can be different on the subscriber-side. I don't see any better ideas > to fix this issue. This problem seems to be from the time Logical > Replication has been introduced, so adding others (who are generally > involved in this area) to see what they think about this bug? I think > people might not be using toasted columns for Replica Identity due to > which this problem has been reported yet but I feel this is quite a > fundamental issue and we should do something about this. In the evening before going offline a week ago I was looking at this and my conclusion was that this was a legitimate problem: the original implementation is faulty in that the full detoasted value is required to be transmitted in order for downstream to be able to read the value. I am not sure if at the level of logical decoding it is a problem theoretically, but at least for logical replication it is clearly a practical problem. Reading Dilip's last posted patch that day, I had some reservations about the API of ExtractReplicaIdentity. The new argument makes for a very strange to explain behavior "return the key values if they are unchanged, *or* if they are toasted" ... ??? I tried to make sense of that, and tried to find a concept that would make sense to the whole, but couldn't find any obvious angle in the short time I looked at it. I haven't looked at it again. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte" (Ijon Tichy en Viajes, Stanislaw Lem)
Re: [BUG]Update Toast data failure in logical replication
On Fri, Jul 30, 2021 at 10:21 AM Amit Kapila wrote: > > This problem seems to be from the time Logical > Replication has been introduced, so adding others (who are generally > involved in this area) to see what they think about this bug? I think > people might not be using toasted columns for Replica Identity due to > which this problem has been reported yet but I feel this is quite a > fundamental issue and we should do something about this. > > Let me summarize the problem for the ease of others. > > The logical replica can go out of sync for UPDATES when there is a > toast column as part of REPLICA IDENTITY. In such cases, updates are > not replicated if the key column doesn't change because we don't log > the actual key value for the unchanged toast key. It is neither logged > as part of old_key_tuple nor for new tuple due to which we are not > able to find the tuple to be updated on the subscriber-side and the > update is ignored on the subscriber-side. We log this in DEBUG1 mode > but I don't think the user can do anything about this and the replica > will go out-of-sync. This works when the replica identity column value > is not toasted because then it will be part of the new tuple and we > use that to fetch the tuple on the subscriber. > It seems to me this problem exists from the time we introduced wal_level = logical in the commit e55704d8b2 [1], or another possibility is that logical replication commit didn't consider something to make it work. Andres, Robert, Petr, can you guys please comment because otherwise, we might miss something here. [1] - commit e55704d8b2fe522fbc9435acbb5bc59033478bd5 Author: Robert Haas Date: Tue Dec 10 18:33:45 2013 -0500 Add new wal_level, logical, sufficient for logical decoding. When wal_level=logical, we'll log columns from the old tuple as configured by the REPLICA IDENTITY facility added in commit 07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible a properly-configured logical replication solution to correctly follow table updates even if they change the chosen key columns, or, with REPLICA IDENTITY FULL, even if the table has no key at all. Note that updates which do not modify the replica identity column won't log anything extra, making the choice of a good key (i.e. one that will rarely be changed) important to performance when wal_level=logical is configured. .. Andres Freund, reviewed in various versions by myself, Heikki Linnakangas, KONDO Mitsumasa, and many others. -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Mon, Jul 26, 2021 at 10:45 AM Dilip Kumar wrote: > > I was thinking more about this idea, but IMHO, unless we send the key > toasted tuple from the publisher how is the subscriber supposed to > fetch it. Because that is the key value for finding the tuple on the > subscriber side and if we haven't sent the key value, how are we > supposed to find the tuple on the subscriber side? > I was thinking of using toast pointer but that won't work because it can be different on the subscriber-side. I don't see any better ideas to fix this issue. This problem seems to be from the time Logical Replication has been introduced, so adding others (who are generally involved in this area) to see what they think about this bug? I think people might not be using toasted columns for Replica Identity due to which this problem has been reported yet but I feel this is quite a fundamental issue and we should do something about this. Let me summarize the problem for the ease of others. The logical replica can go out of sync for UPDATES when there is a toast column as part of REPLICA IDENTITY. In such cases, updates are not replicated if the key column doesn't change because we don't log the actual key value for the unchanged toast key. It is neither logged as part of old_key_tuple nor for new tuple due to which we are not able to find the tuple to be updated on the subscriber-side and the update is ignored on the subscriber-side. We log this in DEBUG1 mode but I don't think the user can do anything about this and the replica will go out-of-sync. This works when the replica identity column value is not toasted because then it will be part of the new tuple and we use that to fetch the tuple on the subscriber. Now, it is not clear if the key-value (for the toast column which is part of replica identity) is not present in WAL how we can find the tuple to update on subscriber? We can't use the toast pointer in the new tuple to fetch the toast information as that can be different on subscribers. The simple way is to WAL LOG the unchanged toasted value as part of old_key_tuple, this will be required only for toast columns. Note that Delete works because we WAL Log the unchanged key tuple in that case. -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Fri, Jul 23, 2021 at 8:58 AM Amit Kapila wrote: > > Okay, thanks. I think one point we need to consider here is that on > the subscriber side, we use dirtysnapshot to search the key, so we > need to ensure that we don't fetch the wrong data. I am not sure what > will happen when by the time we try to search the tuple in the > subscriber-side for the update, it has been removed and re-inserted > with the same values by the user. Do we find the newly inserted tuple > and update it? If so, can it also happen even if logged the unchanged > old_key_tuple as the patch is doing currently? > I was thinking more about this idea, but IMHO, unless we send the key toasted tuple from the publisher how is the subscriber supposed to fetch it. Because that is the key value for finding the tuple on the subscriber side and if we haven't sent the key value, how are we supposed to find the tuple on the subscriber side? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
On Thu, Jul 22, 2021 at 8:02 PM Dilip Kumar wrote: > > On Thu, Jul 22, 2021 at 4:11 PM Amit Kapila wrote: > > > > On Thu, Jun 3, 2021 at 5:15 PM Dilip Kumar wrote: > > > > > > On Wed, Jun 2, 2021 at 7:23 PM Dilip Kumar wrote: > > > > > > > > On Wed, Jun 2, 2021 at 7:20 PM Kuntal Ghosh > > > > wrote: > > > > > > > > > > On Wed, Jun 2, 2021 at 3:10 PM Dilip Kumar > > > > > wrote: > > > > > > > > > > > > Yes, you are right. I will change it in the next version, along > > > > > > with > > > > > > the test case. > > > > > > > > > > > +/* > > > > > + * if the key hasn't changed and we're only logging the key, > > > > > we're done. > > > > > + * But if tuple has external data then we might have to detoast > > > > > the key. > > > > > + */ > > > > > This doesn't really mention why we need to detoast the key even when > > > > > the key remains the same. I guess we can add some more details here. > > > > > > > > Noted, let's see what others have to say about fixing this, then I > > > > will fix this along with one other pending comment and I will also add > > > > the test case. Thanks for looking into this. > > > > > > I have fixed all the pending issues, I see there is already a test > > > case for this so I have changed the output for that. > > > > > > > IIUC, this issue occurs because we don't log the actual key value for > > unchanged toast key. It is neither logged as part of old_key_tuple nor > > for new tuple due to which we are not able to find it in the > > apply-side when we searched it via FindReplTupleInLocalRel. Now, I > > think it will work if we LOG the entire unchanged toasted value as you > > have done in the patch but can we explore some other way to fix it. In > > the subscriber-side, can we detect that the key column has toasted > > value in the new tuple and try to first fetch it and then do the index > > search for the fetched toasted value? I am not sure about the > > feasibility of this but wanted to see if we can someway avoid logging > > unchanged toasted key value as that might save us from additional WAL. > > Yeah if we can do this then it will be a better approach, I think as > you mentioned we can avoid logging unchanged toast key data. I will > investigate this next week and update the thread. > Okay, thanks. I think one point we need to consider here is that on the subscriber side, we use dirtysnapshot to search the key, so we need to ensure that we don't fetch the wrong data. I am not sure what will happen when by the time we try to search the tuple in the subscriber-side for the update, it has been removed and re-inserted with the same values by the user. Do we find the newly inserted tuple and update it? If so, can it also happen even if logged the unchanged old_key_tuple as the patch is doing currently? -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Thu, Jul 22, 2021 at 4:11 PM Amit Kapila wrote: > > On Thu, Jun 3, 2021 at 5:15 PM Dilip Kumar wrote: > > > > On Wed, Jun 2, 2021 at 7:23 PM Dilip Kumar wrote: > > > > > > On Wed, Jun 2, 2021 at 7:20 PM Kuntal Ghosh > > > wrote: > > > > > > > > On Wed, Jun 2, 2021 at 3:10 PM Dilip Kumar > > > > wrote: > > > > > > > > > > Yes, you are right. I will change it in the next version, along with > > > > > the test case. > > > > > > > > > +/* > > > > + * if the key hasn't changed and we're only logging the key, we're > > > > done. > > > > + * But if tuple has external data then we might have to detoast > > > > the key. > > > > + */ > > > > This doesn't really mention why we need to detoast the key even when > > > > the key remains the same. I guess we can add some more details here. > > > > > > Noted, let's see what others have to say about fixing this, then I > > > will fix this along with one other pending comment and I will also add > > > the test case. Thanks for looking into this. > > > > I have fixed all the pending issues, I see there is already a test > > case for this so I have changed the output for that. > > > > IIUC, this issue occurs because we don't log the actual key value for > unchanged toast key. It is neither logged as part of old_key_tuple nor > for new tuple due to which we are not able to find it in the > apply-side when we searched it via FindReplTupleInLocalRel. Now, I > think it will work if we LOG the entire unchanged toasted value as you > have done in the patch but can we explore some other way to fix it. In > the subscriber-side, can we detect that the key column has toasted > value in the new tuple and try to first fetch it and then do the index > search for the fetched toasted value? I am not sure about the > feasibility of this but wanted to see if we can someway avoid logging > unchanged toasted key value as that might save us from additional WAL. Yeah if we can do this then it will be a better approach, I think as you mentioned we can avoid logging unchanged toast key data. I will investigate this next week and update the thread. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
On Thu, Jun 3, 2021 at 5:15 PM Dilip Kumar wrote: > > On Wed, Jun 2, 2021 at 7:23 PM Dilip Kumar wrote: > > > > On Wed, Jun 2, 2021 at 7:20 PM Kuntal Ghosh > > wrote: > > > > > > On Wed, Jun 2, 2021 at 3:10 PM Dilip Kumar wrote: > > > > > > > > Yes, you are right. I will change it in the next version, along with > > > > the test case. > > > > > > > +/* > > > + * if the key hasn't changed and we're only logging the key, we're > > > done. > > > + * But if tuple has external data then we might have to detoast the > > > key. > > > + */ > > > This doesn't really mention why we need to detoast the key even when > > > the key remains the same. I guess we can add some more details here. > > > > Noted, let's see what others have to say about fixing this, then I > > will fix this along with one other pending comment and I will also add > > the test case. Thanks for looking into this. > > I have fixed all the pending issues, I see there is already a test > case for this so I have changed the output for that. > IIUC, this issue occurs because we don't log the actual key value for unchanged toast key. It is neither logged as part of old_key_tuple nor for new tuple due to which we are not able to find it in the apply-side when we searched it via FindReplTupleInLocalRel. Now, I think it will work if we LOG the entire unchanged toasted value as you have done in the patch but can we explore some other way to fix it. In the subscriber-side, can we detect that the key column has toasted value in the new tuple and try to first fetch it and then do the index search for the fetched toasted value? I am not sure about the feasibility of this but wanted to see if we can someway avoid logging unchanged toasted key value as that might save us from additional WAL. -- With Regards, Amit Kapila.
Re: [BUG]Update Toast data failure in logical replication
On Fri, Jun 4, 2021 at 8:25 AM tanghy.f...@fujitsu.com wrote: > > On Thu, Jun 3, 2021 7:45 PMDilip Kumar wrote: > > > > I have fixed all the pending issues, I see there is already a test > > case for this so I have changed the output for that. > > Thanks for your patch. I tested it for all branches(10,11,12,13,HEAD) and all > of them passed.(This bug was introduced in PG-10.) > I also tested the scenario where I found this bug, data could be synchronized > after your fix. Thanks for verifying this. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: [BUG]Update Toast data failure in logical replication
On Thu, Jun 3, 2021 7:45 PMDilip Kumar wrote: > > I have fixed all the pending issues, I see there is already a test > case for this so I have changed the output for that. Thanks for your patch. I tested it for all branches(10,11,12,13,HEAD) and all of them passed.(This bug was introduced in PG-10.) I also tested the scenario where I found this bug, data could be synchronized after your fix. Regards Tang
Re: [BUG]Update Toast data failure in logical replication
On Wed, Jun 2, 2021 at 7:23 PM Dilip Kumar wrote: > > On Wed, Jun 2, 2021 at 7:20 PM Kuntal Ghosh > wrote: > > > > On Wed, Jun 2, 2021 at 3:10 PM Dilip Kumar wrote: > > > > > > Yes, you are right. I will change it in the next version, along with > > > the test case. > > > > > +/* > > + * if the key hasn't changed and we're only logging the key, we're > > done. > > + * But if tuple has external data then we might have to detoast the > > key. > > + */ > > This doesn't really mention why we need to detoast the key even when > > the key remains the same. I guess we can add some more details here. > > Noted, let's see what others have to say about fixing this, then I > will fix this along with one other pending comment and I will also add > the test case. Thanks for looking into this. I have fixed all the pending issues, I see there is already a test case for this so I have changed the output for that. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com From fe1b9131c6c53a6335dd4fd7ceb5f8c18aacdba9 Mon Sep 17 00:00:00 2001 From: Dilip Kumar Date: Mon, 31 May 2021 14:37:26 +0530 Subject: [PATCH v3] Extract unchanged replica identity key if it is stored externally If replica identity is set to key and the key is not modified we don't log key seperately because it should be logged along with the updated tuple. But if the key is stored externally we must have to detoast and log it separately. --- contrib/test_decoding/expected/toast.out | 2 +- src/backend/access/heap/heapam.c | 20 ++-- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/contrib/test_decoding/expected/toast.out b/contrib/test_decoding/expected/toast.out index 75c4d22..769ab0e 100644 --- a/contrib/test_decoding/expected/toast.out +++ b/contrib/test_decoding/expected/toast.out @@ -77,7 +77,7 @@ SELECT substr(data, 1, 200) FROM pg_logical_slot_get_changes('regression_slot', table public.toasted_key: INSERT: id[integer]:1 toasted_key[text]:'1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123 COMMIT BEGIN - table public.toasted_key: UPDATE: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'987654321098765432109876543210987654321098765432109 + table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678 COMMIT BEGIN table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678 diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 2433998..afd775e 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -8339,8 +8339,14 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, return tp; } - /* if the key hasn't changed and we're only logging the key, we're done */ - if (!key_changed) + /* + * If the key hasn't changed and we're only logging the key, we're done. + * But if tuple has external data then it's possible that key might have + * stored externally, if so we need to extract its value because the value + * of the externally stored key data will not be logged with the main tuple + * so we will have to log as old key tuple for replica identity. + */ + if ((!key_changed) && !HeapTupleHasExternal(tp)) return NULL; /* find out the replica identity columns */ @@ -8391,6 +8397,16 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, key_tuple = toast_flatten_tuple(oldtup, desc); heap_freetuple(oldtup); } + /* + * If key tuple doesn't have any external data and key is not changed then + * just free the key tuple and return NULL. + */ + else if (!key_changed) + { + heap_freetuple(key_tuple); + *copy = false; + return NULL; + } return key_tuple; } -- 1.8.3.1
Re: [BUG]Update Toast data failure in logical replication
On Wed, Jun 2, 2021 at 7:20 PM Kuntal Ghosh wrote: > > On Wed, Jun 2, 2021 at 3:10 PM Dilip Kumar wrote: > > > > Yes, you are right. I will change it in the next version, along with > > the test case. > > > +/* > + * if the key hasn't changed and we're only logging the key, we're done. > + * But if tuple has external data then we might have to detoast the key. > + */ > This doesn't really mention why we need to detoast the key even when > the key remains the same. I guess we can add some more details here. Noted, let's see what others have to say about fixing this, then I will fix this along with one other pending comment and I will also add the test case. Thanks for looking into this. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [BUG]Update Toast data failure in logical replication
On Wed, Jun 2, 2021 at 3:10 PM Dilip Kumar wrote: > > Yes, you are right. I will change it in the next version, along with > the test case. > +/* + * if the key hasn't changed and we're only logging the key, we're done. + * But if tuple has external data then we might have to detoast the key. + */ This doesn't really mention why we need to detoast the key even when the key remains the same. I guess we can add some more details here. -- Thanks & Regards, Kuntal Ghosh
Re: [BUG]Update Toast data failure in logical replication
On Wed, Jun 2, 2021 at 2:37 PM tanghy.f...@fujitsu.com wrote: > > On Wed, Jun 2, 2021 2:44 PM Dilip Kumar wrote: > > Attached patch fixes that, I haven't yet added the test case. Once > > someone confirms on the approach then I will add a test case to the > > patch. > > key_tuple = heap_form_tuple(desc, values, nulls); > *copy = true; > ... > key_tuple = toast_flatten_tuple(oldtup, desc); > heap_freetuple(oldtup); > } > + /* > +* If key tuple doesn't have any external data and key is not changed > then > +* just free the key tuple and return NULL. > +*/ > + else if (!key_changed) > + { > + heap_freetuple(key_tuple); > + return NULL; > + } > > return key_tuple; > } > > I think "*copy = false" should be added before return NULL because we don't > return a modified copy tuple here. Thoughts? Yes, you are right. I will change it in the next version, along with the test case. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: [BUG]Update Toast data failure in logical replication
On Wed, Jun 2, 2021 2:44 PM Dilip Kumar wrote: > Attached patch fixes that, I haven't yet added the test case. Once > someone confirms on the approach then I will add a test case to the > patch. key_tuple = heap_form_tuple(desc, values, nulls); *copy = true; ... key_tuple = toast_flatten_tuple(oldtup, desc); heap_freetuple(oldtup); } + /* +* If key tuple doesn't have any external data and key is not changed then +* just free the key tuple and return NULL. +*/ + else if (!key_changed) + { + heap_freetuple(key_tuple); + return NULL; + } return key_tuple; } I think "*copy = false" should be added before return NULL because we don't return a modified copy tuple here. Thoughts? Regards Tang
Re: [BUG]Update Toast data failure in logical replication
On Tue, Jun 1, 2021 at 3:39 PM Dilip Kumar wrote: > > On Tue, Jun 1, 2021 at 12:29 PM tanghy.f...@fujitsu.com > > I noticed that in case1, ExtractReplicaIdentity function returned NULL on > > HEAD. But after your fix, it didn’t return NULL. There is no problem with > > this case on HEAD, but the patch modified its return value. I’m not sure if > > it would bring new problems. Have you checked it? > > Good observation, basically, my check says that any field in the tuple > is toasted then prepare the key tuple, actually, after that, I should > recheck whether the key field specifically toasted or not and if it is > not then we can continue returning NULL. I will fix this in the next > version. Attached patch fixes that, I haven't yet added the test case. Once someone confirms on the approach then I will add a test case to the patch. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com From 3be161266dd013b8fdd68352dc1c6bd34ed36069 Mon Sep 17 00:00:00 2001 From: Dilip Kumar Date: Mon, 31 May 2021 14:37:26 +0530 Subject: [PATCH v2] Extract unchanged replica identity key if it is stored externally If replica identity is set to key and the key is not modified we don't log key seperately because it should be logged along with the updated tuple. But if the key is stored externally we must have to detoast and log it separately. --- src/backend/access/heap/heapam.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index bd60129..9b53cdc 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -8407,8 +8407,11 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, return tp; } - /* if the key hasn't changed and we're only logging the key, we're done */ - if (!key_changed) + /* + * if the key hasn't changed and we're only logging the key, we're done. + * But if tuple has external data then we might have to detoast the key. + */ + if ((!key_changed) && !HeapTupleHasExternal(tp)) return NULL; /* find out the replica identity columns */ @@ -8459,6 +8462,15 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, key_tuple = toast_flatten_tuple(oldtup, desc); heap_freetuple(oldtup); } + /* + * If key tuple doesn't have any external data and key is not changed then + * just free the key tuple and return NULL. + */ + else if (!key_changed) + { + heap_freetuple(key_tuple); + return NULL; + } return key_tuple; } -- 1.8.3.1
Re: [BUG]Update Toast data failure in logical replication
On Tue, Jun 1, 2021 at 12:29 PM tanghy.f...@fujitsu.com wrote: > > Hi > > > > I have some questions with your patch. Here are two cases I used to check the > bug. > > > > Case1(PK toasted_key is short), data could be synchronized on HEAD. > > --- > > INSERT INTO toasted_key(toasted_key, toasted_col1) VALUES('111', > repeat('9876543210', 200)); > > UPDATE toasted_key SET toasted_col2 = toasted_col1; > > --- > > > > Case2(PK toasted_key is very long), data couldn’t be synchronized on > HEAD.(which is the bug) > > --- > > INSERT INTO toasted_key(toasted_key, toasted_col1) > VALUES(repeat('9876543210', 200), '111'); > > UPDATE toasted_key SET toasted_col2 = toasted_col1; > > --- > > > > So I think the bug is only related with the length of primary key. > > I noticed that in case1, ExtractReplicaIdentity function returned NULL on > HEAD. But after your fix, it didn’t return NULL. There is no problem with > this case on HEAD, but the patch modified its return value. I’m not sure if > it would bring new problems. Have you checked it? Good observation, basically, my check says that any field in the tuple is toasted then prepare the key tuple, actually, after that, I should recheck whether the key field specifically toasted or not and if it is not then we can continue returning NULL. I will fix this in the next version. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: [BUG]Update Toast data failure in logical replication
Hi I have some questions with your patch. Here are two cases I used to check the bug. Case1(PK toasted_key is short), data could be synchronized on HEAD. --- INSERT INTO toasted_key(toasted_key, toasted_col1) VALUES('111', repeat('9876543210', 200)); UPDATE toasted_key SET toasted_col2 = toasted_col1; --- Case2(PK toasted_key is very long), data couldn’t be synchronized on HEAD.(which is the bug) --- INSERT INTO toasted_key(toasted_key, toasted_col1) VALUES(repeat('9876543210', 200), '111'); UPDATE toasted_key SET toasted_col2 = toasted_col1; --- So I think the bug is only related with the length of primary key. I noticed that in case1, ExtractReplicaIdentity function returned NULL on HEAD. But after your fix, it didn’t return NULL. There is no problem with this case on HEAD, but the patch modified its return value. I’m not sure if it would bring new problems. Have you checked it? Regards Tang
Re: [BUG]Update Toast data failure in logical replication
On Mon, 31 May 2021 at 3:33 PM, tanghy.f...@fujitsu.com < tanghy.f...@fujitsu.com> wrote: > On Mon, May 31, 2021 5:12 PM Dilip Kumar wrote: > > > > The problem is if the key attribute is not changed we don't log it as > > it should get logged along with the updated tuple, but if it is > > externally stored then the complete key will never be logged because > > there is no log from the toast table. For fixing this if the key is > > externally stored then always log that. > > > > Please test with the attached patch. > > Thanks for your patch. I tested it and the bug was fixed. > Thanks for confirming this. > I'm still trying to understand your fix, please allow me to ask more(maybe > silly) questions if I found any. > > +* if the key hasn't changedand we're only logging the key, we're > done. > > I think "changedand" should be "changed and". > Okay, I will fix it. Lets see what others have to say about this fix, if we agree with this then I think we might have to change the test output. I will do that in the next version along with your comment fix.
RE: [BUG]Update Toast data failure in logical replication
On Mon, May 31, 2021 5:12 PM Dilip Kumar wrote: > > The problem is if the key attribute is not changed we don't log it as > it should get logged along with the updated tuple, but if it is > externally stored then the complete key will never be logged because > there is no log from the toast table. For fixing this if the key is > externally stored then always log that. > > Please test with the attached patch. Thanks for your patch. I tested it and the bug was fixed. I'm still trying to understand your fix, please allow me to ask more(maybe silly) questions if I found any. +* if the key hasn't changedand we're only logging the key, we're done. I think "changedand" should be "changed and". Regards Tang
Re: [BUG]Update Toast data failure in logical replication
On Mon, May 31, 2021 at 12:20 PM Dilip Kumar wrote: > > On Mon, May 31, 2021 at 8:04 AM tanghy.f...@fujitsu.com > wrote: > > > > On Friday, May 28, 2021 6:51 PM, Dilip Kumar wrote: > > > Seems you did not set the replica identity for updating the tuple. > > > Try this before updating, and it should work. > > > > Thanks for your reply. I tried it. > > > > > ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey; > > > > This didn't work. > > > > > or > > > > > > ALTER TABLE toasted_key REPLICA IDENTITY FULL. > > > > It worked. > > > > And I noticed if the length of PRIMARY KEY (toasted_key) is short, data > > could be synchronized successfully with default replica identity. > > Could you tell me why we need to set replica identity? > > Looks like some problem if the replica identity is an index and the > value is stored externally, I will debug this and let you know. The problem is if the key attribute is not changed we don't log it as it should get logged along with the updated tuple, but if it is externally stored then the complete key will never be logged because there is no log from the toast table. For fixing this if the key is externally stored then always log that. Please test with the attached patch. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com From 965c18471919e231ec94edac9025f7178f121206 Mon Sep 17 00:00:00 2001 From: Dilip Kumar Date: Mon, 31 May 2021 14:37:26 +0530 Subject: [PATCH v1] Extract unchanged replica identity key if it is stored externally If replica identity is set to key and the key is not modified we don't log key seperately because it should be logged along with the updated tuple. But if the key is stored externally we must have to detoast and log it separately. --- src/backend/access/heap/heapam.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index bd60129..27d60e6 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -8407,8 +8407,11 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, return tp; } - /* if the key hasn't changed and we're only logging the key, we're done */ - if (!key_changed) + /* + * if the key hasn't changedand we're only logging the key, we're done. + * But if tuple has external data then we might have to detoast the key. + */ + if ((!key_changed) && !HeapTupleHasExternal(tp)) return NULL; /* find out the replica identity columns */ -- 1.8.3.1
Re: [BUG]Update Toast data failure in logical replication
On Mon, May 31, 2021 at 8:04 AM tanghy.f...@fujitsu.com wrote: > > On Friday, May 28, 2021 6:51 PM, Dilip Kumar wrote: > > Seems you did not set the replica identity for updating the tuple. > > Try this before updating, and it should work. > > Thanks for your reply. I tried it. > > > ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey; > > This didn't work. > > > or > > > > ALTER TABLE toasted_key REPLICA IDENTITY FULL. > > It worked. > > And I noticed if the length of PRIMARY KEY (toasted_key) is short, data could > be synchronized successfully with default replica identity. > Could you tell me why we need to set replica identity? Looks like some problem if the replica identity is an index and the value is stored externally, I will debug this and let you know. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: [BUG]Update Toast data failure in logical replication
On Friday, May 28, 2021 6:51 PM, Dilip Kumar wrote: > Seems you did not set the replica identity for updating the tuple. > Try this before updating, and it should work. Thanks for your reply. I tried it. > ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey; This didn't work. > or > > ALTER TABLE toasted_key REPLICA IDENTITY FULL. It worked. And I noticed if the length of PRIMARY KEY (toasted_key) is short, data could be synchronized successfully with default replica identity. Could you tell me why we need to set replica identity? Regards Tang
Re: [BUG]Update Toast data failure in logical replication
On Fri, May 28, 2021 at 12:31 PM tanghy.f...@fujitsu.com wrote: > > On Friday, May 28, 2021 3:02 PM, tanghy.f...@fujitsu.com wrote: > > FYI. The problem also occurs in PG-13. I will try to check from which > > version it > > got introduced. > > I reproduced it in PG-10,11,12,13. > I think the problem has been existing since Logical replication introduced in > PG-10. Seems you did not set the replica identity for updating the tuple. Try this before updating, and it should work. ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey; or ALTER TABLE toasted_key REPLICA IDENTITY FULL. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: [BUG]Update Toast data failure in logical replication
On Friday, May 28, 2021 3:02 PM, tanghy.f...@fujitsu.com wrote: > FYI. The problem also occurs in PG-13. I will try to check from which version > it > got introduced. I reproduced it in PG-10,11,12,13. I think the problem has been existing since Logical replication introduced in PG-10. Regards Tang
RE: [BUG]Update Toast data failure in logical replication
On Friday, May 28, 2021 2:16 PM, tanghy.f...@fujitsu.com wrote: >I think I just found a bug in logical replication. Data couldn't be >synchronized while updating toast data. >Could anyone take a look at it? FYI. The problem also occurs in PG-13. I will try to check from which version it got introduced. Regards, Tang