Re: Delay the variable initialization in get_rel_sync_entry
On Wed, Jan 5, 2022 at 10:31 AM Michael Paquier wrote: > On Fri, Dec 24, 2021 at 01:27:26PM +, houzj.f...@fujitsu.com wrote: > > Here is the perf result of pgoutput_change after applying the patch. > > I didn't notice something else that stand out. > > > > |--2.99%--pgoutput_change > > |--1.80%--get_rel_sync_entry > > |--1.56%--hash_search > > > > Also attach complete profiles. > > Thanks. I have also done my own set of measurements, and the > difference is noticeable in the profiles I looked at. So, applied > down to 13. Thanks. I agree the variables were being defined in the wrong place before this patch. -- Amit Langote EDB: http://www.enterprisedb.com
RE: Delay the variable initialization in get_rel_sync_entry
On Wednesday, January 5, 2022 9:31 AM Michael Paquier wrote: > On Fri, Dec 24, 2021 at 01:27:26PM +, houzj.f...@fujitsu.com wrote: > > Here is the perf result of pgoutput_change after applying the patch. > > I didn't notice something else that stand out. > > > > |--2.99%--pgoutput_change > > |--1.80%--get_rel_sync_entry > > |--1.56%--hash_search > > > > Also attach complete profiles. > > Thanks. I have also done my own set of measurements, and the difference is > noticeable in the profiles I looked at. So, applied down to 13. Thanks for pushing! Best regards, Hou zj
Re: Delay the variable initialization in get_rel_sync_entry
On Fri, Dec 24, 2021 at 01:27:26PM +, houzj.f...@fujitsu.com wrote: > Here is the perf result of pgoutput_change after applying the patch. > I didn't notice something else that stand out. > > |--2.99%--pgoutput_change > |--1.80%--get_rel_sync_entry > |--1.56%--hash_search > > Also attach complete profiles. Thanks. I have also done my own set of measurements, and the difference is noticeable in the profiles I looked at. So, applied down to 13. -- Michael signature.asc Description: PGP signature
RE: Delay the variable initialization in get_rel_sync_entry
On Friday, December 24, 2021 8:13 AM Michael Paquier wrote: > > On Thu, Dec 23, 2021 at 12:54:41PM -0300, Euler Taveira wrote: > > On Wed, Dec 22, 2021, at 10:11 AM, houzj.f...@fujitsu.com wrote: > >> The extra cost could sometimes be noticeable because get_rel_sync_entry is > a > >> hot function which is executed for each change. And the 'am_partition' and > >> 'relkind' are necessary only when we need to rebuild the RelationSyncEntry. > >> > >> Here is the perf result for the case when inserted large amounts of data > >> into > a > >> un-published table in which case the cost is noticeable. > >> > >> --12.83%--pgoutput_change > >> |--11.84%--get_rel_sync_entry > >> |--4.76%--get_rel_relispartition > >> |--4.70%--get_rel_relkind > > How does the perf balance change once you apply the patch? Do we have > anything else that stands out? Getting rid of this bottleneck is fine > by itself, but I am wondering if there are more things to worry about > or not. Thanks for the response. Here is the perf result of pgoutput_change after applying the patch. I didn't notice something else that stand out. |--2.99%--pgoutput_change |--1.80%--get_rel_sync_entry |--1.56%--hash_search Also attach complete profiles. > > Good catch. WFM. Deferring variable initialization close to its first use is > > good practice. > > Yeah, it is usually a good practice to have the declaration within > the code block that uses it rather than piling everything at the > beginning of the function. Being able to see that in profiles is > annoying, and the change is simple, so I'd like to backpatch it. +1 > This is a period of vacations for a lot of people, so I'll wait until > the beginning-ish of January before doing anything. Thanks, added it to CF. https://commitfest.postgresql.org/36/3471/ Best regards, Hou zj <>
Re: Delay the variable initialization in get_rel_sync_entry
On Thu, Dec 23, 2021 at 12:54:41PM -0300, Euler Taveira wrote: > On Wed, Dec 22, 2021, at 10:11 AM, houzj.f...@fujitsu.com wrote: >> The extra cost could sometimes be noticeable because get_rel_sync_entry is a >> hot function which is executed for each change. And the 'am_partition' and >> 'relkind' are necessary only when we need to rebuild the RelationSyncEntry. >> >> Here is the perf result for the case when inserted large amounts of data >> into a >> un-published table in which case the cost is noticeable. >> >> --12.83%--pgoutput_change >> |--11.84%--get_rel_sync_entry >> |--4.76%--get_rel_relispartition >> |--4.70%--get_rel_relkind How does the perf balance change once you apply the patch? Do we have anything else that stands out? Getting rid of this bottleneck is fine by itself, but I am wondering if there are more things to worry about or not. > Good catch. WFM. Deferring variable initialization close to its first use is > good practice. Yeah, it is usually a good practice to have the declaration within the code block that uses it rather than piling everything at the beginning of the function. Being able to see that in profiles is annoying, and the change is simple, so I'd like to backpatch it. This is a period of vacations for a lot of people, so I'll wait until the beginning-ish of January before doing anything. -- Michael signature.asc Description: PGP signature
Re: Delay the variable initialization in get_rel_sync_entry
On Wed, Dec 22, 2021, at 10:11 AM, houzj.f...@fujitsu.com wrote: > When reviewing some logical replication patches. I noticed that in > function get_rel_sync_entry() we always invoke get_rel_relispartition() > and get_rel_relkind() at the beginning which could cause unnecessary > cache access. > > --- > get_rel_sync_entry(PGOutputData *data, Oid relid) > { > RelationSyncEntry *entry; > bool am_partition = get_rel_relispartition(relid); > char relkind = get_rel_relkind(relid); > --- > > The extra cost could sometimes be noticeable because get_rel_sync_entry is a > hot function which is executed for each change. And the 'am_partition' and > 'relkind' are necessary only when we need to rebuild the RelationSyncEntry. > > Here is the perf result for the case when inserted large amounts of data into > a > un-published table in which case the cost is noticeable. > > --12.83%--pgoutput_change > |--11.84%--get_rel_sync_entry > |--4.76%--get_rel_relispartition > |--4.70%--get_rel_relkind Good catch. WFM. Deferring variable initialization close to its first use is good practice. -- Euler Taveira EDB https://www.enterprisedb.com/
Re: Delay the variable initialization in get_rel_sync_entry
At Wed, 22 Dec 2021 13:11:38 +, "houzj.f...@fujitsu.com" wrote in > Hi, > > When reviewing some logical replication patches. I noticed that in > function get_rel_sync_entry() we always invoke get_rel_relispartition() > and get_rel_relkind() at the beginning which could cause unnecessary > cache access. > > --- > get_rel_sync_entry(PGOutputData *data, Oid relid) > { > RelationSyncEntry *entry; > boolam_partition = get_rel_relispartition(relid); > charrelkind = get_rel_relkind(relid); > --- > > The extra cost could sometimes be noticeable because get_rel_sync_entry is a > hot function which is executed for each change. And the 'am_partition' and > 'relkind' are necessary only when we need to rebuild the RelationSyncEntry. > > Here is the perf result for the case when inserted large amounts of data into > a > un-published table in which case the cost is noticeable. > > --12.83%--pgoutput_change > |--11.84%--get_rel_sync_entry > |--4.76%--get_rel_relispartition > |--4.70%--get_rel_relkind > > So, I think it would be better if we do the initialization only when > RelationSyncEntry in invalid. > > Attach a small patch which delay the initialization. > > Thoughts ? A simple benchmarking that replicates pgbench workload showed me that the function doesn't enter the path to use the am_partition and relkind in almost all (99.999..%) cases and I don't think it is a special case. Thus I think we can expect that we gain about 10% without any possibility of loss. Addition to that, it is simply a good practice to keep variable scopes narrow. So +1 for this change. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Delay the variable initialization in get_rel_sync_entry
Hi, When reviewing some logical replication patches. I noticed that in function get_rel_sync_entry() we always invoke get_rel_relispartition() and get_rel_relkind() at the beginning which could cause unnecessary cache access. --- get_rel_sync_entry(PGOutputData *data, Oid relid) { RelationSyncEntry *entry; boolam_partition = get_rel_relispartition(relid); charrelkind = get_rel_relkind(relid); --- The extra cost could sometimes be noticeable because get_rel_sync_entry is a hot function which is executed for each change. And the 'am_partition' and 'relkind' are necessary only when we need to rebuild the RelationSyncEntry. Here is the perf result for the case when inserted large amounts of data into a un-published table in which case the cost is noticeable. --12.83%--pgoutput_change |--11.84%--get_rel_sync_entry |--4.76%--get_rel_relispartition |--4.70%--get_rel_relkind So, I think it would be better if we do the initialization only when RelationSyncEntry in invalid. Attach a small patch which delay the initialization. Thoughts ? Best regards, Hou zj 0001-Delay-the-variable-initialization-in-get_rel_sync_en.patch Description: 0001-Delay-the-variable-initialization-in-get_rel_sync_en.patch