Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-21 Thread Masahiko Sawada
On Thu, Aug 21, 2025 at 2:55 AM Amit Kapila wrote: > > On Thu, Aug 21, 2025 at 2:03 PM 赵宇鹏(宇彭) > wrote: > > > > From what we see in our users’ production environments, the situation is > > exactly > > as previously described. Creating a “publication for all tables” is very > > common, > > beca

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-21 Thread Amit Kapila
On Thu, Aug 21, 2025 at 10:53 AM Hayato Kuroda (Fujitsu) wrote: > > > I have concerns about the performance implications of iterating > > through all entries in the caches within > > maybe_cleanup_rel_sync_cache(). If the cache contains numerous > > entries, this iteration could potentially cause

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-21 Thread Amit Kapila
On Thu, Aug 21, 2025 at 2:03 PM 赵宇鹏(宇彭) wrote: > > From what we see in our users’ production environments, the situation is > exactly > as previously described. Creating a “publication for all tables” is very > common, > because manually choosing individual tables to publish can be cumbersome. >

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-21 Thread 赵宇鹏(宇彭)
Hi, From what we see in our users’ production environments, the situation is exactly as previously described. Creating a “publication for all tables” is very common, because manually choosing individual tables to publish can be cumbersome. Regular CREATE/DROP TABLE activity is also normal, and

RE: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-20 Thread Hayato Kuroda (Fujitsu)
> Firstly I also considered but did not choose because of the code complexity. > After considering more, it is not so difficult, PSA new file. v3 contained 100_cachectm_oom.pl, which won't succeed. Here is a patch which removed the test file. Best regards, Hayato Kuroda FUJITSU LIMITED v4-0001

RE: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-20 Thread Hayato Kuroda (Fujitsu)
Dear Xuneng, > This may not be ideal. It decrements on every lookup of an existing > entry, not just when consuming an invalidation, which could make the > counter go > negative. Do we need decrementing logic? Not perfect 1:1 tracking > seems ok in here; though it might make the clean-up a bit mo

RE: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-20 Thread Hayato Kuroda (Fujitsu)
Dear Sawada-san, > It decrements the counter whenever we successfully find the entry from > the cache but I'm not sure this is the right approach. What if no > cache invalidation happens at all but we retrieve entries from the > cache many times? Oh, right. I tried to handle the case that invalid

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-19 Thread Xuneng Zhou
Hi, On Wed, Aug 20, 2025 at 8:44 AM Masahiko Sawada wrote: > > On Sun, Aug 17, 2025 at 11:30 PM Hayato Kuroda (Fujitsu) > wrote: > > > > Dear Sawada-san, > > > > > I've not verified, but even if that's true, IIUC only one relation's > > > cache entry can set in_use to true at a time. > > > > I a

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-19 Thread Masahiko Sawada
On Sun, Aug 17, 2025 at 11:30 PM Hayato Kuroda (Fujitsu) wrote: > > Dear Sawada-san, > > > I've not verified, but even if that's true, IIUC only one relation's > > cache entry can set in_use to true at a time. > > I also think so. > > > If my understanding is > > correct, when the walsender accept

RE: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-17 Thread Hayato Kuroda (Fujitsu)
Dear Sawada-san, > I've not verified, but even if that's true, IIUC only one relation's > cache entry can set in_use to true at a time. I also think so. > If my understanding is > correct, when the walsender accepts invalidation messages in > logicalrep_write_tuple() as you mentioned, it doesn't

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-15 Thread Masahiko Sawada
On Wed, Aug 13, 2025 at 11:43 PM 赵宇鹏(宇彭) wrote: > > Hi all, > > We recently ran into a memory leak in a production logical-replication > WAL-sender > process. A simplified reproduction script is attached. > > If you run the script and then call MemoryContextStats(TopMemoryContext). you > will see

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-15 Thread Masahiko Sawada
On Fri, Aug 15, 2025 at 5:07 AM Hayato Kuroda (Fujitsu) wrote: > > Dear Sawada-san, > > > Given that cache invalidation is executed upon replaying > > REORDER_BUFFER_CHANGE_INVALIDATION and the end of a transaction > > replay, in which case do we keep the relcache (i.e. just setting > > replicate_

RE: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-15 Thread Hayato Kuroda (Fujitsu)
Dear Sawada-san, > Given that cache invalidation is executed upon replaying > REORDER_BUFFER_CHANGE_INVALIDATION and the end of a transaction > replay, in which case do we keep the relcache (i.e. just setting > replicate_valid=false) because of in_use=true? Per old discussion [1], logicalrep_writ

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-15 Thread Masahiko Sawada
On Thu, Aug 14, 2025 at 10:26 PM Zhijie Hou (Fujitsu) wrote: > > On Friday, August 15, 2025 10:59 AM Xuneng Zhou wrote: > > Thanks for your clarification! > > > > On Fri, Aug 15, 2025 at 10:10 AM Zhijie Hou (Fujitsu) > > > > wrote: > > > > > > On Thursday, August 14, 2025 8:49 PM Hayato Kuroda

RE: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-14 Thread Zhijie Hou (Fujitsu)
On Friday, August 15, 2025 10:59 AM Xuneng Zhou wrote: > Thanks for your clarification! > > On Fri, Aug 15, 2025 at 10:10 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Thursday, August 14, 2025 8:49 PM Hayato Kuroda (Fujitsu) > wrote: > > > > > > Dear Xuneng, > > > > > > > Is it safe to free the

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-14 Thread Xuneng Zhou
Hi Zhijie and Hayato-san, Thanks for your clarification! On Fri, Aug 15, 2025 at 10:10 AM Zhijie Hou (Fujitsu) wrote: > > On Thursday, August 14, 2025 8:49 PM Hayato Kuroda (Fujitsu) > wrote: > > > > Dear Xuneng, > > > > > Is it safe to free the substructure from within > > > rel_sync_cache_r

RE: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-14 Thread Zhijie Hou (Fujitsu)
On Thursday, August 14, 2025 8:49 PM Hayato Kuroda (Fujitsu) wrote: > > Dear Xuneng, > > > Is it safe to free the substructure from within > > rel_sync_cache_relation_cb()? > > You referred the comment in rel_sync_cache_relation_cb() right? I understood > like that we must not access to any *

RE: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-14 Thread Hayato Kuroda (Fujitsu)
Dear Xuneng, > Is it safe to free the substructure from within rel_sync_cache_relation_cb()? You referred the comment in rel_sync_cache_relation_cb() right? I understood like that we must not access to any *system caches*, from the comment. Here we do not re-build caches so that we do not access

Re: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-14 Thread Xuneng Zhou
Hi, Thanks for the patch. On Thu, Aug 14, 2025 at 6:39 PM Hayato Kuroda (Fujitsu) wrote: > > Dear Zhao, > > Thanks for raising the issue. > > > If you run the script and then call MemoryContextStats(TopMemoryContext). > > you > > will see something like: > > "logical replication cache context:

RE: memory leak in logical WAL sender with pgoutput's cachectx

2025-08-14 Thread Hayato Kuroda (Fujitsu)
Dear Zhao, Thanks for raising the issue. > If you run the script and then call MemoryContextStats(TopMemoryContext). you > will see something like: > "logical replication cache context: 562044928 total in 77 blocks;" > meaning “cachectx” has grown to ~500 MB, and it keeps growing as the number I