Re: memory leak in pgoutput
> By the way, if possible, could you send an updated version of the > patch to show what you have in mind? Yeah, here is the new patch: I have verifed that this patch works for REL_[15-17]_STABLE and master. The memory consumption of "logical replication cache context" remains consistently at 112 MB during the benchmark mentioned above. > sysbench oltp_write_only --db-driver=pgsql \ > --pgsql-port=5432 \ > --pgsql-db=postgres \ > --pgsql-user=postgres \ > --tables=15000 --table_size=100 \ > --report-interval=1 \ > --threads=10 run \ > --time=180 v2-0001-Fix-memory-leak-in-pgoutput.patch Description: v2-0001-Fix-memory-leak-in-pgoutput.patch
Re: memory leak in pgoutput
> You should be more careful with the amount of tests you are doing > here. This fails while waiting for some changes to be streamed when > creating a subscription: > cd src/test/subscription/ && PROVE_TESTS=t/017_stream_ddl.pl make check I apologize for the obvious error in the previous patch. I have corrected it in the new patch(v3) and pass the regression testing. v3-0001-Fix-memory-leak-in-pgoutput.patch Description: v3-0001-Fix-memory-leak-in-pgoutput.patch
Re: memory leak in pgoutput
> Here, after freeing the tupledesc, the ExecDropSingleTupleTableSlot will still > access the freed tupledesc->tdrefcount which is an illegal memory access. Yes, I overlooked that. > I think we can do something like below instead: > > + TupleDesc desc = > entry->old_slot->tts_tupleDescriptor; > + > + Assert(desc->tdrefcount == -1); > + > ExecDropSingleTupleTableSlot(entry->old_slot); > + FreeTupleDesc(desc); It seems a bit odd because "entry->old_slot->tts_tupleDescriptor" is accessed after "entry->old_slot" has been freed. I think we can avoid this by assigning "desc" to NULL before ExecDropSingleTupleTableSlot(). ``` + TupleDesc desc = entry->old_slot->tts_tupleDescriptor; + + Assert(desc->tdrefcount == -1); + + FreeTupleDesc(desc); + desc = NULL; ExecDropSingleTupleTableSlot(entry->old_slot); ``` By the way, this issue is introduced in 52e4f0cd472d39d. Therefore, we may need to backport the patch to v15. Best Regards, Boyu Yang
memory leak in pgoutput
Hello, I recently noticed a unusually large memory consumption in pgoutput where "RelationSyncCache" is maintaining approximately 3 GB of memory while handling 15,000 tables. Upon investigating the cause, I found that "tts_tupleDescriptor" in both "old_slot" and "new_slot" wouldn't be freed before the reusing the entry. The refcount of the tuple descriptor is initialized as -1 in init_tuple_slot(), which means it will not be refcounted. So, when cleaning the outdated slots, ExecDropSingleTupleTableSlot() would omit tuple descriptors. To address this issue, calling FreeTupleDesc() to release the tuple descriptor before ExecDropSingleTupleTableSlot() might be a suitable solution. However, I am uncertain whether this approach is appropriate. Reproduct = It's easy to reproduce the issue with sysbench: 1. Create emtpy tables on primary. ``` sysbench oltp_read_only --db-driver=pgsql \ --pgsql-port=5432 \ --pgsql-db=postgres \ --pgsql-user=postgres \ --tables=15000 --table_size=0 \ --report-interval=1 \ --threads=10 prepare ``` 2. Create a standby and promote it. 3. Create publication on primary and subscription on standby. ``` CREATE PUBLICATION pub1 FOR ALL TABLES; CREATE SUBSCRIPTION sub1 CONNECTION 'port=5432 user=postgres dbname=postgres' PUBLICATION pub1 WITH (enabled, create_slot, slot_name='pub1_to_sub1', copy_data=false); ``` 4. Bench on primary. ``` sysbench oltp_write_only --db-driver=pgsql \ --pgsql-port=5432 \ --pgsql-db=postgres \ --pgsql-user=postgres \ --tables=15000 --table_size=100 \ --report-interval=1 \ --threads=10 run \ --time=180 ``` Tested in PostgreSQL 17, the memory consumption of walsender is 804MB after the benchmark, and it decreases to 441MB after applying the patch. Thanks for your feedback. Kind Regards, Boyu Yang 0001-Fix-memory-leak-in-pgoutput.patch Description: 0001-Fix-memory-leak-in-pgoutput.patch