Hi, Per complain in another thread[1], I started to look into the global variables in pgoutput.
Currently we have serval global variables in pgoutput, but each of them is inherently local to an individual pgoutput instance. This could cause issues if we switch to different output plugin instance in one session and could miss to reset their value in case of errors. The analysis for each variable is as follows: - static HTAB *RelationSyncCache = NULL; pgoutput creates this hash table under cacheMemoryContext to remember the relation schemas that have been sent, but it's local to an individual pgoutput instance, and because it's under global memory context, the hashtable is not properly cleared in error paths which means it has a risk of being accessed in a different output plugin instance. This was also mentioned in another thread[2]. So I think we'd better allocate this under output plugin private context. But note that, instead of completely moving the hash table into the output plugin private data, we need to to keep the static pointer variable for the map to be accessed by the syscache callbacks. This is because syscache callbacks won't be un-registered even after shutting down the output plugin, so we need a static pointer to cache the map pointer so that callbacks can check it. - static bool publish_no_origin; This flag is also local to pgoutput instance, and we didn't reset the flag in output shutdown callback, so if we consume changes from different slots, then the second call would reuse the flag value that is set in the first call which is unexpected. To completely avoid this issue, we think we'd better move this flag to output plugin private data structure. Example: SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_1', NULL, NULL, 'proto_version', '1', 'publication_names', 'pub', 'origin', 'none'); --- Set origin in this call. SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_2', NULL, NULL, 'proto_version', '1', 'publication_names', 'pub'); -- Didn't set origin, but will reuse the origin flag in the first call. - static bool in_streaming; While on it, I feel we can also move this flag to private data, although I didn't see problems for this one. - static bool publications_valid; I thought we need to move this to private data as well, but we need to access this in a syscache callback, which means we need to keep the static variable. Attach the patches to change in_streaming, publish_no_origin and RelationSyncCache. Suggestions and comments are welcome. [1] https://www.postgresql.org/message-id/20230821182732.t3qc75i5s5xvovls%40awork3.anarazel.de [2] https://www.postgresql.org/message-id/CAA4eK1LJ%3DCSsxETs5ydqP58OiWPiwodx%3DJqw89LQ7fMrRWqK9w%40mail.gmail.com Best Regards, Hou Zhijie
v1-0003-Move-in_streaming-to-output-private-data.patch
Description: v1-0003-Move-in_streaming-to-output-private-data.patch
v1-0001-Allocate-RelationSyncCache-of-pgoutput-under-priv.patch
Description: v1-0001-Allocate-RelationSyncCache-of-pgoutput-under-priv.patch
v1-0002-Maintain-publish_no_origin-in-output-plugin-priva.patch
Description: v1-0002-Maintain-publish_no_origin-in-output-plugin-priva.patch