On 2/16/23 2:43 PM, Jonathan S. Katz wrote:
On 2/16/23 2:38 PM, Alvaro Herrera wrote:
On 2023-Feb-16, Jonathan S. Katz wrote:

On 2/16/23 12:53 PM, Alvaro Herrera wrote:

I don't think this is the fault of logical replication.  Consider that
for the backend server, the function source code is just an opaque
string that is given to the plpgsql engine to interpret.  So there's no
way for the logical DDL replication engine to turn this into runnable
code if the table name is not qualified.

Sure, that's fair. That said, the example above would fall under a "typical use case", i.e. I'm replicating functions that call tables without schema
qualification. This is pretty common, and as logical replication becomes
used for more types of workloads (e.g. high availability), we'll definitely
see this.

Hmm, I think you're saying that replay should turn check_function_bodies
off, and I think I agree with that.

Yes, exactly. +1

I drilled into this a bit more using the SQL standard bodies (BEGIN ATOMIC) to see if there were any other behaviors we needed to account for. Overall, it worked well but I ran into one issue.

First, functions with "BEGIN ATOMIC" ignores "check_function_bodies" which is by design based on how this feature works. We should still turn "check_function_bodies" to "off" though, per above discussion.

In the context of DDL replication, "BEGIN ATOMIC" does support schema-unqualified functions, presumably because it includes the parsed content?

I created an updated example[1] where I converted the SQL functions to use the standard syntax and I returned the table names to be schema unqualified. This seemed to work, but I ran into a weird case with this function:

CREATE OR REPLACE FUNCTION public.calendar_manage(room_id int, calendar_date date)
RETURNS void
LANGUAGE SQL
BEGIN ATOMIC
    WITH delete_calendar AS (
        DELETE FROM calendar
        WHERE
            room_id = $1 AND
            calendar_date = $2
    )
    INSERT INTO calendar (room_id, status, calendar_date, calendar_range)
    SELECT $1, c.status, $2, c.calendar_range
    FROM calendar_generate_calendar($1, tstzrange($2, $2 + 1)) c;
END;

This produced an error on the subscriber, with the following message:

2023-02-16 20:58:24.096 UTC [26864] ERROR: missing FROM-clause entry for table "calendar_1" at character 322 2023-02-16 20:58:24.096 UTC [26864] CONTEXT: processing remote data for replication origin "pg_18658" during message type "DDL" in transaction 980, finished at 0/C099A7D8 2023-02-16 20:58:24.096 UTC [26864] STATEMENT: CREATE OR REPLACE FUNCTION public.calendar_manage ( IN room_id pg_catalog.int4, IN calendar_date pg_catalog.date ) RETURNS pg_catalog.void LANGUAGE sql VOLATILE PARALLEL UNSAFE CALLED ON NULL INPUT SECURITY INVOKER COST 100 BEGIN ATOMIC
         WITH delete_calendar AS (
                  DELETE FROM public.calendar
WHERE ((calendar_1.room_id OPERATOR(pg_catalog.=) calendar_manage.room_id) AND (calendar_1.calendar_date OPERATOR(pg_catalog.=) calendar_manage.calendar_date))
                 )
INSERT INTO public.calendar (room_id, status, calendar_date, calendar_range) SELECT calendar_manage.room_id,
                     c.status,
                     calendar_manage.calendar_date,
                     c.calendar_range
FROM public.calendar_generate_calendar(calendar_manage.room_id, pg_catalog.tstzrange((calendar_manage.calendar_date)::timestamp with time zone, ((calendar_manage.calendar_date OPERATOR(pg_catalog.+) 1))::timestamp with time zone)) c(status, calendar_range);
        END

This seemed to add an additional, incorrect reference to the origin table for the "room_id" and "calendar_date" attributes within the CTE of this function. I don't know if this is directly related to the DDL replication patch, but reporting it as I triggered the behavior through it.

Thanks,

Jonathan

[1] https://gist.github.com/jkatz/fe29006b724fd6f32ee849a96dc01608

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to