On Fri, Feb 26, 2021 at 4:07 PM Amit Langote <[email protected]> wrote: > > Hi Greg, > > Replying to an earlier email in the thread because I think I found a > problem relevant to the topic that was brought up. > > On Wed, Feb 17, 2021 at 10:34 PM Amit Langote <[email protected]> wrote: > > On Wed, Feb 17, 2021 at 10:44 AM Greg Nancarrow <[email protected]> wrote: > > > Is the use of "table_close(rel, NoLock)'' intentional? That will keep > > > the lock (lockmode) until end-of-transaction. > > > > I think we always keep any locks on relations that are involved in a > > plan until end-of-transaction. What if a partition is changed in an > > unsafe manner between being considered safe for parallel insertion and > > actually performing the parallel insert? > > I realized that there is a race condition that will allow a concurrent > backend to invalidate a parallel plan (for insert into a partitioned > table) at a point in time when it's too late for plancache.c to detect > it. It has to do with how plancache.c locks the relations affected by > a cached query and its cached plan. Specifically, > AcquireExecutorLocks(), which locks the relations that need to be > locked before the plan could be considered safe to execute, does not > notice the partitions that would have been checked for parallel safety > when the plan was made. Given that AcquireExecutorLocks() only loops > over PlannedStmt.rtable and locks exactly the relations found there, > partitions are not locked. This means that a concurrent backend can, > for example, add an unsafe trigger to a partition before a parallel > worker inserts into it only to fail when it does. > > Steps to reproduce (tried with v19 set of patches): > > drop table if exists rp, foo; > create table rp (a int) partition by range (a); > create table rp1 partition of rp for values from (minvalue) to (0); > create table rp2 partition of rp for values from (0) to (maxvalue); > create table foo (a) as select generate_series(1, 1000000); > set plan_cache_mode to force_generic_plan; > set enable_parallel_dml to on; > deallocate all; > prepare q as insert into rp select * from foo where a%2 = 0; > explain analyze execute q; > > At this point, attach a debugger to this backend and set a breakpoint > in AcquireExecutorLocks() and execute q again: > > -- will execute the cached plan > explain analyze execute q; > > Breakpoint will be hit. Continue till return from the function and > stop. Start another backend and execute this: > > -- will go through, because no lock still taken on the partition > create or replace function make_table () returns trigger language > plpgsql as $$ begin create table bar(); return null; end; $$ parallel > unsafe; > create trigger ai_rp2 after insert on rp2 for each row execute > function make_table(); > > Back to the debugger, hit continue to let the plan execute. You > should see this error: > > ERROR: cannot start commands during a parallel operation > CONTEXT: SQL statement "create table bar()" > PL/pgSQL function make_table() line 1 at SQL statement parallel worker > > The attached patch fixes this, although I am starting to have second > thoughts about how we're tracking partitions in this patch. Wondering > if we should bite the bullet and add partitions into the main range > table instead of tracking them separately in partitionOids, which > might result in a cleaner patch overall. >
Thanks Amit, I was able to reproduce the problem using your instructions (though I found I had to run that explain an extra time, in order to hit the breakpoint). Also, I can confirm that the problem doesn't occur after application of your patch. I'll leave it to your better judgement as to what to do next - if you feel the current tracking method is not sufficient Regards, Greg Nancarrow Fujitsu Australia
