On Thu, Mar 4, 2021 at 4:37 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Feb 26, 2021 at 10:37 AM Amit Langote <amitlangot...@gmail.com> wrote: > > > > I realized that there is a race condition that will allow a concurrent > > backend to invalidate a parallel plan (for insert into a partitioned > > table) at a point in time when it's too late for plancache.c to detect > > it. It has to do with how plancache.c locks the relations affected by > > a cached query and its cached plan. Specifically, > > AcquireExecutorLocks(), which locks the relations that need to be > > locked before the plan could be considered safe to execute, does not > > notice the partitions that would have been checked for parallel safety > > when the plan was made. Given that AcquireExecutorLocks() only loops > > over PlannedStmt.rtable and locks exactly the relations found there, > > partitions are not locked. This means that a concurrent backend can, > > for example, add an unsafe trigger to a partition before a parallel > > worker inserts into it only to fail when it does. > > > > Steps to reproduce (tried with v19 set of patches): > > > > drop table if exists rp, foo; > > create table rp (a int) partition by range (a); > > create table rp1 partition of rp for values from (minvalue) to (0); > > create table rp2 partition of rp for values from (0) to (maxvalue); > > create table foo (a) as select generate_series(1, 1000000); > > set plan_cache_mode to force_generic_plan; > > set enable_parallel_dml to on; > > deallocate all; > > prepare q as insert into rp select * from foo where a%2 = 0; > > explain analyze execute q; > > > > At this point, attach a debugger to this backend and set a breakpoint > > in AcquireExecutorLocks() and execute q again: > > > > -- will execute the cached plan > > explain analyze execute q; > > > > Breakpoint will be hit. Continue till return from the function and > > stop. Start another backend and execute this: > > > > -- will go through, because no lock still taken on the partition > > create or replace function make_table () returns trigger language > > plpgsql as $$ begin create table bar(); return null; end; $$ parallel > > unsafe; > > create trigger ai_rp2 after insert on rp2 for each row execute > > function make_table(); > > > > Back to the debugger, hit continue to let the plan execute. You > > should see this error: > > > > ERROR: cannot start commands during a parallel operation > > CONTEXT: SQL statement "create table bar()" > > PL/pgSQL function make_table() line 1 at SQL statement parallel worker > > > > The attached patch fixes this, > > > > One thing I am a bit worried about this fix is that after this for > parallel-mode, we will maintain partitionOids at two places, one in > plannedstmt->relationOids and the other in plannedstmt->partitionOids. > I guess you have to do that because, in AcquireExecutorLocks, we can't > find which relationOids are corresponding to partitionOids, am I > right? If so, why not just maintain them in plannedstmt->partitionOids > and then make PlanCacheRelCallback consider it? Also, in > standard_planner, we should add these partitionOids only for > parallel-mode. >
One more point I was thinking about is whether we need to worry about locking indexes during prepared query execution (similar to what we do for AcquireExecutorLocks). I think we shouldn't be bothered to lock those or even retain lock during parallel-safety checks because one cannot change index expression or predicate. Is my understanding correct or am I missing something and we should be worried about them as well. -- With Regards, Amit Kapila.