Re: Parallel INSERT (INTO ... SELECT ...)

Amit Langote Thu, 25 Feb 2021 21:07:51 -0800

Hi Greg,

Replying to an earlier email in the thread because I think I found a
problem relevant to the topic that was brought up.

On Wed, Feb 17, 2021 at 10:34 PM Amit Langote <amitlangot...@gmail.com> wrote:
> On Wed, Feb 17, 2021 at 10:44 AM Greg Nancarrow <gregn4...@gmail.com> wrote:
> > Is the use of "table_close(rel, NoLock)'' intentional? That will keep
> > the lock (lockmode) until end-of-transaction.
>
> I think we always keep any locks on relations that are involved in a
> plan until end-of-transaction.  What if a partition is changed in an
> unsafe manner between being considered safe for parallel insertion and
> actually performing the parallel insert?

I realized that there is a race condition that will allow a concurrent
backend to invalidate a parallel plan (for insert into a partitioned
table) at a point in time when it's too late for plancache.c to detect
it.  It has to do with how plancache.c locks the relations affected by
a cached query and its cached plan.  Specifically,
AcquireExecutorLocks(), which locks the relations that need to be
locked before the plan could be considered safe to execute, does not
notice the partitions that would have been checked for parallel safety
when the plan was made.  Given that AcquireExecutorLocks() only loops
over PlannedStmt.rtable and locks exactly the relations found there,
partitions are not locked.  This means that a concurrent backend can,
for example, add an unsafe trigger to a partition before a parallel
worker inserts into it only to fail when it does.

Steps to reproduce (tried with v19 set of patches):

drop table if exists rp, foo;
create table rp (a int) partition by range (a);
create table rp1 partition of rp for values from (minvalue) to (0);
create table rp2 partition of rp for values from (0) to (maxvalue);
create table foo (a) as select generate_series(1, 1000000);
set plan_cache_mode to force_generic_plan;
set enable_parallel_dml to on;
deallocate all;
prepare q as insert into rp select * from foo where a%2 = 0;
explain analyze execute q;

At this point, attach a debugger to this backend and set a breakpoint
in AcquireExecutorLocks() and execute q again:

-- will execute the cached plan
explain analyze execute q;

Breakpoint will be hit.  Continue till return from the function and
stop. Start another backend and execute this:

-- will go through, because no lock still taken on the partition
create or replace function make_table () returns trigger language
plpgsql as $$ begin create table bar(); return null; end; $$ parallel
unsafe;
create trigger ai_rp2 after insert on rp2 for each row execute
function make_table();

Back to the debugger, hit continue to let the plan execute.  You
should see this error:

ERROR:  cannot start commands during a parallel operation
CONTEXT:  SQL statement "create table bar()"
PL/pgSQL function make_table() line 1 at SQL statement parallel worker

The attached patch fixes this, although I am starting to have second
thoughts about how we're tracking partitions in this patch.  Wondering
if we should bite the bullet and add partitions into the main range
table instead of tracking them separately in partitionOids, which
might result in a cleaner patch overall.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

AcquireExecutorLocks-partition-insert.patch
Description: Binary data

Re: Parallel INSERT (INTO ... SELECT ...)

Reply via email to