On Wed, Sep 13, 2017 at 12:39 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Tue, Sep 12, 2017 at 3:46 AM, Amit Langote > <langote_amit...@lab.ntt.co.jp> wrote: >> In this case, AcquireExecutorLocks will lock all the relations in >> PlannedStmt.rtable, which must include all partitioned tables of all >> partition trees involved in the query. Of those, it will lock the tables >> whose RT indexes appear in PlannedStmt.nonleafResultRelations with >> RowExclusiveLock mode. PlannedStmt.nonleafResultRelations is a global >> list of all partitioned table RT indexes obtained by concatenating >> partitioned_rels lists of all ModifyTable nodes involved in the query >> (set_plan_refs does that). We need to distinguish nonleafResultRelations, >> because we need to take the stronger lock on a given table before any >> weaker one if it happens to appear in the query as a non-result relation >> too, to avoid lock strength upgrade deadlock hazard. > > Hmm. The problem with this theory in my view is that it doesn't > explain why InitPlan() and ExecOpenScanRelation() lock the relations > instead of just assuming that they are already locked either by > AcquireExecutorLocks or by planning. If ExecLockNonLeafAppendTables() > doesn't really need to take locks, then ExecOpenScanRelation() must > not need to do it either. We invented ExecLockNonLeafAppendTables() > on the occasion of removing the scans of those tables which would > previously have caused ExecOpenScanRelation() to be invoked, so as to > keep the locking behavior unchanged. > > AcquireExecutorLocks() looks like an odd bit of code to me. The > executor itself locks result tables in InitPlan() and then everything > else during InitPlan() and all of the others later on while walking > the plan tree -- comments in InitPlan() say that this is to avoid a > lock upgrade hazard if a result rel is also a source rel. But > AcquireExecutorLocks() has no such provision; it just locks everything > in RTE order. In theory, that's a deadlock hazard of another kind, as > we just talked about in the context of EIBO. In fact, expanding in > bound order has made the situation worse: before, expansion order and > locking order were the same, so maybe having AcquireExecutorLocks() > work in RTE order coincidentally happened to give the same result as > the executor code itself as long as there are no result relations. > But this is certainly not true any more. I'm not sure it's worth > expending a lot of time on this -- it's evidently not a problem in > practice, or somebody probably would've complained before now. > > But that having been said, I don't think we should assume that all the > locks taken from the executor are worthless because plancache.c will > always do the job for us. I don't know of a case where we execute a > saved plan without going through the plan cache, but that doesn't mean > that there isn't one or that there couldn't be one in the future. > It's not the job of these partitioning patches to whack around the way > we do locking in general -- they should preserve the existing behavior > as much as possible. If we want to get rid of the locking in the > executor altogether, that's a separate discussion where, I have a > feeling, there will prove to be better reasons for the way things are > than we are right now supposing. >
I agree that it's not the job of these patches to change the locking or even get rid of partitioned_rels. In order to continue returning partitioned_rels in Append paths esp. in the case of queries involving set operations and partitioned table e.g "select 1 from t1 union all select 2 from t1;" in which t1 is multi-level partitioned table, we need a fix in add_paths_to_append_rels(). The fix provided in [1] is correct but we will need a longer explanation of why we have to involve RTE_SUBQUERY with RELKIND_PARTITIONED_TABLE. The explanation is complicated. If we get rid of partitioned_rels, we don't need to fix that code in add_paths_to_append_rel(). I suggested that [2] -- (excerpt from [2]) Actually, the original problem that caused this discussion started with an assertion failure in get_partitioned_child_rels() as Assert(list_length(result) >= 1); This assertion fails if result is NIL when an intermediate partitioned table is passed. May be we should assert (result == NIL || list_length(result) == 1) and allow that function to be called even for intermediate partitioned partitions for which the function will return NIL. That will leave the code in add_paths_to_append_rel() simple. Thoughts? -- Amit Langote agrees with this. It kind of makes the assertion lame but keeps the code sane. What do you think? [1] https://www.postgresql.org/message-id/d2f1cdcb-ebb4-76c5-e471-79348ca5d...@lab.ntt.co.jp [2] https://www.postgresql.org/message-id/CAFjFpRfJ3GRRmmOugaMA-q4i=se5p6yjz_c6a6hdrdqqtgx...@mail.gmail.com -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers