On Sat, Nov 1, 2014 at 9:09 PM, Robert Haas <robertmh...@gmail.com> wrote: > 1. Any non-trivial piece of PostgreSQL code is likely to contain > syscache lookups. > 2. Syscache lookups had better work in parallel workers, or they'll be > all but useless.
I've been using parallel sorts and index builds in my mental model of how this will be used. I note that sorts go out of their way to look up all the syscache entries in advance precisely so that tuplesort doesn't start doing catalog lookups in the middle of the sort. In general I think what people are imagining is that the parallel workers will be running low-level code like tuplesort that has all the databasey stuff like catalog lookups done in advance and just operates on C data structures like function pointers. And I think that's a valuable coding discipline to enforce, it avoids having low level infrastructure calling up to higher level abstractions which quickly becomes hard to reason about. However in practice I think you're actually right -- but not for the reasons you've been saying. I think the parallel workers *should* be written as low level infrastructure and not be directly doing syscache lookups or tuple locking etc. However there are a million ways in which Postgres is extensible which causes loops in the call graph that aren't apparent in the direct code structure. For instance, what happens if the index you're building is an expression index or partial index? Worse, what happens if those expressions have a plpython function that does queries using SPI.... But those are the kinds of user code exploiting extensibility are the situations where we need a deadlock detector and where you might need this infrastructure. We wouldn't and shouldn't need a deadlock detector for our own core server code. In an ideal world some sort of compromise that enforces careful locking rules where all locks are acquired in advance and parallel workers are prohibited from obtaining locks in the core code while still allowing users to a free-for-all and detecting deadlocks at runtime for them would be ideal. But I'm not sure there's any real middle ground here. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers