On Tue, Mar 16, 2021 at 2:41 PM Thomas Munro <thomas.mu...@gmail.com> wrote: > On Mon, Mar 15, 2021 at 8:25 PM Bharath Rupireddy > <bharath.rupireddyforpostg...@gmail.com> wrote: > > > > The problem with a case like REFRESH MATERIALIZED VIEW is that there's > > > > nothing to prevent something that gets run in the course of the query > > > > from trying to access the view (and the heavyweight lock won't prevent > > > > that, due to group locking). That's probably a stupid thing to do, > > > > but it can't be allowed to break the world. The other cases are safe > > > > from that particular problem because the table doesn't exist yet. > > > > Please correct me if my understanding of the above comment (from the > > commit e9baa5e9) is wrong - even if the leader opens the matview > > relation in exclusive mode, because of group locking(in case we allow > > parallel workers to feed in the data to the new heap that gets created > > for RMV, see ExecRefreshMatView->make_new_heap), can other sessions > > still access the matview relation with older data? > > > > I performed below testing to prove myself wrong for the above understanding: > > session 1: > > 1) added few rows to the table t1 on which the mv1 is defined; > > 2) refresh materialized view mv1; > > > > session 2: > > 1) select count(*) from mv1; ---> this query is blocked until > > session 1's step (2) is completed and gives the latest result even if > > the underlying data-generating query runs select part in parallel. > > > > Is there anything I'm missing here? > > I think he was talking about things like functions that try to access > the mv from inside the same query, in a worker. I haven't figured out > exactly which hazards he meant. I thought about wrong-relfilenode > hazards and combocid hazards, but considering the way this thing > always inserts into a fresh table before performing merge or swap > steps later, I don't yet see why this is different from any other > insert-from-select-with-gather.
I asked Robert if he had some hazard in mind that we haven't already discussed here when he wrote that, and didn't recall any. I think we're OK here. I added the "concurrently" variant to the regression test, just to get it exercised too. The documentation needed a small tweak where we have a list of data-writing commands that are allowed to use parallelism. That run of sentences was getting a bit tortured so I converted it into a bullet list; I hope I didn't upset the documentation style police. Pushed. Thanks for working on this! This is really going to fly with INSERT pushdown. The 3 merge queries used by CONCURRENTLY will take some more work.