Re: Performance degradation of REFRESH MATERIALIZED VIEW

Justin Pryzby Tue, 27 Apr 2021 07:04:13 -0700

On Tue, Apr 27, 2021 at 03:43:07PM +0200, Tomas Vondra wrote:
> On 4/27/21 7:34 AM, Masahiko Sawada wrote:
> > On Tue, Apr 27, 2021 at 8:07 AM Andres Freund <[email protected]> wrote:
> > > On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
> > > > On 4/26/21 9:27 PM, Andres Freund wrote:
> > > > > On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
> > > > > > I'm not sure what to do about this :-( I don't have any ideas about 
> > > > > > how to
> > > > > > eliminate this overhead, so the only option I see is reverting the 
> > > > > > changes
> > > > > > in heap_insert. Unfortunately, that'd mean inserts into TOAST 
> > > > > > tables won't
> > > > > > be frozen ...
> > > > > 
> > > > > ISTM that the fundamental issue here is not that we acquire pins that 
> > > > > we
> > > > > shouldn't, but that we do so at a much higher frequency than needed.
> > > > > 
> > > > > It's probably too invasive for 14, but I think it might be worth 
> > > > > exploring
> > > > > passing down a BulkInsertState in nodeModifyTable.c's 
> > > > > table_tuple_insert() iff
> > > > > the input will be more than one row.
> > > > > 
> > > > > And then add the vm buffer of the target page to BulkInsertState, so 
> > > > > that
> > > > > hio.c can avoid re-pinning the buffer.
> > > > > 
> > > > 
> > > > Yeah. The question still is what to do about 14, though. Shall we leave 
> > > > the
> > > > code as it is now, or should we change it somehow? It seem a bit 
> > > > unfortunate
> > > > that a COPY FREEZE optimization should negatively influence other (more)
> > > > common use cases, so I guess we can't just keep the current code ...
> > > 
> > > I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
> > > and see whether that fixes the regression.
> 
> What Andres is suggesting (I think) is to modify ExecInsert() to pass a
> valid bistate to table_tuple_insert, instead of just NULL, and store the
> vmbuffer in it. Not sure how to identify when inserting more than just a
> single row, though ...


Maybe this is relevant.
https://commitfest.postgresql.org/33/2553/
| INSERT SELECT: BulkInsertState and table_multi_insert

The biistate part is small - Simon requested to also use table_multi_insert,
which makes the patch much bigger, and there's probably lots of restrictions I
haven't even thought to check.

This uses a GUC threshold for bulk insert, but I'm still not sure it's really
problematic to use a biistate for a single row.

        /* Use bulk insert after a threshold number of tuples */
        // XXX: maybe this should only be done if it's not a partitioned table 
or
        // if the partitions don't support miinfo, which uses its own bistates
        mtstate->ntuples++;
        if (mtstate->bistate == NULL &&
                        mtstate->ntuples > bulk_insert_ntuples &&
                        bulk_insert_ntuples >= 0)
        {
                elog(DEBUG1, "enabling bulk insert");
                mtstate->bistate = GetBulkInsertState();

-- 
Justin

Re: Performance degradation of REFRESH MATERIALIZED VIEW

Reply via email to