Hi,

On 2021-10-24 17:10:55 -0400, Tom Lane wrote:
> 0004 is not committable as-is, because it assumes that the source
> server has single-array unnest(), which is not true before 8.4.
> We could fix that by using "oid = ANY(array-constant)" conditions
> instead, but I'm unsure about the performance properties of that
> for large OID arrays on those old server versions.

It doesn't seem bad at all. 8.3 assert:

CREATE TABLE foo(oid oid primary key);
INSERT INTO foo SELECT generate_series(1, 1000000);
postgres[1164129][1]=# explain ANALYZE SELECT count(*) FROM foo WHERE oid = 
ANY(ARRAY(SELECT generate_series(1100000, 1, -1)));
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                             QUERY PLAN        
                                                      │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate  (cost=81.54..81.55 rows=1 width=0) (actual time=2433.656..2433.656 
rows=1 loops=1)                                       │
│   InitPlan                                                                    
                                                      │
│     ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.004..149.425 
rows=1100000 loops=1)                                  │
│   ->  Bitmap Heap Scan on foo  (cost=42.70..81.50 rows=10 width=0) (actual 
time=2275.137..2369.478 rows=1000000 loops=1)            │
│         Recheck Cond: (oid = ANY (($0)::oid[]))                               
                                                      │
│         ->  Bitmap Index Scan on foo_pkey  (cost=0.00..42.69 rows=10 width=0) 
(actual time=2274.077..2274.077 rows=1000000 loops=1) │
│               Index Cond: (oid = ANY (($0)::oid[]))                           
                                                      │
│ Total runtime: 2436.201 ms                                                    
                                                      │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(8 rows)

Time: 2437.568 ms (00:02.438)


> Lastly, 0006 implements the other idea we'd discussed in the other
> thread: for queries that are issued repetitively but not within a
> single pg_dump function invocation, use PREPARE/EXECUTE to cut down
> the overhead.  This gets only diminishing returns after 0004, but
> it still brings "pg_dump -s regression" down from 0.39s to 0.33s,
> so maybe it's worth doing.

I think it's worth doing. There's things that the batch approach won't help
with and even if it doesn't help a lot with the regression test database, I'd
expect it to help plenty with other cases.

A test database I had around with lots of functions got drastically faster to
dump (7.4s to 2.5s), even though the number of queries didn't change
significantly. According to pg_stat_statements plan_time for the dumpFunc
query went from 2352ms to 0.4ms - interestingly execution time nearly halves
as well.


> I stopped after caching the plans for
> functions/aggregates/operators/types, though.  The remaining sorts
> of objects aren't likely to appear in typical databases enough times
> to be worth worrying over.  (This patch will be a net loss if there
> are more than zero but less than perhaps 10 instances of an object type,
> so there's definitely reasons beyond laziness for not doing more.)

Seems reasonable.


> @@ -7340,25 +7340,37 @@ getDomainConstraints(Archive *fout, TypeInfo *tyinfo)
>                               i_consrc;
>       int                     ntups;
>  
> -     query = createPQExpBuffer();
> +     static bool query_prepared = false;
>  
> -     if (fout->remoteVersion >= 90100)
> -             appendPQExpBuffer(query, "SELECT tableoid, oid, conname, "
> -                                               
> "pg_catalog.pg_get_constraintdef(oid) AS consrc, "
> -                                               "convalidated "
> -                                               "FROM 
> pg_catalog.pg_constraint "
> -                                               "WHERE contypid = 
> '%u'::pg_catalog.oid "
> -                                               "ORDER BY conname",
> -                                               tyinfo->dobj.catId.oid);
> +     if (!query_prepared)
> +     {

I wonder if it'd be better to store this in Archive or such. The approach with
static variables might run into problems with parallel pg_dump at some
point. These objects aren't dumped in parallel yet, but still...

Greetings,

Andres Freund


Reply via email to