On 04/12/2017 05:00 PM, Andreas Karlsson wrote:
Looked at this an option 1 seems simple enough if I am not missing something. I might hack something up later tonight. Either way I think this improvement can be done separately from the proposed replacement of the catalog header files. Trying to fix everything at once often leads to nothing being fixed at all.
Here is my proof of concept patch. It does basically the same thing as Andres's patch except that it handles quoted values a bit better and does not try to support anything other than the regproc type.
The patch speeds up initdb without fsync from 0.80 seconds to 0.55 seconds, which is a nice speedup, while adding a negligible amount of extra work on compilation.
Two things which could be improved in this patch if people deem it important:
- Refactor the code to be more generic, like Andres patch, so it is easier to add lookups for other data types.
- Write something closer to a real .bki parser to actually understand quoted values and _null_ when parsing the data. Right now this patch only splits each row into its fields (while being aware of quotes) but does not attempt to remove quotes. The PoC patch treats "foo" and foo as different.
Andreas
commit bc8ede661b12407c0b6a9e4c0932d21028f87fc1 Author: Andreas Karlsson <andr...@proxel.se> Date: Wed Apr 12 21:00:49 2017 +0200 WIP: Resolve regproc entires when generating .bki diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index 6e9d57aa8d..f918c9ef8a 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -102,6 +102,7 @@ print $bki "# PostgreSQL $major_version\n"; # vars to hold data needed for schemapg.h my %schemapg_entries; my @tables_needing_macros; +my %procs; our @types; # produce output, one catalog at a time @@ -164,20 +165,41 @@ foreach my $catname (@{ $catalogs->{names} }) $row->{bki_values} =~ s/\bPGUID\b/$BOOTSTRAP_SUPERUSERID/g; $row->{bki_values} =~ s/\bPGNSP\b/$PG_CATALOG_NAMESPACE/g; + # Split values into tokens without interpreting their meaning. + my %bki_values; + @bki_values{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g; + + # Substitute regproc entires with oids + foreach my $att (keys %bki_values) + { + next if $bki_attr{$att}->{type} ne 'regproc'; + next if $bki_values{$att} =~ /\A(\d+|-|_null_)\z/; + + $bki_values{$att} = $procs{$bki_values{$att}}; + } + + # Save pg_proc oids for use by later catalogs. This relies on the + # order we process the files, but the bootstrap code also relies on + # pg_proc being loaded first. + if ($catname eq 'pg_proc') + { + $procs{$bki_values{proname}} = $row->{oid}; + } + # Save pg_type info for pg_attribute processing below if ($catname eq 'pg_type') { - my %type; + my %type = %bki_values; $type{oid} = $row->{oid}; - @type{@attnames} = split /\s+/, $row->{bki_values}; push @types, \%type; } # Write to postgres.bki my $oid = $row->{oid} ? "OID = $row->{oid} " : ''; - printf $bki "insert %s( %s)\n", $oid, $row->{bki_values}; + printf $bki "insert %s( %s)\n", $oid, + join(' ', @bki_values{@attnames}); - # Write comments to postgres.description and postgres.shdescription + # Write comments to postgres.description and postgres.shdescription if (defined $row->{descr}) { printf $descr "%s\t%s\t0\t%s\n", $row->{oid}, $catname, diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index 2af9b355e7..10d9c6abc7 100644 --- a/src/backend/utils/Gen_fmgrtab.pl +++ b/src/backend/utils/Gen_fmgrtab.pl @@ -58,16 +58,9 @@ foreach my $column (@{ $catalogs->{pg_proc}->{columns} }) my $data = $catalogs->{pg_proc}->{data}; foreach my $row (@$data) { - - # To construct fmgroids.h and fmgrtab.c, we need to inspect some - # of the individual data fields. Just splitting on whitespace - # won't work, because some quoted fields might contain internal - # whitespace. We handle this by folding them all to a simple - # "xxx". Fortunately, this script doesn't need to look at any - # fields that might need quoting, so this simple hack is - # sufficient. - $row->{bki_values} =~ s/"[^"]*"/"xxx"/g; - @{$row}{@attnames} = split /\s+/, $row->{bki_values}; + # Split values into tokens without interpreting their meaning. + # This is safe becease we are going to use the values as-is. + @{$row}{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g; # Select out just the rows for internal-language procedures. # Note assumption here that INTERNALlanguageId is 12. diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c index 702924a958..a4237b0661 100644 --- a/src/backend/utils/adt/regproc.c +++ b/src/backend/utils/adt/regproc.c @@ -87,51 +87,11 @@ regprocin(PG_FUNCTION_ARGS) /* Else it's a name, possibly schema-qualified */ /* - * In bootstrap mode we assume the given name is not schema-qualified, and - * just search pg_proc for a unique match. This is needed for - * initializing other system catalogs (pg_namespace may not exist yet, and - * certainly there are no schemas other than pg_catalog). + * We should never get here in bootstrap mode, as all references should + * have been resolved by genbki.pl. */ if (IsBootstrapProcessingMode()) - { - int matches = 0; - Relation hdesc; - ScanKeyData skey[1]; - SysScanDesc sysscan; - HeapTuple tuple; - - ScanKeyInit(&skey[0], - Anum_pg_proc_proname, - BTEqualStrategyNumber, F_NAMEEQ, - CStringGetDatum(pro_name_or_oid)); - - hdesc = heap_open(ProcedureRelationId, AccessShareLock); - sysscan = systable_beginscan(hdesc, ProcedureNameArgsNspIndexId, true, - NULL, 1, skey); - - while (HeapTupleIsValid(tuple = systable_getnext(sysscan))) - { - result = (RegProcedure) HeapTupleGetOid(tuple); - if (++matches > 1) - break; - } - - systable_endscan(sysscan); - heap_close(hdesc, AccessShareLock); - - if (matches == 0) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_FUNCTION), - errmsg("function \"%s\" does not exist", pro_name_or_oid))); - - else if (matches > 1) - ereport(ERROR, - (errcode(ERRCODE_AMBIGUOUS_FUNCTION), - errmsg("more than one function named \"%s\"", - pro_name_or_oid))); - - PG_RETURN_OID(result); - } + elog(ERROR, "regprocin with textual values is not supported in bootstrap mode"); /* * Normal case: parse the name into components and see if it matches any
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers