Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Wed, Mar 20, 2019 at 9:24 PM Tom Lane wrote: > Joel Jacobson writes: > > I've seen a performance trick in other hash functions [1] > > to instead read multiple bytes in each iteration, > > and then handle the remaining bytes after the loop. > > [1] https://github.com/wangyi-fudan/wyhash/blob/master/wyhash.h#L29 > > I can't get very excited about this, seeing that we're only going to > be hashing short strings. I don't really believe your 30% number > for short strings; and even if I did, there's no evidence that the > hash functions are worth any further optimization in terms of our > overall performance. > I went ahead and tested this approach anyway, since I need this algorithm in a completely different project. The benchmark below shows stats for three different keywords per length, compiled with -O2: $ c++ -O2 -std=c++14 -o bench_perfect_hash bench_perfect_hash.cc $ ./bench_perfect_hash keyword length char-a-time (ns) word-a-time (ns) diff (%) as 2 3.30 2.62 -0.21 at 2 3.54 2.66 -0.25 by 2 3.30 2.59 -0.22 add 3 4.01 3.15 -0.21 all 3 4.04 3.11 -0.23 and 3 3.84 3.11 -0.19 also 4 4.50 3.17 -0.30 both 4 4.49 3.06 -0.32 call 4 4.95 3.42 -0.31 abort5 6.09 4.02 -0.34 admin5 5.26 3.65 -0.31 after5 5.18 3.76 -0.27 access 6 5.97 3.91 -0.34 action 6 5.86 3.89 -0.34 always 6 6.10 3.77 -0.38 analyse 7 6.67 4.64 -0.30 analyze 7 7.09 4.87 -0.31 between 7 7.02 4.66 -0.34 absolute 8 7.49 3.82 -0.49 backward 8 7.13 3.88 -0.46 cascaded 8 7.23 4.17 -0.42 aggregate9 8.04 4.49 -0.44 assertion9 7.98 4.52 -0.43 attribute9 8.03 4.44 -0.45 assignment 10 8.58 4.67 -0.46 asymmetric 10 9.07 4.57 -0.50 checkpoint 10 9.15 4.53 -0.51 constraints 11 9.58 5.14 -0.46 insensitive 11 9.62 5.30 -0.45 publication 11 10.305.60 -0.46 concurrently 12 10.364.81 -0.54 current_date 12 11.175.48 -0.51 current_role 12 11.155.10 -0.54 authorization13 11.875.50 -0.54 configuration13 11.505.51 -0.52 xmlattributes13 11.725.66 -0.52 current_schema 14 12.175.58 -0.54 localtimestamp 14 11.785.46 -0.54 characteristics 15 12.775.97 -0.53 current_catalog 15 12.655.87 -0.54 current_timestamp17 14.196.12 -0.57 > Also, as best I can tell, the approach you propose would result > in an endianness dependence, meaning we'd have to have separate > lookup tables for BE and LE machines. That's not a dealbreaker > perhaps, but it is certainly another point on the "it's not worth it" > side of the argument. > I can see how the same problem has been worked-around in e.g. pg_crc32.h: #ifdef WORDS_BIGENDIAN #define FIN_CRC32C(crc) ((crc) = pg_bswap32(crc) ^ 0x) #else #define FIN_CRC32C(crc) ((crc) ^= 0x) #endif So I used the same trick in PerfectHash.pm: $f .= sprintf "#ifdef WORDS_BIGENDIAN\n"; $f .= sprintf "\t\tc4 = pg_bswap32(c4);\n"; $f .= sprintf "#endif\n"; I've also tried to measure the overall effect by hacking postgres.c: + struct timespec start, stop; + clock_gettime( CLOCK_REALTIME, &start); + for (int i=0; i<10; i++) + { + List *parsetree_list2; + MemoryContext oldcontext2; + + oldcontext2 = MemoryContextSwitchTo(MessageContext); + parsetree_list2 = pg_parse_query(query_string); + MemoryContextSwitchTo(oldcontext2); +// MemoryContextReset(MessageContext); + CHECK_F
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Joel Jacobson writes: > I've seen a performance trick in other hash functions [1] > to instead read multiple bytes in each iteration, > and then handle the remaining bytes after the loop. > [1] https://github.com/wangyi-fudan/wyhash/blob/master/wyhash.h#L29 I can't get very excited about this, seeing that we're only going to be hashing short strings. I don't really believe your 30% number for short strings; and even if I did, there's no evidence that the hash functions are worth any further optimization in terms of our overall performance. Also, as best I can tell, the approach you propose would result in an endianness dependence, meaning we'd have to have separate lookup tables for BE and LE machines. That's not a dealbreaker perhaps, but it is certainly another point on the "it's not worth it" side of the argument. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Many thanks for working on this, amazing work, really nice you made it a separate reusable Perl-module. The generated hash functions reads one character at a time. I've seen a performance trick in other hash functions [1] to instead read multiple bytes in each iteration, and then handle the remaining bytes after the loop. [1] https://github.com/wangyi-fudan/wyhash/blob/master/wyhash.h#L29 I've done some testing and it looks like a ~30% speed-up of the generated ScanKeywords_hash_func() function would be possible. If you think this approach is promising, I would be happy to prepare a patch for it, but I wanted to check with the project this idea has not already been considered and ruled out for some technical reasons I've failed to see, is there any? For this to work you would need to use larger constants for $hash_mult1 and $hash_mult2 though. I've successfully used these values: $hash_mult1 0x2c1b3c6d $hash_mult2 (0x297a2d39, 0x85ebca6b, 0xc2b2ae35, 0x7feb352d, 0x846ca68b) Here is the idea: Generated C-code: for (; keylen >= 4; keylen -= 4, k += 4) { uint32_t v; memcpy(&v, k, 4); v |= 0x20202020; a = a * 739982445 + v; b = b * 2246822507 + v; } uint32_t v = 0; switch (keylen) { case 3: memcpy(&v, k, 3); v |= 0x202020; break; case 2: memcpy(&v, k, 2); v |= 0x2020; break; case 1: memcpy(&v, k, 1); v |= 0x20; break; } a = a * 739982445 + v; b = b * 2246822507 + v; return h[a % 883] + h[b % 883]; (Reding 8 bytes a time instead would perhaps be a win since some keywords are quite long.) Perl-code: sub _calc_hash { my ($key, $mult, $seed) = @_; my $result = $seed; my $i=0; my $keylen = length($key); for (; $keylen>=4; $keylen-=4, $i+=4) { my $cn = (ord(substr($key,$i+0,1)) << 0) | (ord(substr($key,$i+1,1)) << 8) | (ord(substr($key,$i+2,1)) << 16) | (ord(substr($key,$i+3,1)) << 24); $cn |= 0x20202020 if $case_fold; $result = ($result * $mult + $cn) % 4294967296; } my $cn = 0; if ($keylen == 3) { $cn = (ord(substr($key,$i+0,1)) << 0) | (ord(substr($key,$i+1,1)) << 8) | (ord(substr($key,$i+2,1)) << 16); $cn |= 0x202020 if $case_fold; } elsif ($keylen == 2) { $cn = (ord(substr($key,$i+0,1)) << 0) | (ord(substr($key,$i+1,1)) << 8); $cn |= 0x2020 if $case_fold; } elsif ($keylen == 1) { $cn = (ord(substr($key,$i+0,1)) << 0); $cn |= 0x20 if $case_fold; } $result = ($result * $mult + $cn) % 4294967296; return $result; }
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > On Wed, Jan 9, 2019 at 5:33 PM Tom Lane wrote: >> really need here. We could go with "[no-]case-insensitive", perhaps. >> Or "[no-]case-fold", which is at least a little shorter and less >> double-negative-y. > I'd be in favor of --[no-]case-fold. Yeah, I like that better too; I've been having to stop and think every time as to which direction is which with the [in]sensitive terminology. I'll make it "case-fold" throughout. > This comment in PerfectHash.pm: > (It does change '_', else we could just skip adjusting > # $cn here at all, for typical keyword strings.) > ...seems a bit out of place in the module, because of its reference to > keywords, of interest right now to its only caller. Maybe a bit of > context here. (I also don't quite understand why we could > hypothetically skip the adjustment.) Were it not for the underscore case, we could plausibly assume that the supplied keywords are already all-lower-case and don't need any further folding. But I agree that this comment is probably more confusing than helpful; it's easier just to see that the code is applying the same transform as the runtime lookup will do. > Lastly, the keyword headers still have a dire warning about ASCII > order and binary search. Those could be softened to match the comment > in gen_keywordlist.pl. Agreed, will do. Thanks for reviewing! regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Wed, Jan 9, 2019 at 5:33 PM Tom Lane wrote: > really need here. We could go with "[no-]case-insensitive", perhaps. > Or "[no-]case-fold", which is at least a little shorter and less > double-negative-y. I'd be in favor of --[no-]case-fold. On Tue, Jan 8, 2019 at 5:53 PM Tom Lane wrote: > I improved the comment about how come the hash table entry assignment > works. I've gone over the algorithm in more detail and I don't see any nicer way to write it. This comment in PerfectHash.pm: (It does change '_', else we could just skip adjusting # $cn here at all, for typical keyword strings.) ...seems a bit out of place in the module, because of its reference to keywords, of interest right now to its only caller. Maybe a bit of context here. (I also don't quite understand why we could hypothetically skip the adjustment.) Lastly, the keyword headers still have a dire warning about ASCII order and binary search. Those could be softened to match the comment in gen_keywordlist.pl. -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > On Tue, Jan 8, 2019 at 5:31 PM Tom Lane wrote: >> The length macro was readily available there so I used it. AFAIR >> that wasn't true elsewhere, though I might've missed something. >> It's pretty much just belt-and-suspenders coding anyway, since all >> those arrays are machine generated ... > I tried using the available num_keywords macro in plpgsql and it > worked fine, but it makes the lines really long. Alternatively, as in > the attached, we could remove the single use of the core macro and > maybe add comments to the generated magic numbers. Meh, I'm not excited about removing the option just because there's only one use of it now. There might be more-compelling uses later. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > On Wed, Jan 9, 2019 at 2:04 PM Tom Lane wrote: >> Now the impedance mismatch about case sensitivity is handled with >> my $f = PerfectHash::generate_hash_function(\@keywords, $funcname, >> case_insensitive => !$case_sensitive); >> which is at least a little clearer than before, though I'm not sure >> if it entirely solves the problem. > It's a bit clearer, but thinking about this some more, it makes sense > for gen_keywordlist.pl to use $case_insensitive, because right now > every instance of the var is "!$case_sensitive". In the attached (on > top of v4), I change the command line option to --citext, and add the > ability to negate it within the option, as '--no-citext'. It's kind of > a double negative for the C-keywords invocation, but we can have the > option for both cases, so we don't need to worry about what the > default is (which is case_insensitive=1). Ah, I didn't realize that Getopt allows having a boolean option defaulting to "on". That makes it more practical to do something here. I'm not in love with "citext" as the option name, though ... that has little to recommend it except brevity, which is not a property we really need here. We could go with "[no-]case-insensitive", perhaps. Or "[no-]case-fold", which is at least a little shorter and less double-negative-y. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > On Wed, Jan 9, 2019 at 2:44 PM Tom Lane wrote: >> [patch to shrink oid index] > It would help maintaining its newfound sveltness if we warned if a > higher oid was assigned, as in the attached. I used 6200 as a soft > limit, but that could be anything similiar. I think the reason we have this issue is that people tend to use high OIDs during development of a patch, so that their elbows won't be joggled by unrelated changes. Then sometimes they forget to renumber them down before committing. A warning like this would lead to lots of noise during the development stage, which nobody would thank us for. If we could find a way to notice this only when we were about to commit, it'd be good .. but I don't have an idea about a nice way to do that. (No, I don't want a commit hook on gitmaster; that's warning too late, which is not better.) regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Wed, Jan 9, 2019 at 2:44 PM Tom Lane wrote: > [patch to shrink oid index] It would help maintaining its newfound sveltness if we warned if a higher oid was assigned, as in the attached. I used 6200 as a soft limit, but that could be anything similiar. -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index e7ead7dcc0..201343c2f6 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -74,6 +74,7 @@ my %catalog_data; my @toast_decls; my @index_decls; my %oidcounts; +my $max_proc_oid = 6200; foreach my $header (@input_files) { @@ -230,6 +231,13 @@ foreach my $row (@{ $catalog_data{pg_opfamily} }) my %procoids; foreach my $row (@{ $catalog_data{pg_proc} }) { + # Warn if a large OID has been newly assigned. + if ($row->{oid} > $max_proc_oid) + { + warn sprintf "Consider using an OID smaller than %d for %s (see commit 8ff5f824dca75)\n", + $max_proc_oid, $row->{proname}; + } + # Generate an entry under just the proname (corresponds to regproc lookup) my $prokey = $row->{proname}; if (defined $procoids{$prokey})
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Tue, Jan 8, 2019 at 5:31 PM Tom Lane wrote: > > John Naylor writes: > > In the committed keyword patch, I noticed that in common/keywords.c, > > the array length is defined with > > ScanKeywordCategories[SCANKEYWORDS_NUM_KEYWORDS] > > but other keyword arrays just have ...[]. Is there a reason for the > > difference? > > The length macro was readily available there so I used it. AFAIR > that wasn't true elsewhere, though I might've missed something. > It's pretty much just belt-and-suspenders coding anyway, since all > those arrays are machine generated ... I tried using the available num_keywords macro in plpgsql and it worked fine, but it makes the lines really long. Alternatively, as in the attached, we could remove the single use of the core macro and maybe add comments to the generated magic numbers. -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/common/keywords.c b/src/common/keywords.c index 84f779feb9..edec4a5094 100644 --- a/src/common/keywords.c +++ b/src/common/keywords.c @@ -26,7 +26,7 @@ #define PG_KEYWORD(kwname, value, category) category, -const uint8 ScanKeywordCategories[SCANKEYWORDS_NUM_KEYWORDS] = { +const uint8 ScanKeywordCategories[] = { #include "parser/kwlist.h" }; diff --git a/src/tools/gen_keywordlist.pl b/src/tools/gen_keywordlist.pl index 927511d7c4..8c4c5da828 100644 --- a/src/tools/gen_keywordlist.pl +++ b/src/tools/gen_keywordlist.pl @@ -147,11 +147,6 @@ foreach my $name (@keywords) print $kwdef "};\n\n"; -# Emit a macro defining the number of keywords. -# (In some places it's useful to have access to that as a constant.) - -printf $kwdef "#define %s_NUM_KEYWORDS %d\n\n", uc $varname, scalar @keywords; - # Emit the definition of the hash function. my $funcname = $varname . "_hash_func"; @@ -168,8 +163,8 @@ printf $kwdef "const ScanKeywordList %s = {\n", $varname; printf $kwdef qq|\t%s_kw_string,\n|, $varname; printf $kwdef qq|\t%s_kw_offsets,\n|, $varname; printf $kwdef qq|\t%s,\n|, $funcname; -printf $kwdef qq|\t%s_NUM_KEYWORDS,\n|, uc $varname; -printf $kwdef qq|\t%d\n|, $max_len; +printf $kwdef qq|\t%d,\t/* number of keywords */\n|, scalar @keywords; +printf $kwdef qq|\t%d\t/* max keyword length */\n|, $max_len; printf $kwdef "};\n\n"; printf $kwdef "#endif\t\t\t\t\t\t\t/* %s_H */\n", uc $base_filename;
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Wed, Jan 9, 2019 at 2:04 PM Tom Lane wrote: > > I wrote: > > John Naylor writes: > >> -There is a bit of a cognitive clash between $case_sensitive in > >> gen_keywordlist.pl and $case_insensitive in PerfectHash.pm. They each > >> make sense in their own file, but might it be worth using one or the > >> other? > Working on the fmgr-oid-lookup idea gave me the thought that > PerfectHash.pm ought to support fixed-length keys. Rather than start > adding random parameters to the function, I borrowed an idea from > PostgresNode.pm and made the options be keyword-style parameters. Now > the impedance mismatch about case sensitivity is handled with > > my $f = PerfectHash::generate_hash_function(\@keywords, $funcname, > case_insensitive => !$case_sensitive); > > which is at least a little clearer than before, though I'm not sure > if it entirely solves the problem. It's a bit clearer, but thinking about this some more, it makes sense for gen_keywordlist.pl to use $case_insensitive, because right now every instance of the var is "!$case_sensitive". In the attached (on top of v4), I change the command line option to --citext, and add the ability to negate it within the option, as '--no-citext'. It's kind of a double negative for the C-keywords invocation, but we can have the option for both cases, so we don't need to worry about what the default is (which is case_insensitive=1). -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/common/Makefile b/src/common/Makefile index d0c2b970eb..5f9327de59 100644 --- a/src/common/Makefile +++ b/src/common/Makefile @@ -124,7 +124,7 @@ libpgcommon_srv.a: $(OBJS_SRV) # generate SQL keyword lookup table to be included into keywords*.o. kwlist_d.h: $(top_srcdir)/src/include/parser/kwlist.h $(GEN_KEYWORDLIST_DEPS) - $(GEN_KEYWORDLIST) --extern $< + $(GEN_KEYWORDLIST) --extern --citext $< # Dependencies of keywords*.o need to be managed explicitly to make sure # that you don't get broken parsing code, even in a non-enable-depend build. diff --git a/src/interfaces/ecpg/preproc/Makefile b/src/interfaces/ecpg/preproc/Makefile index 6c02f97622..96dd76317f 100644 --- a/src/interfaces/ecpg/preproc/Makefile +++ b/src/interfaces/ecpg/preproc/Makefile @@ -60,10 +60,10 @@ preproc.y: ../../../backend/parser/gram.y parse.pl ecpg.addons ecpg.header ecpg. # generate keyword headers c_kwlist_d.h: c_kwlist.h $(GEN_KEYWORDLIST_DEPS) - $(GEN_KEYWORDLIST) --varname ScanCKeywords --case $< + $(GEN_KEYWORDLIST) --varname ScanCKeywords --no-citext $< ecpg_kwlist_d.h: ecpg_kwlist.h $(GEN_KEYWORDLIST_DEPS) - $(GEN_KEYWORDLIST) --varname ScanECPGKeywords $< + $(GEN_KEYWORDLIST) --varname ScanECPGKeywords --citext $< # Force these dependencies to be known even without dependency info built: ecpg_keywords.o c_keywords.o keywords.o preproc.o pgc.o parser.o: preproc.h diff --git a/src/pl/plpgsql/src/Makefile b/src/pl/plpgsql/src/Makefile index 8a0f294323..bc4ee50682 100644 --- a/src/pl/plpgsql/src/Makefile +++ b/src/pl/plpgsql/src/Makefile @@ -80,10 +80,10 @@ plerrcodes.h: $(top_srcdir)/src/backend/utils/errcodes.txt generate-plerrcodes.p # generate keyword headers for the scanner pl_reserved_kwlist_d.h: pl_reserved_kwlist.h $(GEN_KEYWORDLIST_DEPS) - $(GEN_KEYWORDLIST) --varname ReservedPLKeywords $< + $(GEN_KEYWORDLIST) --varname ReservedPLKeywords --citext $< pl_unreserved_kwlist_d.h: pl_unreserved_kwlist.h $(GEN_KEYWORDLIST_DEPS) - $(GEN_KEYWORDLIST) --varname UnreservedPLKeywords $< + $(GEN_KEYWORDLIST) --varname UnreservedPLKeywords --citext $< check: submake diff --git a/src/tools/gen_keywordlist.pl b/src/tools/gen_keywordlist.pl index 2744e1dc26..927511d7c4 100644 --- a/src/tools/gen_keywordlist.pl +++ b/src/tools/gen_keywordlist.pl @@ -35,13 +35,13 @@ use PerfectHash; my $output_path = ''; my $extern = 0; -my $case_sensitive = 0; +my $case_insensitive = 1; my $varname = 'ScanKeywords'; GetOptions( 'output:s' => \$output_path, 'extern' => \$extern, - 'case-sensitive' => \$case_sensitive, + 'citext!'=> \$case_insensitive, 'varname:s' => \$varname) || usage(); my $kw_input_file = shift @ARGV || die "No input file.\n"; @@ -97,7 +97,7 @@ while (<$kif>) } # When being case-insensitive, insist that the input be all-lower-case. -if (!$case_sensitive) +if ($case_insensitive) { foreach my $kw (@keywords) { @@ -157,7 +157,7 @@ printf $kwdef "#define %s_NUM_KEYWORDS %d\n\n", uc $varname, scalar @keywords; my $funcname = $varname . "_hash_func"; my $f = PerfectHash::generate_hash_function(\@keywords, $funcname, - case_insensitive => !$case_sensitive); + case_insensitive => $case_insensitive); printf $kwdef qq|static %s\n|, $f; @@ -178,11 +178,11 @@ printf $kwdef "#endif\t\t\t\t\t\t\t/* %s_H */\n", uc $base_filename; sub usage { die <] [--varname/-v ] [--extern/-e] input_file +Usage: gen_keywordlist.pl [--output
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Wed, Jan 09, 2019 at 02:04:15PM -0500, Tom Lane wrote: > Also, in view of finding that the original multiplier choices failed > on the fmgr oid problem, I spent a little effort making the code > able to try more combinations of hash multipliers and seeds. It'd > be nice to have some theory rather than just heuristics about what > will work, though ... The theory is that the code needs two families of pair-wise independent hash functions to give the O(1) number of tries. For practical purposes, the Jenkins hash easily qualifies or any cryptographic hash function. The downside is that those hash functions are very expensive though. A multiplicative hash, especially using the high bits of the results (e.g. the top 32bit of a 32x32->64 multiplication) qualifies for the requirements, but for strings of input it would need a pair of constant per word. So the choice of a hash function family for a performance sensitive part is a heuristic in the sense of trying to get away with as simple a function as possible. Joerg
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > On 2019-01-09 15:03:35 -0500, Tom Lane wrote: >> (Speaking of which, I've been wondering for awhile if libpq ought not >> obtain the OIDs of lo_create and friends by #including fmgroids.h >> instead of doing a runtime query on every connection. If we did that, >> we'd be forever giving up the option to renumber them ... but do you >> really want to bet that nobody else has done this already in some >> other client code?) > I'm not enthusiastic about that. I kinda hope we're going to evolve that > interface further, which'd make it version dependent anyway (we don't > require all of them right now...). And it's not that expensive to query > their oids once. Version dependency doesn't seem like much of an argument: we'd just teach libpq to pay attention to the server version, which it knows anyway (and uses for other purposes already). But this is a bit off-topic for this thread, perhaps. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2019-01-09 15:03:35 -0500, Tom Lane wrote: > Alvaro Herrera writes: > > On 2019-Jan-09, Tom Lane wrote: > >> We could make the index table still smaller if we wanted to reassign > >> a couple dozen high-numbered functions down to lower OIDs, but I dunno > >> if it's worth the trouble. It certainly isn't from a performance > >> standpoint, because those unused entry ranges will never be touched > >> in normal usage; but it'd make the server executable a couple KB smaller. > > > Or two couples KB smaller, if we abandoned the idea that pg_proc OIDs > > must not collide with those in any other catalog, and we renumbered all > > functions to start at OID 1 or so. duplicate_oids would complain about > > that, though, I suppose ... and nobody who has ever hardcoded a function > > OID would love this idea much. > > I think that'd be a nonstarter for commonly-used functions. I'm guessing > that pg_replication_origin_create() and so on, which are the immediate > problem, haven't been around long enough or get used often enough for > someone to have hard-coded their OIDs. But I could be wrong. I don't think it's likely that it'd be useful to hardcode them, and therefore hope that nobody would do so. I personally feel limited sympathy to people hardcoding oids across major versions. The benefits of making pg easier to maintain and more efficient seem higher than allowing for that. > (Speaking of which, I've been wondering for awhile if libpq ought not > obtain the OIDs of lo_create and friends by #including fmgroids.h > instead of doing a runtime query on every connection. If we did that, > we'd be forever giving up the option to renumber them ... but do you > really want to bet that nobody else has done this already in some > other client code?) I'm not enthusiastic about that. I kinda hope we're going to evolve that interface further, which'd make it version dependent anyway (we don't require all of them right now...). And it's not that expensive to query their oids once. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Alvaro Herrera writes: > On 2019-Jan-09, Tom Lane wrote: >> We could make the index table still smaller if we wanted to reassign >> a couple dozen high-numbered functions down to lower OIDs, but I dunno >> if it's worth the trouble. It certainly isn't from a performance >> standpoint, because those unused entry ranges will never be touched >> in normal usage; but it'd make the server executable a couple KB smaller. > Or two couples KB smaller, if we abandoned the idea that pg_proc OIDs > must not collide with those in any other catalog, and we renumbered all > functions to start at OID 1 or so. duplicate_oids would complain about > that, though, I suppose ... and nobody who has ever hardcoded a function > OID would love this idea much. I think that'd be a nonstarter for commonly-used functions. I'm guessing that pg_replication_origin_create() and so on, which are the immediate problem, haven't been around long enough or get used often enough for someone to have hard-coded their OIDs. But I could be wrong. (Speaking of which, I've been wondering for awhile if libpq ought not obtain the OIDs of lo_create and friends by #including fmgroids.h instead of doing a runtime query on every connection. If we did that, we'd be forever giving up the option to renumber them ... but do you really want to bet that nobody else has done this already in some other client code?) regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > On 2019-01-09 14:44:24 -0500, Tom Lane wrote: >> /* fast lookup only possible if original oid still assigned */ >> -if (id >= FirstGenbkiObjectId) >> +if (id > fmgr_last_builtin_oid) >> return NULL; > An extern reference here will make the code a bit less efficient, but > it's probably not worth generating a header with a define for it > instead... Yeah, also that would be significantly more fragile, in that it'd be hard to be sure where that OID had propagated to when rebuilding. We haven't chosen to make fmgr_nbuiltins a #define either, and I think it's best to treat this the same way. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2019-01-09 14:44:24 -0500, Tom Lane wrote: > I wrote: > > Also, I fail to understand why fmgr_builtin_oid_index has 1 entries > > anyway. We could easily have fmgrtab.c expose the last actually assigned > > builtin function OID (presently 6121) and make the index array only > > that big, which just about eliminates the space advantage completely. > > Concretely, like the attached. Seems like a good improvement. > We could make the index table still smaller if we wanted to reassign > a couple dozen high-numbered functions down to lower OIDs, but I dunno > if it's worth the trouble. It certainly isn't from a performance > standpoint, because those unused entry ranges will never be touched > in normal usage; but it'd make the server executable a couple KB smaller. Probably indeed not worth it. I'm not 100% convinced on the performance POV, but in contrast to the earlier binary search either approach is fast enough that it probably hard to measure any difference. > diff --git a/src/backend/utils/fmgr/fmgindex b41649f..506eeef 100644 > --- a/src/backend/utils/fmgr/fmgr.c > +++ b/src/backend/utils/fmgr/fmgr.c > @@ -75,12 +75,12 @@ fmgr_isbuiltin(Oid id) > uint16 index; > > /* fast lookup only possible if original oid still assigned */ > - if (id >= FirstGenbkiObjectId) > + if (id > fmgr_last_builtin_oid) > return NULL; An extern reference here will make the code a bit less efficient, but it's probably not worth generating a header with a define for it instead... Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 2019-Jan-09, Tom Lane wrote: > We could make the index table still smaller if we wanted to reassign > a couple dozen high-numbered functions down to lower OIDs, but I dunno > if it's worth the trouble. It certainly isn't from a performance > standpoint, because those unused entry ranges will never be touched > in normal usage; but it'd make the server executable a couple KB smaller. Or two couples KB smaller, if we abandoned the idea that pg_proc OIDs must not collide with those in any other catalog, and we renumbered all functions to start at OID 1 or so. duplicate_oids would complain about that, though, I suppose ... and nobody who has ever hardcoded a function OID would love this idea much. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
I wrote: > Also, I fail to understand why fmgr_builtin_oid_index has 1 entries > anyway. We could easily have fmgrtab.c expose the last actually assigned > builtin function OID (presently 6121) and make the index array only > that big, which just about eliminates the space advantage completely. Concretely, like the attached. We could make the index table still smaller if we wanted to reassign a couple dozen high-numbered functions down to lower OIDs, but I dunno if it's worth the trouble. It certainly isn't from a performance standpoint, because those unused entry ranges will never be touched in normal usage; but it'd make the server executable a couple KB smaller. regards, tom lane diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index cafe408..f970940 100644 *** a/src/backend/utils/Gen_fmgrtab.pl --- b/src/backend/utils/Gen_fmgrtab.pl *** foreach my $datfile (@input_files) *** 80,90 $catalog_data{$catname} = Catalog::ParseData($datfile, $schema, 0); } - # Fetch some values for later. - my $FirstGenbkiObjectId = - Catalog::FindDefinedSymbol('access/transam.h', $include_path, - 'FirstGenbkiObjectId'); - # Collect certain fields from pg_proc.dat. my @fmgr = (); --- 80,85 *** my %bmap; *** 225,230 --- 220,226 $bmap{'t'} = 'true'; $bmap{'f'} = 'false'; my @fmgr_builtin_oid_index; + my $last_builtin_oid = 0; my $fmgr_count = 0; foreach my $s (sort { $a->{oid} <=> $b->{oid} } @fmgr) { *** foreach my $s (sort { $a->{oid} <=> $b-> *** 232,237 --- 228,234 " { $s->{oid}, $s->{nargs}, $bmap{$s->{strict}}, $bmap{$s->{retset}}, \"$s->{prosrc}\", $s->{prosrc} }"; $fmgr_builtin_oid_index[ $s->{oid} ] = $fmgr_count++; + $last_builtin_oid = $s->{oid}; if ($fmgr_count <= $#fmgr) { *** foreach my $s (sort { $a->{oid} <=> $b-> *** 244,274 } print $tfh "};\n"; ! print $tfh qq| const int fmgr_nbuiltins = (sizeof(fmgr_builtins) / sizeof(FmgrBuiltin)); ! |; # Create fmgr_builtins_oid_index table. ! # ! # Note that the array has to be filled up to FirstGenbkiObjectId, ! # as we can't rely on zero initialization as 0 is a valid mapping. ! print $tfh qq| ! const uint16 fmgr_builtin_oid_index[FirstGenbkiObjectId] = { ! |; ! for (my $i = 0; $i < $FirstGenbkiObjectId; $i++) { my $oid = $fmgr_builtin_oid_index[$i]; ! # fmgr_builtin_oid_index is sparse, map nonexistant functions to # InvalidOidBuiltinMapping if (not defined $oid) { $oid = 'InvalidOidBuiltinMapping'; } ! if ($i + 1 == $FirstGenbkiObjectId) { print $tfh " $oid\n"; } --- 241,270 } print $tfh "};\n"; ! printf $tfh qq| const int fmgr_nbuiltins = (sizeof(fmgr_builtins) / sizeof(FmgrBuiltin)); ! ! const Oid fmgr_last_builtin_oid = %u; ! |, $last_builtin_oid; # Create fmgr_builtins_oid_index table. ! printf $tfh qq| ! const uint16 fmgr_builtin_oid_index[%u] = { ! |, $last_builtin_oid + 1; ! for (my $i = 0; $i <= $last_builtin_oid; $i++) { my $oid = $fmgr_builtin_oid_index[$i]; ! # fmgr_builtin_oid_index is sparse, map nonexistent functions to # InvalidOidBuiltinMapping if (not defined $oid) { $oid = 'InvalidOidBuiltinMapping'; } ! if ($i == $last_builtin_oid) { print $tfh " $oid\n"; } diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c index b41649f..506eeef 100644 *** a/src/backend/utils/fmgr/fmgr.c --- b/src/backend/utils/fmgr/fmgr.c *** fmgr_isbuiltin(Oid id) *** 75,86 uint16 index; /* fast lookup only possible if original oid still assigned */ ! if (id >= FirstGenbkiObjectId) return NULL; /* * Lookup function data. If there's a miss in that range it's likely a ! * nonexistant function, returning NULL here will trigger an ERROR later. */ index = fmgr_builtin_oid_index[id]; if (index == InvalidOidBuiltinMapping) --- 75,86 uint16 index; /* fast lookup only possible if original oid still assigned */ ! if (id > fmgr_last_builtin_oid) return NULL; /* * Lookup function data. If there's a miss in that range it's likely a ! * nonexistent function, returning NULL here will trigger an ERROR later. */ index = fmgr_builtin_oid_index[id]; if (index == InvalidOidBuiltinMapping) diff --git a/src/include/utils/fmgrtab.h b/src/include/utils/fmgrtab.h index a778f88..e981f34 100644 *** a/src/include/utils/fmgrtab.h --- b/src/include/utils/fmgrtab.h *** extern const FmgrBuiltin fmgr_builtins[] *** 36,46 extern const int fmgr_nbuiltins; /* number of entries in table */ /* ! * Mapping from a builtin function's oid to the index in the fmgr_builtins ! * array. */ #define InvalidOidBuiltinMapping PG_UINT16_MAX ! extern const uint16 fmgr_builtin_oid_index[FirstGenbkiObjectId]; #endif /* FMGRTAB_H */ --- 36,48
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
I wrote: > John Naylor writes: >> -There is a bit of a cognitive clash between $case_sensitive in >> gen_keywordlist.pl and $case_insensitive in PerfectHash.pm. They each >> make sense in their own file, but might it be worth using one or the >> other? > Yeah, dunno. It seems to make sense for the command-line-level default of > gen_keywordlist.pl to be "case insensitive", since most users want that. > But that surely shouldn't be the default in PerfectHash.pm, and I'm not > very sure how to reconcile the discrepancy. Working on the fmgr-oid-lookup idea gave me the thought that PerfectHash.pm ought to support fixed-length keys. Rather than start adding random parameters to the function, I borrowed an idea from PostgresNode.pm and made the options be keyword-style parameters. Now the impedance mismatch about case sensitivity is handled with my $f = PerfectHash::generate_hash_function(\@keywords, $funcname, case_insensitive => !$case_sensitive); which is at least a little clearer than before, though I'm not sure if it entirely solves the problem. Also, in view of finding that the original multiplier choices failed on the fmgr oid problem, I spent a little effort making the code able to try more combinations of hash multipliers and seeds. It'd be nice to have some theory rather than just heuristics about what will work, though ... Barring objections or further review, I plan to push this soon. regards, tom lane diff --git a/src/common/Makefile b/src/common/Makefile index 317b071..d0c2b97 100644 *** a/src/common/Makefile --- b/src/common/Makefile *** OBJS_FRONTEND = $(OBJS_COMMON) fe_memuti *** 63,68 --- 63,73 OBJS_SHLIB = $(OBJS_FRONTEND:%.o=%_shlib.o) OBJS_SRV = $(OBJS_COMMON:%.o=%_srv.o) + # where to find gen_keywordlist.pl and subsidiary files + TOOLSDIR = $(top_srcdir)/src/tools + GEN_KEYWORDLIST = $(PERL) -I $(TOOLSDIR) $(TOOLSDIR)/gen_keywordlist.pl + GEN_KEYWORDLIST_DEPS = $(TOOLSDIR)/gen_keywordlist.pl $(TOOLSDIR)/PerfectHash.pm + all: libpgcommon.a libpgcommon_shlib.a libpgcommon_srv.a distprep: kwlist_d.h *** libpgcommon_srv.a: $(OBJS_SRV) *** 118,125 $(CC) $(CFLAGS) $(subst -DFRONTEND,, $(CPPFLAGS)) -c $< -o $@ # generate SQL keyword lookup table to be included into keywords*.o. ! kwlist_d.h: $(top_srcdir)/src/include/parser/kwlist.h $(top_srcdir)/src/tools/gen_keywordlist.pl ! $(PERL) $(top_srcdir)/src/tools/gen_keywordlist.pl --extern $< # Dependencies of keywords*.o need to be managed explicitly to make sure # that you don't get broken parsing code, even in a non-enable-depend build. --- 123,130 $(CC) $(CFLAGS) $(subst -DFRONTEND,, $(CPPFLAGS)) -c $< -o $@ # generate SQL keyword lookup table to be included into keywords*.o. ! kwlist_d.h: $(top_srcdir)/src/include/parser/kwlist.h $(GEN_KEYWORDLIST_DEPS) ! $(GEN_KEYWORDLIST) --extern $< # Dependencies of keywords*.o need to be managed explicitly to make sure # that you don't get broken parsing code, even in a non-enable-depend build. diff --git a/src/common/kwlookup.c b/src/common/kwlookup.c index d72842e..6545480 100644 *** a/src/common/kwlookup.c --- b/src/common/kwlookup.c *** *** 35,94 * receive a different case-normalization mapping. */ int ! ScanKeywordLookup(const char *text, const ScanKeywordList *keywords) { ! int len, ! i; ! char word[NAMEDATALEN]; ! const char *kw_string; ! const uint16 *kw_offsets; ! const uint16 *low; ! const uint16 *high; ! ! len = strlen(text); if (len > keywords->max_kw_len) ! return -1;/* too long to be any keyword */ ! ! /* We assume all keywords are shorter than NAMEDATALEN. */ ! Assert(len < NAMEDATALEN); /* ! * Apply an ASCII-only downcasing. We must not use tolower() since it may ! * produce the wrong translation in some locales (eg, Turkish). */ ! for (i = 0; i < len; i++) ! { ! char ch = text[i]; ! if (ch >= 'A' && ch <= 'Z') ! ch += 'a' - 'A'; ! word[i] = ch; ! } ! word[len] = '\0'; /* ! * Now do a binary search using plain strcmp() comparison. */ ! kw_string = keywords->kw_string; ! kw_offsets = keywords->kw_offsets; ! low = kw_offsets; ! high = kw_offsets + (keywords->num_keywords - 1); ! while (low <= high) { ! const uint16 *middle; ! int difference; ! middle = low + (high - low) / 2; ! difference = strcmp(kw_string + *middle, word); ! if (difference == 0) ! return middle - kw_offsets; ! else if (difference < 0) ! low = middle + 1; ! else ! high = middle - 1; } ! return -1; } --- 35,85 * receive a different case-normalization mapping. */ int ! ScanKeywordLookup(const char *str, const ScanKeywordList *keywords) { ! size_t len; ! int h; ! const char *kw; + /* + * Reject immediately if too long to be any keyword. This saves useless + * hashing and downcasing work on long strings. +
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Tue, Jan 08, 2019 at 05:53:25PM -0500, Tom Lane wrote: > John Naylor writes: > > -As for the graph algorithm, I'd have to play with it to understand > > how it works. > > I improved the comment about how come the hash table entry assignment > works. One thing I'm not clear about myself is > > # A cycle-free graph is either empty or has some vertex of degree 1. > > That sounds like a standard graph theory result, but I'm not familiar > with a proof for it. Let's assume all vertexes have a degree > 1, the graph is acyclic and non-empty. Pick any vertex. Let's construct a path now starting from this vertex. It is connected to at least one other vertex. Let's follow that path. Again, there must be connected to one more vertex and it can't go back to the starting point (since that would be a cycle). The next vertex must still have another connections and it can't go back to any already visited vertexes. Continue until you run out of vertex... Joerg
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > -As for the graph algorithm, I'd have to play with it to understand > how it works. I improved the comment about how come the hash table entry assignment works. One thing I'm not clear about myself is # A cycle-free graph is either empty or has some vertex of degree 1. That sounds like a standard graph theory result, but I'm not familiar with a proof for it. regards, tom lane #-- # # PerfectHash.pm #Perl module that constructs minimal perfect hash functions # # This code constructs a minimal perfect hash function for the given # set of keys, using an algorithm described in # "An optimal algorithm for generating minimal perfect hash functions" # by Czech, Havas and Majewski in Information Processing Letters, # 43(5):256-264, October 1992. # This implementation is loosely based on NetBSD's "nbperf", # which was written by Joerg Sonnenberger. # # The resulting hash function is perfect in the sense that if the presented # key is one of the original set, it will return the key's index in the set # (in range 0..N-1). However, the caller must still verify the match, # as false positives are possible. Also, the hash function may return # values that are out of range (negative, or >= N). This indicates that # the presented key is definitely not in the set. # # # Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group # Portions Copyright (c) 1994, Regents of the University of California # # src/tools/PerfectHash.pm # #-- package PerfectHash; use strict; use warnings; # At runtime, we'll compute two simple hash functions of the input key, # and use them to index into a mapping table. The hash functions are just # multiply-and-add in uint32 arithmetic, with different multipliers but # the same initial seed. All the complexity in this module is concerned # with selecting hash parameters that will work and building the mapping # table. # We support making case-insensitive hash functions, though this only # works for a strict-ASCII interpretation of case insensitivity, # ie, A-Z maps onto a-z and nothing else. my $case_insensitive = 0; # # Construct a C function implementing a perfect hash for the given keys. # The C function definition is returned as a string. # # The keys can be any set of Perl strings; it is caller's responsibility # that there not be any duplicates. (Note that the "strings" can be # binary data, but endianness is the caller's problem.) # # The name to use for the function is caller-specified, but its signature # is always "int f(const void *key, size_t keylen)". The caller may # prepend "static " to the result string if it wants a static function. # # If $ci is true, the function is case-insensitive, for the limited idea # of case-insensitivity explained above. # sub generate_hash_function { my ($keys_ref, $funcname, $ci) = @_; # It's not worth passing this around as a parameter; just use a global. $case_insensitive = $ci; # Try different hash function parameters until we find a set that works # for these keys. In principle we might need to change multipliers, # but these two multipliers are chosen to be primes that are cheap to # calculate via shift-and-add, so don't change them without care. my $hash_mult1 = 31; my $hash_mult2 = 2053; # We just try successive hash seed values until we find one that works. # (Commonly, random seeds are tried, but we want reproducible results # from this program so we don't do that.) my $hash_seed; my @subresult; for ($hash_seed = 0; $hash_seed < 1000; $hash_seed++) { @subresult = _construct_hash_table($keys_ref, $hash_mult1, $hash_mult2, $hash_seed); last if @subresult; } # Choke if we didn't succeed in a reasonable number of tries. die "failed to generate perfect hash" if !@subresult; # Extract info from the function result array. my $elemtype = $subresult[0]; my @hashtab = @{ $subresult[1] }; my $nhash= scalar(@hashtab); # OK, construct the hash function definition including the hash table. my $f = ''; $f .= sprintf "int\n"; $f .= sprintf "%s(const void *key, size_t keylen)\n{\n", $funcname; $f .= sprintf "\tstatic const %s h[%d] = {\n", $elemtype, $nhash; for (my $i = 0; $i < $nhash; $i++) { $f .= sprintf "%s%6d,%s", ($i % 8 == 0 ? "\t\t" : " "), $hashtab[$i], ($i % 8 == 7 ? "\n" : ""); } $f .= sprintf "\n" if ($nhash % 8 != 0); $f .= sprintf "\t};\n\n"; $f .= sprintf "\tconst unsigned char *k = key;\n"; $f .=
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > Just a couple comments about the module: > -If you qualify the function's module name as you did > (PerfectHash::generate_hash_function), you don't have to export the > function into the callers namespace, so you can skip the @EXPORT_OK > setting. Most of our modules don't export. OK by me. I was more concerned about hiding the stuff that isn't supposed to be exported. > -There is a bit of a cognitive clash between $case_sensitive in > gen_keywordlist.pl and $case_insensitive in PerfectHash.pm. They each > make sense in their own file, but might it be worth using one or the > other? Yeah, dunno. It seems to make sense for the command-line-level default of gen_keywordlist.pl to be "case insensitive", since most users want that. But that surely shouldn't be the default in PerfectHash.pm, and I'm not very sure how to reconcile the discrepancy. > In the committed keyword patch, I noticed that in common/keywords.c, > the array length is defined with > ScanKeywordCategories[SCANKEYWORDS_NUM_KEYWORDS] > but other keyword arrays just have ...[]. Is there a reason for the > difference? The length macro was readily available there so I used it. AFAIR that wasn't true elsewhere, though I might've missed something. It's pretty much just belt-and-suspenders coding anyway, since all those arrays are machine generated ... regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Tue, Jan 8, 2019 at 3:04 PM Tom Lane wrote: > > I'll take a crack at separating into a module. I'll wait a bit in > > case there are any stylistic suggestions on the patch as it stands. > > I had a go at that myself. I'm sure there's plenty to criticize in > the result, but at least it passes make check-world ;-) Just a couple comments about the module: -If you qualify the function's module name as you did (PerfectHash::generate_hash_function), you don't have to export the function into the callers namespace, so you can skip the @EXPORT_OK setting. Most of our modules don't export. -There is a bit of a cognitive clash between $case_sensitive in gen_keywordlist.pl and $case_insensitive in PerfectHash.pm. They each make sense in their own file, but might it be worth using one or the other? -As for the graph algorithm, I'd have to play with it to understand how it works. In the committed keyword patch, I noticed that in common/keywords.c, the array length is defined with ScanKeywordCategories[SCANKEYWORDS_NUM_KEYWORDS] but other keyword arrays just have ...[]. Is there a reason for the difference?
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > On 2019-01-08 13:41:16 -0500, John Naylor wrote: >> Do you mean the fmgr table? > Not the entire fmgr table, but just the builtin oid index, generated by > the following section: > ... > The generated fmgr_builtin_oid_index is pretty sparse, and a more dense > hashtable might e.g. more efficient from a cache perspective. I experimented with this, but TBH I think it's a dead loss. We currently have 2768 built-in functions, so the perfect hash table requires 5537 int16 entries, which is not *that* much less than the 1 entries that are in fmgr_builtin_oid_index presently. When you consider the extra cycles needed to do the hashing, and the fact that you have to touch (usually) two cache lines not one in the lookup table, it's hard to see how this could net out as a win performance-wise. Also, I fail to understand why fmgr_builtin_oid_index has 1 entries anyway. We could easily have fmgrtab.c expose the last actually assigned builtin function OID (presently 6121) and make the index array only that big, which just about eliminates the space advantage completely. BTW, I found out while trying this that Joerg's fear of the hash multipliers being too simplistic is valid: the perfect hash generator failed until I changed them. I picked a larger value that should be just as easy to use for shift-and-add purposes. regards, tom lane diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index cafe408..9fceb60 100644 *** a/src/backend/utils/Gen_fmgrtab.pl --- b/src/backend/utils/Gen_fmgrtab.pl *** use Catalog; *** 18,23 --- 18,24 use strict; use warnings; + use PerfectHash; # Collect arguments my @input_files; *** foreach my $s (sort { $a->{oid} <=> $b-> *** 219,237 print $pfh "extern Datum $s->{prosrc}(PG_FUNCTION_ARGS);\n"; } ! # Create the fmgr_builtins table, collect data for fmgr_builtin_oid_index print $tfh "\nconst FmgrBuiltin fmgr_builtins[] = {\n"; my %bmap; $bmap{'t'} = 'true'; $bmap{'f'} = 'false'; ! my @fmgr_builtin_oid_index; my $fmgr_count = 0; foreach my $s (sort { $a->{oid} <=> $b->{oid} } @fmgr) { print $tfh " { $s->{oid}, $s->{nargs}, $bmap{$s->{strict}}, $bmap{$s->{retset}}, \"$s->{prosrc}\", $s->{prosrc} }"; ! $fmgr_builtin_oid_index[ $s->{oid} ] = $fmgr_count++; if ($fmgr_count <= $#fmgr) { --- 220,244 print $pfh "extern Datum $s->{prosrc}(PG_FUNCTION_ARGS);\n"; } ! # Create the fmgr_builtins table, collect data for hash function print $tfh "\nconst FmgrBuiltin fmgr_builtins[] = {\n"; my %bmap; $bmap{'t'} = 'true'; $bmap{'f'} = 'false'; ! my @fmgr_builtin_oids; ! my $prev_oid = 0; my $fmgr_count = 0; foreach my $s (sort { $a->{oid} <=> $b->{oid} } @fmgr) { print $tfh " { $s->{oid}, $s->{nargs}, $bmap{$s->{strict}}, $bmap{$s->{retset}}, \"$s->{prosrc}\", $s->{prosrc} }"; ! die "duplicate OIDs" if $s->{oid} <= $prev_oid; ! $prev_oid = $s->{oid}; ! ! push @fmgr_builtin_oids, pack("n", $s->{oid}); ! ! $fmgr_count++; if ($fmgr_count <= $#fmgr) { *** print $tfh "};\n"; *** 246,283 print $tfh qq| const int fmgr_nbuiltins = (sizeof(fmgr_builtins) / sizeof(FmgrBuiltin)); - |; - - # Create fmgr_builtins_oid_index table. - # - # Note that the array has to be filled up to FirstGenbkiObjectId, - # as we can't rely on zero initialization as 0 is a valid mapping. - print $tfh qq| - const uint16 fmgr_builtin_oid_index[FirstGenbkiObjectId] = { |; - for (my $i = 0; $i < $FirstGenbkiObjectId; $i++) - { - my $oid = $fmgr_builtin_oid_index[$i]; ! # fmgr_builtin_oid_index is sparse, map nonexistant functions to ! # InvalidOidBuiltinMapping ! if (not defined $oid) ! { ! $oid = 'InvalidOidBuiltinMapping'; ! } ! if ($i + 1 == $FirstGenbkiObjectId) ! { ! print $tfh " $oid\n"; ! } ! else ! { ! print $tfh " $oid,\n"; ! } ! } ! print $tfh "};\n"; # And add the file footers. --- 253,267 print $tfh qq| const int fmgr_nbuiltins = (sizeof(fmgr_builtins) / sizeof(FmgrBuiltin)); |; ! # Create perfect hash function for searching fmgr_builtin by OID. ! print $tfh PerfectHash::generate_hash_function(\@fmgr_builtin_oids, ! "fmgr_builtin_oid_hash", ! 0); # And add the file footers. diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c index b41649f..ad93032 100644 *** a/src/backend/utils/fmgr/fmgr.c --- b/src/backend/utils/fmgr/fmgr.c *** extern Datum fmgr_security_definer(PG_FU *** 72,92 static const FmgrBuiltin * fmgr_isbuiltin(Oid id) { ! uint16 index; /* fast lookup only possible if original oid still assigned */ if (id >= FirstGenbkiObjectId) return NULL; /* ! * Lookup function data. If there's a miss in that range it's likely a ! * nonexistant function, returning NULL here will trigger an ERROR later.
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > On Tue, Jan 8, 2019 at 12:06 PM Andrew Dunstan > wrote: >> If he doesn't I will. > I'll take a crack at separating into a module. I'll wait a bit in > case there are any stylistic suggestions on the patch as it stands. I had a go at that myself. I'm sure there's plenty to criticize in the result, but at least it passes make check-world ;-) I resolved the worry I had last night about the range of table values by putting in logic to check the range and choose a suitable table element type. There are a couple of existing calls where we manage to fit the hashtable elements into int8 that way; of course, by definition that doesn't save a whole lot of space since such tables couldn't have many elements, but it seems cleaner anyway. regards, tom lane diff --git a/src/common/Makefile b/src/common/Makefile index 317b071..d0c2b97 100644 *** a/src/common/Makefile --- b/src/common/Makefile *** OBJS_FRONTEND = $(OBJS_COMMON) fe_memuti *** 63,68 --- 63,73 OBJS_SHLIB = $(OBJS_FRONTEND:%.o=%_shlib.o) OBJS_SRV = $(OBJS_COMMON:%.o=%_srv.o) + # where to find gen_keywordlist.pl and subsidiary files + TOOLSDIR = $(top_srcdir)/src/tools + GEN_KEYWORDLIST = $(PERL) -I $(TOOLSDIR) $(TOOLSDIR)/gen_keywordlist.pl + GEN_KEYWORDLIST_DEPS = $(TOOLSDIR)/gen_keywordlist.pl $(TOOLSDIR)/PerfectHash.pm + all: libpgcommon.a libpgcommon_shlib.a libpgcommon_srv.a distprep: kwlist_d.h *** libpgcommon_srv.a: $(OBJS_SRV) *** 118,125 $(CC) $(CFLAGS) $(subst -DFRONTEND,, $(CPPFLAGS)) -c $< -o $@ # generate SQL keyword lookup table to be included into keywords*.o. ! kwlist_d.h: $(top_srcdir)/src/include/parser/kwlist.h $(top_srcdir)/src/tools/gen_keywordlist.pl ! $(PERL) $(top_srcdir)/src/tools/gen_keywordlist.pl --extern $< # Dependencies of keywords*.o need to be managed explicitly to make sure # that you don't get broken parsing code, even in a non-enable-depend build. --- 123,130 $(CC) $(CFLAGS) $(subst -DFRONTEND,, $(CPPFLAGS)) -c $< -o $@ # generate SQL keyword lookup table to be included into keywords*.o. ! kwlist_d.h: $(top_srcdir)/src/include/parser/kwlist.h $(GEN_KEYWORDLIST_DEPS) ! $(GEN_KEYWORDLIST) --extern $< # Dependencies of keywords*.o need to be managed explicitly to make sure # that you don't get broken parsing code, even in a non-enable-depend build. diff --git a/src/common/kwlookup.c b/src/common/kwlookup.c index d72842e..9dc1fee 100644 *** a/src/common/kwlookup.c --- b/src/common/kwlookup.c *** *** 35,94 * receive a different case-normalization mapping. */ int ! ScanKeywordLookup(const char *text, const ScanKeywordList *keywords) { ! int len, ! i; ! char word[NAMEDATALEN]; ! const char *kw_string; ! const uint16 *kw_offsets; ! const uint16 *low; ! const uint16 *high; ! ! len = strlen(text); if (len > keywords->max_kw_len) ! return -1;/* too long to be any keyword */ ! ! /* We assume all keywords are shorter than NAMEDATALEN. */ ! Assert(len < NAMEDATALEN); /* ! * Apply an ASCII-only downcasing. We must not use tolower() since it may ! * produce the wrong translation in some locales (eg, Turkish). */ ! for (i = 0; i < len; i++) ! { ! char ch = text[i]; ! if (ch >= 'A' && ch <= 'Z') ! ch += 'a' - 'A'; ! word[i] = ch; ! } ! word[len] = '\0'; /* ! * Now do a binary search using plain strcmp() comparison. */ ! kw_string = keywords->kw_string; ! kw_offsets = keywords->kw_offsets; ! low = kw_offsets; ! high = kw_offsets + (keywords->num_keywords - 1); ! while (low <= high) { ! const uint16 *middle; ! int difference; ! middle = low + (high - low) / 2; ! difference = strcmp(kw_string + *middle, word); ! if (difference == 0) ! return middle - kw_offsets; ! else if (difference < 0) ! low = middle + 1; ! else ! high = middle - 1; } ! return -1; } --- 35,89 * receive a different case-normalization mapping. */ int ! ScanKeywordLookup(const char *str, const ScanKeywordList *keywords) { ! size_t len; ! int h; ! const char *kw; + /* + * Reject immediately if too long to be any keyword. This saves useless + * hashing and downcasing work on long strings. + */ + len = strlen(str); if (len > keywords->max_kw_len) ! return -1; /* ! * Compute the hash function. We assume it was generated to produce ! * case-insensitive results. Since it's a perfect hash, we need only ! * match to the specific keyword it identifies. */ ! h = keywords->hash(str, len); ! /* ! * An out-of-range result implies no match. (This can happen for ! * non-keyword inputs because the hash function will sum two unrelated ! * hashtable entries.) ! */ ! if (h < 0 || h >= keywords->num_keywords) ! return -1; /* ! * Compare character-by-character to see if we have a match, applying an ! * ASCII-only
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2019-01-08 13:41:16 -0500, John Naylor wrote: > On Tue, Jan 8, 2019 at 12:06 PM Andrew Dunstan > wrote: > > On 1/7/19 7:52 PM, Andres Freund wrote: > > > Builtin functions for one, which we'd swatted down last time round due > > > to gperfs defficiencies. > > Do you mean the fmgr table? Not the entire fmgr table, but just the builtin oid index, generated by the following section: # Create the fmgr_builtins table, collect data for fmgr_builtin_oid_index print $tfh "\nconst FmgrBuiltin fmgr_builtins[] = {\n"; my %bmap; $bmap{'t'} = 'true'; $bmap{'f'} = 'false'; my @fmgr_builtin_oid_index; my $fmgr_count = 0; foreach my $s (sort { $a->{oid} <=> $b->{oid} } @fmgr) { print $tfh " { $s->{oid}, $s->{nargs}, $bmap{$s->{strict}}, $bmap{$s->{retset}}, \"$s->{prosrc}\", $s->{prosrc} }"; $fmgr_builtin_oid_index[ $s->{oid} ] = $fmgr_count++; if ($fmgr_count <= $#fmgr) { print $tfh ",\n"; } else { print $tfh "\n"; } } print $tfh "};\n"; print $tfh qq| const int fmgr_nbuiltins = (sizeof(fmgr_builtins) / sizeof(FmgrBuiltin)); |; The generated fmgr_builtin_oid_index is pretty sparse, and a more dense hashtable might e.g. more efficient from a cache perspective. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Tue, Jan 8, 2019 at 12:06 PM Andrew Dunstan wrote: > On 1/7/19 7:52 PM, Andres Freund wrote: > > Builtin functions for one, which we'd swatted down last time round due > > to gperfs defficiencies. Do you mean the fmgr table? > >> But in any case, that sounds like a task for someone with > >> more sense of Perl style than I have. > > John, any chance you could help out with that... :) > > If he doesn't I will. I'll take a crack at separating into a module. I'll wait a bit in case there are any stylistic suggestions on the patch as it stands.
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 1/7/19 7:52 PM, Andres Freund wrote: > Hi, > > On 2019-01-07 19:37:51 -0500, Tom Lane wrote: >> Andres Freund writes: >>> Hm, shouldn't we extract the perfect hash generation into a perl module >>> or such? It seems that there's plenty other possible uses for it. >> Such as? > Builtin functions for one, which we'd swatted down last time round due > to gperfs defficiencies. But I think there's plenty more potential, > e.g. it'd make sense from a performance POV to use a perfect hash > function for locks on builtin objects (the hashtable for lookups therein > shows up prominently in a fair number of profiles, and they are a large > percentage of the acquistions). I'm certain there's plenty more, I've > not though too much about it. > Yeah, this is pretty neat, >> But in any case, that sounds like a task for someone with >> more sense of Perl style than I have. > John, any chance you could help out with that... :) > If he doesn't I will. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2019-01-07 19:37:51 -0500, Tom Lane wrote: > Andres Freund writes: > > Hm, shouldn't we extract the perfect hash generation into a perl module > > or such? It seems that there's plenty other possible uses for it. > > Such as? Builtin functions for one, which we'd swatted down last time round due to gperfs defficiencies. But I think there's plenty more potential, e.g. it'd make sense from a performance POV to use a perfect hash function for locks on builtin objects (the hashtable for lookups therein shows up prominently in a fair number of profiles, and they are a large percentage of the acquistions). I'm certain there's plenty more, I've not though too much about it. > But in any case, that sounds like a task for someone with > more sense of Perl style than I have. John, any chance you could help out with that... :) Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > Hm, shouldn't we extract the perfect hash generation into a perl module > or such? It seems that there's plenty other possible uses for it. Such as? But in any case, that sounds like a task for someone with more sense of Perl style than I have. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
I wrote: > Probably there's a lot to be criticized about the Perl style below; > anybody feel a need to rewrite it? Here's a somewhat better version. I realized that I was being too slavishly tied to the data structures used in the C version; in Perl it's easier to manage the lists of edges as hashes. I can't see any need to distinguish left and right edge sets, either, so this just has one such hash per vertex. Also, it seems to me that we *can* make intelligent use of unused hashtable entries to exit early on many non-keyword inputs. The reason the existing code fails to do so is that it computes the sums and differences of hashtable entries in unsigned modulo arithmetic; but if we make the hashtable entries signed, we can set them up as exact differences and drop the final modulo operation in the hash function. Then, any out-of-range sum must indicate an input that is not a keyword (because it is touching a pair of hashtable entries that didn't go together) and we can exit early from the caller. This in turn lets us mark unused hashtable entries with large values to ensure that sums involving them will be out of range. A weak spot in that argument is that it's not entirely clear how large the differences can get --- with an unlucky series of collisions, maybe they could get large enough to overflow int16? I don't think that's likely for the size of problem this script is going to encounter, so I just put in an error check for it. But it could do with closer analysis before deciding that this is a general-purpose solution. regards, tom lane diff --git a/src/common/kwlookup.c b/src/common/kwlookup.c index d72842e..9dc1fee 100644 *** a/src/common/kwlookup.c --- b/src/common/kwlookup.c *** *** 35,94 * receive a different case-normalization mapping. */ int ! ScanKeywordLookup(const char *text, const ScanKeywordList *keywords) { ! int len, ! i; ! char word[NAMEDATALEN]; ! const char *kw_string; ! const uint16 *kw_offsets; ! const uint16 *low; ! const uint16 *high; ! ! len = strlen(text); if (len > keywords->max_kw_len) ! return -1;/* too long to be any keyword */ ! ! /* We assume all keywords are shorter than NAMEDATALEN. */ ! Assert(len < NAMEDATALEN); /* ! * Apply an ASCII-only downcasing. We must not use tolower() since it may ! * produce the wrong translation in some locales (eg, Turkish). */ ! for (i = 0; i < len; i++) ! { ! char ch = text[i]; ! if (ch >= 'A' && ch <= 'Z') ! ch += 'a' - 'A'; ! word[i] = ch; ! } ! word[len] = '\0'; /* ! * Now do a binary search using plain strcmp() comparison. */ ! kw_string = keywords->kw_string; ! kw_offsets = keywords->kw_offsets; ! low = kw_offsets; ! high = kw_offsets + (keywords->num_keywords - 1); ! while (low <= high) { ! const uint16 *middle; ! int difference; ! middle = low + (high - low) / 2; ! difference = strcmp(kw_string + *middle, word); ! if (difference == 0) ! return middle - kw_offsets; ! else if (difference < 0) ! low = middle + 1; ! else ! high = middle - 1; } ! return -1; } --- 35,89 * receive a different case-normalization mapping. */ int ! ScanKeywordLookup(const char *str, const ScanKeywordList *keywords) { ! size_t len; ! int h; ! const char *kw; + /* + * Reject immediately if too long to be any keyword. This saves useless + * hashing and downcasing work on long strings. + */ + len = strlen(str); if (len > keywords->max_kw_len) ! return -1; /* ! * Compute the hash function. We assume it was generated to produce ! * case-insensitive results. Since it's a perfect hash, we need only ! * match to the specific keyword it identifies. */ ! h = keywords->hash(str, len); ! /* ! * An out-of-range result implies no match. (This can happen for ! * non-keyword inputs because the hash function will sum two unrelated ! * hashtable entries.) ! */ ! if (h < 0 || h >= keywords->num_keywords) ! return -1; /* ! * Compare character-by-character to see if we have a match, applying an ! * ASCII-only downcasing to the input characters. We must not use ! * tolower() since it may produce the wrong translation in some locales ! * (eg, Turkish). */ ! kw = GetScanKeyword(h, keywords); ! while (*str != '\0') { ! char ch = *str++; ! if (ch >= 'A' && ch <= 'Z') ! ch += 'a' - 'A'; ! if (ch != *kw++) ! return -1; } + if (*kw != '\0') + return -1; ! /* Success! */ ! return h; } diff --git a/src/include/common/kwlookup.h b/src/include/common/kwlookup.h index 39efb35..dbff367 100644 *** a/src/include/common/kwlookup.h --- b/src/include/common/kwlookup.h *** *** 14,19 --- 14,22 #ifndef KWLOOKUP_H #define KWLOOKUP_H + /* Hash function used by ScanKeywordLookup */ + typedef int (*ScanKeywordHashFunc) (const void *key, size_t keylen); + /*
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2019-01-07 16:11:04 -0500, Tom Lane wrote: > I wrote: > > I took a quick look through the NetBSD nbperf sources at > > http://cvsweb.netbsd.org/bsdweb.cgi/src/usr.bin/nbperf/ > > and I concur with your judgment that we could manage translating > > that into Perl, especially if we only implement the parts we need. > > Here's an implementation of that, using the hash functions you showed > upthread. The speed of the Perl script seems to be pretty acceptable; > less than 100ms to handle the main SQL keyword list, on my machine. > Yeah, the C version might be less than 1ms, but I don't think that > we need to put up with non-Perl build tooling for that. > > Using the same test case as before (parsing information_schema.sql), > I get runtimes around 3560 ms, a shade better than my jury-rigged > prototype. > > Probably there's a lot to be criticized about the Perl style below; > anybody feel a need to rewrite it? Hm, shouldn't we extract the perfect hash generation into a perl module or such? It seems that there's plenty other possible uses for it. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
I wrote: > I took a quick look through the NetBSD nbperf sources at > http://cvsweb.netbsd.org/bsdweb.cgi/src/usr.bin/nbperf/ > and I concur with your judgment that we could manage translating > that into Perl, especially if we only implement the parts we need. Here's an implementation of that, using the hash functions you showed upthread. The speed of the Perl script seems to be pretty acceptable; less than 100ms to handle the main SQL keyword list, on my machine. Yeah, the C version might be less than 1ms, but I don't think that we need to put up with non-Perl build tooling for that. Using the same test case as before (parsing information_schema.sql), I get runtimes around 3560 ms, a shade better than my jury-rigged prototype. Probably there's a lot to be criticized about the Perl style below; anybody feel a need to rewrite it? regards, tom lane diff --git a/src/common/kwlookup.c b/src/common/kwlookup.c index d72842e..445b99d 100644 *** a/src/common/kwlookup.c --- b/src/common/kwlookup.c *** *** 35,94 * receive a different case-normalization mapping. */ int ! ScanKeywordLookup(const char *text, const ScanKeywordList *keywords) { ! int len, ! i; ! char word[NAMEDATALEN]; ! const char *kw_string; ! const uint16 *kw_offsets; ! const uint16 *low; ! const uint16 *high; ! ! len = strlen(text); if (len > keywords->max_kw_len) ! return -1;/* too long to be any keyword */ ! ! /* We assume all keywords are shorter than NAMEDATALEN. */ ! Assert(len < NAMEDATALEN); /* ! * Apply an ASCII-only downcasing. We must not use tolower() since it may ! * produce the wrong translation in some locales (eg, Turkish). */ ! for (i = 0; i < len; i++) ! { ! char ch = text[i]; ! ! if (ch >= 'A' && ch <= 'Z') ! ch += 'a' - 'A'; ! word[i] = ch; ! } ! word[len] = '\0'; /* ! * Now do a binary search using plain strcmp() comparison. */ ! kw_string = keywords->kw_string; ! kw_offsets = keywords->kw_offsets; ! low = kw_offsets; ! high = kw_offsets + (keywords->num_keywords - 1); ! while (low <= high) { ! const uint16 *middle; ! int difference; ! middle = low + (high - low) / 2; ! difference = strcmp(kw_string + *middle, word); ! if (difference == 0) ! return middle - kw_offsets; ! else if (difference < 0) ! low = middle + 1; ! else ! high = middle - 1; } ! return -1; } --- 35,81 * receive a different case-normalization mapping. */ int ! ScanKeywordLookup(const char *str, const ScanKeywordList *keywords) { ! size_t len; ! uint32 h; ! const char *kw; + /* + * Reject immediately if too long to be any keyword. This saves useless + * hashing and downcasing work on long strings. + */ + len = strlen(str); if (len > keywords->max_kw_len) ! return -1; /* ! * Compute the hash function. We assume it was generated to produce ! * case-insensitive results. Since it's a perfect hash, we need only ! * match to the specific keyword it identifies. */ ! h = keywords->hash(str, len); /* ! * Compare character-by-character to see if we have a match, applying an ! * ASCII-only downcasing to the input characters. We must not use ! * tolower() since it may produce the wrong translation in some locales ! * (eg, Turkish). */ ! kw = GetScanKeyword(h, keywords); ! while (*str != '\0') { ! char ch = *str++; ! if (ch >= 'A' && ch <= 'Z') ! ch += 'a' - 'A'; ! if (ch != *kw++) ! return -1; } + if (*kw != '\0') + return -1; ! /* Success! */ ! return h; } diff --git a/src/include/common/kwlookup.h b/src/include/common/kwlookup.h index 39efb35..a6609ee 100644 *** a/src/include/common/kwlookup.h --- b/src/include/common/kwlookup.h *** *** 14,19 --- 14,22 #ifndef KWLOOKUP_H #define KWLOOKUP_H + /* Hash function used by ScanKeywordLookup */ + typedef uint32 (*ScanKeywordHashFunc) (const void *key, size_t keylen); + /* * This struct contains the data needed by ScanKeywordLookup to perform a * search within a set of keywords. The contents are typically generated by *** typedef struct ScanKeywordList *** 23,28 --- 26,32 { const char *kw_string; /* all keywords in order, separated by \0 */ const uint16 *kw_offsets; /* offsets to the start of each keyword */ + ScanKeywordHashFunc hash; /* perfect hash function for keywords */ int num_keywords; /* number of keywords */ int max_kw_len; /* length of longest keyword */ } ScanKeywordList; diff --git a/src/interfaces/ecpg/preproc/Makefile b/src/interfaces/ecpg/preproc/Makefile index b5b74a3..abfe3cc 100644 *** a/src/interfaces/ecpg/preproc/Makefile --- b/src/interfaces/ecpg/preproc/Makefile *** preproc.y: ../../../backend/parser/gram. *** 57,63 # generate keyword headers c_kwlist_d.h: c_kwlist.h $(GEN_KEYWORDLIST) ! $(PERL) $(GEN_KEYWORDL
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 1/6/19, Tom Lane wrote: > I've pushed that version (v8 + max_kw_len); if the buildfarm doesn't > fall over, we can move on with looking at hashing. Thank you. The API adjustment looks good, and I'm glad that splitting out the aux info led to even further cleanups. -John Naylor
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Joerg Sonnenberger writes: > On Sun, Jan 06, 2019 at 02:29:05PM -0500, Tom Lane wrote: >> * We should extend the ScanKeywordList representation to include a >> field holding the longest keyword length in the table, which >> gen_keywordlist.pl would have no trouble providing. Then we could >> skip downcasing and/or hashing for any word longer than that, replacing >> the current NAMEDATALEN test, and thereby putting a tight bound on >> the cost of downcasing and/or hashing. > Correct, possibly even have an array for each class of keywords. I added that change to v8 and noted a further small improvement in my test case. That probably says something about the prevalence of long identifiers in information_schema.sql ;-), but anyway we can figure it's not a net loss. I've pushed that version (v8 + max_kw_len); if the buildfarm doesn't fall over, we can move on with looking at hashing. I took a quick look through the NetBSD nbperf sources at http://cvsweb.netbsd.org/bsdweb.cgi/src/usr.bin/nbperf/ and I concur with your judgment that we could manage translating that into Perl, especially if we only implement the parts we need. I'm curious what further changes you've made locally, and what parameters you were using. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Joerg Sonnenberger writes: > On Sun, Jan 06, 2019 at 02:29:05PM -0500, Tom Lane wrote: >> So we probably can't have inlined hashing code --- I imagine the >> hash generator needs the flexibility to pick different values of >> those multipliers. > Right now, only the initial values are randomized. Picking a different > set of hash functions is possible, but someone that should be done only > if there is an actual need. That was what I meant with stronger mixing > might be necessary for "annoying" keyword additions. Hmm. I'm still leaning towards using generated, out-of-line hash functions though, because then we could have a generator switch indicating whether to apply the |0x20 case coercion or not. (I realize that we could blow off that consideration and use a case-insensitive hash function all the time, but it seems cleaner to me not to make assumptions about how variable the hash function parameters will need to be.) > There are two ways for dealing with it: > (1) Have one big hash table with all the various keywords and a class > mask stored. If there is enough overlap between the keyword tables, it > can significantly reduce the amount of space needed. In terms of code > complexity, it adds one class check at the end, i.e. a bitmap test. No, this would be a bad idea IMO, because it makes the core, plpgsql, and ecpg keyword sets all interdependent. If you add a keyword to any one of those and forget to rebuild the other components, you got trouble. Maybe we could make that reliable, but I don't think it's worth fooling with for hypothetical benefit. Also, it'd make the net space usage more not less, because each of those executables/shlibs would contain copies of all the keywords for the other ones' needs. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Sun, Jan 06, 2019 at 02:29:05PM -0500, Tom Lane wrote: > * It's too bad that the hash function doesn't have a return convention > that allows distinguishing "couldn't possibly match any keyword" from > "might match keyword 0". I imagine a lot of the zero entries in its > hashtable could be interpreted as the former, so at the cost of perhaps > a couple more if-tests we could skip work at the caller. As this patch > stands, we could only skip the strcmp() so it might not be worth the > trouble --- but if we use Joerg's |0x20 hack then we could hash before > downcasing, allowing us to skip the downcasing step if the word couldn't > possibly be a keyword. Likely that would be worth the trouble. The hash function itself doesn't have enough data in it to know whether its a match or not. A strcmp or memcmp at the end will always be necessary if you don't know already that it is a keyword. > * We should extend the ScanKeywordList representation to include a > field holding the longest keyword length in the table, which > gen_keywordlist.pl would have no trouble providing. Then we could > skip downcasing and/or hashing for any word longer than that, replacing > the current NAMEDATALEN test, and thereby putting a tight bound on > the cost of downcasing and/or hashing. Correct, possibly even have an array for each class of keywords. > * If we do hash first, then we could replace the downcasing loop and > strcmp with an integrated loop that downcases and compares one > character at a time, removing the need for the NAMEDATALEN-sized > buffer variable. This is also an option. Assuming that the character set for keywords doesn't change (letter or _), one 64bit bit test per input character would ensure that the |0x20 hack gives correct result for comparing as well. Any other input would be an instant mismatch. If digits are valid keyword characters, it would be two tests. > * I think it matters to the speed of the hashing loop that the > magic multipliers are hard-coded. (Examining the assembly code > shows that, at least on my x86_64 hardware, gcc prefers to use > shift-and-add sequences here instead of multiply instructions.) Right, that's one of the reasons for choosing them. The other is that it gives decent mixing for ASCII-only input. At the moment, they are hard-coded. > So we probably can't have inlined hashing code --- I imagine the > hash generator needs the flexibility to pick different values of > those multipliers. Right now, only the initial values are randomized. Picking a different set of hash functions is possible, but someone that should be done only if there is an actual need. That was what I meant with stronger mixing might be necessary for "annoying" keyword additions. > I envision making this work by having > gen_keywordlist.pl emit a function definition for the hash step and > include a function pointer to it in ScanKeywordList. That extra > function call will make things fractionally slower than what I have > here, but I don't think it will change the conclusions any. This > approach would also greatly alleviate the concern I had yesterday > about ecpg's c_keywords.c having a second copy of the hashing code; > what it would have is its own generated function, which isn't much > of a problem. There are two ways for dealing with it: (1) Have one big hash table with all the various keywords and a class mask stored. If there is enough overlap between the keyword tables, it can significantly reduce the amount of space needed. In terms of code complexity, it adds one class check at the end, i.e. a bitmap test. (2) Build independent hash tables for each input class. A bit simpler to manage, but can result in a bit wasted space. >From the generator side, it doesn't matter which choice is taken. > * Given that the generator's runtime is negligible when coded in C, > I suspect that we might able to tolerate the speed hit from translating > it to Perl, and frankly I'd much rather do that than cope with the > consequences of including C code in our build process. I'm just not fluent enough in Perl to be much help for that, but I can sit down and write a trivial Python version of it :) There are a couple of changes that are useful to have in this context, e.g. the ability to directly provide the offsets in the result table to allow dropping the index -> offset table completely. Joerg
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Joerg Sonnenberger writes: > On Mon, Jan 07, 2019 at 03:11:55AM +1300, David Rowley wrote: >> What I'm most interested in is how long it took to generate the hash >> function in hash2.c? > It's within the noise floor of time(1) on my laptop, e.g. ~1ms. I decided to do some simple performance measurements to see if we're actually getting any useful results here. I set up a test case that just runs raw_parser in a tight loop: while (count-- > 0) { List *parsetree_list; MemoryContext oldcontext; oldcontext = MemoryContextSwitchTo(mycontext); parsetree_list = pg_parse_query(query_string); MemoryContextSwitchTo(oldcontext); MemoryContextReset(mycontext); CHECK_FOR_INTERRUPTS(); } and exercised it on the contents of information_schema.sql. I think that's a reasonably representative test case considering that we're only examining the speed of the flex+bison stages. (Since it's mostly DDL, including parse analysis might be a bit unlike normal workloads, but for raw parsing it should be fine.) On my workstation, in a non-cassert build, HEAD requires about 4700 ms for 1000 iterations (with maybe 1% cross-run variation). Applying the v8 patch I posted yesterday, the time drops to ~4520 ms or about a 4% savings. So not a lot, but it's pretty repeatable, and it shows that reducing the cache footprint of keyword lookup is worth something. I then tried plastering in Joerg's example hash function, as in the attached delta patch on top of v8. This is *not* a usable patch; it breaks plpgsql and ecpg, because ScanKeywordLookup no longer works for non-core keyword sets. But that doesn't matter for the information_schema test case, and on that I find the runtime drops to ~3820 ms, or 19% better than HEAD. So this is definitely an idea worth pursuing. Some additional thoughts: * It's too bad that the hash function doesn't have a return convention that allows distinguishing "couldn't possibly match any keyword" from "might match keyword 0". I imagine a lot of the zero entries in its hashtable could be interpreted as the former, so at the cost of perhaps a couple more if-tests we could skip work at the caller. As this patch stands, we could only skip the strcmp() so it might not be worth the trouble --- but if we use Joerg's |0x20 hack then we could hash before downcasing, allowing us to skip the downcasing step if the word couldn't possibly be a keyword. Likely that would be worth the trouble. * We should extend the ScanKeywordList representation to include a field holding the longest keyword length in the table, which gen_keywordlist.pl would have no trouble providing. Then we could skip downcasing and/or hashing for any word longer than that, replacing the current NAMEDATALEN test, and thereby putting a tight bound on the cost of downcasing and/or hashing. * If we do hash first, then we could replace the downcasing loop and strcmp with an integrated loop that downcases and compares one character at a time, removing the need for the NAMEDATALEN-sized buffer variable. * I think it matters to the speed of the hashing loop that the magic multipliers are hard-coded. (Examining the assembly code shows that, at least on my x86_64 hardware, gcc prefers to use shift-and-add sequences here instead of multiply instructions.) So we probably can't have inlined hashing code --- I imagine the hash generator needs the flexibility to pick different values of those multipliers. I envision making this work by having gen_keywordlist.pl emit a function definition for the hash step and include a function pointer to it in ScanKeywordList. That extra function call will make things fractionally slower than what I have here, but I don't think it will change the conclusions any. This approach would also greatly alleviate the concern I had yesterday about ecpg's c_keywords.c having a second copy of the hashing code; what it would have is its own generated function, which isn't much of a problem. * Given that the generator's runtime is negligible when coded in C, I suspect that we might able to tolerate the speed hit from translating it to Perl, and frankly I'd much rather do that than cope with the consequences of including C code in our build process. I'm eagerly awaiting seeing Joerg's code, but I think in the meantime I'm going to go look up NetBSD's nbperf to get an idea of how painful it might be to do in Perl. (Now, bearing in mind that I'm not exactly fluent in Perl, there are probably other people around here who could produce a better translation ...) regards, tom lane diff --git a/src/common/kwlookup.c b/src/common/kwlookup.c index db62623..8c55f40 100644 *** a/src/common/kwlookup.c --- b/src/common/kwlookup.c *** *** 17,22 --- 17,149 #include "common/kwlookup.h" + static uint32 + hash(const void *key, size_t keylen) + { + static const uint16 g[881] = { + 0x015b, 0x, 0x0070, 0x
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Mon, Jan 07, 2019 at 03:11:55AM +1300, David Rowley wrote: > What I'm most interested in is how long it took to generate the hash > function in hash2.c? It's within the noise floor of time(1) on my laptop, e.g. ~1ms. Joerg
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Sat, 5 Jan 2019 at 09:20, John Naylor wrote: > > On 1/3/19, Joerg Sonnenberger wrote: > > Hello John, > > I was pointed at your patch on IRC and decided to look into adding my > > own pieces. What I can provide you is a fast perfect hash function > > generator. I've attached a sample hash function based on the current > > main keyword list. hash() essentially gives you the number of the only > > possible match, a final strcmp/memcmp is still necessary to verify that > > it is an actual keyword though. The |0x20 can be dropped if all cases > > have pre-lower-cased the input already. This would replace the binary > > search in the lookup functions. Returning offsets directly would be easy > > as well. That allows writing a single string where each entry is prefixed > > with a type mask, the token id, the length of the keyword and the actual > > keyword text. Does that sound useful to you? > > Judging by previous responses, there is still interest in using > perfect hash functions, so thanks for this. I'm not knowledgeable > enough to judge its implementation, so I'll leave that for others. Well, I'm quite impressed by the resulting hash function. The resulting hash value can be used directly to index the existing 440 element ScanKeywords[] array (way better than the 1815 element array Andrew got from gperf). If we also happened to also store the length of the keyword in that array then we could compare the length of the word after hashing. If the length is the same then we could perform a memcmp() to confirm the match, should be a little cheaper than a strcmp() and we should be able to store the length for free on 64-bit machines. If the length is not the same then it's not a keyword. It may also save some cycles to determine the input word's length at the same time as lowering it. The keyword length could also be easily determined by changing the PG_KEYWORD macro to become: #define PG_KEYWORD(a,b,c) {a,0,c,sizeof(a)-1}, after, of course adding a new field to the ScanKeyword struct. What I'm most interested in is how long it took to generate the hash function in hash2.c? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
I wrote: > I spent some time hacking on this today, and I think it's committable > now, but I'm putting it back up in case anyone wants to have another > look (and also so the cfbot can check it on Windows). ... and indeed, the cfbot doesn't like it. Here's v8, with the missing addition to Mkvcbuild.pm. regards, tom lane diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index e8ef966..9131991 100644 *** a/contrib/pg_stat_statements/pg_stat_statements.c --- b/contrib/pg_stat_statements/pg_stat_statements.c *** fill_in_constant_lengths(pgssJumbleState *** 3075,3082 /* initialize the flex scanner --- should match raw_parser() */ yyscanner = scanner_init(query, &yyextra, ! ScanKeywords, ! NumScanKeywords); /* we don't want to re-emit any escape string warnings */ yyextra.escape_string_warning = false; --- 3075,3082 /* initialize the flex scanner --- should match raw_parser() */ yyscanner = scanner_init(query, &yyextra, ! &ScanKeywords, ! ScanKeywordTokens); /* we don't want to re-emit any escape string warnings */ yyextra.escape_string_warning = false; diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index 7e9b122..4c0c258 100644 *** a/src/backend/parser/parser.c --- b/src/backend/parser/parser.c *** raw_parser(const char *str) *** 41,47 /* initialize the flex scanner */ yyscanner = scanner_init(str, &yyextra.core_yy_extra, ! ScanKeywords, NumScanKeywords); /* base_yylex() only needs this much initialization */ yyextra.have_lookahead = false; --- 41,47 /* initialize the flex scanner */ yyscanner = scanner_init(str, &yyextra.core_yy_extra, ! &ScanKeywords, ScanKeywordTokens); /* base_yylex() only needs this much initialization */ yyextra.have_lookahead = false; diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l index fbeb86f..e1cae85 100644 *** a/src/backend/parser/scan.l --- b/src/backend/parser/scan.l *** bool escape_string_warning = true; *** 67,72 --- 67,87 bool standard_conforming_strings = true; /* + * Constant data exported from this file. This array maps from the + * zero-based keyword numbers returned by ScanKeywordLookup to the + * Bison token numbers needed by gram.y. This is exported because + * callers need to pass it to scanner_init, if they are using the + * standard keyword list ScanKeywords. + */ + #define PG_KEYWORD(kwname, value, category) value, + + const uint16 ScanKeywordTokens[] = { + #include "parser/kwlist.h" + }; + + #undef PG_KEYWORD + + /* * Set the type of YYSTYPE. */ #define YYSTYPE core_YYSTYPE *** other . *** 504,521 * We will pass this along as a normal character string, * but preceded with an internally-generated "NCHAR". */ ! const ScanKeyword *keyword; SET_YYLLOC(); yyless(1); /* eat only 'n' this time */ ! keyword = ScanKeywordLookup("nchar", ! yyextra->keywords, ! yyextra->num_keywords); ! if (keyword != NULL) { ! yylval->keyword = keyword->name; ! return keyword->value; } else { --- 519,536 * We will pass this along as a normal character string, * but preceded with an internally-generated "NCHAR". */ ! int kwnum; SET_YYLLOC(); yyless(1); /* eat only 'n' this time */ ! kwnum = ScanKeywordLookup("nchar", ! yyextra->keywordlist); ! if (kwnum >= 0) { ! yylval->keyword = GetScanKeyword(kwnum, ! yyextra->keywordlist); ! return yyextra->keyword_tokens[kwnum]; } else { *** other . *** 1021,1039 {identifier} { ! const ScanKeyword *keyword; char *ident; SET_YYLLOC(); /* Is it a keyword? */ ! keyword = ScanKeywordLookup(yytext, ! yyextra->keywords, ! yyextra->num_keywords); ! if (keyword != NULL) { ! yylval->keyword = keyword->name; ! return keyword->value; } /* --- 1036,1054 {identifier} { ! int kwnum; char *ident; SET_YYLLOC(); /* Is it a keyword? */ ! kwnum = ScanKeywordLookup(yytext, ! yyextra->keywordlist); ! if (kwnum >= 0) { ! yylval->keyword = GetScanKeyword(kwnum, ! yyextra->keywordlist); ! return yyextra->keyword_tokens[kwnum]; } /* *** scanner_yyerror(const char *message, cor *** 1142,1149 core_yyscan_t scanner_init(const char *str, core_yy_extra_type *yyext, ! const ScanKeyword *keywords, ! int num_keywords) { Size slen = strlen(str); yyscan_t scanner; --- 115
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > [ v6-0001-Use-offset-based-keyword-lookup.patch ] I spent some time hacking on this today, and I think it's committable now, but I'm putting it back up in case anyone wants to have another look (and also so the cfbot can check it on Windows). Given the discussion about possibly switching to perfect hashing, I thought it'd be a good idea to try to make the APIs less dependent on the exact table representation. So in the attached, I created a struct ScanKeywordList that holds all the data ScanKeywordLookup needs, and the generated headers declare variables of that type, and we just pass around a pointer to that instead of passing several different things. I also went ahead with the idea of splitting the category and token data into separate arrays. That allows moving the backend token array out of src/common entirely, which I think is a good thing because of the dependency situation: we no longer need to run the bison build before we can compile src/common/keywords_srv.o. There's one remaining refactoring issue that I think we'd want to consider before trying to jack this up and wheel a perfect-hash lookup under it: where to do the downcasing transform. Right now, ecpg's c_keywords.c has its own copy of the binary-search logic because it doesn't want the downcasing transform that ScanKeywordLookup does. So unless we want it to also have a copy of the hash lookup logic, we need to rearrange that somehow. We could give ScanKeywordLookup a "bool downcase" argument, or we could refactor things so that the downcasing is done by callers if they need it (which many don't). I'm not very sure which of those three alternatives is best. My argument upthread that we could always do the downcasing before keyword lookup now feels a bit shaky, because I was reminded while working on this code that we actually have different downcasing rules for keywords and identifiers (yes, really), so that it's not possible for those code paths to share a downcasing transform. So the idea of moving the keyword-downcasing logic to the callers is likely to not work out quite as nicely as I thought. (This might also mean that I was overly hasty to reject Joerg's |0x20 hack. It's still an ugly hack, but it would save doing the keyword downcasing transform if we don't get a hashcode match...) regards, tom lane diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index e8ef966..9131991 100644 *** a/contrib/pg_stat_statements/pg_stat_statements.c --- b/contrib/pg_stat_statements/pg_stat_statements.c *** fill_in_constant_lengths(pgssJumbleState *** 3075,3082 /* initialize the flex scanner --- should match raw_parser() */ yyscanner = scanner_init(query, &yyextra, ! ScanKeywords, ! NumScanKeywords); /* we don't want to re-emit any escape string warnings */ yyextra.escape_string_warning = false; --- 3075,3082 /* initialize the flex scanner --- should match raw_parser() */ yyscanner = scanner_init(query, &yyextra, ! &ScanKeywords, ! ScanKeywordTokens); /* we don't want to re-emit any escape string warnings */ yyextra.escape_string_warning = false; diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index 7e9b122..4c0c258 100644 *** a/src/backend/parser/parser.c --- b/src/backend/parser/parser.c *** raw_parser(const char *str) *** 41,47 /* initialize the flex scanner */ yyscanner = scanner_init(str, &yyextra.core_yy_extra, ! ScanKeywords, NumScanKeywords); /* base_yylex() only needs this much initialization */ yyextra.have_lookahead = false; --- 41,47 /* initialize the flex scanner */ yyscanner = scanner_init(str, &yyextra.core_yy_extra, ! &ScanKeywords, ScanKeywordTokens); /* base_yylex() only needs this much initialization */ yyextra.have_lookahead = false; diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l index fbeb86f..e1cae85 100644 *** a/src/backend/parser/scan.l --- b/src/backend/parser/scan.l *** bool escape_string_warning = true; *** 67,72 --- 67,87 bool standard_conforming_strings = true; /* + * Constant data exported from this file. This array maps from the + * zero-based keyword numbers returned by ScanKeywordLookup to the + * Bison token numbers needed by gram.y. This is exported because + * callers need to pass it to scanner_init, if they are using the + * standard keyword list ScanKeywords. + */ + #define PG_KEYWORD(kwname, value, category) value, + + const uint16 ScanKeywordTokens[] = { + #include "parser/kwlist.h" + }; + + #undef PG_KEYWORD + + /* * Set the type of YYSTYPE. */ #define YYSTYPE core_YYSTYPE *** other . *** 504,521 * We will pass this along as a normal character string, * but preceded with an internally-gene
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Fri, Jan 04, 2019 at 02:36:15PM -0800, Andres Freund wrote: > Hi, > > On 2019-01-04 16:43:39 -0500, Tom Lane wrote: > > Joerg Sonnenberger writes: > > >> * What's the generator written in? (if the answer's not "Perl", wedging > > >> it into our build processes might be painful) > > > > > Plain C, nothing really fancy in it. > > > > That's actually a bigger problem than you might think, because it > > doesn't fit in very nicely in a cross-compiling build: we might not > > have any C compiler at hand that generates programs that can execute > > on the build machine. That's why we prefer Perl for tools that need > > to execute during the build. However, if the code is pretty small > > and fast, maybe translating it to Perl is feasible. Or perhaps > > we could add sufficient autoconfiscation infrastructure to identify > > a native C compiler. It's not very likely that there isn't one, > > but it is possible that nothing we learned about the configured > > target compiler would apply to it :-( There is a pre-made autoconf macro for doing the basic glue for CC_FOR_BUILD, it's been used by various projects already including libXt and friends. > I think it might be ok if we included the output of the generator in the > buildtree? Not being able to add keywords while cross-compiling sounds like > an acceptable restriction to me. I assume we'd likely grow further users > of such a generator over time, and some of the input lists might be big > enough that we'd not want to force it to be recomputed on every machine. This is quite reasonable as well. I wouldn't worry about the size of the input list at all. Processing the Webster dictionary needs something less than 0.4s on my laptop for 235k entries. Joerg
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2019-01-04 16:43:39 -0500, Tom Lane wrote: > Joerg Sonnenberger writes: > >> * What's the generator written in? (if the answer's not "Perl", wedging > >> it into our build processes might be painful) > > > Plain C, nothing really fancy in it. > > That's actually a bigger problem than you might think, because it > doesn't fit in very nicely in a cross-compiling build: we might not > have any C compiler at hand that generates programs that can execute > on the build machine. That's why we prefer Perl for tools that need > to execute during the build. However, if the code is pretty small > and fast, maybe translating it to Perl is feasible. Or perhaps > we could add sufficient autoconfiscation infrastructure to identify > a native C compiler. It's not very likely that there isn't one, > but it is possible that nothing we learned about the configured > target compiler would apply to it :-( I think it might be ok if we included the output of the generator in the buildtree? Not being able to add keywords while cross-compiling sounds like an acceptable restriction to me. I assume we'd likely grow further users of such a generator over time, and some of the input lists might be big enough that we'd not want to force it to be recomputed on every machine. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Joerg Sonnenberger writes: > On Fri, Jan 04, 2019 at 03:31:11PM -0500, Tom Lane wrote: >> The sample hash function certainly looks great. I'm not terribly on board >> with using |0x20 as a substitute for lower-casing, but that's a minor >> detail. > Yeah, I've included that part more because I don't know the current use > cases enough. If all instances are already doing lower-casing in > advance, it is trivial to drop. I think we probably don't need that, because we'd always need to generate a lower-cased version of the input anyway: either to compare to the potential keyword match, or to use as the normalized identifier if it turns out not to be a keyword. I don't think there are any cases where it's useful to delay downcasing till after the keyword lookup. >> * What's the generator written in? (if the answer's not "Perl", wedging >> it into our build processes might be painful) > Plain C, nothing really fancy in it. That's actually a bigger problem than you might think, because it doesn't fit in very nicely in a cross-compiling build: we might not have any C compiler at hand that generates programs that can execute on the build machine. That's why we prefer Perl for tools that need to execute during the build. However, if the code is pretty small and fast, maybe translating it to Perl is feasible. Or perhaps we could add sufficient autoconfiscation infrastructure to identify a native C compiler. It's not very likely that there isn't one, but it is possible that nothing we learned about the configured target compiler would apply to it :-( >> * What license is it under? > Two clause BSD license. OK, that works, at least. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Fri, Jan 04, 2019 at 03:31:11PM -0500, Tom Lane wrote: > John Naylor writes: > > On 1/3/19, Joerg Sonnenberger wrote: > >> I was pointed at your patch on IRC and decided to look into adding my > >> own pieces. What I can provide you is a fast perfect hash function > >> generator. I've attached a sample hash function based on the current > >> main keyword list. hash() essentially gives you the number of the only > >> possible match, a final strcmp/memcmp is still necessary to verify that > >> it is an actual keyword though. The |0x20 can be dropped if all cases > >> have pre-lower-cased the input already. This would replace the binary > >> search in the lookup functions. Returning offsets directly would be easy > >> as well. That allows writing a single string where each entry is prefixed > >> with a type mask, the token id, the length of the keyword and the actual > >> keyword text. Does that sound useful to you? > > > Judging by previous responses, there is still interest in using > > perfect hash functions, so thanks for this. I'm not knowledgeable > > enough to judge its implementation, so I'll leave that for others. > > We haven't actually seen the implementation, so it's hard to judge ;-). It's a temporary hacked version of nbperf in the NetBSD tree. > The sample hash function certainly looks great. I'm not terribly on board > with using |0x20 as a substitute for lower-casing, but that's a minor > detail. Yeah, I've included that part more because I don't know the current use cases enough. If all instances are already doing lower-casing in advance, it is trivial to drop. > The foremost questions in my mind are: > > * What's the generator written in? (if the answer's not "Perl", wedging > it into our build processes might be painful) Plain C, nothing really fancy in it. > * What license is it under? Two clause BSD license. > * Does it always suceed in producing a single-level lookup table? This question is a bit tricky. The short answer is: yes. The longer answer: The choosen hash function in the example is very simple (e.g. just two variations of DJB-style hash), so with that: no, not without potentially fiddling a bit with the hash function if things ever get nasty like having two keywords that hit a funnel for both variants. The main concern for the choice was to be fast. When using two families of independent hash functions, the generator requires a probalistic linear time in the number of keys. That means with a strong enough hash function like the Jenkins hash used in PG elsewhere, it will succeed very fast. So if it fails on new keywords, making the mixing a bit stronger should be enough. Joerg
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/27/18, Tom Lane wrote: > If you really are hot about saving that other 440 bytes, the way to > do it would be to drop the struct entirely and use two parallel > arrays, an int16[] for value and a char[] (or better uint8[]) for > category. Those would be filled by reading kwlist.h twice with > different definitions for PG_KEYWORD. Not sure it's worth the > trouble though --- in particular, not clear that it's a win from > the standpoint of number of cache lines touched. Understood. That said, after re-implementing all keyword lookups, I wondered if there'd be a notational benefit to dropping the struct, especially since as yet no caller uses both token and category. It makes pl_scanner.c and its reserved keyword list a bit nicer, and gets rid of the need to force frontend to have 'zero' token numbers, but I'm not sure it's a clear win. I've attached a patch (applies on top of v6), gzipped to avoid confusing the cfbot. -John Naylor keyword-nostruct.patch.gz Description: GNU Zip compressed data
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > On 2019-01-04 12:26:18 -0500, Tom Lane wrote: >> I'm wondering where we want to go with this. Is anyone excited >> about pursuing the perfect-hash-function idea? (Joerg's example >> function looked pretty neat to me.) If we are going to do that, >> does it make sense to push this version beforehand? > I think it does make sense to push this version beforehand. Most of > the code would be needed anyway, so it's not like this is going to > cause a lot of churn. Yeah, I'm leaning in that direction too, first on the grounds of "don't let the perfect be the enemy of the good", and second because if we do end up with perfect hashing, we'd still need a table-generation step. The build infrastructure this adds would support a generator that produces perfect hashes just as well as what this is doing, even if we end up having to whack the API of ScanKeywordLookup around some more. So barring objections, I'll have a look at pushing this, and then we can think about using perfect hashing instead. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > On 1/3/19, Joerg Sonnenberger wrote: >> I was pointed at your patch on IRC and decided to look into adding my >> own pieces. What I can provide you is a fast perfect hash function >> generator. I've attached a sample hash function based on the current >> main keyword list. hash() essentially gives you the number of the only >> possible match, a final strcmp/memcmp is still necessary to verify that >> it is an actual keyword though. The |0x20 can be dropped if all cases >> have pre-lower-cased the input already. This would replace the binary >> search in the lookup functions. Returning offsets directly would be easy >> as well. That allows writing a single string where each entry is prefixed >> with a type mask, the token id, the length of the keyword and the actual >> keyword text. Does that sound useful to you? > Judging by previous responses, there is still interest in using > perfect hash functions, so thanks for this. I'm not knowledgeable > enough to judge its implementation, so I'll leave that for others. We haven't actually seen the implementation, so it's hard to judge ;-). The sample hash function certainly looks great. I'm not terribly on board with using |0x20 as a substitute for lower-casing, but that's a minor detail. The foremost questions in my mind are: * What's the generator written in? (if the answer's not "Perl", wedging it into our build processes might be painful) * What license is it under? * Does it always suceed in producing a single-level lookup table? regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 2019-01-04 12:26:18 -0500, Tom Lane wrote: > I wrote: > > Andres Freund writes: > >> On 2018-12-29 16:59:52 -0500, John Naylor wrote: > >>> I think 0001 with complete keyword lookup replacement is in decent > >>> enough shape to post. Make check-world passes. A few notes and > >>> caveats: > > >> I tried to take this for a spin, an for me the build fails because various > >> frontend programs don't have KeywordOffsets/Strings defined, but reference > >> it > >> through various functions exposed to the frontend (like fmtId()). That I > >> see > >> that error but you don't is probably related to me using -fuse-ld=gold in > >> CFLAGS. > > > I was just about to point out that the cfbot is seeing that too ... > > Aside from the possible linkage problem, this will need a minor rebase > over 4879a5172, which rearranged some of plpgsql's calls of > ScanKeywordLookup. > > While I don't think it's going to be hard to resolve these issues, > I'm wondering where we want to go with this. Is anyone excited > about pursuing the perfect-hash-function idea? (Joerg's example > function looked pretty neat to me.) If we are going to do that, > does it make sense to push this version beforehand? I think it does make sense to push this version beforehand. Most of the code would be needed anyway, so it's not like this is going to cause a lot of churn. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 1/3/19, Joerg Sonnenberger wrote: > Hello John, > I was pointed at your patch on IRC and decided to look into adding my > own pieces. What I can provide you is a fast perfect hash function > generator. I've attached a sample hash function based on the current > main keyword list. hash() essentially gives you the number of the only > possible match, a final strcmp/memcmp is still necessary to verify that > it is an actual keyword though. The |0x20 can be dropped if all cases > have pre-lower-cased the input already. This would replace the binary > search in the lookup functions. Returning offsets directly would be easy > as well. That allows writing a single string where each entry is prefixed > with a type mask, the token id, the length of the keyword and the actual > keyword text. Does that sound useful to you? Judging by previous responses, there is still interest in using perfect hash functions, so thanks for this. I'm not knowledgeable enough to judge its implementation, so I'll leave that for others. -John Naylor
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/30/18, Andres Freund wrote: > I tried to take this for a spin, an for me the build fails because various > frontend programs don't have KeywordOffsets/Strings defined, but reference > it > through various functions exposed to the frontend (like fmtId()). That I > see > that error but you don't is probably related to me using -fuse-ld=gold in > CFLAGS. > > I can "fix" this by including kwlist_d.h in common/keywords.c > regardless of FRONTEND. That also lead me to discover that the build > dependencies somewhere aren't correctly set-up, because I need to > force a clean rebuild to trigger the problem again, just changing > keywords.c back doesn't trigger the problem. Hmm, that was a typo, and I didn't notice even when I found I had to include kwlist_d.h in ecpg/keywords.c. :-( I've fixed both of those in the attached v6. As far as dependencies, I'm far from sure I have it up to par. That piece could use some discussion. On 1/4/19, Tom Lane wrote: > Aside from the possible linkage problem, this will need a minor rebase > over 4879a5172, which rearranged some of plpgsql's calls of > ScanKeywordLookup. > > While I don't think it's going to be hard to resolve these issues, > I'm wondering where we want to go with this. Is anyone excited > about pursuing the perfect-hash-function idea? (Joerg's example > function looked pretty neat to me.) If we are going to do that, > does it make sense to push this version beforehand? If it does, for v6 I've also done the rebase, updated the copyright year, and fixed an error in MSVC. -John Naylor From fb846eeabd450b6932e57f7e8d4f0aebc125d178 Mon Sep 17 00:00:00 2001 From: John Naylor Date: Fri, 4 Jan 2019 14:35:40 -0500 Subject: [PATCH v6] Use offset-based keyword lookup. Replace binary search over an array of ScanKeyword structs with a binary search over an array of offsets into a keyword string. Access auxillary data only after a keyword hit. This has better locality of reference and a smaller memory footprint. --- .../pg_stat_statements/pg_stat_statements.c | 2 + src/backend/parser/parser.c | 4 +- src/backend/parser/scan.l | 38 ++-- src/backend/utils/adt/misc.c | 2 +- src/backend/utils/adt/ruleutils.c | 9 +- src/common/.gitignore | 1 + src/common/Makefile | 25 ++- src/common/keywords.c | 42 +++-- src/fe_utils/string_utils.c | 9 +- src/include/common/keywords.h | 19 +- src/include/parser/kwlist.h | 2 +- src/include/parser/scanner.h | 8 +- src/interfaces/ecpg/preproc/.gitignore| 2 + src/interfaces/ecpg/preproc/Makefile | 20 ++- src/interfaces/ecpg/preproc/c_keywords.c | 65 ++- src/interfaces/ecpg/preproc/c_kwlist.h| 53 ++ src/interfaces/ecpg/preproc/ecpg_keywords.c | 81 ++--- src/interfaces/ecpg/preproc/ecpg_kwlist.h | 68 +++ src/interfaces/ecpg/preproc/keywords.c| 6 +- src/interfaces/ecpg/preproc/pgc.l | 22 +-- src/interfaces/ecpg/preproc/preproc_extern.h | 8 +- src/pl/plpgsql/src/.gitignore | 2 + src/pl/plpgsql/src/Makefile | 16 +- src/pl/plpgsql/src/pl_reserved_kwlist.h | 55 ++ src/pl/plpgsql/src/pl_scanner.c | 167 -- src/pl/plpgsql/src/pl_unreserved_kwlist.h | 111 src/tools/gen_keywords.pl | 136 ++ src/tools/msvc/Solution.pm| 44 + src/tools/msvc/clean.bat | 6 + 29 files changed, 696 insertions(+), 327 deletions(-) create mode 100644 src/common/.gitignore create mode 100644 src/interfaces/ecpg/preproc/c_kwlist.h create mode 100644 src/interfaces/ecpg/preproc/ecpg_kwlist.h create mode 100644 src/pl/plpgsql/src/pl_reserved_kwlist.h create mode 100644 src/pl/plpgsql/src/pl_unreserved_kwlist.h create mode 100644 src/tools/gen_keywords.pl diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index e8ef966bb5..4776de7595 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -3076,6 +3076,8 @@ fill_in_constant_lengths(pgssJumbleState *jstate, const char *query, yyscanner = scanner_init(query, &yyextra, ScanKeywords, + KeywordString, + KeywordOffsets, NumScanKeywords); /* we don't want to re-emit any escape string warnings */ diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index 7e9b1222fd..19fa7bb8ae 100644 --- a/src/backend/parser/parser.c +++ b/src/backend/parser/parser.c @@ -40,8 +40,8 @@ raw_parser(const char *str) int yyresult; /* initialize the flex scanner */ - yyscanner = scanner_init(str, &yyextra.core_yy_extra, - ScanKeywords, Nu
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
I wrote: > Andres Freund writes: >> On 2018-12-29 16:59:52 -0500, John Naylor wrote: >>> I think 0001 with complete keyword lookup replacement is in decent >>> enough shape to post. Make check-world passes. A few notes and >>> caveats: >> I tried to take this for a spin, an for me the build fails because various >> frontend programs don't have KeywordOffsets/Strings defined, but reference it >> through various functions exposed to the frontend (like fmtId()). That I see >> that error but you don't is probably related to me using -fuse-ld=gold in >> CFLAGS. > I was just about to point out that the cfbot is seeing that too ... Aside from the possible linkage problem, this will need a minor rebase over 4879a5172, which rearranged some of plpgsql's calls of ScanKeywordLookup. While I don't think it's going to be hard to resolve these issues, I'm wondering where we want to go with this. Is anyone excited about pursuing the perfect-hash-function idea? (Joerg's example function looked pretty neat to me.) If we are going to do that, does it make sense to push this version beforehand? regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Sun, Dec 16, 2018 at 11:50:15AM -0500, John Naylor wrote: > A few months ago I was looking into faster search algorithms for > ScanKeywordLookup(), so this is interesting to me. While an optimal > full replacement would be a lot of work, the above ideas are much less > invasive and would still have some benefit. Unless anyone intends to > work on this, I'd like to flesh out the offset-into-giant-string > approach a bit further: Hello John, I was pointed at your patch on IRC and decided to look into adding my own pieces. What I can provide you is a fast perfect hash function generator. I've attached a sample hash function based on the current main keyword list. hash() essentially gives you the number of the only possible match, a final strcmp/memcmp is still necessary to verify that it is an actual keyword though. The |0x20 can be dropped if all cases have pre-lower-cased the input already. This would replace the binary search in the lookup functions. Returning offsets directly would be easy as well. That allows writing a single string where each entry is prefixed with a type mask, the token id, the length of the keyword and the actual keyword text. Does that sound useful to you? Joerg #include #include uint32_t hash(const void * __restrict key, size_t keylen) { static const uint16_t g[881] = { 0x015b, 0x, 0x0070, 0x01b2, 0x, 0x0078, 0x0020, 0x, 0x, 0x0193, 0x, 0x, 0x, 0x01ac, 0x, 0x0122, 0x00b9, 0x0176, 0x013b, 0x, 0x, 0x, 0x0150, 0x, 0x, 0x, 0x008b, 0x00ea, 0x00b3, 0x0197, 0x, 0x0118, 0x012d, 0x0102, 0x, 0x0091, 0x0061, 0x008c, 0x, 0x, 0x0144, 0x01b4, 0x, 0x, 0x01a8, 0x019e, 0x, 0x00da, 0x, 0x, 0x0122, 0x0176, 0x00f3, 0x016a, 0x00f4, 0x00c0, 0x0111, 0x, 0x0103, 0x0028, 0x001a, 0x0180, 0x, 0x, 0x005f, 0x, 0x00d9, 0x, 0x016d, 0x, 0x0170, 0x0007, 0x016f, 0x, 0x014e, 0x0098, 0x00a8, 0x004b, 0x, 0x0056, 0x, 0x0121, 0x0012, 0x0102, 0x0192, 0x, 0x00f2, 0x0066, 0x, 0x003a, 0x, 0x, 0x0144, 0x, 0x, 0x0133, 0x0067, 0x0169, 0x, 0x, 0x0152, 0x0122, 0x, 0x0058, 0x0135, 0x0045, 0x0193, 0x00d2, 0x007e, 0x, 0x00ae, 0x012c, 0x, 0x, 0x, 0x, 0x0124, 0x, 0x0046, 0x0018, 0x, 0x00ba, 0x00d1, 0x004a, 0x, 0x, 0x, 0x, 0x, 0x001f, 0x, 0x0101, 0x, 0x, 0x, 0x01b5, 0x016e, 0x0173, 0x008a, 0x, 0x0173, 0x000b, 0x, 0x00d5, 0x005e, 0x, 0x00ac, 0x, 0x, 0x, 0x01a1, 0x, 0x, 0x0127, 0x, 0x005e, 0x, 0x016f, 0x, 0x012b, 0x01a4, 0x01b4, 0x, 0x, 0x003a, 0x, 0x, 0x00f5, 0x00b1, 0x0003, 0x0123, 0x001b, 0x, 0x004f, 0x, 0x, 0x, 0x, 0x, 0x007a, 0x, 0x, 0x, 0x, 0x00c2, 0x00a2, 0x00b9, 0x, 0x00cb, 0x, 0x00d2, 0x, 0x0197, 0x0121, 0x, 0x00d6, 0x0107, 0x, 0x, 0x, 0x, 0x, 0x, 0x0165, 0x00df, 0x0121, 0x, 0x, 0x, 0x, 0x, 0x019b, 0x, 0x01ad, 0x, 0x014f, 0x018d, 0x, 0x015f, 0x0168, 0x, 0x0199, 0x, 0x, 0x, 0x00a1, 0x, 0x, 0x0109, 0x, 0x, 0x01a6, 0x0097, 0x, 0x0018, 0x, 0x00d1, 0x, 0x, 0x, 0x0187, 0x0018, 0x, 0x00aa, 0x, 0x, 0x, 0x, 0x0136, 0x0063, 0x00b8, 0x, 0x0067, 0x0114, 0x, 0x, 0x0151, 0x, 0x, 0x, 0x00bf, 0x, 0x, 0x, 0x01b4, 0x00d4, 0x, 0x0006, 0x017e, 0x0167, 0x003a, 0x017f, 0x0183, 0x00c9, 0x01a2, 0x, 0x, 0x0153, 0x00ce, 0x, 0x, 0x, 0x, 0x, 0x, 0x0051, 0x, 0x0086, 0x, 0x0083, 0x0137, 0x, 0x, 0x0050, 0x, 0x00d7, 0x, 0x, 0x, 0x0129, 0x00f1, 0x, 0x009b, 0x01a7, 0x, 0x00b4, 0x, 0x00e0, 0x0046, 0x0025, 0x, 0x, 0x, 0x0144, 0x, 0x01a5, 0x0044, 0x0096, 0x0078, 0x0166, 0x, 0x, 0x, 0x0143, 0x, 0x00b8, 0x, 0x009e, 0x, 0x008c, 0x, 0x, 0x, 0x, 0x, 0x00fe, 0x, 0x, 0x0037, 0x0057, 0x, 0x00c3, 0x, 0x, 0x, 0x00bf, 0x014b, 0x0069, 0x00ce, 0x, 0x019d, 0x007f, 0x0186, 0x, 0x0119, 0x0015, 0x, 0x000e, 0x0113, 0x0139, 0x008e, 0x01ab, 0x, 0x005c, 0x, 0x0095, 0x, 0x019d, 0x, 0x0195, 0x0036, 0x, 0x, 0x00e0, 0x0146, 0x, 0x0033, 0x, 0x, 0x0035, 0x, 0x, 0x, 0x, 0x00d2,
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > On 2018-12-29 16:59:52 -0500, John Naylor wrote: >> I think 0001 with complete keyword lookup replacement is in decent >> enough shape to post. Make check-world passes. A few notes and >> caveats: > I tried to take this for a spin, an for me the build fails because various > frontend programs don't have KeywordOffsets/Strings defined, but reference it > through various functions exposed to the frontend (like fmtId()). That I see > that error but you don't is probably related to me using -fuse-ld=gold in > CFLAGS. I was just about to point out that the cfbot is seeing that too ... regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2018-12-29 16:59:52 -0500, John Naylor wrote: > I think 0001 with complete keyword lookup replacement is in decent > enough shape to post. Make check-world passes. A few notes and > caveats: I tried to take this for a spin, an for me the build fails because various frontend programs don't have KeywordOffsets/Strings defined, but reference it through various functions exposed to the frontend (like fmtId()). That I see that error but you don't is probably related to me using -fuse-ld=gold in CFLAGS. I can "fix" this by including kwlist_d.h in common/keywords.c regardless of FRONTEND. That also lead me to discover that the build dependencies somewhere aren't correctly set-up, because I need to force a clean rebuild to trigger the problem again, just changing keywords.c back doesn't trigger the problem. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
I think 0001 with complete keyword lookup replacement is in decent enough shape to post. Make check-world passes. A few notes and caveats: -I added an --extern option to the script for the core keyword headers. This also capitalizes variables. -ECPG keyword lookup is a bit different in that the ecpg and sql lookup functions are wrapped in a single function rather than called separately within pgc.l. It might be worth untangling that, but I have not done so. -Some variable names haven't changed even though now they're only referring to token values, which might be confusing. -I haven't checked if I need to install the generated headers. -I haven't measured performance or binary size. If anyone is excited enough to do that, great, otherwise I'll do that as time permits. -There are probably makefile bugs. Now, on to previous review points: On 12/27/18, Tom Lane wrote: > +/* Payload data for keywords */ > +typedef struct ScanKeywordAux > +{ > + int16 value; /* grammar's token code */ > + charcategory; /* see codes above */ > +} ScanKeywordAux; > > There isn't really any point in changing category to "char", because > alignment considerations will mandate that sizeof(ScanKeywordAux) be > a multiple of 2 anyway. With some compilers we could get around that > with a pragma to force non-aligned storage, but doing so would be a > net loss on most non-Intel architectures. Reverted, especially since we can skip the struct entirely for some callers as you pointed out below. > diff --git a/src/pl/plpgsql/src/.gitignore b/src/pl/plpgsql/src/.gitignore > @@ -1,3 +1,4 @@ > +/*kwlist_d.h > > Not a fan of using wildcards in .gitignore files, at least not when > there's just one or two files you intend to match. Removed. > # Force these dependencies to be known even without dependency info built: > -pl_gram.o pl_handler.o pl_comp.o pl_exec.o pl_funcs.o pl_scanner.o: > plpgsql.h pl_gram.h plerrcodes.h > +pl_gram.o pl_handler.o pl_comp.o pl_exec.o pl_funcs.o pl_scanner.o: > plpgsql.h pl_gram.h plerrcodes.h pl_unreserved_kwlist_d.h > > Hm, do we really need any more than pl_scanner.o to depend on that header? I think you're right, so separated into a new rule. > +# Parse keyword header for names. > +my @keywords; > +while (<$kif>) > +{ > + if (/^PG_KEYWORD\("(\w+)",\s*\w+,\s*\w+\)/) > > This is assuming more than it should about the number of arguments for > PG_KEYWORD, as well as what's in them. I think it'd be sufficient to > match like this: > > if (/^PG_KEYWORD\("(\w+)",/) ...and... > diff --git a/src/pl/plpgsql/src/pl_unreserved_kwlist.h > b/src/pl/plpgsql/src/pl_unreserved_kwlist.h > +/* name, value, category */ > +PG_KEYWORD("absolute", K_ABSOLUTE, UNRESERVED_KEYWORD) > > Likewise, I'd just have these be two-argument macros. There's no reason > for the various kwlist.h headers to agree on the number of payload > arguments for PG_KEYWORD. Both done, however... > +/* FIXME: Have to redefine this symbol for the WIP. */ > +#undef PG_KEYWORD > +#define PG_KEYWORD(kwname, value, category) {value, category}, > + > +static const ScanKeywordAux unreserved_keywords[] = { > +#include "pl_unreserved_kwlist.h" > }; > > The category isn't useful for this keyword list, so couldn't you > just make this an array of uint16 values? Yes, this works for the unreserved keywords. The reserved ones still need the aux struct to work with the core scanner, even though scan.l doesn't reference category either. This has the consequence that we can't dispense with category, e.g.: PG_KEYWORD("all", K_ALL, RESERVED_KEYWORD) ...unless we do without the struct entirely, but that's not without disadvantages as you mentioned. I decided to export the struct (rather than just int16 for category) to the frontend, even though we have to set the token values to zero, since there might someday be another field of use to the frontend. Also to avoid confusion. > I don't mind allowing the prefix to default to empty. What I was > concerned about was that base_filename could end up undefined. > Probably the thing to do is to generate base_filename separately, > say by stripping any initial ".*/" sequence and then substitute > '_' for '.'. I removed assumptions about the filename. > +Options: > +-o output path > +-p optional prefix for generated data structures > > This usage message is pretty vague about how you write the options > (cf gripe above). I tried it like this: Usage: gen_keywords.pl [--output/-o ] [--prefix/-p ] input_file --output Output directory --prefix String prepended to var names in the output file On 12/27/18, Andrew Dunstan wrote: > I would rather we used the standard perl module Getopt::Long, as > numerous programs we have already do. Done. I'll also send a patch later to bring some other scripts in line. -John Naylor From 87e3e5e045823070a865933e969262d55e156137 Mon Sep 17 00:00:00 2001 From: Joh
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 2018-12-27 14:22:11 -0500, Tom Lane wrote: > Andrew Dunstan writes: > > A smaller project might be to see if we can replace the binary keyword > > search in ScanKeyword with a perfect hashing function generated by > > gperf, or something similar. I had a quick look at that, too. > > Yeah, we've looked at gperf before, eg > > https://www.postgresql.org/message-id/20170927183156.jqzcsy7ocjcbd...@alap3.anarazel.de > > Perhaps it'd be a win but I'm not very convinced. Note that the tradeoffs mentioned there, by memory, aren't necessarily applicable here. As we're dealing with strings anyway, gperf wanting to deal with strings rather than being able to deal with numbers isn't problematic. > I don't know much about the theory of perfect hashing, but I wonder > if we could just roll our own tool for that. Since we're not dealing > with extremely large keyword sets, perhaps brute force search for a > set of multipliers for a hash computation like > (char[0] * some_prime + char[1] * some_other_prime ...) mod table_size > would work. The usual way to do do perfect hashing is to bascially have a two stage hashtable, with the first stage keyed by a "normal" hash fuinction, and the second one disambiguating the values that hash into the same bucket, by additionally keying a hash-function with the value in the cell in the intermediate hash table. Determining the parameters in the intermediate table is what takes time. That most perfect hash functions look like that way is also a good part of the reason why I doubt it's worthwhile to go there over a simple linear probing hashtable, with a good hashfunction - computing two hash-values will usually be worse than linear probing for *small* and *not modified* hashtables. A simple (i.e. slow for large numbers of keys) implementation for generating a perfect hash function isn't particularly hard. E.g. look at the python implementation at http://iswsa.acm.org/mphf/index.html and http://stevehanov.ca/blog/index.php?id=119 for an easy explanation with graphics. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Thu, Dec 27, 2018 at 07:04:41PM -0300, Alvaro Herrera wrote: > On 2018-Dec-27, Tom Lane wrote: > > > I poked around a little on my own machines, and I can confirm that > > Getopt::Long is present in a default Perl install-from-source at > > least as far back as perl 5.6.1. It's barely conceivable that some > > packager might omit it from their minimal package, but Red Hat, > > Apple, NetBSD, and OpenBSD all include it. So it sure looks to > > me like relying on it should be non-problematic. > > In Debian it's included in package perl-modules-5.24, which packages > perl and libperl5.24 depend on. I suppose it's possible to install > perl-base and not install perl-modules, but it'd be a really bare-bones > machine. I'm not sure it's possible to build Postgres in such a > machine. $ corelist -a Getopt::Long Data for 2018-11-29 Getopt::Long was first released with perl 5 5 undef 5.001 undef 5.002 2.01 5.003072.04 5.004 2.10 5.004052.19 5.005 2.17 5.005032.19 5.005042.20 [much output elided] Fortunately, this has been part of Perl core a lot further back than we promise to support for builds, so I think we're clear to use it everywhere we process options. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 2018-Dec-27, Tom Lane wrote: > I poked around a little on my own machines, and I can confirm that > Getopt::Long is present in a default Perl install-from-source at > least as far back as perl 5.6.1. It's barely conceivable that some > packager might omit it from their minimal package, but Red Hat, > Apple, NetBSD, and OpenBSD all include it. So it sure looks to > me like relying on it should be non-problematic. In Debian it's included in package perl-modules-5.24, which packages perl and libperl5.24 depend on. I suppose it's possible to install perl-base and not install perl-modules, but it'd be a really bare-bones machine. I'm not sure it's possible to build Postgres in such a machine. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andrew Dunstan writes: > On 12/27/18 3:34 PM, Tom Lane wrote: >> ... that's a pretty thin argument, and if Getopt::Long is present even >> in the most minimal Perl installations then it's certainly moot. > It's bundled separately, but on both systems I looked at it's needed by > the base perl package. I don't recall ever seeing a system where it's > not available. I'm reasonably careful about what packages the buildfarm > requires, and it's used Getopt::Long from day one. I poked around a little on my own machines, and I can confirm that Getopt::Long is present in a default Perl install-from-source at least as far back as perl 5.6.1. It's barely conceivable that some packager might omit it from their minimal package, but Red Hat, Apple, NetBSD, and OpenBSD all include it. So it sure looks to me like relying on it should be non-problematic. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/27/18 3:34 PM, Tom Lane wrote: > Andrew Dunstan writes: >> On 12/27/18 3:00 PM, John Naylor wrote: >>> This style was cargo-culted from the catalog scripts. I can settle on >>> just the first form if you like. >> I would rather we used the standard perl module Getopt::Long, as >> numerous programs we have already do. > Hmm ... grepping finds that used only in > > src/tools/pgindent/pgindent > src/tools/git_changelog > src/pl/plperl/text2macro.pl > > so I'm not quite sure about the "numerous" claim. Adopting that > here would possibly impose the requirement of having Getopt::Long > on some developers who are getting by without it today. However, > that's a pretty thin argument, and if Getopt::Long is present even > in the most minimal Perl installations then it's certainly moot. It's bundled separately, but on both systems I looked at it's needed by the base perl package. I don't recall ever seeing a system where it's not available. I'm reasonably careful about what packages the buildfarm requires, and it's used Getopt::Long from day one. > > On the whole I'm +1 for this. Perhaps also, as an independent patch, > we should change the catalog scripts to use Getopt::Long. > > Probably some others, too. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andrew Dunstan writes: > On 12/27/18 3:00 PM, John Naylor wrote: >> This style was cargo-culted from the catalog scripts. I can settle on >> just the first form if you like. > I would rather we used the standard perl module Getopt::Long, as > numerous programs we have already do. Hmm ... grepping finds that used only in src/tools/pgindent/pgindent src/tools/git_changelog src/pl/plperl/text2macro.pl so I'm not quite sure about the "numerous" claim. Adopting that here would possibly impose the requirement of having Getopt::Long on some developers who are getting by without it today. However, that's a pretty thin argument, and if Getopt::Long is present even in the most minimal Perl installations then it's certainly moot. On the whole I'm +1 for this. Perhaps also, as an independent patch, we should change the catalog scripts to use Getopt::Long. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/27/18 3:00 PM, John Naylor wrote: > On 12/27/18, Tom Lane wrote: >> diff --git a/src/tools/gen_keywords.pl b/src/tools/gen_keywords.pl >> +elsif ($arg =~ /^-o/) >> +{ >> +$output_path = length($arg) > 2 ? substr($arg, 2) : shift @ARGV; >> +} >> >> My perl-fu is not great, but it looks like this will accept arguments >> like "-ofilename", which is a style I don't like at all. I'd rather >> either insist on the filename being separate or write the switch like >> "-o=filename". Also, project style when taking both forms is usually >> more like >> -o filename >> --offset=filename > This style was cargo-culted from the catalog scripts. I can settle on > just the first form if you like. > I would rather we used the standard perl module Getopt::Long, as numerous programs we have already do. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > On 12/27/18, Tom Lane wrote: >> +$kw_input_file =~ /((\w*)kwlist)\.h/; >> +my $base_filename = $1; >> +$prefix = $2 if !defined $prefix; >> >> Hmm, what happens if the input filename does not end with "kwlist.h"? > If that's a maintainability hazard, I can force every invocation to > provide a prefix instead. I don't mind allowing the prefix to default to empty. What I was concerned about was that base_filename could end up undefined. Probably the thing to do is to generate base_filename separately, say by stripping any initial ".*/" sequence and then substitute '_' for '.'. >> I looked very briefly at v4-0002, and I'm not very convinced about >> the "middle" aspect of that optimization. It seems unmaintainable, >> plus you've not exhibited how the preferred keywords would get selected >> in the first place (wiring them into the Perl script is surely not >> acceptable). > What if the second argument of the macro held this info? Yeah, you'd have to do something like that. But I'm still concerned about the maintainability aspect: if we mark say "commit" as the starting point in the "c" group, future additions or deletions of keywords starting with "c" might render that an increasingly poor choice. But most likely nobody would ever notice that the marking was getting more and more suboptimal. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/27/18, Tom Lane wrote: > diff --git a/src/tools/gen_keywords.pl b/src/tools/gen_keywords.pl > + elsif ($arg =~ /^-o/) > + { > + $output_path = length($arg) > 2 ? substr($arg, 2) : shift @ARGV; > + } > > My perl-fu is not great, but it looks like this will accept arguments > like "-ofilename", which is a style I don't like at all. I'd rather > either insist on the filename being separate or write the switch like > "-o=filename". Also, project style when taking both forms is usually > more like > -o filename > --offset=filename This style was cargo-culted from the catalog scripts. I can settle on just the first form if you like. > +$kw_input_file =~ /((\w*)kwlist)\.h/; > +my $base_filename = $1; > +$prefix = $2 if !defined $prefix; > > Hmm, what happens if the input filename does not end with "kwlist.h"? If that's a maintainability hazard, I can force every invocation to provide a prefix instead. > I looked very briefly at v4-0002, and I'm not very convinced about > the "middle" aspect of that optimization. It seems unmaintainable, > plus you've not exhibited how the preferred keywords would get selected > in the first place (wiring them into the Perl script is surely not > acceptable). What if the second argument of the macro held this info? Something like: PG_KEYWORD("security", FULL_SEARCH, UNRESERVED_KEYWORD) PG_KEYWORD("select", OPTIMIZE, SELECT, RESERVED_KEYWORD) with a warning emitted if more than one keyword per range has OPTIMIZE. That would require all keyword lists to have that second argument, but selecting a preferred keyword would be optional. -John Naylor
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andrew Dunstan writes: > RD parsers are not terribly hard to write. Sure, as long as they are for grammars that are (a) small, (b) static, and (c) LL(1), which is strictly weaker than the LALR(1) grammar class that bison can handle. We already have a whole lot of constructs that are at the edges of what bison can handle, which makes me dubious that an RD parser could be built at all without a lot of performance-eating lookahead and/or backtracking. > A smaller project might be to see if we can replace the binary keyword > search in ScanKeyword with a perfect hashing function generated by > gperf, or something similar. I had a quick look at that, too. Yeah, we've looked at gperf before, eg https://www.postgresql.org/message-id/20170927183156.jqzcsy7ocjcbd...@alap3.anarazel.de Perhaps it'd be a win but I'm not very convinced. I don't know much about the theory of perfect hashing, but I wonder if we could just roll our own tool for that. Since we're not dealing with extremely large keyword sets, perhaps brute force search for a set of multipliers for a hash computation like (char[0] * some_prime + char[1] * some_other_prime ...) mod table_size would work. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > Barring additional bikeshedding on 0001, I'll plan on implementing > offset-based lookup for the other keyword types and retire the old > ScanKeyword. Once that's done, we can benchmark and compare with the > optimizations in 0002. Sounds like a plan. Assorted minor bikeshedding on v4-0001 (just from eyeballing it, I didn't test it): +/* Like ScanKeywordLookup, but uses offsets into a keyword string. */ +int +ScanKeywordLookupOffset(const char *string_to_lookup, + const char *kw_strings, Not really "like" it, since the return value is totally different and so is the representation of the keyword list. I realize that your plan is probably to get rid of ScanKeywordLookup and then adapt the latter's comment for this code, but don't forget that you need to adjust said comment. +/* Payload data for keywords */ +typedef struct ScanKeywordAux +{ + int16 value; /* grammar's token code */ + charcategory; /* see codes above */ +} ScanKeywordAux; There isn't really any point in changing category to "char", because alignment considerations will mandate that sizeof(ScanKeywordAux) be a multiple of 2 anyway. With some compilers we could get around that with a pragma to force non-aligned storage, but doing so would be a net loss on most non-Intel architectures. If you really are hot about saving that other 440 bytes, the way to do it would be to drop the struct entirely and use two parallel arrays, an int16[] for value and a char[] (or better uint8[]) for category. Those would be filled by reading kwlist.h twice with different definitions for PG_KEYWORD. Not sure it's worth the trouble though --- in particular, not clear that it's a win from the standpoint of number of cache lines touched. diff --git a/src/pl/plpgsql/src/.gitignore b/src/pl/plpgsql/src/.gitignore @@ -1,3 +1,4 @@ +/*kwlist_d.h Not a fan of using wildcards in .gitignore files, at least not when there's just one or two files you intend to match. # Force these dependencies to be known even without dependency info built: -pl_gram.o pl_handler.o pl_comp.o pl_exec.o pl_funcs.o pl_scanner.o: plpgsql.h pl_gram.h plerrcodes.h +pl_gram.o pl_handler.o pl_comp.o pl_exec.o pl_funcs.o pl_scanner.o: plpgsql.h pl_gram.h plerrcodes.h pl_unreserved_kwlist_d.h Hm, do we really need any more than pl_scanner.o to depend on that header? +/* FIXME: Have to redefine this symbol for the WIP. */ +#undef PG_KEYWORD +#define PG_KEYWORD(kwname, value, category) {value, category}, + +static const ScanKeywordAux unreserved_keywords[] = { +#include "pl_unreserved_kwlist.h" }; The category isn't useful for this keyword list, so couldn't you just make this an array of uint16 values? diff --git a/src/pl/plpgsql/src/pl_unreserved_kwlist.h b/src/pl/plpgsql/src/pl_unreserved_kwlist.h +/* name, value, category */ +PG_KEYWORD("absolute", K_ABSOLUTE, UNRESERVED_KEYWORD) Likewise, I'd just have these be two-argument macros. There's no reason for the various kwlist.h headers to agree on the number of payload arguments for PG_KEYWORD. diff --git a/src/tools/gen_keywords.pl b/src/tools/gen_keywords.pl + elsif ($arg =~ /^-o/) + { + $output_path = length($arg) > 2 ? substr($arg, 2) : shift @ARGV; + } My perl-fu is not great, but it looks like this will accept arguments like "-ofilename", which is a style I don't like at all. I'd rather either insist on the filename being separate or write the switch like "-o=filename". Also, project style when taking both forms is usually more like -o filename --offset=filename +$kw_input_file =~ /((\w*)kwlist)\.h/; +my $base_filename = $1; +$prefix = $2 if !defined $prefix; Hmm, what happens if the input filename does not end with "kwlist.h"? +# Parse keyword header for names. +my @keywords; +while (<$kif>) +{ + if (/^PG_KEYWORD\("(\w+)",\s*\w+,\s*\w+\)/) This is assuming more than it should about the number of arguments for PG_KEYWORD, as well as what's in them. I think it'd be sufficient to match like this: if (/^PG_KEYWORD\("(\w+)",/) +Options: +-o output path +-p optional prefix for generated data structures This usage message is pretty vague about how you write the options (cf gripe above). I looked very briefly at v4-0002, and I'm not very convinced about the "middle" aspect of that optimization. It seems unmaintainable, plus you've not exhibited how the preferred keywords would get selected in the first place (wiring them into the Perl script is surely not acceptable). If you want to pursue that, please separate it into an 0002 that just adds the letter-range aspect and then an 0003 that adds the "middle" business on top. Then we can do testing to see whether either of those ideas are worthwhile. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/27/18 12:12 PM, Tom Lane wrote: >> I don't buy that we're inable to write a descent parser that way. > I do not think that we could write one for the current state of the > PG grammar without an investment of effort so large that it's not > going to happen. Even if such a parser were to spring fully armed > from somebody's forehead, we absolutely cannot expect that it would > continue to work correctly after non-wizard contributors modify it. I just did a quick survey of generator tools. Unfortunately, the best candidate alternative (ANTLR) no longer supports generating plain C code. I don't know of another tool that is well maintained, supports C, and generates top down parsers. Twenty-five years ago or so I wrote a top-down table-driven parser generator, but that was in another country, and besides, the wench is dead. There are well known techniques (See s 4.4 of the Dragon Book, if you have a copy) for formal analysis of grammars to determine predictive parser action. They aren't hard, and the tables they produce are typically much smaller than those used for LALR parsers. Still, probably not for the faint of heart. The tools that have moved to using hand cut RD parsers have done so precisely because they get a significant performance benefit from doing so. RD parsers are not terribly hard to write. Yes, the JSON grammar is tiny, but I think I wrote the basics of the RD parser we use for JSON in about an hour. I think arguing that our hacker base is not competent to maintain such a thing for the SQL grammar is wrong. We successfully maintain vastly more complex pieces of code. Having said all that, I don't intend to spend any time on implementing an alternative parser. It would as you say involve a heck of a lot of time, which I don't have. It would be a fine academic research project for some student. A smaller project might be to see if we can replace the binary keyword search in ScanKeyword with a perfect hashing function generated by gperf, or something similar. I had a quick look at that, too. Unfortunately the smallest hash table I could generate for our 440 symbols had 1815 entries, so I'm not sure how well that would work. Worth investigating, though. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > On 2018-12-26 14:03:57 -0500, Tom Lane wrote: >> It's impossible to write correct RD parsers by hand for any but the most >> trivial, conflict-free languages, and what we have got to deal with >> is certainly neither of those; moreover, it's a constantly moving target. >> We'd be buying into an endless landscape of parser bugs if we go that way. >> It's *not* worth it. > It's not exactly new that people end up moving to bison to recursive > descent parsers once they hit the performance problems and want to give > better error messages. E.g. both gcc and clang have hand-written > recursive-descent parsers for C and C++ these days. Note that they are dealing with fixed language definitions. Furthermore, there's no need to worry about whether that code has to be hacked on by less-than-expert people. Neither condition applies to us. The thing that most concerns me about not using a grammar tool of some sort is that with handwritten RD, it's very easy to get into situations where you've "defined" (well, implemented, because you never did have a formal definition) a language that is ambiguous, admitting of more than one valid parse interpretation. You won't find out until someone files a bug report complaining that some apparently-valid statement isn't doing what they expect. At that point you are in a world of hurt, because it's too late to fix it without changing the language definition and thus creating user-visible compatibility breakage. Now bison isn't perfect in this regard, because you can shoot yourself in the foot with ill-considered precedence specifications (and we've done so ;-(), but it is light-years more likely to detect ambiguous grammar up-front than any handwritten parser logic is. If we had a tool that proved a BNF grammar non-ambiguous and then wrote an RD parser for it, that'd be fine with me --- but we need a tool, not somebody claiming he can write an error-free RD parser for an arbitrary language. My position is that anyone claiming that is just plain deluded. I also do not buy your unsupported-by-any-evidence claim that the error reports would be better. I've worked on RD parsers in the past, and they're not really better, at least not without expending enormous amounts of effort --- and run-time cycles --- specifically on the error reporting aspect. Again, I don't see that happening for us. > I don't buy that we're inable to write a descent parser that way. I do not think that we could write one for the current state of the PG grammar without an investment of effort so large that it's not going to happen. Even if such a parser were to spring fully armed from somebody's forehead, we absolutely cannot expect that it would continue to work correctly after non-wizard contributors modify it. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/26/18, John Naylor wrote: > On 12/26/18, Robert Haas wrote: >> I wonder if we could do something really simple like a lookup based on >> the first character of the scan keyword. It looks to me like there are >> 440 keywords right now, and the most common starting letter is 'c', >> which is the first letter of 51 keywords. So dispatching based on the >> first letter clips at least 3 steps off the binary search. I don't >> know whether that's enough to be worthwhile, but it's probably pretty >> simple to implement. > I agree it'd be fairly simple to do, and might raise the bar for doing > anything more complex. I went ahead and did this for v4, but split out into a separate patch. In addition, I used a heuristic to bypass binary search for the most common keywords. Normally, the middle value is computed mathematically, but I found that in each range of keywords beginning with the same letter, there is often 1 or 2 common keywords that are good first guesses, such as select, from, join, limit, where. I taught the lookup to try those first, and then compute subsequent steps the usual way. Barring additional bikeshedding on 0001, I'll plan on implementing offset-based lookup for the other keyword types and retire the old ScanKeyword. Once that's done, we can benchmark and compare with the optimizations in 0002. -John Naylor From ed78318df69bd3fa1543f3dc04d0a6ca9794a81c Mon Sep 17 00:00:00 2001 From: John Naylor Date: Wed, 26 Dec 2018 21:28:27 -0500 Subject: [PATCH v4 1/2] WIP Use offset-based keyword lookup. (For WIP, only for plpgsql unreserved keywords) Replace binary search over an array of ScanKeyword structs with a binary search over an array of offsets into a keyword string. Access auxillary data only after a keyword hit. This has better locality of reference and a smaller memory footprint. --- src/common/keywords.c | 55 + src/include/common/keywords.h | 12 ++ src/pl/plpgsql/src/.gitignore | 1 + src/pl/plpgsql/src/Makefile | 13 +- src/pl/plpgsql/src/pl_scanner.c | 128 +--- src/pl/plpgsql/src/pl_unreserved_kwlist.h | 108 + src/tools/gen_keywords.pl | 137 ++ src/tools/msvc/Solution.pm| 10 ++ 8 files changed, 361 insertions(+), 103 deletions(-) create mode 100644 src/pl/plpgsql/src/pl_unreserved_kwlist.h create mode 100644 src/tools/gen_keywords.pl diff --git a/src/common/keywords.c b/src/common/keywords.c index 0c0c794c68..b0e5a721b6 100644 --- a/src/common/keywords.c +++ b/src/common/keywords.c @@ -112,3 +112,58 @@ ScanKeywordLookup(const char *text, return NULL; } + +/* Like ScanKeywordLookup, but uses offsets into a keyword string. */ +int +ScanKeywordLookupOffset(const char *string_to_lookup, + const char *kw_strings, + const uint16 *kw_offsets, + int num_keywords) +{ + int len, +i; + char word[NAMEDATALEN]; + const uint16 *low; + const uint16 *high; + + len = strlen(string_to_lookup); + /* We assume all keywords are shorter than NAMEDATALEN. */ + if (len >= NAMEDATALEN) + return -1; + + /* + * Apply an ASCII-only downcasing. We must not use tolower() since it may + * produce the wrong translation in some locales (eg, Turkish). + */ + for (i = 0; i < len; i++) + { + char ch = string_to_lookup[i]; + + if (ch >= 'A' && ch <= 'Z') + ch += 'a' - 'A'; + word[i] = ch; + } + word[len] = '\0'; + + /* + * Now do a binary search using plain strcmp() comparison. + */ + low = kw_offsets; + high = kw_offsets + (num_keywords - 1); + while (low <= high) + { + const uint16 *middle; + int difference; + + middle = low + (high - low) / 2; + difference = strcmp(kw_strings + *middle, word); + if (difference == 0) + return middle - kw_offsets; + else if (difference < 0) + low = middle + 1; + else + high = middle - 1; + } + + return -1; +} diff --git a/src/include/common/keywords.h b/src/include/common/keywords.h index 0b31505b66..201d0fcc7a 100644 --- a/src/include/common/keywords.h +++ b/src/include/common/keywords.h @@ -28,6 +28,13 @@ typedef struct ScanKeyword int16 category; /* see codes above */ } ScanKeyword; +/* Payload data for keywords */ +typedef struct ScanKeywordAux +{ + int16 value; /* grammar's token code */ + char category; /* see codes above */ +} ScanKeywordAux; + #ifndef FRONTEND extern PGDLLIMPORT const ScanKeyword ScanKeywords[]; extern PGDLLIMPORT const int NumScanKeywords; @@ -41,4 +48,9 @@ extern const ScanKeyword *ScanKeywordLookup(const char *text, const ScanKeyword *keywords, int num_keywords); +int ScanKeywordLookupOffset(const char *string_to_lookup, + const char *kw_strings, + const uint16 *kw_offsets, + int num_keywords); + #endif /* KEYWORDS_H */ diff --git a/src/pl/plpgsql/src/.gitignore b/src/pl/plpgsql/src/.gitignore index ff6ac965fd..a649302fdb 100644 --- a/src/pl/plpgsql/src/.gitignore +++ b/s
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2018-12-26 14:03:57 -0500, Tom Lane wrote: > Andres Freund writes: > > My bet is, and has been for quite a while, that we'll have to go for a > > hand-written recursive descent type parser. > > I will state right up front that that will happen over my dead body. > > It's impossible to write correct RD parsers by hand for any but the most > trivial, conflict-free languages, and what we have got to deal with > is certainly neither of those; moreover, it's a constantly moving target. > We'd be buying into an endless landscape of parser bugs if we go that way. > It's *not* worth it. It's not exactly new that people end up moving to bison to recursive descent parsers once they hit the performance problems and want to give better error messages. E.g. both gcc and clang have hand-written recursive-descent parsers for C and C++ these days. I don't buy that we're inable to write a descent parser that way. What I *do* buy is that it's more problematic for the design of our SQL dialect, because the use of bison often uncovers ambiguities in new extensions of the language. And I don't really have a good idea how to handle that. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/26/18, Robert Haas wrote: > I wonder if we could do something really simple like a lookup based on > the first character of the scan keyword. It looks to me like there are > 440 keywords right now, and the most common starting letter is 'c', > which is the first letter of 51 keywords. So dispatching based on the > first letter clips at least 3 steps off the binary search. I don't > know whether that's enough to be worthwhile, but it's probably pretty > simple to implement. Using radix tree structures for the top couple of node levels is a known technique to optimize tries that need to be more space-efficient at lower levels, so this has precedent. In this case there would be a space trade off of (alphabet size, rounded up) * (size of index to lower boundary + size of index to upper boundary) = 32 * (2 + 2) = 128 bytes which is pretty small compared to what we'll save by offset-based lookup. On average, there'd be 4.1 binary search steps, which is nice. I agree it'd be fairly simple to do, and might raise the bar for doing anything more complex. -John Naylor
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > On 2018-12-26 10:45:11 -0500, Robert Haas wrote: >> I'm not sure that I understand quite what you have in mind for a >> serialized non-perfect hashtable. Are you thinking that we'd just >> construct a simplehash and serialize it? > I was basically thinking that we'd have the perl script implement a > simple hash and put the keyword (pointers) into an array, handling > conflicts with the simplest linear probing thinkable. As there's never a > need for modifications, that ought to be fairly simple. I think it was Knuth who said that when you use hashing, you are putting a great deal of faith in the average case, because the worst case is terrible. The applicability of that to this problem is that if you hit a bad case (say, a long collision chain affecting some common keywords) you could end up with poor performance that affects a lot of people for a long time. And our keyword list is not so static that you could prove once that the behavior is OK and then forget about it. So I'm suspicious of proposals to use simplistic hashing here. There might well be some value in Robert's idea of keying off the first letter to get rid of the first few binary-search steps, not least because those steps are particularly terrible from a cache-footprint perspective. I'm not sold on doing anything significantly more invasive than that. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > My bet is, and has been for quite a while, that we'll have to go for a > hand-written recursive descent type parser. I will state right up front that that will happen over my dead body. It's impossible to write correct RD parsers by hand for any but the most trivial, conflict-free languages, and what we have got to deal with is certainly neither of those; moreover, it's a constantly moving target. We'd be buying into an endless landscape of parser bugs if we go that way. It's *not* worth it. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2018-12-26 10:45:11 -0500, Robert Haas wrote: > I'm not sure that I understand quite what you have in mind for a > serialized non-perfect hashtable. Are you thinking that we'd just > construct a simplehash and serialize it? I was basically thinking that we'd have the perl script implement a simple hash and put the keyword (pointers) into an array, handling conflicts with the simplest linear probing thinkable. As there's never a need for modifications, that ought to be fairly simple. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2018-12-26 11:50:18 -0500, Robert Haas wrote: > On Wed, Dec 26, 2018 at 11:22 AM Tom Lane wrote: > > I think there's a lot of goalpost-moving going on here. The original > > idea was to trim the physical size of the data structure, as stated > > in the thread subject, and just reap whatever cache benefits we got > > along the way from that. I am dubious that we actually have any > > performance problem in this code that needs a big dollop of added > > complexity to fix. > > I have seen ScanKeywordLookup show up in profiles quite often and > fairly prominently -- like several percent of total runtime. I'm not > trying to impose requirements on John's patch, and I agree that > reducing the physical size of the structure is a good step whether > anything else is done or not. However, I don't see that as a reason to > shut down further discussion of other possible improvements. If his > patch makes this disappear from profiles, cool, but if it doesn't, > then sooner or later somebody's going to want to do more. I agree. And most of the patch would be a pre-requisite for anything more elaborate anyway. > FWIW, my bet is this helps but isn't enough to get rid of the problem > completely. A 9-step binary search has got to be slower than a really > well-optimized hash table lookup. Yea, at least with a non-optimized layout. If we'd used a binary search optimized lookup order it might be different, but probably at best equivalent to a good hashtable. > > In my hands, the only part of the low-level parsing code that commonly > > shows up as interesting in profiles is the Bison engine. That's probably > > because the grammar tables are circa half a megabyte and blow out cache > > pretty badly :-(. I don't know of any way to make that better, > > unfortunately. I suspect that it's just going to get worse, because > > people keep submitting additions to the grammar. > > I'm kinda surprised that you haven't seen ScanKeywordLookup() in > there, but I agree with you that the size of the main parser tables is > a real issue, and that there's no easy solution. At various times > there has been discussion of using some other parser generator, and > I've also toyed with the idea of writing one specifically for > PostgreSQL. Unfortunately, it seems like bison is all but > unmaintained; the alternatives are immature and have limited adoption > and limited community; and writing something from scratch is a ton of > work. :-( My bet is, and has been for quite a while, that we'll have to go for a hand-written recursive descent type parser. They can be *substantially* faster, and performance isn't as affected by the grammar size. And, about as important, they also allow for a lot more heuristics around grammar errors - I do think we'll soon have to better than to throw a generic syntax error for the cases where the grammar doesn't match at all. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Robert Haas writes: > I'm kinda surprised that you haven't seen ScanKeywordLookup() in > there, but I agree with you that the size of the main parser tables is > a real issue, and that there's no easy solution. At various times > there has been discussion of using some other parser generator, and > I've also toyed with the idea of writing one specifically for > PostgreSQL. Unfortunately, it seems like bison is all but > unmaintained; the alternatives are immature and have limited adoption > and limited community; and writing something from scratch is a ton of > work. :-( Yeah, and also: SQL is a damn big and messy language, and so it's not very clear that it's really bison's fault that it's slow to parse. We might do a ton of work to implement an alternative, and then find ourselves no better off. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
David Fetter writes: > On Wed, Dec 26, 2018 at 11:22:39AM -0500, Tom Lane wrote: >> In my hands, the only part of the low-level parsing code that >> commonly shows up as interesting in profiles is the Bison engine. > Should we be considering others? We've looked around before, IIRC, and not really seen any arguably better tools. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Wed, Dec 26, 2018 at 11:22 AM Tom Lane wrote: > I think there's a lot of goalpost-moving going on here. The original > idea was to trim the physical size of the data structure, as stated > in the thread subject, and just reap whatever cache benefits we got > along the way from that. I am dubious that we actually have any > performance problem in this code that needs a big dollop of added > complexity to fix. I have seen ScanKeywordLookup show up in profiles quite often and fairly prominently -- like several percent of total runtime. I'm not trying to impose requirements on John's patch, and I agree that reducing the physical size of the structure is a good step whether anything else is done or not. However, I don't see that as a reason to shut down further discussion of other possible improvements. If his patch makes this disappear from profiles, cool, but if it doesn't, then sooner or later somebody's going to want to do more. FWIW, my bet is this helps but isn't enough to get rid of the problem completely. A 9-step binary search has got to be slower than a really well-optimized hash table lookup. In a perfect world the latter touches the cache line containing the keyword -- which presumably is already in cache since we just scanned it -- then computes a hash value without touching any other cache lines -- and then goes straight to the right entry. So it touches ONE new cache line. That might a level of optimization that's hard to achieve in practice, but I don't think it's crazy to want to get there. > In my hands, the only part of the low-level parsing code that commonly > shows up as interesting in profiles is the Bison engine. That's probably > because the grammar tables are circa half a megabyte and blow out cache > pretty badly :-(. I don't know of any way to make that better, > unfortunately. I suspect that it's just going to get worse, because > people keep submitting additions to the grammar. I'm kinda surprised that you haven't seen ScanKeywordLookup() in there, but I agree with you that the size of the main parser tables is a real issue, and that there's no easy solution. At various times there has been discussion of using some other parser generator, and I've also toyed with the idea of writing one specifically for PostgreSQL. Unfortunately, it seems like bison is all but unmaintained; the alternatives are immature and have limited adoption and limited community; and writing something from scratch is a ton of work. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Wed, Dec 26, 2018 at 11:22:39AM -0500, Tom Lane wrote: > > In my hands, the only part of the low-level parsing code that > commonly shows up as interesting in profiles is the Bison engine. Should we be considering others? As I understand it, steps have been made in this field since yacc was originally designed. Is LALR actually suitable for languages like SQL, or is it just there for historical reasons? Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Robert Haas writes: > On Wed, Dec 19, 2018 at 8:01 PM Andres Freund wrote: >> The last time I looked into perfect hash functions, it wasn't easy to >> find a generator that competed with a decent normal hashtable (in >> particular gperf's are very unconvincing). The added tooling is a >> concern imo. OTOH, we're comparing not with a hashtable, but a binary >> search, where the latter will usually loose. Wonder if we shouldn't >> generate a serialized non-perfect hashtable instead. The lookup code for >> a read-only hashtable without concern for adversarial input is pretty >> trivial. > I wonder if we could do something really simple like a lookup based on > the first character of the scan keyword. It looks to me like there are > 440 keywords right now, and the most common starting letter is 'c', > which is the first letter of 51 keywords. So dispatching based on the > first letter clips at least 3 steps off the binary search. I don't > know whether that's enough to be worthwhile, but it's probably pretty > simple to implement. I think there's a lot of goalpost-moving going on here. The original idea was to trim the physical size of the data structure, as stated in the thread subject, and just reap whatever cache benefits we got along the way from that. I am dubious that we actually have any performance problem in this code that needs a big dollop of added complexity to fix. In my hands, the only part of the low-level parsing code that commonly shows up as interesting in profiles is the Bison engine. That's probably because the grammar tables are circa half a megabyte and blow out cache pretty badly :-(. I don't know of any way to make that better, unfortunately. I suspect that it's just going to get worse, because people keep submitting additions to the grammar. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On Wed, Dec 19, 2018 at 8:01 PM Andres Freund wrote: > The last time I looked into perfect hash functions, it wasn't easy to > find a generator that competed with a decent normal hashtable (in > particular gperf's are very unconvincing). The added tooling is a > concern imo. OTOH, we're comparing not with a hashtable, but a binary > search, where the latter will usually loose. Wonder if we shouldn't > generate a serialized non-perfect hashtable instead. The lookup code for > a read-only hashtable without concern for adversarial input is pretty > trivial. I wonder if we could do something really simple like a lookup based on the first character of the scan keyword. It looks to me like there are 440 keywords right now, and the most common starting letter is 'c', which is the first letter of 51 keywords. So dispatching based on the first letter clips at least 3 steps off the binary search. I don't know whether that's enough to be worthwhile, but it's probably pretty simple to implement. I'm not sure that I understand quite what you have in mind for a serialized non-perfect hashtable. Are you thinking that we'd just construct a simplehash and serialize it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/22/18, Tom Lane wrote: > John Naylor writes: >> Using a single file also gave me another idea: Take value and category >> out of ScanKeyword, and replace them with an index into another array >> containing those, which will only be accessed in the event of a hit. >> That would shrink ScanKeyword to 4 bytes (offset, index), further >> increasing locality of reference. Might not be worth it, but I can try >> it after moving on to the core scanner. > > I like that idea a *lot*, actually, because it offers the opportunity > to decouple this mechanism from all assumptions about what the > auxiliary data for a keyword is. Okay, in that case I went ahead and did it for WIP v3. > (it'd be a good idea to have a switch that allows specifying the > prefix of these constant names). Done as an optional switch, and tested, but not yet used in favor of the previous method as a fallback. I'll probably do it in the final version to keep lines below 80, and to add 'core_' to the core keyword vars. > /* Payload data for keywords */ > typedef struct MyKeyword > { > int16value; > int16category; > } MyKeyword; I tweaked this a bit to typedef struct ScanKeywordAux { int16 value; /* grammar's token code */ charcategory; /* see codes above */ } ScanKeywordAux; It seems that category was only 2 bytes to make ScanKeyword a power of 2 (of course that was on 32 bit machines and doesn't hold true anymore). Using char will save another few hundred bytes in the core scanner. Since we're only accessing this once per identifier, we may not need to worry so much about memory alignment. > Aside from being arguably better from the locality-of-reference > standpoint, this gets us out of the weird ifdef'ing you've got in > the v2 patch. The kwlist_d.h headers can be very ordinary headers. Yeah, that's a nice (and for me unexpected) bonus. -John Naylor src/common/keywords.c | 55 src/include/common/keywords.h | 12 +++ src/pl/plpgsql/src/.gitignore | 1 + src/pl/plpgsql/src/Makefile | 13 +-- src/pl/plpgsql/src/pl_scanner.c | 127 +++ src/pl/plpgsql/src/pl_unreserved_kwlist.h | 107 +++ src/tools/gen_keywords.pl | 139 ++ src/tools/msvc/Solution.pm| 10 +++ 8 files changed, 361 insertions(+), 103 deletions(-) diff --git a/src/common/keywords.c b/src/common/keywords.c index 0c0c794c68..b0e5a721b6 100644 --- a/src/common/keywords.c +++ b/src/common/keywords.c @@ -112,3 +112,58 @@ ScanKeywordLookup(const char *text, return NULL; } + +/* Like ScanKeywordLookup, but uses offsets into a keyword string. */ +int +ScanKeywordLookupOffset(const char *string_to_lookup, + const char *kw_strings, + const uint16 *kw_offsets, + int num_keywords) +{ + int len, +i; + char word[NAMEDATALEN]; + const uint16 *low; + const uint16 *high; + + len = strlen(string_to_lookup); + /* We assume all keywords are shorter than NAMEDATALEN. */ + if (len >= NAMEDATALEN) + return -1; + + /* + * Apply an ASCII-only downcasing. We must not use tolower() since it may + * produce the wrong translation in some locales (eg, Turkish). + */ + for (i = 0; i < len; i++) + { + char ch = string_to_lookup[i]; + + if (ch >= 'A' && ch <= 'Z') + ch += 'a' - 'A'; + word[i] = ch; + } + word[len] = '\0'; + + /* + * Now do a binary search using plain strcmp() comparison. + */ + low = kw_offsets; + high = kw_offsets + (num_keywords - 1); + while (low <= high) + { + const uint16 *middle; + int difference; + + middle = low + (high - low) / 2; + difference = strcmp(kw_strings + *middle, word); + if (difference == 0) + return middle - kw_offsets; + else if (difference < 0) + low = middle + 1; + else + high = middle - 1; + } + + return -1; +} diff --git a/src/include/common/keywords.h b/src/include/common/keywords.h index 0b31505b66..201d0fcc7a 100644 --- a/src/include/common/keywords.h +++ b/src/include/common/keywords.h @@ -28,6 +28,13 @@ typedef struct ScanKeyword int16 category; /* see codes above */ } ScanKeyword; +/* Payload data for keywords */ +typedef struct ScanKeywordAux +{ + int16 value; /* grammar's token code */ + char category; /* see codes above */ +} ScanKeywordAux; + #ifndef FRONTEND extern PGDLLIMPORT const ScanKeyword ScanKeywords[]; extern PGDLLIMPORT const int NumScanKeywords; @@ -41,4 +48,9 @@ extern const ScanKeyword *ScanKeywordLookup(const char *text, const ScanKeyword *keywords, int num_keywords); +int ScanKeywordLookupOffset(const char *string_to_lookup, + const char *kw_strings, + const uint16 *kw_offsets, + int num_keywords); + #endif /* KEYWORDS_H */ diff --git a/src/pl/plpgsql/src/.gitignore b/src/pl/plpgsql/src/.gitignore index ff6ac965fd..a649302fdb 100644 --- a/src/pl/plpgsql/src/.
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andres Freund writes: > On 2018-12-22 12:20:00 -0500, Tom Lane wrote: >> I like that idea a *lot*, actually, because it offers the opportunity >> to decouple this mechanism from all assumptions about what the >> auxiliary data for a keyword is. > OTOH, it doubles or triples the number of cachelines accessed when > encountering a keyword. Compared to what? The current situation in that regard is a mess. Also, AFAICS this proposal involves the least amount of data touched during the lookup phase of anything we've discussed, so I do not even accept that your criticism is correct. One extra cacheline fetch to get the aux data for a particular keyword after the search is not going to tip the scales away from this being a win. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2018-12-22 12:20:00 -0500, Tom Lane wrote: > John Naylor writes: > > Using a single file also gave me another idea: Take value and category > > out of ScanKeyword, and replace them with an index into another array > > containing those, which will only be accessed in the event of a hit. > > That would shrink ScanKeyword to 4 bytes (offset, index), further > > increasing locality of reference. Might not be worth it, but I can try > > it after moving on to the core scanner. > > I like that idea a *lot*, actually, because it offers the opportunity > to decouple this mechanism from all assumptions about what the > auxiliary data for a keyword is. OTOH, it doubles or triples the number of cachelines accessed when encountering a keyword. The fraction of keywords to not-keywords in SQL makes me wonder whether that makes it a good deal. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > Using a single file also gave me another idea: Take value and category > out of ScanKeyword, and replace them with an index into another array > containing those, which will only be accessed in the event of a hit. > That would shrink ScanKeyword to 4 bytes (offset, index), further > increasing locality of reference. Might not be worth it, but I can try > it after moving on to the core scanner. I like that idea a *lot*, actually, because it offers the opportunity to decouple this mechanism from all assumptions about what the auxiliary data for a keyword is. Basically, we'd redefine ScanKeywordLookup as having the API "given a string, return a keyword index if it is a keyword, -1 if it isn't"; then the caller would use the keyword index to look up the auxiliary data in a table that it owns, and ScanKeywordLookup doesn't know about at all. So that leads to a design like this: the master data is in a header that's just like kwlist.h is today, except now we are thinking of PG_KEYWORD as an N-argument macro not necessarily exactly 3 arguments. The Perl script reads that, paying attention only to the first argument of the macro calls, and outputs a file containing, say, static const uint16 kw_offsets[] = { 0, 6, 15, ... }; static const char kw_strings[] = "abort\0" "absolute\0" ... ; (it'd be a good idea to have a switch that allows specifying the prefix of these constant names). Then ScanKeywordLookup has the signature int ScanKeywordLookup(const char *string_to_lookup, const char *kw_strings, const uint16 *kw_offsets, int num_keywords); and a file using this stuff looks something like /* Payload data for keywords */ typedef struct MyKeyword { int16value; int16category; } MyKeyword; #define PG_KEYWORD(kwname, value, category) {value, category}, static const MyKeyword MyKeywords[] = { #include "kwlist.h" }; /* String lookup table for keywords */ #include "kwlist_d.h" /* Lookup code looks about like this: */ kwnum = ScanKeywordLookup(str, kw_strings, kw_offsets, lengthof(kw_offsets)); if (kwnum >= 0) ... look into MyKeywords[kwnum] for info ... Aside from being arguably better from the locality-of-reference standpoint, this gets us out of the weird ifdef'ing you've got in the v2 patch. The kwlist_d.h headers can be very ordinary headers. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/20/18, Tom Lane wrote: > I'd be inclined to put the script in src/tools, I think. IMO src/common > is for code that actually gets built into our executables. Done. >> which takes >> pl_unreserved_kwlist.h as input and outputs >> pl_unreserved_kwlist_offset.h and pl_unreserved_kwlist_string.h. > > I wonder whether we'd not be better off producing just one output > file, in which we have the offsets emitted as PG_KEYWORD macros > and then the giant string emitted as a macro definition, ie > something like > > #define PG_KEYWORD_STRING \ > "absolute\0" \ > "alias\0" \ > ... > > That simplifies the Makefile-hacking, at least, and it possibly gives > callers more flexibility about what they actually want to do with the > string. Okay, I tried that. Since the script is turning one header into another, I borrowed the "*_d.h" nomenclature from the catalogs. Using a single file required some #ifdef hacks in the output file. Maybe there's a cleaner way to do this, but I don't know what it is. Using a single file also gave me another idea: Take value and category out of ScanKeyword, and replace them with an index into another array containing those, which will only be accessed in the event of a hit. That would shrink ScanKeyword to 4 bytes (offset, index), further increasing locality of reference. Might not be worth it, but I can try it after moving on to the core scanner. > I'm for "not change things unnecessarily". People might well be > scraping the keyword list out of parser/kwlist.h for other purposes > right now --- indeed, it's defined the way it is exactly to let > people do that. I don't see a good reason to force them to redo > whatever tooling they have that depends on that. So let's build > kwlist_offsets.h alongside that, but not change kwlist.h itself. Done. >> I used the global .gitignore, but maybe that's an abuse of it. > > Yeah, I'd say it is. Moved. >> +# TODO: Error out if the keyword names are not in ASCII order. > > +many for including such a check. Done. > Also note that we don't require people to have Perl installed when > building from a tarball. Therefore, these derived headers must get > built during "make distprep" and removed by maintainer-clean but > not distclean. I think this also has some implications for VPATH > builds, but as long as you follow the pattern used for other > derived header files (e.g. fmgroids.h), you should be fine. Done. I also blindly added support for MSVC. -John Naylor src/common/keywords.c | 55 src/include/common/keywords.h | 14 +++ src/pl/plpgsql/src/.gitignore | 1 + src/pl/plpgsql/src/Makefile | 9 +- src/pl/plpgsql/src/pl_scanner.c | 119 ++ src/pl/plpgsql/src/pl_unreserved_kwlist.h | 107 +++ src/tools/gen_keywords.pl | 136 ++ src/tools/msvc/Solution.pm| 10 +++ 8 files changed, 356 insertions(+), 95 deletions(-) diff --git a/src/common/keywords.c b/src/common/keywords.c index 0c0c794c68..0fb14a0372 100644 --- a/src/common/keywords.c +++ b/src/common/keywords.c @@ -112,3 +112,58 @@ ScanKeywordLookup(const char *text, return NULL; } + +/* Like ScanKeywordLookup, but uses offsets into a keyword string. */ +const ScanKeywordOffset * +ScanKeywordLookupOffset(const char *text, + const ScanKeywordOffset *keywords, + int num_keywords, + const char *keyword_string) +{ + int len, +i; + char word[NAMEDATALEN]; + const ScanKeywordOffset *low; + const ScanKeywordOffset *high; + + len = strlen(text); + /* We assume all keywords are shorter than NAMEDATALEN. */ + if (len >= NAMEDATALEN) + return NULL; + + /* + * Apply an ASCII-only downcasing. We must not use tolower() since it may + * produce the wrong translation in some locales (eg, Turkish). + */ + for (i = 0; i < len; i++) + { + char ch = text[i]; + + if (ch >= 'A' && ch <= 'Z') + ch += 'a' - 'A'; + word[i] = ch; + } + word[len] = '\0'; + + /* + * Now do a binary search using plain strcmp() comparison. + */ + low = keywords; + high = keywords + (num_keywords - 1); + while (low <= high) + { + const ScanKeywordOffset *middle; + int difference; + + middle = low + (high - low) / 2; + difference = strcmp(keyword_string + middle->offset, word); + if (difference == 0) + return middle; + else if (difference < 0) + low = middle + 1; + else + high = middle - 1; + } + + return NULL; +} diff --git a/src/include/common/keywords.h b/src/include/common/keywords.h index 0b31505b66..7cd7bf4461 100644 --- a/src/include/common/keywords.h +++ b/src/include/common/keywords.h @@ -28,11 +28,20 @@ typedef struct ScanKeyword int16 category; /* see codes above */ } ScanKeyword; +typedef struct ScanKeywordOffset +{ + int32 offset; /* offset into a keyword string */ + int16 value; /* grammar's token code */ + int16 category; /* see codes
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > On 12/18/18, Tom Lane wrote: >> I'd be kind of inclined to convert all uses of ScanKeyword to the new way, >> if only for consistency's sake. On the other hand, I'm not the one >> volunteering to do the work. > That's reasonable, as long as the design is nailed down first. Along > those lines, attached is a heavily WIP patch that only touches plpgsql > unreserved keywords, to test out the new methodology in a limited > area. After settling APIs and name/directory bikeshedding, I'll move > on to the other four keyword types. Let the bikeshedding begin ... > There's a new Perl script, src/common/gen_keywords.pl, I'd be inclined to put the script in src/tools, I think. IMO src/common is for code that actually gets built into our executables. > which takes > pl_unreserved_kwlist.h as input and outputs > pl_unreserved_kwlist_offset.h and pl_unreserved_kwlist_string.h. I wonder whether we'd not be better off producing just one output file, in which we have the offsets emitted as PG_KEYWORD macros and then the giant string emitted as a macro definition, ie something like #define PG_KEYWORD_STRING \ "absolute\0" \ "alias\0" \ ... That simplifies the Makefile-hacking, at least, and it possibly gives callers more flexibility about what they actually want to do with the string. > The > output headers are not installed or symlinked anywhere. Since the > input keyword lists will never be #included directly, they might be > better as .txt files, like errcodes.txt. If we went that far, we might > also remove the PG_KEYWORD macros (they'd still be in the output > files) and rename parser/kwlist.h to common/core_kwlist.txt. There's > also a case for not changing things unnecessarily, especially if > there's ever a new reason to include the base keyword list directly. I'm for "not change things unnecessarily". People might well be scraping the keyword list out of parser/kwlist.h for other purposes right now --- indeed, it's defined the way it is exactly to let people do that. I don't see a good reason to force them to redo whatever tooling they have that depends on that. So let's build kwlist_offsets.h alongside that, but not change kwlist.h itself. > To keep the other keyword types functional, I had to add a separate > new struct ScanKeywordOffset and new function > ScanKeywordLookupOffset(), so the patch is a bit messier than the > final will be. Check. > I used the global .gitignore, but maybe that's an abuse of it. Yeah, I'd say it is. > +# TODO: Error out if the keyword names are not in ASCII order. +many for including such a check. Also note that we don't require people to have Perl installed when building from a tarball. Therefore, these derived headers must get built during "make distprep" and removed by maintainer-clean but not distclean. I think this also has some implications for VPATH builds, but as long as you follow the pattern used for other derived header files (e.g. fmgroids.h), you should be fine. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/19/18, Andrew Gierth wrote: > Is there any particular reason not to go further and use a perfect hash > function for the lookup, rather than binary search? When I was investigating faster algorithms, I ruled out gperf based on discussions in the archives. The approach here has modest goals and shouldn't be too invasive. With the makefile support and separate keyword files in place, that'll be one less thing to do if we ever decide to replace binary search. The giant string will likely be useful as well. Since we're on the subject, I think some kind of trie would be ideal performance-wise, but a large amount of work. The nice thing about a trie is that it can be faster then a hash table for a key miss. I found a paper that described some space-efficient trie variations [1], but we'd likely have to code the algorithm and a way to emit a C code representation of it. I've found some libraries, but that would have more of the same difficulties in practicality that gperf had. [1] https://infoscience.epfl.ch/record/64394/files/triesearches.pdf -John Naylor
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Andrew Gierth writes: > Is there any particular reason not to go further and use a perfect hash > function for the lookup, rather than binary search? Tooling? I seem to recall having looked at gperf and deciding that it pretty much sucked, so it's not real clear to me what we would use. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Hi, On 2018-12-20 00:54:39 +, Andrew Gierth wrote: > > "John" == John Naylor writes: > > > On 12/18/18, Tom Lane wrote: > >> I'd be kind of inclined to convert all uses of ScanKeyword to the > >> new way, if only for consistency's sake. On the other hand, I'm not > >> the one volunteering to do the work. > > John> That's reasonable, as long as the design is nailed down first. > John> Along those lines, attached is a heavily WIP patch that only > John> touches plpgsql unreserved keywords, to test out the new > John> methodology in a limited area. After settling APIs and > John> name/directory bikeshedding, I'll move on to the other four > John> keyword types. > > Is there any particular reason not to go further and use a perfect hash > function for the lookup, rather than binary search? The last time I looked into perfect hash functions, it wasn't easy to find a generator that competed with a decent normal hashtable (in particular gperf's are very unconvincing). The added tooling is a concern imo. OTOH, we're comparing not with a hashtable, but a binary search, where the latter will usually loose. Wonder if we shouldn't generate a serialized non-perfect hashtable instead. The lookup code for a read-only hashtable without concern for adversarial input is pretty trivial. Greetings, Andres Freund
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
> "John" == John Naylor writes: > On 12/18/18, Tom Lane wrote: >> I'd be kind of inclined to convert all uses of ScanKeyword to the >> new way, if only for consistency's sake. On the other hand, I'm not >> the one volunteering to do the work. John> That's reasonable, as long as the design is nailed down first. John> Along those lines, attached is a heavily WIP patch that only John> touches plpgsql unreserved keywords, to test out the new John> methodology in a limited area. After settling APIs and John> name/directory bikeshedding, I'll move on to the other four John> keyword types. Is there any particular reason not to go further and use a perfect hash function for the lookup, rather than binary search? -- Andrew (irc:RhodiumToad)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/18/18, Tom Lane wrote: > I'd be kind of inclined to convert all uses of ScanKeyword to the new way, > if only for consistency's sake. On the other hand, I'm not the one > volunteering to do the work. That's reasonable, as long as the design is nailed down first. Along those lines, attached is a heavily WIP patch that only touches plpgsql unreserved keywords, to test out the new methodology in a limited area. After settling APIs and name/directory bikeshedding, I'll move on to the other four keyword types. There's a new Perl script, src/common/gen_keywords.pl, which takes pl_unreserved_kwlist.h as input and outputs pl_unreserved_kwlist_offset.h and pl_unreserved_kwlist_string.h. The output headers are not installed or symlinked anywhere. Since the input keyword lists will never be #included directly, they might be better as .txt files, like errcodes.txt. If we went that far, we might also remove the PG_KEYWORD macros (they'd still be in the output files) and rename parser/kwlist.h to common/core_kwlist.txt. There's also a case for not changing things unnecessarily, especially if there's ever a new reason to include the base keyword list directly. To keep the other keyword types functional, I had to add a separate new struct ScanKeywordOffset and new function ScanKeywordLookupOffset(), so the patch is a bit messier than the final will be. With a 4-byte offset, ScankeyWordOffset is 8 bytes, down from 12, and is now a power of 2. I used the global .gitignore, but maybe that's an abuse of it. Make check passes, but I don't know how well it stresses keyword use. I'll create a commitfest entry soon. -John Naylor .gitignore| 2 + src/common/gen_keywords.pl| 151 ++ src/common/keywords.c | 55 +++ src/include/common/keywords.h | 14 +++ src/pl/plpgsql/src/Makefile | 13 ++- src/pl/plpgsql/src/pl_scanner.c | 108 - src/pl/plpgsql/src/pl_unreserved_kwlist.h | 107 + 7 files changed, 355 insertions(+), 95 deletions(-) diff --git a/.gitignore b/.gitignore index 794e35b73c..8a9a05c20a 100644 --- a/.gitignore +++ b/.gitignore @@ -31,6 +31,8 @@ win32ver.rc *.exe lib*dll.def lib*.pc +*kwlist_offset.h +*kwlist_string.h # Local excludes in root directory /GNUmakefile diff --git a/src/common/gen_keywords.pl b/src/common/gen_keywords.pl new file mode 100644 index 00..a8a8576d43 --- /dev/null +++ b/src/common/gen_keywords.pl @@ -0,0 +1,151 @@ +#-- +# +# gen_keywords.pl +# Perl script that generates *_offset.h and *_string.h from a given +# keyword list file. These headers are then included into files that +# call ScanKeywordLookup() on that keyword list. The keyword name is +# is represented as an offset into a single string. +# +# Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group +# Portions Copyright (c) 1994, Regents of the University of California +# +# src/commen/gen_keywords.pl +# +#-- + + +my $output_path = ''; +my $kw_input_file; + +# Process command line switches. +while (@ARGV) +{ + my $arg = shift @ARGV; + if ($arg !~ /^-/) + { + $kw_input_file = $arg; + } + elsif ($arg =~ /^-o/) + { + $output_path = length($arg) > 2 ? substr($arg, 2) : shift @ARGV; + } + else + { + usage(); + } +} + + +# Make sure output_path ends in a slash. +if ($output_path ne '' && substr($output_path, -1) ne '/') +{ + $output_path .= '/'; +} + +$kw_input_file =~ /(\w*kwlist)\.h/; +my $base_filename = $1; + +my $kw_offset_file = $output_path . $base_filename . '_offset.h'; +my $kw_string_file = $output_path . $base_filename . '_string.h'; + +open(my $kif, '<', $kw_input_file) || die "$kw_input_file: $!"; +open(my $kof, '>', $kw_offset_file) || die "$kw_offset_file: $!"; +open(my $ksf, '>', $kw_string_file) || die "$kw_string_file: $!"; + +# Opening boilerplate for keyword offset header. +printf $kof <) +{ + if (/^PG_KEYWORD\("(\w+)",\s+(\w+),\s+(\w+)\)/) + { + $name = $1; + $value = $2; + $category = $3; + + push @keywords, $name; + + # Emit ScanKeyword macros with numerical offsets instead of text. + print $kof "PG_KEYWORD($offset, $value, $category)\n"; + + # Calculate the cumulative offset of the next keyword, + # taking into account the null terminator. + $offset += length($name) +1; + } +} + +# TODO: Error out if the keyword names are not in ASCII order. + +printf $ksf qq|const char *%s_string =\n"|, $base_filename; +print $ksf join qq|\\0"\n"|, @keywords; +printf $ksf qq|";\n\n#endif\t\t\t\t\t\t\t/* %s_STRING_H */\n|, uc $base_filename; + + +sub usage +{ + die <= NAMEDATALEN) + return NULL; + + /* + * Apply an ASCII-only downcasing. We must not use tolower() since it may + * produce the wrong translation in some locales (eg, Turkish). + */ + for (i = 0;
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > On 12/17/18, Tom Lane wrote: >> Also, wouldn't we also adopt this technology for its unreserved keywords, >> too? > We wouldn't be forced to, but there might be other reasons to do so. > Were you thinking of code consistency (within pl_scanner.c or > globally)? Or something else? > If we did adopt this setup for plpgsql unreserved keywords, > ecpg/preproc/ecpg_keywords.c and ecpg/preproc/c_keywords.c would be > left using the current ScanKeyword struct for search. Using offset > search for all 5 types of keywords would be globally consistent, but > it also means additional headers, generated headers, and makefile > rules. I'd be kind of inclined to convert all uses of ScanKeyword to the new way, if only for consistency's sake. On the other hand, I'm not the one volunteering to do the work. regards, tom lane
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 12/17/18, Tom Lane wrote: > John Naylor writes: >> Since PL/pgSQL uses the core scanner, we'd need to use offsets in its >> reserved_keywords[], too. Those don't change much, so we can probably >> get away with hard-coding the offsets and the giant string in that >> case. (If that's not acceptable, we could separate that out to >> pl_reserved_kwlist.h and reuse the above tooling to generate >> pl_reserved_kwlist_{offset,string}.h, but that's more complex.) > > plpgsql isn't as stable as all that: people propose new syntax for it > all the time. I do not think a hand-maintained array would be pleasant > at all. Okay. > Also, wouldn't we also adopt this technology for its unreserved keywords, > too? We wouldn't be forced to, but there might be other reasons to do so. Were you thinking of code consistency (within pl_scanner.c or globally)? Or something else? If we did adopt this setup for plpgsql unreserved keywords, ecpg/preproc/ecpg_keywords.c and ecpg/preproc/c_keywords.c would be left using the current ScanKeyword struct for search. Using offset search for all 5 types of keywords would be globally consistent, but it also means additional headers, generated headers, and makefile rules. -John Naylor
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
John Naylor writes: > A few months ago I was looking into faster search algorithms for > ScanKeywordLookup(), so this is interesting to me. While an optimal > full replacement would be a lot of work, the above ideas are much less > invasive and would still have some benefit. Unless anyone intends to > work on this, I'd like to flesh out the offset-into-giant-string > approach a bit further: Have at it... > Since PL/pgSQL uses the core scanner, we'd need to use offsets in its > reserved_keywords[], too. Those don't change much, so we can probably > get away with hard-coding the offsets and the giant string in that > case. (If that's not acceptable, we could separate that out to > pl_reserved_kwlist.h and reuse the above tooling to generate > pl_reserved_kwlist_{offset,string}.h, but that's more complex.) plpgsql isn't as stable as all that: people propose new syntax for it all the time. I do not think a hand-maintained array would be pleasant at all. Also, wouldn't we also adopt this technology for its unreserved keywords, too? regards, tom lane
reducing the footprint of ScanKeyword (was Re: Large writable variables)
On 10/15/18, Tom Lane wrote: > Andres Freund writes: >> On 2018-10-15 16:36:26 -0400, Tom Lane wrote: >>> We could possibly fix these by changing the data structure so that >>> what's in a ScanKeywords entry is an offset into some giant string >>> constant somewhere. No idea how that would affect performance, but >>> I do notice that we could reduce the sizeof(ScanKeyword), which can't >>> hurt. > >> Yea, that might even help performancewise. Alternatively we could change >> ScanKeyword to store the keyword name inline, but that'd be a measurable >> size increase... > > Yeah. It also seems like doing it this way would improve locality of > access: the pieces of the giant string would presumably be in the same > order as the ScanKeywords entries, whereas with the current setup, > who knows where the compiler has put 'em or in what order. > > We'd need some tooling to generate the constants that way, though; > I can't see how to make it directly from kwlist.h. A few months ago I was looking into faster search algorithms for ScanKeywordLookup(), so this is interesting to me. While an optimal full replacement would be a lot of work, the above ideas are much less invasive and would still have some benefit. Unless anyone intends to work on this, I'd like to flesh out the offset-into-giant-string approach a bit further: Since there are several callers of the current approach that don't use the core keyword list, we'd have to keep the existing struct and lookup function, to keep the complexity manageable. Once we have an offset-based struct and function, it makes sense to use it for all searches of core keywords. This includes not only the core scanner, but also adt/rule_utils.c, fe_utils/string_utils.c, and ecpg/preproc/keywords.c. There would need to be a header with offsets replacing name strings, generated from parser/kwlist.h, maybe kwlist_offset.h. It'd probably be convenient if it was emitted into the common/ dir. The giant string would likely need its own header (kwlist_string.h?). Since PL/pgSQL uses the core scanner, we'd need to use offsets in its reserved_keywords[], too. Those don't change much, so we can probably get away with hard-coding the offsets and the giant string in that case. (If that's not acceptable, we could separate that out to pl_reserved_kwlist.h and reuse the above tooling to generate pl_reserved_kwlist_{offset,string}.h, but that's more complex.) The rest should be just a SMOP. Any issues I left out? -John Naylor