[PATCHES] ilike multi-byte pattern cache
The attached patch implements a one-value pattern cache for the multi-byte encoding case for ILIKE. This reduces calls to lower() by (50% -1) in the common case where the pattern is a constant. My own testing and Guillaume Smet's show that this cuts roughly in half the performance penalty we inflicted by using lower() in that case. Is this sufficiently low risk to sneak into 8.3? cheers andrew Index: src/backend/utils/adt/like.c === RCS file: /cvsroot/pgsql/src/backend/utils/adt/like.c,v retrieving revision 1.71 diff -c -r1.71 like.c *** src/backend/utils/adt/like.c 22 Sep 2007 03:58:34 - 1.71 --- src/backend/utils/adt/like.c 22 Sep 2007 12:16:23 - *** *** 139,144 --- 139,149 *p; int slen, plen; + static char patcache[512], lpatcache[512]; + static int patcachelen = 0, lpatcachelen = 0; + + p = VARDATA_ANY(pat); + plen = VARSIZE_ANY_EXHDR(pat); /* For efficiency reasons, in the single byte case we don't call * lower() on the pattern and text, but instead call to_lower on each *** *** 147,156 if (pg_database_encoding_max_length() 1) { /* lower's result is never packed, so OK to use old macros here */ - pat = DatumGetTextP(DirectFunctionCall1(lower, PointerGetDatum(pat))); - p = VARDATA(pat); - plen = (VARSIZE(pat) - VARHDRSZ); str = DatumGetTextP(DirectFunctionCall1(lower, PointerGetDatum(str))); s = VARDATA(str); slen = (VARSIZE(str) - VARHDRSZ); --- 152,192 if (pg_database_encoding_max_length() 1) { + if (plen 0 plen == patcachelen strncmp(p,patcache,plen) == 0) + { + p = lpatcache; + plen = lpatcachelen; + } + else + { + char *lp; + int lplen; + + pat = DatumGetTextP(DirectFunctionCall1(lower, + PointerGetDatum(pat))); + + /* lower's result is never packed, so OK to use old macros here */ + lp = VARDATA(pat); + lplen = (VARSIZE(pat) - VARHDRSZ); + + if (plen 512 lplen 512) + { + patcachelen = plen; + lpatcachelen = lplen; + memcpy(patcache,p,plen); + memcpy(lpatcache,lp,lplen); + } + else + { + patcachelen = 0; + lpatcachelen = 0; + } + + p = lp; + plen = lplen; + } + /* lower's result is never packed, so OK to use old macros here */ str = DatumGetTextP(DirectFunctionCall1(lower, PointerGetDatum(str))); s = VARDATA(str); slen = (VARSIZE(str) - VARHDRSZ); *** *** 161,168 } else { - p = VARDATA_ANY(pat); - plen = VARSIZE_ANY_EXHDR(pat); s = VARDATA_ANY(str); slen = VARSIZE_ANY_EXHDR(str); return SB_IMatchText(s, slen, p, plen); --- 197,202 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] ilike multi-byte pattern cache
Andrew Dunstan [EMAIL PROTECTED] writes: The attached patch implements a one-value pattern cache for the multi-byte encoding case for ILIKE. This reduces calls to lower() by (50% -1) in the common case where the pattern is a constant. My own testing and Guillaume Smet's show that this cuts roughly in half the performance penalty we inflicted by using lower() in that case. Is this sufficiently low risk to sneak into 8.3? This seems awfully ugly ... and considering that you don't get to avoid lower() on the data side, it seems pretty dubious that it buys very much percentagewise. It would also be a net loss for non-constant patterns, which are by no means unheard of --- or even two constant patterns used in the same query. We've lived with this in 8.2 without much complaint. I think we can let it go until we think of a better solution. To my mind this is all tied up in the problem of handling locales in a better fashion --- a lot of the inefficiency of lower() is due to having a poor impedance match to the libc locale-related functions, and might be eliminated if we had locale support with better APIs. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] ilike multi-byte pattern cache
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: The attached patch implements a one-value pattern cache for the multi-byte encoding case for ILIKE. This reduces calls to lower() by (50% -1) in the common case where the pattern is a constant. My own testing and Guillaume Smet's show that this cuts roughly in half the performance penalty we inflicted by using lower() in that case. Is this sufficiently low risk to sneak into 8.3? This seems awfully ugly ... and considering that you don't get to avoid lower() on the data side, it seems pretty dubious that it buys very much percentagewise. It would also be a net loss for non-constant patterns, which are by no means unheard of --- or even two constant patterns used in the same query. The cost of using lower() is demonstrably high. Even on a very small pattern the speedup is easily measurable. The cost of the patch is effectively 1 call to strcmp() per call and 2 calls to memcpy() per cache miss, which should be quite cheap. We've lived with this in 8.2 without much complaint. I think we can let it go until we think of a better solution. To my mind this is all tied up in the problem of handling locales in a better fashion --- a lot of the inefficiency of lower() is due to having a poor impedance match to the libc locale-related functions, and might be eliminated if we had locale support with better APIs. Well, we have a complaint now :-( This was aimed at being some temporary relief rather than a long term fix. I guess Guillaume can use this as a patch if he wants. I agree that we need better locale APIs. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Various fixes for syslogger
ITAGAKI Takahiro [EMAIL PROTECTED] writes: Andrew Dunstan [EMAIL PROTECTED] wrote: Can you pinpoint where this change in argv/argc was made? I'm having trouble locating it. Oops, I misread the codes. There was the same bug in 8.0, too. Therefore, the fix should be appled to 8.0, 8.1, 8.2 and 8.3dev. We've had the bug in all versions after syslogger was added. I've applied this fix, but only as far back as 8.2, on two grounds: 1. It's only likely to be of interest to developers. 2. We are no longer supporting 8.0 or 8.1 on Windows. While it's theoretically not a Windows-specific bug, in practice no one is going to use the EXEC_BACKEND option on other platforms. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [DOCS] Patch to update log levels
Joshua D. Drake [EMAIL PROTECTED] writes: Do we have some kind of correlation for eventlog on windows? Then I could just use a table to show the relationships. Done. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] msvc, build and install with cygwin in the PATH
Hannes Eder wrote: Magnus Hagander wrote: Hannes Eder wrote: Is it worth doing this the Perl-way and using File::Find? If so, I can work an a patch for that. It's certainly cleaner that way, but I don't find it a major issue. But I'd rather see that fix than the other one. Here we go. See attached patch. Your comments are welcome. I have committed a fix that is somewhat similar to this. The Install.pm module needs some love, but that will have to wait till the next cycle. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] Add function for quote_qualified_identifier?
I had some spare cycles so I went ahead and patched this. Patch includes documentation and new regression tests. While I was in there I also added regression tests for quote_ident(), which appeared to be absent. quote_literal doesn't seem to have any regression tests either, but I decided to leave that for another patch. With thanks to Neil Conway for his assistance on IRC. Cheers BJ On 9/15/07, Bruce Momjian [EMAIL PROTECTED] wrote: This has been saved for the 8.4 release: Brendan Jurd wrote: Hi hackers, I note that we currently expose the usefulness of the quote_identifier function to the user with quote_ident(text). Is there any reason we shouldn't do the same with quote_qualified_identifier? We could just add a quote_qualified_ident(text, text) ... it would make forming dynamic queries more convenient in databases that use multiple schemas. Clearly a DBA could just create this function himself in SQL (and it wouldn't be difficult), but is that a good reason not to have it in our standard set of functions? Would be happy to cook up a patch for this. Cheers, BJ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/func.sgml === RCS file: /projects/cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.397 diff -c -r1.397 func.sgml *** doc/src/sgml/func.sgml 19 Sep 2007 03:13:57 - 1.397 --- doc/src/sgml/func.sgml 22 Sep 2007 03:07:26 - *** *** 1276,1281 --- 1276,1284 primaryquote_ident/primary /indexterm indexterm + primaryquote_qualified_ident/primary +/indexterm +indexterm primaryquote_literal/primary /indexterm indexterm *** *** 1541,1546 --- 1544,1563 /row row + entryliteralfunctionquote_qualified_ident/function(parameterschema/parameter typetext/type, parameteridentifier/parameter typetext/type)/literal/entry +entrytypetext/type/entry +entry + Return the given schema and identifier suitably quoted to be used as a + fully qualified identifier in an acronymSQL/acronym statement + string. Quoting is performed as for functionquote_ident/function, + but parameterschema/parameter and parameteridentifier/parameter + are quoted separately. +/entry +entryliteralquote_ident('Some schema','A table')/literal/entry +entryliteralSome schema.A table/literal/entry + /row + + row entryliteralfunctionquote_literal/function(parameterstring/parameter)/literal/entry entrytypetext/type/entry entry Index: src/backend/utils/adt/quote.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/quote.c,v retrieving revision 1.22 diff -c -r1.22 quote.c *** src/backend/utils/adt/quote.c 27 Feb 2007 23:48:08 - 1.22 --- src/backend/utils/adt/quote.c 22 Sep 2007 03:07:26 - *** *** 46,51 --- 46,77 } /* + * quote_qualified_ident - + *returns a properly quoted, schema-qualified identifier + */ + Datum + quote_qualified_ident(PG_FUNCTION_ARGS) + { + text*schema = PG_GETARG_TEXT_P(0); + text*ident = PG_GETARG_TEXT_P(1); + text*result; + const char *quoted; + char*schema_s; + char*ident_s; + + schema_s = DatumGetCString(DirectFunctionCall1(textout, + PointerGetDatum(schema))); + ident_s = DatumGetCString(DirectFunctionCall1(textout, + PointerGetDatum(ident))); + + quoted = quote_qualified_identifier(schema_s, ident_s); + + result = DatumGetTextP(DirectFunctionCall1(textin, + CStringGetDatum(quoted))); + PG_RETURN_TEXT_P(result); + } + + /* * quote_literal - * returns a properly quoted literal * Index: src/include/catalog/pg_proc.h === RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v retrieving revision 1.471 diff -c -r1.471 pg_proc.h *** src/include/catalog/pg_proc.h 20 Sep 2007 17:56:32
Re: [PATCHES] [HACKERS] Add function for quote_qualified_identifier?
Brendan Jurd [EMAIL PROTECTED] writes: Patch includes documentation and new regression tests. While I was in there I also added regression tests for quote_ident(), which appeared to be absent. This seems rather pointless, since it's equivalent to quote_ident(schemaname) || '.' || quote_ident(relname). regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Add function for quote_qualified_identifier?
On 9/23/07, Tom Lane [EMAIL PROTECTED] wrote: This seems rather pointless, since it's equivalent to quote_ident(schemaname) || '.' || quote_ident(relname). Yes it is, and I brought that up in the OP: I wrote: Clearly a DBA could just create this function himself in SQL (and it wouldn't be difficult), but is that a good reason not to have it in our standard set of functions? But since nobody arced up about it I thought I might as well move things along and produce a patch. Many of the functions provided by postgres are easy to write yourself. That doesn't mean they shouldn't be there. After all, there is *exactly* one way to do quote_qualified_ident. Why require every DBA who needs this functionality to go through the motions? I'll admit that it's a minor improvement, but that seems reasonable given it has a miniscule cost. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] 'Waiting on lock'
On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote: related TODO items: - add a WAIT n clause in same SQL locations as NOWAIT - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting) to provide better control over lock waits. are these actual TODO items? i can't find them on the TODO list and i don't remember any discussion nor patch about this -- regards, Jaime Casanova Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning. Richard Cook ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings