[PATCHES] ilike multi-byte pattern cache

2007-09-22 Thread Andrew Dunstan


The attached patch implements a one-value pattern cache for the 
multi-byte encoding case for ILIKE. This reduces calls to lower() by 
(50% -1) in the common case where the pattern is a constant.  My own 
testing and Guillaume Smet's show that this cuts roughly in half the 
performance penalty we inflicted by using lower() in that case.


Is this sufficiently low risk to sneak into 8.3?

cheers

andrew
Index: src/backend/utils/adt/like.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/adt/like.c,v
retrieving revision 1.71
diff -c -r1.71 like.c
*** src/backend/utils/adt/like.c	22 Sep 2007 03:58:34 -	1.71
--- src/backend/utils/adt/like.c	22 Sep 2007 12:16:23 -
***
*** 139,144 
--- 139,149 
  			   *p;
  	int			slen,
  plen;
+ 	static char patcache[512], lpatcache[512];
+ static int  patcachelen = 0, lpatcachelen = 0;
+ 
+ 	p = VARDATA_ANY(pat);
+ 	plen = VARSIZE_ANY_EXHDR(pat);
  
  	/* For efficiency reasons, in the single byte case we don't call
  	 * lower() on the pattern and text, but instead call to_lower on each
***
*** 147,156 
  
  	if (pg_database_encoding_max_length()  1)
  	{
  		/* lower's result is never packed, so OK to use old macros here */
- 		pat = DatumGetTextP(DirectFunctionCall1(lower, PointerGetDatum(pat)));
- 		p = VARDATA(pat);
- 		plen = (VARSIZE(pat) - VARHDRSZ);
  		str = DatumGetTextP(DirectFunctionCall1(lower, PointerGetDatum(str)));
  		s = VARDATA(str);
  		slen = (VARSIZE(str) - VARHDRSZ);
--- 152,192 
  
  	if (pg_database_encoding_max_length()  1)
  	{
+ 		if (plen  0   plen == patcachelen  strncmp(p,patcache,plen) == 0)
+ 		{
+ 			p = lpatcache;
+ 			plen = lpatcachelen;
+ 		}
+ 		else
+ 		{
+ 			char *lp;
+ 			int   lplen;
+ 
+ 			pat = DatumGetTextP(DirectFunctionCall1(lower, 
+ 	PointerGetDatum(pat)));
+ 
+ 			/* lower's result is never packed, so OK to use old macros here */
+ 			lp = VARDATA(pat);
+ 			lplen = (VARSIZE(pat) - VARHDRSZ);
+ 
+ 			if (plen  512  lplen  512)
+ 			{
+ patcachelen = plen;
+ lpatcachelen = lplen;
+ memcpy(patcache,p,plen);
+ memcpy(lpatcache,lp,lplen);
+ 			}
+ 			else
+ 			{
+ patcachelen = 0;
+ lpatcachelen = 0;
+ 			}
+ 
+ 			p = lp;
+ 			plen = lplen;
+ 		}
+ 
  		/* lower's result is never packed, so OK to use old macros here */
  		str = DatumGetTextP(DirectFunctionCall1(lower, PointerGetDatum(str)));
  		s = VARDATA(str);
  		slen = (VARSIZE(str) - VARHDRSZ);
***
*** 161,168 
  	}
  	else
  	{
- 		p = VARDATA_ANY(pat);
- 		plen = VARSIZE_ANY_EXHDR(pat);
  		s = VARDATA_ANY(str);
  		slen = VARSIZE_ANY_EXHDR(str);
  		return SB_IMatchText(s, slen, p, plen);
--- 197,202 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] ilike multi-byte pattern cache

2007-09-22 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 The attached patch implements a one-value pattern cache for the 
 multi-byte encoding case for ILIKE. This reduces calls to lower() by 
 (50% -1) in the common case where the pattern is a constant.  My own 
 testing and Guillaume Smet's show that this cuts roughly in half the 
 performance penalty we inflicted by using lower() in that case.

 Is this sufficiently low risk to sneak into 8.3?

This seems awfully ugly ... and considering that you don't get to
avoid lower() on the data side, it seems pretty dubious that it
buys very much percentagewise.  It would also be a net loss for
non-constant patterns, which are by no means unheard of --- or even
two constant patterns used in the same query.

We've lived with this in 8.2 without much complaint.  I think we can
let it go until we think of a better solution.  To my mind this is
all tied up in the problem of handling locales in a better fashion
--- a lot of the inefficiency of lower() is due to having a poor
impedance match to the libc locale-related functions, and might be
eliminated if we had locale support with better APIs.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] ilike multi-byte pattern cache

2007-09-22 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  
The attached patch implements a one-value pattern cache for the 
multi-byte encoding case for ILIKE. This reduces calls to lower() by 
(50% -1) in the common case where the pattern is a constant.  My own 
testing and Guillaume Smet's show that this cuts roughly in half the 
performance penalty we inflicted by using lower() in that case.



  

Is this sufficiently low risk to sneak into 8.3?



This seems awfully ugly ... and considering that you don't get to
avoid lower() on the data side, it seems pretty dubious that it
buys very much percentagewise.  It would also be a net loss for
non-constant patterns, which are by no means unheard of --- or even
two constant patterns used in the same query.
  


The cost of using lower() is demonstrably high. Even on a very small 
pattern the speedup is easily measurable.


The cost of the patch is effectively 1 call to strcmp() per call and 2 
calls to memcpy() per cache miss, which should be quite cheap.



We've lived with this in 8.2 without much complaint.  I think we can
let it go until we think of a better solution.  To my mind this is
all tied up in the problem of handling locales in a better fashion
--- a lot of the inefficiency of lower() is due to having a poor
impedance match to the libc locale-related functions, and might be
eliminated if we had locale support with better APIs.


  


Well, we have a complaint now :-( This was aimed at being some temporary 
relief rather than a long term fix. I guess Guillaume can use this as a 
patch if he wants.


I agree that we need better locale APIs.

cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] Various fixes for syslogger

2007-09-22 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 Andrew Dunstan [EMAIL PROTECTED] wrote:
 Can you pinpoint where this change in argv/argc was made? I'm having 
 trouble locating it.

 Oops, I misread the codes. There was the same bug in 8.0, too.
 Therefore, the fix should be appled to 8.0, 8.1, 8.2 and 8.3dev.
 We've had the bug in all versions after syslogger was added.

I've applied this fix, but only as far back as 8.2, on two grounds:
1. It's only likely to be of interest to developers.
2. We are no longer supporting 8.0 or 8.1 on Windows.

While it's theoretically not a Windows-specific bug, in practice
no one is going to use the EXEC_BACKEND option on other platforms.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [DOCS] Patch to update log levels

2007-09-22 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
 Do we have some kind of correlation for eventlog on windows? Then I
 could just use a table to show the relationships.

Done.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] msvc, build and install with cygwin in the PATH

2007-09-22 Thread Andrew Dunstan



Hannes Eder wrote:

Magnus Hagander wrote:
Hannes Eder wrote:
 Is it worth doing this the Perl-way and using File::Find? If so, 
I can

 work an a patch for that.

 It's certainly cleaner that way, but I don't find it a major issue. 
But I'd

 rather see that fix than the other one.

Here we go. See attached patch. Your comments are welcome.




I have committed a fix that is somewhat similar to this. The Install.pm 
module needs some love, but that will have to wait till the next cycle.


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] [HACKERS] Add function for quote_qualified_identifier?

2007-09-22 Thread Brendan Jurd
I had some spare cycles so I went ahead and patched this.

Patch includes documentation and new regression tests.  While I was in
there I also added regression tests for quote_ident(), which appeared
to be absent.

quote_literal doesn't seem to have any regression tests either, but I
decided to leave that for another patch.

With thanks to Neil Conway for his assistance on IRC.

Cheers
BJ

On 9/15/07, Bruce Momjian [EMAIL PROTECTED] wrote:
 This has been saved for the 8.4 release:
 Brendan Jurd wrote:
  Hi hackers,
 
  I note that we currently expose the usefulness of the quote_identifier
  function to the user with quote_ident(text).
 
  Is there any reason we shouldn't do the same with 
  quote_qualified_identifier?
 
  We could just add a quote_qualified_ident(text, text) ... it would
  make forming dynamic queries more convenient in databases that use
  multiple schemas.
 
  Clearly a DBA could just create this function himself in SQL (and it
  wouldn't be difficult), but is that a good reason not to have it in
  our standard set of functions?
 
  Would be happy to cook up a patch for this.
 
  Cheers,
  BJ
 
  ---(end of broadcast)---
  TIP 9: In versions below 8.0, the planner will ignore your desire to
 choose an index scan if your joining column's datatypes do not
 match

 --
   Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
   EnterpriseDB   http://www.enterprisedb.com

   + If your life is a hard drive, Christ can be your backup. +

Index: doc/src/sgml/func.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.397
diff -c -r1.397 func.sgml
*** doc/src/sgml/func.sgml  19 Sep 2007 03:13:57 -  1.397
--- doc/src/sgml/func.sgml  22 Sep 2007 03:07:26 -
***
*** 1276,1281 
--- 1276,1284 
  primaryquote_ident/primary
 /indexterm
 indexterm
+ primaryquote_qualified_ident/primary
+/indexterm
+indexterm
  primaryquote_literal/primary
 /indexterm
 indexterm
***
*** 1541,1546 
--- 1544,1563 
/row
  
row
+
entryliteralfunctionquote_qualified_ident/function(parameterschema/parameter
 typetext/type, parameteridentifier/parameter 
typetext/type)/literal/entry
+entrytypetext/type/entry
+entry
+   Return the given schema and identifier suitably quoted to be 
used as a
+   fully qualified identifier in an acronymSQL/acronym 
statement
+   string.  Quoting is performed as for 
functionquote_ident/function,
+   but parameterschema/parameter and 
parameteridentifier/parameter
+   are quoted separately.
+/entry
+entryliteralquote_ident('Some schema','A table')/literal/entry
+entryliteralSome schema.A table/literal/entry
+   /row
+ 
+   row
 
entryliteralfunctionquote_literal/function(parameterstring/parameter)/literal/entry
 entrytypetext/type/entry
 entry
Index: src/backend/utils/adt/quote.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/quote.c,v
retrieving revision 1.22
diff -c -r1.22 quote.c
*** src/backend/utils/adt/quote.c   27 Feb 2007 23:48:08 -  1.22
--- src/backend/utils/adt/quote.c   22 Sep 2007 03:07:26 -
***
*** 46,51 
--- 46,77 
  }
  
  /*
+  * quote_qualified_ident -
+  *returns a properly quoted, schema-qualified identifier
+  */
+ Datum
+ quote_qualified_ident(PG_FUNCTION_ARGS)
+ {
+   text*schema = PG_GETARG_TEXT_P(0);
+   text*ident = PG_GETARG_TEXT_P(1);
+   text*result;
+   const char  *quoted;
+   char*schema_s;
+   char*ident_s;
+ 
+   schema_s = DatumGetCString(DirectFunctionCall1(textout, 
+   
   PointerGetDatum(schema)));
+   ident_s = DatumGetCString(DirectFunctionCall1(textout, 
+   
  PointerGetDatum(ident)));
+ 
+   quoted = quote_qualified_identifier(schema_s, ident_s);
+ 
+   result = DatumGetTextP(DirectFunctionCall1(textin, 
+   
   CStringGetDatum(quoted)));
+   PG_RETURN_TEXT_P(result);
+ }
+ 
+ /*
   * quote_literal -
   *  returns a properly quoted literal
   *
Index: src/include/catalog/pg_proc.h
===
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.471
diff -c -r1.471 pg_proc.h
*** src/include/catalog/pg_proc.h   20 Sep 2007 17:56:32 

Re: [PATCHES] [HACKERS] Add function for quote_qualified_identifier?

2007-09-22 Thread Tom Lane
Brendan Jurd [EMAIL PROTECTED] writes:
 Patch includes documentation and new regression tests.  While I was in
 there I also added regression tests for quote_ident(), which appeared
 to be absent.

This seems rather pointless, since it's equivalent to
quote_ident(schemaname) || '.' || quote_ident(relname).

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Add function for quote_qualified_identifier?

2007-09-22 Thread Brendan Jurd
On 9/23/07, Tom Lane [EMAIL PROTECTED] wrote:
 This seems rather pointless, since it's equivalent to
 quote_ident(schemaname) || '.' || quote_ident(relname).

Yes it is, and I brought that up in the OP:

I wrote:
 Clearly a DBA could just create this function himself in SQL (and it
 wouldn't be difficult), but is that a good reason not to have it in
 our standard set of functions?

But since nobody arced up about it I thought I might as well move
things along and produce a patch.

Many of the functions provided by postgres are easy to write yourself.
 That doesn't mean they shouldn't be there.  After all, there is
*exactly* one way to do quote_qualified_ident.  Why require every DBA
who needs this functionality to go through the motions?

I'll admit that it's a minor improvement, but that seems reasonable
given it has a miniscule cost.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] 'Waiting on lock'

2007-09-22 Thread Jaime Casanova
On 6/19/07, Simon Riggs [EMAIL PROTECTED] wrote:

 related TODO items:
 - add a WAIT n clause in same SQL locations as NOWAIT
 - add a lock_wait_timeout (USERSET), default = 0 (unlimited waiting)

 to provide better control over lock waits.


are these actual TODO items? i can't find them on the TODO list and i
don't remember any discussion nor patch about this

-- 
regards,
Jaime Casanova

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning.
   Richard Cook

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings