Re: [PATCHES] Proposed patch to change TOAST compression strategy

2008-02-29 Thread Teodor Sigaev

Tsvector dump (taken by Magnus from mail archives of pgsql's lists)
http://www.sigaev.ru/misc/tstest.sql.bz2

Query:
select sum(ts_rank( vector, 'asdfjkl' )) from tstest ;

ts_rank  detoasts value in any case, even tsvector doesn't contain needed 
lexemes.

Test was on my notebook: Core2 Duo 1.8MHz, 2Gb with default postgres.conf

8.4 without patch:
Time: 10883,368 ms

8.4 with patch (db was reinited)
Time: 9654,266 ms



--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Robert Lor

Tom Lane wrote:

I agree with Peter.  There are a whole lot of include files that are
needed by way more than 3 .c files, and yet are not folded into
postgres.h.  c.h is right out.
  
My concern is that when we start adding more probes (not just the 
backend), we will have to add the following 5 lines in .c files that use 
the Dtrace macros. This seems intrusive and messy to me instead of in a 
centralized place like c.h. What are the disadvantages for keeping the 
way it is now?


#ifdef ENABLE_DTRACE
#include utils/probes.h
#else
#include utils/probes_null.h
#endif

Regards,
-Robert

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Alvaro Herrera
Robert Lor wrote:

 My concern is that when we start adding more probes (not just the  
 backend), we will have to add the following 5 lines in .c files that use  
 the Dtrace macros. This seems intrusive and messy to me instead of in a  
 centralized place like c.h. What are the disadvantages for keeping the  
 way it is now?

 #ifdef ENABLE_DTRACE
 #include utils/probes.h
 #else
 #include utils/probes_null.h
 #endif

Why can't this block be centralized in probes.h?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Robert Lor

Alvaro Herrera wrote:

Robert Lor wrote:

  
My concern is that when we start adding more probes (not just the  
backend), we will have to add the following 5 lines in .c files that use  
the Dtrace macros. This seems intrusive and messy to me instead of in a  
centralized place like c.h. What are the disadvantages for keeping the  
way it is now?


#ifdef ENABLE_DTRACE
#include utils/probes.h
#else
#include utils/probes_null.h
#endif



Why can't this block be centralized in probes.h?
  
probes.h is auto generated and it can certainly be massaged to include 
the above logic, but I'd like to avoid doing that if possible.


The thinking initially was to make this tracing feature more like a 
framework and make it as simple as possible to add new probes and as 
un-intrusive as possible, that's why I thought and still think that 
putting the includes in c.h makes sense, unless there are obvious 
disadvantages I'm not aware of.


Regards,
-Robert




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[PATCHES] remove TCL_ARRAYS

2008-02-29 Thread Alvaro Herrera
This patch removes the TCL_ARRAY symbol.  This seems to be a leftover
from when pgtcl was around in the backend; if enabled, it causes
array_out to emit bogus array values:

alvherre=# create table bar ( a text);
CREATE TABLE
alvherre=# insert into bar values ('foo');
INSERT 0 1
alvherre=# select array_append('{}', a) from bar;
 array_append 
--
 {foo}
(1 ligne)

The correct value is

alvherre=# select array_append('{}', a) from bar;
 array_append 
--
 {foo\}
(1 ligne)


Of course, the system does not accept the TCL_ARRAY value back.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Index: src/backend/utils/adt/arrayfuncs.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/utils/adt/arrayfuncs.c,v
retrieving revision 1.140
diff -c -p -r1.140 arrayfuncs.c
*** src/backend/utils/adt/arrayfuncs.c	1 Jan 2008 19:45:52 -	1.140
--- src/backend/utils/adt/arrayfuncs.c	29 Feb 2008 14:30:28 -
*** array_out(PG_FUNCTION_ARGS)
*** 1030,1038 
  if (ch == '' || ch == '\\')
  {
  	needquote = true;
- #ifndef TCL_ARRAYS
  	overall_length += 1;
- #endif
  }
  else if (ch == '{' || ch == '}' || ch == typdelim ||
  		 isspace((unsigned char) ch))
--- 1030,1036 
*** array_out(PG_FUNCTION_ARGS)
*** 1103,1109 
  		if (needquotes[k])
  		{
  			APPENDCHAR('');
- #ifndef TCL_ARRAYS
  			for (tmp = values[k]; *tmp; tmp++)
  			{
  char		ch = *tmp;
--- 1101,1106 
*** array_out(PG_FUNCTION_ARGS)
*** 1113,1121 
  *p++ = ch;
  			}
  			*p = '\0';
- #else
- 			APPENDSTR(values[k]);
- #endif
  			APPENDCHAR('');
  		}
  		else
--- 1110,1115 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Robert Lor

Peter Eisentraut wrote:
Another thing that is concerning me about this new approach is the way the 
probes are named.  For example, we'd now have a call


POSTGRESQL_LWLOCK_ACQUIRE()

in the code.  This does not say we are *tracing* lock aquisition, but it looks 
like a macro that actually acquires a lock.
  

Definitely a valid concern.
I understand that these probe names follow some global naming scheme.  Is it 
hard to change that?  I'd feel more comfortable with, say, 
(D)TRACE_POSTGRESQL_LWLOCK_ACQUIRE().
  
Because the macro is auto generated and follows certain naming 
conventions, prepending TRACE_ will not work. If you did that, the probe 
name will be called postgresql-lwlock-aquire and the provider will be 
trace which is not what we want.


To avoid the confusion, how about just adding a simple comment like /* 
DTrace probe or  Trace point or something similar */  before all 
occurrences of the macro calls?


Regards,
-Robert

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Alvaro Herrera
Robert Lor wrote:
 Alvaro Herrera wrote:

 Why can't this block be centralized in probes.h?
   
 probes.h is auto generated and it can certainly be massaged to include  
 the above logic, but I'd like to avoid doing that if possible.

Hmm, so let's have a third file that's not autogenerated, which is the
file we will use for #includes, and contains just that block.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Fix for initdb failures on Vista

2008-02-29 Thread Dave Page
On Fri, Feb 29, 2008 at 3:25 PM, Magnus Hagander [EMAIL PROTECTED] wrote:

 I noticed that as well when looking at the code, but since I ran my tests
 on non-vista platforms I didn't hit the actual problem.

 Dave - it shuold be a simple case of adding the same line of code to the
 regression tests, no?

Yeah, that should do it.

 Meanwhile, I'll apply what we have with my additios and cleanups per mine
 and Heikkis comments, because they fix the most important codepaths.

Thanks for putting the final polish on this.


-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Fix for initdb failures on Vista

2008-02-29 Thread Magnus Hagander
On Fri, Feb 29, 2008 at 12:17:51AM -0500, Andrew Dunstan wrote:
 
 
 Dave Page wrote:
 The attached patch fixes problems reported primarily on Vista, but
 also on some Windows 2003 and XP installations in which initdb reports
 that it cannot find postgres.exe.
 
 This occurs because of security-related changes implemented in Windows
 Vista and recent patches on older OS's. When running initdb or pg_ctl
 we currently create a restricted security token with the
 Administrators and Power Users groups (and thus their privileges)
 removed and re-execute the same program using the restricted token.
 This ensures that the process is run without potentially dangerous
 privileges no matter what user account it was started from. On Vista
 and friends however, the default DACL (list of Access Control Entries)
 used in the restricted token contains Administrators (the group) 
 System when we run as Administrator, vs. User + System when run as
 other users. Because we then drop Administrators, we are left with
 only the System ACE in the DACL, which does not allow us to use
 CreatePipe()/CreateProcess().
 
 To fix this, when we create the restricted process, we initially start
 it in suspended mode. We modify it's DACL to explicitly add an ACE for
 the current user, and then resume the child process. This remains
 secure because administrative privileges are granted to the groups
 that we've dropped, not the user itself.
 
 I've tested on Vista and XP, but additional testing would be useful
 (Andrew, Magnus?). Please apply to head, 8.3 and 8.2
 
   
 
 This appears to work for initdb. But make check fails after the initdb 
 stage, I think because pg_regress doesn't use pg_ctl to start the 
 postmaster. The log just reads Access is denied'
 
 I don't have too much difficulty with that as long as we stipulate that 
 postgres has to be built, or at least checked, as a non-privileged user 
 (c.f. recent discussion of building RPMs as root). Alternatively, we 
 should also patch pg_regress.c
 

I noticed that as well when looking at the code, but since I ran my tests
on non-vista platforms I didn't hit the actual problem.

Dave - it shuold be a simple case of adding the same line of code to the
regression tests, no?

Meanwhile, I'll apply what we have with my additios and cleanups per mine
and Heikkis comments, because they fix the most important codepaths.

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Tom Lane
Robert Lor [EMAIL PROTECTED] writes:
 Peter Eisentraut wrote:
 I understand that these probe names follow some global naming scheme.  Is it 
 hard to change that?  I'd feel more comfortable with, say, 
 (D)TRACE_POSTGRESQL_LWLOCK_ACQUIRE().
 
 Because the macro is auto generated and follows certain naming 
 conventions, prepending TRACE_ will not work. If you did that, the probe 
 name will be called postgresql-lwlock-aquire and the provider will be 
 trace which is not what we want.

Ugh.  Is this tool really so badly designed that it thinks it has
ownership of the *entire* namespace within the target program?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] remove TCL_ARRAYS

2008-02-29 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 This patch removes the TCL_ARRAY symbol.

If you're going to remove it you should actually remove it
(eg from pg_config_manual.h).

 This seems to be a leftover
 from when pgtcl was around in the backend;

Yeah, it was supporting some kluge or other in the libpgtcl client.
AFAICT from the 7.x code, the client-side kluge was never enabled by
default anyway, so it seems unlikely anyone will miss it.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Robert Lor wrote:
 probes.h is auto generated and it can certainly be massaged to include  
 the above logic, but I'd like to avoid doing that if possible.

 Hmm, so let's have a third file that's not autogenerated, which is the
 file we will use for #includes, and contains just that block.

Or just two files.  Call probes_null something else, have it be included
where needed, have it include the autogenerated file when appropriate.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Robert Lor

Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:
  

Hmm, so let's have a third file that's not autogenerated, which is the
file we will use for #includes, and contains just that block.



Or just two files.  Call probes_null something else, have it be included
where needed, have it include the autogenerated file when appropriate.

  


I really like this idea. So, we're now back to having  pg_trace.h which 
includes the autogenerated probes.h when Dtrace is enabled, else null 
macros will be used.


Currently, pg_trace.h is included in c.h, and I feel strongly that it 
should remains there because by design I'd like to
1)  have the tracing feature be available both in the frontend and 
backend without having to do anything extra, which also means that 
probes.h needs to be generated before any compilation
2)  centralize the include of this header just in case the 
implementation needs to be changed for some reason (eg, if this file 
needs to be splitted, etc)
3)  reduce the number of changes to a minimal when adding new probes to 
new .c files


I haven't heard any major disadvantages about keeping it in c.h, but if 
you are still adamant about keeping it out of c.h, I'll will go along 
with that.


Regards,
-Robert

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Tom Lane
Robert Lor [EMAIL PROTECTED] writes:
 I haven't heard any major disadvantages about keeping it in c.h, but if 
 you are still adamant about keeping it out of c.h, I'll will go along 
 with that.

I was thinking that pg_trace.h involved some backend-only code, but
I'm not sure why I thought that :-(.  Yeah, your plan to do it by
restructuring the contents of pg_trace.h sounds fine.

We still have what I consider a big problem with the names of the
macros.  Perhaps that could be fixed by passing the auto-generated
file through a sed script to put a prefix on the macro names before
we start to use it?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] CopyReadLineText optimization

2008-02-29 Thread Heikki Linnakangas

Heikki Linnakangas wrote:
Attached is a patch that modifies CopyReadLineText so that it uses 
memchr to speed up the scan. The nice thing about memchr is that we can 
take advantage of any clever optimizations that might be in libc or 
compiler.


Here's an updated version of the patch. The principle is the same, but 
the same optimization is now used for CSV input as well, and there's 
more comments.


I still need to do more benchmarking. I mentioned a ~5% speedup on the 
test I ran earlier, which was a load of the lineitem table from TPC-H. 
It looks like with cheaper data types the gain can be much bigger; 
here's an oprofile from loading the TPC-H partsupp table,


Before:

samples  %image name   symbol name
5146 25.7635  postgres CopyReadLine
4089 20.4716  postgres DoCopy
1449  7.2544  reiserfs (no symbols)
1369  6.8539  postgres pg_verify_mbstr_len
1013  5.0716  libc-2.7.so  memcpy
749   3.7499  libc-2.7.so  strtod_l_internal
598   2.9939  postgres heap_formtuple
548   2.7436  libc-2.7.so  strtol_l_internal
403   2.0176  libc-2.7.so  memset
309   1.5470  libc-2.7.so  strlen
208   1.0414  postgres AllocSetAlloc
...

After:

samples  %image name   symbol name
4165 25.7879  postgres DoCopy
1574  9.7455  postgres pg_verify_mbstr_len
1520  9.4112  reiserfs (no symbols)
1005  6.2225  libc-2.7.so  memchr
986   6.1049  libc-2.7.so  memcpy
632   3.9131  libc-2.7.so  strtod_l_internal
589   3.6468  postgres heap_formtuple
546   3.3806  libc-2.7.so  strtol_l_internal
386   2.3899  libc-2.7.so  memset
366   2.2661  postgres CopyReadLine
287   1.7770  libc-2.7.so  strlen
215   1.3312  postgres LWLockAcquire
208   1.2878  postgres hash_any
176   1.0897  postgres LWLockRelease
161   0.9968  postgres InputFunctionCall
157   0.9721  postgres AllocSetAlloc
...

Profile shows that with the patch, ~8.5% of the CPU time is spent in 
CopyReadLine+memchr, vs. 25.5% before. That's a quite significant speedup.


I still need to test the worst-case performance, with input that has a 
lot of escapes. It would be interesting to hear reports with this patch 
from people on different platforms. These results are from my laptop 
with 32-bit Intel CPU, running Linux. There could be big differences in 
the memchr implementations.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/commands/copy.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.295
diff -c -r1.295 copy.c
*** src/backend/commands/copy.c	1 Jan 2008 19:45:48 -	1.295
--- src/backend/commands/copy.c	29 Feb 2008 17:43:53 -
***
*** 67,72 
--- 67,275 
  	EOL_CRNL
  } EolType;
  
+ 
+ /* Fast byte searcher functions *
+  *
+  * CopyReadLine needs to search for newlines, as well as quoting and escape
+  * characters to determine which newlines are real and which ones are escaped.
+  * Doing that in the naive way, looping through the input byte by byte and
+  * comparing against the interesting characters, can be slow.
+  *
+  * To speed that up, we provide a searcher interface, that can be used to
+  * search a piece of memory for up to 4 different bytes simultaneously. It's
+  * similar to memchr, but allows searching for multiple chars at the same 
+  * time.
+  * 
+  * To start a search on a new block of memory, call init_searcher. Then call
+  * next_searcher repeatedly to return all the occurrances of the bytes being
+  * searched for.
+  *
+  * The searcher implementation uses memchr to search for the bytes, and keeps
+  * track of where the next occurrance of each is. Using memchr allows us
+  * to take advantage of any platform-specific optimizations.
+  */
+ 
+ /*
+  * Struct used to store state of the current search between calls to 
+  * next_searcher. Initialized by init_search.
+  */
+ typedef struct {
+ 	/* Each element in c corresponds the element in s. These are sorted
+ 	 * by the pointers (ptr) */
+ 	int n;		/* elements used in the arrays */
+ 	char c[4];	/* bytes we're looking for */
+ 	char *ptr[4]; /* pointers to next occurances of the bytes */
+ 	char *end;	/* end of block being searched. Last byte to search is end-1 */
+ } searcher;
+ 
+ /* Functions internal to searcher */
+ 
+ #define swap_searcher_entries(searcher, a, b) do { \
+ 	char tmpc = (searcher)-c[a]; \
+ 	char *tmpptr = (searcher)-ptr[a]; \
+ 	(searcher)-c[a] 

Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Peter Eisentraut
Am Freitag, 29. Februar 2008 schrieb Robert Lor:
 My concern is that when we start adding more probes (not just the
 backend), we will have to add the following 5 lines in .c files that use
 the Dtrace macros.

I had already solved this in my intermediate patch I sent you by symlinking 
probes_null.h to probes.h.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Fix for initdb failures on Vista

2008-02-29 Thread Magnus Hagander

Dave Page wrote:

On Fri, Feb 29, 2008 at 3:25 PM, Magnus Hagander [EMAIL PROTECTED] wrote:

I noticed that as well when looking at the code, but since I ran my tests
on non-vista platforms I didn't hit the actual problem.

Dave - it shuold be a simple case of adding the same line of code to the
regression tests, no?


Yeah, that should do it.


Will you provide a patch for that, or should I? (I remind you I have no 
box ready to reproduce it on, so it helps a lot if you can do it :-P)


//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Peter Eisentraut
Am Freitag, 29. Februar 2008 schrieb Robert Lor:
 Currently, pg_trace.h is included in c.h, and I feel strongly that it
 should remains there because by design I'd like to
 1)  have the tracing feature be available both in the frontend and
 backend without having to do anything extra,

I think each component would have its own probes definition file.

 which also means that 
 probes.h needs to be generated before any compilation

Well, you are going to have to do a lot more work on the makefiles if you want 
to do it that way.  Make works by defining dependencies between files, not by 
hoping that people will execute the commands in the order you write them.  If 
you want every single file in the tree to depend on a rule, you will have to 
do something different.

 2)  centralize the include of this header just in case the
 implementation needs to be changed for some reason (eg, if this file
 needs to be splitted, etc)

We have no evidence that anything like that will ever happen.

 3)  reduce the number of changes to a minimal when adding new probes to
 new .c files

These arguments seem irrelevant in my mind.  When you add new function calls, 
you will usually have to add new header files as well.  It's the normal way 
to do things.

 I haven't heard any major disadvantages about keeping it in c.h, but if
 you are still adamant about keeping it out of c.h, I'll will go along
 with that.

Including only what you need is a principle.  It keeps the namespace clean, it 
speads up compilation time, it makes the build system simpler and more 
efficient.  Otherwise we'd only need one header file for everything.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PATCHES] sinval.c / sinvaladt.c restructuring

2008-02-29 Thread Alvaro Herrera
I just modified the interactions in sinval.c and sinvaladt.c per
http://thread.gmane.org/gmane.comp.db.postgresql.devel.patches/18820/focus=18822

It looks a lot saner this way: the code that actually deals with the
queue, including locking etc, is all in sinvaladt.c.  This means that
the struct definition of the queue, and the queue pointer, are now
internal implementation details inside sinvaladt.c.

One side effect of this change is that the call to SendPostmasterSignal
now occurs after the lock has been released.  ISTM this is a good idea
on general principles (no syscalls in lwlocked code), but I'm wondering
if I created a thundering hoard problem that did not exist before.

All tests pass.

As a test of Review Board, I uploaded the patch to it:
http://reviewdemo.postgresql.org/r/19/

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Index: src/backend/storage/ipc/sinval.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/storage/ipc/sinval.c,v
retrieving revision 1.83
diff -c -p -r1.83 sinval.c
*** src/backend/storage/ipc/sinval.c	1 Jan 2008 19:45:51 -	1.83
--- src/backend/storage/ipc/sinval.c	29 Feb 2008 17:14:26 -
*** static void ProcessCatchupEvent(void);
*** 56,62 
  void
  CreateSharedInvalidationState(void)
  {
- 	/* SInvalLock must be initialized already, during LWLock init */
  	SIBufferInit();
  }
  
--- 56,61 
*** CreateSharedInvalidationState(void)
*** 67,83 
  void
  InitBackendSharedInvalidationState(void)
  {
! 	int			flag;
! 
! 	LWLockAcquire(SInvalLock, LW_EXCLUSIVE);
! 	flag = SIBackendInit(shmInvalBuffer);
! 	LWLockRelease(SInvalLock);
! 	if (flag  0)/* unexpected problem */
! 		elog(FATAL, shared cache invalidation initialization failed);
! 	if (flag == 0)/* expected problem: MaxBackends exceeded */
! 		ereport(FATAL,
! (errcode(ERRCODE_TOO_MANY_CONNECTIONS),
!  errmsg(sorry, too many clients already)));
  }
  
  /*
--- 66,72 
  void
  InitBackendSharedInvalidationState(void)
  {
! 	SIBackendInit();
  }
  
  /*
*** SendSharedInvalidMessage(SharedInvalidat
*** 89,97 
  {
  	bool		insertOK;
  
! 	LWLockAcquire(SInvalLock, LW_EXCLUSIVE);
! 	insertOK = SIInsertDataEntry(shmInvalBuffer, msg);
! 	LWLockRelease(SInvalLock);
  	if (!insertOK)
  		elog(DEBUG4, SI buffer overflow);
  }
--- 78,84 
  {
  	bool		insertOK;
  
! 	insertOK = SIInsertDataEntry(msg);
  	if (!insertOK)
  		elog(DEBUG4, SI buffer overflow);
  }
*** ReceiveSharedInvalidMessages(
*** 123,141 
  		 */
  		catchupInterruptOccurred = 0;
  
! 		/*
! 		 * We can run SIGetDataEntry in parallel with other backends running
! 		 * SIGetDataEntry for themselves, since each instance will modify only
! 		 * fields of its own backend's ProcState, and no instance will look at
! 		 * fields of other backends' ProcStates.  We express this by grabbing
! 		 * SInvalLock in shared mode.  Note that this is not exactly the
! 		 * normal (read-only) interpretation of a shared lock! Look closely at
! 		 * the interactions before allowing SInvalLock to be grabbed in shared
! 		 * mode for any other reason!
! 		 */
! 		LWLockAcquire(SInvalLock, LW_SHARED);
! 		getResult = SIGetDataEntry(shmInvalBuffer, MyBackendId, data);
! 		LWLockRelease(SInvalLock);
  
  		if (getResult == 0)
  			break;/* nothing more to do */
--- 110,116 
  		 */
  		catchupInterruptOccurred = 0;
  
! 		getResult = SIGetDataEntry(MyBackendId, data);
  
  		if (getResult == 0)
  			break;/* nothing more to do */
*** ReceiveSharedInvalidMessages(
*** 155,165 
  
  	/* If we got any messages, try to release dead messages */
  	if (gotMessage)
! 	{
! 		LWLockAcquire(SInvalLock, LW_EXCLUSIVE);
! 		SIDelExpiredDataEntries(shmInvalBuffer);
! 		LWLockRelease(SInvalLock);
! 	}
  }
  
  
--- 130,136 
  
  	/* If we got any messages, try to release dead messages */
  	if (gotMessage)
! 		SIDelExpiredDataEntries(false);
  }
  
  
Index: src/backend/storage/ipc/sinvaladt.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/storage/ipc/sinvaladt.c,v
retrieving revision 1.66
diff -c -p -r1.66 sinvaladt.c
*** src/backend/storage/ipc/sinvaladt.c	1 Jan 2008 19:45:51 -	1.66
--- src/backend/storage/ipc/sinvaladt.c	29 Feb 2008 17:11:20 -
***
*** 24,30 
  #include storage/sinvaladt.h
  
  
! SISeg	   *shmInvalBuffer;
  
  static LocalTransactionId nextLocalTransactionId;
  
--- 24,105 
  #include storage/sinvaladt.h
  
  
! /*
!  * Conceptually, the shared cache invalidation messages are stored in an
!  * infinite array, where maxMsgNum is the next array subscript to store a
!  * submitted message in, minMsgNum is the smallest array subscript containing a
!  * message not yet read by all backends, and we always have maxMsgNum 

Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Robert Lor

Peter Eisentraut wrote:
I had already solved this in my intermediate patch I sent you by symlinking 
probes_null.h to probes.h.


  


Now I see why you created the symlink. But I thinkt the suggestion by 
Tom/Avaro to include probes.h and the content of probes_null.h in a 
separate header (pg_trace.h) is a good one.


Regards,
-Robert

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Tom Lane
Robert Lor [EMAIL PROTECTED] writes:
 And don't think adding a simple comment before the macro call is 
 sufficient? This can be documented so everyone knows the convention.

It's stupid.  The need for a comment is proof that the macro is badly
named.  I don't intend to hold still for letting poorly-designed tools
dictate our coding conventions.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PATCHES] Remove FATAL from pg_lzdecompress

2008-02-29 Thread Zdenek Kotala


I attach patch which adds boundaries check and memory overwriting 
protection when compressed data are corrupted.


Current behavior let code overwrite a memory and after that check if 
unpacked size is same as expected value. In this case elog execution 
fails (at least on Solaris - malloc has corrupted structures) and no 
message appears in a log file.


I did not add any extra information into the message. Reasonable 
solution seems to be use errcontext how was recommended by Alvaro. But I 
'm not sure if printtup is good place for it, because pg_detoast is 
called from many places. However, is can be solved in separate patch.


I'm also think that this modification should be backported to other 
version too.


Thanks Zdenek
Index: src/backend/utils/adt/pg_lzcompress.c
===
RCS file: /zfs_data/cvs_pgsql/cvsroot/pgsql/src/backend/utils/adt/pg_lzcompress.c,v
retrieving revision 1.29
diff -c -r1.29 pg_lzcompress.c
*** src/backend/utils/adt/pg_lzcompress.c	1 Jan 2008 19:45:52 -	1.29
--- src/backend/utils/adt/pg_lzcompress.c	29 Feb 2008 19:07:50 -
***
*** 633,638 
--- 633,639 
  {
  	const unsigned char *dp;
  	const unsigned char *dend;
+ 	const unsigned char *destend;
  	unsigned char *bp;
  	unsigned char ctrl;
  	int32		ctrlc;
***
*** 643,656 
  	dp = ((const unsigned char *) source) + sizeof(PGLZ_Header);
  	dend = ((const unsigned char *) source) + VARSIZE(source);
  	bp = (unsigned char *) dest;
  
! 	while (dp  dend)
  	{
  		/*
  		 * Read one control byte and process the next 8 items.
  		 */
  		ctrl = *dp++;
! 		for (ctrlc = 0; ctrlc  8  dp  dend; ctrlc++)
  		{
  			if (ctrl  1)
  			{
--- 644,658 
  	dp = ((const unsigned char *) source) + sizeof(PGLZ_Header);
  	dend = ((const unsigned char *) source) + VARSIZE(source);
  	bp = (unsigned char *) dest;
+ 	destend = ((const unsigned char *)dest) + source-rawsize;
  
! 	while (dp  dend  bp  destend)
  	{
  		/*
  		 * Read one control byte and process the next 8 items.
  		 */
  		ctrl = *dp++;
! 		for (ctrlc = 0; ctrlc  8  dp  dend  bp  destend; ctrlc++)
  		{
  			if (ctrl  1)
  			{
***
*** 673,678 
--- 675,682 
   * memcpy() here, because the copied areas could overlap
   * extremely!
   */
+ if( bp+len  destend) /* data are corrupted, do not continue */
+ 	break;
  while (len--)
  {
  	*bp = bp[-off];
***
*** 696,708 
  	}
  
  	/*
! 	 * Check we decompressed the right amount, else die.  This is a FATAL
! 	 * condition if we tromped on more memory than expected (we assume we have
! 	 * not tromped on shared memory, though, so need not PANIC).
  	 */
! 	destsize = (char *) bp - dest;
! 	if (destsize != source-rawsize)
! 		elog(destsize  source-rawsize ? FATAL : ERROR,
  			 compressed data is corrupt);
  
  	/*
--- 700,709 
  	}
  
  	/*
! 	 * Check we decompressed the right amount.
  	 */
! 	if (bp != destend || dp != dend)
! 		elog(ERROR,
  			 compressed data is corrupt);
  
  	/*

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Robert Lor

Tom Lane wrote:

We still have what I consider a big problem with the names of the
macros.  Perhaps that could be fixed by passing the auto-generated
file through a sed script to put a prefix on the macro names before
we start to use it?
  


Post processing the auto generated header may work, but I think it could 
be unnecessarily complicated and error proned. First, the formats of the 
header between Mac OS X and Solaris are different, and I'm pretty sure 
it'll be different for FreeBSD when it comes out with Dtrace support. 
Second, if there are changes in the  content of the header in the 
future, the sed script may break. I can investigate this approach 
further, but I personally prefer a solution this is less error proned.


And don't think adding a simple comment before the macro call is 
sufficient? This can be documented so everyone knows the convention.


Regards,
-Robert

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Alvaro Herrera
Robert Lor wrote:

 Post processing the auto generated header may work, but I think it could  
 be unnecessarily complicated and error proned.

Would it work to name the traces trace_transaction__start etc instead?
AFAICS that would cause the macros to be named

POSTGRESQL_TRACE_TRANSACTION_START()

which is not ideal but at least it's a bit more obvious what it's all
about.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Robert Lor

Peter Eisentraut wrote:

I think each component would have its own probes definition file.
  


A while back when we met in Toronto, the consensus was to only have one 
provider called postgresql and all probes whether they be from the 
backend or frontend will be grouped together in this one provider.


Well, you are going to have to do a lot more work on the makefiles if you want 
to do it that way.  Make works by defining dependencies between files, not by 
hoping that people will execute the commands in the order you write them.  If 
you want every single file in the tree to depend on a rule, you will have to 
do something different.


  
Actually, I was able to get it to work without doing much based on your 
patch.  Please comment on this updated patch I submitted yesterday

http://archives.postgresql.org/pgsql-patches/2008-02/msg00173.php

Including only what you need is a principle.  It keeps the namespace clean, it 
speads up compilation time, it makes the build system simpler and more 
efficient.  Otherwise we'd only need one header file for everything.
  


Okay, will move the header into individual .c files.

Regards,
-Robert


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] remove TCL_ARRAYS

2008-02-29 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  This patch removes the TCL_ARRAY symbol.
 
 If you're going to remove it you should actually remove it
 (eg from pg_config_manual.h).

Oops.  Thanks, removed from there too.


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PATCHES] CopyReadAttributesCSV optimization

2008-02-29 Thread Heikki Linnakangas
Here's a patch to speed up CopyReadAttributesCSV. On the test case I've 
been playing with, loading the TPC-H partsupp table, about 20% 
CopyReadAttributesCSV (inlined into DoCopy, DoCopy itself is insignificant):


samples  %image name   symbol name
8136 25.8360  postgres CopyReadLine
6350 20.1645  postgres DoCopy
2181  6.9258  postgres pg_verify_mbstr_len
2157  6.8496  reiserfs (no symbols)
1668  5.2968  libc-2.7.so  memcpy
1142  3.6264  libc-2.7.so  strtod_l_internal
951   3.0199  postgres heap_formtuple
904   2.8707  libc-2.7.so  strtol_l_internal
619   1.9656  libc-2.7.so  memset
442   1.4036  libc-2.7.so  strlen
341   1.0828  postgres hash_any
329   1.0447  postgres pg_atoi
300   0.9527  postgres AllocSetAlloc

With this patch, the usage of that function goes down to ~13%

samples  %image name   symbol name
7191 28.7778  postgres CopyReadLine
3257 13.0343  postgres DoCopy
2127  8.5121  reiserfs (no symbols)
1914  7.6597  postgres pg_verify_mbstr_len
1413  5.6547  libc-2.7.so  memcpy
920   3.6818  libc-2.7.so  strtod_l_internal
784   3.1375  libc-2.7.so  strtol_l_internal
745   2.9814  postgres heap_formtuple
508   2.0330  libc-2.7.so  memset
398   1.5928  libc-2.7.so  strlen
315   1.2606  postgres hash_any
255   1.0205  postgres AllocSetAlloc

The trick is to split the loop in CopyReadAttributesCSV into two parts, 
inside quotes, and outside quotes, saving some instructions in both 
parts.


Your mileage may vary, but I'm quite happy with this. I haven't tested 
it much yet, but I wouldn't expect it to be a loss in any interesting 
scenario. The code also doesn't look much worse after the patch, perhaps 
even better.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/commands/copy.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.295
diff -c -r1.295 copy.c
*** src/backend/commands/copy.c	1 Jan 2008 19:45:48 -	1.295
--- src/backend/commands/copy.c	29 Feb 2008 20:57:09 -
***
*** 2913,2919 
  	for (;;)
  	{
  		bool		found_delim = false;
- 		bool		in_quote = false;
  		bool		saw_quote = false;
  		char	   *start_ptr;
  		char	   *end_ptr;
--- 3146,3151 
***
*** 2934,3000 
  		{
  			char		c;
  
! 			end_ptr = cur_ptr;
! 			if (cur_ptr = line_end_ptr)
! break;
! 			c = *cur_ptr++;
! 			/* unquoted field delimiter */
! 			if (c == delimc  !in_quote)
  			{
! found_delim = true;
! break;
! 			}
! 			/* start of quoted field (or part of field) */
! 			if (c == quotec  !in_quote)
! 			{
! saw_quote = true;
! in_quote = true;
! continue;
  			}
! 			/* escape within a quoted field */
! 			if (c == escapec  in_quote)
  			{
! /*
!  * peek at the next char if available, and escape it if it is
!  * an escape char or a quote char
!  */
! if (cur_ptr  line_end_ptr)
! {
! 	char		nextc = *cur_ptr;
  
! 	if (nextc == escapec || nextc == quotec)
  	{
! 		*output_ptr++ = nextc;
! 		cur_ptr++;
! 		continue;
  	}
  }
! 			}
  
! 			/*
! 			 * end of quoted field. Must do this test after testing for escape
! 			 * in case quote char and escape char are the same (which is the
! 			 * common case).
! 			 */
! 			if (c == quotec  in_quote)
! 			{
! in_quote = false;
! continue;
  			}
- 
- 			/* Add c to output string */
- 			*output_ptr++ = c;
  		}
  
  		/* Terminate attribute value in output area */
  		*output_ptr++ = '\0';
  
- 		/* Shouldn't still be in quote mode */
- 		if (in_quote)
- 			ereport(ERROR,
- 	(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
- 	 errmsg(unterminated CSV quoted field)));
- 
  		/* Check whether raw input matched null marker */
  		input_len = end_ptr - start_ptr;
  		if (!saw_quote  input_len == cstate-null_print_len 
--- 3166,3241 
  		{
  			char		c;
  
! 			/* Not in quote */
! 			for (;;)
  			{
! end_ptr = cur_ptr;
! if (cur_ptr = line_end_ptr)
! 	goto endfield;
! c = *cur_ptr++;
! /* unquoted field delimiter */
! if (c == delimc)
! {
! 	found_delim = true;
! 	goto endfield;
! }
! /* start of quoted field (or part of field) */
! if (c == quotec)
! {
! 	saw_quote = true;
! 	break;
! }
! /* Add c to output string */
! *output_ptr++ = c;
  			}
! 
! 			/* In quote */
! 			for (;;)
  			{
! end_ptr = cur_ptr;
! if 

Re: [PATCHES] DTrace probe patch for OS X Leopard

2008-02-29 Thread Robert Lor

Alvaro Herrera wrote:

Would it work to name the traces trace_transaction__start etc instead?
AFAICS that would cause the macros to be named

POSTGRESQL_TRACE_TRANSACTION_START()
  
Correct, and that would work. With this approach, all the probe names 
will start with trace-, and this particular one will be called 
trace-transaction-start and can be used this way.


postgresql*:::trace-transaction-start
{
...
}

Actually, it reads better to me than having trace in front. If the above 
macro name is acceptable, I'll take this route.



Regards,
-Robert

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[PATCHES] Proposing correction to posix_fadvise() usage in xlog.c

2008-02-29 Thread Mark Wong
I believe I have a correction to the usage of posix_fadvise() in
xlog.c.  Basically posix_fadvise() is being called right before the
WAL segment file is closed, which effectively doesn't do anything as
opposed to when the file is opened.  This proposed correction calls
posix_fadvise() in three locations in order to make sure
POSIX_FADV_DONTNEED is set correctly since there are three cases for
opening a WAL segment file for writing.

I'm hesitant to post any data I have because I only have a little pc
with a SATA drive in it.  My hardware knowledge on SATA controllers
and drives is a little weak, but my testing with dbt-2 is showing the
performance dropping.  I am guessing that SATA drives have write cache
enabled by default so it seems to make sense that using
POSIX_FADV_DONTNEED will cause writes to be slower by writing through
the disk cache.  Again, assuming that is possible with SATA hardware.

If memory serves, one of the wins here is suppose to be that in a
scenario where we are not expecting to re-read writes to the WAL we
also do not want the writes to disk to flush out other data from the
operating system disk cache.  But I'm not sure how best to test
correctness.

Anyway, I hope I'm not way off but I'm sure someone will correct me. :)

Regards,
Mark


pgsql-log-fadvise.patch
Description: Binary data

---(end of broadcast)---
TIP 6: explain analyze is your friend