Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-25 Thread Stefan Kaltenbrunner
Marshall, Steve wrote:
 I'm glad to see the patch making its way through the process.  I'm also
 glad you guys do comprehensive testing before accepting it, since we are
 only able to test in a more limited range of environments.
 
 We have applied the patch to our 8.2.4 installations and are running it
 in a high transaction rate system (processing lots and lots of
 continually changing weather data).  Let me know if there is any
 information we could provide that would be of help in making the
 back-patching decision.

I have re-enabled tcl builds(for -HEAD) on lionfish (mipsel) and quagga
(arm) a few days ago so we should get a bit of additional coverage from
boxes that definitly had problems with the tcl-threading behaviour.


Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-21 Thread Marshall, Steve
I'm glad to see the patch making its way through the process.  I'm also
glad you guys do comprehensive testing before accepting it, since we are
only able to test in a more limited range of environments.

We have applied the patch to our 8.2.4 installations and are running it
in a high transaction rate system (processing lots and lots of
continually changing weather data).  Let me know if there is any
information we could provide that would be of help in making the
back-patching decision.

Yours,
Steve Marshall

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 20, 2007 8:33 PM
To: Marshall, Steve
Cc: pgsql-patches@postgresql.org
Subject: Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming
multithreaded 

Marshall, Steve [EMAIL PROTECTED] writes:
 There is a problem in PL/TCL that can cause the postgres backend to 
 become multithreaded.   Postgres is not designed to be multithreaded,
so 
 this causes downstream errors in signal handling.  We have seen this 
 cause a number of unexpected state errors associated with 
 notification handling; however, unpredictable signal handling would be

 likely to cause other errors as well.

I've applied this patch to CVS HEAD (8.3-to-be).  I'm a bit hesitant to
back-patch it however, at least not till it gets through some beta
testing.

Thanks for the detailed explanation, test case, and patch!

regards, tom lane


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-20 Thread Tom Lane
Marshall, Steve [EMAIL PROTECTED] writes:
 There is a problem in PL/TCL that can cause the postgres backend to 
 become multithreaded.   Postgres is not designed to be multithreaded, so 
 this causes downstream errors in signal handling.  We have seen this 
 cause a number of unexpected state errors associated with notification 
 handling; however, unpredictable signal handling would be likely to 
 cause other errors as well. 

I've applied this patch to CVS HEAD (8.3-to-be).  I'm a bit hesitant
to back-patch it however, at least not till it gets through some beta
testing.

Thanks for the detailed explanation, test case, and patch!

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-16 Thread Stefan Kaltenbrunner
Tom Lane wrote:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 yeah testing that patch now (seems to apply just fine on -HEAD) but it
 seems that there is something strange going on because I just got:
 
 ! ERROR:  could not read block 2 of relation 1663/16384/2606: read only 0 of 
 8192 bytes
 
 Is that repeatable?  What sort of filesystem are you testing on?
 (soft-mounted NFS by any chance?)

doesn't seem to be repeatable :-(

on the postitive side - the pltcl patch does seem to fix the mentioned
regression failures quagga triggered earlier this year. I did about a
dozends of full buildfarm runs with that patch and about ten manual
executions of the pltcl regression tests without any sign of
misbehaviour so this seems like a clear candidate for at least -HEAD.


Stefan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-15 Thread Stefan Kaltenbrunner
Tom Lane wrote:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 hmm i wonder if that could be related to:
 http://archives.postgresql.org/pgsql-hackers/2007-01/msg00377.php
 
 I had forgotten that thread, but it sure does look related doesn't it?
 Do you want to try Steve's proposed patch and see if it fixes it?

yeah testing that patch now (seems to apply just fine on -HEAD) but it
seems that there is something strange going on because I just got:

http://www.kaltenbrunner.cc/files/regression.diffs

on the first manual buildfarm run on quagga...


Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-15 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 yeah testing that patch now (seems to apply just fine on -HEAD) but it
 seems that there is something strange going on because I just got:

! ERROR:  could not read block 2 of relation 1663/16384/2606: read only 0 of 
8192 bytes

Is that repeatable?  What sort of filesystem are you testing on?
(soft-mounted NFS by any chance?)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-15 Thread Stefan Kaltenbrunner
Tom Lane wrote:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 yeah testing that patch now (seems to apply just fine on -HEAD) but it
 seems that there is something strange going on because I just got:
 
 ! ERROR:  could not read block 2 of relation 1663/16384/2606: read only 0 of 
 8192 bytes
 
 Is that repeatable?  What sort of filesystem are you testing on?

that box is slow - still testing ...

 (soft-mounted NFS by any chance?)

no - local SATA disk with ext3 (no OS related errors and SMART is clean too)



Stefan


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-14 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Marshall, Steve wrote:
 There is a problem in PL/TCL that can cause the postgres backend to 
 become multithreaded.   Postgres is not designed to be multithreaded, so 
 this causes downstream errors in signal handling.  We have seen this 
 cause a number of unexpected state errors associated with notification 
 handling; however, unpredictable signal handling would be likely to 
 cause other errors as well. 
 
 Some sample scripts are attached which will reproduce this problem when 
 running against a multithreaded version of TCL, but will work without 
 error with single-threaded TCL library.  The scripts are a combination 
 of Unix shell, perl DBI, and SQL commands.  The postgres process can be 
 seen to have multiple threads using the Linux command ps -Lwfu 
 postgres.  In this command the NLWP columns will be 2 for multithreaded 
 backend processes.  The threaded/non-threaded state of the TCL library 
 can be ascertained on Linux using ldd to determine if libpthread.so is 
 linked to the TCL library (e.g. ldd /usr/lib/libtcl8.4.so).
 
 The multithreaded behavior occurs the first time PL/TCL is used in a 
 postgres backend, but only when postgres is linked against a 
 multithread-enabled version of libtcl.  Thus, this problem can be 
 side-stepped by linking against the proper TCL library.  However 
 multithreaded TCL libraries are becoming the norm in Linux distributions 
 and seems ubiquitous in the Windows world.  Therefore a fix to the 
 PL/TCL code is warrented.
 
 We determined that postgres became multithreaded during the creation of 
 the TCL interpreter in a function called tcl_InitNotifier.  This 
 function is part of TCL's Notifier subsystem, which is used to monitor 
 for events asynchronously from the TCL event loop.  Although initialized 
 when an interpreter is created, the Notifier subsystem is not used until 
 a process enters the TCL event loop.  This never happens within a 
 postgres process, because postgres implements its own event loop.  
 Therefore the initialization of the Notifier subsystem is not necessary 
 within the context of PL/TCL.
 
 Our solution was to disable the Notifier subsystem by overriding the 
 functions associated with it using the Tcl_SetNotifier function.  This 
 allows 8 functions related to the Notifier to overriden.  Even though we 
 found only two of the functions were ever called within postgres, we 
 overrode 8 functions with no-op versions, just for completeness.  A 
 patch file containing the changes to pltcl.c from its 8.2.4 version is 
 also attached.
 
 We tested this patch with PostgreSQL 8.2.4 on both RedHat Enterprise 4.0 
 usingTCL 8.4 (single threaded) and RHE 5.0 using TCL 8.4.13 
 (multithreaded).  We expect this solution to work with Windows as well, 
 although we have not tested it.  There may be some problems using this 
 solution with old versions of TCL that pre-date the Tcl_SetNotifier 
 function.  However this function has been around for quite a while; it 
 was added in in the TCL 8.2 release, circa 2000.
 
 We hope this patch will be considered for a future PostgreSQL release.
 
 Steve Marshall
 Paul Bayer
 Doug Knight
 WSI Corporation
 
 

[ application/x-gzip is not supported, skipping... ]

 *** pltcl.c.orig  2007-09-10 12:58:34.0 -0400
 --- pltcl.c   2007-09-11 11:37:33.363222114 -0400
 ***
 *** 163,168 
 --- 163,258 
   static void pltcl_build_tuple_argument(HeapTuple tuple, TupleDesc tupdesc,
  Tcl_DString *retval);
   
 + /**
 +  *  Declarations for functions overriden using Tcl_SetNotifier. 
 +  **/
 + static int fakeThreadKey;   /* To give valid address for ClientData */
 + 
 + static ClientData
 + pltcl_InitNotifier _ANSI_ARGS_((void));
 + 
 + static void
 + pltcl_FinalizeNotifier _ANSI_ARGS_((ClientData clientData));
 + 
 + static void
 + pltcl_SetTimer _ANSI_ARGS_((Tcl_Time *timePtr));
 + 
 + static void
 + pltcl_AlertNotifier _ANSI_ARGS_((ClientData clientData));
 + 
 + static void
 + pltcl_CreateFileHandler _ANSI_ARGS_((int fd, int mask, Tcl_FileProc *proc, 
 ClientData clientData));
 + 
 + static void
 + pltcl_DeleteFileHandler _ANSI_ARGS_((int fd));
 + 
 + static void
 + pltcl_ServiceModeHook _ANSI_ARGS_((int mode));
 + 
 + static int
 + pltcl_WaitForEvent _ANSI_ARGS_((Tcl_Time *timePtr));
 + 
 + /**
 +  *  Definitions for functions overriden using Tcl_SetNotifier. 
 +  *  These implementations effectively disable the TCL Notifier subsystem.
 +  *  This is okay because we never enter the TCL event loop from postgres,
 +  *  so the notifier capabilities are 

Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-14 Thread Gregory Stark

Bruce Momjian [EMAIL PROTECTED] writes:

 This has been saved for the 8.4 release:

   http://momjian.postgresql.org/cgi-bin/pgpatches_hold

 ---

 Marshall, Steve wrote:
 There is a problem in PL/TCL that can cause the postgres backend to 
 become multithreaded.   Postgres is not designed to be multithreaded, so 
 this causes downstream errors in signal handling.  

Um, this is a bug fix. Unless you had some problem with it?

Do we have anyone actively maintaining pltcl these days? I'm intentionally
quite unfamiliar with Tcl or I would be happy to verify it's reasonable. But
the explanation seems pretty convincing. If we don't have anyone maintaining
it then we're pretty much at the mercy of applying whatever patches come in
from people who are more familiar with it than us. In my experience that's how
new maintainers for modules of free software are often found anyways.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-14 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Bruce Momjian [EMAIL PROTECTED] writes:
 There is a problem in PL/TCL that can cause the postgres backend to 
 become multithreaded.   Postgres is not designed to be multithreaded, so 
 this causes downstream errors in signal handling.  

 Um, this is a bug fix. Unless you had some problem with it?

I haven't reviewed that patch yet, but I concur we should consider it
for 8.3.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-14 Thread Andrew Dunstan



Gregory Stark wrote:

Do we have anyone actively maintaining pltcl these days? I'm intentionally
quite unfamiliar with Tcl or I would be happy to verify it's reasonable. But
the explanation seems pretty convincing. If we don't have anyone maintaining
it then we're pretty much at the mercy of applying whatever patches come in
from people who are more familiar with it than us. In my experience that's how
new maintainers for modules of free software are often found anyways.

  


I was hoping that Jan or somebody else with some Tcl-fu would comment. I 
agree it probably shouldn't be deferred.


It does bother me a bit that the Tcl engine startup just starts firing 
up threads without advertisement - I wondered if we shouldn't kick back 
and say it's their problem to fix.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-14 Thread Bruce Momjian

OK, moved to patches list.

---

Andrew Dunstan wrote:
 
 
 Gregory Stark wrote:
  Do we have anyone actively maintaining pltcl these days? I'm intentionally
  quite unfamiliar with Tcl or I would be happy to verify it's reasonable. But
  the explanation seems pretty convincing. If we don't have anyone maintaining
  it then we're pretty much at the mercy of applying whatever patches come in
  from people who are more familiar with it than us. In my experience that's 
  how
  new maintainers for modules of free software are often found anyways.
 

 
 I was hoping that Jan or somebody else with some Tcl-fu would comment. I 
 agree it probably shouldn't be deferred.
 
 It does bother me a bit that the Tcl engine startup just starts firing 
 up threads without advertisement - I wondered if we shouldn't kick back 
 and say it's their problem to fix.
 
 cheers
 
 andrew

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded

2007-09-11 Thread Marshall, Steve
There is a problem in PL/TCL that can cause the postgres backend to 
become multithreaded.   Postgres is not designed to be multithreaded, so 
this causes downstream errors in signal handling.  We have seen this 
cause a number of unexpected state errors associated with notification 
handling; however, unpredictable signal handling would be likely to 
cause other errors as well. 

Some sample scripts are attached which will reproduce this problem when 
running against a multithreaded version of TCL, but will work without 
error with single-threaded TCL library.  The scripts are a combination 
of Unix shell, perl DBI, and SQL commands.  The postgres process can be 
seen to have multiple threads using the Linux command ps -Lwfu 
postgres.  In this command the NLWP columns will be 2 for multithreaded 
backend processes.  The threaded/non-threaded state of the TCL library 
can be ascertained on Linux using ldd to determine if libpthread.so is 
linked to the TCL library (e.g. ldd /usr/lib/libtcl8.4.so).


The multithreaded behavior occurs the first time PL/TCL is used in a 
postgres backend, but only when postgres is linked against a 
multithread-enabled version of libtcl.  Thus, this problem can be 
side-stepped by linking against the proper TCL library.  However 
multithreaded TCL libraries are becoming the norm in Linux distributions 
and seems ubiquitous in the Windows world.  Therefore a fix to the 
PL/TCL code is warrented.


We determined that postgres became multithreaded during the creation of 
the TCL interpreter in a function called tcl_InitNotifier.  This 
function is part of TCL's Notifier subsystem, which is used to monitor 
for events asynchronously from the TCL event loop.  Although initialized 
when an interpreter is created, the Notifier subsystem is not used until 
a process enters the TCL event loop.  This never happens within a 
postgres process, because postgres implements its own event loop.  
Therefore the initialization of the Notifier subsystem is not necessary 
within the context of PL/TCL.


Our solution was to disable the Notifier subsystem by overriding the 
functions associated with it using the Tcl_SetNotifier function.  This 
allows 8 functions related to the Notifier to overriden.  Even though we 
found only two of the functions were ever called within postgres, we 
overrode 8 functions with no-op versions, just for completeness.  A 
patch file containing the changes to pltcl.c from its 8.2.4 version is 
also attached.


We tested this patch with PostgreSQL 8.2.4 on both RedHat Enterprise 4.0 
usingTCL 8.4 (single threaded) and RHE 5.0 using TCL 8.4.13 
(multithreaded).  We expect this solution to work with Windows as well, 
although we have not tested it.  There may be some problems using this 
solution with old versions of TCL that pre-date the Tcl_SetNotifier 
function.  However this function has been around for quite a while; it 
was added in in the TCL 8.2 release, circa 2000.


We hope this patch will be considered for a future PostgreSQL release.

Steve Marshall
Paul Bayer
Doug Knight
WSI Corporation




pltcl_multithread_bug_test.tar.gz
Description: GNU Zip compressed data
*** pltcl.c.orig2007-09-10 12:58:34.0 -0400
--- pltcl.c 2007-09-11 11:37:33.363222114 -0400
***
*** 163,168 
--- 163,258 
  static void pltcl_build_tuple_argument(HeapTuple tuple, TupleDesc tupdesc,
   Tcl_DString *retval);
  
+ /**
+  *  Declarations for functions overriden using Tcl_SetNotifier. 
+  **/
+ static int fakeThreadKey;   /* To give valid address for ClientData */
+ 
+ static ClientData
+ pltcl_InitNotifier _ANSI_ARGS_((void));
+ 
+ static void
+ pltcl_FinalizeNotifier _ANSI_ARGS_((ClientData clientData));
+ 
+ static void
+ pltcl_SetTimer _ANSI_ARGS_((Tcl_Time *timePtr));
+ 
+ static void
+ pltcl_AlertNotifier _ANSI_ARGS_((ClientData clientData));
+ 
+ static void
+ pltcl_CreateFileHandler _ANSI_ARGS_((int fd, int mask, Tcl_FileProc *proc, 
ClientData clientData));
+ 
+ static void
+ pltcl_DeleteFileHandler _ANSI_ARGS_((int fd));
+ 
+ static void
+ pltcl_ServiceModeHook _ANSI_ARGS_((int mode));
+ 
+ static int
+ pltcl_WaitForEvent _ANSI_ARGS_((Tcl_Time *timePtr));
+ 
+ /**
+  *  Definitions for functions overriden using Tcl_SetNotifier. 
+  *  These implementations effectively disable the TCL Notifier subsystem.
+  *  This is okay because we never enter the TCL event loop from postgres,
+  *  so the notifier capabilities are initialized, but never used.
+  *  
+  *  NOTE: Only InitNotifier and DeleteFileHandler ever seem to get called
+  *by postgres, but we implement all the functions for completeness.
+  **/
+ 
+ ClientData