Re: [PATCHES] Improvement of procArray.xmin for VACUUM

2007-03-24 Thread Heikki Linnakangas

Bruce Momjian wrote:

Tom Lane wrote:

Bruce Momjian [EMAIL PROTECTED] writes:

I have been thinking we could improve how quickly VACUUM can expire rows
if we update procArray.xmin more frequently for non-SERIALIZABLE
transactions.
The attached patch updates procArray.xmin in this manner.

This patch is incredibly broken.  Didn't you understand what I said
about how we don't track which snapshots are still alive?  You can't
advance procArray.xmin past the xmin of the oldest live snapshot in the
backend, and you can't assume that there are no live snapshots at the
places where this patch assumes that.  (Open cursors are one obvious
counterexample, but I think there are more.)

To make intra-transaction advancing of xmin possible, we'd need to
explicitly track all of the backend's live snapshots, probably by
introducing a snapshot cache manager that gives out tracked refcounts
as we do for some other structures like catcache entries.  This might
have some other advantages (I think most of the current CopySnapshot
operations could be replaced by refcount increments) but it's a whole
lot more complicated and invasive than what you've got here.


I updated the patch to save the MyProc-xid at the time the first cursor
is created, and not allow the MyProc-xid to be set lower than that
saved value in the current transaction.  It added only a few more lines
to the patch.


It seems to me a lot cleaner to do the reference counting like Tom 
suggested. Increase the refcount on CopySnapshot, and decrease it on 
FreeSnapshot. Assuming that all callers of CopySnapshot free the 
snapshot with FreeSnapshot when they're done with it.


BTW: I really like the idea of doing this. It's a relatively simple 
thing to do and gives some immediate benefit. And it opens the door for 
more tricks to vacuum more aggressively in the presence of long-running 
transactions. And it allows us to vacuum tuples that were inserted and 
deleted in the same transactions even while the transaction is still 
running, which helps some pathological cases where a transaction updates 
a counter column many times within a transaction. We could also use it 
to free entries in the new combocids hash table earlier (not that it's a 
problem as it is, though).


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PATCHES] Improvement of procArray.xmin for VACUUM

2007-03-24 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 It seems to me a lot cleaner to do the reference counting like Tom 
 suggested. Increase the refcount on CopySnapshot, and decrease it on 
 FreeSnapshot. Assuming that all callers of CopySnapshot free the 
 snapshot with FreeSnapshot when they're done with it.

I don't believe we bother at the moment; which is one of the reasons
it'd be a nontrivial patch.  I do think it might be worth doing though.
In the simple case where you're just issuing successive non-cursor
commands within a READ COMMITTED transaction, a refcounted
implementation would be able to recognize that there are *no* live
snapshots between commands and therefore reset MyProc-xmin to 0
whenever the backend is idle.

OTOH, do we have any evidence that this is worth bothering with at all?
I fear that the cases of long-running transactions that are problems
in the real world wouldn't be helped much --- for instance, pg_dump
wouldn't change behavior because it uses a serializable transaction.
Also, at some point a long-running transaction becomes a bottleneck
simply because its XID is itself the oldest thing visible in the
ProcArray and is determining everyone's xmin.  How much daylight is
there really between your xmin is old and your xid is old?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Improvement of procArray.xmin for VACUUM

2007-03-24 Thread Bruce Momjian
Tom Lane wrote:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
  It seems to me a lot cleaner to do the reference counting like Tom 
  suggested. Increase the refcount on CopySnapshot, and decrease it on 
  FreeSnapshot. Assuming that all callers of CopySnapshot free the 
  snapshot with FreeSnapshot when they're done with it.
 
 I don't believe we bother at the moment; which is one of the reasons
 it'd be a nontrivial patch.  I do think it might be worth doing though.
 In the simple case where you're just issuing successive non-cursor
 commands within a READ COMMITTED transaction, a refcounted
 implementation would be able to recognize that there are *no* live
 snapshots between commands and therefore reset MyProc-xmin to 0
 whenever the backend is idle.

Attached is my current version of the patch.  It doesn't work now that I
tried to do reference count for Snapshots, but will stop now that Tom is
considering redesigning the snapshot mechanism.

 OTOH, do we have any evidence that this is worth bothering with at all?
 I fear that the cases of long-running transactions that are problems
 in the real world wouldn't be helped much --- for instance, pg_dump
 wouldn't change behavior because it uses a serializable transaction.
 Also, at some point a long-running transaction becomes a bottleneck
 simply because its XID is itself the oldest thing visible in the
 ProcArray and is determining everyone's xmin.  How much daylight is
 there really between your xmin is old and your xid is old?

Well, interesting you mention that, because I have a second idea on how
to improve things.  We start with MyProc-xmin equal to our own xid, and
then look for earlier transactions.  It should be possible to skip
considering our own xid for MyProc-xmin.  This would obviously help
VACUUM during long-running transactions.  While our transaction is
running, our xid isn't committed, so VACUUM isn't going to touch any of
our rows, and if other transactions complete before our
multi-transaction _statement_ starts, we can't see deleted rows from
them transaction, so why keep the deleted rows around?  Consider this
case:

Session #:
1   2   3
BEGIN;
SELECT 1;
CREATE TABLE test(x int);
INSERT INTO test VALUES (1);
DELETE FROM test;
SELECT 1;
VACUUM VERBOSE test;
(row can be reused)
COMMIT;
VACUUM VERBOSE test;
(normal row reuse)

As I understand it, in READ COMMITTED mode, we have to skip
transactions in progress when our _statement_ starts, but anything
committed before that we see and we don't see dead rows created by them.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/catalog/index.c
===
RCS file: /cvsroot/pgsql/src/backend/catalog/index.c,v
retrieving revision 1.280
diff -c -c -r1.280 index.c
*** src/backend/catalog/index.c	3 Mar 2007 20:08:41 -	1.280
--- src/backend/catalog/index.c	24 Mar 2007 19:17:56 -
***
*** 1358,1364 
  	ExprContext *econtext;
  	Snapshot	snapshot;
  	TransactionId OldestXmin;
! 
  	/*
  	 * sanity checks
  	 */
--- 1358,1364 
  	ExprContext *econtext;
  	Snapshot	snapshot;
  	TransactionId OldestXmin;
! 	
  	/*
  	 * sanity checks
  	 */
***
*** 1555,1561 
  	ExecDropSingleTupleTableSlot(slot);
  
  	FreeExecutorState(estate);
! 
  	/* These may have been pointing to the now-gone estate */
  	indexInfo-ii_ExpressionsState = NIL;
  	indexInfo-ii_PredicateState = NIL;
--- 1555,1563 
  	ExecDropSingleTupleTableSlot(slot);
  
  	FreeExecutorState(estate);
! 	if (snapshot != SnapshotNow)
! 		FreeSnapshot(snapshot);
! 		
  	/* These may have been pointing to the now-gone estate */
  	indexInfo-ii_ExpressionsState = NIL;
  	indexInfo-ii_PredicateState = NIL;
Index: src/backend/commands/cluster.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/cluster.c,v
retrieving revision 1.157
diff -c -c -r1.157 cluster.c
*** src/backend/commands/cluster.c	13 Mar 2007 00:33:39 -	1.157
--- src/backend/commands/cluster.c	24 Mar 2007 19:17:56 -
***
*** 204,209 
--- 204,210 
  			/* Start a new transaction for each relation. */
  			StartTransactionCommand();
  			/* functions in indexes may want a snapshot set */
+ 			FreeSnapshot(ActiveSnapshot);
  			ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
  	

Re: [PATCHES] [BUGS] BUG #3095: LDAP authentication parsing incorrectly

2007-03-24 Thread Bruce Momjian

I have researched this problem, and the incorrect behavior seems to be
totally caused by the fact that unquoted commas are treated as item
separators in pg_hba.conf.

I have updated the documentation in 8.2 and CVS HEAD to indicate that
the LDAP URL should be double-quoted, and double-quoted the example URL
for emphasis.

If double-quoting does not 100% fix your problem, please let us know. 
Thanks.

Documentation patch attached.

---

Joey Wang wrote:
 
 The following bug has been logged online:
 
 Bug reference:  3095
 Logged by:  Joey Wang
 Email address:  [EMAIL PROTECTED]
 PostgreSQL version: 8.2.3
 Operating system:   Linux
 Description:LDAP authentication parsing incorrectly
 Details: 
 
 LDAP authentication parsing has two bugs.
 
 When pg_hba.conf contains the a line
 
 host all all 127.0.0.1/24 ldap
 ldap://ActiveDirectory/dc=domain,dc=com;cn=;,cn=users
 
 We expect the parsing will construct a user DN as
 
 cn=userid,cn=users,dc=domain,dc=com
 
 But
 
 (1) dc=domain,dc=com is ignored. This is the src code from auth.c:
 
 .
 
 /* ldap, no port number */
 r = sscanf(port-auth_arg,  
 ldap://%127[^/]/%127[^;];%127[^;];%127s;,
server, basedn, prefix, suffix);
 
 .
 
 snprintf(fulluser, sizeof(fulluser), %s%s%s,
prefix, port-user_name, suffix);
 fulluser[sizeof(fulluser) - 1] = '\0';
 
 r = ldap_simple_bind_s(ldap, fulluser, passwd);
 
 We can see the code did not use basedn.
 
 (2) suffix containing ',' is converted to other character. This bug is
 caused by parsing algrithm to treat comma as a token separator.
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/client-auth.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/client-auth.sgml,v
retrieving revision 1.97
diff -c -c -r1.97 client-auth.sgml
*** doc/src/sgml/client-auth.sgml	31 Jan 2007 20:56:16 -	1.97
--- doc/src/sgml/client-auth.sgml	24 Mar 2007 21:44:29 -
***
*** 929,937 
  synopsis
  ldap[replaceables/]://replaceableservername/[:replaceableport/]/replaceablebase dn/replaceable[;replaceableprefix/[;replaceablesuffix/]]
  /synopsis
! for example:
  synopsis
! ldap://ldap.example.net/dc=example,dc=net;EXAMPLE\
  /synopsis
  
 /para
--- 929,941 
  synopsis
  ldap[replaceables/]://replaceableservername/[:replaceableport/]/replaceablebase dn/replaceable[;replaceableprefix/[;replaceablesuffix/]]
  /synopsis
! Commas are used to specify multiple items in an literalldap/
! component.  However, because unquoted commas are treated as item
! separators in filenamepg_hba.conf/filename, it is wise to
! double-quote the literalldap/ URL to preserve any commas present,
! e.g.:
  synopsis
! ldap://ldap.example.net/dc=example,dc=net;EXAMPLE\;
  /synopsis
  
 /para

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Improvement of procArray.xmin for VACUUM

2007-03-24 Thread Gregory Stark
 OTOH, do we have any evidence that this is worth bothering with at all?
 I fear that the cases of long-running transactions that are problems
 in the real world wouldn't be helped much --- for instance, pg_dump
 wouldn't change behavior because it uses a serializable transaction.
Well I think this would be the same infrastructure we would need to do the
other discussed improvement to address pg_dump's impact. That would require us
to publish the youngest xmax of the live snapshots. Vacuum could deduce that
that xid cannot possibly see any transactions between the youngest extant xmax
and the oldest in-progress transaction.
-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] LIMIT/SORT optimization

2007-03-24 Thread Gregory Stark
 Did Greg push a version which didn't carry the WIP label to it?  As
 far as I remember he was still asking how to make the Sort and Limit
 nodes communicate.

 Good question.  I asked for a new version of this patch and the WIP was
 only in the email subject line. Greg, is this ready for review?
Well my question was whether it was acceptable to do things this way or
whether there was a better way. If it's the right way then sure, if not then
no. I guess that's what review is for.
In short I'm not sure what constitutes ready for review. Are you asking if
it's ready to apply? I don't know, that's why we have reviews. Or are you
asking if it's ready for someone to look at? What's the point of posting WIP
patches if you don't want someone to look at it?
-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] LIMIT/SORT optimization

2007-03-24 Thread Bruce Momjian
Gregory Stark wrote:
  Did Greg push a version which didn't carry the WIP label to it?  As
  far as I remember he was still asking how to make the Sort and Limit
  nodes communicate.
 
  Good question.  I asked for a new version of this patch and the WIP was
  only in the email subject line. Greg, is this ready for review?
 Well my question was whether it was acceptable to do things this way or
 whether there was a better way. If it's the right way then sure, if not then
 no. I guess that's what review is for.
 In short I'm not sure what constitutes ready for review. Are you asking if
 it's ready to apply? I don't know, that's why we have reviews. Or are you
 asking if it's ready for someone to look at? What's the point of posting WIP
 patches if you don't want someone to look at it?

I am asking if _you_ are done working on it.  Seems you are, so I will
add it to the patch queue.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Automatic adjustment of bgwriter_lru_maxpages

2007-03-24 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


ITAGAKI Takahiro wrote:
 Sorry, I had a mistake in the patch I sent.
 This is a fixed version.
 
 I wrote:
 
  I'm working on making the bgwriter to write almost of dirty pages. This is
  the proposal for it using automatic adjustment of bgwriter_lru_maxpages.
 
 Regards,
 ---
 ITAGAKI Takahiro
 NTT Open Source Software Center
 

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Optimized pgbench for 8.3

2007-03-24 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


ITAGAKI Takahiro wrote:
 The attached is a patch to optimize contrib/pgbench using new 8.3 features.
 
 - Use DROP IF EXISTS to suppress errors for initial loadings.
 - Use a combination of TRUNCATE and COPY to reduce WAL on creating
   the accounts table.
 
 Also, there are some cosmetic changes.
 
 - Change the output of -v option from starting full vacuum...
   to starting vacuum accounts... in reflection of the fact.
 - Shape duplicated error checks into executeStatement().
 
 
 There is a big performance win in COPY with no WAL feature.
 Thanks for the efforts!
 
 Regards,
 ---
 ITAGAKI Takahiro
 NTT Open Source Software Center

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] POSIX Shared memory, revised, again

2007-03-24 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Chris Marcellino wrote:
 So I've finished reformulating this patch to use the device/inode  
 instead of the hash idea I had earlier.
 
 I have tested this patch (on Darwin) to permit the postmaster to  
 recover after a crash or killing, and it will not permit the  
 postmaster to restart while another postmaster or backends are still  
 in the database (regardless of the lockfile status). The ideas are  
 the same as before, but are described via flowchart in the attached  
 PDF. The three template files opt-out those particular templates from  
 the POSIX shmem, but this could be done alternately by opting-in the  
 desired patches. An autoconf'ing of the patched configure.in file is,  
 of course, required.
 
 I appreciate the feedback and consideration.
 
 Thanks again,
 Chris Marcellino
 
 
 

[ Attachment, skipping... ]

[ Attachment, skipping... ]

[ Attachment, skipping... ]

[ Attachment, skipping... ]

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] POSIX shared memory support

2007-03-24 Thread Bruce Momjian

Newest version added:

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Chris Marcellino wrote:
 In case you haven't had enough, here is another version of the code  
 to make Postgres use POSIX shared memory. Along with the issues that  
 have already been addressed, this version ensures that orphaned  
 backends are not in the database when restarting Postgres by using a  
 single 1 byte SysV segment to see who is attached to the segment  
 using shmctl/IPC_STAT/nattach.
 
 This effectively frees Postgres from the SHMMAX and SHMALL limits.  
 Since this still takes one SysV segment, SHMMNI can still be reached  
 on most platforms if a ton of databases are opened simultaneously  
 (i.e. 32 on Mac OS X, 256 on Linux and Solaris).
 
 If you have the need to ship a product with Postgres embedded in it  
 and are unable to change kernel settings (like myself), this might be  
 of use to you. I have tested all of the failure situations I could  
 think of by various combinations of deleting lockfiles while in use,  
 changing the PID inside the lockfile and trying to restart and run  
 more than one postmaster simultaneously.
 
 Of course, this since this requires both POSIX and SysV shared  
 memory, this doesn't increase the portability of Postgres which might  
 make it less appropriate for mass distribution; I thought I would put  
 it out there for any feedback either way.
 
 Thanks again,
 Chris Marcellino
 

[ Attachment, skipping... ]

 
 
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Load distributed checkpoint V3

2007-03-24 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


ITAGAKI Takahiro wrote:
 Folks,
 
 Here is the latest version of Load distributed checkpoint patch.
 
 I've fixed some bugs, including in cases of missing file errors
 and overlapping of asynchronous checkpoint requests.
 
 Regards,
 ---
 ITAGAKI Takahiro
 NTT Open Source Software Center

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org