Re: [PATCHES] Improvement of procArray.xmin for VACUUM
Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I have been thinking we could improve how quickly VACUUM can expire rows if we update procArray.xmin more frequently for non-SERIALIZABLE transactions. The attached patch updates procArray.xmin in this manner. This patch is incredibly broken. Didn't you understand what I said about how we don't track which snapshots are still alive? You can't advance procArray.xmin past the xmin of the oldest live snapshot in the backend, and you can't assume that there are no live snapshots at the places where this patch assumes that. (Open cursors are one obvious counterexample, but I think there are more.) To make intra-transaction advancing of xmin possible, we'd need to explicitly track all of the backend's live snapshots, probably by introducing a snapshot cache manager that gives out tracked refcounts as we do for some other structures like catcache entries. This might have some other advantages (I think most of the current CopySnapshot operations could be replaced by refcount increments) but it's a whole lot more complicated and invasive than what you've got here. I updated the patch to save the MyProc-xid at the time the first cursor is created, and not allow the MyProc-xid to be set lower than that saved value in the current transaction. It added only a few more lines to the patch. It seems to me a lot cleaner to do the reference counting like Tom suggested. Increase the refcount on CopySnapshot, and decrease it on FreeSnapshot. Assuming that all callers of CopySnapshot free the snapshot with FreeSnapshot when they're done with it. BTW: I really like the idea of doing this. It's a relatively simple thing to do and gives some immediate benefit. And it opens the door for more tricks to vacuum more aggressively in the presence of long-running transactions. And it allows us to vacuum tuples that were inserted and deleted in the same transactions even while the transaction is still running, which helps some pathological cases where a transaction updates a counter column many times within a transaction. We could also use it to free entries in the new combocids hash table earlier (not that it's a problem as it is, though). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Improvement of procArray.xmin for VACUUM
Heikki Linnakangas [EMAIL PROTECTED] writes: It seems to me a lot cleaner to do the reference counting like Tom suggested. Increase the refcount on CopySnapshot, and decrease it on FreeSnapshot. Assuming that all callers of CopySnapshot free the snapshot with FreeSnapshot when they're done with it. I don't believe we bother at the moment; which is one of the reasons it'd be a nontrivial patch. I do think it might be worth doing though. In the simple case where you're just issuing successive non-cursor commands within a READ COMMITTED transaction, a refcounted implementation would be able to recognize that there are *no* live snapshots between commands and therefore reset MyProc-xmin to 0 whenever the backend is idle. OTOH, do we have any evidence that this is worth bothering with at all? I fear that the cases of long-running transactions that are problems in the real world wouldn't be helped much --- for instance, pg_dump wouldn't change behavior because it uses a serializable transaction. Also, at some point a long-running transaction becomes a bottleneck simply because its XID is itself the oldest thing visible in the ProcArray and is determining everyone's xmin. How much daylight is there really between your xmin is old and your xid is old? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Improvement of procArray.xmin for VACUUM
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: It seems to me a lot cleaner to do the reference counting like Tom suggested. Increase the refcount on CopySnapshot, and decrease it on FreeSnapshot. Assuming that all callers of CopySnapshot free the snapshot with FreeSnapshot when they're done with it. I don't believe we bother at the moment; which is one of the reasons it'd be a nontrivial patch. I do think it might be worth doing though. In the simple case where you're just issuing successive non-cursor commands within a READ COMMITTED transaction, a refcounted implementation would be able to recognize that there are *no* live snapshots between commands and therefore reset MyProc-xmin to 0 whenever the backend is idle. Attached is my current version of the patch. It doesn't work now that I tried to do reference count for Snapshots, but will stop now that Tom is considering redesigning the snapshot mechanism. OTOH, do we have any evidence that this is worth bothering with at all? I fear that the cases of long-running transactions that are problems in the real world wouldn't be helped much --- for instance, pg_dump wouldn't change behavior because it uses a serializable transaction. Also, at some point a long-running transaction becomes a bottleneck simply because its XID is itself the oldest thing visible in the ProcArray and is determining everyone's xmin. How much daylight is there really between your xmin is old and your xid is old? Well, interesting you mention that, because I have a second idea on how to improve things. We start with MyProc-xmin equal to our own xid, and then look for earlier transactions. It should be possible to skip considering our own xid for MyProc-xmin. This would obviously help VACUUM during long-running transactions. While our transaction is running, our xid isn't committed, so VACUUM isn't going to touch any of our rows, and if other transactions complete before our multi-transaction _statement_ starts, we can't see deleted rows from them transaction, so why keep the deleted rows around? Consider this case: Session #: 1 2 3 BEGIN; SELECT 1; CREATE TABLE test(x int); INSERT INTO test VALUES (1); DELETE FROM test; SELECT 1; VACUUM VERBOSE test; (row can be reused) COMMIT; VACUUM VERBOSE test; (normal row reuse) As I understand it, in READ COMMITTED mode, we have to skip transactions in progress when our _statement_ starts, but anything committed before that we see and we don't see dead rows created by them. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/catalog/index.c === RCS file: /cvsroot/pgsql/src/backend/catalog/index.c,v retrieving revision 1.280 diff -c -c -r1.280 index.c *** src/backend/catalog/index.c 3 Mar 2007 20:08:41 - 1.280 --- src/backend/catalog/index.c 24 Mar 2007 19:17:56 - *** *** 1358,1364 ExprContext *econtext; Snapshot snapshot; TransactionId OldestXmin; ! /* * sanity checks */ --- 1358,1364 ExprContext *econtext; Snapshot snapshot; TransactionId OldestXmin; ! /* * sanity checks */ *** *** 1555,1561 ExecDropSingleTupleTableSlot(slot); FreeExecutorState(estate); ! /* These may have been pointing to the now-gone estate */ indexInfo-ii_ExpressionsState = NIL; indexInfo-ii_PredicateState = NIL; --- 1555,1563 ExecDropSingleTupleTableSlot(slot); FreeExecutorState(estate); ! if (snapshot != SnapshotNow) ! FreeSnapshot(snapshot); ! /* These may have been pointing to the now-gone estate */ indexInfo-ii_ExpressionsState = NIL; indexInfo-ii_PredicateState = NIL; Index: src/backend/commands/cluster.c === RCS file: /cvsroot/pgsql/src/backend/commands/cluster.c,v retrieving revision 1.157 diff -c -c -r1.157 cluster.c *** src/backend/commands/cluster.c 13 Mar 2007 00:33:39 - 1.157 --- src/backend/commands/cluster.c 24 Mar 2007 19:17:56 - *** *** 204,209 --- 204,210 /* Start a new transaction for each relation. */ StartTransactionCommand(); /* functions in indexes may want a snapshot set */ + FreeSnapshot(ActiveSnapshot); ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
Re: [PATCHES] [BUGS] BUG #3095: LDAP authentication parsing incorrectly
I have researched this problem, and the incorrect behavior seems to be totally caused by the fact that unquoted commas are treated as item separators in pg_hba.conf. I have updated the documentation in 8.2 and CVS HEAD to indicate that the LDAP URL should be double-quoted, and double-quoted the example URL for emphasis. If double-quoting does not 100% fix your problem, please let us know. Thanks. Documentation patch attached. --- Joey Wang wrote: The following bug has been logged online: Bug reference: 3095 Logged by: Joey Wang Email address: [EMAIL PROTECTED] PostgreSQL version: 8.2.3 Operating system: Linux Description:LDAP authentication parsing incorrectly Details: LDAP authentication parsing has two bugs. When pg_hba.conf contains the a line host all all 127.0.0.1/24 ldap ldap://ActiveDirectory/dc=domain,dc=com;cn=;,cn=users We expect the parsing will construct a user DN as cn=userid,cn=users,dc=domain,dc=com But (1) dc=domain,dc=com is ignored. This is the src code from auth.c: . /* ldap, no port number */ r = sscanf(port-auth_arg, ldap://%127[^/]/%127[^;];%127[^;];%127s;, server, basedn, prefix, suffix); . snprintf(fulluser, sizeof(fulluser), %s%s%s, prefix, port-user_name, suffix); fulluser[sizeof(fulluser) - 1] = '\0'; r = ldap_simple_bind_s(ldap, fulluser, passwd); We can see the code did not use basedn. (2) suffix containing ',' is converted to other character. This bug is caused by parsing algrithm to treat comma as a token separator. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/client-auth.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/client-auth.sgml,v retrieving revision 1.97 diff -c -c -r1.97 client-auth.sgml *** doc/src/sgml/client-auth.sgml 31 Jan 2007 20:56:16 - 1.97 --- doc/src/sgml/client-auth.sgml 24 Mar 2007 21:44:29 - *** *** 929,937 synopsis ldap[replaceables/]://replaceableservername/[:replaceableport/]/replaceablebase dn/replaceable[;replaceableprefix/[;replaceablesuffix/]] /synopsis ! for example: synopsis ! ldap://ldap.example.net/dc=example,dc=net;EXAMPLE\ /synopsis /para --- 929,941 synopsis ldap[replaceables/]://replaceableservername/[:replaceableport/]/replaceablebase dn/replaceable[;replaceableprefix/[;replaceablesuffix/]] /synopsis ! Commas are used to specify multiple items in an literalldap/ ! component. However, because unquoted commas are treated as item ! separators in filenamepg_hba.conf/filename, it is wise to ! double-quote the literalldap/ URL to preserve any commas present, ! e.g.: synopsis ! ldap://ldap.example.net/dc=example,dc=net;EXAMPLE\; /synopsis /para ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Improvement of procArray.xmin for VACUUM
OTOH, do we have any evidence that this is worth bothering with at all? I fear that the cases of long-running transactions that are problems in the real world wouldn't be helped much --- for instance, pg_dump wouldn't change behavior because it uses a serializable transaction. Well I think this would be the same infrastructure we would need to do the other discussed improvement to address pg_dump's impact. That would require us to publish the youngest xmax of the live snapshots. Vacuum could deduce that that xid cannot possibly see any transactions between the youngest extant xmax and the oldest in-progress transaction. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] LIMIT/SORT optimization
Did Greg push a version which didn't carry the WIP label to it? As far as I remember he was still asking how to make the Sort and Limit nodes communicate. Good question. I asked for a new version of this patch and the WIP was only in the email subject line. Greg, is this ready for review? Well my question was whether it was acceptable to do things this way or whether there was a better way. If it's the right way then sure, if not then no. I guess that's what review is for. In short I'm not sure what constitutes ready for review. Are you asking if it's ready to apply? I don't know, that's why we have reviews. Or are you asking if it's ready for someone to look at? What's the point of posting WIP patches if you don't want someone to look at it? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] LIMIT/SORT optimization
Gregory Stark wrote: Did Greg push a version which didn't carry the WIP label to it? As far as I remember he was still asking how to make the Sort and Limit nodes communicate. Good question. I asked for a new version of this patch and the WIP was only in the email subject line. Greg, is this ready for review? Well my question was whether it was acceptable to do things this way or whether there was a better way. If it's the right way then sure, if not then no. I guess that's what review is for. In short I'm not sure what constitutes ready for review. Are you asking if it's ready to apply? I don't know, that's why we have reviews. Or are you asking if it's ready for someone to look at? What's the point of posting WIP patches if you don't want someone to look at it? I am asking if _you_ are done working on it. Seems you are, so I will add it to the patch queue. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Automatic adjustment of bgwriter_lru_maxpages
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- ITAGAKI Takahiro wrote: Sorry, I had a mistake in the patch I sent. This is a fixed version. I wrote: I'm working on making the bgwriter to write almost of dirty pages. This is the proposal for it using automatic adjustment of bgwriter_lru_maxpages. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center [ Attachment, skipping... ] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Optimized pgbench for 8.3
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- ITAGAKI Takahiro wrote: The attached is a patch to optimize contrib/pgbench using new 8.3 features. - Use DROP IF EXISTS to suppress errors for initial loadings. - Use a combination of TRUNCATE and COPY to reduce WAL on creating the accounts table. Also, there are some cosmetic changes. - Change the output of -v option from starting full vacuum... to starting vacuum accounts... in reflection of the fact. - Shape duplicated error checks into executeStatement(). There is a big performance win in COPY with no WAL feature. Thanks for the efforts! Regards, --- ITAGAKI Takahiro NTT Open Source Software Center [ Attachment, skipping... ] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] POSIX Shared memory, revised, again
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Chris Marcellino wrote: So I've finished reformulating this patch to use the device/inode instead of the hash idea I had earlier. I have tested this patch (on Darwin) to permit the postmaster to recover after a crash or killing, and it will not permit the postmaster to restart while another postmaster or backends are still in the database (regardless of the lockfile status). The ideas are the same as before, but are described via flowchart in the attached PDF. The three template files opt-out those particular templates from the POSIX shmem, but this could be done alternately by opting-in the desired patches. An autoconf'ing of the patched configure.in file is, of course, required. I appreciate the feedback and consideration. Thanks again, Chris Marcellino [ Attachment, skipping... ] [ Attachment, skipping... ] [ Attachment, skipping... ] [ Attachment, skipping... ] [ Attachment, skipping... ] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] POSIX shared memory support
Newest version added: Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Chris Marcellino wrote: In case you haven't had enough, here is another version of the code to make Postgres use POSIX shared memory. Along with the issues that have already been addressed, this version ensures that orphaned backends are not in the database when restarting Postgres by using a single 1 byte SysV segment to see who is attached to the segment using shmctl/IPC_STAT/nattach. This effectively frees Postgres from the SHMMAX and SHMALL limits. Since this still takes one SysV segment, SHMMNI can still be reached on most platforms if a ton of databases are opened simultaneously (i.e. 32 on Mac OS X, 256 on Linux and Solaris). If you have the need to ship a product with Postgres embedded in it and are unable to change kernel settings (like myself), this might be of use to you. I have tested all of the failure situations I could think of by various combinations of deleting lockfiles while in use, changing the PID inside the lockfile and trying to restart and run more than one postmaster simultaneously. Of course, this since this requires both POSIX and SysV shared memory, this doesn't increase the portability of Postgres which might make it less appropriate for mass distribution; I thought I would put it out there for any feedback either way. Thanks again, Chris Marcellino [ Attachment, skipping... ] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Load distributed checkpoint V3
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- ITAGAKI Takahiro wrote: Folks, Here is the latest version of Load distributed checkpoint patch. I've fixed some bugs, including in cases of missing file errors and overlapping of asynchronous checkpoint requests. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center [ Attachment, skipping... ] ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org