[HACKERS] Re: Synch Rep: direct transfer of WAL file from the primary to the standby
Hi, On Tue, Jun 16, 2009 at 3:13 PM, Fujii Masaomasao.fu...@gmail.com wrote: The main part of this capability is the new function to read the specified WAL file. The following is the definition of it. pg_read_xlogfile (filename text [, restore bool]) returns setof bytea - filename: name of file to read - restore: indicates whether to try to restore the file from the archive - returns the content of the specified file (max size of one row is 8KB, i.e. this function returns 2,048 rows when WAL file whose size is 16MB is requested.) If restore=true, this function tries to retrieve the file from the archive at first. This requires restore_command which needs to be specified in postgresql.conf. In order for the primary server (ie. a normal backend) to read an archived file, restore_command needs to be specified in also postgresql.conf. In this case, how should we handle restore_command in recovery.conf? 1) Delete restore_command from recovery.conf. In this case, an user has to specify it in postgresql.conf instead of recovery.conf when doing PITR. This is simple, but tempts me to merge two configuration files. I'm not sure why the parameters for recovery should be set apart from postgresql.conf. 2) Leave restore_command in recovery.conf; it can be set in both or either of two configuration files. We put recovery.conf before postgresql.conf only during recovery if it's in both. After recovery, we prioritize postgresql.conf. In this case, recovery.conf also needs to be re-read during recovery when SIGHUP arrives. This might be complicated for an user. 3) Separate restore_command into two parameters. For example, - normal_restore_command: is used by a normal backend - recovery_restore_command: is used by startup process for PITR In this case, it's bothersome that the same command must be set in both of two configuration files. I'm leaning to 1) that restore_command is simply moved from recovery.conf to postgresql.conf. What's your opinion? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.5 development schedule
Robert Haas wrote: On Fri, Jul 3, 2009 at 1:16 PM, Tom Lanet...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Robert Haas wrote: What I've seen of Heikki's work thus far has led me to believe that his reasons for rejecting the patch were good ones, but I don't specifically what they were. It would be helpful, I think, to reiterate them or repost links to the relevant messages in the archives; it would also be great if we could get an estimate of how close the patch is to being committable. Does it still need massive work, or is it getting fairly close, or what? Are the issues code cleanliness/maintainability, bugs, missing functionality? This is where we left off: http://archives.postgresql.org/message-id/49a64d16.8010...@enterprisedb.com There were adjacent remarks suggesting that large other parts of the patch remained to be reviewed, as well. http://archives.postgresql.org/pgsql-hackers/2009-02/msg01268.php Thanks to both of you, this is very helpful. Two other questions: 1. Are there any chunks of this functionality in this patch that seem like they might be able to be severed and committed separately? I don't think so. 2. Was the latest version of this patch posted to the mailing list, and if so can you provide a link? Yes: http://archives.postgresql.org/message-id/49a64d73.6090...@enterprisedb.com -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First CommitFest: July 15th
On Saturday 04 July 2009 00:54:11 Robert Haas wrote: I think what would be more useful is if we could somehow associated metadata with each commit. Right now, for example, the author of a patch is not stored with the patch in any structured way; it's just typed in, usually but not always as the last line of the commit. So you can't easily find out what lines of code a certain person has touched, for example. The sorts of problems that you're talking about seem broadly in the same vein. I have been trying to follow a convention on-and-off to put the author of the patch in the last line of the commit message, like Author: First Last n...@example.com A tool such as git-cvsimport will actually parse that and put it into the author field of a git commit. (The tool we use, fromcvs, doesn't do that, but it could conceivably be patched easily to do it.) I also found the following resource helpful in crafting commit messages: http://www.tpope.net/node/106 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tsvector extraction patch
On Friday 03 July 2009 10:49:41 Hans-Juergen Schoenig -- PostgreSQL wrote: hello, this patch has not made it through yesterday, so i am trying to send it again. i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good |8 patch |9 pretti |3 sure |4 (4 rows) Sounds useful. But in the interest of orthogonality (or whatever), how about instead you write a cast from tsvector to text[], and then you can use unnest() to convert that to a table, e.g., SELECT * FROM unnest(CAST(to_tsvector('...') AS text[])); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feedback on writing extensible modules
Tom Lane t...@sss.pgh.pa.us writes: Dimitri Fontaine dfonta...@hi-media.com writes: Please find attached a little little patch which run process_local_preload_libraries from within a transaction. This is inevitably going to break other people's code. Put the transaction wrapper in your own stuff if you have to have it. The module is working fine on HEAD without any patch if it cares about starting a transaction itself into _PG_init(), even when _PG_init() is called at function call time rather than at local_preload_libraries time. My reserve was that I thought the transaction arround _PG_init() was existing in a 'normal' call, so the explicit creation of it in the module would fail: StartTransactionCommand(); ... CommitTransactionCommand(); Now my only problem is related to making the module 8.3 compliant: pre_prepare.c:19:27: error: utils/snapmgr.h: No such file or directory pre_prepare.c: In function ‘_PG_init’: pre_prepare.c:188: warning: implicit declaration of function ‘PushActiveSnapshot’ pre_prepare.c:207: warning: implicit declaration of function ‘PopActiveSnapshot’ I guess I can document that having pre_prepare in local_preload_libraries with preprepare.at_init = on is only support from 8.4 onward... Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FYI: fdatasync vs sync_file_range
Hi, Using sync_file_range(2) as wal_sync_method might speed up the XLOG flush. So, I made the patch to introduce the new valid value (sync_file_range) to wal_sync_method, and performed the comparative performance measurement of fdatasync vs sync_file_range using this patch. The patch is attached to this mail. This is just a reference information, and I'm not planning to provide the patch for CommitFest now. Environment: - PowerEdge1850 (Xeon 2.8GHz, Mem 512MB) - Fedora11 - PostgreSQL v8.4 with the patch Measurement: - pgbench -i -s64 - pgbench -c16 -t1000 -Mprepared * [20 times] - postgresql.conf checkpoint_segments = 64 - The above measurement was repeated 3 times Result: - The following values indicate throughput of pgbench (tps) The first set fdatasync sync_file_range 1 60.6 58.9 2 63.1 58.8 3 61.3 62.3 4 70.3 66.8 5 67.4 66.2 6 67.8 71.1 7 74.3 67.5 8 70.0 71.9 9 71.7 72.8 10 74.0 72.0 11 72.3 72.1 12 79.9 78.6 13 73.3 73.3 14 72.9 71.2 15 78.6 78.6 16 81.7 76.7 17 75.5 75.9 18 78.0 73.3 19 75.3 78.9 20 83.0 77.3 avg 72.5 71.2 The second set - fdatasync sync_file_range 1 52.6 60.3 2 57.4 65.9 3 62.6 63.7 4 59.0 68.9 5 67.0 72.2 6 61.5 72.2 7 69.0 73.4 8 64.3 75.6 9 67.6 74.8 10 69.1 75.7 11 65.7 77.7 12 72.6 76.6 13 68.8 75.5 14 69.4 79.4 15 74.2 81.2 16 71.4 77.5 17 71.3 78.0 18 73.1 80.4 19 73.5 80.2 20 73.7 80.7 avg 67.2 74.5 The third set - fdatasync sync_file_range 1 60.9 59.5 2 58.3 64.1 3 64.7 62.9 4 66.6 68.0 5 67.9 70.9 6 69.9 69.4 7 70.0 72.6 8 72.3 76.6 9 70.7 74.7 10 70.3 70.2 11 77.2 78.2 12 74.8 73.9 13 69.6 79.0 14 79.3 80.7 15 78.0 74.6 16 77.8 78.9 17 73.6 81.0 18 81.5 77.6 19 76.1 78.5 20 79.1 83.7 avg 71.9 73.8 According to the result, using sync_file_range instead of fdatasync has little effect in the performance of postgres. This time I just used sync_file_range with the following combination of the flags: SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE | SYNC_FILE_RANGE_WAIT_AFTER This might be a stupid way, so there might be room for improvement. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center Index: configure === RCS file: /projects/cvsroot/pgsql/configure,v retrieving revision 1.644 diff -c -r1.644 configure *** configure 27 Jun 2009 00:14:46 - 1.644 --- configure 30 Jun 2009 04:54:13 - *** *** 16587,16592 --- 16587,16761 fi + # sync_file_range() is a no-op on Solaris, so don't incur function overhead + # by calling it. + if test $PORTNAME != solaris; then + + for ac_func in sync_file_range + do + as_ac_var=`echo ac_cv_func_$ac_func | $as_tr_sh` + { echo $as_me:$LINENO: checking for $ac_func 5 + echo $ECHO_N checking for $ac_func... $ECHO_C 6; } + if { as_var=$as_ac_var; eval test \\${$as_var+set}\ = set; }; then + echo $ECHO_N (cached) $ECHO_C 6 + else + cat conftest.$ac_ext _ACEOF + /* confdefs.h. */ + _ACEOF + cat confdefs.h conftest.$ac_ext + cat conftest.$ac_ext _ACEOF + /* end confdefs.h. */ + /* Define $ac_func to an innocuous variant, in case limits.h declares $ac_func. +For example, HP-UX 11i limits.h declares gettimeofday. */ + #define $ac_func innocuous_$ac_func + + /* System header to define __stub macros and hopefully few prototypes, + which can conflict with char $ac_func (); below. + Prefer limits.h to assert.h if __STDC__ is defined, since + limits.h exists even on freestanding compilers. */ + + #ifdef __STDC__ + # include limits.h + #else + # include assert.h + #endif + + #undef $ac_func + + /* Override any GCC internal prototype to avoid an error. +Use char because int might match the return type of a GCC +builtin and then its argument prototype would still apply. */ + #ifdef __cplusplus + extern C + #endif + char $ac_func (); + /* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ + #if defined __stub_$ac_func || defined
Re: [HACKERS] WIP: generalized index constraints
On Sun, 2009-07-05 at 17:28 -0700, Jeff Davis wrote: This is a follow up to my old proposal here: http://archives.postgresql.org/pgsql-hackers/2008-06/msg00404.php Any input is appreciated (design problems, implementation, language ideas, or anything else). I'd like to get it into shape for the July 15 commitfest if no major problems are found. I was concerned that your definition of concurrently inserted didn't seem to match the size of the shared memory array required. How will you cope with a large COPY? Surely there can be more than one concurrent insert from any backend? It would be useful to see a real example of what this can be used for. I think it will be useful to separate the concepts of a constraint from the concept of an index. It seems possible to have a UNIQUE constraint that doesn't help at all in locating rows, just in proving that the rows are unique. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, Jul 6, 2009 at 11:56 AM, Simon Riggssi...@2ndquadrant.com wrote: How will you cope with a large COPY? Surely there can be more than one concurrent insert from any backend? He only needs to handle inserts for the period they're actively being inserted into the index. Once they're in the index he'll find them using the index scan. In other words this is all a proxy for the way btree locks index pages while it looks for a unique key violation. I'm a bit concerned about the use of tid. You might have to look at a lot of heap pages to check for conflicts. I suppose they're almost certainly all in shared memory though. Also, it sounds like you're anticipating the possibility of dead entries in the array but if you do then you need to store the xmin also to protect against a tuple that's been vacuumed and had its line pointer reused since. But I don't see the necessity for that anyways since you can just clean up the entry on abort. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ALTER SET DISTINCT vs. Oracle-like DBMS_STATS
Hello, A new feature ALTER TABLE ... ALTER COLUMN ... SET DISTINCT is submitted to the next commetfest: http://archives.postgresql.org/message-id/603c8f070905041913r667b3f32oa068d758ba5f1...@mail.gmail.com but I have another approach for the plan stability issues. It might conflict ALTER SET DISTINCT patch in terms of duplicated functionality, so I want to discuss them. It is just similar to Oracle's DBMS_STATS package. http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_stats.htm If it were, ALTER TABLE tablename ALTER COLUMN 3rd-column SET DISITNCT 100 could be written as: INSERT INTO dbms_stats.columns(starelid, ataattnum, stadistinct) VALUES ('tablename'::regclass, 3, 100); Of course ALTER SET DISTINCT is easy-to-use, so it could be an alias for the above INSERT command. The DBMS_STATS for Postgres is based on new statstics hooks in 8.4 -- get_relation_info_hook, get_attavgwidth_hook, get_relation_stats_hook and get_index_stats_hook. The module has dbms_stats.relations and dbms_stats.columns tables and hides pg_class and pg_statistics when enabled. So, if once you set a value to dbms_stats.columns.stadistinct, the value hides pg_statistics.stadistinct and planner always uses it for planning. You can modify statistics of your tables by inserting values directly to relations and columns tables. Also lock() or unlock() functions are useful to use a bit customized stats based on existing values. - TABLE dbms_stats.relations : hide pg_class.relpages, reltuples. - TABLE dbms_stats.columns: hide pg_statistic. - FUNCTION dbms_stats.lock() : copy pg_class and pg_statistic to the above tables. - FUNCTION dbms_stats.unlock(): delete some rows from the above tables. The module also supports backup-statstics feature. - TABLE dbms_stats.backup, relations_backup, columns_backup - FUNCTION dbms_stats.backup() : backup statistics to the above tables. - FUNCTION dbms_stats.restore() : restore statistics from - FUNCTION dbms_stats.export() : export statistics to external text file. - FUNCTION dbms_stats.import() : import statistics from external text file. If acceptable, I'd like to submit DBMS_STATS for Postgres module to September commitfest. I'm not sure the feature should be in core, in contrib, or in pgFoundry... Comments welcome. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FYI: fdatasync vs sync_file_range
On Mon, 2009-07-06 at 17:54 +0900, Fujii Masao wrote: According to the result, using sync_file_range instead of fdatasync has little effect in the performance of postgres. [...when flushing XLOG] Why did you think it would? AFAICS the range of dirty pages will be restricted to a fairly tight range anyway. The only difference between the two would indicate an OS inefficiency. I don't see an opportunity for XLOG to be more efficient by using a finer-grained API. I think there is still a valid use for sync_file_range at checkpoint, since the for some large tables this could reduce the number of pages needing to be written at checkpoint time. That would help smooth out larger writes. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FYI: fdatasync vs sync_file_range
Fujii Masao wrote: According to the result, using sync_file_range instead of fdatasync has little effect in the performance of postgres. When we flush the WAL, we flush everything we've written that far. I'm not surprised that sync_file_range makes no difference; it does the same amount of I/O as fsync(). sync_file_range() might be a useful useful replacement for the data file fsync()s at checkpoint, though. You could avoid the I/O storm that fsync() causes by flushing the files in smaller chunks with sync_file_range(), with a small delay in between. But since I don't recall any complaints about I/O storms at checkpoints since the smoothed checkpoints patch in 8.3, it might not be worth it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First CommitFest: July 15th
Peter Eisentraut wrote: On Saturday 04 July 2009 00:54:11 Robert Haas wrote: I think what would be more useful is if we could somehow associated metadata with each commit. Right now, for example, the author of a patch is not stored with the patch in any structured way; it's just typed in, usually but not always as the last line of the commit. So you can't easily find out what lines of code a certain person has touched, for example. The sorts of problems that you're talking about seem broadly in the same vein. I have been trying to follow a convention on-and-off to put the author of the patch in the last line of the commit message, like Author: First Last n...@example.com Sure, I can use that format if we decide to be consistent. A tool such as git-cvsimport will actually parse that and put it into the author field of a git commit. (The tool we use, fromcvs, doesn't do that, but it could conceivably be patched easily to do it.) I also found the following resource helpful in crafting commit messages: http://www.tpope.net/node/106 Interesting idea to have a subject line for the commit message. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Reduce the memcpy call from SearchCatCache
Hi, Here is the oprofile results of pgbench. CPU: P4 / Xeon with 2 hyper-threads, speed 2793.55 MHz (estimated) Counted GLOBAL_POWER_EVENTS events with a unit mask of 0x01 (mandatory) count 10 samples %app name symbol name 1345216.8312 ipmi_si (no symbols) 94515 4.7996 vmlinux schedule 52609 2.6716 postgres AllocSetAlloc 39659 2.0140 postgres base_yyparse 34605 1.7573 vmlinux mwait_idle 33234 1.6877 vmlinux _spin_lock 31353 1.5922 libc-2.3.4.somemcpy I think that the performance improves if the call frequency of memcpy is reduced. I measured the place where postgres used memcpy. (Test program is pgbench -t 4000) total-size avg-size caller 636185 111968560 176 catcache.c:1129 68236 18436197 270 xlog.c:947 3909 13822874 3536 xlog.c:940 200033520528 176 catcache.c:1376 560102071477 36 pgstat.c:2288 1255241902864 15 dynahash.c:948 200011760088 88 setrefs.c:205 catcache.c:1129 is memcpy at SearchCatCache, and catcache.c:1376 is memcpy at SearchCatCacheList. memcpy(cur_skey, cache-cc_skey, sizeof(cur_skey)); Attached patch is reduce the memcpy calls from SearchCatCache and SearchCatCacheList. This patch directly uses cache-cc_skey in looking for hash table. Here is an effect of the patch. original: Counted GLOBAL_POWER_EVENTS events samples %app name symbol name 31353 1.5922 libc-2.3.4.somemcpy patched: Counted GLOBAL_POWER_EVENTS events samples %app name symbol name 20629 1.0684 libc-2.3.4.somemcpy --- Atsushi Ogawa *** ./src/backend/utils/cache/catcache.c.orig 2009-07-06 22:06:52.0 +0900 --- ./src/backend/utils/cache/catcache.c2009-07-06 13:51:48.0 +0900 *** *** 1124,1140 /* * initialize the search key information */ ! memcpy(cur_skey, cache-cc_skey, sizeof(cur_skey)); ! cur_skey[0].sk_argument = v1; ! cur_skey[1].sk_argument = v2; ! cur_skey[2].sk_argument = v3; ! cur_skey[3].sk_argument = v4; /* * find the hash bucket in which to look for the tuple */ ! hashValue = CatalogCacheComputeHashValue(cache, cache-cc_nkeys, cur_skey); hashIndex = HASH_INDEX(hashValue, cache-cc_nbuckets); /* --- 1124,1141 /* * initialize the search key information +* use cache-cc_skey directly in looking for hash table */ ! cache-cc_skey[0].sk_argument = v1; ! cache-cc_skey[1].sk_argument = v2; ! cache-cc_skey[2].sk_argument = v3; ! cache-cc_skey[3].sk_argument = v4; /* * find the hash bucket in which to look for the tuple */ ! hashValue = CatalogCacheComputeHashValue(cache, cache-cc_nkeys, ! cache-cc_skey); hashIndex = HASH_INDEX(hashValue, cache-cc_nbuckets); /* *** *** 1160,1166 HeapKeyTest(ct-tuple, cache-cc_tupdesc, cache-cc_nkeys, ! cur_skey, res); if (!res) continue; --- 1161,1167 HeapKeyTest(ct-tuple, cache-cc_tupdesc, cache-cc_nkeys, ! cache-cc_skey, res); if (!res) continue; *** *** 1222,1227 --- 1223,1234 */ relation = heap_open(cache-cc_reloid, AccessShareLock); + /* +* We need copy ScanKey data, because systable_beginscan changes +* the ScanKey data. +*/ + memcpy(cur_skey, cache-cc_skey, sizeof(cur_skey)); + scandesc = systable_beginscan(relation, cache-cc_indexoid, IndexScanOK(cache, cur_skey), *** *** 1371,1389 /* * initialize the search key information */ ! memcpy(cur_skey, cache-cc_skey, sizeof(cur_skey)); ! cur_skey[0].sk_argument = v1; ! cur_skey[1].sk_argument = v2; ! cur_skey[2].sk_argument = v3; ! cur_skey[3].sk_argument = v4; /* * compute a hash value of the given keys for faster search. We don't * presently divide the CatCList items into buckets, but this still lets * us skip non-matching items quickly most of the time. */
Re: [HACKERS] commitfest.postgresql.org
Peter Eisentraut pete...@gmx.net wrote: On Saturday 04 July 2009 01:19:23 Joshua D. Drake wrote: a button says, I am about to perform X. A link *always* says, I am about to go to a new web page. That was my feeling. In addition, if the action will be preceded by a dialog (for options or confirmation) the button text should end with '...'. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, Jul 06, 2009 at 11:56:41AM +0100, Simon Riggs wrote: On Sun, 2009-07-05 at 17:28 -0700, Jeff Davis wrote: This is a follow up to my old proposal here: http://archives.postgresql.org/pgsql-hackers/2008-06/msg00404.php Any input is appreciated (design problems, implementation, language ideas, or anything else). I'd like to get it into shape for the July 15 commitfest if no major problems are found. I was concerned that your definition of concurrently inserted didn't seem to match the size of the shared memory array required. How will you cope with a large COPY? Surely there can be more than one concurrent insert from any backend? It would be useful to see a real example of what this can be used for. Constraints like these intervals can't overlap would be one. It's handy in calendaring applications, for example. I think it will be useful to separate the concepts of a constraint from the concept of an index. It seems possible to have a UNIQUE constraint that doesn't help at all in locating rows, just in proving that the rows are unique. Interesting idea. Are you thinking of this in terms of things the planner can do once it knows a set is all distinct values, or...? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Determining client_encoding from client locale
Here's my first attempt at setting client_encoding automatically from locale. It adds a new conninfo parameter to libpq, client_encoding. If set to auto, libpq uses the encoding as returned by pg_get_encoding_from_locale(). Any other value is passed through to the server as is. psql is modified to set client_encoding=auto, unless overridden by PGCLIENTENCODING. BTW, I had to modify psql to use PQconnectdb() instead of PQsetdblogin(), so that it can pass the extra parameter. I found it a bit laboursome to construct the conninfo string with proper escaping, just to have libpq parse and split it into components again. Could we have a version of PQconnectdb() with an API more suited for setting the params programmatically? The PQsetdbLogin() approach doesn't scale as parameters are added/removed in future versions, but we could have something like this: PGconn *PQconnectParams(const char **params) Where params is an array with an even number of parameters, forming key/value pairs. Usage example: char *connparams[] = { dbname, mydb, user, username, NULL /* terminate with NULL */ }; conn = PQconnectParams(connparams); This is similar to what I did internally in psql in the attached patch. Another idea is to use an array of PQconninfoOption structs: PQconn *PQconnectParams(PQconninfoOption *params); This would be quite natural since that's the format returned by PQconnDefaults() and PQconninfoParse(), but a bit more cumbersome to use in applications that don't use those functions, as in the previous example. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com commit 24f6d68ddd3725c1f9a98c47f7535b2973ffc492 Author: Heikki Linnakangas hei...@enterprisedb.com Date: Mon Jul 6 16:54:00 2009 +0300 Add client_encoding conninfo parameter. By specifying special value 'auto', libpq will determine the encoding from the current locale. Modify psql to use the 'auto' mode if PGCLIENTENCODING if not set. diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 86affb0..a5d45b2 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -236,6 +236,19 @@ /listitem /varlistentry +varlistentry id=libpq-connect-client-encoding xreflabel=client_encoding + termliteralclient_encoding/literal/term + listitem + para + Character encoding to use. This sets the varnameclient_encoding/varname + configuration option for this connection. In addition to the values + accepted by the corresponding server option, you can use 'auto' to + determine the right encoding from the current locale in the client + (LC_CTYPE environment variable on Unix systems). + /para + /listitem +/varlistentry + varlistentry id=libpq-connect-options xreflabel=options termliteraloptions/literal/term listitem @@ -5871,6 +5884,16 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough) linkend=libpq-connect-connect-timeout connection parameter. /para /listitem + +listitem + para + indexterm + primaryenvarPGCLIENTENCODING/envar/primary + /indexterm + envarPGCLIENTENCODING/envar behaves the same as xref + linkend=libpq-connect-client-encoding connection parameter. + /para +/listitem /itemizedlist /para @@ -5907,17 +5930,6 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough) listitem para indexterm - primaryenvarPGCLIENTENCODING/envar/primary - /indexterm - envarPGCLIENTENCODING/envar sets the default client character - set encoding. (Equivalent to literalSET client_encoding TO - .../literal.) - /para -/listitem - -listitem - para - indexterm primaryenvarPGGEQO/envar/primary /indexterm envarPGGEQO/envar sets the default mode for the genetic query diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 0955e13..6991e7a 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1239,8 +1239,7 @@ do_connect(char *dbname, char *user, char *host, char *port) while (true) { - n_conn = PQsetdbLogin(host, port, NULL, NULL, - dbname, user, password); + n_conn = PSQLconnect(host, port, dbname, user, password); /* We can immediately discard the password -- no longer needed */ if (password) diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 6b2de37..a5a0b0a 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -32,6 +32,8 @@ static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static char *construct_conninfo(const char * const *optarray); + /* * Safe wrapper around strdup() */ @@ -1538,3 +1540,75 @@ expand_tilde(char **filename) return *filename; } + +/* + * Establish a
Re: [HACKERS] First CommitFest: July 15th
On Mon, Jul 06, 2009 at 09:12:55AM -0400, Bruce Momjian wrote: Peter Eisentraut wrote: On Saturday 04 July 2009 00:54:11 Robert Haas wrote: I think what would be more useful is if we could somehow associated metadata with each commit. Right now, for example, the author of a patch is not stored with the patch in any structured way; it's just typed in, usually but not always as the last line of the commit. So you can't easily find out what lines of code a certain person has touched, for example. The sorts of problems that you're talking about seem broadly in the same vein. I have been trying to follow a convention on-and-off to put the author of the patch in the last line of the commit message, like Author: First Last n...@example.com Sure, I can use that format if we decide to be consistent. A tool such as git-cvsimport will actually parse that and put it into the author field of a git commit. (The tool we use, fromcvs, doesn't do that, but it could conceivably be patched easily to do it.) I also found the following resource helpful in crafting commit messages: http://www.tpope.net/node/106 Interesting idea to have a subject line for the commit message. It would help me a lot when putting together the patches section in the PostgreSQL Weekly News. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Synch Rep: direct transfer of WAL file from the primary to the standby
Fujii Masao masao.fu...@gmail.com writes: In order for the primary server (ie. a normal backend) to read an archived file, restore_command needs to be specified in also postgresql.conf. In this case, how should we handle restore_command in recovery.conf? I confess to not having paid much attention to this thread so far, but ... what is the rationale for having such a capability at all? It seems to me to be exposing implementation details that we do not need to expose, as well as making assumptions that we shouldn't make (like there is exactly one archive and the primary server has read access to it). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First CommitFest: July 15th
David Fetter wrote: On Mon, Jul 06, 2009 at 09:12:55AM -0400, Bruce Momjian wrote: Peter Eisentraut wrote: On Saturday 04 July 2009 00:54:11 Robert Haas wrote: I think what would be more useful is if we could somehow associated metadata with each commit. Right now, for example, the author of a patch is not stored with the patch in any structured way; it's just typed in, usually but not always as the last line of the commit. So you can't easily find out what lines of code a certain person has touched, for example. The sorts of problems that you're talking about seem broadly in the same vein. I have been trying to follow a convention on-and-off to put the author of the patch in the last line of the commit message, like Author: First Last n...@example.com Sure, I can use that format if we decide to be consistent. A tool such as git-cvsimport will actually parse that and put it into the author field of a git commit. (The tool we use, fromcvs, doesn't do that, but it could conceivably be patched easily to do it.) I also found the following resource helpful in crafting commit messages: http://www.tpope.net/node/106 Interesting idea to have a subject line for the commit message. It would help me a lot when putting together the patches section in the PostgreSQL Weekly News. OK, someone want to write a wiki that explains our new preferred commit message format? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tsvector extraction patch
On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig -- PostgreSQLpostg...@cybertec.at wrote: hello, this patch has not made it through yesterday, so i am trying to send it again. i made a small patch which i found useful for my personal tasks. it would be nice to see this in 8.5. if not core then maybe contrib. it transforms a tsvector to table format which is really nice for text processing and comparison. test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure this is a good patch')); lex | rank +-- good | 8 patch | 9 pretti | 3 sure | 4 (4 rows) This looks very useful! I wonder if providing a weight column would be relatively simple? I think this would present problems with the cast-to-text[] idea that Peter suggests, though. -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: mi...@esilibrary.com | web: http://www.esilibrary.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SET DISTINCT vs. Oracle-like DBMS_STATS
Itagaki Takahiro escribió: It is just similar to Oracle's DBMS_STATS package. http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_stats.htm If it were, ALTER TABLE tablename ALTER COLUMN 3rd-column SET DISITNCT 100 could be written as: INSERT INTO dbms_stats.columns(starelid, ataattnum, stadistinct) VALUES ('tablename'::regclass, 3, 100); Of course ALTER SET DISTINCT is easy-to-use, so it could be an alias for the above INSERT command. Why wouldn't you implement this through reloptions? (I ask because the syntax you propose above is awfully similar to what we used for pg_autovacuum, which ended up being replaced by reloptions) -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, 2009-07-06 at 12:28 +0100, Greg Stark wrote: He only needs to handle inserts for the period they're actively being inserted into the index. Once they're in the index he'll find them using the index scan. In other words this is all a proxy for the way btree locks index pages while it looks for a unique key violation. Exactly, that was my design: /* * We have to find all tuples, even those not visible * yet. Other transactions may have inserted many tuples (or * the transaction might be a prepared transaction), so there * may be some tuples that are not in the shared memory * structure and not visible. */ I'm a bit concerned about the use of tid. You might have to look at a lot of heap pages to check for conflicts. I suppose they're almost certainly all in shared memory though. That was my hope. The 8.4 bulk insert code might defeat that to some degree, however. Maybe that could be disabled when inserting into an index with constraints? I didn't think about it before, but the bulk insert buffer ring would affect unique btrees, too, right? Also, it sounds like you're anticipating the possibility of dead entries in the array but if you do then you need to store the xmin also to protect against a tuple that's been vacuumed and had its line pointer reused since. But I don't see the necessity for that anyways since you can just clean up the entry on abort. Can you tell me a little more specifically the problem you're worried about? If the tuple has been VACUUMed and removed, then the TID search will either find a tuple, and do a spurious constraint check against it; or not find a tuple, and just move on. I could have the commit and abort paths clear the entry, which might optimize away some of the TransactionIdIsInProgress() calls for transactions that ended normally. But that didn't strike me as a big cost compared to the index scan. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, 2009-07-06 at 11:56 +0100, Simon Riggs wrote: I think it will be useful to separate the concepts of a constraint from the concept of an index. It seems possible to have a UNIQUE constraint that doesn't help at all in locating rows, just in proving that the rows are unique. That would be interesting. Do you have a use case? Checking the constraint would surely be slower in a lot of cases. I could imagine different constraint-checking schemes that could be fast against a heap. For instance, if it's greater than the max or less than the min value, that would be cheap to check. That might be an interesting way to handle the constraint for a sequence-generated column, or timestamp column that is always ascending. However, the problem is I don't see a lot of room for a practical use case. In the above situations, you'd almost certainly want indexes anyway: what's the point of a sequence number unless you're going to do lookups? And if you have an ascending timestamp column, I would think that you might do range lookups occasionally (which will be even better because the heap will be clustered). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, 2009-07-06 at 07:30 -0700, David Fetter wrote: It would be useful to see a real example of what this can be used for. Constraints like these intervals can't overlap would be one. It's handy in calendaring applications, for example. Exactly, you already know my use case ;) My goal is a temporal key, where you can't have overlapping intervals of time, e.g. the constraint nobody can be two places at the same time. I think it will be useful to separate the concepts of a constraint from the concept of an index. It seems possible to have a UNIQUE constraint that doesn't help at all in locating rows, just in proving that the rows are unique. Interesting idea. Are you thinking of this in terms of things the planner can do once it knows a set is all distinct values, or...? I think that's an orthogonal idea. It's a good idea though, I would like the planner to be smarter about those kinds of things. A simple example is that if a table has a non-partial unique constraint anywhere, then select * from foo union select * from foo can be transformed into select * from foo (eliminating the expensive union). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, Jul 6, 2009 at 4:57 PM, Jeff Davispg...@j-davis.com wrote: Exactly, you already know my use case ;) My goal is a temporal key, where you can't have overlapping intervals of time, e.g. the constraint nobody can be two places at the same time. Incidentally to handle non-overlapping ranges you don't need GIST, you can actually use a plain btree. Since there are no overlapping ranges the ranges have a complete ordering and you can get that by just sorting by either endpoint. To enforce the constraint you only have to compare with the previous and following element in the btree. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] TODO items: Alter view add column
Hi, This one is still in the TODO (and marked as not done). but i think this is partially done (at least the last entry should be removed), right? Improve ability to modify views via ALTER TABLE * Re: idea: storing view source in system catalogs * modifying views * Re: patch: Add columns via CREATE OR REPLACE VIEW -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4, One-Time Filter and subquery ( ... FROM function() union all ... )
Sergey Burladyan eshkin...@gmail.com writes: 8.4 always execute functions in this subquery, even if result do not need it. 8.3 correctly optimize this and do not execute this functions, here is example: create function foo() returns int language sql as $$ select pg_sleep(5); select 1 $$; EXPLAIN ANALYZE select * from (select 1 as i, r from foo() r union all select 2, r from foo() r) as x where i = 3; Hmm. This doesn't actually have anything to do with functions; for example in 8.3 regression=# explain select * from (select 1 as i, * from tenk1 a union all select 2, * from tenk1 b) as x where i = 3; QUERY PLAN - Result (cost=0.00..916.02 rows=2 width=248) - Append (cost=0.00..916.02 rows=2 width=248) - Result (cost=0.00..458.00 rows=1 width=244) One-Time Filter: false - Seq Scan on tenk1 a (cost=0.00..458.00 rows=1 width=244) - Result (cost=0.00..458.00 rows=1 width=244) One-Time Filter: false - Seq Scan on tenk1 b (cost=0.00..458.00 rows=1 width=244) (8 rows) but in 8.4 regression=# explain select * from (select 1 as i, * from tenk1 a union all select 2, * from tenk1 b) as x where i = 3; QUERY PLAN Result (cost=0.00..966.00 rows=100 width=276) - Append (cost=0.00..966.00 rows=100 width=276) - Seq Scan on tenk1 a (cost=0.00..483.00 rows=50 width=276) Filter: (1 = 3) - Seq Scan on tenk1 b (cost=0.00..483.00 rows=50 width=276) Filter: (2 = 3) (6 rows) The reason for the change is that 8.4 is smart enough to flatten UNION ALL subqueries that have non-Var select list items. Which means that when set_append_rel_pathlist pushes the appendrel's i = 3 restriction down into the member queries, it's pushing the modified restrictions into plain relation scans instead of subquery scans. Before, const-simplification and recognition of the resulting constant-false quals happened when the whole planner was recursively invoked on the subquery, but for plain relation scans we assume all that was already done. So we have a layer of processing that's getting missed out in examples like these. It was never important before because the old code couldn't produce a constant qual condition that way (since the substituted expression would necessarily be a Var). I'm inclined to think the right fix involves making set_append_rel_pathlist perform const simplification and check for pseudoconstant quals after it does adjust_appendrel_attrs(). It might take a bit of code refactoring to do that conveniently, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Show method of index
On Tuesday 12 May 2009 08:36:20 Khee Chin wrote: postgres=# \di idx_foo_bt_fooi; List of relations Schema | Name | Type | Owner | Table | Method | Definition +-+---+---+---++--- public | idx_foo_bt_fooi | index | rubik | foo | btree | md5((a)::text), md5((a || b)) (1 row) postgres=# \d idx_foo_bt_fooi; Index public.idx_foo_bt_fooi Column | Type | Definition -+--+-- pg_expression_1 | text | md5(a::text) pg_expression_2 | text | md5(a || b) btree, for table public.foo I have committed the second part, the additions to the \d output. I think the sentiment was against changing the \di output. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, 2009-07-06 at 17:02 +0100, Greg Stark wrote: On Mon, Jul 6, 2009 at 4:57 PM, Jeff Davispg...@j-davis.com wrote: Exactly, you already know my use case ;) My goal is a temporal key, where you can't have overlapping intervals of time, e.g. the constraint nobody can be two places at the same time. Incidentally to handle non-overlapping ranges you don't need GIST, you can actually use a plain btree. Since there are no overlapping ranges the ranges have a complete ordering and you can get that by just sorting by either endpoint. To enforce the constraint you only have to compare with the previous and following element in the btree. What if you have an entire index full of overlapping dead tuples, and a few live ones? How would search work? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, 2009-07-06 at 08:50 -0700, Jeff Davis wrote: On Mon, 2009-07-06 at 11:56 +0100, Simon Riggs wrote: I think it will be useful to separate the concepts of a constraint from the concept of an index. It seems possible to have a UNIQUE constraint that doesn't help at all in locating rows, just in proving that the rows are unique. That would be interesting. Do you have a use case? Checking the constraint would surely be slower in a lot of cases. I could imagine different constraint-checking schemes that could be fast against a heap. For instance, if it's greater than the max or less than the min value, that would be cheap to check. That might be an interesting way to handle the constraint for a sequence-generated column, or timestamp column that is always ascending. Yes. However, the problem is I don't see a lot of room for a practical use case. In the above situations, you'd almost certainly want indexes anyway: what's the point of a sequence number unless you're going to do lookups? And if you have an ascending timestamp column, I would think that you might do range lookups occasionally (which will be even better because the heap will be clustered). In many cases, people add unique indexes solely to allow replication to work correctly. The index itself may never be used, especially in high volume applications. How do you handle uniqueness within a stream? Presumably it is possible and useful to have a stream of data that can be guaranteed unique, yet a stream would never be uniquely targeted for lookups because of the volume of data involved. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, Jul 6, 2009 at 6:20 PM, Jeff Davispg...@j-davis.com wrote: On Mon, 2009-07-06 at 17:02 +0100, Greg Stark wrote: On Mon, Jul 6, 2009 at 4:57 PM, Jeff Davispg...@j-davis.com wrote: Exactly, you already know my use case ;) My goal is a temporal key, where you can't have overlapping intervals of time, e.g. the constraint nobody can be two places at the same time. Incidentally to handle non-overlapping ranges you don't need GIST, you can actually use a plain btree. Since there are no overlapping ranges the ranges have a complete ordering and you can get that by just sorting by either endpoint. To enforce the constraint you only have to compare with the previous and following element in the btree. What if you have an entire index full of overlapping dead tuples, and a few live ones? How would search work? I should clarify I didn't mean you could implement it in SQL using Postgres btrees. I just meant that a tree data structure was sufficient, you don't need the power of GIST. It's probably easier to implement it in GIST in Postgres since it's there though. So it would work just like regular btrees, you only consider it a conflict if there's a live value that conflicts. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO items: Alter view add column
Jaime Casanova wrote: Hi, This one is still in the TODO (and marked as not done). but i think this is partially done (at least the last entry should be removed), right? Improve ability to modify views via ALTER TABLE * Re: idea: storing view source in system catalogs * modifying views * Re: patch: Add columns via CREATE OR REPLACE VIEW I think we only completed this for 8.4: * Allow CREATE OR REPLACE VIEW to add columns to the end of a view (Robert Haas) so I think those links are all still relevant. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 8.4, One-Time Filter and subquery ( ... FROM function() union all ... )
Sergey Burladyan eshkin...@gmail.com writes: PostgreSQL 8.4.0 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 4.3.3-13) 4.3.3, 32-bit EXPLAIN ANALYZE select * from (select 1 as i, r from foo() r union all select 2, r from foo() r) as x where i = 3; QUERY PLAN --- Result (cost=0.00..0.53 rows=2 width=36) (actual time=10007.352..10007.352 rows=0 loops=1) - Append (cost=0.00..0.53 rows=2 width=36) (actual time=10007.351..10007.351 rows=0 loops=1) - Function Scan on foo r (cost=0.00..0.26 rows=1 width=36) (actual time=5003.342..5003.342 rows=0 loops=1) Filter: (1 = 3) - Function Scan on foo r (cost=0.00..0.26 rows=1 width=36) (actual time=5004.004..5004.004 rows=0 loops=1) Filter: (2 = 3) Total runtime: 10007.464 ms As of CVS HEAD you get QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.011..0.011 rows=0 loops=1) One-Time Filter: false Total runtime: 0.179 ms (3 rows) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] information_schema.columns changes needed for OLEDB
On Friday 22 May 2009 18:27:01 Konstantin Izmailov wrote: 3. character_octet_length should always be double of character_maximum_length (due to Unicode character size on Windows which is 2). I have the attached patch that would make character_octet_length the product of character_octet_length and the maximum octet length of a single character in the selected server encoding. So for UTF-8, this would be factor 4. This doesn't exactly correspond to the behavior that you expect, but I think it's more correct overall anyway. diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml index 8e145d7..6460862 100644 --- a/doc/src/sgml/information_schema.sgml +++ b/doc/src/sgml/information_schema.sgml @@ -343,10 +343,10 @@ entrytypecardinal_number/type/entry entry If literaldata_type/literal identifies a character type, - the maximum possible length in octets (bytes) of a datum (this - should not be of concern to - productnamePostgreSQL/productname users); null for all - other data types. + the maximum possible length in octets (bytes) of a datum; null + for all other data types. The maximum octet length depends on + the declared character maximum length (see above) and the + server encoding. /entry /row @@ -947,9 +947,10 @@ entrytypecardinal_number/type/entry entry If literaldata_type/literal identifies a character type, - the maximum possible length in octets (bytes) of a datum (this - should not be of concern to productnamePostgreSQL/productname users); null for all - other data types. + the maximum possible length in octets (bytes) of a datum; null + for all other data types. The maximum octet length depends on + the declared character maximum length (see above) and the + server encoding. /entry /row @@ -1688,9 +1689,9 @@ entrytypecardinal_number/type/entry entry If the domain has a character type, the maximum possible length - in octets (bytes) of a datum (this should not be of concern to - productnamePostgreSQL/productname users); null for all - other data types. + in octets (bytes) of a datum; null for all other data types. + The maximum octet length depends on the declared character + maximum length (see above) and the server encoding. /entry /row diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index fe75322..cd6258b 100644 --- a/src/backend/catalog/information_schema.sql +++ b/src/backend/catalog/information_schema.sql @@ -102,11 +102,7 @@ CREATE FUNCTION _pg_char_octet_length(typid oid, typmod int4) RETURNS integer IMMUTABLE RETURNS NULL ON NULL INPUT AS -$$SELECT - CASE WHEN $1 IN (25, 1042, 1043) /* text, char, varchar */ - THEN CAST(2^30 AS integer) - ELSE null - END$$; +$$SELECT information_schema._pg_char_max_length($1, $2) * pg_encoding_max_length((SELECT encoding FROM pg_database WHERE datname = current_database()))$$; CREATE FUNCTION _pg_numeric_precision(typid oid, typmod int4) RETURNS integer LANGUAGE sql diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c index 753c927..058493c 100644 --- a/src/backend/utils/mb/mbutils.c +++ b/src/backend/utils/mb/mbutils.c @@ -482,6 +482,17 @@ length_in_encoding(PG_FUNCTION_ARGS) } +Datum +pg_encoding_max_length_sql(PG_FUNCTION_ARGS) +{ + int encoding = PG_GETARG_INT32(0); + + if (PG_VALID_ENCODING(encoding)) + return pg_wchar_table[encoding].maxmblen; + else + PG_RETURN_NULL(); +} + /* * convert client encoding to server encoding. */ diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 0285acd..e194d6a 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2278,6 +2278,9 @@ DESCR(convert encoding name to encoding id); DATA(insert OID = 1597 ( pg_encoding_to_char PGNSP PGUID 12 1 0 0 f f f t f s 1 0 19 23 _null_ _null_ _null_ _null_ PG_encoding_to_char _null_ _null_ _null_ )); DESCR(convert encoding id to encoding name); +DATA(insert OID = 2319 ( pg_encoding_max_length PGNSP PGUID 12 1 0 0 f f f t f i 1 0 23 23 _null_ _null_ _null_ _null_ pg_encoding_max_length_sql _null_ _null_ _null_ )); +DESCR(maximum octet length of a character in an eocidng); + DATA(insert OID = 1638 ( oidgt PGNSP PGUID 12 1 0 0 f f f t f i 2 0 16 26 26 _null_ _null_ _null_ _null_ oidgt _null_ _null_ _null_ )); DESCR(greater-than); DATA(insert OID = 1639 ( oidge PGNSP PGUID 12 1 0 0 f f f t f i 2 0 16 26 26 _null_ _null_ _null_ _null_ oidge _null_ _null_ _null_ )); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index c1b9393..13fd41a 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -948,6 +948,7 @@ extern Datum pg_convert(PG_FUNCTION_ARGS); extern
Re: [HACKERS] information_schema.columns changes needed for OLEDB
Peter Eisentraut pete...@gmx.net writes: I have the attached patch that would make character_octet_length the product of character_octet_length and the maximum octet length of a single character in the selected server encoding. So for UTF-8, this would be factor 4. This doesn't exactly correspond to the behavior that you expect, but I think it's more correct overall anyway. +1, but that new query isn't very schema-safe ... I think it needs a few pg_catalog. qualifications. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Maintenance Policy?
Howdy Hackers, Is there a published maintenance policy somewhere? Something that says for how long the project supports minor releases of PostgreSQL. For example, does 7.4 still get bug fixes and minor releases? If not, how does one know when support for a major version has been dropped? Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce the memcpy call from SearchCatCache
Atsushi Ogawa a_og...@hi-ho.ne.jp writes: Attached patch is reduce the memcpy calls from SearchCatCache and SearchCatCacheList. This patch directly uses cache-cc_skey in looking for hash table. How much did you test this patch? I'm fairly sure it will break things. There are cases where cache lookups happen recursively. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Small foreign key error message improvement
I recently had a puzzler, which involved this sort of accidental parser error: CREATE TABLE foo (a int, b text, PRIMARY KEY (a, b)); CREATE TABLE bar (x int, y text, FOREIGN KEY (q, r) REFERENCES foo (m, n)); ERROR: column q referenced in foreign key constraint does not exist versus CREATE TABLE bar (x int, y text, FOREIGN KEY (x, y) REFERENCES foo (m, n)); ERROR: column m referenced in foreign key constraint does not exist This example has been simplified for clarity, but the original case involved a bunch of id columns everywhere. What's confusing is that q is not actually referenced by the foreign key constraint, but referenced in the statement that attempts to define the foreign key constraint, so I was looking on the wrong side of the constraint there. Attached is a small patch that separates those error messages into: ERROR: column q specified as a constrained column in foreign key constraint does not exist ERROR: column m specified as a referenced column in foreign key constraint does not exist Details may be debatable. Comments? diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 7afe6e7..a253fd8 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -238,7 +238,7 @@ static void AlterSeqNamespaces(Relation classRel, Relation rel, Oid oldNspOid, Oid newNspOid, const char *newNspName); static int transformColumnNameList(Oid relId, List *colList, - int16 *attnums, Oid *atttypids); + int16 *attnums, Oid *atttypids, bool is_source); static int transformFkeyGetPrimaryKey(Relation pkrel, Oid *indexOid, List **attnamelist, int16 *attnums, Oid *atttypids, @@ -4641,7 +4641,7 @@ ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, numfks = transformColumnNameList(RelationGetRelid(rel), fkconstraint-fk_attrs, - fkattnum, fktypoid); + fkattnum, fktypoid, true); /* * If the attribute list for the referenced table was omitted, lookup the @@ -4660,7 +4660,7 @@ ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, { numpks = transformColumnNameList(RelationGetRelid(pkrel), fkconstraint-pk_attrs, - pkattnum, pktypoid); + pkattnum, pktypoid, false); /* Look for an index matching the column list */ indexOid = transformFkeyCheckAttrs(pkrel, numpks, pkattnum, opclasses); @@ -4855,7 +4855,7 @@ ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, */ static int transformColumnNameList(Oid relId, List *colList, - int16 *attnums, Oid *atttypids) + int16 *attnums, Oid *atttypids, bool is_source) { ListCell *l; int attnum; @@ -4868,10 +4868,18 @@ transformColumnNameList(Oid relId, List *colList, atttuple = SearchSysCacheAttName(relId, attname); if (!HeapTupleIsValid(atttuple)) - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_COLUMN), - errmsg(column \%s\ referenced in foreign key constraint does not exist, - attname))); + { + if (is_source) +ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg(column \%s\ specified as a constrained column in foreign key constraint does not exist, +attname))); + else +ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_COLUMN), + errmsg(column \%s\ specified as a referenced column in foreign key constraint does not exist, +attname))); + } if (attnum = INDEX_MAX_KEYS) ereport(ERROR, (errcode(ERRCODE_TOO_MANY_COLUMNS), -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgxs and make check message
Isn't it a bad idea that this from pgxs.mk does not return a non-zero status? check: @echo 'make check' is not supported. @echo Do 'make install', then 'make installcheck' instead. Or is something relying on a nonexisting test suite passing successfully? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] please, actualize czech link on international sites list
Hello we changed url from www.pgsql.cz to www.postgres.cz. Please, actualise link on http://www.postgresql.org/community/international thank you Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] please, actualize czech link on international sites list
Pavel Stehule wrote: Hello we changed url from www.pgsql.cz to www.postgres.cz. Please, actualise link on http://www.postgresql.org/community/international done Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small foreign key error message improvement
Peter Eisentraut pete...@gmx.net writes: I recently had a puzzler, which involved this sort of accidental parser error: CREATE TABLE foo (a int, b text, PRIMARY KEY (a, b)); CREATE TABLE bar (x int, y text, FOREIGN KEY (q, r) REFERENCES foo (m, n)); ERROR: column q referenced in foreign key constraint does not exist versus CREATE TABLE bar (x int, y text, FOREIGN KEY (x, y) REFERENCES foo (m, n)); ERROR: column m referenced in foreign key constraint does not exist This example has been simplified for clarity, but the original case involved a bunch of id columns everywhere. What's confusing is that q is not actually referenced by the foreign key constraint, but referenced in the statement that attempts to define the foreign key constraint, so I was looking on the wrong side of the constraint there. Attached is a small patch that separates those error messages into: It seems to me that the right fix here is not so much to tweak the message wording as to put in an error location cursor. In more complicated cases (eg, multiple FOREIGN KEY clauses) the suggested wording change wouldn't help much anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO items: Alter view add column
--On Montag, Juli 06, 2009 13:51:36 -0400 Bruce Momjian br...@momjian.us wrote: I think we only completed this for 8.4: * Allow CREATE OR REPLACE VIEW to add columns to the end of a view (Robert Haas) Yes, this is done, but we're still not able to drop or change column names of a view. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] please, actualize czech link on international sites list
2009/7/6 Stefan Kaltenbrunner ste...@kaltenbrunner.cc: Pavel Stehule wrote: Hello we changed url from www.pgsql.cz to www.postgres.cz. Please, actualise link on http://www.postgresql.org/community/international done thank you Pavel Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Mon, 2009-07-06 at 18:27 +0100, Simon Riggs wrote: In many cases, people add unique indexes solely to allow replication to work correctly. The index itself may never be used, especially in high volume applications. Interesting. Maybe we should at least try to leave room for this feature to be added later. I agree that, from a theoretical perspective, requiring a UNIQUE constraint to use an index is wrong. For one thing, you can't ensure the uniqueness without defining some total order (although you can define an arbitrary total order for cases with no meaningful total order). How do you handle uniqueness within a stream? Presumably it is possible and useful to have a stream of data that can be guaranteed unique, yet a stream would never be uniquely targeted for lookups because of the volume of data involved. [ Simon is asking me because I work for Truviso, but my response is not officially from Truviso ] There are a few cases worth mentioning here. First, if you have a stream that's backed by a table, you can use a table constraint. Second, you might choose to have an in-order constraint (not necessary, the system can fix out-of-order data), which could be a unique constraint that's very cheap to test. Additionally, this is not strictly a constraint, but if you have downstream operators, like COUNT(DISTINCT...), that can be seen as being similar to a constraint. These will often be over a limited span of time, say, a minute or an hour, and we can keep the necessary state. If there are a huge number of distinct values there, then it's a challenge to avoid keeping a lot of state. There are a few other specialized methods that we can use for specific use-cases. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO items: Alter view add column
On Mon, Jul 6, 2009 at 3:15 PM, Bernd Helmlemaili...@oopsware.de wrote: --On Montag, Juli 06, 2009 13:51:36 -0400 Bruce Momjian br...@momjian.us wrote: I think we only completed this for 8.4: * Allow CREATE OR REPLACE VIEW to add columns to the end of a view (Robert Haas) Yes, this is done, but we're still not able to drop or change column names of a view. Then the 3rd point on that item should be removed or reworded... -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO items: Alter view add column
Jaime Casanova wrote: On Mon, Jul 6, 2009 at 3:15 PM, Bernd Helmlemaili...@oopsware.de wrote: --On Montag, Juli 06, 2009 13:51:36 -0400 Bruce Momjian br...@momjian.us wrote: I think we only completed this for 8.4: ? ? ? ? ? ? * Allow CREATE OR REPLACE VIEW to add columns to the end ? ? ? ? ? ? ? ?of a view (Robert Haas) Yes, this is done, but we're still not able to drop or change column names of a view. Then the 3rd point on that item should be removed or reworded... The problem is that third item is an email subject, not text we can typically modify. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO items: Alter view add column
Jaime Casanova wrote: On Mon, Jul 6, 2009 at 3:15 PM, Bernd Helmlemaili...@oopsware.de wrote: --On Montag, Juli 06, 2009 13:51:36 -0400 Bruce Momjian br...@momjian.us wrote: I think we only completed this for 8.4: * Allow CREATE OR REPLACE VIEW to add columns to the end of a view (Robert Haas) Yes, this is done, but we're still not able to drop or change column names of a view. Then the 3rd point on that item should be removed or reworded... I think the main text of the item should be reworded or made more verbose, to make more explicit what is it about. I think vague descriptions of TODO items are a bad idea in general because you can never close them. If an item is too big, perhaps it should be split in multiple, more manageable items. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO items: Alter view add column
Bruce Momjian br...@momjian.us wrote: The problem is that third item is an email subject, not text we can typically modify. Is it really more important that the line in the TODO list reflect the subject line of the referenced email than that it accurately describe the work we want done? If so, perhaps someone should start a new thread which references back to the old one from which work is already done, committed, and released. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join optimization for inheritance tables
Tom Lane wrote: Nedyalko Borisov nedya...@asterdata.com writes: In summary, we are making two suggestions: 1. Extend the optimizer to consider joins between child tables when hierarchies are joined together. We already handle this for the case where the join is nestloop with inner index scan, and I'm not convinced that there's any real gain to be had for other join types. From OLTP perspective this proposal won't introduce any benefits due to the fact that queries operate on small parts of the data, so we can add a flag that will disable/enable the inherited join. However, the OLAP queries process significant amount of data and to leverage this fact the DB admins partition the data. We think that the optimizer should take advantage of this partitioning and consider plans where the joins are performed on small parts of the data. For example, typical observed scenario is: optimizer chooses Merge Join for two partitioned tables like the plan below: Merge Cond: (table1.id = table2.id) - Sort Sort Key: id - Append - Seq Scan on table1_part1 - Seq Scan on table1_part2 - - Seq Scan on table1_partN - Sort Sort Key: id - Append - Seq Scan on table2_part1 - Seq Scan on table2_part2 - - Seq Scan on table2_partM This plan ignores the fact there are indexes on the table2 partitions and that the pairwise partitions joins (index nested loop or hash join) will be faster than scanning all the partitions and sorting them. To see the effect of the pairwise joins we performed some experiments with the initial implementation. The experiments consisted of joining two partitioned tables where each of the tables have around 200 children and the 2 int columns id and count. We generated data of different sizes and measured the execution times and here are the results: 0.5 million records - regular plan 0.69s - modified plan 0.51 1 million records - regular plan 2.9s - modified plan 1 2.5 million records - regular plan 4.17s - modified plan 2.28 5 million records - regular plan 11.25s - modified plan 4.46 Increasing the data size or adding more columns will increase the difference between the current plan that the database picks and the proposed modification of the plans. Thus, we thing that it might be useful if the optimizer considers plans with inherited joins. 2. Add the Empty Check Constraint, which would enforce that a particular table is to remain empty. The trouble with that is that a constraint that doesn't propagate to its child tables is a weird beast that I'd just as soon not invent. We are currently thinking about inventing an explicit notion of partitioned tables. If we had that, it would be reasonable to have a special kind of parent table for a partitioned set and refuse to allow any data in that relation. But I'm not excited about contorting the general constraint mechanism in the way that would be necessary to express this as a constraint. OK, implementing a special abstract/parent table would make more sense. In this line of thoughts could you elaborate on the explicit notion of partitioned tables or give us some references. Thanks, Nedyalko Borisov and Herodotos Herodotou regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join optimization for inheritance tables
On Mon, Jul 6, 2009 at 10:23 PM, Nedyalko Borisovnedya...@asterdata.com wrote: For example, typical observed scenario is: optimizer chooses Merge Join for two partitioned tables like the plan below: Merge Cond: (table1.id = table2.id) - Sort Sort Key: id - Append - Seq Scan on table1_part1 - Seq Scan on table1_part2 - - Seq Scan on table1_partN - Sort Sort Key: id - Append - Seq Scan on table2_part1 - Seq Scan on table2_part2 - - Seq Scan on table2_partM This plan ignores the fact there are indexes on the table2 partitions and that the pairwise partitions joins (index nested loop or hash join) will be faster than scanning all the partitions and sorting them. To some degree my merge-append patch would mitigate this case. It would allow the use of indexes on some or all the partitions to avoid the sorts. However it would still force all the partitions to be appended on each side and then merged. If we could match up all the partitions then I think this plan would be faster with the Append on top and separate merge joins for each pair of partitions. Aside from skipping the cpu cost of the merge-append I think it would win for a few other reasons as well. Each join would be able to come up with much better statistics which would enable it to pick a better join when one is available. Even if the planner still picks a merge join it would be much more likely to finish early and skip the remainder of a partition on one side or the other. OK, implementing a special abstract/parent table would make more sense. I had in mind to tackle this in a bit of a roundabout way. If we mark the parent table read-only then notice that all tuples (all 0 of them) in the table are frozen then we can discard that table from the plans. Since changing the read-only attribute would have to be treated as a DDL operation which would invalidate any cached plans we can trust that it won't change as long as the plan lives so no new tuples can be inserted. The reason I wanted to take such a roundabout route instead of having an abstract or empty property is that a wanted to generalize this. Once we know a table is read-only then there are lots of properties we could find useful in planning aside from emptiness. We could have statistics like the minimum and maximum value for a column which the planner would be able to trust and exclude partitions without having to explicitly declare constraints on every column. This is all just my musings, not any kind of consensus. Does it make sense to others or is it too baroque when a simple abstract flag would do? -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SET DISTINCT vs. Oracle-like DBMS_STATS
Euler Taveira de Oliveira eu...@timbira.com wrote: INSERT INTO dbms_stats.columns(starelid, ataattnum, stadistinct) VALUES ('tablename'::regclass, 3, 100); Why wouldn't you implement this through reloptions? Because it is column-based and not table-based? In this case, we need to store and array value like {attnum, stadistinct}. If it is not ugly in your POV, +1 for this approach. Yes, column-based storage is needed. However, when we drop tables, dangling stat settings might remain. I want core-support for the module, for example, TRIGGER ON DROP TABLE or some drop-relation-hooks. There might be another approach that we add pg_attribute.attoptions for generic column-based options, like pg_class.reloptions. Which approach is better, or something else? Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Synch Rep: direct transfer of WAL file from the primary to the standby
Hi, Thanks for the comment! On Tue, Jul 7, 2009 at 12:16 AM, Tom Lanet...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: In order for the primary server (ie. a normal backend) to read an archived file, restore_command needs to be specified in also postgresql.conf. In this case, how should we handle restore_command in recovery.conf? I confess to not having paid much attention to this thread so far, but ... what is the rationale for having such a capability at all? If the XLOG files which are required for recovery exist only in the primary server, the standby server has to read them in some way. For example, when the latest XLOG file of the primary server is 09 and the standby server has only 01, the missing files (02-08) has to be read for recovery by the standby server. In this case, the XLOG records in 09 or later are shipped to the standby server in real time by synchronous replication feature. The problem which I'd like to solve is how to make the standby server read the XLOG files (XLOG file, backup history file and timeline history) which exist only in the primary server. In the previous patch, we had to manually copy those missing files to the archive of the standby server or use the warm-standby mechanism. This would decrease the usability of synchronous replication. So, I proposed one of the solutions which makes the standby server read those missing files automatically: introducing new function pg_read_xlogfile() which reads the specified XLOG file. Is this solution in the right direction? Do you have another reasonable solution? It seems to me to be exposing implementation details that we do not need to expose, as well as making assumptions that we shouldn't make (like there is exactly one archive and the primary server has read access to it). You mean that one archive is shared between two servers? If so, no. I attached the picture of the environment which I assume. Please feel free to comment. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center pg_read_xlogfile.pdf Description: Adobe PDF document -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for automating partitions in PostgreSQL 8.4 Beta 2
On Mon, Jun 8, 2009 at 9:02 AM, Kedar Potdarkedar.pot...@gmail.com wrote: Hi, PFA patch, readme for automating partitions in PostgreSQL 8.4 Beta 2 and testcases. if you are still working on this, can you please update the patch to cvs head? -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] *_collapse_limit, geqo_threshold
I think we should try to do something about join_collapse_limit, from_collapse_limit, and geqo_threshold for 8.5. http://archives.postgresql.org/message-id/9134.1243289...@sss.pgh.pa.us http://archives.postgresql.org/message-id/603c8f070905251800g5b86d2dav26eca7f417d15...@mail.gmail.com I'm still of the opinion that join_collapse_threshold is a loaded foot-gun, because I don't think that users will expect that a join specified this way: SELECT ... FROM a JOIN b ON Pab JOIN c ON Pac JOIN d ON Pad ... will behave differently than one specified this way: SELECT ... FROM a, b, c, d WHERE Pab AND Pac AND Pad ... The whole purpose of join_collapse_limit in the first instance is to prevent planning time from getting out of control, but I don't see how we can view it as a very effective safety valve when it depends so heavily on which syntax is used. If the planning time for an N-way join is excessive, then we're going to have a problem with excessive planning time whenever the second syntax is selected, and I don't see any reason to believe that users see the second syntax as dangerous in terms of planning time but the first syntax as safer. One possibility would be to remove join_collapse_limit entirely, but that would eliminate one possibily-useful piece of functionality that it current enables: namely, the ability to exactly specify the join order by setting join_collapse_limit to 1. So one possibility would be to rename the variable something like explicit_join_order and make it a Boolean; another possibility would be to change the default value to INT_MAX. The approach I've taken in the attached patch is to make 0 mean unlimited and make that the default value. I don't have a strong feeling about whether that's better than the other two options, although it seems cleaner to me or I'd not have written the patch that way. We could also consider adopting this same approach for from_collapse_limit, though for some reason that behavior marginally less pathological to me. At any rate, regardless of whether this patch (or one of the other approaches mentioned above) are adopted for 8.5, I think we should raise the default values for whatever is left. The defaults basically haven't been modified since they were put in, and my experience is that even queries with 10 to 15 joins perform acceptably for OLTP workloads, which are exactly the workloads where query planning time is most likely to be an issue. So I would propose raising each of the limits by 4 (to 12 for from_collapse_limit and join_collapse_limit if we don't unlimit them entirely, and to 16 for geqo_threshold). I'm interested in hearing from anyone who has practical experience with tuning these variables, or any ideas on what we should test to get a better idea as to how to set them. Thanks, ...Robert *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** *** 2288,2296 SELECT * FROM parent WHERE key = 2400; /para para ! By default, this variable is set the same as ! varnamefrom_collapse_limit/varname, which is appropriate ! for most uses. Setting it to 1 prevents any reordering of explicit literalJOIN/s. Thus, the explicit join order specified in the query will be the actual order in which the relations are joined. The query planner does not always choose --- 2288,2295 /para para ! By default, this variable is set to literal0/, which always ! allows rewriting. Setting it to 1 prevents any reordering of explicit literalJOIN/s. Thus, the explicit join order specified in the query will be the actual order in which the relations are joined. The query planner does not always choose *** a/src/backend/optimizer/plan/initsplan.c --- b/src/backend/optimizer/plan/initsplan.c *** *** 477,483 deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, /* force the join order exactly at this node */ joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist)); } ! else if (list_length(leftjoinlist) + list_length(rightjoinlist) = join_collapse_limit) { /* OK to combine subproblems */ --- 477,484 /* force the join order exactly at this node */ joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist)); } ! else if (join_collapse_limit == 0 ! || list_length(leftjoinlist) + list_length(rightjoinlist) = join_collapse_limit) { /* OK to combine subproblems */ *** a/src/backend/utils/misc/guc.c --- b/src/backend/utils/misc/guc.c *** *** 1275,1284 static struct config_int ConfigureNamesInt[] = constructs are not flattened.), gettext_noop(The planner will flatten explicit JOIN constructs into lists of FROM items whenever a ! list of no more than this many items would result.) }, join_collapse_limit, ! 8, 1, INT_MAX,