Re: [HACKERS] fillfactor using WITH syntax
Simon Riggs [EMAIL PROTECTED] wrote: Itagaki had suggested adding options to heaps also, so clearly we'd need to add that to pg_class, rather than pg_index in that case. Yes, I want to add options tables not only indexes. There is pg_index for indexes, but is not pg_table for tables, so I added options to pg_class. Why not implement an array of option parameters on pg_class, so both heaps and indexes can be given additional parameters? That way you wouldn't need a specific relfillfactor attribute. That would allow us to keep CREATE TABLE free of additional keywords also. Ok, I'll add a options array to pg_class instead of the fixed-field for fillfactor, referring to the aclitem. --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Compression and on-disk sorting
On Fri, May 26, 2006 at 09:21:44PM +0100, Simon Riggs wrote: On Fri, 2006-05-26 at 14:47 -0500, Jim C. Nasby wrote: But the meat is: -- work_mem -- Scale 20002 not compressed 150 805.7 797.7 not compressed 300017820 17436 compressed 150 371.4 400.1 compressed 300081528537 compressed, no headers 300073257876 Since Tom has committed the header-removing patch, we need to test not compressed, no headers v compressed, no headers -- work_mem -- Scale 20002 not compressed 150 805.7 797.7 not compressed 300017820 17436 not compressed, no hdr 300014470 14507 compressed 150 371.4 400.1 compressed 300081528537 compressed, no headers 300073257876 There is a noticeable rise in sort time with increasing work_mem, but that needs to be offset from the benefit that in-general comes from using a large Heap for the sort. With the data you're using that always looks like a loss, but that isn't true with all input data orderings. I thought that a change had been made to the on-disk sort specifically to eliminate the problem of more work_mem making the sort take longer. I also thought that there was something about that fix that was tunable. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Compression and on-disk sorting
On Wed, 2006-06-07 at 01:33 -0500, Jim C. Nasby wrote: On Fri, May 26, 2006 at 09:21:44PM +0100, Simon Riggs wrote: On Fri, 2006-05-26 at 14:47 -0500, Jim C. Nasby wrote: But the meat is: -- work_mem -- Scale 20002 not compressed 150 805.7 797.7 not compressed 300017820 17436 compressed 150 371.4 400.1 compressed 300081528537 compressed, no headers 300073257876 Since Tom has committed the header-removing patch, we need to test not compressed, no headers v compressed, no headers -- work_mem -- Scale 20002 not compressed 150 805.7 797.7 not compressed 300017820 17436 not compressed, no hdr 300014470 14507 compressed 150 371.4 400.1 compressed 300081528537 compressed, no headers 300073257876 That looks fairly conclusive. Can we try tests with data in reverse order, so we use more tapes? We're still using a single tape, so the additional overhead of compression doesn't cause any pain. There is a noticeable rise in sort time with increasing work_mem, but that needs to be offset from the benefit that in-general comes from using a large Heap for the sort. With the data you're using that always looks like a loss, but that isn't true with all input data orderings. I thought that a change had been made to the on-disk sort specifically to eliminate the problem of more work_mem making the sort take longer. There was a severe non-optimal piece of code...but the general effect still exists. As does the effect that having higher work_mem produces fewer runs which speeds up the final stages of the sort. I also thought that there was something about that fix that was tunable. Increasing work_mem makes *this* test case take longer. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] self-deadlock at FATAL exit of boostrap process on read error
I encounter a situation that the server can't shutdown when a boostrap process does ReadBuffer() but gets an read error. I guess the problem may be like this - the boostrap process can't read at line: smgrread(reln-rd_smgr, blockNum, (char *) bufBlock); So it does a FATAL exit and shmem_exit() is called: while (--on_shmem_exit_index = 0) (*on_shmem_exit_list[on_shmem_exit_index].function) (code, on_shmem_exit_list[on_shmem_exit_index].arg); Where on_shmem_exit_list[0] = DummyProcKill on_shmem_exit_list[1] = AtProcExit_Buffers The above callback is called in a stack order, so AtProcExit_Buffers() will call AbortBufferIO() which is blocked by itself on io_in_progress_lock (which is not the case as the comment says since LWLockReleaseAll has already been called, we're not holding the buffer's io_in_progress_lock). There may other similar problems for bootstrap process like this, so I am not sure the best fix for this ... Regards, Qingqing ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] How to avoid transaction ID wrap
Koichi Suzuki wrote: I've once proposed a patch for 64bit transaction ID, but this causes some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit transaction ID has to pay about a couple of percent of performance. If 64bit transaction ID is a reasonable fix, I've already posted this patch. Anyone can apply this to later versions. Did you check performance on 32-bit or 64-bit systems and 64-bit binary version of PGSQL? I think that today is not problem to have 64-bit architecture and 64-bit ID should increase scalability of Postgres. Zdenek ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] How to avoid transaction ID wrap
Koichi Suzuki wrote: I've once proposed a patch for 64bit transaction ID, but this causes some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit transaction ID has to pay about a couple of percent of performance. If 64bit transaction ID is a reasonable fix, I've already posted this patch. Anyone can apply this to later versions. Be careful, the pg_multixact stuff assumes that a MultiXactId is the same size as TransactionId, so you have to change that too. I don't recall offhand if it was defined in a way that would make it just work automatically. (You'd also have to be careful about it not overrunning the SLRU files when it's close to the end of 'em.). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
On Tue, 2006-06-06 at 16:50 -0400, Tom Lane wrote: I have a theory about this, and it's not pleasant at all. What I think is that we have a Heisenberg problem here: the act of invoking gettimeofday() actually changes what is measured. If this theory is correct, then the entire notion of EXPLAIN ANALYZE sampling has just crashed and burned. We can't ship a measurement tool that is accurate on some platforms and not others. Regrettably, I would agree and so conclude that we shouldn't pursue the sampling idea further. Heisenbugs suck time like no other. Interesting, though. That leaves us with a number of possibilities: 0. Do Nothing 1. Option to skip timing altogether on an EXPLAIN ANALYZE 2. Option to produce a partial execution only, to locate problem areas. Any others? Option 2 would be harder to interpret, but still useful - originally discussed in a current thread on -perform. Option 1 wouldn't be as useful as the original sampling idea, but if its not on the table any longer I'd revert back to Option 1 as being the best choice for further work. Do we agree the idea can't go further? What next? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] How to avoid transaction ID wrap
On Wed, Jun 07, 2006 at 01:48:50PM +0200, Zdenek Kotala wrote: Koichi Suzuki wrote: I've once proposed a patch for 64bit transaction ID, but this causes some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit transaction ID has to pay about a couple of percent of performance. If 64bit transaction ID is a reasonable fix, I've already posted this patch. Anyone can apply this to later versions. Did you check performance on 32-bit or 64-bit systems and 64-bit binary version of PGSQL? I think that today is not problem to have 64-bit architecture and 64-bit ID should increase scalability of Postgres. I doubt performance is the issue directly. Increasing the size of the transaction counter would increase the size of narrow tables by maybe 30%. That's 30% more disk space and 30% more memory usage in some places. Maybe at some point it'll be worth it, but right now I don't think those commodoties are cheap enough to use like this for fairly marginal benefits. Beside, memory bandwidth hasn't grown anywhere enar as fast as memory space, so it's always a good idea to use as little memory as possible. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Duplicate rows sneaking in despite PRIMARY KEY / UNIQUE constraint
Travis, We have used postgresql 7.4, 8.0, and 8.1 with DSPAM and have never had a single problem like you are describing. In the past on this mailing list, these sorts of issues have been caused by hardware problems on the DB server in some cases. Good luck with tracking it down. Ken On Tue, Jun 06, 2006 at 05:45:26AM -0400, Travis Cross wrote: I have a table that I am using to store email token data for DSPAM. I'm noticing that a handful (4-16) of rows with duplicate columns (uid,token) are sneaking into the table every day despite the primary key constraint. The server currently processes a few thousand emails per day, and this particular table currently has about 4.5 million rows in it. I feel as though I must be missing something here, because I have always strongly assumed that postgresql prevents this sort of chaos from happening by default. When these duplicate pairs make their way into the table, all havoc breaks loose ;) The rows with the duplicate pairs seem to become land mines. The postgresql process handling a query that subsequently runs across one of these rows dies, taking down the DSPAM daemon with it, and sometimes corrupting the postgresql shared memory enough that the postmaster has to shutdown the other processes and restart everything anew [1]. I am usually able to clean things up by running the following, but once or twice I've had to drop the unique constraint before postgresql would process the request without choking: delete from dspam_token_data where row(uid,token) in (select uid,token from dspam_token_data group by uid,token having count(*) 1); (I don't worry about preserving one of the duplicates here.) I'm running postgresql-8.1.3. Here is the table in question: CREATE TABLE dspam.dspam_token_data ( uid int4 NOT NULL, token int8 NOT NULL, spam_hits int4, innocent_hits int4, last_hit date, CONSTRAINT dspam_token_data_pkey PRIMARY KEY (uid, token) ) WITHOUT OIDS; [2] What steps can I take to fix this? Is there any additional information I can provide? I've cleaned the table many times now. I then run VACUUM ANALYZE. My next step will probably be to hack the DSPAM sources to make the application more careful about not trying to insert rows that would violate the unique constraint. Even still, it seems that would only reduce the frequency of these occurrences, not eliminate them completely. Thanks! Cheers, -- Travis Notes: [1] A condensed log file showing off exactly what happens here is attached. [2] Previously, the table lacked a primary key and instead used a unique constraint and index. This yielded the exact same results I am currently seeing using a two-column primary key, as above. The old table schema was: CREATE TABLE dspam.dspam_token_data ( uid int4, token int8, spam_hits int4, innocent_hits int4, last_hit date, CONSTRAINT dspam_token_data_uid_key UNIQUE (uid, token) ) WITHOUT OIDS; Jun 5 17:58:14 shadow postgres[28775]: [21-1] PANIC: right sibling's left-link doesn't match Jun 5 17:58:14 shadow postgres[28775]: [21-2] STATEMENT: PREPARE dspam_update_plan (bigint) AS UPDATE dspam_token_data SET last_hit = CURRENT_DATE, innocent_hits = Jun 5 17:58:14 shadow postgres[28775]: [21-3] innocent_hits + 1 WHERE uid = '1' AND token = $1;PREPARE dspam_insert_plan (bigint, int, int) AS INSERT INTO dspam_token_data Jun 5 17:58:14 shadow postgres[28775]: [21-4] (uid, token, spam_hits, innocent_hits, last_hit) VALUES (1, $1, $2, $3, CURRENT_DATE); Jun 5 17:58:14 shadow dspam[7780]: query error: (null): see sql.errors for more details Jun 5 17:58:14 shadow postgres[10566]: [21-1] LOG: server process (PID 28775) was terminated by signal 6 Jun 5 17:58:14 shadow postgres[10566]: [22-1] LOG: terminating any other active server processes Jun 5 17:58:14 shadow postgres[7219]: [21-1] WARNING: terminating connection because of crash of another server process Jun 5 17:58:14 shadow postgres[7219]: [21-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server Jun 5 17:58:14 shadow postgres[7219]: [21-3] process exited abnormally and possibly corrupted shared memory. Jun 5 17:58:14 shadow postgres[7525]: [24-1] LOG: database system was interrupted at 2006-06-05 17:58:01 UTC Jun 5 17:58:14 shadow postgres[7525]: [25-1] LOG: checkpoint record is at 16/DA1D6868 Jun 5 17:58:14 shadow postgres[7525]: [26-1] LOG: redo record is at 16/DA0081D8; undo record is at 0/0; shutdown FALSE Jun 5 17:58:14 shadow postgres[7525]: [27-1] LOG: next transaction ID: 5691748; next OID: 148615 Jun 5 17:58:14 shadow postgres[7525]: [28-1] LOG: next MultiXactId: 4; next MultiXactOffset: 10 Jun 5 17:58:14 shadow postgres[7525]: [29-1] LOG: database
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Simon Riggs [EMAIL PROTECTED] writes: Do we agree the idea can't go further? What next? It still needs investigation; I'm no longer convinced that the TLB-flush theory is correct. See rest of thread. We may well have to revert the current patch, but I'd like to be sure we understand why. If we do have to revert, I'd propose that we pursue the notion of interrupt-driven sampling like gprof uses. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Compression and on-disk sorting
On Wed, Jun 07, 2006 at 11:59:50AM +0100, Simon Riggs wrote: On Wed, 2006-06-07 at 01:33 -0500, Jim C. Nasby wrote: On Fri, May 26, 2006 at 09:21:44PM +0100, Simon Riggs wrote: On Fri, 2006-05-26 at 14:47 -0500, Jim C. Nasby wrote: But the meat is: -- work_mem -- Scale 20002 not compressed 150 805.7 797.7 not compressed 300017820 17436 compressed 150 371.4 400.1 compressed 300081528537 compressed, no headers 300073257876 Since Tom has committed the header-removing patch, we need to test not compressed, no headers v compressed, no headers -- work_mem -- Scale 20002 not compressed 150 805.7 797.7 not compressed 300017820 17436 not compressed, no hdr 300014470 14507 compressed 150 371.4 400.1 compressed 300081528537 compressed, no headers 300073257876 That looks fairly conclusive. Can we try tests with data in reverse order, so we use more tapes? We're still using a single tape, so the additional overhead of compression doesn't cause any pain. Would simply changing the ORDER BY to DESC suffice for this? FWIW: bench=# select correlation from pg_stats where tablename='accounts' and attname='bid'; correlation - 1 (1 row) There is a noticeable rise in sort time with increasing work_mem, but that needs to be offset from the benefit that in-general comes from using a large Heap for the sort. With the data you're using that always looks like a loss, but that isn't true with all input data orderings. I thought that a change had been made to the on-disk sort specifically to eliminate the problem of more work_mem making the sort take longer. There was a severe non-optimal piece of code...but the general effect still exists. As does the effect that having higher work_mem produces fewer runs which speeds up the final stages of the sort. I also thought that there was something about that fix that was tunable. Increasing work_mem makes *this* test case take longer. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Compression and on-disk sorting
On Wed, 2006-06-07 at 09:35 -0500, Jim C. Nasby wrote: Would simply changing the ORDER BY to DESC suffice for this? FWIW: Try sorting on aid also, both ascneding and descending. We need to try lots of tests, not just one thats chosen to show the patch in the best light. I want this, but we need to check. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Connection Broken with Custom Dicts for TSearch2
Since we are on the topic, is there a timeline/plans for openfts being brought into core? If not, I'll continue my work on bringing it into Gentoo Portage. OpenFTS never, but tsearch2 is possible. But it requires enough work to do, so I have doubt that it will be done in 8.2... -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
On Wed, Jun 07, 2006 at 09:53:32AM -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Do we agree the idea can't go further? What next? It still needs investigation; I'm no longer convinced that the TLB-flush theory is correct. See rest of thread. We may well have to revert the current patch, but I'd like to be sure we understand why. One thing I'm thinking of trying is to, instead of assuming we can estimate the duractions of all the nodes by taking the total time divided by samples. we assume that the duration of tuple X is similar in duration to tuple X+1 but not necessarily the same as all other tuples. This moves the calculation from EndLoop to StopInstr. Basically in StopInstr you do the steps: if( sampling ) { x = get time for this tuple n = number of tuples skipped cumulativetime += x*n } This would mean that we wouldn't be assuming that tuples near the end take as long as tuples near the beginning. Except we're now dealing will smaller numbers, so I'm worried about error accumlation. If we do have to revert, I'd propose that we pursue the notion of interrupt-driven sampling like gprof uses. How would that work? You could then estimate how much time was spent in each node, but you no longer have any idea about when they were entered or left. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] self-deadlock at FATAL exit of boostrap process on read error
Qingqing Zhou [EMAIL PROTECTED] writes: I encounter a situation that the server can't shutdown when a boostrap process does ReadBuffer() but gets an read error. Hm, AtProcExit_Buffers is assuming that we've done AbortTransaction, but the WAL-replay process doesn't do that because it's not running a transaction. Seems like we need to stack another on-proc-exit function to do the appropriate subset of AbortTransaction ... LWLockReleaseAll at least, not sure what else. Do you have a test case to reproduce this problem? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Proposal to improve multicolumn GiST page split algoriithm.
I'd like to base on paper Generalizing Search in Generalized Search Trees by Paul Aoki, section 4.1 Multiple Key Support (http://www.sai.msu.su/~megera/postgres/gist/papers/csd-97-950.pdf) Proposed algorithm (without details about nulls etc): 1) n = 1 - set column's number to first one 2) form vector of keys of n-th column 3) set left and right union keys to NULL (see below) 4) call user-defined pickSplit method for n-th column 5) if it was a last column then return - page is splitted 6) try to find keys on left page with zero penalty with right union key and keys on right page with left union. Note, we check only keys in n-th column. Let M is a total number of keys with zero penalties with opposite unions. So we have M keys/tuples which can be freely distributed between left and right page. Penalty is calculated by call user-defined Penalty() method for n-th column. 7) if M == 0 then return - page is splitted 8) n++ 9) form vector of keys of n-th column, but use only tuples, determined in step 6 10)set left and right union keys. Its forms from tuples which can't be freely distributed between page. These tuples are determined in step 6 11)go to step 4. That algorithm requires to small change in interface of pickSplit() method. It works with GIST_SPLITVEC structure. But step 6 requires that pickSplit() should knows: are unions already formed? If not, pickSplit() should work as now. If yes, pickSplit() must take in consideration formed left and right unions, and it can't to 'reduce' that unions. I suggest add one boolean field to GIST_SPLITVEC: isSubSplit (suggest exact name, pls) if isSubSplit == false then unions are not formed, pickSplit works as now. if isSubSplit == true then unions are already formed, pickSplit should use formed keys. So, after pickSplit() call isSubSplit should be always 'false', if it's not then GiST's core should say at least a warning about unsupported feature in opclass. BTW, in this case, GiST may compose old (with formed before) unions with suggested by pickSplit() by maximizing penalty between left and right keys. Also, I plan to split GIST_SPLITVEC to 2 structures: one of its will be argument for pickSplit() and another will use internally in GiST core. Now GIST_SPLITVEC contains a lot of field that's needed only for core. Thoughts, suggestions, objections? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Snowball and ispell in tsearch2
We got a lot requests about including stemmers and ispell dictionaries for all accessible languages into tsearch2. I understand that tsearch2 will be closer to end user. But sources of snowball stemmers is about 800kb, each ispell dictionaries will takes about 0.5-2M. All sizes are sized with compression. I am afraid that is too big size... What are opinions? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Martijn van Oosterhout kleptog@svana.org writes: This would mean that we wouldn't be assuming that tuples near the end take as long as tuples near the beginning. Except we're now dealing will smaller numbers, so I'm worried about error accumlation. Hm, that would explain why Hash joins suffer from this especially. Even when functioning properly hashes get slower as the buckets fill up and there are longer lists to traverse. Perhaps the hashes are suffering inordinately from collisions though. Some of the data type hash functions looked kind of suspect when I peeked at them a while back. -- greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Compression and on-disk sorting
On Wed, Jun 07, 2006 at 04:11:57PM +0100, Simon Riggs wrote: On Wed, 2006-06-07 at 09:35 -0500, Jim C. Nasby wrote: Would simply changing the ORDER BY to DESC suffice for this? FWIW: Try sorting on aid also, both ascneding and descending. We need to try lots of tests, not just one thats chosen to show the patch in the best light. I want this, but we need to check. Well, correlation on everything in that table is 1. At this point maybe it makes more sense to just come up with a different test case, possibly generate_series and random. Better yet would be if someone came up with a patch that actually populated the filler field in pgbench. Better still would be allowing the user to define how large they wanted the filler field to be... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Snowball and ispell in tsearch2
Hello Teodor, I've just recently implemented an advanced full-text search function on top of tsearch2. Searching through the manuals and websites to get the snowball stemmer and compile my own module took me way to long. I'd rather go fetch a cup of coffee during a 30 minute download... That said, I don't necessarily mean that all stemmers must be included in CVS or such. It should just be simpler for the database administrator to install ispell or stemmer 'modules'. A non-plus-ultra solution would be to provide packages for each language (in debian or fedora, etc..). Perhaps we can put together the source code for all languages modules available and provide scripts to fetch ispell data or to generate the snowball stemmers. A debian package maintainer would have to fetch all the data to generate all language packages. Someone else might just want to download and compile a norwegian snowball stemmer. I'd be willing to help with such a project. I have experience with tsearch2 as well as with gentoo and debian packaging. I can't help with rpm, though. Regards Markus Teodor Sigaev wrote: We got a lot requests about including stemmers and ispell dictionaries for all accessible languages into tsearch2. I understand that tsearch2 will be closer to end user. But sources of snowball stemmers is about 800kb, each ispell dictionaries will takes about 0.5-2M. All sizes are sized with compression. I am afraid that is too big size... What are opinions? ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] ADD/DROP INHERITS
I've implemented most of ADD/DROP INHERITS but it's my first significant piece of code at this level. I would appreciate any feedback about it. In particular I'm worried I may be on the wrong track about how some low level operations work like memory management, syscache lookups, heap tuple creation etc. Also, I'm not at all clear what kind of locks are really necessary for this operation. I may be taking excessively strong or weak locks or have deadlock risks. The main thing remaining to be done is implementing default column expressions. Those would require an Alter Table Pass 3 operation I believe. Also I haven't looked at table constraints at all yet, I'm not clear what's supposed to happen there. I made some decisions on some semantic issues that I believe are correct but could stand some double checking. Specifically If the parent has oids then the child must have oids and if a column in the parent is NOT NULL then the column in the child must be NOT NULL as well. I can send the actual patch to psql-patches, it includes some other changes to refactor StoreCatalogInheritance and add the syntax to gram.y. But it's still not quite finished because of default values. static void ATExecAddInherits(Relation rel, RangeVar *parent) { Relation relation, catalogRelation; SysScanDesc scan; ScanKeyData key; HeapTuple inheritsTuple; int4 inhseqno = 0; ListCell *child; List *children; relation = heap_openrv(parent, AccessShareLock); /* XXX is this enough locking? */ if (relation-rd_rel-relkind != RELKIND_RELATION) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg(inherited relation \%s\ is not a table, parent-relname))); /* Permanent rels cannot inherit from temporary ones */ if (!rel-rd_istemp isTempNamespace(RelationGetNamespace(relation))) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg(cannot inherit from temporary relation \%s\, parent-relname))); if (!pg_class_ownercheck(RelationGetRelid(relation), GetUserId())) aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS, RelationGetRelationName(relation)); /* If parent has OIDs then all children must have OIDs */ if (relation-rd_rel-relhasoids !rel-rd_rel-relhasoids) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg(table \%s\ without OIDs cannot inherit from table \%s\ with OIDs, RelationGetRelationName(rel), parent-relname))); /* * Reject duplications in the list of parents. -- this is the same check as * when creating a table, but maybe we should check for the parent anywhere * higher in the inheritance structure? */ catalogRelation = heap_open(InheritsRelationId, RowExclusiveLock); ScanKeyInit(key, Anum_pg_inherits_inhrelid, BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(RelationGetRelid(rel))); scan = systable_beginscan(catalogRelation, InheritsRelidSeqnoIndexId, true, SnapshotNow, 1, key); while (HeapTupleIsValid(inheritsTuple = systable_getnext(scan))) { Form_pg_inherits inh = (Form_pg_inherits) GETSTRUCT(inheritsTuple); if (inh-inhparent == RelationGetRelid(relation)) ereport(ERROR, (errcode(ERRCODE_DUPLICATE_TABLE), errmsg(inherited relation \%s\ duplicated, parent-relname))); if (inh-inhseqno inhseqno) inhseqno = inh-inhseqno; } systable_endscan(scan); heap_close(catalogRelation, RowExclusiveLock); /* Get children because we have to manually recurse and also because we * have to check for recursive inheritance graphs */ /* this routine is actually in the planner */ children = find_all_inheritors(RelationGetRelid(rel)); if (list_member_oid(children, RelationGetRelid(relation))) ereport(ERROR, (errcode(ERRCODE_DUPLICATE_TABLE), errmsg(Circular inheritance structure found))); foreach(child, children) { Oid childrelid = lfirst_oid(child); Relationchildrel; childrel
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Greg Stark [EMAIL PROTECTED] writes: Martijn van Oosterhout kleptog@svana.org writes: This would mean that we wouldn't be assuming that tuples near the end take as long as tuples near the beginning. Except we're now dealing will smaller numbers, so I'm worried about error accumlation. Hm, that would explain why Hash joins suffer from this especially. Even when functioning properly hashes get slower as the buckets fill up and there are longer lists to traverse. Nope, that is certainly not the explanation, because the hash table is loaded in the (single) call of the Hash node at the start of the query. It is static all through the sampled-and-not executions of the Hash Join node, which is where our problem is. I don't see that Martijn's idea responds to the problem anyway, if it is some sort of TLB-related issue. The assumption we are making is not tuples near the end take as long as tuples near the beginning, it is tuples we sample take as long as tuples we don't (both statements of course meaning on the average). If the act of sampling incurs overhead beyond the gettimeofday() call itself, then we are screwed, and playing around with which iterations we sample and how we do the extrapolation won't make the slightest bit of difference. I'm unsure about the TLB-flush theory because I see no evidence of any such overhead in the 8.1 timings; but on the other hand it's hard to see what else could explain the apparent dependence on targetlist width. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Tom Lane [EMAIL PROTECTED] writes: Nope, that is certainly not the explanation, because the hash table is loaded in the (single) call of the Hash node at the start of the query. It is static all through the sampled-and-not executions of the Hash Join node, which is where our problem is. At the risk of asking a stupid question, it's not perchance including that hash build in the first sample it takes of the hash join node? -- greg ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Nope, that is certainly not the explanation, because the hash table is loaded in the (single) call of the Hash node at the start of the query. It is static all through the sampled-and-not executions of the Hash Join node, which is where our problem is. At the risk of asking a stupid question, it's not perchance including that hash build in the first sample it takes of the hash join node? Sure. Which is one of the reasons why the first tuple is excluded from the extrapolation... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] ADD/DROP INHERITS
Greg Stark wrote: I can send the actual patch to psql-patches, it includes some other changes to refactor StoreCatalogInheritance and add the syntax to gram.y. But it's still not quite finished because of default values. You can send what you've got, and note that it's not for application yet. Post early and post often ;-) There are a surprising number of things to be done when you play with the syntax, as I found out not too long ago. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] ADD/DROP INHERITS
Greg Stark [EMAIL PROTECTED] writes: I've implemented most of ADD/DROP INHERITS but it's my first significant piece of code at this level. I would appreciate any feedback about it. I thought we had agreed that the semantics of ADD INHERITS would be to reject the command if the child wasn't already suitable to be a child of the parent. Not to modify it by adding columns or constraints or whatever. For the proposed uses of ADD INHERITS (in particular, linking and unlinking partition tables) an incompatibility in schema almost certainly means you made a mistake, and you don't really want the system helpfully fixing your table to match the parent. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Snowball and ispell in tsearch2
800kb, each ispell dictionaries will takes about 0.5-2M. All sizes are Sorry, withOUT compression... -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Snowball and ispell in tsearch2
OpenFTS ebuild: http://bugs.gentoo.org/show_bug.cgi?id=135859 It has a USE flag for the snowball stemmer. I can take care of packaging for Gentoo if it will free up time for you to work on other distros. John PS, upstream package size isn't, and shouldn't be an issue, it should be left to the packaging systems to discretely fetch what is needed. On 6/7/06, Markus Schiltknecht [EMAIL PROTECTED] wrote: That said, I don't necessarily mean that all stemmers must be included in CVS or such. It should just be simpler for the database administrator to install ispell or stemmer 'modules'. A non-plus-ultra solution would be to provide packages for each language (in debian or fedora, etc..). I'd be willing to help with such a project. I have experience with tsearch2 as well as with gentoo and debian packaging. I can't help with rpm, though. Regards Markus Teodor Sigaev wrote: We got a lot requests about including stemmers and ispell dictionaries for all accessible languages into tsearch2. I understand that tsearch2 will be closer to end user. But sources of snowball stemmers is about 800kb, each ispell dictionaries will takes about 0.5-2M. All sizes are sized with compression. I am afraid that is too big size... What are opinions? ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Ah-hah, I've sussed it. The faulty assumption can actually be stated as: all the executions, except maybe the first, will take approximately the same amount of time. The failing test case I've been looking at is one where the system decides to use a batched hash join, and in this plan type some iterations take much longer than others. (The apparent dependence on targetlist width is now easy to understand, because that affects the estimated hashtable size and hence the decision whether batching is needed. I'm not sure why I don't see the effect when running the identical case on my other machine, but since the other one is a 64-bit machine its space calculations are probably a bit different.) I added some printf's to instrument.c to print out the actual time measurements for each sample, as well as the calculations in InstrEndLoop. Attached are the printouts that occurred for the HashJoin node. The thing that is killing the extrapolation is of course the very large time for iteration 2, which the extrapolation includes in its average. But there's well over 10:1 variation in the later samples as well. On reflection it's easy to imagine other cases where some iterations take much longer than others in a not-very-predictable way. For instance, a join where only a subset of the outer tuples have a match is going to act that way. I don't think there's any good way that we can be sure we have a representative sample of executions, and so I'm afraid this approach to sampling EXPLAIN ANALYZE is a failure. I propose we revert this patch and think about an interrupt-driven sampling method instead. regards, tom lane 401489a0 1: 331616 usec in iter 1 401489a0 1: 110338 usec in iter 2 401489a0 1: 54 usec in iter 3 401489a0 1: 99 usec in iter 4 401489a0 1: 77 usec in iter 5 401489a0 1: 145 usec in iter 6 401489a0 1: 117 usec in iter 7 401489a0 1: 33 usec in iter 8 401489a0 1: 97 usec in iter 9 401489a0 1: 98 usec in iter 10 401489a0 1: 52 usec in iter 11 401489a0 1: 33 usec in iter 12 401489a0 1: 51 usec in iter 13 401489a0 1: 83 usec in iter 14 401489a0 1: 153 usec in iter 15 401489a0 1: 115 usec in iter 16 401489a0 1: 52 usec in iter 17 401489a0 1: 242 usec in iter 18 401489a0 1: 48 usec in iter 19 401489a0 1: 87 usec in iter 20 401489a0 1: 23 usec in iter 21 401489a0 1: 80 usec in iter 22 401489a0 1: 57 usec in iter 23 401489a0 1: 17 usec in iter 24 401489a0 1: 51 usec in iter 25 401489a0 1: 18 usec in iter 26 401489a0 1: 16 usec in iter 27 401489a0 1: 100 usec in iter 28 401489a0 1: 45 usec in iter 29 401489a0 1: 174 usec in iter 30 401489a0 1: 131 usec in iter 31 401489a0 1: 17 usec in iter 32 401489a0 1: 45 usec in iter 33 401489a0 1: 16 usec in iter 34 401489a0 1: 120 usec in iter 35 401489a0 1: 15 usec in iter 36 401489a0 1: 17 usec in iter 37 401489a0 1: 15 usec in iter 38 401489a0 1: 48 usec in iter 39 401489a0 1: 127 usec in iter 40 401489a0 1: 36 usec in iter 41 401489a0 1: 41 usec in iter 42 401489a0 1: 69 usec in iter 43 401489a0 1: 50 usec in iter 44 401489a0 1: 104 usec in iter 45 401489a0 1: 22 usec in iter 46 401489a0 1: 50 usec in iter 47 401489a0 1: 17 usec in iter 48 401489a0 1: 47 usec in iter 49 401489a0 1: 54 usec in iter 50 401489a0 1: 46 usec in iter 51 401489a0 1: 20 usec in iter 54 401489a0 1: 38 usec in iter 55 401489a0 1: 68 usec in iter 56 401489a0 1: 17 usec in iter 60 401489a0 1: 16 usec in iter 61 401489a0 1: 15 usec in iter 67 401489a0 1: 31 usec in iter 68 401489a0 1: 15 usec in iter 70 401489a0 1: 61 usec in iter 78 401489a0 1: 143 usec in iter 85 401489a0 1: 21 usec in iter 89 401489a0 1: 14 usec in iter 96 401489a0 1: 21 usec in iter 104 401489a0 1: 21 usec in iter 107 401489a0 1: 16 usec in iter 116 401489a0 1: 194 usec in iter 118 401489a0 1: 136 usec in iter 122 401489a0 1: 34 usec in iter 127 401489a0 1: 46 usec in iter 131 401489a0 1: 15 usec in iter 133 401489a0 1: 15 usec in iter 135 401489a0 1: 34 usec in iter 137 401489a0 1: 54 usec in iter 142 401489a0 1: 206 usec in iter 151 401489a0 1: 75 usec in iter 162 401489a0 1: 20 usec in iter 172 401489a0 1: 66 usec in iter 177 401489a0 1: 21 usec in iter 181 401489a0 1: 69 usec in iter 186 401489a0 1: 16 usec in iter 193 401489a0 1: 46 usec in iter 201 401489a0 1: 33 usec in iter 210 401489a0 1: 50 usec in iter 216 401489a0 1: 21 usec in iter 222 401489a0 1: 18 usec in iter 224 401489a0 1: 33 usec in iter 229 401489a0 1: 20 usec in iter 232 401489a0 1: 44 usec in iter 236 401489a0 1: 29 usec in iter 239 401489a0 1: 34 usec in iter 240 401489a0 1: 31 usec in iter 241 401489a0 1: 27 usec in iter 254 401489a0 1: 45 usec in iter 257 401489a0 1: 147 usec in iter 259 401489a0 1: 15 usec in iter 269 401489a0 1: 16 usec in iter 278 401489a0 1: 14 usec in iter 279 401489a0 1: 58 usec in iter 290 401489a0 1: 15 usec in iter 291 401489a0 1: 53 usec in iter 295 401489a0 1: 15 usec in iter 306 401489a0 1: 16 usec in iter 318 401489a0 1: 34 usec in iter 328 401489a0 1: 37 usec in iter 339 401489a0 1: 28
Re: [HACKERS] ADD/DROP INHERITS
Tom Lane [EMAIL PROTECTED] writes: I thought we had agreed that the semantics of ADD INHERITS would be to reject the command if the child wasn't already suitable to be a child of the parent. Not to modify it by adding columns or constraints or whatever. For the proposed uses of ADD INHERITS (in particular, linking and unlinking partition tables) an incompatibility in schema almost certainly means you made a mistake, and you don't really want the system helpfully fixing your table to match the parent. I didn't see any discussion like that and I find it pretty surprising. Personally I would have agreed. For partitioned tables you certainly don't want it to create new columns without warning you. But that's entirely inconsistent with the way inherited tables work in general. It seems to go against the grain of Postgres's general style to implement just the use case that's useful for a particular application rather than keep the features logically consistent with each other. Perhaps there should be an option when issuing the ADD INHERITS to indicate whether you want it to create new columns or only match existing columns. That would also give me a convenient excuse to skip all those NOTICEs about merging column definitions. Actually I think in the long term for partitioned tables Postgres will have to implement a special syntax just like Oracle and other databases. The user doesn't really want to have to manually manage all the partitions as tables. That imposes a lot of extra work to have to define the tables with the right syntax, maintain the constraints properly, etc. For the user it would be better to have a single property of the partitioned table that specified the partition key. Then when adding a partition you would only have to specify the key range it covers, not write an arbitrary constraint from scratch. Nor would you have to create an empty table with the proper definition first then add it in. -- greg ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] ADD/DROP INHERITS
Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I thought we had agreed that the semantics of ADD INHERITS would be to reject the command if the child wasn't already suitable to be a child of the parent. I didn't see any discussion like that and I find it pretty surprising. I'm pretty sure it was mentioned somewhere along the line. But that's entirely inconsistent with the way inherited tables work in general. I don't see any basis for that conclusion. The properties of a table are set when it's created and you need to do pretty explicit ALTERs to change them. We do not for example automatically make a unique index for a table when someone tries to reference a foreign key to a column set that doesn't already have such an index. In this situation, I think it's entirely reasonable to expect the user to do any needed ALTER ADD COLUMN, CONSTRAINT, etc commands before trying to attach a child table to a parent. Having the system do it for you offers no functionality gain, just a way to shoot yourself in the foot. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Connection Broken with Custom Dicts for TSearch2
How about those pg_ts* tables, which are specific for a database? Will they serve to the whole cluster?2006/6/7, Teodor Sigaev [EMAIL PROTECTED] :OpenFTS never, but tsearch2 is possible. But it requires enough work to do, so I have doubt that it will be done in 8.2...-- Rodrigo Hjorthttp://icewall.org/~hjort
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
On Wed, Jun 07, 2006 at 03:32:35PM -0400, Tom Lane wrote: On reflection it's easy to imagine other cases where some iterations take much longer than others in a not-very-predictable way. For instance, a join where only a subset of the outer tuples have a match is going to act that way. I don't think there's any good way that we can be sure we have a representative sample of executions, and so I'm afraid this approach to sampling EXPLAIN ANALYZE is a failure. I don't think we ever assumed it would never be a problem. We just assumed that the sampling would cancel the effect out to give a decent average. Thing is, I never expected to get a three order magnitude difference between samples. That's just far too much to be corrected in any way. The random sampling should counter most effects, and I didn't consider the one tuple in a million that takes much longer to be a particularly common case. Anyway, as a test, if you take the approach that the measurement at item X only applies to the tuples immediately preceding it, for the data you posted you get a result of 0.681148 seconds. How long did that query run that produced that data? (The bit of perl I used is: cat data | perl -lne 'BEGIN { $last=0; $sum =0 } /: (\d+) usec in iter (\d+)/ do { $sum += ($2-$last)*$1; $last=$2 }; END { print $sum\n }' I propose we revert this patch and think about an interrupt-driven sampling method instead. That's another possibility ofcourse... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Martijn van Oosterhout kleptog@svana.org writes: Anyway, as a test, if you take the approach that the measurement at item X only applies to the tuples immediately preceding it, for the data you posted you get a result of 0.681148 seconds. How long did that query run that produced that data? I didn't save the corresponding printout unfortunately, but it was probably pretty similar to this: regression=# explain analyze select count(*) from (select * from tenk1 a join tenk1 b on a.unique1 = b.unique2 offset 0) ss; QUERY PLAN -- Aggregate (cost=2609.00..2609.01 rows=1 width=0) (actual time=869.395..869.399 rows=1 loops=1) - Limit (cost=825.00..2484.00 rows=1 width=488) (actual time=248.640..3368.313 rows=1 loops=1) - Hash Join (cost=825.00..2484.00 rows=1 width=488) (actual time=248.609..2983.528 rows=1 loops=1) Hash Cond: (a.unique1 = b.unique2) - Seq Scan on tenk1 a (cost=0.00..458.00 rows=1 width=244) (actual time=0.084..21.525 rows=1 loops=1) - Hash (cost=458.00..458.00 rows=1 width=244) (actual time=248.269..248.269 rows=1 loops=1) - Seq Scan on tenk1 b (cost=0.00..458.00 rows=1 width=244) (actual time=0.025..22.760 rows=1 loops=1) Total runtime: 877.265 ms (8 rows) Time: 888.469 ms regression=# The above idea won't fix it anyway, only move the failure cases around. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] ADD/DROP INHERITS
Tom Lane [EMAIL PROTECTED] writes: Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I thought we had agreed that the semantics of ADD INHERITS would be to reject the command if the child wasn't already suitable to be a child of the parent. I didn't see any discussion like that and I find it pretty surprising. I'm pretty sure it was mentioned somewhere along the line. But that's entirely inconsistent with the way inherited tables work in general. I don't see any basis for that conclusion. The properties of a table are set when it's created and you need to do pretty explicit ALTERs to change them. It just seems weird for: CREATE TABLE foo (x,y,z) INHERITS (bar) to not be the equivalent to: CREATE TABLE foo (x,y,z) ALTER TABLE foo ADD INHERITS bar We do not for example automatically make a unique index for a table when someone tries to reference a foreign key to a column set that doesn't already have such an index. But that's not really the same thing. Whether you add the foreign key later or when you initially create the table it never creates that index. On the other hand if you add a column to the parent it doesn't complain if not all the children already have that column -- it goes and adds it recursively. In this situation, I think it's entirely reasonable to expect the user to do any needed ALTER ADD COLUMN, CONSTRAINT, etc commands before trying to attach a child table to a parent. Having the system do it for you offers no functionality gain, just a way to shoot yourself in the foot. Well if that's the consensus feeling then it certainly makes my life easier. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
On Wed, Jun 07, 2006 at 11:34:30AM -0400, Tom Lane wrote: timer interrupt routine does this once every few milliseconds: One issue is that on very fast queries, you'd get absolutely no data this way. A possible solution would be to start with an extremely short timer interval, and ramp it up if the query runs longer. The downside is that you'd need to somehow re-scale iterations every time the timer interval changed. Of course another option is to start off using the gettimeofday() method and switch to sampling after X seconds, but it'd be nice if both code paths weren't needed. The bubble-up of sample counts to parent nodes could perhaps be done while printing the results instead of on-the-fly as sketched above, but the above seems simpler. It'd be nice if there was an option to not aggregate child runtimes to their parents at all, since it'd make spotting hot spots much easier. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] How to avoid transaction ID wrap
On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote: Zdenek Kotala [EMAIL PROTECTED] writes: Koichi Suzuki wrote: I've once proposed a patch for 64bit transaction ID, but this causes some overhead to each tuple (XMIN and XMAX). Did you check performance on 32-bit or 64-bit systems and 64-bit binary version of PGSQL? I think that today is not problem to have 64-bit architecture and 64-bit ID should increase scalability of Postgres. The percentage increase in I/O demand is the main reason the patch was rejected, not so much the arithmetic. Before considering 64 bit XIDs, it'd be very helpful to know why Mark can't vacuum frequently enough to handle rollover... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
On Wed, 2006-06-07 at 15:32 -0400, Tom Lane wrote: Ah-hah, I've sussed it. so I'm afraid this approach to sampling EXPLAIN ANALYZE is a failure. Hmmm, I thought we already got that bit...sorry to sound negative. I propose we revert this patch and think about an interrupt-driven sampling method instead. I don't have much more faith in crazy scheme No.2 either. (Mine or yours...) Can we just have an option to avoid the timing altogether, please? I don't want to have long discussions about instrumentation, I just want a reasonably useful EXPLAIN ANALYZE in a reasonable amount of time - one that we never, ever have to doubt whether the sampling works correctly on a Miasmic-367 with HyperKooling. You could lose a month on -perform going into the details of this for everybody - this was supposed to be a simple additional feature. If you're set on the sampling, great, but can we have the option to avoid it completely also? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Simon Riggs [EMAIL PROTECTED] writes: On Wed, 2006-06-07 at 15:32 -0400, Tom Lane wrote: I propose we revert this patch and think about an interrupt-driven sampling method instead. I don't have much more faith in crazy scheme No.2 either. (Mine or yours...) Can we just have an option to avoid the timing altogether, please? I don't want to have long discussions about instrumentation, I just want a reasonably useful EXPLAIN ANALYZE in a reasonable amount of time - one that we never, ever have to doubt whether the sampling works correctly on a Miasmic-367 with HyperKooling. Frankly, I think the pre-existing version of EXPLAIN ANALYZE is fine. People have been hyperventilating about the timing overhead but I think that it's perfectly acceptable as-is. Certainly the removal of timing is not going to convert an intolerable EXPLAIN ANALYZE runtime into an acceptable one; what it *is* likely to do is let you be misled about which part of the query is the problem. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Jim C. Nasby [EMAIL PROTECTED] writes: On Wed, Jun 07, 2006 at 11:34:30AM -0400, Tom Lane wrote: timer interrupt routine does this once every few milliseconds: One issue is that on very fast queries, you'd get absolutely no data this way. Yeah. Of course, on very fast queries you don't get very good data from the existing method either --- with a machine fast enough to have sub-microsecond plan node execution times, the accuracy of gettimeofday really isn't good enough. The other thing that was bothering me was whether disk I/O might be undercounted by an interrupt-driven method. I kinda doubt that any kernel will save up N interrupts that occur while the process is blocked on a slow read() ... you'll probably get only one. So the whole idea may be unworkable. At the moment I think we need to revert the recent patch and go back to the drawing board. Improving on the original implementation of EXPLAIN ANALYZE is clearly not as easy as it looks. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
On Wed, Jun 07, 2006 at 04:04:33PM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Anyway, as a test, if you take the approach that the measurement at item X only applies to the tuples immediately preceding it, for the data you posted you get a result of 0.681148 seconds. How long did that query run that produced that data? snip The above idea won't fix it anyway, only move the failure cases around. Well, if we're trying to make a system that never ever produces strange looking results, then yes, we have to drop it. No matter how you sample, there's going to be a failure mode somewhere. If you are prepared to live with a certain margin (it will be within X% of the real value Y% of the time) then I think that's an acheivable goal (I'm not saying that's necessarily what we have now). You could always give people the choice of disabling sampling if it looks wierd, but that just gives people more knobs to twiddle and get upset about. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] ADD/DROP INHERITS
Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: In this situation, I think it's entirely reasonable to expect the user to do any needed ALTER ADD COLUMN, CONSTRAINT, etc commands before trying to attach a child table to a parent. Having the system do it for you offers no functionality gain, just a way to shoot yourself in the foot. Well if that's the consensus feeling then it certainly makes my life easier. Well, one reason for my position is exactly to make your life easier. I think that making ADD INHERITS do all these other things automagically is lily-gilding, or at least implementing features not shown to be needed. Let's make it do the minimum needed for the use-cases cited so far --- we can always add more functionality later, *after* it's proven needed. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
On Wed, 2006-06-07 at 16:56 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Wed, 2006-06-07 at 15:32 -0400, Tom Lane wrote: I propose we revert this patch and think about an interrupt-driven sampling method instead. I don't have much more faith in crazy scheme No.2 either. (Mine or yours...) Can we just have an option to avoid the timing altogether, please? I don't want to have long discussions about instrumentation, I just want a reasonably useful EXPLAIN ANALYZE in a reasonable amount of time - one that we never, ever have to doubt whether the sampling works correctly on a Miasmic-367 with HyperKooling. Frankly, I think the pre-existing version of EXPLAIN ANALYZE is fine. I respect everybody's opinion, yours doubly so, as you know. But with this current opinion, on this occasion, I believe you are in the minority of those who have spent some time asking for EXPLAIN ANALYSEs and have not received them because the run time is unacceptable. People have been hyperventilating about the timing overhead but I think that it's perfectly acceptable as-is. The phrase hyperventilating is subjective. From my side, I could use the phrase explaining-the-obvious and potentially cause rankle also. We should be able to discuss things without that. Certainly the removal of timing is not going to convert an intolerable EXPLAIN ANALYZE runtime into an acceptable one; I disagree, as have others. what it *is* likely to do is let you be misled about which part of the query is the problem. A full EXPLAIN ANALYZE is always desirable - we agree on that. The question is what we do when one is not available. Guessing leaves you much more open to being misled. I'd like an option that we can take when a full EXPLAIN ANALYZE is not possible, one that doesn't give different results on different CPUs/OS. We need this to help people who are in difficulty. This isn't a vanity feature or a they-do-it-so-we-should-also. It's for support. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] ADD/DROP INHERITS
On Wed, Jun 07, 2006 at 03:33:54PM -0400, Greg Stark wrote: Perhaps there should be an option when issuing the ADD INHERITS to indicate whether you want it to create new columns or only match existing columns. That would also give me a convenient excuse to skip all those NOTICEs about merging column definitions. +1, but I also agree with Tom that this doesn't need to be in the first pass. Actually I think in the long term for partitioned tables Postgres will have to implement a special syntax just like Oracle and other databases. The user doesn't really want to have to manually manage all the partitions as tables. That imposes a lot of extra work to have to define the tables with the right syntax, maintain the constraints properly, etc. For the user it would be better to have a single property of the partitioned table that specified the partition key. Then when adding a partition you would only have to specify the key range it covers, not write an arbitrary constraint from scratch. Nor would you have to create an empty table with the proper definition first then add it in. I think this is on the TODO list; it's just a matter of actually doing it. A good first step would be creating an easy means to create an inherited table that contained everything the parent did; constraints, indexes, etc. After that's in place, it's easier to create a new partition (constraints and all) with a single command. Note that there's no reason this *has* to be in the backend; someone could do it as a pgFoundry project. Of course long-term it would be best if it was included, but that's probably more involved, especially for a newer coder. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
Simon Riggs [EMAIL PROTECTED] writes: On Wed, 2006-06-07 at 16:56 -0400, Tom Lane wrote: Certainly the removal of timing is not going to convert an intolerable EXPLAIN ANALYZE runtime into an acceptable one; I disagree, as have others. The overhead seems to be on the order of a couple tens of percent usually. I don't see how that makes the difference between an EXPLAIN ANALYZE you can run and one you can't. A full EXPLAIN ANALYZE is always desirable - we agree on that. The question is what we do when one is not available. The least bad alternative I've heard is to let EXPLAIN ANALYZE print out stats-so-far if the query is canceled by control-C or statement timeout. The objection to this is you may mistake startup transients for full query behavior ... but at least the numbers will be good as far as they go. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
On Wed, Jun 07, 2006 at 05:05:23PM -0400, Tom Lane wrote: The other thing that was bothering me was whether disk I/O might be undercounted by an interrupt-driven method. I kinda doubt that any kernel will save up N interrupts that occur while the process is blocked on a slow read() ... you'll probably get only one. So the whole idea may be unworkable. True, but if you get to the point where you're waiting on I/O, I would think you could afford a gettimeofday() call. Even if the block we need is in the OS cache, it's possible that the overhead of getting it from there is high enough that the gettimeofday() call won't matter. FWIW, it looks like running a much larger setting for shared_buffers (like, 50% of memory) is a good way to boost performance, and that configuration would make it less likely that you'd do a gettimeofday just to pull data out of the OS cache. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] ADD/DROP INHERITS
Ühel kenal päeval, K, 2006-06-07 kell 15:33, kirjutas Greg Stark: Tom Lane [EMAIL PROTECTED] writes: I thought we had agreed that the semantics of ADD INHERITS would be to reject the command if the child wasn't already suitable to be a child of the parent. Not to modify it by adding columns or constraints or whatever. For the proposed uses of ADD INHERITS (in particular, linking and unlinking partition tables) an incompatibility in schema almost certainly means you made a mistake, and you don't really want the system helpfully fixing your table to match the parent. I didn't see any discussion like that and I find it pretty surprising. I'm pretty sure that what was discussed was just attaching/detaching child tables into inheritance chains with no table alterations. Maybe it was never mentioned explicitly, but that was how I understood the discussion. Personally I would have agreed. For partitioned tables you certainly don't want it to create new columns without warning you. Exactly! But that's entirely inconsistent with the way inherited tables work in general. It seems to go against the grain of Postgres's general style to implement just the use case that's useful for a particular application rather than keep the features logically consistent with each other. There are too many conflicting definitions of logically consistent, so doing the bare minimum is the best way to avoid the whole problem. Perhaps there should be an option when issuing the ADD INHERITS to indicate whether you want it to create new columns or only match existing columns. That would also give me a convenient excuse to skip all those NOTICEs about merging column definitions. nonono! the whole pg inheritance/partitioning thing is still quite low-level and ADD/DEL INHERITS is the wrong place to start fixing it. Actually I think in the long term for partitioned tables Postgres will have to implement a special syntax just like Oracle and other databases. The user doesn't really want to have to manually manage all the partitions as tables. That imposes a lot of extra work to have to define the tables with the right syntax, maintain the constraints properly, etc. Yes. Maybe. But this is something that requires much more thought and planning than adding the simplest possible ADD/DELETE INHERITS. For the user it would be better to have a single property of the partitioned table that specified the partition key. Then when adding a partition you would only have to specify the key range it covers, not write an arbitrary constraint from scratch. Nor would you have to create an empty table with the proper definition first then add it in. Don't try to solve too many problems at once. Starting with just a possibility to move suitable ready-made partitions in and out of inheritance chain solves a really big problem. No need to try to obfuscate it with extra functionality, at least not initially. -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] How to avoid transaction ID wrap
Ühel kenal päeval, T, 2006-06-06 kell 13:53, kirjutas Christopher Browne: We have triggers that fire is something interesting is found on insert. We want this thing to run for a log time. From the numbers, you can see the PostgreSQL database is VERY loaded. Running VACUUM may not always be possible without losing data. why ? just run it with very friendly delay settings. Friendly delay settings can have adverse effects; it is likely to make vacuum run on the order of 3x as long, which means that if you have a very large table that takes 12h to VACUUM, vacuum delay will increase that to 36h, which means you'll have a transaction open for 36h. That'll be very evil, to be sure... Not always. I know that it is evil in slony1 context, but often it *is* possible to design your system in a way where a superlong transaction is almost unnoticable. Long transactions are evil in case they cause some fast-changing table to grow its storage size several orders of magnitude, but if that is not the case then they just run there in backgroun with no ill effects, especially do-nothing transactions like vacuum. -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] How to avoid transaction ID wrap
On Thu, Jun 08, 2006 at 01:30:42AM +0300, Hannu Krosing wrote: ??hel kenal p??eval, T, 2006-06-06 kell 13:53, kirjutas Christopher Browne: We have triggers that fire is something interesting is found on insert. We want this thing to run for a log time. From the numbers, you can see the PostgreSQL database is VERY loaded. Running VACUUM may not always be possible without losing data. why ? just run it with very friendly delay settings. Friendly delay settings can have adverse effects; it is likely to make vacuum run on the order of 3x as long, which means that if you have a very large table that takes 12h to VACUUM, vacuum delay will increase that to 36h, which means you'll have a transaction open for 36h. That'll be very evil, to be sure... Not always. I know that it is evil in slony1 context, but often it *is* possible to design your system in a way where a superlong transaction is almost unnoticable. Long transactions are evil in case they cause some fast-changing table to grow its storage size several orders of magnitude, but if that is not the case then they just run there in backgroun with no ill effects, especially do-nothing transactions like vacuum. Plus, if the only issue here is in fact the long-running transaction for vacuum, there's other ways to address that which would be a lot less intrusive than doing something like going to 64 bit XIDs. IIRC, in 8.2 vacuum will start a new transaction every time it fills up maintenance_work_mem, so just setting that low could solve the problem (at the expense of a heck of a lot of extra IO). -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] How to avoid transaction ID wrap
On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote: Zdenek Kotala [EMAIL PROTECTED] writes: Koichi Suzuki wrote: I've once proposed a patch for 64bit transaction ID, but this causes some overhead to each tuple (XMIN and XMAX). Did you check performance on 32-bit or 64-bit systems and 64-bit binary version of PGSQL? I think that today is not problem to have 64-bit architecture and 64-bit ID should increase scalability of Postgres. The percentage increase in I/O demand is the main reason the patch was rejected, not so much the arithmetic. Before considering 64 bit XIDs, it'd be very helpful to know why Mark can't vacuum frequently enough to handle rollover... The system is under heavy load, and while there are tricks that can be done, vacuum is a process which is extra load the system when it is running. It is a sliding scale, as always, you may get the system to the point where it can vacuum AND perform as needed, but the database is growing constantly. Eventually you will get to the point where you can't run vacuum *and* keep up with the data stream. I guess what I am saying is that PostgreSQL isn't smooth, between checkpoints and vacuum, it is near impossible to make a product that performs consistently under high load. Now don't flame me, I really do love PostgreSQL, it is just that I bump up against these issues from time to time and it would be nice if there were some way to work around them. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] How to avoid transaction ID wrap
On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote: I guess what I am saying is that PostgreSQL isn't smooth, between checkpoints and vacuum, it is near impossible to make a product that performs consistently under high load. Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to find a case where I couldn't smooth out the IO load so that it wasn't an issue. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] ADD/DROP INHERITS
How does ALTER TABLE table INHERITS ADD parent ALTER TABLE table INHERITS DROP parent sound? I'll admit it doesn't read very well but it doesn't necessitate complicating other rules in gram.y -- greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] ADD/DROP INHERITS
Greg Stark [EMAIL PROTECTED] writes: How does ALTER TABLE table INHERITS ADD parent ALTER TABLE table INHERITS DROP parent sound? I'll admit it doesn't read very well but it doesn't necessitate complicating other rules in gram.y Or alternatively if people want to keep English-like SQL style grammar: ALTER TABLE table INHERIT parent ALTER TABLE table NO INHERIT parent -- greg ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] ADD/DROP INHERITS
Greg Stark said: Greg Stark [EMAIL PROTECTED] writes: How does ALTER TABLE table INHERITS ADD parent ALTER TABLE table INHERITS DROP parent sound? I'll admit it doesn't read very well but it doesn't necessitate complicating other rules in gram.y Or alternatively if people want to keep English-like SQL style grammar: ALTER TABLE table INHERIT parent ALTER TABLE table NO INHERIT parent That could work ... or maybe UNINHERIT would read better than NO INHERIT. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] drop if exists remainder
On Saturday 04 March 2006 22:24, David Fetter wrote: On Fri, Mar 03, 2006 at 03:35:24PM -0500, Andrew Dunstan wrote: Bruce Momjian wrote: Christopher Kings-Lynne wrote: What's the consensus on this? Nobody else has chimed in, so I'm inclined to do no more on the gounds of insufficient demand. Let's decide before too much bitrot occurs, though. +1 :) +1 -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] drop if exists remainder
Tom, What's the consensus on this? Nobody else has chimed in, so I'm inclined to do no more on the gounds of insufficient demand. Let's decide before too much bitrot occurs, though. +1 :) +1 We were talking about this on IRC, and I feel that if we're going to do IF EXISTS for any objects, we should do it for all objects. Otherwise we risk a considerable amount of user confusion. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] ADD/DROP INHERITS
On Jun 8, 2006, at 9:13 , Greg Stark wrote: Greg Stark [EMAIL PROTECTED] writes: How does ALTER TABLE table INHERITS ADD parent ALTER TABLE table INHERITS DROP parent sound? I'll admit it doesn't read very well but it doesn't necessitate complicating other rules in gram.y Or alternatively if people want to keep English-like SQL style grammar: ALTER TABLE table INHERIT parent ALTER TABLE table NO INHERIT parent ALTER TABLE table DISOWN parent? Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] ADD/DROP INHERITS
Andrew Dunstan [EMAIL PROTECTED] writes: Greg Stark said: Or alternatively if people want to keep English-like SQL style grammar: ALTER TABLE table INHERIT parent ALTER TABLE table NO INHERIT parent That could work ... or maybe UNINHERIT would read better than NO INHERIT. DISINHERIT maybe? While creating unreserved keywords isn't the end of the world it seems better to stick to the vocabulary already there if possible. It makes it easier for the user to remember how to spell commands. That's why I didn't suggest fixing the DROP INHERITS ambiguity by inventing something like REMOVE INHERITS. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] ADD/DROP INHERITS
grzm, ALTER TABLE table DISOWN parent? You can't disown your parents. ;-) -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] stable snapshot looks outdated
On Thursday 01 June 2006 14:29, Robert Treat wrote: Looking at http://www.postgresql.org/ftp/stable_snapshot/ surely we have acheived stability at least once since 2005-11-26.. :-) Can we get that fixed? Bueller? -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Snowball and ispell in tsearch2
We got a lot requests about including stemmers and ispell dictionaries for all accessible languages into tsearch2. I understand that tsearch2 will be closer to end user. But sources of snowball stemmers is about 800kb, each ispell dictionaries will takes about 0.5-2M. All sizes are sized with compression. I am afraid that is too big size... What are opinions? Maybe putting it on pgFoundry? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Snowball and ispell in tsearch2
Perhaps we can put together the source code for all languages modules available and provide scripts to fetch ispell data or to generate the snowball stemmers. A debian package maintainer would have to fetch all the data to generate all language packages. Someone else might just want to download and compile a norwegian snowball stemmer. I'd be willing to help with such a project. I have experience with tsearch2 as well as with gentoo and debian packaging. I can't help with rpm, though. I could help with a FreeBSD package I suppose. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] sort_mem logging when exceeded ...
I thought we added something in 8.1 for doing this, but can't seem to find anything ... where it reports how much memory is/was needed to do the sort to the logs ... Is that something for 8.2, that didn't get into 8.1? Or am I looking at the wrong docs? Thx Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Snowball and ispell in tsearch2
I'd be willing to help with such a project. I have experience with tsearch2 as well as with gentoo and debian packaging. I can't help with rpm, though. I could help with a FreeBSD package I suppose. Although I should probably finish up those damn GIN docs first :) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] sort_mem logging when exceeded ...
Marc G. Fournier [EMAIL PROTECTED] writes: I thought we added something in 8.1 for doing this, but can't seem to find anything ... where it reports how much memory is/was needed to do the sort to the logs ... trace_sort regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] self-deadlock at FATAL exit of boostrap process on read error
Tom Lane [EMAIL PROTECTED] wrote Do you have a test case to reproduce this problem? According to the error message, the problem happens during reading pg_database. I just tried to plug in this line in mdread(): +/* pretend there is an error reading pg_database */ +if (reln-smgr_rnode.relNode == 1262) +{ +fprintf(stderr, Ooops \n); +return false; +} v = _mdfd_getseg(reln, blocknum, false); And it works. Regards, Qingqing ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] ADD/DROP INHERITS
Greg Stark [EMAIL PROTECTED] writes: While creating unreserved keywords isn't the end of the world it seems better to stick to the vocabulary already there if possible. It makes it easier for the user to remember how to spell commands. +1. Don't invent new keywords (even if unreserved) when there's no strong reason to do so. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] code cleanup for SearchSysCache
There are roughly 420 calls of SearchSysCache() and 217 of which are just report cache lookup failed. Shall we put the elog in the SearchSysCache itself? Notice that most search is on the Oid field -- which is *not* user visible, so I think most of them can safely let SearchSysCache handle the failed search without reporting any misleading information. Also, to support situations where indeed need to check the return tuple, we can add a boolean parameter isComplain to the argument list. Regards, Qingqing ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] How to avoid transaction ID wrap
Tom Lane wrote: Zdenek Kotala [EMAIL PROTECTED] writes: Koichi Suzuki wrote: I've once proposed a patch for 64bit transaction ID, but this causes some overhead to each tuple (XMIN and XMAX). Did you check performance on 32-bit or 64-bit systems and 64-bit binary version of PGSQL? I think that today is not problem to have 64-bit architecture and 64-bit ID should increase scalability of Postgres. I checked the performance on 64-bit system and 64bit binary. The percentage increase in I/O demand is the main reason the patch was rejected, not so much the arithmetic. That's right. I've also ovserved I/O demand increase. I remember we have to pay three to five percent performance decrease in pgbench. So I don't think we should apply this patch without further justification. I'm looking for other reasons for larger transaction ID. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Koichi Suzuki ---(end of broadcast)--- TIP 6: explain analyze is your friend