[HACKERS] GIN - Generalized Inverted iNdex. Try 2.
We (me and Oleg) are glad to present GIN to PostgreSQL. If community will agree, we will commit it to HEAD branch. http://www.sigaev.ru/gin/gin.gz http://www.sigaev.ru/gin/README.txt Install: % cd pgsql % zcat gin.gz | patch -p0 make and initdb, install tsearch2 Changes from previous patch: * add support for tsearch2 * add 'fuzzy' limit * fixes README: Gin for PostgreSQL Gin was sponsored by jfg://networks (http://www.jfg-networks.com/) Gin stands for Generalized Inverted Index and should be considered as a genie, not a drink. Generalized means that index doesn't know what operation it accelerates, it works with strategies, defined for specific data type (read Index Method Strategies chapter in PostgreSQL documentation). In that sense, gin is similar to the GiST and differs from btree index, which has predefined comparison based operations. Inverted index is an index structure storing a (key,posting list) pairs, where posting list is a set of documents in which key occurs (document, usually, contains many keys). The primary goal of the Gin index is a support for scalable full text search in PostgreSQL. Gin is consists of a B-tree constructed over entries (ET, entries tree), where entry is an element of indexed value ( element of array, lexeme for tsvector) and where each tuple in the leaf pages is either pointer to B-tree over item pointers (PT, posting tree), or list of item pointers (PL, posting list) if tuple is small enough. Notes: There is no delete operation for ET. The reason for this, is that from our experience, a set of unique words of a whole collection changed very rare. This greatly simplify code and concurrency algorithm. Gin comes with built-in support for one dimensional arrays ( integer[], text[], no support for NULL elements) and following operations: * contains : value_array @ query_array * overlap : value_array query_array * contained: value_array ~ query_array Synopsis =# create index txt_idx on aa using gin(a); Features * Concurrency * WAL-logging (recoverable) * user-defined opclass, the scheme is similar to GiST * optimized index creation (make use of maintenance_work_mem to accumulate postings in memory) * tsearch2 opclass * soft upper limit of the returned results set using GUC variable gin_fuzzy_search_limit Gin fuzzy limit There are often situations, when full text search returns a very big set of results, which is very difficult to manage, since reading tuples from disk and their ordering could takes a lot of time, which is unacceptable for the production (notice, that search itself is very fast). Such queries are usually contain very frequent lexemes, so results are not very helpful. To facilitate execution of such queries we introduced a configurable soft upper limit of the size of the returned set - GUC variable gin_fuzzy_search_limit, which is 0 on default (no limitation). This subset randomly chosen from the whole result set. Soft means that the actual number of returned results could slightly differs from this limit, depending on the query and the quality of system random generator. From our experience, we found that value about several thousands (5000-2) is ok, i.e., gin fuzzy limit will have no effects for queries returning result set lesser than this number. Limitations * no support for multicolumn indices * Gin doesn't uses scan-kill_prior_tuple scan-ignore_killed_tuples * Gin searches entry only by equality matching, this may be improved in future * Gin doesn't supports full scan of index * Gin doesn't index NULL value Gin interface OpClass interface (pseudocode). Example for opclas is in ginarayproc.c. Datum* extractValue(Datum inputValue, uint32* nentries) Returns array of Datum of entries of value to be indexed, nentries should contains number of returning entries int compareEntry( Datum a, Datum b ) Compares two entries (not the indexing values!) Datum* extractQuery(Datum query, uint32* nentries, StrategyNumber n) Returns array of Datum of entries of query to be searched, n contains Strategy number of operation. bool consistent( bool[] check, StrategyNumber n, Datum query) The size of check array is the same as sizeof of array returned by extractQuery. Each element of check array is true if indexed value has corresponding entry in the query, i.e. if check[i] == TRUE then i-th entry of query is presented in indexed value. Function should returns true if indexed value matches by StrategyNumber and query. Open items We appreciate any comments, help and recommendations. * teach optimizer/executor, that GIN is intrinsically clustered, i.e., it always returns ItemPointer in ascending order. * tweak gincostestimate * GIN stores several ItemPointer to heap tuple, so vacuum full produces warning message: WARNING: index idx contains 88395 row versions, but table contains 51812 row versions HINT: Rebuild the index with REINDEX.
Re: [HACKERS] Avoiding redundant fetches of btree index metapages
On Tue, 2006-04-25 at 13:43 -0400, Tom Lane wrote: What I'm considering doing to fix that is require any change to a btree's metapage data to send out a relcache inval message for the index. That will force all backends to flush the stale cache item not later than the start of their next transaction, and thereby guarantee that they aren't using pointers that are too old to be safe against vacuuming. (There are other ways we could handle this, but that one seems like the simplest and least invasive.) Comments? Anyone see any flaws in the reasoning? Hmmm I'm slightly worried that frequently-churning small tables will be made even slower by this. What do you think? * Re-using rd_targblock was OK for a quick hack because we don't use it for indexes, but it's too klugy for real use, and it's not quite enough info anyway (we really ought to cache the root level as well as root block number). I'm planning to add a void * pointer to the Relation struct that the index AM is allowed to use as it sees fit, with the understanding that any pointed-to data lives in rd_indexcxt and the pointer will be reset to NULL on any relcache clear. btree would store a copy of the BTMetaPageData struct. The other AMs might have some other use for this. So we would be able to cache other items also? I'd also considered caching the rightmost page, for when the table grows via a monotonically increasing key. Similar benefits, same problems, so same-ish solution sounds the right way. For that case we'd save N block accesses to locate the rightmost leaf page. If the cache is wrong, we could seek again from the root at a cost of 1 additional access (or move right until we find it depending upon how stale we think the cache is, if we can find a heuristic to gauge that). We would only need to send an invalidation message when VACUUM removes a page, as well as for any insertion other than the rightmost page. Maybe do this as an index option, e.g. APPEND (MONOTONIC seems like a poor choice, even if it would be more correct)? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 2.
On Wed, 26 Apr 2006, Teodor Sigaev wrote: We (me and Oleg) are glad to present GIN to PostgreSQL. If community will agree, we will commit it to HEAD branch. http://www.sigaev.ru/gin/gin.gz http://www.sigaev.ru/gin/README.txt Install: % cd pgsql % zcat gin.gz | patch -p0 make and initdb, install tsearch2 I just built this, and noticed that the regression test for opr_sanity fails with your patch. I attached the regression.diffs. -- BOFH excuse #85: Windows 95 undocumented feature *** ./expected/opr_sanity.out Wed Jan 25 18:35:51 2006 --- ./results/opr_sanity.outWed Apr 26 08:31:13 2006 *** *** 778,785 WHERE p4.amopclaid = p2.oid AND p4.amopsubtype = p3.amopsubtype); oid | amname | oid | opcname | amopsubtype ! -++-+-+- ! (0 rows) -- Check that amopopr points at a reasonable-looking operator, ie a binary -- operator yielding boolean. --- 778,791 WHERE p4.amopclaid = p2.oid AND p4.amopsubtype = p3.amopsubtype); oid | amname | oid | opcname | amopsubtype ! --++--+---+- ! 2742 | gin| 2745 | _int4_ops | 0 ! 2742 | gin| 2745 | _int4_ops | 0 ! 2742 | gin| 2745 | _int4_ops | 0 ! 2742 | gin| 2746 | _text_ops | 0 ! 2742 | gin| 2746 | _text_ops | 0 ! 2742 | gin| 2746 | _text_ops | 0 ! (6 rows) -- Check that amopopr points at a reasonable-looking operator, ie a binary -- operator yielding boolean. *** *** 825,831 783 | 10 | | 783 | 11 | | 783 | 12 | | ! (24 rows) -- Check that all operators linked to by opclass entries have selectivity -- estimators. This is not absolutely required, but it seems a reasonable --- 831,840 783 | 10 | | 783 | 11 | | 783 | 12 | | ! 2742 |1 | ! 2742 |2 | @ ! 2742 |3 | ~ ! (27 rows) -- Check that all operators linked to by opclass entries have selectivity -- estimators. This is not absolutely required, but it seems a reasonable *** *** 847,854 WHERE p1.amopopr = p2.oid AND p1.amopclaid = p3.oid AND NOT binary_coercible(p3.opcintype, p2.oprleft); amopclaid | amopopr | oid | oprname | opcname ! ---+-+-+-+- ! (0 rows) SELECT p1.amopclaid, p1.amopopr, p2.oid, p2.oprname, p3.opcname FROM pg_amop AS p1, pg_operator AS p2, pg_opclass AS p3 --- 856,869 WHERE p1.amopopr = p2.oid AND p1.amopclaid = p3.oid AND NOT binary_coercible(p3.opcintype, p2.oprleft); amopclaid | amopopr | oid | oprname | opcname ! ---+-+--+-+--- ! 2746 |2750 | 2750 | | _text_ops ! 2745 |2750 | 2750 | | _int4_ops ! 2746 |2751 | 2751 | @ | _text_ops ! 2745 |2751 | 2751 | @ | _int4_ops ! 2746 |2752 | 2752 | ~ | _text_ops ! 2745 |2752 | 2752 | ~ | _int4_ops ! (6 rows) SELECT p1.amopclaid, p1.amopopr, p2.oid, p2.oprname, p3.opcname FROM pg_amop AS p1, pg_operator AS p2, pg_opclass AS p3 == ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 2.
I just built this, and noticed that the regression test for opr_sanity fails with your patch. I attached the regression.diffs. Sorry, this part isn't done yet, because we are waiting of community decision.. We don't add regression test yet. If community don't like to include GIN in core, we make a contrib/gin, but in this case GIN can't use WAL feature because of WAL interface can't call user-defined function. The reason for first diff is a hardcoded 'gist' index: -- We have to exclude GiST, unfortunately, since it hasn't got any fixed -- requirements about strategy operators. SELECT p1.oid, p1.amname, p2.oid, p2.opcname, p3.amopsubtype FROM pg_am AS p1, pg_opclass AS p2, pg_amop AS p3 WHERE p2.opcamid = p1.oid AND p3.amopclaid = p2.oid AND p1.amname != 'gist' AND p1.amstrategies != (SELECT count(*) FROM pg_amop AS p4 WHERE p4.amopclaid = p2.oid AND p4.amopsubtype = p3.amopsubtype); Second is right diff. For the thread one reason is that operations , ~, @ defined for anyarray, but used for particular types. *** ./expected/opr_sanity.out Wed Jan 25 18:35:51 2006 --- ./results/opr_sanity.outWed Apr 26 08:31:13 2006 *** *** 778,785 WHERE p4.amopclaid = p2.oid AND p4.amopsubtype = p3.amopsubtype); oid | amname | oid | opcname | amopsubtype ! -++-+-+- ! (0 rows) -- Check that amopopr points at a reasonable-looking operator, ie a binary -- operator yielding boolean. --- 778,791 WHERE p4.amopclaid = p2.oid AND p4.amopsubtype = p3.amopsubtype); oid | amname | oid | opcname | amopsubtype ! --++--+---+- ! 2742 | gin| 2745 | _int4_ops | 0 ! 2742 | gin| 2745 | _int4_ops | 0 ! 2742 | gin| 2745 | _int4_ops | 0 ! 2742 | gin| 2746 | _text_ops | 0 ! 2742 | gin| 2746 | _text_ops | 0 ! 2742 | gin| 2746 | _text_ops | 0 ! (6 rows) -- Check that amopopr points at a reasonable-looking operator, ie a binary -- operator yielding boolean. *** *** 825,831 783 | 10 | | 783 | 11 | | 783 | 12 | | ! (24 rows) -- Check that all operators linked to by opclass entries have selectivity -- estimators. This is not absolutely required, but it seems a reasonable --- 831,840 783 | 10 | | 783 | 11 | | 783 | 12 | | ! 2742 |1 | ! 2742 |2 | @ ! 2742 |3 | ~ ! (27 rows) -- Check that all operators linked to by opclass entries have selectivity -- estimators. This is not absolutely required, but it seems a reasonable *** *** 847,854 WHERE p1.amopopr = p2.oid AND p1.amopclaid = p3.oid AND NOT binary_coercible(p3.opcintype, p2.oprleft); amopclaid | amopopr | oid | oprname | opcname ! ---+-+-+-+- ! (0 rows) SELECT p1.amopclaid, p1.amopopr, p2.oid, p2.oprname, p3.opcname FROM pg_amop AS p1, pg_operator AS p2, pg_opclass AS p3 --- 856,869 WHERE p1.amopopr = p2.oid AND p1.amopclaid = p3.oid AND NOT binary_coercible(p3.opcintype, p2.oprleft); amopclaid | amopopr | oid | oprname | opcname ! ---+-+--+-+--- ! 2746 |2750 | 2750 | | _text_ops ! 2745 |2750 | 2750 | | _int4_ops ! 2746 |2751 | 2751 | @ | _text_ops ! 2745 |2751 | 2751 | @ | _int4_ops ! 2746 |2752 | 2752 | ~ | _text_ops ! 2745 |2752 | 2752 | ~ | _int4_ops ! (6 rows) SELECT p1.amopclaid, p1.amopopr, p2.oid, p2.oprname, p3.opcname FROM pg_amop AS p1, pg_operator AS p2, pg_opclass AS p3 == ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
I would like to start a discussion regarding the TODO item “%Add a separate TRUNCATE permission” to gain more information. The new TRUNCATE permission: Is it meant to be a general truncating permission on all tables, schema's like: “I, the DBA give you the privilege to TRUNCATE” Or is this a per-table, per-schema truncate privilege. Could someone provide more information about this? Thank you. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Avoiding redundant fetches of btree index metapages
Simon Riggs [EMAIL PROTECTED] writes: Hmmm I'm slightly worried that frequently-churning small tables will be made even slower by this. What do you think? How so? So we would be able to cache other items also? Only to the extent that you can guarantee a stale cache entry isn't a problem. We've already done the analysis involved for the existing metapage entries, but anything else would require more thought. (And more cache flush events.) For that case we'd save N block accesses to locate the rightmost leaf page. Surely you mean log(N). regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Avoiding redundant fetches of btree index metapages
On Wed, 2006-04-26 at 12:53 -0400, Tom Lane wrote: So we would be able to cache other items also? Only to the extent that you can guarantee a stale cache entry isn't a problem. We've already done the analysis involved for the existing metapage entries, but anything else would require more thought. (And more cache flush events.) You mean performance tests! Will do. Methinks that cache flushing is the key to performance for that idea. For that case we'd save N block accesses to locate the rightmost leaf page. Surely you mean log(N). Depends what N is. I meant the level, not the number of rows. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
* Gevik Babakhani ([EMAIL PROTECTED]) wrote: The new TRUNCATE permission: Is it meant to be a general truncating permission on all tables, schema's like: ???I, the DBA give you the privilege to TRUNCATE??? Or is this a per-table, per-schema truncate privilege. Could someone provide more information about this? It would be a per-table, table-level privilege. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE
On Wed, 2006-04-26 at 13:31 -0400, Stephen Frost wrote: * Gevik Babakhani ([EMAIL PROTECTED]) wrote: The new TRUNCATE permission: Is it meant to be a general truncating permission on all tables, schema's like: ???I, the DBA give you the privilege to TRUNCATE??? Or is this a per-table, per-schema truncate privilege. It would be a per-table, table-level privilege. Would the privilege apply to the table depending on the table being truncated? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
* Gevik Babakhani ([EMAIL PROTECTED]) wrote: On Wed, 2006-04-26 at 13:31 -0400, Stephen Frost wrote: * Gevik Babakhani ([EMAIL PROTECTED]) wrote: The new TRUNCATE permission: Is it meant to be a general truncating permission on all tables, schema's like: ???I, the DBA give you the privilege to TRUNCATE??? Or is this a per-table, per-schema truncate privilege. It would be a per-table, table-level privilege. Would the privilege apply to the table depending on the table being truncated? eh? It's just like 'select', 'update', 'delete', etc. Either you have permission to truncate the table(s), or you don't. The main problem you'll run into here is not the implementation (it's trivial and I've already done it actually) for this specific permission but that we need to redesign the permission system to allow for more permission bits because otherwise we'll run out soon. My initial thought on how to do this was to split the permissions into use permissions and admin permissions. There's already a split along these lines built into the system (lower-order bits are use and higher-order bits are admin, or the other way around) but *alot* of things currently expect to be able to pass permissions around in 4 bytes. I'd be happy to look into this some more (and had planned to) but I've been rather busy lately (finals coming up). I think the use/admin split is the correct split because the admin permissions aren't checked very frequently (mainly by grants and people looking at the permission information). The use permissions are checked very frequently and so need to be kept fast. I don't think that would be very difficult to do though. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE
Gevik Babakhani [EMAIL PROTECTED] writes: Would the privilege apply to the table depending on the table being truncated? I think the idea is to require TRUNCATE privilege on all the tables being truncated in the command. This would substitute for the existing ownership check. I do have a concern here, which is that GRANT ALL on a table didn't use to convey TRUNCATE, but now it will. However, since GRANT ALL does confer the right to do DELETE FROM tab, maybe this isn't an issue. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
Stephen Frost [EMAIL PROTECTED] writes: we need to redesign the permission system to allow for more permission bits because otherwise we'll run out soon. Only if we keep inventing separate privileges for things as specific as TRUNCATE. I was just about to raise this point as a possible reason why not to invent a separate TRUNCATE bit. (There are other problems, eg both 't' and 'T' letters are already taken.) The question that really ought to be answered before doing any of this is why DELETE privilege shouldn't be sufficient to allow TRUNCATE. In any case, I don't feel it necessary to panic about running out of permission bits when the space is only 75% used... with a little care it'll last us a long time yet, and I'm not eager to pay any performance price whatsoever just so we can invent the Joe Hacker Memorial Privilege Bit. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE
* Tom Lane ([EMAIL PROTECTED]) wrote: Gevik Babakhani [EMAIL PROTECTED] writes: Would the privilege apply to the table depending on the table being truncated? I think the idea is to require TRUNCATE privilege on all the tables being truncated in the command. This would substitute for the existing ownership check. Right, definitely agree about this. I do have a concern here, which is that GRANT ALL on a table didn't use to convey TRUNCATE, but now it will. However, since GRANT ALL does confer the right to do DELETE FROM tab, maybe this isn't an issue. Hmmm, I have to agree that this an interesting question. I don't tend to use GRANT ALL so I'm not really sure what people are thinking when they use it. It seems to me that it'd make sense to include TRUNCATE in 'GRANT ALL' (since it includes the abilities to create triggers and references, etc, which I wouldn't generally consider to be normal, where normal would be select/insert/update/delete). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
* Tom Lane ([EMAIL PROTECTED]) wrote: Stephen Frost [EMAIL PROTECTED] writes: we need to redesign the permission system to allow for more permission bits because otherwise we'll run out soon. Only if we keep inventing separate privileges for things as specific as TRUNCATE. I was just about to raise this point as a possible reason why not to invent a separate TRUNCATE bit. (There are other problems, eg both 't' and 'T' letters are already taken.) Unfortunately the things which (I feel anyway) we should be allowing as grantable permissions really do fall into different categorizations (imv). TRUNCATE violates MVCC so is more than just DELETE (and I could definitely see where you might want to allow DELETE and *not* TRUNCATE). Additionally, I think you need more then SELECT for 'ANALYZE' or 'VACUUM'. I could maybe see associating ANALYZE/VACUUM privileges with privileges which can modify the table or with a new bit for both of them. I could also see TRUNCATE having that ability but I do believe that it'd be useful to be able to grant ANALYZE/VACUUM without granting TRUNCATE... The question that really ought to be answered before doing any of this is why DELETE privilege shouldn't be sufficient to allow TRUNCATE. TRUNCATE doesn't follow MVCC... In any case, I don't feel it necessary to panic about running out of permission bits when the space is only 75% used... with a little care it'll last us a long time yet, and I'm not eager to pay any performance price whatsoever just so we can invent the Joe Hacker Memorial Privilege Bit. Splitting the privileges I don't think would incur any real performance hit at all but I'd rather use up the bits we have before changing things. I got the impression previously that the privilege system would need to be changed before adding more things to the current system would be allowed though. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
Tom Lane wrote: Stephen Frost [EMAIL PROTECTED] writes: we need to redesign the permission system to allow for more permission bits because otherwise we'll run out soon. Only if we keep inventing separate privileges for things as specific as TRUNCATE. I was just about to raise this point as a possible reason why not to invent a separate TRUNCATE bit. (There are other problems, eg both 't' and 'T' letters are already taken.) The question that really ought to be answered before doing any of this is why DELETE privilege shouldn't be sufficient to allow TRUNCATE. TODO has: * %Add a separate TRUNCATE permission Currently only the owner can TRUNCATE a table because triggers are not called, and the table is locked in exclusive mode. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
Stephen Frost wrote: The question that really ought to be answered before doing any of this is why DELETE privilege shouldn't be sufficient to allow TRUNCATE. TRUNCATE doesn't follow MVCC... We can certainly talk about fixing that. (And CLUSTER at the same time, I think.) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
* Alvaro Herrera ([EMAIL PROTECTED]) wrote: Stephen Frost wrote: The question that really ought to be answered before doing any of this is why DELETE privilege shouldn't be sufficient to allow TRUNCATE. TRUNCATE doesn't follow MVCC... We can certainly talk about fixing that. (And CLUSTER at the same time, I think.) The issue is that it seems to be intractable to retain MVCC-ness *and* provide the performance savings TRUNCATE gives. If you can solve that problem then we could get rid of TRUNCATE and implement DELETE-without-WHERE using that magic. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE
On Wed, 2006-04-26 at 13:54 -0400, Tom Lane wrote: The question that really ought to be answered before doing any of this is why DELETE privilege shouldn't be sufficient to allow TRUNCATE. This is actually a very good one I think. Perhaps we shouldn't code and overkill making things more complex. The only reason I can think of having a separate TRUNCATE permission is when one could not ROLLBACK a TRUNCATE. But this is not the case. We can rollback a TRUNCATE :) :) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
* Stephen Frost ([EMAIL PROTECTED]) wrote: * Alvaro Herrera ([EMAIL PROTECTED]) wrote: Stephen Frost wrote: The question that really ought to be answered before doing any of this is why DELETE privilege shouldn't be sufficient to allow TRUNCATE. TRUNCATE doesn't follow MVCC... We can certainly talk about fixing that. (And CLUSTER at the same time, I think.) The issue is that it seems to be intractable to retain MVCC-ness *and* provide the performance savings TRUNCATE gives. If you can solve that problem then we could get rid of TRUNCATE and implement DELETE-without-WHERE using that magic. Let me qualify that- in cases where there aren't row-level triggers or other things which would prevent it from being possible anyway. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
Stephen Frost wrote: * Alvaro Herrera ([EMAIL PROTECTED]) wrote: Stephen Frost wrote: TRUNCATE doesn't follow MVCC... We can certainly talk about fixing that. (And CLUSTER at the same time, I think.) The issue is that it seems to be intractable to retain MVCC-ness *and* provide the performance savings TRUNCATE gives. If you can solve that problem then we could get rid of TRUNCATE and implement DELETE-without-WHERE using that magic. Doh, sorry, I was thinking in CLUSTER :-( -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Regarding TODO item %Add a separate TRUNCATE permission
Bruce Momjian pgman@candle.pha.pa.us writes: Currently only the owner can TRUNCATE a table because triggers are not called, and the table is locked in exclusive mode. Doh. Of course the point about not calling ON DELETE triggers is why this has to be considered a special privilege. Never mind me, I've still got a bad head-cold :-( regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] ANSI-strict pointer aliasing rules
I ran afoul of these rules the other day when compiling pgsql 8.1 on AIX. The configure scripts are set up to look for xlc instead of cc, and that command invokes cc with -qalias=ansi, the ANSI-strict pointer aliasing mode. Specifically, in this mode, the compiler assumes that accesses via pointers of different types never alias. Unfortunately, this breaks all of the Value construction code, because we end up with (for example): Node *n = palloc0fast(...); n-tag = T_VALUE; Value *v = (Value *) n; v-tag = T_STRING; And aggressive compiler optimization can reorder these two tag assignments, resulting in the bizarre Unrecognized node type: 650 message. The fix is one of two things: 1. Change the tag element of structures to be a Node, and access the tag via it: Major code change, allows Node to change in the future for instrumentation et cetera. 2. Make the makeNode macro cast to the derived structure before assigning the tag: Minor code change, makes assumptions about derived structures. 3. Get configure to select cc instead of xlc: No code change, loses some performance. -- Taral [EMAIL PROTECTED] You can't prove anything. -- Gödel's Incompetence Theorem ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] ANSI-strict pointer aliasing rules
Taral [EMAIL PROTECTED] writes: I ran afoul of these rules the other day when compiling pgsql 8.1 on AIX. The configure scripts are set up to look for xlc instead of cc, and that command invokes cc with -qalias=ansi, the ANSI-strict pointer aliasing mode. We've looked at this before. There's no way we are buying into the strict aliasing rules: it's not so much the code we know will break, as the fear of what we don't know about. gcc, at least, is utterly useless about warning about constructs that might change behavior under strict aliasing ... but without fairly reliable warnings of such problems, we have little hope of getting all the bugs out. You'll notice that configure already goes out of its way to force traditional aliasing for gcc, and I'd recommend doing the same for AIX's compiler. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] ANSI-strict pointer aliasing rules
On Wed, Apr 26, 2006 at 03:45:00PM -0500, Taral wrote: I ran afoul of these rules the other day when compiling pgsql 8.1 on AIX. The configure scripts are set up to look for xlc instead of cc, and that command invokes cc with -qalias=ansi, the ANSI-strict pointer aliasing mode. PostgreSQL doesn't work with strict aliasing rules. That's why for GCC we pass -fno-strict-aliasing. What's the equivalent option for that compiler? The fix is one of two things: Well, there are a number of fixes, it's questionable whether it's worth the effort. In GCC you can mark structures that are aliased to avoid the problem (attribute((may_alias)) iirc), but we don't do that. 3. Get configure to select cc instead of xlc: No code change, loses some performance. 4. Find the option for disabling strict alias and get configure to add that. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] ANSI-strict pointer aliasing rules
On Wed, Apr 26, 2006 at 04:58:31PM -0400, Tom Lane wrote: We've looked at this before. There's no way we are buying into the strict aliasing rules: it's not so much the code we know will break, as the fear of what we don't know about. gcc, at least, is utterly useless about warning about constructs that might change behavior under strict aliasing ... but without fairly reliable warnings of such problems, we have little hope of getting all the bugs out. These warnings will never happen. Strict-aliasing could effect anything, all you'd get would be zillions of useless warnings. Take for example this innocuous looking code from libpq/hba.c: static void parse_hba_auth(ListCell **line_item, UserAuth *userauth_p, char **auth_arg_p, bool *error_p) { char *token; *auth_arg_p = NULL; if (!*line_item) { Strict aliasing says that 'line_item' and 'auth_arg_p' can't point to the same location because they are different types. Hence the store/load could be overlapped safely. *We* know that will never be a problem, but the compiler can't know, so it has to assume one way or the other. With no-strict-aliasing, it can't reorder and it might add wait-states instead, just in case (CPU dependant obviously). The compiler could warn you about this, but it'd be useless since you can't do anything about it anyway. In the grandparent's case I think the compiler is being really stupid because it can know the two pointers are the same. But I'm sure there are more subtle cases where it won't know. PostgreSQL typecasts pointers fairly freely. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
[HACKERS] todo items history
Please accept my apologies for this trivial question... I have been reading through the TODO items for the last couple of days and I couldn't stop wondering whether there is history of discussion kept or logged about the TODO items somewhere. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] todo items history
Gevik Babakhani wrote: Please accept my apologies for this trivial question... I have been reading through the TODO items for the last couple of days and I couldn't stop wondering whether there is history of discussion kept or logged about the TODO items somewhere. Not really, though the CVS history shows when the item was added. Many items represent complex threads of discussion, so only the general conclusion is in the TODO list. Is there an items that is unclear? -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] todo items history
On Wed, Apr 26, 2006 at 11:44:52PM +0200, Gevik Babakhani wrote: Please accept my apologies for this trivial question... I have been reading through the TODO items for the last couple of days and I couldn't stop wondering whether there is history of discussion kept or logged about the TODO items somewhere. You can sometimes find discussion about a TODO if you search the archives for the text of the TODO, but I'd say that's only about 60% sucessful. Bruce will sometimes capture relevant threads of discussion along with TODO items, though. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] todo items history
Not really, though the CVS history shows when the item was added. Many items represent complex threads of discussion, so only the general conclusion is in the TODO list. Is there an items that is unclear? The reason I asked this question is because I would like to contribute. In fact I proposed my first patch today (of course with Tom's and Alvaro's coaching). I thought maybe more is kept than general conclusion. I guess I have to search in the CVS history and annoy people by asking all kinds of questions. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] todo items history
Gevik Babakhani wrote: Not really, though the CVS history shows when the item was added. Many items represent complex threads of discussion, so only the general conclusion is in the TODO list. Is there an items that is unclear? The reason I asked this question is because I would like to contribute. In fact I proposed my first patch today (of course with Tom's and Alvaro's coaching). I thought maybe more is kept than general conclusion. I guess I have to search in the CVS history and annoy people by asking all kinds of questions. Usually we can clarify any item, and in fact the discussion details might have changed since the item was added, so by asking, you get current information. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] todo items history
Thank you :) On Wed, 2006-04-26 at 18:14 -0400, Bruce Momjian wrote: Gevik Babakhani wrote: Not really, though the CVS history shows when the item was added. Many items represent complex threads of discussion, so only the general conclusion is in the TODO list. Is there an items that is unclear? The reason I asked this question is because I would like to contribute. In fact I proposed my first patch today (of course with Tom's and Alvaro's coaching). I thought maybe more is kept than general conclusion. I guess I have to search in the CVS history and annoy people by asking all kinds of questions. Usually we can clarify any item, and in fact the discussion details might have changed since the item was added, so by asking, you get current information. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] ANSI-strict pointer aliasing rules
On 4/26/06, Martijn van Oosterhout kleptog@svana.org wrote: Well, there are a number of fixes, it's questionable whether it's worth the effort. In GCC you can mark structures that are aliased to avoid the problem (attribute((may_alias)) iirc), but we don't do that. There's also C99's restrict. 4. Find the option for disabling strict alias and get configure to add that. You'll still lose performance, but the option is -qalias=noansi. -- Taral [EMAIL PROTECTED] You can't prove anything. -- Gödel's Incompetence Theorem ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: Catalog Access (was: [HACKERS] [GENERAL] Concurrency problem
On 4/25/06 12:24 PM, Tom Lane [EMAIL PROTECTED] wrote: I'm inclined to think that the right solution is to fix UpdateStats and setRelhasindex so that they don't use simple_heap_update, but call heap_update directly and cope with HeapTupleUpdated (by looping around and trying the update from scratch). Is there a verdict on what can/should/will be done for this? As far as I can tell from all this, there appears to be no workaround (even kludgy) other than to not build indexes in parallel - not an attractive option. If I'm only building two indexes simultaneously, what would happen if I tried to lock pg_class in the shorter index build transaction? Besides seeming like a bad idea... Wes ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: Catalog Access (was: [HACKERS] [GENERAL] Concurrency problem
On Wed, Apr 26, 2006 at 05:24:27PM -0500, Wes wrote: On 4/25/06 12:24 PM, Tom Lane [EMAIL PROTECTED] wrote: I'm inclined to think that the right solution is to fix UpdateStats and setRelhasindex so that they don't use simple_heap_update, but call heap_update directly and cope with HeapTupleUpdated (by looping around and trying the update from scratch). Is there a verdict on what can/should/will be done for this? As far as I can tell from all this, there appears to be no workaround (even kludgy) other than to not build indexes in parallel - not an attractive option. If I'm only building two indexes simultaneously, what would happen if I tried to lock pg_class in the shorter index build transaction? Besides seeming like a bad idea... Try running a first index build by itself and then running them in parallel. Hopefully once pg_class has an exact tuple count the conflicting update won't happen. If you actually have an exact tuple count you could also try updating pg_class manually beforehand, but that's not exactly a supported option... Another possibility would be to patch the code so that if the tuplecount found by CREATE INDEX is within X percent of what's already in pg_class it doesn't do the update. Since there's already code to check to see if the count is an exact match, this patch should be pretty simple, and the community might well accept it into the code as well. BTW, why are you limiting yourself to 2 indexes at once? I'd expect that for a table larger than memory you'd be better off building all the indexes at once so that everything runs off a single sequential scan. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: Catalog Access (was: [HACKERS] [GENERAL] Concurrency problem
Jim C. Nasby [EMAIL PROTECTED] writes: Try running a first index build by itself and then running them in parallel. Yeah, this is probably the best workaround for now. I think we should look at making it fully concurrent-safe per upthread comments, but that won't be happening in existing release branches. Also, the only case where it's a problem is if the first two index builds finish at almost exactly the same time. It might be possible to overlap the first two index builds with reasonable safety so long as you choose indexes with very different sorting costs (eg, integer vs text columns, different numbers of columns, etc). regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_freespacemap question
Patch applied. Thanks. --- Mark Kirkwood wrote: Tom Lane wrote: Mark Kirkwood [EMAIL PROTECTED] writes: Good points! I had not noticed this test case. Probably NULL is better Would setting it to 'BLCKSZ - (fixed index header stuff)' be better, No, I don't think so, because that will just make it harder to recognize what's what (remember that BLCKSZ isn't really a constant, and the index overhead is not the same for all AMs either). The point here is that for indexes the FSM tracks whole-page availability, not the amount of free space within pages. So I think NULL is a reasonable representation of that. Using NULL will make it easy to filter the results if you want to see only heap-page data or only index-page data, whereas it will be very hard to do that if the view adopts an ultimately-artificial convention about the amount of available space on an index page. Right - after suggesting it I realized that coding the different index overhead for each possible AM would have been ... difficult :-). A patch is attached to implement the NULL free bytes and other recommendations: 1/ Index free bytes set to NULL 2/ Comment added to the README briefly mentioning the index business 3/ Columns reordered more logically 4/ 'Blockid' column removed 5/ Free bytes column renamed to just 'bytes' instead of 'blockfreebytes' Now 5/ was only hinted at, but seemed worth doing while I was there (hopefully I haven't made it too terse now). cheers Mark Index: pg_freespacemap.c === RCS file: /projects/cvsroot/pgsql/contrib/pg_freespacemap/pg_freespacemap.c,v retrieving revision 1.2 diff -c -r1.2 pg_freespacemap.c *** pg_freespacemap.c 14 Feb 2006 15:03:59 - 1.2 --- pg_freespacemap.c 9 Mar 2006 03:38:10 - *** *** 12,18 #include storage/freespace.h #include utils/relcache.h ! #define NUM_FREESPACE_PAGES_ELEM6 #if defined(WIN32) || defined(__CYGWIN__) /* Need DLLIMPORT for some things that are not so marked in main headers */ --- 12,18 #include storage/freespace.h #include utils/relcache.h ! #define NUM_FREESPACE_PAGES_ELEM5 #if defined(WIN32) || defined(__CYGWIN__) /* Need DLLIMPORT for some things that are not so marked in main headers */ *** *** 29,40 typedef struct { - uint32 blockid; - uint32 relfilenode; uint32 reltablespace; uint32 reldatabase; uint32 relblocknumber; ! uint32 blockfreebytes; } FreeSpacePagesRec; --- 29,40 typedef struct { uint32 reltablespace; uint32 reldatabase; + uint32 relfilenode; uint32 relblocknumber; ! uint32 bytes; ! boolisindex; } FreeSpacePagesRec; *** *** 91,107 /* Construct a tuple to return. */ tupledesc = CreateTemplateTupleDesc(NUM_FREESPACE_PAGES_ELEM, false); ! TupleDescInitEntry(tupledesc, (AttrNumber) 1, blockid, !INT4OID, -1, 0); ! TupleDescInitEntry(tupledesc, (AttrNumber) 2, relfilenode, OIDOID, -1, 0); ! TupleDescInitEntry(tupledesc, (AttrNumber) 3, reltablespace, OIDOID, -1, 0); ! TupleDescInitEntry(tupledesc, (AttrNumber) 4, reldatabase, OIDOID, -1, 0); ! TupleDescInitEntry(tupledesc, (AttrNumber) 5, relblocknumber, INT8OID, -1, 0); ! TupleDescInitEntry(tupledesc, (AttrNumber) 6, blockfreebytes, INT4OID, -1, 0); /* Generate attribute metadata needed later to produce tuples */ --- 91,105 /* Construct a tuple to return. */ tupledesc = CreateTemplateTupleDesc(NUM_FREESPACE_PAGES_ELEM, false); ! TupleDescInitEntry(tupledesc, (AttrNumber) 1, reltablespace, OIDOID, -1, 0); ! TupleDescInitEntry(tupledesc, (AttrNumber) 2, reldatabase, OIDOID, -1, 0); ! TupleDescInitEntry(tupledesc, (AttrNumber) 3, relfilenode, OIDOID, -1, 0); !
Re: [PATCHES] [HACKERS] pg_freespacemap question
Patch applied. Thanks. --- Mark Kirkwood wrote: Mark Kirkwood wrote: Tom Lane wrote: I do notice a rather serious shortcoming of pg_freespacemap in its current incarnation, which is that it *only* shows you the per-page free space data, and not any of the information that would let you determine what the FSM is doing to filter the raw data. The per-relation avgRequest and lastPageCount fields would be interesting for instance. Perhaps there should be a second view with one row per relation to carry the appropriate data. Ok - I did wonder about 2 views, but was unsure if the per-relation stuff was interesting. Given that it looks like it is interesting, I'll see about getting a second view going. This patch implements the second view for FSM relations. I have renamed the functions and views to be: pg_freespacemap_relations pg_freespacemap_pages This patch depends on the previous one (which was called simply 'pg_freespacemap.patch'). Cheers Mark [ application/gzip is not supported, skipping... ] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: Catalog Access (was: [HACKERS] [GENERAL] Concurrency problem
On Wed, Apr 26, 2006 at 06:42:53PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: Try running a first index build by itself and then running them in parallel. Yeah, this is probably the best workaround for now. I think we should look at making it fully concurrent-safe per upthread comments, but that won't be happening in existing release branches. Also, the only case where it's a problem is if the first two index builds finish at almost exactly the same time. It might be possible to overlap the first two index builds with reasonable safety so long as you choose indexes with very different sorting costs (eg, integer vs text columns, different numbers of columns, etc). What about not updating if the tuplecount is within X percent? Would that be safe enough to back-port? I've been trying to think of a reason why disabling the current behavior of CREATE INDEX forcing reltuples to be 100% accurate but I can't think of one... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: Catalog Access (was: [HACKERS] [GENERAL] Concurrency problem
Jim C. Nasby [EMAIL PROTECTED] writes: What about not updating if the tuplecount is within X percent? Would that be safe enough to back-port? Even if you got agreement that it was a good idea (I don't think so myself), it wouldn't help Wes, at least not for values of X smaller than 100. Presumably, that first CREATE INDEX is trying to update reltuples from zero to reality. Also, the first CREATE INDEX has to set relhasindex = true, and that's not fuzzy at all. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: Catalog Access (was: [HACKERS] [GENERAL] Concurrency problem
On 4/26/06 5:34 PM, Jim C. Nasby [EMAIL PROTECTED] wrote: Try running a first index build by itself and then running them in parallel. Hopefully once pg_class has an exact tuple count the conflicting update won't happen. If you actually have an exact tuple count you could also try updating pg_class manually beforehand, but that's not exactly a supported option... I thought about that. It would work well for the table with 3 indexes (1/2) either way, but would be an extra pass on the one with 4 (1/2/1 instead of 2/2). Another possibility would be to patch the code so that if the tuplecount found by CREATE INDEX is within X percent of what's already in pg_class it doesn't do the update. Since there's already code to check to see if the count is an exact match, this patch should be pretty simple, and the community might well accept it into the code as well. I don't think that would help here. I assume after the COPY, the tuple count is zero, and after the first index build, it is exact. Dumb question... Since COPY has to lock the table, why doesn't it take the current count in pg_class and increment it by the number of rows inserted? If you're doing a clean load of a table, that would result in an exact count. What about your idea of retrying the request if it detects a conflict? BTW, why are you limiting yourself to 2 indexes at once? I'd expect that for a table larger than memory you'd be better off building all the indexes at once so that everything runs off a single sequential scan. I don't know enough about the index build process. My presumption was that while you might get a gain during the read process, the head contention during the sort/write process would be a killer. I don't have enough spindles (yet) to separate out the different indexes. I think you'd only want to do one table at a time to avoid head contention during the read process. Wouldn't this tend to exacerbate the current problem, too? In this specific case, I guess I could do 1,2 and 1,3 parallel builds (for the 3 index/4 index tables). Right now I'm running with rather restricted hardware (1GB memory, two 2-disk RAID 0's and a single disk). If the pilot proves what I think is possible, and I can get real hardware (i.e. an intelligent caching array and some serious memory), things change a bit. Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: Catalog Access (was: [HACKERS] [GENERAL] Concurrency problem
On Wed, Apr 26, 2006 at 07:13:08PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: What about not updating if the tuplecount is within X percent? Would that be safe enough to back-port? Even if you got agreement that it was a good idea (I don't think so myself), it wouldn't help Wes, at least not for values of X smaller than 100. Presumably, that first CREATE INDEX is trying to update reltuples from zero to reality. It may be, but an ANALYZE would eliminate that need and be far faster than waiting on one entire CREATE INDEX. I'm thinking that even being of by as much as 5% won't matter to the planner, and I can't think of any possible reason to need an exact tuplecount in pg_class... Also, the first CREATE INDEX has to set relhasindex = true, and that's not fuzzy at all. Oh, will each index build try and do that? Would changing that be non-invasive enough to backpatch (I'm guessing it's just an added conditional...) -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] ANSI-strict pointer aliasing rules
Martijn van Oosterhout kleptog@svana.org writes: In the grandparent's case I think the compiler is being really stupid because it can know the two pointers are the same. But I'm sure there are more subtle cases where it won't know. PostgreSQL typecasts pointers fairly freely. Actually, if xlc is behaving as Taral says then I'm pretty convinced it's *broken*; it is assuming far more than is allowed even by the ANSI aliasing rules. As I read the spec, ANSI aliasing says that a given value must always be accessed through equivalent (up to signedness) primitive types, ie, if you store through an int pointer and fetch through a long pointer the compiler is allowed to reorder those two references. In the example Taral gives, both field references are to fields of type NodeTag. I don't see anything in the spec that allows the compiler to assume they are distinct variables just because they are members of different struct types. The spec restriction is defined in terms of the lvalue type of the particular store or fetch access, not on what kind of structure it's part of. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 2.
What changed between Try 1 and Try 2? Teodor Sigaev wrote: We (me and Oleg) are glad to present GIN to PostgreSQL. If community will agree, we will commit it to HEAD branch. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 2.
Oh I can't read - ignore me :) Teodor Sigaev wrote: Changes from previous patch: * add support for tsearch2 * add 'fuzzy' limit * fixes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: Catalog Access (was: [HACKERS] [GENERAL] Concurrency problem
On 4/26/06 5:42 PM, Tom Lane [EMAIL PROTECTED] wrote: Yeah, this is probably the best workaround for now. I think we should look at making it fully concurrent-safe per upthread comments, but that won't be happening in existing release branches. I changed the index build script such that for each table it builds one index by itself, then builds the remaining indexes in parallel. This appears to be stable. I made several runs with no errors. I've got some more testing to do, then I'll try my big run. Thanks Wes ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] pg_dump
Anyone,Hello our server had postgresql 7.4 when updated to 8.1.2 everything is working fine, but the version of pg_dunp is 7.4How do i upgrade pg_dump so I can dump / backup the database?Thanks Bob
Re: [HACKERS] ANSI-strict pointer aliasing rules
Tom Lane [EMAIL PROTECTED] writes: As I read the spec, ANSI aliasing says that a given value must always be accessed through equivalent (up to signedness) primitive types It there a difference between C89 and C99 here? -- greg ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_dump
Moving to -general, where this belongs. How was PostgreSQL installed? There's no reason I know of why pg_dump shouldn't have been upgraded. On Thu, Apr 27, 2006 at 12:40:12PM +1000, Bob Shearer wrote: Anyone, Hello our server had postgresql 7.4 when updated to 8.1.2 everything is working fine, but the version of pg_dunp is 7.4 How do i upgrade pg_dump so I can dump / backup the database? Thanks Bob -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] ANSI-strict pointer aliasing rules
On Wed, Apr 26, 2006 at 11:42:11PM -0400, Greg Stark wrote: Tom Lane [EMAIL PROTECTED] writes: As I read the spec, ANSI aliasing says that a given value must always be accessed through equivalent (up to signedness) primitive types It there a difference between C89 and C99 here? I believe C89 makes no reference to aliasing restrictions. It only leaves certain operations, in certain contexts, implementation defined. That is, if you wish to maintain portable behaviour, you would choose not to make use of C features that have implementation defined semantics... :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org