Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Am 07.11.2013 12:42, schrieb Dilip kumar: This patch implementing the following TODO item Allow parallel cores to be used by vacuumdb http://www.postgresql.org/message-id/4f10a728.7090...@agliodbs.com [1] Like Parallel pg_dump, vacuumdb is provided with the option to run the vacuum of multiple tables in parallel. [ VACUUMDB –J ] 1. One new option is provided with vacuumdb to give the number of workers. 2. All worker will be started in beginning and all will be waiting for the vacuum instruction from the master. 3. Now, if table list is provided in vacuumdb command using -t then, it will send the vacuum of one table to one of the IDLE worker, next table to next IDLE worker and so on. 4. If vacuum is given for one DB then, it will execute select on pg_class to get the table list and fetch the table name one by one and also assign the vacuum responsibility to IDLE workers. [...] For this use case, would it make sense to queue work (tables) in order of their size, starting on the largest one? For the case where you have tables of varying size this would lead to a reduced overall processing time as it prevents large (read: long processing time) tables to be processed in the last step. While processing large tables at first and filling up processing slots/jobs when they get free with smaller tables one after the other would safe overall execution time. Regards Jan -- professional: http://www.oscar-consult.de Links: -- [1] http://www.postgresql.org/message-id/4f10a728.7090...@agliodbs.com
[HACKERS] Improve code in tidbitmap.c
Hi, ISTM the code in tidbitmap.c should be improved. Patch attached. I think this patch increases the efficiency a bit. Thanks, Best regards, Etsuro Fujita diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c index 3ef0112..43628ac 100644 --- a/src/backend/nodes/tidbitmap.c +++ b/src/backend/nodes/tidbitmap.c @@ -361,7 +361,7 @@ tbm_union_page(TIDBitmap *a, const PagetableEntry *bpage) if (bpage-ischunk) { /* Scan b's chunk, mark each indicated page lossy in a */ - for (wordnum = 0; wordnum WORDS_PER_PAGE; wordnum++) + for (wordnum = 0; wordnum WORDS_PER_CHUNK; wordnum++) { bitmapword w = bpage-words[wordnum]; @@ -473,7 +473,7 @@ tbm_intersect_page(TIDBitmap *a, PagetableEntry *apage, const TIDBitmap *b) /* Scan each bit in chunk, try to clear */ boolcandelete = true; - for (wordnum = 0; wordnum WORDS_PER_PAGE; wordnum++) + for (wordnum = 0; wordnum WORDS_PER_CHUNK; wordnum++) { bitmapword w = apage-words[wordnum]; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Heavily modified big table bloat even in auto vacuum is running
On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi haribabu.ko...@huawei.com wrote: On 07 November 2013 09:42 Amit Kapila wrote: I am not sure whether the same calculation as done for new_rel_tuples works for new_dead_tuples, you can once check it. I didn't find any way to calculate new_dead_tuples like new_rel_tuples. I will check it. I am thinking that if we have to do estimation anyway, then wouldn't it be better to do the way Tom had initially suggested (Maybe we could have VACUUM copy the n_dead_tuples value as it exists when VACUUM starts, and then send that as the value to subtract when it's done?) I think the reason you gave that due to tuple visibility check the number of dead tuples calculated by above logic is not accurate is right but still it will make the value of dead tuples more appropriate than it's current value. You can check if there is a way to do estimation of dead tuples similar to new tuples, and it will be as solid as current logic of vac_estimate_reltuples(), then it's okay, otherwise use the other solution (using the value of n_dead_tuples at start of Vacuum) to solve the problem. The two approaches calculations are approximation values only. 1. Taking a copy of n_dead_tuples before VACUUM starts and then subtract it once it is done. This approach doesn't include the tuples which are remains during the vacuum operation. Wouldn't next or future vacuum's will make the estimate more appropraite? 2. nkeep counter contains the tuples which are still visible to other transactions. This approach doesn't include tuples which are deleted on pages where vacuum operation is already finished. In my opinion the second approach gives the value nearer to the actual value, because it includes some of the new dead tuples also. Please correct me if anything wrong in my analysis. I think main problem in nkeep logic is to come up with an estimation algorithm similar to live tuples. By the way, do you have test case or can you try to write a test case which can show this problem and then after fix, you can verify if the problem is resolved. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On 08 November 2013 03:22, Euler Taveira Wrote On 07-11-2013 09:42, Dilip kumar wrote: Dilip, this is on my TODO for 9.4. I've already had a half-backed patch for it. Let's see what I can come up with. Ok, Let me know if I can contribute to this.. Is it required to move the common code for parallel operation of pg_dump and vacuumdb to one place and reuse it ? I'm not sure about that because the pg_dump parallel code is tight to TOC entry. Also, dependency matters for pg_dump while in the scripts case, an order to be choosen will be used. However, vacuumdb can share the parallel code with clusterdb and reindexdb (my patch does it). +1 Of course, a refactor to unify parallel code (pg_dump and scripts) can be done in a separate patch. Prototype patch is attached in the mail, please provide your feedback/Suggestions... I'll try to merge your patch with the one I have here until the next CF. Regards, Dilip -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections
On Fri, 08 November 2013 09:47 On Tue, Nov 5, 2013 at 3:20 PM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On execution of pg_resetxlog using the option -n 1. It will display values in two section. 2. First section will be called as Current pg_control values or Guess pg_control values. 3. In first section, it will display all current (i.e. before change) values of control file or guessed values. 4. Second section will be called as Values to be used after reset. 5. In second section, it will display new values of parameters to be reset as per user request. Please provide your opinion or expectation out of this patch. Your approach in patch seems to be inline with Todo item. On a quick glance, I observed few things which can make your patch better: 1. The purpose was to print pg_control values in one section and any other reset values in different section, so in that regard, should we display below in new section, as here newXlogSegNo is not directly from pg_control. PrintControlValues() { .. XLogFileName(fname, ControlFile.checkPointCopy.ThisTimeLineID, newXlogSegNo); printf(_(First log segment after reset:%s\n), fname); } Yes we can print newXlogSegNo. 2. why to have extra flags for changedParam, can't we do without it. For example, we already use set_xid value to check if user has provided xid. You mean to say that we can print wherever values of control file parameter is getting assigned. If yes, then the problem is that every places will have to check the condition as if(noupdate) and then print the corresponding value. E.g. If (noupdate) printf(_(NextMultiXactId:%u\n), ControlFile.checkPointCopy.nextMulti); 3. + static void + PrintNewControlValues(int changedParam) { + if (changedParam) + printf(_(\n\nValues to be used after reset:\n\n)); Even after first if check fails, still you continue to check other values, it is better if you can have check if (changedParam) before calling this function Yes you are correct. But now this check will not be required because always at-least one value (i.e. of newXlogSegNo) will be printed. Please let me know if understanding of all comments are correct. After that I will update the patch. Thanks and Regards, Kumar Rajeev Rastogi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On 08 November 2013 13:38, Jan Lentfer For this use case, would it make sense to queue work (tables) in order of their size, starting on the largest one? For the case where you have tables of varying size this would lead to a reduced overall processing time as it prevents large (read: long processing time) tables to be processed in the last step. While processing large tables at first and filling up processing slots/jobs when they get free with smaller tables one after the other would safe overall execution time. Good point, I have made the change and attached the modified patch. Regards, Dilip vacuumdb_parallel_v2.patch Description: vacuumdb_parallel_v2.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup.sgml-cmd-v003.patch
On Thu, Nov 7, 2013 at 11:37 AM, Joshua D. Drake j...@commandprompt.com wrote: This isn't software, it is docs. It is ridiculous to suggest we break this up into 3-4 patches. This is a small doc patch to a single doc file (backup.sgml). I don't think it's ridiculous, but you can certainly disagree. superuser privileges; it's the selective-dump case where you can often get by without them. I've attached a proposed patch along these lines for your consideration. That's fair. Should I go ahead and apply that portion, then? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On 08-11-2013 05:07, Jan Lentfer wrote: For the case where you have tables of varying size this would lead to a reduced overall processing time as it prevents large (read: long processing time) tables to be processed in the last step. While processing large tables at first and filling up processing slots/jobs when they get free with smaller tables one after the other would safe overall execution time. That is certainly a good strategy (not the optimal [1] -- that is hard to achieve). Also, the strategy must: (i) consider the relation age before size (for vacuum); (ii) consider that you can't pick indexes for the same relation (for reindex). [1] http://www.postgresql.org/message-id/CA+TgmobwxqsagXKtyQ1S8+gMpqxF_MLXv=4350tfzvqawke...@mail.gmail.com -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR during end-of-xact/FATAL
On Wed, Nov 6, 2013 at 9:40 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Noah Misch wrote: Incomplete list: - If smgrDoPendingDeletes() finds files to delete, mdunlink() and its callee relpathbackend() call palloc(); this is true in all supported branches. In 9.3, due to commit 279628a0, smgrDoPendingDeletes() itself calls palloc(). (In fact, it does so even when the pending list is empty -- this is the only palloc() during a trivial transaction commit.) palloc() failure there yields a PANIC during commit. I think we should fix this routine to avoid the palloc when not necessary. That initial palloc is pointless. Also, there have been previous discussions about having relpathbackend not use palloc at all. That was only because we wanted to use it in pg_xlogdump which didn't have palloc support at the time, so it's no longer as pressing; but perhaps it's still worthy of consideration. +1, but I'd like it better if we could find a way of avoiding the palloc in all cases. Panicking because we run out of memory at the wrong time isn't really very nice. Maybe the information needs to be maintained in the format in which it ultimately needs to be returned, so that we needn't rejigger it in the middle of a critical section. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW: possible resjunk columns in AddForeignUpdateTargets
Tom Lane wrote: Albe Laurenz laurenz.a...@wien.gv.at writes: I have a question concerning the Foreign Data Wrapper API: I find no mention of this in the documentation, but I remember that you can only add a resjunk column that matches an existing attribute of the foreign table and not one with an arbitrary name or definition. Ist that right? I don't recall such a limitation offhand. It's probably true that you can't create Vars that don't match some declared column of the table, but that doesn't restrict what you put into resjunk columns AFAIR. I am confused, probably due to lack of knowledge. What I would like to do is add a custom resjunk column (e.g. a bytea) in AddForeignUpdateTargets that carries a row identifier from the scan state to the modify state. Would that be possible? Can I have anything else than a Var in a resjunk column? Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW: possible resjunk columns in AddForeignUpdateTargets
Albe Laurenz laurenz.a...@wien.gv.at writes: What I would like to do is add a custom resjunk column (e.g. a bytea) in AddForeignUpdateTargets that carries a row identifier from the scan state to the modify state. Would that be possible? Can I have anything else than a Var in a resjunk column? [ thinks for awhile... ] Hm. In principle you can put any expression you want into the tlist during AddForeignUpdateTargets. However, if it's not a Var then the planner won't understand that it's something that needs to be supplied by the table scan, so things won't work right in any but the most trivial cases (maybe not even then :-(). What I'd try is creating a Var that has the attno of ctid (ie, SelfItemPointerAttributeNumber) but the datatype you want, ie bytea. This won't match what the catalogs say your table's ctid is, but I think that nothing will care much about that. It's definitely an area that could use more work. IIRC we'd discussed providing some way for an FDW to specify the type of the ctid column for its tables, but we didn't do anything about it in 9.3. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
On Tue, Oct 29, 2013 at 11:26 AM, Nigel Heron nhe...@querymetrics.com wrote: So, for now, the counters only track sockets created from an inbound (client to server) connection. here's v3 of the patch (rebase and cleanup). Hi, here's v4 of the patch. I added documentation and a new global view called pg_stat_socket (includes bytes_sent, bytes_received and stats_reset time) thanks, -nigel. *** a/doc/src/sgml/monitoring.sgml --- b/doc/src/sgml/monitoring.sgml *** *** 278,283 postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re --- 278,291 /row row + entrystructnamepg_stat_socket/indextermprimarypg_stat_socket/primary/indexterm/entry + entryOne row only, showing statistics about the +cluster's communication socket activity. See +xref linkend=pg-stat-socket-view for details. + /entry + /row + + row entrystructnamepg_stat_database/indextermprimarypg_stat_database/primary/indexterm/entry entryOne row per database, showing database-wide statistics. See xref linkend=pg-stat-database-view for details. *** *** 627,632 postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re --- 635,650 that was executed. /entry /row + row + entrystructfieldbytes_sent//entry + entrytypebigint//entry + entryNumber of bytes sent over this backend's communication socket/entry + /row + row + entrystructfieldbytes_received//entry + entrytypebigint//entry + entryNumber of bytes received over this backend's communication socket/entry + /row /tbody /tgroup /table *** *** 735,740 postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re --- 753,801 single row, containing global data for the cluster. /para + table id=pg-stat-socket-view xreflabel=pg_stat_socket +titlestructnamepg_stat_socket/structname View/title + +tgroup cols=3 + thead + row + entryColumn/entry + entryType/entry + entryDescription/entry + /row + /thead + + tbody + row + entrystructfieldbytes_sent//entry + entrytypebigint/type/entry + entry + Number of bytes sent over communication sockets. + /entry + /row + row + entrystructfieldbytes_received//entry + entrytypebigint/type/entry + entry + Number of bytes received over communication sockets. + /entry + /row + row + entrystructfieldstats_reset//entry + entrytypetimestamp with time zone/type/entry + entryTime at which these statistics were last reset/entry + /row + /tbody + /tgroup + /table + + para +The structnamepg_stat_socket/structname view will always have a +single row, containing global data for the cluster. +Only sockets created from inbound client connections are tracked (Unix sockets and TCP). +Streaming replication traffic is counted on the master, but not on the slave (See xref linkend=pg-stat-replication-view for details.) + /para + table id=pg-stat-database-view xreflabel=pg_stat_database titlestructnamepg_stat_database/structname View/title tgroup cols=3 *** *** 859,864 postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re --- 920,935 in milliseconds/entry /row row + entrystructfieldbytes_sent//entry + entrytypebigint//entry + entryNumber of bytes sent over backend communication sockets in this database/entry + /row + row + entrystructfieldbytes_received//entry + entrytypebigint//entry + entryNumber of bytes received over this backend communication sockets in this database/entry + /row + row entrystructfieldstats_reset//entry entrytypetimestamp with time zone//entry entryTime at which these statistics were last reset/entry *** *** 1417,1422 postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re --- 1488,1503 /entry /row row + entrystructfieldbytes_sent//entry + entrytypebigint//entry + entryNumber of bytes sent over this WAL sender's communication socket/entry + /row + row + entrystructfieldbytes_received//entry + entrytypebigint//entry + entryNumber of bytes received over this WAL sender's communication socket/entry + /row + row entrystructfieldstate//entry entrytypetext//entry entryCurrent WAL sender state/entry *** *** 1613,1618 postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re --- 1694,1701 argument (requires superuser privileges). Calling literalpg_stat_reset_shared('bgwriter')/ will zero all the counters shown in the structnamepg_stat_bgwriter/ view. +Calling literalpg_stat_reset_shared('socket')/ will
Re: [HACKERS] Row-security writer-side checks proposal
On Wed, Nov 6, 2013 at 2:27 AM, Craig Ringer cr...@2ndquadrant.com wrote: Separate READ DELETE etc would only be interesting if we wanted to let someone DELETE rows they cannot SELECT. Since we have DELETE ... RETURNING, and since users can write a predicate function for DELETE that leaks the information even if we didn't, in practice if you give the user any READ right you've given them all of them. So I don't think we can support that (except maybe by column RLS down the track). Well, we could require SELECT privilege when a a RETURNING clause is present... Except for the DELETE, this is actually just two policies, one for reads (session label = row label) and one for writes (session label = new row label). So this might be an acceptable constraint if necessary, but it'd be really good to support per-command rules, and we certainly need asymmetric read- and write- rules. OK. Support for automatically updatable security barrier views would take care of this issue, at which point I'd agree: RLS becomes mostly cosmetic syntactical sugar over existing capabilities. FKs would ignore RLS, much like the would if you use explicit SECURITY BARRIER views and have FKs between the base tables. One big difference still remains though: when you add an RLS policy on a table, all procedures and views referring to that table automatically use the transparent security barrier view over the table instead. That's *not* the case when you use views manually; you have to re-create views that point to the table so they instead point to a security barrier view over the table. Again it's nothing you can't do with updatable security barrier views, but it's automatic and transparent with RLS. That's true, but it's that automatic transparent part that also introduces a lot of pain, because what do you do when you need to really get at the real data (e.g. to back it up)? The ad-hoc rule superusers are exempt solves the problem at one level, but it doesn't do a lot for e.g. database owners. I've looked at how some other vendors do it, and I can't say their approaches are pretty. Did you look at Trusted RUBIX? Both of these have a concept that Pg RLS doesn't seem to have: multiple RLS policies. I think that's actually quite important to consider, because we'll need that anyway to support RLS on a subset of columns. Both also have the concept of turning particular RLS policies on and off on a per-user basis or per-session using privileged on-login triggers, so that application A and application B can apply different RLS rules on the same data. I don't think it's important to cover these from the start, but it'd be a good idea not to foreclose these possibilities in whatever gets into Pg. I agree, and I'm not sure we're there yet. Frankly, switching from a single security policy per table to multiple policies per table doesn't sound like a good candidate for a follow-on commit; it's likely to have fundamental ramifications for the syntax, and I'm not eager to see us implement one syntax now only to overhaul it in the next release. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.
On Tue, Nov 5, 2013 at 5:15 PM, Peter Eisentraut pete...@gmx.net wrote: On 11/5/13, 1:04 AM, Arulappan, Arul Shaji wrote: Implements NCHAR/NVARCHAR as distinct data types, not as synonyms If, per SQL standard, NCHAR(x) is equivalent to CHAR(x) CHARACTER SET cs, then for some cs, NCHAR(x) must be the same as CHAR(x). Therefore, an implementation as separate data types is wrong. Interesting. Since the point doesn't seem to be getting through, let me try to be more clear: we're not going to accept any form of this patch. A patch that makes some progress toward actually coping with multiple encodings in the same database would be very much worth considering, but adding compatible syntax with incompatible semantics is not of interest to the PostgreSQL project. We have had this debate on many other topics in the past and will no doubt have it again in the future, but the outcome is always the same. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gin page deletion bug
On 07.11.2013 22:49, Heikki Linnakangas wrote: Gin page deletion fails to take into account that there might be a search in-flight to the page that is deleted. If the page is reused for something else, the search can get very confused. ... The regular b-tree code solves this by stamping deleted pages with the current XID, and only allowing them to be reused once that XID becomes old enough ( RecentGlobalXmin). Another approach might be to grab a cleanup-strength lock on the left and parent pages when deleting a page, and requiring search to keep the pin on the page its coming from, until it has locked the next page. I came up with the attached fix. In a nutshell, when walking along a right-link, the new page is locked before releasing the lock on the old one. Also, never delete the leftmost branch of a posting tree. I believe these changes are sufficient to fix the problem, because of the way the posting tree is searched: All searches on a posting tree are full searches, scanning the whole tree from left to right. Insertions do seek to the middle of the tree, but they are locked out during vacuum by holding a cleanup lock on the posting tree root (even without this patch). Thanks to this property, a search cannot step on a page that's being deleted by following a downlink, if we refrain from deleting the leftmost page on a level, as searches only descend down the leftmost branch. It would be nice to improve that in master - holding an exclusive lock on the root page is pretty horrible - but this is a nice back-patchable patch. I'm not worried about the loss of concurrency because we now have to hold the lock on previous page when stepping to next page. In searches, it's only a share-lock, and in insertions, it's very rare that you have to step right. The patch also adds some sanity checks to stepping right: the next page should be of the same type as the current page, e.g stepping right should not go from leaf to non-leaf or vice versa. - Heikki diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c index 2a6be4b..a738d80 100644 --- a/src/backend/access/gin/ginbtree.c +++ b/src/backend/access/gin/ginbtree.c @@ -112,10 +112,8 @@ ginFindLeafPage(GinBtree btree, GinBtreeStack *stack) /* rightmost page */ break; + stack-buffer = ginStepRight(stack-buffer, btree-index, access); stack-blkno = rightlink; - LockBuffer(stack-buffer, GIN_UNLOCK); - stack-buffer = ReleaseAndReadBuffer(stack-buffer, btree-index, stack-blkno); - LockBuffer(stack-buffer, access); page = BufferGetPage(stack-buffer); } @@ -148,6 +146,41 @@ ginFindLeafPage(GinBtree btree, GinBtreeStack *stack) } } +/* + * Step right from current page. + * + * The next page is locked first, before releasing the current page. This is + * crucial to protect from concurrent page deletion (see comment in + * ginDeletePage). + */ +Buffer +ginStepRight(Buffer buffer, Relation index, int lockmode) +{ + Buffer nextbuffer; + Page page = BufferGetPage(buffer); + bool isLeaf = GinPageIsLeaf(page); + bool isData = GinPageIsData(page); + BlockNumber blkno = GinPageGetOpaque(page)-rightlink; + + nextbuffer = ReadBuffer(index, blkno); + LockBuffer(nextbuffer, lockmode); + UnlockReleaseBuffer(buffer); + + /* Sanity check that the page we stepped to is of similar kind. */ + page = BufferGetPage(nextbuffer); + if (isLeaf != GinPageIsLeaf(page) || isData != GinPageIsData(page)) + elog(ERROR, right sibling of GIN page is of different type); + + /* + * Given the proper lock sequence above, we should never land on a + * deleted page. + */ + if (GinPageIsDeleted(page)) + elog(ERROR, right sibling of GIN page was deleted); + + return nextbuffer; +} + void freeGinBtreeStack(GinBtreeStack *stack) { @@ -235,12 +268,12 @@ ginFindParents(GinBtree btree, GinBtreeStack *stack, while ((offset = btree-findChildPtr(btree, page, stack-blkno, InvalidOffsetNumber)) == InvalidOffsetNumber) { blkno = GinPageGetOpaque(page)-rightlink; - LockBuffer(buffer, GIN_UNLOCK); - ReleaseBuffer(buffer); if (blkno == InvalidBlockNumber) + { +UnlockReleaseBuffer(buffer); break; - buffer = ReadBuffer(btree-index, blkno); - LockBuffer(buffer, GIN_EXCLUSIVE); + } + buffer = ginStepRight(buffer, btree-index, GIN_EXCLUSIVE); page = BufferGetPage(buffer); } @@ -439,23 +472,21 @@ ginInsertValue(GinBtree btree, GinBtreeStack *stack, GinStatsData *buildStats) { BlockNumber rightlink = GinPageGetOpaque(page)-rightlink; - LockBuffer(parent-buffer, GIN_UNLOCK); - if (rightlink == InvalidBlockNumber) { /* * rightmost page, but we don't find parent, we should use * plain search... */ +LockBuffer(parent-buffer, GIN_UNLOCK); ginFindParents(btree, stack, rootBlkno); parent = stack-parent; Assert(parent != NULL); break; } + parent-buffer = ginStepRight(parent-buffer, btree-index, GIN_EXCLUSIVE);
Re: [HACKERS] FDW: possible resjunk columns in AddForeignUpdateTargets
Tom Lane wrote: Albe Laurenz laurenz.a...@wien.gv.at writes: What I would like to do is add a custom resjunk column (e.g. a bytea) in AddForeignUpdateTargets that carries a row identifier from the scan state to the modify state. Would that be possible? Can I have anything else than a Var in a resjunk column? [ thinks for awhile... ] Hm. In principle you can put any expression you want into the tlist during AddForeignUpdateTargets. However, if it's not a Var then the planner won't understand that it's something that needs to be supplied by the table scan, so things won't work right in any but the most trivial cases (maybe not even then :-(). What I'd try is creating a Var that has the attno of ctid (ie, SelfItemPointerAttributeNumber) but the datatype you want, ie bytea. This won't match what the catalogs say your table's ctid is, but I think that nothing will care much about that. It's definitely an area that could use more work. IIRC we'd discussed providing some way for an FDW to specify the type of the ctid column for its tables, but we didn't do anything about it in 9.3. Thanks a lot, I will try that. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
Patch applies and builds against git HEAD (as of 6790e738031089d5). make check runs cleanly as well. The new features appear to work as advertised as far as I've been able to check. The code looks good as far as I can see. Documentation patches are included for the new features. Still to be tested: the counts for streaming replication (no replication setup here to test against yet). __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com* On Fri, Nov 8, 2013 at 9:01 AM, Nigel Heron nhe...@querymetrics.com wrote: On Tue, Oct 29, 2013 at 11:26 AM, Nigel Heron nhe...@querymetrics.com wrote: So, for now, the counters only track sockets created from an inbound (client to server) connection. here's v3 of the patch (rebase and cleanup). Hi, here's v4 of the patch. I added documentation and a new global view called pg_stat_socket (includes bytes_sent, bytes_received and stats_reset time) thanks, -nigel.
Re: [HACKERS] stats for network traffic WIP
Also still to be tested: performance impact. __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com* On Fri, Nov 8, 2013 at 9:33 AM, Mike Blackwell mike.blackw...@rrd.comwrote: Patch applies and builds against git HEAD (as of 6790e738031089d5). make check runs cleanly as well. The new features appear to work as advertised as far as I've been able to check. The code looks good as far as I can see. Documentation patches are included for the new features. Still to be tested: the counts for streaming replication (no replication setup here to test against yet). __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com* On Fri, Nov 8, 2013 at 9:01 AM, Nigel Heron nhe...@querymetrics.comwrote: On Tue, Oct 29, 2013 at 11:26 AM, Nigel Heron nhe...@querymetrics.com wrote: So, for now, the counters only track sockets created from an inbound (client to server) connection. here's v3 of the patch (rebase and cleanup). Hi, here's v4 of the patch. I added documentation and a new global view called pg_stat_socket (includes bytes_sent, bytes_received and stats_reset time) thanks, -nigel.
Re: [HACKERS] Gin page deletion bug
Heikki Linnakangas hlinnakan...@vmware.com writes: I came up with the attached fix. In a nutshell, when walking along a right-link, the new page is locked before releasing the lock on the old one. Also, never delete the leftmost branch of a posting tree. I believe these changes are sufficient to fix the problem, because of the way the posting tree is searched: This seems reasonable, but I wonder whether the concurrency discussion shouldn't be in gist/README, rather than buried in a comment inside ginDeletePage (not exactly the first place one would think to look IMO). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v6.5
On Tue, Nov 5, 2013 at 10:21 AM, Andres Freund and...@2ndquadrant.com wrote: Attached to this mail and in the xlog-decoding-rebasing-remapping branch in my git[1] repository you can find the next version of the patchset that: I have pushed patches #1 and #2 from this series as a single commit, after some editing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing pg_dump default file format
I don't want to hijack this thread any further, but Craig, thanks for your insight. -Harold On Thu, Nov 7, 2013 at 8:35 PM, Craig Ringer cr...@2ndquadrant.com wrote: On 11/08/2013 11:41 AM, Harold Giménez wrote: On Thu, Nov 7, 2013 at 7:01 PM, Craig Ringer cr...@2ndquadrant.com mailto:cr...@2ndquadrant.com wrote: (a) Lots of people only upgrade every two, three, or even more major versions. I'm dealing with clients on 8.3, and people still pop up on Stack Overflow with 8.1 sometimes! These people don't ever see the deprecated phase. Interesting that they'd never update their clients either. I guess if that's true there's a mentality of if it ain't broke don't fix it going on here. I've seen all sorts of combinations of client and server versions in the wild. Ancient JDBC, ODBC, etc drivers are also common. Would they read change logs before upgrading, or just cross their fingers? (b) People routinely ignore cron job output. Even important job output. They won't see the messages, and won't act on them if they do until something actually breaks. How common is this? I couldn't possibly say, I can only go by what I see in communication with clients, in private mail, in the mailing lists, and on Stack Overflow. I do see a couple of different groups: * People who upgrade casually without looking at release notes etc, then wonder why everything just broke. Typically people running PostgreSQL in development environments. I'm not too fussed about this group, they do it to themselves. On the other hand they're a *big* group (think Ruby on Rails developers, etc) and the more of them who whine about how PostgreSQL is painful to upgrade and always breaks, the worse it is for general uptake of Pg. * Those who don't upgrade because they don't know or care to. That box has been sitting there doing its thing for ages, and until they hit some key limitation, trigger an old bug, or finally need to do something different they don't realise that life would've been easier if they hadn't stayed on 7.1. These folks seem to be the most likely ones to unthinkingly upgrade from 7.something-low or 8.x straight to 9.3 without reading the release notes then wonder why things don't work, especially since they'll tend to do it in a hurry as a knee-jerk to try to fix a problem. * People who want to upgrade but are choked by bureaucracy and change management processes that move on a tectonic time scale. They're still on 8.something-low because it's just too painful to change. They're the ones who'll ask you to backport synchronous replication into 9.0 because they're not allowed to upgrade to 9.1/9.2, despite the obvious insanity of running custom-patched and rebuilt software instead of a well-tested public release. When they upgrade they do it in giant leaps despite the pain involved, just because it's so hard to upgrade at all. They're likely to plan carefully and do it right. * People who'd love to upgrade, but have an old database running a 24x7 mission critical system with enough data that they just can't get a dump/reload window, and they're on something older than 8.4 so they can't pg_upgrade. When they do upgrade they tend to plan it in detail, test carefully first, etc, so again they're pretty OK. In general, people are busy and the database isn't all they care about. They probably read the release notes about as well as you read the release notes on a new distro upgrade or something else that you care moderately about. If you're doing a major upgrade from an old Pg to a very new one there's a lot to take in and a lot of rel notes to read. One of the things that might help here is if we had (on the wiki, maybe) a single document that kept a running list of compatibility affecting changes and upgrades that need special action, along with a recommended upgrade path. That way people wouldn't have to read 6 major versions of release notes, get half way, and give up. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Changing pg_dump default file format
On 11/07/2013 08:35 PM, Craig Ringer wrote: I've seen all sorts of combinations of client and server versions in the wild. Ancient JDBC, ODBC, etc drivers are also common. There's a security compromise out there for the 8.1 JDBC driver which was discovered last year, and folks *keep* reporting to pgsql-security. Even though the 8.1 driver had been officially deprecated for 2 years before the security issue was discovered. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix picksplit with nan values
Alexander Korotkov aekorot...@gmail.com writes: Thanks, Andrew! Good spot. I didn't examine order by operators for work with NaNs. I think this time problem is in GiST itself rather than in opclass. I'm going to fix it in a separate patch. Attached patch fixes knn GiST behaviour with NaN. It makes RB-tree comparison function in GiST work like float8 btree opclass comparison function. Hmm ... does that really work, or even do anything? I'd have thought that if either input is a NAN, the initial test if (sa-distances[i] != sb-distances[i]) would return false so we'd not enter the rest of it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup.sgml-cmd-v003.patch
On 11/08/2013 06:04 AM, Robert Haas wrote: On Thu, Nov 7, 2013 at 11:37 AM, Joshua D. Drake j...@commandprompt.com wrote: This isn't software, it is docs. It is ridiculous to suggest we break this up into 3-4 patches. This is a small doc patch to a single doc file (backup.sgml). I don't think it's ridiculous, but you can certainly disagree. superuser privileges; it's the selective-dump case where you can often get by without them. I've attached a proposed patch along these lines for your consideration. That's fair. Should I go ahead and apply that portion, then? I am certainly not opposed. -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix picksplit with nan values
Alexander Korotkov aekorot...@gmail.com writes: I wrote attached patch by following principles: 1) NaN coordinates shouldn't crash or hang GiST. 2) NaN coordinates should be processed in GiST index scan like in sequential scan. 3) NaN coordinates shouldn't lead to significant slowdown. I looked at this patch for awhile. It seems like there's still an awful lot of places in gistproc.c that are not worrying about NANs, and it's not clear to me that they don't need to. For instance, despite the changes in adjustBox(), it'll still be possible to have boxes with NAN boundaries if all the contained values are all-NAN boxes. It doesn't seem like gist_box_penalty will behave very sanely for that; it'll return a NAN penalty which seems unhelpful. The static box_penalty function doesn't work sanely for NANs either, and if it can return NAN then you also have to worry about NAN deltas in common_entry_cmp. And isn't it still possible for the Assert in gist_box_picksplit to fire? That could be illustrated on following test-case: create table test1 as (select point(random(), random()) as p from generate_series(1,100)); create index test1_idx on test1 using gist(p); create table temp as (select * from test1); insert into temp (select point('nan'::float8, 'nan'::float8) from generate_series(1,1000)); insert into temp (select point('nan'::float8, random()) from generate_series(1,1000)); insert into temp (select point(random(), 'nan'::float8) from generate_series(1,1000)); create table test2 as (select * from temp order by random()); create index test2_idx on test2 using gist(p); drop table temp; I think this test case is unlikely to generate any all-NAN index entry boxes, because almost certainly the initial entries will be non-NANs, and you've got it set up to keep incoming NANs from adjusting the boundaries of an existing box. You'd get better code coverage if you started by inserting some NANs into an empty index. Also, as a stylistic matter, I thought your previous solution of copying float8_cmp_internal was a better idea than manually inlining it (sans comments!) in multiple places as this version does. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protocol forced to V2 in low-memory conditions?
[ still catching up on old email ] I wrote: Andrew Dunstan and...@dunslane.net writes: On 09/11/2013 02:30 PM, Robert Haas wrote: On Tue, Sep 10, 2013 at 9:53 AM, Tom Lane t...@sss.pgh.pa.us wrote: Note that I was proposing removing libpq's support for V2 connections. Not the backend's. I vote against this. If we remove V2 support from libpq, then we'll have no easy way to test that the backend's support still works. How is it tested now, and who is doing the testing? Exactly. The current support in libpq is nigh useless for testing purposes, because there's no way to activate that code path on command. It only runs if libpq (thinks it) is connecting to a pre-7.4 backend. Actually ... there's another way we might deal with this. Suppose we invent a libpq connection option to specify whether to use v2 or v3 protocol, defaulting to the latter, and then just remove the protocol- fallback-during-connection code path. If there is anyone out there using a modern libpq to talk to a pre-7.4 server, they can be told to invoke the connection option. This gets rid of the unexpected-protocol-downgrade problem in a reliable manner, and it also gives us a way to test V2 code paths in both libpq and the backend, which Andrew is correct to finger as something that goes nearly totally untested right now. The main objections I can see to this are (1) it wouldn't provide a back-patchable fix, and (2) it'd be adding more legacy code instead of removing it. But the other approaches that we've talked about didn't sound very back-patchable either, so I think only (2) really holds much water. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protocol forced to V2 in low-memory conditions?
On Fri, Nov 8, 2013 at 2:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: [ still catching up on old email ] I wrote: Andrew Dunstan and...@dunslane.net writes: On 09/11/2013 02:30 PM, Robert Haas wrote: On Tue, Sep 10, 2013 at 9:53 AM, Tom Lane t...@sss.pgh.pa.us wrote: Note that I was proposing removing libpq's support for V2 connections. Not the backend's. I vote against this. If we remove V2 support from libpq, then we'll have no easy way to test that the backend's support still works. How is it tested now, and who is doing the testing? Exactly. The current support in libpq is nigh useless for testing purposes, because there's no way to activate that code path on command. It only runs if libpq (thinks it) is connecting to a pre-7.4 backend. Actually ... there's another way we might deal with this. Suppose we invent a libpq connection option to specify whether to use v2 or v3 protocol, defaulting to the latter, and then just remove the protocol- fallback-during-connection code path. If there is anyone out there using a modern libpq to talk to a pre-7.4 server, they can be told to invoke the connection option. This gets rid of the unexpected-protocol-downgrade problem in a reliable manner, and it also gives us a way to test V2 code paths in both libpq and the backend, which Andrew is correct to finger as something that goes nearly totally untested right now. The main objections I can see to this are (1) it wouldn't provide a back-patchable fix, and (2) it'd be adding more legacy code instead of removing it. But the other approaches that we've talked about didn't sound very back-patchable either, so I think only (2) really holds much water. This approach sounds promising to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v6.5
On Fri, Nov 8, 2013 at 12:38 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 5, 2013 at 10:21 AM, Andres Freund and...@2ndquadrant.com wrote: Attached to this mail and in the xlog-decoding-rebasing-remapping branch in my git[1] repository you can find the next version of the patchset that: I have pushed patches #1 and #2 from this series as a single commit, after some editing. And I've also pushed patch #13, which is an almost-totally-unrelated improvement that has nothing to do with logical replication, but is useful all the same. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup.sgml-cmd-v003.patch
On Fri, Nov 8, 2013 at 1:33 PM, Joshua D. Drake j...@commandprompt.com wrote: superuser privileges; it's the selective-dump case where you can often get by without them. I've attached a proposed patch along these lines for your consideration. That's fair. Should I go ahead and apply that portion, then? I am certainly not opposed. OK, done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] regclass error reports improperly downcased
On 11/7/13 6:41 PM, Tom Lane wrote: Jim Nasby jna...@enova.com writes: decibel@decina.cashnetusa=# SELECT 'Moo'::regclass; ERROR: relation moo does not exist at character 8 That's doing what it's supposed to. Compare regression=# select 'Moo'::regclass; ERROR: relation moo does not exist LINE 1: select 'Moo'::regclass; ^ regression=# select 'Moo'::regclass; ERROR: relation Moo does not exist LINE 1: select 'Moo'::regclass; ^ The regclass input converter applies the same case-folding rules as the SQL parser does, ie, fold unless double-quoted. Ahh, duh. Hrm... I ran across this because someone here got confused by this: SELECT pg_total_relation_size( schema_name || '.' || relname ) FROM pg_stat_all_tables ERROR: relation moo does not exist Obviously the problem is that they needed to use quote_ident(), but I was hoping to make the error less confusing to deal with. Perhaps we can add a hint? Something to the effect of Do you need to use double-quotes or quote_ident()? -- Jim Nasby, Lead Data Architect (512) 569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup.sgml-cmd-v003.patch
On 11/08/2013 02:12:56 PM, Robert Haas wrote: On Fri, Nov 8, 2013 at 1:33 PM, Joshua D. Drake j...@commandprompt.com wrote: superuser privileges; it's the selective-dump case where you can often get by without them. I've attached a proposed patch along these lines for your consideration. That's fair. Should I go ahead and apply that portion, then? I am certainly not opposed. OK, done. So Joshusa/Ivan Do you want to send another version of the patch with the committed changes omitted for further review? Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] regclass error reports improperly downcased
Jim Nasby jna...@enova.com writes: Ahh, duh. Hrm... I ran across this because someone here got confused by this: SELECT pg_total_relation_size( schema_name || '.' || relname ) FROM pg_stat_all_tables ERROR: relation moo does not exist Personally I'd do that like select pg_total_relation_size(oid) from pg_class where ... and avoid fooling with regclass conversion at all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gin page deletion bug
On 08.11.2013 19:14, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: I came up with the attached fix. In a nutshell, when walking along a right-link, the new page is locked before releasing the lock on the old one. Also, never delete the leftmost branch of a posting tree. I believe these changes are sufficient to fix the problem, because of the way the posting tree is searched: This seems reasonable, but I wonder whether the concurrency discussion shouldn't be in gist/README, rather than buried in a comment inside ginDeletePage (not exactly the first place one would think to look IMO). Yeah, README is probably better. There's zero explanation of concurrency and locking issues there at the moment, so I added a new section there explaining why the page deletion is (now) safe. There's a lot more to say about locking in GIN, but I'll leave that for another day. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minmax indexes
Robert Haas escribió: On Wed, Sep 25, 2013 at 4:34 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Here's an updated version of this patch, with fixes to all the bugs reported so far. Thanks to Thom Brown, Jaime Casanova, Erik Rijkers and Amit Kapila for the reports. I'm not very happy with the use of a separate relation fork for storing this data. I have been playing with having the revmap in the main fork of the index rather than a separate one. On the surface many things stay just what they are; I only had to add a layer beneath the revmap that maps its logical block numbers to physical block numbers. The problem with this is that it needs more disk access, because revmap block numbers cannot be hardcoded. After doing some quick math, what I ended up with was to keep an array of BlockNumbers in the metapage. Each element in this array points to array pages; each array page is, in turn, filled with more BlockNumbers, which this time correspond to the logical revmap pages we used to have in the revmap fork. (I initially feared that this design would not allow me to address enough revmap pages for the largest of tables; but fortunately this is sufficient unless you configure very small pages, say BLCKSZ 2kB, use small page ranges, and use small datatypes, say char. I have no problem with saying that that scenario is not supported if you want to have minmax indexes on 32 TB tables. I mean, who uses BLCKSZ smaller than 8kB anyway?). The advantage of this design is that in order to find any particular logical revmap page, you always have to do a constant number of page accesses. You read the metapage, then read the array page, then read the revmap page; done. Another idea I considered was chaining revmap pages (so each would have a pointer-to-next), or chaining array pages; but this would have meant that to locate an individual page to the end of the revmap, you might need to do many accesses. Not good. As an optimization for relatively small indexes, we hardcode the page number for the first revmap page: it's always the page right after the metapage (so BlockNumber 1). A revmap page can store, with the default page size, about 1350 item pointers; so with an index built for page ranges of 1000 pages per range, you can point to enough index entries for a ~10 GB table without having the need to examine the first array page. This seems pretty acceptable; people with larger tables can likely spare one extra page accessed every now and then. (For comparison, each regular minmax page can store about 500 index tuples, if it's built for a single 4-byte column; this means that the 10 GB table requires a 5-page index.) This is not complete yet; although I have a proof-of-concept working, I still need to write XLog support code and update the pageinspect code to match. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minmax indexes
Alvaro Herrera escribió: I have been playing with having the revmap in the main fork of the index rather than a separate one. ... This is not complete yet; although I have a proof-of-concept working, I still need to write XLog support code and update the pageinspect code to match. Just to be clear: the v7 published elsewhere in this thread does not contain this revmap-in-main-fork code. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR during end-of-xact/FATAL
On Wed, Nov 06, 2013 at 10:14:53AM +0530, Amit Kapila wrote: On Thu, Oct 31, 2013 at 8:22 PM, Noah Misch n...@leadboat.com wrote: If the original AbortTransaction() pertained to a FATAL, the situation is worse. errfinish() promotes the ERROR thrown from AbortTransaction() to another FATAL, isn't errstart promotes ERROR to FATAL? Right. When I tried above scenario, I hit Assert at different place ... This means that in the situation when an ERROR occurs in AbortTransaction which is called as a result of FATAL error, there are many more possibilities of Assert. Agreed. About unclean FATAL-then-ERROR scenario, one way to deal at high level could be to treat such a case as backend crash in which case postmaster reinitialises shared memory and other stuff. If we can't manage to free a shared memory resource like a lock or buffer pin, we really must PANIC. Can't we try to initialise the shared memory and other resources, wouldn't that resolve the problem's that can occur due to scenario explained by you? A PANIC will reinitialize everything relevant, largely resolving the problems around ERROR during FATAL. It's a heavy-handed solution, but it may well be the best solution. Efforts to harden CommitTransaction() and AbortTransaction() seem well-spent, but the additional effort to make FATAL exit cope where AbortTransaction() or another exit action could not cope seems to be slicing ever-smaller portions of additional robustness. I pondered a variant of that conclusion that distinguished critical cleanup needs from the rest. Each shared resource (heavyweight locks, buffer pins, LWLocks) would have an on_shmem_exit() callback that cleans up the resource under a critical section. (AtProcExit_Buffers() used to fill such a role, but resowner.c's work during AbortTransaction() has mostly supplanted it.) The ShutdownPostgres callback would not use a critical section, so lesser failures in AbortTransaction() would not upgrade to a PANIC. But I'm leaning against such a complication on the grounds that it would add seldom-tested code paths posing as much a chance of eroding robustness as bolstering it. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup.sgml-cmd-v003.patch
On 11/08/2013 12:18 PM, Karl O. Pinc wrote: On 11/08/2013 02:12:56 PM, Robert Haas wrote: On Fri, Nov 8, 2013 at 1:33 PM, Joshua D. Drake j...@commandprompt.com wrote: superuser privileges; it's the selective-dump case where you can often get by without them. I've attached a proposed patch along these lines for your consideration. That's fair. Should I go ahead and apply that portion, then? I am certainly not opposed. OK, done. So Joshusa/Ivan Do you want to send another version of the patch with the committed changes omitted for further review? I have no problem having Ivan submit another patch, outside of the multiple patch requirement. If you guys will accept a single patch file with the agreed changes, then we are good. Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup.sgml-cmd-v003.patch
On 11/08/2013 03:42:56 PM, Joshua D. Drake wrote: On 11/08/2013 12:18 PM, Karl O. Pinc wrote: On 11/08/2013 02:12:56 PM, Robert Haas wrote: On Fri, Nov 8, 2013 at 1:33 PM, Joshua D. Drake j...@commandprompt.com wrote: Should I go ahead and apply that portion, then? I am certainly not opposed. OK, done. So Joshusa/Ivan Do you want to send another version of the patch with the committed changes omitted for further review? I have no problem having Ivan submit another patch, outside of the multiple patch requirement. If you guys will accept a single patch file with the agreed changes, then we are good. I can't say what will be accepted and haven't really kept track of what has been accepted. If there's more that you want to patch then send that and we'll work from there. Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v6.5
On 11/8/13, 3:03 PM, Robert Haas wrote: On Fri, Nov 8, 2013 at 12:38 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 5, 2013 at 10:21 AM, Andres Freund and...@2ndquadrant.com wrote: Attached to this mail and in the xlog-decoding-rebasing-remapping branch in my git[1] repository you can find the next version of the patchset that: I have pushed patches #1 and #2 from this series as a single commit, after some editing. And I've also pushed patch #13, which is an almost-totally-unrelated improvement that has nothing to do with logical replication, but is useful all the same. Please fix this new compiler warning: pg_regress_ecpg.c: In function ‘main’: pg_regress_ecpg.c:170:2: warning: passing argument 3 of ‘regression_main’ from incompatible pointer type [enabled by default] In file included from pg_regress_ecpg.c:19:0: ../../../../src/test/regress/pg_regress.h:55:5: note: expected ‘init_function’ but argument is of type ‘void (*)(void)’ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
[ I'm so far behind ... ] Bruce Momjian br...@momjian.us writes: Applied. Thank you for all your suggestions. I thought the suggestion had been to issue a *warning*. How did that become an error? This patch seems likely to break applications that may have just been harmlessly sloppy about when they were issuing SETs and/or what flavor of SET they use. We don't for example throw an error for START TRANSACTION with an open transaction or COMMIT or ROLLBACK without one --- how can it possibly be argued that these operations are more dangerous than those cases? I'd personally have voted for using NOTICE. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent module - two params function should be immutable
Bruce Momjian br...@momjian.us writes: [ mark unaccent functions immutable ] Applied. This patch is flat out wrong and needs to be reverted. The functions were correctly marked (by you!) in commit c0577c92a84cc477a88fe6868c16c4a7e3348b11 on the basis of the discussion of bug #5781, http://www.postgresql.org/message-id/201012021544.ob2fitn1041...@wwwmaster.postgresql.org which was a request exactly like this one and was denied for good and sufficient reasons. There was absolutely no reasoning given in this thread that explained why we should ignore the previous objections. In particular, marking the single-argument version of unaccent() as immutable is the height of folly because its behavior depends on the setting of search_path. Changing the two-argument function is maybe a bit more debatable, but that's not what you did. If we were going to change the behavior, this patch would still be wrong because it fails to provide an upgrade path. The objections saying you needed a 1.1 migration script were completely correct. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections
On Fri, Nov 8, 2013 at 10:37 AM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On Fri, 08 November 2013 09:47 On Tue, Nov 5, 2013 at 3:20 PM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On execution of pg_resetxlog using the option -n Please provide your opinion or expectation out of this patch. Your approach in patch seems to be inline with Todo item. On a quick glance, I observed few things which can make your patch better: 1. The purpose was to print pg_control values in one section and any other reset values in different section, so in that regard, should we display below in new section, as here newXlogSegNo is not directly from pg_control. PrintControlValues() { .. XLogFileName(fname, ControlFile.checkPointCopy.ThisTimeLineID, newXlogSegNo); printf(_(First log segment after reset:%s\n), fname); } Yes we can print newXlogSegNo. I think then your documentation also need updates. 2. why to have extra flags for changedParam, can't we do without it. For example, we already use set_xid value to check if user has provided xid. You mean to say that we can print wherever values of control file parameter is getting assigned. If yes, then the problem is that every places will have to check the condition as if(noupdate) and then print the corresponding value. E.g. If (noupdate) printf(_(NextMultiXactId:%u\n), ControlFile.checkPointCopy.nextMulti); No, not this way, may be making set_xid as file level parameters, so that you can use them later as well. Your current way also doesn't seem to be too unreasonable, however it is better if you can do without it. One more thing, I think as per this patch few parameters will be displayed twice once in pg_control values .. section and once in Values to be used after reset:, so by doing this I guess you want to make it easier for user to refer both pg_control's original/guessed value and new value after reset. Here I wonder if someone wants to refer to original values, can't he directly use pg_controldata? Anyone else have thoughts about how can we display values which can make current situation better for user. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fix pg_isolation_regress to work outside its build directory compiler warning
Commit 9b4d52f2095be96ca238ce41f6963ec56376491f introduced a new compiler warning to the windows visual studios build D:\Postgres\b\pgsql.sln (default target) (1) - D:\Postgres\b\pg_regress_ecpg.vcxproj (default target) (88) - (ClCompile target) - src\interfaces\ecpg\test\pg_regress_ecpg.c(170): warning C4113: 'void (__cdecl *)(void)' differs in parameter lists from 'init_function' [D:\Postgres\b\pg_regress_ecpg.vcxproj] 1 Warning(s) The attached fixes it. Regards David Rowley windows_compiler_warning.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to fix unused variable warning on windows build
On Thu, Nov 7, 2013 at 11:43 AM, David Rowley dgrowle...@gmail.com wrote: Attached is a small patch which fixes the unused variable warning in the visual studios build. Seems like VS does not support __attribute__((unused)) but looks like all other places we must assign to the variable. I have raised same issue some time back, see the below link where there is some discussion about it. http://www.postgresql.org/message-id/caa4ek1jeoa1hjgauwpcqhw8av7zapkdxjdwubwetjomi2f8...@mail.gmail.com I think it is good, if one of committer's who have windows env. can look into it and commit or provide suggestions, else you can make a combined patch of this and other warning you saw on windows and upload to next CF so that it doesn't get lost. I checked that you have already submitted a patch for this warning alone in CF. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] information schema parameter_default implementation
On 15 September 2013 01:35, Peter Eisentraut pete...@gmx.net wrote: Here is an updated patch which fixes the bug you have pointed out. On Thu, 2013-01-31 at 18:59 +0500, Ali Dar wrote: I checked our your patch. There seems to be an issue when we have OUT parameters after the DEFAULT values. Fixed. Some other minor observations: 1) Some variables are not lined in pg_get_function_arg_default(). Are you referring to indentation issues? I think the indentation is good, but pgindent will fix it anyway. I find only proc variable not aligned. Adding one more tab for all the variables should help. 2) I found the following check a bit confusing, maybe you can make it better if (!argmodes || argmodes[i] == PROARGMODE_IN || argmodes[i] == PROARGMODE_INOUT || argmodes[i] == PROARGMODE_VARIADIC) Factored that out into a separate helper function. The statement needs to be split into 80 columns width lines. 2) inputargn can be assigned in declaration. I'd prefer to initialize it close to where it is used. Me too. 3) Function level comment for pg_get_function_arg_default() is missing. I think the purpose of the function is clear. Unless a reader goes through the definition, it is not obvious whether the second function argument represents the parameter position in input parameters or it is the parameter position in *all* the function parameters (i.e. proargtypes or proallargtypes). I think at least a one-liner description of what this function does should be present. 4) You can also add comments inside the function, for example the comment for the line: nth = inputargn - 1 - (proc-pronargs - proc-pronargdefaults); Suggestion? Yes, it is difficult to understand what's the logic behind this calculation, until we go read the documentation related to pg_proc.proargdefaults. I think this should be sufficient: /* * proargdefaults elements always correspond to the last N input arguments, * where N = pronargdefaults. So calculate the nth_default accordingly. */ 5) I think the line added in the documentation(informational_schema.sgml) is very long. Consider revising. Maybe change from: The default expression of the parameter, or null if none or if the function is not owned by a currently enabled role. TO The default expression of the parameter, or null if none was specified. It will also be null if the function is not owned by a currently enabled role. I don't know what do you exactly mean by: function is not owned by a currently enabled role? I think this style is used throughout the documentation of the information schema. We need to keep the descriptions reasonably compact, but I'm willing to entertain other opinions. This requires first an answer to my earlier question about why the role-related privilege is needed. --- There should be an initial check to see if nth_arg is passed a negative value. It is used as an index into the argmodes array, so a -ve array index can cause a crash. Because pg_get_function_arg_default() can now be called by a user also, we need to validate the input values. I am ok with returning with an error Invalid argument. --- Instead of : deparse_expression_pretty(node, NIL, false, false, 0, 0) you can use : deparse_expression(node, NIL, false, false) We are anyway not using pretty printing. --- Other than this, I have no more issues. --- After the final review is over, the catversion.h requires the appropriate date value. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.
From: Robert Haas robertmh...@gmail.com On Tue, Nov 5, 2013 at 5:15 PM, Peter Eisentraut pete...@gmx.net wrote: On 11/5/13, 1:04 AM, Arulappan, Arul Shaji wrote: Implements NCHAR/NVARCHAR as distinct data types, not as synonyms If, per SQL standard, NCHAR(x) is equivalent to CHAR(x) CHARACTER SET cs, then for some cs, NCHAR(x) must be the same as CHAR(x). Therefore, an implementation as separate data types is wrong. Since the point doesn't seem to be getting through, let me try to be more clear: we're not going to accept any form of this patch. A patch that makes some progress toward actually coping with multiple encodings in the same database would be very much worth considering, but adding compatible syntax with incompatible semantics is not of interest to the PostgreSQL project. We have had this debate on many other topics in the past and will no doubt have it again in the future, but the outcome is always the same. It doesn't seem that there is any semantics incompatible with the SQL standard as follows: - In the first step, cs is the database encoding, which is used for char/varchar/text. - In the second (or final) step, where multiple encodings per database is supported, cs is the national character encoding which is specified with CREATE DATABASE ... NATIONAL CHARACTER ENCODING cs. If NATIONAL CHARACTER ENCODING clause is omitted, cs is the database encoding as step 1. Let me repeat myself: I think the biggest and immediate issue is that PostgreSQL does not support national character types at least officially. Officially means the description in the manual. So I don't have strong objection against the current (hidden) implementation of nchar types in PostgreSQL which are just synonyms, as long as the official support is documented. Serious users don't want to depend on hidden features. However, doesn't the current synonym approach have any problems? Wouldn't it produce any trouble in the future? If we treat nchar as char, we lose the fact that the user requested nchar. Can we lose the fact so easily and produce irreversible result as below? -- Maybe so. I guess the distinct type for NCHAR is for future extension and user friendliness. As one user, I expect to get national character instead of char character set xxx as output of psql \d and pg_dump when I specified national character in DDL. In addition, that makes it easy to use the pg_dump output for importing data to other DBMSs for some reason. -- Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to fix unused variable warning on windows build
On Sat, Nov 9, 2013 at 7:29 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Thu, Nov 7, 2013 at 11:43 AM, David Rowley dgrowle...@gmail.com wrote: Attached is a small patch which fixes the unused variable warning in the visual studios build. Seems like VS does not support __attribute__((unused)) but looks like all other places we must assign to the variable. I have raised same issue some time back, see the below link where there is some discussion about it. http://www.postgresql.org/message-id/caa4ek1jeoa1hjgauwpcqhw8av7zapkdxjdwubwetjomi2f8...@mail.gmail.com Thanks for the link. The reason that we don't see more warnings for this is that it seems in all other places where we have used PG_USED_FOR_ASSERTS_ONLY, the variable is getting assigned to every time, though it will only be when compiled in debug that the variable is checked. It seems microsoft decided to disable warnings for assigned but not used for pretty much this reason. http://stackoverflow.com/questions/10593547/why-is-no-warning-given-for-this-unused-variable Microsoft explain that it's because they had lots of complaints from people who were assigning variables purely so they could see what a method call returned during debugging, and found the warning irritating: So I guess fixing up PG_USED_FOR_ASSERTS_ONLY to work with visual studios is not required and my patch seems like the fix for this unique case. I think it is good, if one of committer's who have windows env. can look into it and commit or provide suggestions, else you can make a combined patch of this and other warning you saw on windows and upload to next CF so that it doesn't get lost. I checked that you have already submitted a patch for this warning alone in CF. I was not quite sure what I should do for these tiny patches. Quite often if a committer happens to read the post and agrees with the patch then it might get committed pretty quickly even outside a commitfest, but if not then if I didn't add to the commitfest then it would likely get lost. Regards David Rowley With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com