Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
On Thu, Feb 2, 2017 at 1:26 AM, Fujii Masao wrote: > I'm afraid that many WAL segments would start with a continuation record > when there are the workload of short transactions (e.g., by pgbench), and > which would make restart_lsn go behind very much. No? I don't quite understand this argument. Even if there are many small transactions, that would cause restart_lsn to just be late by one segment, all the time. > The discussion on this thread just makes me think that restart_lsn should > indicate the replay location instead of flush location. This seems safer. That would penalize WAL retention on the primary with standbys using recovery_min_apply_delay and a slot for example... We can attempt to address this problem two ways. The patch proposed (ugly btw and there are two typos!) is doing it in the WAL sender by not making restart_lsn jump to the next segment if a continuation record is found. Or we could have the standby request for the next segment instead if the record it wants to replay from is at a boundary and that it locally has the beginning of the record, and it has it because it already confirmed to the primary that it flushed to the next segment. Not sure which fix is better though. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speedup twophase transactions
>> Yeah. Was thinking about this yesterday. How about adding entries in >> TwoPhaseState itself (which become valid later)? Only if it does not >> cause a lot of code churn. > > That's possible as well, yes. PFA a patch which does the above. It re-uses the TwoPhaseState gxact entries to track 2PC PREPARE/COMMIT in shared memory. The advantage being that CheckPointTwoPhase() becomes the only place where the fsync of 2PC files happens. A minor annoyance in the patch is the duplication of the code to add the 2nd while loop to go through these shared memory entries in PrescanPreparedTransactions, RecoverPreparedTransactions and StandbyRecoverPreparedTransactions. Other than this, I ran TAP tests and they succeed as needed. Regards, Nikhils -- Nikhil Sontakke http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services twophase_recovery_shmem_020217.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug fix] PQsendQuery occurs error when target_session_attrs is set to read-write
From: Michael Paquier [mailto:michael.paqu...@gmail.com] >This has not been added yet to the next CF. As Ashutosh mentioned >things tend to be easily forgotten. So I have added it here: >https://commitfest.postgresql.org/13/982/ Thank you for adding this problem to CF. > I have noticed this typo on the way => s/requisted/requested/: >--- a/src/interfaces/libpq/fe-connect.c >+++ b/src/interfaces/libpq/fe-connect.c >@@ -2908,7 +2908,7 @@ keep_going: /* We will >come back to here until there is >} >/* >-* If a read-write connection is requisted check for same. >+* If a read-write connection is requested check for same. > */ >if (conn->target_session_attrs != NULL && >strcmp(conn->target_session_attrs, "read-write") == 0) I add this fix to the updated patch. This is based on the patch Ashutosh updated. Regards, Daisuke, Higuchi PQsendQuery_for_target_session_attrs_v4.patch Description: PQsendQuery_for_target_session_attrs_v4.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Provide list of subscriptions and publications in psql's completion
Hi all, While testing a bit the logical replication facility, I have bumped on the fast that psql's completion does not show the list of things already created. Attached is a patch. Thanks, -- Michael subs-psql-completion.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION
On Thu, Feb 2, 2017 at 2:13 PM, Tom Lane wrote: > Kyotaro HORIGUCHI writes: >> Then, the reason for the TRY-CATCH cluase is that I found that >> some functions called from there can throw exceptions. > > Yes, but all LWLocks should be released by normal error recovery. > It should not be necessary for this code to clean that up by hand. > If it were necessary, there would be TRY-CATCH around every single > LWLockAcquire in the backend, and we'd have an unreadable and > unmaintainable system. Please don't add a TRY-CATCH unless it's > *necessary* -- and you haven't explained why this one is. Putting hands into the code and at the problem, I can see that dropping a subscription on a node makes it unresponsive in case of a stop. And that's just because calls to LWLockRelease are missing as in the patch attached. A try/catch problem should not be necessary. -- Michael drop-subs-locks.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug fix] PQsendQuery occurs error when target_session_attrs is set to read-write
From: Ashutosh Bapat [mailto:ashutosh.ba...@enterprisedb.com] >Sorry, attached wrong patch. Here's the right one. > The code expects that there will be two PQgetResult() calls required. > One to fetch the result of SHOW command and the other to extract NULL. > If we require more calls or unexpected results, we should throw and > error. The patch just checks the first result and consumes the > remaining without verifying them. Also, it looks like we can not clear > result of PQgetResult() before using the values or copying them > somewhere else [1]. Here's updated patch which tries to do that. > Please let me know if this looks good to you. Oh, I had a basic misunderstanding. Thank you for correct me. There is no problem in the patch you attached. I agree with you. Regards, Daisuke, Higuchi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
On Thu, Feb 2, 2017 at 3:49 AM, Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> I think what we ought to do about this is invent additional API > >> functions, say > >> > >> Oid CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, > >> CatalogIndexState indstate); > >> void CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, > >> HeapTuple tup, CatalogIndexState indstate); > >> > >> and use these in place of simple_heap_foo plus CatalogIndexInsert > >> in the places where this optimization had been applied. > > > This looks reasonable enough to me. > > Done. > > Thanks for taking care of this. Shame that I missed this because I'd specifically noted the special casing for large objects etc. But looks like while changing 180+ call sites, I forgot my notes. Thanks again, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION
Kyotaro HORIGUCHI writes: > Then, the reason for the TRY-CATCH cluase is that I found that > some functions called from there can throw exceptions. Yes, but all LWLocks should be released by normal error recovery. It should not be necessary for this code to clean that up by hand. If it were necessary, there would be TRY-CATCH around every single LWLockAcquire in the backend, and we'd have an unreadable and unmaintainable system. Please don't add a TRY-CATCH unless it's *necessary* -- and you haven't explained why this one is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should `pg_upgrade --check` check relation filenodes are present?
On 1/31/17 4:57 PM, Craig de Stigter wrote: > However, this problem was not caught by the `--check` command. I'm > looking at the source code and it appears that pg_upgrade does not > attempt to verify relation filenodes actually exist before proceeding, > whether using --check or not. The purpose of --check is to see if there is anything in your database that pg_upgrade cannot upgrade. Its purpose is not to detect general damage in a database. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug fix] PQsendQuery occurs error when target_session_attrs is set to read-write
On Thu, Feb 2, 2017 at 1:34 PM, Ashutosh Bapat wrote: > On Thu, Feb 2, 2017 at 10:04 AM, Ashutosh Bapat > wrote: >> [1] https://www.postgresql.org/docs/devel/static/libpq-exec.html: >> PQgetvalue(). > The code expects that there will be two PQgetResult() calls required. > One to fetch the result of SHOW command and the other to extract NULL. > If we require more calls or unexpected results, we should throw and > error. The patch just checks the first result and consumes the > remaining without verifying them. Also, it looks like we can not clear > result of PQgetResult() before using the values or copying them > somewhere else [1]. Here's updated patch which tries to do that. > Please let me know if this looks good to you. This has not been added yet to the next CF. As Ashutosh mentioned things tend to be easily forgotten. So I have added it here: https://commitfest.postgresql.org/13/982/ I have noticed this typo on the way => s/requisted/requested/: --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -2908,7 +2908,7 @@ keep_going: /* We will come back to here until there is } /* -* If a read-write connection is requisted check for same. +* If a read-write connection is requested check for same. */ if (conn->target_session_attrs != NULL && strcmp(conn->target_session_attrs, "read-write") == 0) Looking at the patch, I agree with Ashutosh. Any fix should be located in the code path of CONNECTION_CHECK_WRITABLE which is the one consuming the results. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling replication connections by default in pg_hba.conf
On 1/22/17 10:29 PM, Michael Paquier wrote: > As now wal_level = replica has become the default for Postgres 10, > could we consider as well making replication connections enabled by > default in pg_hba.conf? This requires just uncommenting a couple of > lines in pg_hba.conf.sample. Yes, I think this makes sense, as one of the reason for these changes is to simplify the use of pg_basebackup. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel bitmap heap scan
On Wed, Feb 1, 2017 at 11:09 PM, Robert Haas wrote: Hi Robert, Thanks for the review. > When Andres wrote in > https://www.postgresql.org/message-id/20161017201558.cr34stc6zvxy3...@alap3.anarazel.de > that you should try to share only the iteration arrays, I believe that > he meant the results of tbm_begin_iterate() -- that is, > iterator->spageptr, chunkptr, schunkbit, spages, and schunks. I'm > perhaps putting words into his mouth, but what he said was that we > could avoid sharing "the whole underlying hash". But the patch set > you've implemented here doesn't behave that way. Instead, you've got > the array containing the hash table elements shared (which is good) Here is my analysis why we selected this instead of just sharing the iterator: The problem is that each entry of iteration array is just a pointer to hash entry. So if we only shared iterator then iterator element will be pointing to local memory of other workers. I thought of one more option that during tbm_begin_iterate instead of keeping pointer to hash entry, we can make a copy of that in DSA area, but that will have 2 copies of hash table element for short duration which is not good. And, I think what we are doing currently is better than that. The work of tbm_begin_iterate() should be done before we > begin the shared scan and the results of that work should be shared. > What happens right now (it appears) is that every backend does that > work based on the same hash table and we just assume they all get the > same answer. Actually, tbm_begin_iterate is processing the each element of the hash table and converting to chunk and page array and currently, it’s done by each worker and it’s pretty cheap. Suppose I try to do this only by the first worker then implementation will look something like this. 1. First, we need to create a tbm->spages array and tbm->schunks array and both should be the array of dsa_pointers. 2. Then each worker will process these array’s and will convert them to the array of their local pointers. 3. With the current solution where all hash elements are stored in one large dsa chunk, then how we are going to divide them into multiple dsa pointers. I will work on remaining comments. And we really do assume that, because > pbms_parallel_iterate() assumes it can shuttle private state back and > forth between iterator in different backends and nothing will break; > but those backends aren't actually using the same iteration arrays. > They are using different iteration arrays that should have the same > contents because they were all derived from the same semi-shared hash > table. That's pretty fragile, and a waste of CPU cycles if the hash > table is large (every backend does its own sort). > > On a related note, I think it's unacceptable to make the internal > details of TBMIterator public. You've moved it from tidbitmap.c to > tidbitmap.h so that nodeBitmapHeapScan.c can tinker with the guts of > the TBMIterator, but that's not OK. Those details need to remain > private to tidbitmap.c. pbms_parallel_iterate() is a nasty kludge; we > need some better solution. The knowledge of how a shared iterator > should iterate needs to be private to tidbitmap.c, not off in the > executor someplace. And I think the entries need to actually be > updated in shared memory directly, not copied back and forth between a > backend-private iterator and a shared iterator. > > Also, pbms_parallel_iterate() can't hold a spinlock around a call to > tbm_iterate(). Note the coding rules for spinlocks mentioned in > spin.h and src/backend/storage/lmgr/README. I think the right thing > to do here is to use an LWLock in a new tranche (add an entry to > BuiltinTrancheIds). > > In 0002, you have this: > > +tb->alloc = palloc(sizeof(SH_ALLOCATOR)); > > This should be using MemoryContextAlloc(ctx, ...) rather than palloc. > Otherwise the allocator object is in a different context from > everything else in the hash table. But TBH, I don't really see why we > want this to be a separate object. Why not just put > HashAlloc/HashFree/args into SH_TYPE directly? That avoids some > pointer chasing and doesn't really seem to cost anything (except that > SH_CREATE will grow a slightly longer argument sequence). > > I think maybe we should rename the functions to element_allocate, > element_free, and element_allocator_ctx, or something like that. The > current names aren't capitalized consistently with other things in > this module, and putting the word "element" in there would make it > more clear what the purpose of this thing is. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)
On 2017-02-01 23:27:36 -0500, Tom Lane wrote: > I wrote: > > Andres Freund writes: > >> Tom, do you have an opinion? > > > Yes, it's broken. split_pathtarget_at_srfs seems to be doing the right > > thing, but then something later is recombining the last two steps. > > Ah, no, I take that back: split_pathtarget_at_srfs is doing the wrong > thing. Yea, that's what I thought. > Nothing's broken quite yet, but when we get to setrefs.c, it replaces > *both* occurrences of g(x) with Vars referencing the g(x) output from > the next level down. So now we have the tlist of the upper ProjectSet > node as "f(Var), Var" and ProjectSet complains that it has no SRF to > evaluate. But I'd missed part of that subtlety. > I think the appropriate fix is that, once split_pathtarget_at_srfs() has > computed a tentative list of SRFs it needs to evaluate, it has to make a > second pass to see if any of them match expressions that were assigned to > the next level down. > This is pretty annoying, but we'd only have to do it > if target_contains_srfs and context.nextlevel_contains_srfs are both true, > which will be a negligibly small fraction of queries in practice. Hm. Can't really come up with something better, but I'm kinda tired too... > Or we could take out that Assert in nodeProjectSet.c. But that seems > like a hack not a fix. Yea, I don't like that much. > I'm pretty tired but I'll work on a fix tomorrow. Cool. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commit fest 2017-01 will begin soon!
On Thu, Feb 2, 2017 at 1:23 PM, Amit Kapila wrote: > Many thanks to you for running the show. I think it might be okay if > one consolidated mail is sent for all the patches that are marked > "Returned with Feedback" or "Rejected" or "Moved to next CF". OTOH, > there is some value in sending a separate mail for each patch so that > people can argue on the decision by CFM for one particular patch, but > not sure if it worth. People tend to look at emails they are directly on in CC, that's why I prefer sending individual emails. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug fix] PQsendQuery occurs error when target_session_attrs is set to read-write
Sorry, attached wrong patch. Here's the right one. On Thu, Feb 2, 2017 at 10:04 AM, Ashutosh Bapat wrote: > On Thu, Feb 2, 2017 at 8:11 AM, Higuchi, Daisuke > wrote: >> From: Ashutosh Bapat [mailto:ashutosh.ba...@enterprisedb.com] >>> Per the documentation [1], "PQgetResult must be called repeatedly >>> until it returns a null pointer, indicating that the command is >>> done.". The code in PQgetResult() under CONNECTION_CHECK_WRITABLE >>> case, violates that. The patch fixes it. The patch however jumps to >>> keep_going without changing conn->status, which means that it will end >>> up again in the same case. While that's fine, may be we should use a >>> local loop on PQgetResult() to keep the code readable. >> Thank you for reviewing the patch. >> I created it with reference to pqSetenvPoll() in >> interfaces/libpq/fe-protocol2.c, >> but I certainly thought that readability is not good. >> I updated the patch, so I will add this to the next commitfest. > > Thanks for the patch. > > The code expects that there will be two PQgetResult() calls required. > One to fetch the result of SHOW command and the other to extract NULL. > If we require more calls or unexpected results, we should throw and > error. The patch just checks the first result and consumes the > remaining without verifying them. Also, it looks like we can not clear > result of PQgetResult() before using the values or copying them > somewhere else [1]. Here's updated patch which tries to do that. > Please let me know if this looks good to you. > > > [1] https://www.postgresql.org/docs/devel/static/libpq-exec.html: > PQgetvalue(). > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company PQsendQuery_for_target_session_attrs_v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug fix] PQsendQuery occurs error when target_session_attrs is set to read-write
On Thu, Feb 2, 2017 at 8:11 AM, Higuchi, Daisuke wrote: > From: Ashutosh Bapat [mailto:ashutosh.ba...@enterprisedb.com] >> Per the documentation [1], "PQgetResult must be called repeatedly >> until it returns a null pointer, indicating that the command is >> done.". The code in PQgetResult() under CONNECTION_CHECK_WRITABLE >> case, violates that. The patch fixes it. The patch however jumps to >> keep_going without changing conn->status, which means that it will end >> up again in the same case. While that's fine, may be we should use a >> local loop on PQgetResult() to keep the code readable. > Thank you for reviewing the patch. > I created it with reference to pqSetenvPoll() in > interfaces/libpq/fe-protocol2.c, > but I certainly thought that readability is not good. > I updated the patch, so I will add this to the next commitfest. Thanks for the patch. The code expects that there will be two PQgetResult() calls required. One to fetch the result of SHOW command and the other to extract NULL. If we require more calls or unexpected results, we should throw and error. The patch just checks the first result and consumes the remaining without verifying them. Also, it looks like we can not clear result of PQgetResult() before using the values or copying them somewhere else [1]. Here's updated patch which tries to do that. Please let me know if this looks good to you. [1] https://www.postgresql.org/docs/devel/static/libpq-exec.html: PQgetvalue(). -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company PQsendQuery_for_target_session_attrs_v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)
I wrote: > Andres Freund writes: >> Tom, do you have an opinion? > Yes, it's broken. split_pathtarget_at_srfs seems to be doing the right > thing, but then something later is recombining the last two steps. Ah, no, I take that back: split_pathtarget_at_srfs is doing the wrong thing. It's generating the desired list of PathTargets, but it's mistakenly concluding that the last one contains_srfs, which leads to making a ProjectSet plan node for it instead of Result. The problem is specific to targetlists like f(g(x)), g(x) where g() is a SRF and f() can be any sort of non-set-returning expression. split_pathtarget_at_srfs() examines the f() node, sees that it's not a SRF, and calls split_pathtarget_walker which finds the g(x) subexpression and correctly assigns that to the next level down. But then split_pathtarget_at_srfs finds the top-level g(x) occurrence and concludes that this level contains_srfs. Nothing's broken quite yet, but when we get to setrefs.c, it replaces *both* occurrences of g(x) with Vars referencing the g(x) output from the next level down. So now we have the tlist of the upper ProjectSet node as "f(Var), Var" and ProjectSet complains that it has no SRF to evaluate. I think the appropriate fix is that, once split_pathtarget_at_srfs() has computed a tentative list of SRFs it needs to evaluate, it has to make a second pass to see if any of them match expressions that were assigned to the next level down. This is pretty annoying, but we'd only have to do it if target_contains_srfs and context.nextlevel_contains_srfs are both true, which will be a negligibly small fraction of queries in practice. Or we could take out that Assert in nodeProjectSet.c. But that seems like a hack not a fix. I'm pretty tired but I'll work on a fix tomorrow. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commit fest 2017-01 will begin soon!
On Thu, Feb 2, 2017 at 8:50 AM, Michael Paquier wrote: > On Tue, Jan 31, 2017 at 4:01 PM, Michael Paquier > wrote: >> The CF finishing soon, I have done a first pass on the patches >> remaining on it, moving patches to the next CF or updating them. There >> are still 48 patches still on the tracker when writing this email: >> Needs review: 32. >> Ready for Committer: 16. >> This requires a certain amount of energy to classify, so if you would >> like spare a bit of HP/MP to the CFM, feel free to cleanup your >> patches or the ones you are reviewing. >> >> @all: If I have updated your patch in a state that you think is not >> appropriate, feel free to complain on the thread of the patch or here. >> That's fine. > > After 3 days and 90 patches spamming everybody on this -hackers, the > CF is now *closed*. Here is the final score of this session: > Committed: 58. > Moved to next CF: 70. > Rejected: 4. > Returned with Feedback: 23. > Total: 155. > > It is nice to see that many people have registered as reviewers and > that much has been done. I have finished with the impression that > there has been more commitment in this area compared to the past. > Thanks to all the people involved! > Many thanks to you for running the show. I think it might be okay if one consolidated mail is sent for all the patches that are marked "Returned with Feedback" or "Rejected" or "Moved to next CF". OTOH, there is some value in sending a separate mail for each patch so that people can argue on the decision by CFM for one particular patch, but not sure if it worth. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: [[Parallel] Shared] Hash
On Thu, Feb 2, 2017 at 4:57 PM, Rafia Sabih wrote: > Apart from the previously reported regression, there appear one more > issue in this set of patches. At times, running a query using parallel > hash it hangs up and all the workers including the master shows the > following backtrace, Ugh, thanks. Investigating. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: [[Parallel] Shared] Hash
On Thu, Feb 2, 2017 at 1:19 AM, Thomas Munro wrote: > On Thu, Feb 2, 2017 at 3:34 AM, Rafia Sabih > wrote: >> 9 | 62928.88 | 59077.909 > > Thanks Rafia. At first glance this plan is using the Parallel Shared > Hash in one place where it should pay off, that is loading the orders > table, but the numbers are terrible. I noticed that it uses batch > files and then has to increase the number of batch files, generating a > bunch of extra work, even though it apparently overestimated the > number of rows, though that's only ~9 seconds of ~60. I am > investigating. Hi Thomas, Apart from the previously reported regression, there appear one more issue in this set of patches. At times, running a query using parallel hash it hangs up and all the workers including the master shows the following backtrace, #0 0x3fff880c7de8 in __epoll_wait_nocancel () from /lib64/power8/libc.so.6 #1 0x104e2718 in WaitEventSetWaitBlock (set=0x100157bde90, cur_timeout=-1, occurred_events=0x3fffdbe69698, nevents=1) at latch.c:998 #2 0x104e255c in WaitEventSetWait (set=0x100157bde90, timeout=-1, occurred_events=0x3fffdbe69698, nevents=1, wait_event_info=134217745) at latch.c:950 #3 0x10512970 in ConditionVariableSleep (cv=0x3ffd736e05a4, wait_event_info=134217745) at condition_variable.c:132 #4 0x104dbb1c in BarrierWaitSet (barrier=0x3ffd736e0594, new_phase=1, wait_event_info=134217745) at barrier.c:97 #5 0x104dbb9c in BarrierWait (barrier=0x3ffd736e0594, wait_event_info=134217745) at barrier.c:127 #6 0x103296a8 in ExecHashShrink (hashtable=0x3ffd73747dc0) at nodeHash.c:1075 #7 0x1032c46c in dense_alloc_shared (hashtable=0x3ffd73747dc0, size=40, shared=0x3fffdbe69eb8, respect_work_mem=1 '\001') at nodeHash.c:2618 #8 0x1032a2f0 in ExecHashTableInsert (hashtable=0x3ffd73747dc0, slot=0x100158f9e90, hashvalue=2389907270) at nodeHash.c:1476 #9 0x10327fd0 in MultiExecHash (node=0x100158f9800) at nodeHash.c:296 #10 0x10306730 in MultiExecProcNode (node=0x100158f9800) at execProcnode.c:577 The issue is not deterministic and straightforwardly reproducible, sometimes after make clean, etc. queries run sometimes they hang up again. I wanted to bring this to your notice hoping you might be faster than me in picking up the exact reason behind this anomaly. -- Regards, Rafia Sabih EnterpriseDB: http://www.enterprisedb.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallelize queries containing subplans
On Wed, Feb 1, 2017 at 10:04 PM, Robert Haas wrote: > On Tue, Jan 31, 2017 at 6:01 AM, Amit Kapila wrote: >> Moved this patch to next CF. > > So here is what seems to be the key hunk from this patch: > > /* > - * Since we don't have the ability to push subplans down to workers at > - * present, we treat subplan references as parallel-restricted. We need > - * not worry about examining their contents; if they are unsafe, we would > - * have found that out while examining the whole tree before reduction of > - * sublinks to subplans. (Really we should not see SubLink during a > - * max_interesting == restricted scan, but if we do, return true.) > + * We can push the subplans only if they don't contain any parallel-aware > + * node as we don't support multi-level parallelism (parallel workers > + * invoking another set of parallel workers). > */ > -else if (IsA(node, SubLink) || > - IsA(node, SubPlan) || > - IsA(node, AlternativeSubPlan)) > +else if (IsA(node, SubPlan)) > +return !((SubPlan *) node)->parallel_safe; > +else if (IsA(node, AlternativeSubPlan)) > { > -if (max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context)) > -return true; > +AlternativeSubPlan *asplan = (AlternativeSubPlan *) node; > +ListCell *lc; > + > +foreach(lc, asplan->subplans) > +{ > +SubPlan*splan = (SubPlan *) lfirst(lc); > + > +Assert(IsA(splan, SubPlan)); > + > +if (max_parallel_hazard_walker((Node *) splan, context)) > +return true; > +} > + > +return false; > } > > I don't see the reason for having this new code that makes > AlternativeSubPlan walk over the subplans; expression_tree_walker > already does that. > It is done this way mainly to avoid recursion/nested calls, but I think you prefer to handle it via expression_tree_walker so that code looks clean. Another probable reason could be that there is always a chance that we miss handling of some expression of a particular node (in this case AlternativeSubPlan), but if that is the case then there are other places in the code which do operate on individual subplan/'s in AlternativeSubPlan list. > On the flip side I don't see the reason for > removing the max_parallel_hazard_test() call for AlternativeSubPlan or > for SubLink. If we don't remove the current test of max_parallel_hazard_test() for AlternativeSubPlan, then AlternativeSubPlan node will be considered parallel restricted, so why do we want that after this patch. For Sublinks, I think they would have converted to Subplans by the time we reach this function for the parallel restricted check. Can you elaborate what you have in mind for the handling of AlternativeSubPlan and SubLink? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pg_authid.rolpassword format (was Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol)
On 2 February 2017 at 00:13, Heikki Linnakangas wrote: > Ok, I'll drop the second patch for now. I committed the first patch after > fixing the things you and Michael pointed out. Thanks for the review! dbd69118 caused small compiler warning for me. The attached fixed it. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services encrypt_password_warning_fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication and Character encoding
On 01-02-2017 00:05, Kyotaro HORIGUCHI wrote: > - Subscriber connects with client_encoding specification and the > output plugin pgoutput decide whether it accepts the encoding > or not. If the subscriber doesn't, pgoutput send data without > conversion. > I don't think storage without conversion is an acceptable approach. We should provide options to users such as ignore tuple or NULL for column(s) with conversion problem. I wouldn't consider storage data without conversion because it silently show incorrect data and we historically aren't flexible with conversion routines. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)
Andres Freund writes: > On 2017-02-02 00:09:03 +0100, Erik Rijkers wrote: >> Something is broken in HEAD: > Hm. Indeed. > The issue is that we're generating ProjectSet nodes instead of Result > for the top-level nodes - but we currently assume that ProjectSet nodes > actually contain sets. I'm tentatively in favor of generating proper > Result nodes in this case, rather than allowing this in ProjectSet - on > the other hand, it works after removing that Assert(). > Tom, do you have an opinion? Yes, it's broken. split_pathtarget_at_srfs seems to be doing the right thing, but then something later is recombining the last two steps. I should be able to find it soon. Does it really work without the Assert? I'd think you'd get srf- where-not-supported errors if the SRF isn't at top level. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commit fest 2017-01 will begin soon!
On Tue, Jan 31, 2017 at 4:01 PM, Michael Paquier wrote: > The CF finishing soon, I have done a first pass on the patches > remaining on it, moving patches to the next CF or updating them. There > are still 48 patches still on the tracker when writing this email: > Needs review: 32. > Ready for Committer: 16. > This requires a certain amount of energy to classify, so if you would > like spare a bit of HP/MP to the CFM, feel free to cleanup your > patches or the ones you are reviewing. > > @all: If I have updated your patch in a state that you think is not > appropriate, feel free to complain on the thread of the patch or here. > That's fine. After 3 days and 90 patches spamming everybody on this -hackers, the CF is now *closed*. Here is the final score of this session: Committed: 58. Moved to next CF: 70. Rejected: 4. Returned with Feedback: 23. Total: 155. It is nice to see that many people have registered as reviewers and that much has been done. I have finished with the impression that there has been more commitment in this area compared to the past. Thanks to all the people involved! -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication and Character encoding
Hello. This version makes subscriber automatically set its database encoding to clinet_encoding (if not explicitly set). And changed the behavior when pg_server_to_client returns NULL to ERROR from sending original string. At Wed, 1 Feb 2017 08:39:41 +, "Shinoda, Noriyoshi" wrote in > Thank you for creating patches. > I strongly hope that your patch will be merged into the new > version. Since all databases are not yet based on UTF - 8, I > think conversion of character codes is still necessary. Thanks. > -Original Message- > From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp] > Sent: Wednesday, February 01, 2017 3:31 PM > To: Shinoda, Noriyoshi > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Logical Replication and Character encoding > > At Wed, 01 Feb 2017 12:13:04 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI > wrote in > <20170201.121304.267734380.horiguchi.kyot...@lab.ntt.co.jp> > > > > I tried a committed Logical Replication environment. I found > > > > that replication between databases of different encodings did not > > > > convert encodings in character type columns. Is this behavior > > > > correct? > > > > > > The output plugin for subscription is pgoutput and it currently > > > doesn't consider encoding but would easiliy be added if desired > > > encoding is informed. > > > > > > The easiest (but somewhat seems fragile) way I can guess is, > > > > > > - Subscriber connects with client_encoding specification and the > > > output plugin pgoutput decide whether it accepts the encoding > > > or not. If the subscriber doesn't, pgoutput send data without > > > conversion. > > > > > > The attached small patch does this and works with the following > > > CREATE SUBSCRIPTION. > > > > Oops. It forgets to care conversion failure. It is amended in the > > attached patch. > > > > > CREATE SUBSCRIPTION sub1 CONNECTION 'host=/tmp port=5432 > > > dbname=postgres client_encoding=EUC_JP' PUBLICATION pub1; > > > > > > > > > Also we may have explicit negotiation on, for example, > > > CREATE_REPLICATION_SLOT. > > > > > > 'CREATE_REPLICATION_SLOT sub1 LOGICAL pgoutput ENCODING EUC_JP' > > > > > > Or output plugin may take options. > > > > > > 'CREATE_REPLICATION_SLOT sub1 LOGICAL pgoutput OPTIONS(encoding EUC_JP)' > > > > > > > > > Any opinions? > > This patch chokes replication when the publisher finds an inconvertible > character in a tuple to be sent. For the case, dropping-then-recreating > subscription is necessary to go forward. I think this behavior is right and inevitable in order to protect subscribers from broken strings. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From e3af51822d1318743e554e24163390b74b254751 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 2 Feb 2017 09:05:20 +0900 Subject: [PATCH] Enable logical-replication between databases with different encodings Different from physical replication, logical replication may run between databases with different encodings. This patch makes subscriber set client_encoding to database encoding and publisher follow it. --- .../libpqwalreceiver/libpqwalreceiver.c| 35 +- src/backend/replication/logical/proto.c| 17 +++ 2 files changed, 51 insertions(+), 1 deletion(-) diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c index 44a89c7..ef38af7 100644 --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c @@ -24,6 +24,7 @@ #include "access/xlog.h" #include "miscadmin.h" #include "pgstat.h" +#include "mb/pg_wchar.h" #include "replication/logicalproto.h" #include "replication/walreceiver.h" #include "storage/proc.h" @@ -112,11 +113,43 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, char **err) { WalReceiverConn *conn; + const char *myconninfo = conninfo; const char *keys[5]; const char *vals[5]; int i = 0; /* + * For logical replication, set database encoding as client_encoding if + * not specified in conninfo + */ + if (logical) + { + PQconninfoOption *opts = NULL; + + opts = PQconninfoParse(conninfo, NULL); + + if (opts != NULL) + { + while (opts->keyword != NULL && + strcmp(opts->keyword, "client_encoding") != 0) +opts++; + + /* add client_encoding to conninfo if not set */ + if (opts->keyword == NULL || opts->val == NULL) + { +StringInfoData s; + +/* Assuming that the memory context here is properly set */ +initStringInfo(&s); +appendStringInfoString(&s, conninfo); +appendStringInfo(&s, " client_encoding=\"%s\"", + GetDatabaseEncodingName()); +myconninfo = s.data; + } + } + } + + /* * We use the expand_dbname parameter to process the connection string (or * URI), and pass some extra options. The deliber
Re: [HACKERS] Performance improvement for joins where outer side is unique
On Tue, Jan 31, 2017 at 9:13 AM, David Rowley wrote: > On 31 January 2017 at 13:10, David Rowley > wrote: >> I've attached a patch which implements this. > > Please disregards previous patch. (I forgot git commit before git diff > to make the patch) > > I've attached the correct patch. Moved to CF 2017-03. (You are the last one.) -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Index Scans
On Wed, Feb 1, 2017 at 10:20 PM, Amit Kapila wrote: > makes sense, so changed accordingly. I have moved this patch to CF 2017-03. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add pgstathashindex() to get hash index table statistics.
On Tue, Jan 24, 2017 at 8:36 PM, Kuntal Ghosh wrote: > Nothing else to add from my side. I'm marking this 'Ready for commiter'. Moved to CF 2017-03 with the same status. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add GUCs for predicate lock promotion thresholds
On Tue, Jan 24, 2017 at 3:32 AM, Kevin Grittner wrote: > I will need some time to consider that Moved to CF 2017-03 for now. The last patch sent still applies. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autonomous transactions
On Sat, Jan 7, 2017 at 5:30 PM, Andrey Borodin wrote: > The new status of this patch is: Ready for Committer I don't think that this thread has reached a conclusion yet. From what I can see the last patch does not apply, so I have moved the patch to next CF with "waiting on author". -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_background contrib module proposal
On Fri, Jan 27, 2017 at 11:38 PM, Robert Haas wrote: > On Fri, Jan 27, 2017 at 9:14 AM, Peter Eisentraut > wrote: >> On 1/19/17 12:47 PM, Andrey Borodin wrote: >>> 4. There is some controversy on where implemented feature shall be: in >>> separate extension (as in this patch), in db_link, in some PL API, in FDW >>> or somewhere else. I think that new extension is an appropriate place for >>> the feature. But I’m not certain. >> >> I suppose we should decide first whether we want pg_background as a >> separate extension or rather pursue extending dblink as proposed elsewhere. >> >> I don't know if pg_background allows any use case that dblink can't >> handle (yet). > > For the record, I have no big problem with extending dblink to allow > this instead of adding pg_background. But I think we should try to > get one or the other done in time for this release. Moved to CF 2017-03 as the discussion is not over yet. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION
At Thu, 2 Feb 2017 08:46:11 +0900, Michael Paquier wrote in > On Thu, Feb 2, 2017 at 2:14 AM, Fujii Masao wrote: > > The lwlock would be released when an exception occurs, so I don't think > > that TRY-CATCH is necessary here. Or it's necessary for another reason? > > +PG_CATCH(); > +{ > +LWLockRelease(LogicalRepLauncherLock); > +PG_RE_THROW(); > +} > +PG_END_TRY(); > Just to do that, a TRY/CATCH block looks like an overkill to me. Why > not just call LWLockRelease in the ERROR and return code paths? I though the same first. The modification at the "if (wrconn ==" is the remains of that. It is reverted inthe attached patch. Then, the reason for the TRY-CATCH cluase is that I found that some functions called from there can throw exceptions. logicalrep_worker_stop and replorigin_drop have ereport in its path. load_library apparently can throw exception. (walrcv_(libpq_) functions don't seeem to.) regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From d0ca653bb2aa776742a2e7a697b02794b1ad66d9 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 2 Feb 2017 11:33:40 +0900 Subject: [PATCH] Fix DROP SUBSCRIPTION's lock leak. DROP SUBSCRIPTION acquires a lock on LogicalRepLauncherLock but never releases. This fixes it. --- src/backend/commands/subscriptioncmds.c | 86 ++--- 1 file changed, 48 insertions(+), 38 deletions(-) diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index 5de..223eea4 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -511,52 +511,62 @@ DropSubscription(DropSubscriptionStmt *stmt) /* Protect against launcher restarting the worker. */ LWLockAcquire(LogicalRepLauncherLock, LW_EXCLUSIVE); - /* Kill the apply worker so that the slot becomes accessible. */ - logicalrep_worker_stop(subid); - - /* Remove the origin tracking if exists. */ - snprintf(originname, sizeof(originname), "pg_%u", subid); - originid = replorigin_by_name(originname, true); - if (originid != InvalidRepOriginId) - replorigin_drop(originid); - - /* If the user asked to not drop the slot, we are done mow.*/ - if (!stmt->drop_slot) - { - heap_close(rel, NoLock); - return; - } - /* - * Otherwise drop the replication slot at the publisher node using - * the replication connection. + * Some functions called here can throw exceptions. we must release + * LogicalRepLauncherLock for the case. */ - load_file("libpqwalreceiver", false); + PG_TRY(); + { + /* Kill the apply worker so that the slot becomes accessible. */ + logicalrep_worker_stop(subid); - initStringInfo(&cmd); - appendStringInfo(&cmd, "DROP_REPLICATION_SLOT \"%s\"", slotname); + /* Remove the origin tracking if exists. */ + snprintf(originname, sizeof(originname), "pg_%u", subid); + originid = replorigin_by_name(originname, true); + if (originid != InvalidRepOriginId) + replorigin_drop(originid); - wrconn = walrcv_connect(conninfo, true, subname, &err); - if (wrconn == NULL) - ereport(ERROR, -(errmsg("could not connect to publisher when attempting to " - "drop the replication slot \"%s\"", slotname), - errdetail("The error was: %s", err))); + /* Do the follwoing only if the user asked to actually drop the slot */ + if (stmt->drop_slot) + { + /* + * Drop the replication slot at the publisher node using the + * replication connection. + */ + load_file("libpqwalreceiver", false); - if (!walrcv_command(wrconn, cmd.data, &err)) - ereport(ERROR, -(errmsg("could not drop the replication slot \"%s\" on publisher", - slotname), - errdetail("The error was: %s", err))); - else - ereport(NOTICE, -(errmsg("dropped replication slot \"%s\" on publisher", - slotname))); + initStringInfo(&cmd); + appendStringInfo(&cmd, "DROP_REPLICATION_SLOT \"%s\"", slotname); + + wrconn = walrcv_connect(conninfo, true, subname, &err); + if (wrconn == NULL) +ereport(ERROR, + (errmsg("could not connect to publisher when attempting to " +"drop the replication slot \"%s\"", slotname), + errdetail("The error was: %s", err))); - walrcv_disconnect(wrconn); + else if (!walrcv_command(wrconn, cmd.data, &err)) +ereport(ERROR, + (errmsg("could not drop the replication slot \"%s\" on publisher", +slotname), + errdetail("The error was: %s", err))); + + ereport(NOTICE, + (errmsg("dropped replication slot \"%s\" on publisher", + slotname))); - pfree(cmd.data); + walrcv_disconnect(wrconn); + pfree(cmd.data); + } + } + PG_CATCH(); + { + LWLockRelease(LogicalRepLauncherLock); + PG_RE_THROW(); + } + PG_END_TRY(); + LWLockRelease(LogicalRepLauncherLock); heap_close(rel, NoLock); } -- 2.9.2 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
On Wed, Feb 1, 2017 at 10:02 AM, Michael Paquier wrote: > On Wed, Feb 1, 2017 at 1:11 AM, Peter Eisentraut > wrote: >> And here is a rebased patch for the original feature. I think this >> addresses all raised concerns and suggestions now. > > Thanks for the new version. That looks good to me after an extra lookup. Moved to CF 2017-03. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2/1/17 4:27 PM, Andres Freund wrote: On 2017-02-02 09:22:46 +0900, Michael Paquier wrote: On Thu, Feb 2, 2017 at 9:17 AM, Jim Nasby wrote: Speaking of which... I have a meeting in 15 minutes to discuss moving to a server with 4TB of memory. With current limits shared buffers maxes at 16TB, which isn't all that far in the future. While 16TB of shared buffers might not be a good idea, it's not going to be terribly long before we start getting questions about it. Time for int64 GUCs? I don't think the GUC bit is the hard part. We'd possibly need some trickery (like not storing bufferid in BufferDesc anymore) to avoid increasing memory usage. Before doing that the first thing to look at would be why the limit is currently INT_MAX / 2 instead of INT_MAX. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug fix] PQsendQuery occurs error when target_session_attrs is set to read-write
From: Ashutosh Bapat [mailto:ashutosh.ba...@enterprisedb.com] > Per the documentation [1], "PQgetResult must be called repeatedly > until it returns a null pointer, indicating that the command is > done.". The code in PQgetResult() under CONNECTION_CHECK_WRITABLE > case, violates that. The patch fixes it. The patch however jumps to > keep_going without changing conn->status, which means that it will end > up again in the same case. While that's fine, may be we should use a > local loop on PQgetResult() to keep the code readable. Thank you for reviewing the patch. I created it with reference to pqSetenvPoll() in interfaces/libpq/fe-protocol2.c, but I certainly thought that readability is not good. I updated the patch, so I will add this to the next commitfest. Regards, Daisuke, Higuchi PQsendQuery_for_target_session_attrs_v2.patch Description: PQsendQuery_for_target_session_attrs_v2.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Thank you for the comment. At Thu, 2 Feb 2017 01:26:03 +0900, Fujii Masao wrote in > > The attached patch does that. Usually it reads page headers only > > on segment boundaries, but once continuation record found (or > > failed to read the next page header, that is, the first record on > > the first page in the next segment has not been replicated), it > > becomes to happen on every page boundary until non-continuation > > page comes. > > I'm afraid that many WAL segments would start with a continuation record > when there are the workload of short transactions (e.g., by pgbench), and > which would make restart_lsn go behind very much. No? I agreed. So trying to release the lock for every page boundary but restart_lsn goes behind much if so many contiguous pages were CONTRECORD. But I think the chance for the situation sticks for one or more segments is ignorablly low. Being said that, there *is* possibility of false continuation, anyway. > The discussion on this thread just makes me think that restart_lsn should > indicate the replay location instead of flush location. This seems safer. Standby restarts from minRecoveryPoint, which is a copy of XLogCtl->replayEndRecPtr and updated by UpdateMinRecoveryPoint(). Whlie, applyPtr in reply messages is a copy of XLogCtl->lastReplayedEndRecptr which is updated after the upate of on-disk minRecoveryPoint. It seems safe from the viewpoint. On the other hand, apply is pausable. Records are copied and flushd on standby then the segments on master that is already sent are safely be removed even for the case. In spite of that, older segments on the master are kept from being removed during the pause. If applyPtr were used as restart_lsn, this could be another problem and this is sure to happen. I'm not sure how much possibility is there for several contiguous segments are full of contpages. But I think it's worse that apply pause causes needless pg_wal flooding. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Tue, Jan 31, 2017 at 9:31 PM, Michael Paquier wrote: > On Wed, Feb 1, 2017 at 1:06 AM, Robert Haas wrote: >> On Thu, Jan 5, 2017 at 12:54 AM, Kuntal Ghosh >> wrote: >>> I've attached the patch with the modified changes. PFA. >> >> Can this patch check contrib/bloom? > > Only full pages are applied at redo by the generic WAL facility. So > you would finish by comparing a page with itself in generic_redo. generic_redo is more complicated than just restoring an FPI. I admit that generic_redo *probably* has no bugs, but I don't see why it would hurt to allow it to be checked. It's not IMPOSSIBLE that restoring the page delta could go wrong somehow. >> +/* >> + * For a speculative tuple, the content of t_ctid is conflicting >> + * between the backup page and current page. Hence, we set it >> + * to the current block number and current offset. >> + */ >> >> Why does it differ? Is that a bug? > > This has been discussed twice in this thread, once by me, once by Alvaro: > https://www.postgresql.org/message-id/CAM3SWZQC8nUgp8SjKDY3d74VLpdf9puHc7-n3zf4xcr_bghPzg%40mail.gmail.com > https://www.postgresql.org/message-id/CAB7nPqQTLGvn_XePjS07kZMMw46kS6S7LfsTocK%2BgLpTN0bcZw%40mail.gmail.com Sorry, I missed/forgot about that. I think this is evidence that the comment isn't really good enough. It says "hey, this might differ" ... with no real explanation of why it might differ or why that's OK. If there were an explanation there, I wouldn't have flagged it. >> +/* >> + * Leave if no masking functions defined, this is possible in the case >> + * resource managers generating just full page writes, comparing an >> + * image to itself has no meaning in those cases. >> + */ >> +if (RmgrTable[rmid].rm_mask == NULL) >> +return; >> >> ...and also... >> >> +/* >> + * This feature is enabled only for the resource managers where >> + * a masking function is defined. >> + */ >> +for (i = 0; i <= RM_MAX_ID; i++) >> +{ >> +if (RmgrTable[i].rm_mask != NULL) >> >> Why do we assume that the feature is only enabled for RMs that have a >> mask function? Why not instead assume that if there's no masking >> function, no masking is required? > > Not all RMs work on full pages. Tracking in smgr.h the list of RMs > that are no-ops when masking things is easier than having empty > routines declared all over the code base. Don't you think so> Sure, but I don't think that's what the code is doing. If an RM is a no-op when masking things, it must define an empty function. If it defines no function, checking is disabled. I think we want to be able to check any RM. No? >> +void(*rm_mask) (char *page, BlockNumber blkno); >> >> Could the page be passed as type "Page" rather than a "char *" to make >> things more convenient for the masking functions? If not, could those >> functions at least do something like "Page page = (Page) pagebytes;" >> rather than "Page page_norm = (Page) page;"? > > xlog_internal.h is used as well by frontends, this makes the header > dependencies cleaner. (I have looked at using Page when hacking this > stuff, but the header dependencies are not worth it, I don't recall > all the details though this was a couple of months back). OK. I still think page_norm is a lame variable name. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2017-02-01 20:38:58 -0500, Robert Haas wrote: > On Wed, Feb 1, 2017 at 8:35 PM, Andres Freund wrote: > > On 2017-02-01 20:30:30 -0500, Robert Haas wrote: > >> On Wed, Feb 1, 2017 at 7:28 PM, Andres Freund wrote: > >> > On 2016-11-28 11:40:53 -0800, Jim Nasby wrote: > >> >> With current limits, the most bgwriter can do (with 8k pages) is 1000 > >> >> pages > >> >> * 100 times/sec = 780MB/s. It's not hard to exceed that with modern > >> >> hardware. Should we increase the limit on bgwriter_lru_maxpages? > >> > > >> > FWIW, I think working on replacing bgwriter (e.g. by working on the > >> > patch I send with a POC replacement) wholesale is a better approach than > >> > spending time increasing limits. > >> > >> I'm happy to see it replaced, but increasing the limits is about three > >> orders of magnitude less work than replacing it, so let's not block > >> this on the theory that the other thing would be better. > > > > I seriously doubt you can meaningfully exceed 780MB/s with the current > > bgwriter. So it's not like the limits are all that relevant right now. > > Sigh. The patch is harmless and there are 4 or 5 votes in favor of > it, one of which clearly states that the person involved has hit seen > it be a problem in real workloads. Do we really have to argue about > this? I don't mind increasing the limit, it's harmless. I just seriously doubt it actually addresses any sort of problem. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On Wed, Feb 1, 2017 at 8:35 PM, Andres Freund wrote: > On 2017-02-01 20:30:30 -0500, Robert Haas wrote: >> On Wed, Feb 1, 2017 at 7:28 PM, Andres Freund wrote: >> > On 2016-11-28 11:40:53 -0800, Jim Nasby wrote: >> >> With current limits, the most bgwriter can do (with 8k pages) is 1000 >> >> pages >> >> * 100 times/sec = 780MB/s. It's not hard to exceed that with modern >> >> hardware. Should we increase the limit on bgwriter_lru_maxpages? >> > >> > FWIW, I think working on replacing bgwriter (e.g. by working on the >> > patch I send with a POC replacement) wholesale is a better approach than >> > spending time increasing limits. >> >> I'm happy to see it replaced, but increasing the limits is about three >> orders of magnitude less work than replacing it, so let's not block >> this on the theory that the other thing would be better. > > I seriously doubt you can meaningfully exceed 780MB/s with the current > bgwriter. So it's not like the limits are all that relevant right now. Sigh. The patch is harmless and there are 4 or 5 votes in favor of it, one of which clearly states that the person involved has hit seen it be a problem in real workloads. Do we really have to argue about this? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2017-02-01 20:30:30 -0500, Robert Haas wrote: > On Wed, Feb 1, 2017 at 7:28 PM, Andres Freund wrote: > > On 2016-11-28 11:40:53 -0800, Jim Nasby wrote: > >> With current limits, the most bgwriter can do (with 8k pages) is 1000 pages > >> * 100 times/sec = 780MB/s. It's not hard to exceed that with modern > >> hardware. Should we increase the limit on bgwriter_lru_maxpages? > > > > FWIW, I think working on replacing bgwriter (e.g. by working on the > > patch I send with a POC replacement) wholesale is a better approach than > > spending time increasing limits. > > I'm happy to see it replaced, but increasing the limits is about three > orders of magnitude less work than replacing it, so let's not block > this on the theory that the other thing would be better. I seriously doubt you can meaningfully exceed 780MB/s with the current bgwriter. So it's not like the limits are all that relevant right now. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On Wed, Feb 1, 2017 at 7:28 PM, Andres Freund wrote: > On 2016-11-28 11:40:53 -0800, Jim Nasby wrote: >> With current limits, the most bgwriter can do (with 8k pages) is 1000 pages >> * 100 times/sec = 780MB/s. It's not hard to exceed that with modern >> hardware. Should we increase the limit on bgwriter_lru_maxpages? > > FWIW, I think working on replacing bgwriter (e.g. by working on the > patch I send with a POC replacement) wholesale is a better approach than > spending time increasing limits. I'm happy to see it replaced, but increasing the limits is about three orders of magnitude less work than replacing it, so let's not block this on the theory that the other thing would be better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2016-11-28 11:40:53 -0800, Jim Nasby wrote: > With current limits, the most bgwriter can do (with 8k pages) is 1000 pages > * 100 times/sec = 780MB/s. It's not hard to exceed that with modern > hardware. Should we increase the limit on bgwriter_lru_maxpages? FWIW, I think working on replacing bgwriter (e.g. by working on the patch I send with a POC replacement) wholesale is a better approach than spending time increasing limits. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2017-02-02 09:22:46 +0900, Michael Paquier wrote: > On Thu, Feb 2, 2017 at 9:17 AM, Jim Nasby wrote: > > Speaking of which... I have a meeting in 15 minutes to discuss moving to a > > server with 4TB of memory. With current limits shared buffers maxes at 16TB, > > which isn't all that far in the future. While 16TB of shared buffers might > > not be a good idea, it's not going to be terribly long before we start > > getting questions about it. > > Time for int64 GUCs? I don't think the GUC bit is the hard part. We'd possibly need some trickery (like not storing bufferid in BufferDesc anymore) to avoid increasing memory usage. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On Thu, Feb 2, 2017 at 9:17 AM, Jim Nasby wrote: > Speaking of which... I have a meeting in 15 minutes to discuss moving to a > server with 4TB of memory. With current limits shared buffers maxes at 16TB, > which isn't all that far in the future. While 16TB of shared buffers might > not be a good idea, it's not going to be terribly long before we start > getting questions about it. Time for int64 GUCs? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2/1/17 3:36 PM, Michael Paquier wrote: On Thu, Feb 2, 2017 at 7:01 AM, Jim Nasby wrote: On 2/1/17 10:27 AM, Robert Haas wrote: This looks fine to me. This could go without the comments, they are likely going to be forgotten if any updates happen in the future. I'm confused... I put the comments in there so if max shared buffers ever changed the other one would hopefully be updated as well. Speaking of which... I have a meeting in 15 minutes to discuss moving to a server with 4TB of memory. With current limits shared buffers maxes at 16TB, which isn't all that far in the future. While 16TB of shared buffers might not be a good idea, it's not going to be terribly long before we start getting questions about it. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION
On Thu, Feb 2, 2017 at 2:14 AM, Fujii Masao wrote: > The lwlock would be released when an exception occurs, so I don't think > that TRY-CATCH is necessary here. Or it's necessary for another reason? +PG_CATCH(); +{ +LWLockRelease(LogicalRepLauncherLock); +PG_RE_THROW(); +} +PG_END_TRY(); Just to do that, a TRY/CATCH block looks like an overkill to me. Why not just call LWLockRelease in the ERROR and return code paths? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)
Hi, On 2017-02-02 00:09:03 +0100, Erik Rijkers wrote: > Something is broken in HEAD: Hm. Indeed. > drop table if exists t; > create table t(c text); > insert into t (c) values ( 'abc' ) ; > > select > regexp_split_to_array( > regexp_split_to_table( > c > , chr(13) || chr(10) ) > , '","' ) > as a > , > regexp_split_to_table( > c > , chr(13) || chr(10) > ) > as rw > from t > ; > > TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180) > I realise the regexp* functions aren't doing anything particularly useful > anymore here; they did in the more complicated original (which I had used > for years). The issue is that we're generating ProjectSet nodes instead of Result for the top-level nodes - but we currently assume that ProjectSet nodes actually contain sets. I'm tentatively in favor of generating proper Result nodes in this case, rather than allowing this in ProjectSet - on the other hand, it works after removing that Assert(). Tom, do you have an opinion? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On Thu, Feb 2, 2017 at 7:01 AM, Jim Nasby wrote: > On 2/1/17 10:27 AM, Robert Haas wrote: >> This looks fine to me. This could go without the comments, they are likely going to be forgotten if any updates happen in the future. > If someone wants to proactively commit this, the CF entry is > https://commitfest.postgresql.org/13/979/. > (BTW, the Jan. CF is still showing as in-progress...) WIP. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)
Something is broken in HEAD: drop table if exists t; create table t(c text); insert into t (c) values ( 'abc' ) ; select regexp_split_to_array( regexp_split_to_table( c , chr(13) || chr(10) ) , '","' ) as a , regexp_split_to_table( c , chr(13) || chr(10) ) as rw from t ; TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180) I realise the regexp* functions aren't doing anything particularly useful anymore here; they did in the more complicated original (which I had used for years). thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cast jsonb to numeric, int, float, bool
On 02.02.2017 01:07, Jim Nasby wrote: On 2/1/17 8:26 AM, Nikita Glukhov wrote: Some comments about the code: I think it would be better to * add function for extraction of scalars from pseudo-arrays * iterate until WJB_DONE to pfree iterator I'm not sure what your intent here is, but if the idea is that a json array would magically cast to a bool or a number data type I think that's a bad idea. My idea, of course, is not about casting any json array to a scalar. It is only about helper subroutine for extraction of top-level jsonb scalars that are always stored as one-element pseudo-arrays with special flag F_SCALAR in the array header. -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
On Wed, Feb 1, 2017 at 6:13 PM, Masahiko Sawada wrote: > On Wed, Feb 1, 2017 at 10:02 PM, Claudio Freire > wrote: >> On Wed, Feb 1, 2017 at 5:47 PM, Masahiko Sawada >> wrote: >>> Thank you for updating the patch. >>> >>> Whole patch looks good to me except for the following one comment. >>> This is the final comment from me. >>> >>> /* >>> * lazy_tid_reaped() -- is a particular tid deletable? >>> * >>> * This has the right signature to be an IndexBulkDeleteCallback. >>> * >>> * Assumes dead_tuples array is in sorted order. >>> */ >>> static bool >>> lazy_tid_reaped(ItemPointer itemptr, void *state) >>> { >>> LVRelStats *vacrelstats = (LVRelStats *) state; >>> >>> You might want to update the comment of lazy_tid_reaped() as well. >> >> I don't see the mismatch with reality there (if you consider >> "dead_tples array" in the proper context, that is, the multiarray). >> >> What in particular do you find out of sync there? > > The current lazy_tid_reaped just find a tid from a tid array using > bsearch but in your patch lazy_tid_reaped handles multiple tid arrays > and processing method become complicated. So I thought it's better to > add the description of this function. Alright, updated with some more remarks that seemed relevant From e37d29c26210a0f23cd2e9fe18a264312fecd383 Mon Sep 17 00:00:00 2001 From: Claudio Freire Date: Mon, 12 Sep 2016 23:36:42 -0300 Subject: [PATCH] Vacuum: allow using more than 1GB work mem Turn the dead_tuples array into a structure composed of several exponentially bigger arrays, to enable usage of more than 1GB of work mem during vacuum and thus reduce the number of full index scans necessary to remove all dead tids when the memory is available. --- src/backend/commands/vacuumlazy.c| 318 --- src/test/regress/expected/vacuum.out | 8 + src/test/regress/sql/vacuum.sql | 10 ++ 3 files changed, 271 insertions(+), 65 deletions(-) diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 005440e..4a6895c 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -12,11 +12,24 @@ * * We are willing to use at most maintenance_work_mem (or perhaps * autovacuum_work_mem) memory space to keep track of dead tuples. We - * initially allocate an array of TIDs of that size, with an upper limit that + * initially allocate an array of TIDs of 128MB, or an upper limit that * depends on table size (this limit ensures we don't allocate a huge area - * uselessly for vacuuming small tables). If the array threatens to overflow, - * we suspend the heap scan phase and perform a pass of index cleanup and page - * compaction, then resume the heap scan with an empty TID array. + * uselessly for vacuuming small tables). Additional arrays of increasingly + * large sizes are allocated as they become necessary. + * + * The TID array is thus represented as a list of multiple segments of + * varying size, beginning with the initial size of up to 128MB, and growing + * exponentially until the whole budget of (autovacuum_)maintenance_work_mem + * is used up. + * + * Lookup in that structure proceeds sequentially in the list of segments, + * and with a binary search within each segment. Since segment's size grows + * exponentially, this retains O(N log N) lookup complexity. + * + * If the array threatens to overflow, we suspend the heap scan phase and + * perform a pass of index cleanup and page compaction, then resume the heap + * scan with an array of logically empty but already preallocated TID segments + * to be refilled with more dead tuple TIDs. * * If we're processing a table with no indexes, we can just vacuum each page * as we go; there's no need to save up multiple tuples to minimize the number @@ -93,6 +106,14 @@ #define LAZY_ALLOC_TUPLES MaxHeapTuplesPerPage /* + * Minimum (starting) size of the dead_tuples array segments. Will allocate + * space for 128MB worth of tid pointers in the first segment, further segments + * will grow in size exponentially. Don't make it too small or the segment list + * will grow bigger than the sweetspot for search efficiency on big vacuums. + */ +#define LAZY_MIN_TUPLES Max(MaxHeapTuplesPerPage, (128<<20) / sizeof(ItemPointerData)) + +/* * Before we consider skipping a page that's marked as clean in * visibility map, we must've seen at least this many clean pages. */ @@ -104,6 +125,27 @@ */ #define PREFETCH_SIZE ((BlockNumber) 32) +typedef struct DeadTuplesSegment +{ + int num_dead_tuples; /* # of entries in the segment */ + int max_dead_tuples; /* # of entries allocated in the segment */ + ItemPointerData last_dead_tuple; /* Copy of the last dead tuple (unset + * until the segment is fully + * populated) */ + unsigned short padding; + ItemPointer dt_tids; /* Array of dead tuples */ +} DeadTuplesSegment; + +typedef struct DeadTuplesMultiArray +{ + int num_entries; /* current
Re: [HACKERS] multivariate statistics (v19)
Still looking at 0002. pg_ndistinct_in disallows input, claiming that pg_node_tree does the same thing. But pg_node_tree does it for security reasons: you could crash the backend if you supplied a malicious value. I don't think that applies to pg_ndistinct_in. Perhaps it will be useful to inject fake stats at some point, so why not allow it? It shouldn't be complicated (though it does require writing some additional code, so perhaps that's one reason we don't want to allow input of these values). The comment on top of pg_ndistinct_out is missing the "_out"; also it talks about histograms, which is not what this is about. In the same function, a trivial point you don't need to pstrdup() the .data out of a stringinfo; it's already palloc'ed in the right context -- just PG_RETURN_CSTRING(str.data) and forget about "ret". Saves you one line. Nearby, some auxiliary functions such as n_choose_k and num_combinations are not documented. What it is that they do? I'd move these at the end of the file, keeping the important entry points at the top of the file. I see this patch has a estimate_ndistinct() which claims to be a re- implementation of code already in analyze.c, but it is actually a lot simpler than what analyze.c does. I've been wondering if it'd be a good idea to use some of this code so that some routines are moved out of analyze.c; good implementations of statistics-related functions would live in src/backend/statistics/ where they can be used both by analyze.c and your new mvstats stuff. (More generally I am beginning to wonder if the new directory should be just src/backend/statistics.) common.h does not belong in src/backend/utils/mvstats; IMO it should be called src/include/utils/mvstat.h. Also, it must not include postgres.h, and it probably doesn't need most of the #includes it has; those are better put into whatever include it. It definitely needs a guarding #ifdef MVSTATS_H around its whole content too. An include file is not just a way to avoid #includes in other files; it is supposed to be a minimally invasive way of exporting the structs and functions implemented in some file into other files. So it must be kept minimal. psql/tab-complete.c compares the wrong version number (9.6 instead of 10). Is it important to have a cast from pg_ndistinct to bytea? I think it's odd that outputting it as bytea yields something completely different than as text. (The bytea is not human readable and cannot be used for future input, so what is the point?) In another subthread you seem to have surrendered to the opinion that the new catalog should be called pg_statistics_ext, just in case in the future we come up with additional things to put on it. However, given its schema, with a "starelid / stakeys", is it sensible to think that we're going to get anything other than something that involves multiple variables? Maybe it should just be "pg_statistics_multivar" and if something else comes along we create another catalog with an appropriate schema. Heck, how does this catalog serve the purpose of cross-table statistics in the first place, given that it has room to record a single relid only? Are you thinking that in the future you'd change starelid into an oidvector column? The comment in gram.y about the CREATE STATISTICS is at odds with what is actually allowed by the grammar. I think the name of a statistics is only useful to DROP/ALTER it, right? I wonder why it's useful that statistics belongs in a schema. Perhaps it should be a global object? I suppose the name collisions would become bothersome if you have many mvstats. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cast jsonb to numeric, int, float, bool
On 2/1/17 8:26 AM, Nikita Glukhov wrote: If you find it useful, I can also add support of json and other types, such as smallint and bigint. Yes, I'd like to have support for other types and maybe for json. There's been previous discussion about something similar, which would be better supporting "casting an unknown to smallint". IIRC there's some non-trivial problem with trying to support that. Some comments about the code: I think it would be better to * add function for extraction of scalars from pseudo-arrays * iterate until WJB_DONE to pfree iterator I'm not sure what your intent here is, but if the idea is that a json array would magically cast to a bool or a number data type I think that's a bad idea. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Alvaro Herrera writes: > Tom Lane wrote: >> I think what we ought to do about this is invent additional API >> functions, say >> >> Oid CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, >> CatalogIndexState indstate); >> void CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, >> HeapTuple tup, CatalogIndexState indstate); >> >> and use these in place of simple_heap_foo plus CatalogIndexInsert >> in the places where this optimization had been applied. > This looks reasonable enough to me. Done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2/1/17 10:27 AM, Robert Haas wrote: On Tue, Jan 31, 2017 at 5:07 PM, Jim Nasby wrote: On 11/29/16 9:58 AM, Jeff Janes wrote: Considering a single SSD can do 70% of that limit, I would say yes. Next question becomes... should there even be an upper limit? Where the contortions needed to prevent calculation overflow become annoying? I'm not a big fan of nannyism in general, but the limits on this parameter seem particularly pointless. You can't write out more buffers than exist in the dirty state, nor more than implied by bgwriter_lru_multiplier. So what is really the worse that can happen if you make it too high? Attached is a patch that ups the limit to INT_MAX / 2, which is the same as shared_buffers. This looks fine to me. If someone wants to proactively commit this, the CF entry is https://commitfest.postgresql.org/13/979/. (BTW, the Jan. CF is still showing as in-progress...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Refactoring of replication commands using printsimple
On Thu, Feb 2, 2017 at 4:05 AM, Robert Haas wrote: > On Tue, Jan 31, 2017 at 8:26 PM, Michael Paquier > wrote: >> pq_sendcountedtext() does some encoding conversion, which is why I >> haven't used because we deal only with integers in this patch... Now >> if you wish to switch to that I have really no arguments against. > > OK, I see. Well, I guess it's sensible either way, but I've committed > this version. Thanks. Thanks. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
On 2/1/17 12:03 PM, Fabien COELHO wrote: I'm unsure whether it is a good idea, I like terse interfaces, but this is just an opinion. I think the issue here is that the original case for this is a user accidentally getting into an \if and then having no clue what's going on. That's similar to what happens when you miss a quote or a semicolon. We handle those cases with %R, and I think %R needs to support if as well. Perhaps there's value to providing more info (active branch, etc), but ISTM trying to do that will just confuse the original (%R) case. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Make psql's \set display variables in alphabetical order.
On Wed, Feb 01, 2017 at 04:38:59PM -0500, Robert Haas wrote: > On Wed, Feb 1, 2017 at 1:08 PM, Andres Freund wrote: > > On 2017-02-01 12:59:36 -0500, Tom Lane wrote: > >> David Fetter writes: > >> > On Wed, Feb 01, 2017 at 04:25:25PM +, Tom Lane wrote: > >> >> Make psql's \set display variables in alphabetical order. > >> > >> > This is a substantial usability improvement which nevertheless is hard > >> > to imagine changes things that scripts relied on. I think it's worth > >> > back-patching. > >> > >> I'm not that excited about it personally, but I agree it would be unlikely > >> to break anything. Other votes? > > > > -0.5 - I see very little reason to backpatch this over dozes of other > > changes. > > I'll vote -1. I don't think this is a bug fix. I withdraw my suggestion. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan
On 1/27/17 4:14 AM, Greg Stark wrote: On 25 January 2017 at 20:06, Jim Nasby wrote: GUCs support SET LOCAL, but that's not the same as local scoping because the setting stays in effect unless the substrans aborts. What I'd like is the ability to set a GUC in a plpgsql block *and have the setting revert on block exit*. I'm wondering which GUCs you have in mind to use this with. It's been quite some time since I messed with this; the only case I remember right now is wanting to do a temporary SET ROLE in an "exec" function: SELECT tools.exec( 'some sql;', role := 'superuser_role' ); To do that, exec has to remember what the current role is and then set it back (as well as remembering to do SET LOCAL in case an error happens. Because what you're describing is dynamic scoping and I'm wondering if what you're really looking for is lexical scoping. That would be more in line with how PL/PgSQL variables are scoped and with how #pragmas usually work. But it would probably not be easy to reconcile with how GUCs work. Right, because GUCs aren't even simply dynamically scoped; they're dynamically scoped with transaction support. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Make psql's \set display variables in alphabetical order.
On Wed, Feb 1, 2017 at 1:08 PM, Andres Freund wrote: > On 2017-02-01 12:59:36 -0500, Tom Lane wrote: >> David Fetter writes: >> > On Wed, Feb 01, 2017 at 04:25:25PM +, Tom Lane wrote: >> >> Make psql's \set display variables in alphabetical order. >> >> > This is a substantial usability improvement which nevertheless is hard >> > to imagine changes things that scripts relied on. I think it's worth >> > back-patching. >> >> I'm not that excited about it personally, but I agree it would be unlikely >> to break anything. Other votes? > > -0.5 - I see very little reason to backpatch this over dozes of other > changes. I'll vote -1. I don't think this is a bug fix. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding of two-phase transactions
On 2 Feb. 2017 08:32, "Tom Lane" wrote: Robert Haas writes: > Also, including the GID in the WAL for each COMMIT/ABORT PREPARED > doesn't seem inordinately expensive to me. I'm confused ... isn't it there already? If not, how do we handle reconstructing 2PC state from WAL at all? Right. Per my comments uothread I don't see why we need to add anything more to WAL here. Stas was concerned about what happens in logical decoding if we crash between PREPSRE TRANSACTION and COMMIT PREPARED. But we'll always go back and decode the whole txn again anyway so it doesn't matter. We can just track it on ReorderBufferTxn when we see it at PREPARE TRANSACTION time.
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Tom Lane wrote: > The source of both of those problems is that in some places, we > did CatalogOpenIndexes and then used the CatalogIndexState for > multiple tuple inserts/updates before doing CatalogCloseIndexes. > The patch dealt with these either by not touching them, just > leaving the simple_heap_insert/update calls in place (thus failing > to create any abstraction), or by blithely ignoring the optimization > and doing s/simple_heap_insert/CatalogTupleInsert/ anyway. For example, > in inv_api.c we are now doing a CatalogOpenIndexes/CatalogCloseIndexes > cycle for each chunk of the large object ... and just to add insult to > injury, the now-useless open/close calls outside the loop are still there. Ouch. You're right, I missed that. > I think what we ought to do about this is invent additional API > functions, say > > Oid CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, >CatalogIndexState indstate); > void CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, > HeapTuple tup, CatalogIndexState indstate); > > and use these in place of simple_heap_foo plus CatalogIndexInsert > in the places where this optimization had been applied. This looks reasonable enough to me. > An alternative but much more complicated fix would be to get rid of > the necessity for callers to worry about this at all, by caching > a CatalogIndexState in the catalog's relcache entry. That might be > worth doing eventually (because it would allow sharing index info > collection across unrelated operations) but I don't want to do it today. Hmm, interesting idea. No disagreement on postponing. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ICU integration
I wrote: > Evidently collateral damage from 2f5c9d9c9. But I'd suggest waiting > to fix it until you can also do s/simple_heap_delete/CatalogTupleDelete/ > as I proposed in > https://www.postgresql.org/message-id/462.1485902...@sss.pgh.pa.us > I'll go make that happen right now, now that I realize there are patch(es) > waiting on it. Done. There's some more loose ends but they won't affect very many call sites, so you should be able to rebase now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
On Wed, Feb 1, 2017 at 10:02 PM, Claudio Freire wrote: > On Wed, Feb 1, 2017 at 5:47 PM, Masahiko Sawada wrote: >> Thank you for updating the patch. >> >> Whole patch looks good to me except for the following one comment. >> This is the final comment from me. >> >> /* >> * lazy_tid_reaped() -- is a particular tid deletable? >> * >> * This has the right signature to be an IndexBulkDeleteCallback. >> * >> * Assumes dead_tuples array is in sorted order. >> */ >> static bool >> lazy_tid_reaped(ItemPointer itemptr, void *state) >> { >> LVRelStats *vacrelstats = (LVRelStats *) state; >> >> You might want to update the comment of lazy_tid_reaped() as well. > > I don't see the mismatch with reality there (if you consider > "dead_tples array" in the proper context, that is, the multiarray). > > What in particular do you find out of sync there? The current lazy_tid_reaped just find a tid from a tid array using bsearch but in your patch lazy_tid_reaped handles multiple tid arrays and processing method become complicated. So I thought it's better to add the description of this function. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables
On Mon, Jan 2, 2017 at 7:32 AM, Ashutosh Bapat wrote: > PFA the patch (pg_dp_join_v6.patch) with some bugs fixed and rebased > on the latest code. Maybe not surprisingly given how fast things are moving around here these days, this needs a rebase. Apart from that, my overall comment on this patch is that it's huge: 37 files changed, 7993 insertions(+), 287 deletions(-) Now, more than half of that is regression test cases and their output, which you will certainly be asked to pare down in any version of this intended for commit. But even excluding those, it's still a fairly patch: 30 files changed, 2783 insertions(+), 272 deletions(-) I think the reason this is so large is because there's a fair amount of refactoring work that has been done as a precondition of the actual meat of the patch, and no attempt has been made to separate the refactoring work from the main body of the patch. I think that's something that needs to be done. If you look at the way Amit Langote submitted the partitioning patches and the follow-up bug fixes, he had a series of patches 0001-blah, 0002-quux, etc. generated using format-patch. Each patch had its own commit message written by him explaining the purpose of that patch, links to relevant discussion, etc. If you can separate this into more digestible chunks it will be easier to get committed. Other questions/comments: Why does find_partition_scheme need to copy the partition bound information instead of just pointing to it? Amit went to some trouble to make sure that this can't change under us while we hold a lock on the relation, and we'd better hold a lock on the relation if we're planning a query against it. I think the PartitionScheme stuff should live in the optimizer rather that src/backend/catalog/partition.c. Maybe plancat.c? Perhaps we eventually need a new file in the optimizer just for partitioning stuff, but I'm not sure about that yet. The fact that set_append_rel_size needs to reopen the relation to extract a few more bits of information is not desirable. You need to fish this information through in some other way; for example, you could have get_relation_info() stash the needed bits in the RelOptInfo. +* For two partitioned tables with the same partitioning scheme, it is +* assumed that the Oids of matching partitions from both the tables +* are placed at the same position in the array of partition oids in Rather than saying that we assume this, you should say why it has to be true. (If it doesn't have to be true, we shouldn't assume it.) +* join relations. Partition tables should have same layout as the +* parent table and hence should not need any translation. But rest of The same attributes have to be present with the same types, but they can be rearranged. This comment seems to imply the contrary. FRACTION_PARTS_TO_PLAN seems like it should be a GUC. + /* +* Add this relation to the list of samples ordered by the increasing +* number of rows at appropriate place. +*/ + foreach (lc, ordered_child_nos) + { + int child_no = lfirst_int(lc); + RelOptInfo *other_childrel = rel->part_rels[child_no]; + + /* +* Keep track of child with lowest number of rows but higher than the +* that of the child being inserted. Insert the child before a +* child with highest number of rows lesser than it. +*/ + if (child_rel->rows <= other_childrel->rows) + insert_after = lc; + else + break; + } Can we use quicksort instead of a hand-coded insertion sort? + if (bms_num_members(outer_relids) > 1) Seems like bms_get_singleton_member could be used. +* Partitioning scheme in join relation indicates a possibilty that the Spelling. There seems to be no reason for create_partition_plan to be separated from create_plan_recurse. You can just add another case for the new path type. Why does create_partition_join_path need to be separate from create_partition_join_path_with_pathkeys? Couldn't that be combined into a single function with a pathkeys argument that might sometimes be NIL? I assume most of the logic is common. >From a sort of theoretical standpoint, the biggest danger of this patch seems to be that by deferring path creation until a later stage than normal, we could miss some important processing. subquery_planner() does a lot of stuff after expand_inherited_tables(); if any of those things, especially the ones that happen AFTER path generation, have an effect on the paths, then this code needs to compensate for those changes somehow. It seems like having the planning of uns
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
On Wed, Feb 1, 2017 at 5:47 PM, Masahiko Sawada wrote: > Thank you for updating the patch. > > Whole patch looks good to me except for the following one comment. > This is the final comment from me. > > /* > * lazy_tid_reaped() -- is a particular tid deletable? > * > * This has the right signature to be an IndexBulkDeleteCallback. > * > * Assumes dead_tuples array is in sorted order. > */ > static bool > lazy_tid_reaped(ItemPointer itemptr, void *state) > { > LVRelStats *vacrelstats = (LVRelStats *) state; > > You might want to update the comment of lazy_tid_reaped() as well. I don't see the mismatch with reality there (if you consider "dead_tples array" in the proper context, that is, the multiarray). What in particular do you find out of sync there? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan
Hi 2017-01-24 21:33 GMT+01:00 Pavel Stehule : > > > >> >>> Perhaps that's as simple as renaming all the existing _ns_* functions to >>> _block_ and then adding support for pragmas... >>> >>> Since you're adding cursor_options to PLpgSQL_expr it should probably be >>> removed as an option to exec_*. >>> >> >> I have to recheck it. Some cursor options going from dynamic cursor >> variables and are related to dynamic query - not query that creates query >> string. >> > > hmm .. so current state is better due using options like > CURSOR_OPT_PARALLEL_OK > > if (expr->plan == NULL) > exec_prepare_plan(estate, expr, (parallelOK ? > CURSOR_OPT_PARALLEL_OK : 0) | > expr->cursor_options); > > This options is not permanent feature of expression - and then I cannot to > remove cursor_option argument from exec_* > > I did minor cleaning - remove cursor_options from plpgsql_var > > Regards > > Pavel > + basic doc Regards Pavel diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml index d3272e1..97c59db 100644 --- a/doc/src/sgml/plpgsql.sgml +++ b/doc/src/sgml/plpgsql.sgml @@ -802,6 +802,32 @@ $$ LANGUAGE plpgsql; happen in a plain SQL command. + + + Block level PRAGMA + + +PRAGMA +in PL/pgSQL + + + +The block level PRAGMA allows to change some +PL/pgSQL compiler behave. Currently +only PRAGMA PLAN_CACHE is supported. + + +CREATE FUNCTION enforce_fresh_plan(_id text) RETURNS boolean AS $$ +DECLARE + PRAGMA PLAN_CACHE(force_custom_plan); +BEGIN + -- in this block every embedded query uses one shot plan + RETURN EXISTS(SELECT * FROM tab WHERE id = _id); +END; +$$ LANGUAGE plpgsql; + + + @@ -4649,6 +4675,43 @@ $$ LANGUAGE plpgsql; use of the now() function would still be a better idea. + + + PRAGMA PLAN_CACHE + + + The plan cache behave can be controlled with PRAGMA PLAN_CACHE. + This PRAGMA can be used on function or on block level (per + function, per block). The following options are possible: + DEFAULT - default PL/pgSQL + implementation - the system try to decide between custom plan and generic + plan after five query executions, FORCE_CUSTOM_PLAN + - the execution plan is one shot plan - it is specific for every set of + used paramaters, FORCE_GENERIC_PLAN - the query plan + is generic from start. + + + + + PRAGMA PLAN_CACHE + in PL/pgSQL + + The plan for INSERT is generic from begin. The + PRAGMA PLAN_CACHE is related to function - etc. every command + in this function will use generic plan. + +CREATE FUNCTION logfunc2(logtxt text) RETURNS void AS $$ +PRAGMA PLAN_CACHE(FORCE_GENERIC_PLAN); +DECLARE +curtime timestamp; +BEGIN +curtime := 'now'; +INSERT INTO logtable VALUES (logtxt, curtime); +END; +$$ LANGUAGE plpgsql; + + + diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index b25b3f1..304fc91 100644 --- a/src/pl/plpgsql/src/pl_comp.c +++ b/src/pl/plpgsql/src/pl_comp.c @@ -51,6 +51,8 @@ bool plpgsql_check_syntax = false; PLpgSQL_function *plpgsql_curr_compile; +PLpgSQL_directives *plpgsql_directives; + /* A context appropriate for short-term allocs during compilation */ MemoryContext plpgsql_compile_tmp_cxt; @@ -83,6 +85,11 @@ static const ExceptionLabelMap exception_label_map[] = { {NULL, 0} }; +PLpgSQL_directives default_directives = { + NULL, + true, /* is_function_scope */ + 0 /* no special cursor option */ +}; /* -- * static prototypes @@ -374,6 +381,9 @@ do_compile(FunctionCallInfo fcinfo, plpgsql_DumpExecTree = false; plpgsql_start_datums(); + /* set default compile directives */ + plpgsql_directives = &default_directives; + switch (function->fn_is_trigger) { case PLPGSQL_NOT_TRIGGER: @@ -852,6 +862,9 @@ plpgsql_compile_inline(char *proc_source) plpgsql_DumpExecTree = false; plpgsql_start_datums(); + /* set default compile directives */ + plpgsql_directives = &default_directives; + /* Set up as though in a function returning VOID */ function->fn_rettype = VOIDOID; function->fn_retset = false; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 6fc3db0..0a01bbe 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -2335,7 +2335,7 @@ exec_stmt_forc(PLpgSQL_execstate *estate, PLpgSQL_stmt_forc *stmt) Assert(query); if (query->plan == NULL) - exec_prepare_plan(estate, query, curvar->cursor_options); + exec_prepare_plan(estate, query, query->cursor_options); /* * Set up short-lived ParamListInfo @@ -3625,7 +3625,7 @@ exec_stmt_execsql(PLpgSQL_execstate *estate, { ListCell *l; - exec_prepare_plan(estate, expr, 0); + exec_prepare_plan(estate, expr, expr->cursor_options); stmt->mod_stmt = false; foreach(l, SPI_plan_get_plan_sources(expr->plan)) {
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
On Tue, Jan 31, 2017 at 3:05 AM, Claudio Freire wrote: > On Mon, Jan 30, 2017 at 5:51 AM, Masahiko Sawada > wrote: >> >> * We are willing to use at most maintenance_work_mem (or perhaps >> * autovacuum_work_mem) memory space to keep track of dead tuples. We >> * initially allocate an array of TIDs of that size, with an upper limit that >> * depends on table size (this limit ensures we don't allocate a huge area >> * uselessly for vacuuming small tables). If the array threatens to >> overflow, >> >> I think that we need to update the above paragraph comment at top of >> vacuumlazy.c file. > > Indeed, I missed that one. Fixing. > >> >> >> + numtuples = Max(numtuples, >> MaxHeapTuplesPerPage); >> + numtuples = Min(numtuples, INT_MAX / 2); >> + numtuples = Min(numtuples, 2 * >> pseg->max_dead_tuples); >> + numtuples = Min(numtuples, >> MaxAllocSize / sizeof(ItemPointerData)); >> + seg->dt_tids = (ItemPointer) >> palloc(sizeof(ItemPointerData) * numtuples); >> >> Why numtuples is limited to "INT_MAX / 2" but not INT_MAX? > > I forgot to mention this one in the OP. > > Googling around, I found out some implemetations of bsearch break with > array sizes beyond INT_MAX/2 [1] (they'd overflow when computing the > midpoint). > > Before this patch, this bsearch call had no way of reaching that size. > An initial version of the patch (the one that allocated a big array > with huge allocation) could reach that point, though, so I reduced the > limit to play it safe. This latest version is back to the starting > point, since it cannot allocate segments bigger than 1GB, but I opted > to keep playing it safe and leave the reduced limit just in case. > Thanks, I understood. >> >> @@ -1376,35 +1411,43 @@ lazy_vacuum_heap(Relation onerel, LVRelStats >> *vacrelstats) >> pg_rusage_init(&ru0); >> npages = 0; >> >> - tupindex = 0; >> - while (tupindex < vacrelstats->num_dead_tuples) >> + segindex = 0; >> + tottuples = 0; >> + for (segindex = tupindex = 0; segindex <= >> vacrelstats->dead_tuples.last_seg; tupindex = 0, segindex++) >> { >> - BlockNumber tblk; >> - Buffer buf; >> - Pagepage; >> - Sizefreespace; >> >> This is a minute thing but tupindex can be define inside of for loop. > > Right, changing. > >> >> >> @@ -1129,10 +1159,13 @@ lazy_scan_heap(Relation onerel, int options, >> LVRelStats *vacrelstats, >> * instead of doing a second scan. >> */ >> if (nindexes == 0 && >> -vacrelstats->num_dead_tuples > 0) >> +vacrelstats->dead_tuples.num_entries > 0) >> { >> /* Remove tuples from heap */ >> -lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer); >> +Assert(vacrelstats->dead_tuples.last_seg == 0);/* >> Should not need more >> + * >> than one segment per >> + * page */ >> >> I'm not sure we need to add Assert() here but it seems to me that the >> comment and code is not properly correspond and the comment for >> Assert() should be wrote above of Assert() line. > > Well, that assert is the one that found the second bug in > lazy_clear_dead_tuples, so clearly it's not without merit. > > I'll rearrange the comments as you ask though. > > > Updated and rebased v7 attached. > > > [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776671 Thank you for updating the patch. Whole patch looks good to me except for the following one comment. This is the final comment from me. /* * lazy_tid_reaped() -- is a particular tid deletable? * * This has the right signature to be an IndexBulkDeleteCallback. * * Assumes dead_tuples array is in sorted order. */ static bool lazy_tid_reaped(ItemPointer itemptr, void *state) { LVRelStats *vacrelstats = (LVRelStats *) state; You might want to update the comment of lazy_tid_reaped() as well. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Alvaro Herrera writes: > Tom Lane wrote: >> However, the patch misses an >> important part of such an abstraction layer by not also converting >> catalog-related simple_heap_delete() calls into some sort of >> CatalogTupleDelete() operation. It is certainly a peculiarity of >> PG heaps that deletions don't require any immediate index work --- most >> other storage engines would need that. >> I propose that we should finish the job by inventing CatalogTupleDelete(), >> which for the moment would be a trivial wrapper around >> simple_heap_delete(), maybe just a macro for it. >> >> If there's no objections I'll go make that happen in a day or two. > Sounds good. So while I was working on this I got quite unhappy with the already-committed patch: it's a leaky abstraction in more ways than this, and it's created a possibly-serious performance regression for large objects (and maybe other places). The source of both of those problems is that in some places, we did CatalogOpenIndexes and then used the CatalogIndexState for multiple tuple inserts/updates before doing CatalogCloseIndexes. The patch dealt with these either by not touching them, just leaving the simple_heap_insert/update calls in place (thus failing to create any abstraction), or by blithely ignoring the optimization and doing s/simple_heap_insert/CatalogTupleInsert/ anyway. For example, in inv_api.c we are now doing a CatalogOpenIndexes/CatalogCloseIndexes cycle for each chunk of the large object ... and just to add insult to injury, the now-useless open/close calls outside the loop are still there. I think what we ought to do about this is invent additional API functions, say Oid CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, CatalogIndexState indstate); void CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup, CatalogIndexState indstate); and use these in place of simple_heap_foo plus CatalogIndexInsert in the places where this optimization had been applied. An alternative but much more complicated fix would be to get rid of the necessity for callers to worry about this at all, by caching a CatalogIndexState in the catalog's relcache entry. That might be worth doing eventually (because it would allow sharing index info collection across unrelated operations) but I don't want to do it today. Objections, better naming ideas? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deadlock in XLogInsert at AIX
On 02/01/2017 08:28 PM, Heikki Linnakangas wrote: > > But if there's no pressing reason to change it, let's leave it alone. It's > not related to the problem at hand, right? > Yes, I agree with you: we should better leave it as it is. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deadlock in XLogInsert at AIX
On 02/01/2017 08:30 PM, REIX, Tony wrote: > > Hi Konstantin, > > Please run:*/opt/IBM/xlc/13.1.3/bin/xlc -qversion* so that I know your exact > XLC v13 version. > IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07) > I'm building on Power7 and not giving any architecture flag to XLC. > > I'm not using *-qalign=natural* . Thus, by default, XLC use -qalign=power, > which is close to natural, as explained at: > > https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html > Why are you using this flag ? > Because otherwise double type is aligned on 4 bytes. > Thanks for info about *pgbench*. PostgreSQL web-site contains a lot of old > information... > > If you could*share scripts or instructions about the tests you are doing with > pgbench*, I would reproduce here. > You do not need any script. Just two simple commands. One to initialize database: pgbench -i -s 1000 And another to run benchmark itself: pgbench -c 100 -j 20 -P 1 -T 10 > I have no "real" application. My job consists in porting OpenSource packages > on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL > RPMs as good as possible... within the limited amount of time I can give to > this package, before > moving to another one. > > About the *zombie* issue, I've discussed with my colleagues. Looks like the > process keeps zombie till the father looks at its status. However, though I > did that several times, I do not remember well the details. And that should > be not specific to AIX. > I'll discuss with another colleague, tomorrow, who should understand this > better than me. > 1. Process is not in zomby state (according to ps). It is in state... It is something AIX specific, I have not see processes in this state at Linux. 2. I have implemented simple test - forkbomb. It creates 1000 children and then wait for them. It is about ten times slower than at Intel/Linux, but still much faster than 100 seconds. So there is some difference between postgress backend and dummy process doing nothing - just immediately terminating after return from fork() > > *Patch for Large Files*: When building PostgreSQL, I found required to use > the following patch so that PostgreSQL works with large files. I do not > remember the details. Do you agree with such a patch ? 1rst version (new-...) > shows the exact places where > define _LARGE_FILES 1 is required. 2nd version (new2-...) is simpler. > > I'm now experimenting with your patch for dead lock. However, that should be > invisible with the "check-world" tests I guess. > > Regards, > > Tony > > > Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit : >> Hi Tony, >> >> On 01.02.2017 18:42, REIX, Tony wrote: >>> >>> Hi Konstantin >>> >>> *XLC.* >>> >>> I'm on AIX 7.1 for now. >>> >>> I'm using this version of *XL**C v13*: >>> >>> # xlc -qversion >>> IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07) >>> Version: 13.01.0003.0003 >>> >>> With this version, I have (at least, since I tested with "check" and not >>> "check-world" at that time) 2 failing tests: create_aggregate , aggregates . >>> >>> >>> With the following *XLC v12* version, I have NO test failure: >>> >>> # /usr/vac/bin/xlc -qversion >>> IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72) >>> Version: 12.01..0016 >>> >>> >>> So maybe you are not using XLC v13.1.3.3, rather another sub-version. >>> Unless you are using more options for the configure ? >>> >>> >>> *Configure*. >>> >>> What are the options that you give to the configure ? >>> >>> >> export CC="/opt/IBM/xlc/13.1.3/bin/xlc" >> export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 " >> export LDFLAGS="-Wl,-bbigtoc,-b64" >> export AR="/usr/bin/ar -X64" >> export LD="/usr/bin/ld -b64 " >> export NM="/usr/bin/nm -X64" >> ./configure --prefix="/opt/postgresql/xlc-debug/9.6" >> >> >>> *Hard load & 64 cores ?* OK. That clearly explains why I do not see this >>> issue. >>> >>> >>> *pgbench ?* I wanted to run it. However, I'm still looking where to get it >>> plus a guide for using it for testing. >>> >> >> pgbench is part of Postgres distributive (src/bin/pgbench) >> >> >>> I would add such tests when building my PostgreSQL RPMs on AIX. So any help >>> is welcome ! >>> >>> >>> *Performance*. >>> >>> - Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. >>> Any idea how I should proceed ? Any PostgreSQL performance benchmark that I >>> could find and use ? pgbench ? >>> >> pgbench is most widely used tool simulating OLTP workload. Certainly it is >> quite primitive and its results are rather artificial. TPC-C seems to be >> better choice. >> But the best case is to implement your own benchmark simulating actual >> workload of your real application. >> >>> - I'm interested in any information for improving the performance & quality >>> of my PostgreSQM RPMs on AIX./(As I already said, BullFreeware RPMs for AIX >>> are free and can be use
Re: [HACKERS] logical decoding of two-phase transactions
On 02/01/2017 10:32 PM, Tom Lane wrote: > Robert Haas writes: >> Also, including the GID in the WAL for each COMMIT/ABORT PREPARED >> doesn't seem inordinately expensive to me. > I'm confused ... isn't it there already? If not, how do we handle > reconstructing 2PC state from WAL at all? > > regards, tom lane > > Right now logical decoding ignores prepare and take in account only "commit prepared": /* * Currently decoding ignores PREPARE TRANSACTION and will just * decode the transaction when the COMMIT PREPARED is sent or * throw away the transaction's contents when a ROLLBACK PREPARED * is received. In the future we could add code to expose prepared * transactions in the changestream allowing for a kind of * distributed 2PC. */ For some scenarios it works well, but if we really need prepared state at replica (as in case of multimaster), then it is not enough. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
Hello, We could just issue interactive-only warnings when: - A user types a branching condition command which sets the branch inactive - A user types a command or query when the current branch is inactive. The warnings could be specific about state, something like: psql session is now in an inactive \if branch. No queries will be executed and only branching commands (\if, \elif, \else, \endif) will be evaluated. [...] My 0.02€: it looks too verbose, should stay on one short line. Maybe: (in|)active (\if|\elif|\else), (execut|ignor)ing commands Although there is some redundancy... calvin=>\if true active \if: executing commands calvin=(t)> \echo ok ok calvin=(t)> \else inactive \else: skipping commands calvin=(f)> ... Maybe it could be controlled, say based on VERBOSITY setting (which really controls verbosity of error reports) or some other. I'm unsure whether it is a good idea, I like terse interfaces, but this is just an opinion. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: [[Parallel] Shared] Hash
On Thu, Feb 2, 2017 at 3:34 AM, Rafia Sabih wrote: > 9 | 62928.88 | 59077.909 Thanks Rafia. At first glance this plan is using the Parallel Shared Hash in one place where it should pay off, that is loading the orders table, but the numbers are terrible. I noticed that it uses batch files and then has to increase the number of batch files, generating a bunch of extra work, even though it apparently overestimated the number of rows, though that's only ~9 seconds of ~60. I am investigating. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding of two-phase transactions
Robert Haas writes: > Also, including the GID in the WAL for each COMMIT/ABORT PREPARED > doesn't seem inordinately expensive to me. I'm confused ... isn't it there already? If not, how do we handle reconstructing 2PC state from WAL at all? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ICU integration
Pavel Stehule writes: > This patch is not possible to compile on today master > commands/collationcmds.o: In function `AlterCollation': > /home/pavel/src/postgresql/src/backend/commands/collationcmds.c:297: > undefined reference to `CatalogUpdateIndexes' Evidently collateral damage from 2f5c9d9c9. But I'd suggest waiting to fix it until you can also do s/simple_heap_delete/CatalogTupleDelete/ as I proposed in https://www.postgresql.org/message-id/462.1485902...@sss.pgh.pa.us I'll go make that happen right now, now that I realize there are patch(es) waiting on it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-level partitions and partition-wise joins
On Fri, Dec 23, 2016 at 12:54 AM, Ashutosh Bapat wrote: > Another question: do we want to commit the code for creating > unflattened hierarchy separately or along with partition-wise join? Probably separately. I suggest posting a series of two (or perhaps more) patches on the same thread. 'git format-patch' is a useful way to produce a patch series for posting. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Mon, Jan 30, 2017 at 2:30 AM, Masahiko Sawada wrote: > "txn" can be used for abbreviation of "Transaction", so for example > pg_fdw_txn_resolver? > I'm also fine to change the module and function name. If we're judging the relative clarity of various ways of abbreviating the word "transaction", "txn" surely beats "x". To repeat my usual refrain, is there any merit to abbreviating at all? Could we call it, say, "fdw_transaction_resolver" or "fdw_transaction_manager"? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ICU integration
Hi 2017-01-24 18:44 GMT+01:00 Peter Eisentraut < peter.eisentr...@2ndquadrant.com>: > On 1/15/17 5:53 AM, Pavel Stehule wrote: > > the regress test fails > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x007bbc2b in pattern_char_isalpha (locale_is_c=0 '\000', > > locale=0x1a73220, is_multibyte=1 '\001', c=97 'a') at selfuncs.c:5291 > > 5291return isalpha_l((unsigned char) c, locale->lt); > > Here is an updated patch that fixes this crash and is rebased on top of > recent changes. > This patch is not possible to compile on today master commands/collationcmds.o: In function `AlterCollation': /home/pavel/src/postgresql/src/backend/commands/collationcmds.c:297: undefined reference to `CatalogUpdateIndexes' collect2: error: ld returned 1 exit status Regards Pavel > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >
Re: [HACKERS] logical decoding of two-phase transactions
On Tue, Jan 31, 2017 at 9:05 PM, Michael Paquier wrote: >> Personally I don't think lack of access to the GID justifies blocking 2PC >> logical decoding. It can be added separately. But it'd be nice to have >> especially if it's cheap. > > I think it should be added reading this thread. +1. If on the logical replication master the user executes PREPARE TRANSACTION 'mumble', isn't it sensible to want the logical replica to prepare the same set of changes with the same GID? To me, that not only seems like *a* sensible thing to want to do but probably the *most* sensible thing to want to do. And then, when the eventual COMMIT PREPAPARED 'mumble' comes along, you want to have the replica run the same command. If you don't do that, then the alternative is that the replica has to make up new names based on the master's XID. But that kinda sucks, because now if replication stops due to a conflict or whatever and you have to disentangle things by hand, all the names on the replica are basically meaningless. Also, including the GID in the WAL for each COMMIT/ABORT PREPARED doesn't seem inordinately expensive to me. For that to really add up to a significant cost, wouldn't you need to be doing LOTS of 2PC transactions, each with very little work, so that the commit/abort prepared records weren't swamped by everything else? That seems like an unlikely scenario, but if it does happen, that's exactly when you'll be most grateful for the GID tracking. I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Refactoring of replication commands using printsimple
On Tue, Jan 31, 2017 at 8:26 PM, Michael Paquier wrote: > pq_sendcountedtext() does some encoding conversion, which is why I > haven't used because we deal only with integers in this patch... Now > if you wish to switch to that I have really no arguments against. OK, I see. Well, I guess it's sensible either way, but I've committed this version. Thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add tab completion for DEALLOCATE
ilm...@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes: > While playing with prepared statements in psql, I noticed that EXECUTE > tab-completes the list of active prepared statements, but DEALLOCATE > does not. > Attached is a patch to fix this. Good idea, but I think it would be better to give DEALLOCATE its own entry in the list, so it could be placed in alphabetical order. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvements in psql hooks for variables
I wrote: > Attached is a finished version that includes hooks for all the variables > for which they were clearly sensible. (FETCH_COUNT doesn't seem to really > need one, and I didn't do anything with HISTSIZE or IGNOREEOF either. > It might be worth bringing the latter two into the hooks paradigm, but > that seems like fit material for a separate patch.) I got more excited about doing that after noticing that not only would it clean up the behavior of those particular variables, but we could get rid of some code. First, we'd not need the rather warty GetVariableNum() anymore, and second, we'd then be almost at the point where every control variable has a hook, and therefore we could drop tab-complete.c's private list of known variable names. That was only ever needed to cover the possibility of important variables not being present in the variables list. So the attached proposed patch does these things: 1. Modify FETCH_COUNT to always have a defined value, like other control variables, mainly so it will always appear in "\set" output. 2. Add hooks to force HISTSIZE to be defined and require it to have an integer value. (I don't see any point in allowing it to be set to non-integral values.) 3. Add hooks to force IGNOREEOF to be defined and require it to have an integer value. Unlike the other cases, here we're trying to be bug-compatible with a rather bogus externally-defined behavior, so I think we need to continue to allow "\set IGNOREEOF whatever". What I propose is that the substitution hook silently replace non-numeric values with "10", so that the stored value always reflects what we're really doing. 4. Add a dummy assign hook for HISTFILE, just so it's always in variables.c's list. We can't require it to be defined always, because that would break the interaction with the PSQL_HISTORY environment variable, so there isn't any change in visible behavior here. 5. Remove tab-complete.c's private list of known variable names. Given the other changes, there are no control variables it won't show anyway. This does mean that if for some reason you've unset one of the status variables (DBNAME, HOST, etc), that variable would not appear in tab completion for \set. But I think that's fine, for at least two reasons: we shouldn't be encouraging people to use those variables as regular variables, and if someone does do so anyway, why shouldn't it act just like a regular variable? 6. Remove no-longer-used-anywhere GetVariableNum(). Any objections? regards, tom lane diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index b9c8fcc..ae58708 100644 *** a/doc/src/sgml/ref/psql-ref.sgml --- b/doc/src/sgml/ref/psql-ref.sgml *** bar *** 3247,3258 fail after having already displayed some rows. - - FETCH_COUNT is ignored if it is unset or does not - have a positive value. It cannot be set to a value that is not - syntactically an integer. - - Although you can use any output format with this feature, --- 3247,3252 *** bar *** 3316,3325 HISTSIZE ! The maximum number of commands to store in the command history. ! If unset, at most 500 commands are stored by default. ! If set to a value that is negative or not an integer, no limit is ! applied. --- 3310,3317 HISTSIZE ! The maximum number of commands to store in the command history ! (default 500). If set to a negative value, no limit is applied. *** bar *** 3345,3357 IGNOREEOF ! If unset, sending an EOF character (usually ControlD) to an interactive session of psql ! will terminate the application. If set to a numeric value, ! that many EOF characters are ignored before the ! application terminates. If the variable is set but not to a ! numeric value, the default is 10. --- 3337,3349 IGNOREEOF ! If set to 1 or less, sending an EOF character (usually ControlD) to an interactive session of psql ! will terminate the application. If set to a larger numeric value, ! that many consecutive EOF characters must be typed to ! make an interactive session terminate. If the variable is set to a ! non-numeric value, it is interpreted as 10. diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index 5365629..3e3cab4 100644 *** a/src/bin/psql/help.c --- b/src/bin/psql/help.c *** helpVariables(unsigned short int pager) *** 348,356 " (default: 0=unlimited)\n")); fprintf(outp
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On Tue, Jan 31, 2017 at 5:07 PM, Jim Nasby wrote: > On 11/29/16 9:58 AM, Jeff Janes wrote: >> Considering a single SSD can do 70% of that limit, I would say >> yes. >> >> Next question becomes... should there even be an upper limit? >> >> >> Where the contortions needed to prevent calculation overflow become >> annoying? >> >> I'm not a big fan of nannyism in general, but the limits on this >> parameter seem particularly pointless. You can't write out more buffers >> than exist in the dirty state, nor more than implied >> by bgwriter_lru_multiplier. So what is really the worse that can happen >> if you make it too high? > > > Attached is a patch that ups the limit to INT_MAX / 2, which is the same as > shared_buffers. This looks fine to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
> > > You can also run "make check" directly in the "src/bin/psql" directory. Previously, that didn't do anything, but now that I've created a TAP test it does. It doesn't, however, run the psql regress tests. But at least I can use the two commands in combination and not have to run *all* TAP tests outside of psql.
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
On Wed, Feb 1, 2017 at 8:53 AM, Corey Huinker wrote: > >> Run make check-world (as opposed to just make check ) > > > I'll give that a shot. > That was it. Tests don't run if you don't invoke them. Thanks.
Re: [HACKERS] [COMMITTERS] pgsql: Make psql's \set display variables in alphabetical order.
On 2017-02-01 12:59:36 -0500, Tom Lane wrote: > David Fetter writes: > > On Wed, Feb 01, 2017 at 04:25:25PM +, Tom Lane wrote: > >> Make psql's \set display variables in alphabetical order. > > > This is a substantial usability improvement which nevertheless is hard > > to imagine changes things that scripts relied on. I think it's worth > > back-patching. > > I'm not that excited about it personally, but I agree it would be unlikely > to break anything. Other votes? -0.5 - I see very little reason to backpatch this over dozes of other changes. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect: Hash index support
On Sat, Jan 28, 2017 at 9:09 PM, Ashutosh Sharma wrote: > okay. Thanks. I have done changes on top of this patch. + ptr = (char *) itup + IndexInfoFindDataOffset(itup->t_info); + Assert(ptr <= uargs->page + BLCKSZ); I think this should be promoted to an ereport(); these functions can accept an arbitrary bytea. + if (opaque->hasho_flag & LH_BUCKET_PAGE) + stat->hasho_prevblkno = InvalidBlockNumber; + else + stat->hasho_prevblkno = opaque->hasho_prevblkno; I think we should return the raw value here. Mithun's patch to cache the metapage hasn't gone in yet, but even if it does, let's assume anyone using contrib/pageinspect wants to see the data that's physically present, not our gloss on it. Other than that, I don't think I have any other comments on this. The tests that got added look a little verbose to me (do we really need to check pages 1-4 separately in each case when they're all hash pages? if hash_bitmap_info rejects one, surely it will reject the others) but I'm not going to fight too hard if Peter wants it that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Make psql's \set display variables in alphabetical order.
David Fetter writes: > On Wed, Feb 01, 2017 at 04:25:25PM +, Tom Lane wrote: >> Make psql's \set display variables in alphabetical order. > This is a substantial usability improvement which nevertheless is hard > to imagine changes things that scripts relied on. I think it's worth > back-patching. I'm not that excited about it personally, but I agree it would be unlikely to break anything. Other votes? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
> > However I would say yes, it should provide some feedback... This means > probably adding a new prompt substitution "%". In the worst > case, the prompt should reflect the current stack, or at least the top of > the task... > We could just issue interactive-only warnings when: - A user types a branching condition command which sets the branch inactive - A user types a command or query when the current branch is inactive. The warnings could be specific about state, something like: psql session is now in an inactive \if branch. No queries will be executed and only branching commands (\if, \elif, \else, \endif) will be evaluated. psql session is now in an inactive \elif branch. No queries will be executed and only branching commands (\if, \elif, \else, \endif) will be evaluated. psql session is now in an inactive \else branch. No queries will be executed and only branching commands (\if, \endif) will be evaluated. This could of course be done in addition to prompt changes, and is orthogonal to the CTRL-c option. I'd like more input before moving forward on either of those, as they have a good chance to clobber other expected behaviors.
[HACKERS] [PATCH] Add tab completion for DEALLOCATE
Hi hackers, While playing with prepared statements in psql, I noticed that EXECUTE tab-completes the list of active prepared statements, but DEALLOCATE does not. Attached is a patch to fix this. Cheers, Ilmari -- "I use RMS as a guide in the same way that a boat captain would use a lighthouse. It's good to know where it is, but you generally don't want to find yourself in the same spot." - Tollef Fog Heen >From ac727e99d6f06a82d6ba9a7927fed8ce179dac3f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= Date: Wed, 1 Feb 2017 17:39:43 + Subject: [PATCH] Add tab completion for DEALLOCATE EXECUTE already tab-completes the list of prepared statements, but DEALLOCATE did not. --- src/bin/psql/tab-complete.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index d6fffcf42f..4a7bfe844a 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -2575,8 +2575,8 @@ psql_completion(const char *text, int start, int end) else if (Matches5("DROP", "RULE", MatchAny, "ON", MatchAny)) COMPLETE_WITH_LIST2("CASCADE", "RESTRICT"); -/* EXECUTE */ - else if (Matches1("EXECUTE")) +/* EXECUTE && DEALLOCATE */ + else if (Matches1("EXECUTE|DEALLOCATE")) COMPLETE_WITH_QUERY(Query_for_list_of_prepared_statements); /* EXPLAIN */ -- 2.11.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel bitmap heap scan
On Tue, Jan 31, 2017 at 6:05 AM, Dilip Kumar wrote: >> 0002-hash-support-alloc-free-v14.patch: >> >> >> + if (tb->alloc) >> + { >> >> The memory for tb->alloc is allocated always, is the if check still >> required? > > In parallel case, only first worker will call SH_CREATE, other worker > will only do palloc for pagetable and copy the reference from main > worker, so they will not have allocator. When Andres wrote in https://www.postgresql.org/message-id/20161017201558.cr34stc6zvxy3...@alap3.anarazel.de that you should try to share only the iteration arrays, I believe that he meant the results of tbm_begin_iterate() -- that is, iterator->spageptr, chunkptr, schunkbit, spages, and schunks. I'm perhaps putting words into his mouth, but what he said was that we could avoid sharing "the whole underlying hash". But the patch set you've implemented here doesn't behave that way. Instead, you've got the array containing the hash table elements shared (which is good) plus there's a sort of dummy hash table in every worker which copies some but not all of the members of the original hash table, leading to the otherwise-unnecessary if-test that Haribabu is complaining about. So the hash table is kinda-shared-kinda-not-shared, which I don't *think* is really what Andres had in mind. In short, I think SH_COPY (implemented here as pagetable_copy) needs to go away. The work of tbm_begin_iterate() should be done before we begin the shared scan and the results of that work should be shared. What happens right now (it appears) is that every backend does that work based on the same hash table and we just assume they all get the same answer. And we really do assume that, because pbms_parallel_iterate() assumes it can shuttle private state back and forth between iterator in different backends and nothing will break; but those backends aren't actually using the same iteration arrays. They are using different iteration arrays that should have the same contents because they were all derived from the same semi-shared hash table. That's pretty fragile, and a waste of CPU cycles if the hash table is large (every backend does its own sort). On a related note, I think it's unacceptable to make the internal details of TBMIterator public. You've moved it from tidbitmap.c to tidbitmap.h so that nodeBitmapHeapScan.c can tinker with the guts of the TBMIterator, but that's not OK. Those details need to remain private to tidbitmap.c. pbms_parallel_iterate() is a nasty kludge; we need some better solution. The knowledge of how a shared iterator should iterate needs to be private to tidbitmap.c, not off in the executor someplace. And I think the entries need to actually be updated in shared memory directly, not copied back and forth between a backend-private iterator and a shared iterator. Also, pbms_parallel_iterate() can't hold a spinlock around a call to tbm_iterate(). Note the coding rules for spinlocks mentioned in spin.h and src/backend/storage/lmgr/README. I think the right thing to do here is to use an LWLock in a new tranche (add an entry to BuiltinTrancheIds). In 0002, you have this: +tb->alloc = palloc(sizeof(SH_ALLOCATOR)); This should be using MemoryContextAlloc(ctx, ...) rather than palloc. Otherwise the allocator object is in a different context from everything else in the hash table. But TBH, I don't really see why we want this to be a separate object. Why not just put HashAlloc/HashFree/args into SH_TYPE directly? That avoids some pointer chasing and doesn't really seem to cost anything (except that SH_CREATE will grow a slightly longer argument sequence). I think maybe we should rename the functions to element_allocate, element_free, and element_allocator_ctx, or something like that. The current names aren't capitalized consistently with other things in this module, and putting the word "element" in there would make it more clear what the purpose of this thing is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deadlock in XLogInsert at AIX
Hi Konstantin, Please run: /opt/IBM/xlc/13.1.3/bin/xlc -qversion so that I know your exact XLC v13 version. I'm building on Power7 and not giving any architecture flag to XLC. I'm not using -qalign=natural . Thus, by default, XLC use -qalign=power, which is close to natural, as explained at: https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html Why are you using this flag ? Thanks for info about pgbench. PostgreSQL web-site contains a lot of old information... If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here. I have no "real" application. My job consists in porting OpenSource packages on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL RPMs as good as possible... within the limited amount of time I can give to this package, before moving to another one. About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me. Patch for Large Files: When building PostgreSQL, I found required to use the following patch so that PostgreSQL works with large files. I do not remember the details. Do you agree with such a patch ? 1rst version (new-...) shows the exact places where define _LARGE_FILES 1 is required. 2nd version (new2-...) is simpler. I'm now experimenting with your patch for dead lock. However, that should be invisible with the "check-world" tests I guess. Regards, Tony Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit : Hi Tony, On 01.02.2017 18:42, REIX, Tony wrote: Hi Konstantin XLC. I'm on AIX 7.1 for now. I'm using this version of XLC v13: # xlc -qversion IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07) Version: 13.01.0003.0003 With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates . With the following XLC v12 version, I have NO test failure: # /usr/vac/bin/xlc -qversion IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72) Version: 12.01..0016 So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ? Configure. What are the options that you give to the configure ? export CC="/opt/IBM/xlc/13.1.3/bin/xlc" export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 " export LDFLAGS="-Wl,-bbigtoc,-b64" export AR="/usr/bin/ar -X64" export LD="/usr/bin/ld -b64 " export NM="/usr/bin/nm -X64" ./configure --prefix="/opt/postgresql/xlc-debug/9.6" Hard load & 64 cores ? OK. That clearly explains why I do not see this issue. pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing. pgbench is part of Postgres distributive (src/bin/pgbench) I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome ! Performance. - Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ? pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice. But the best case is to implement your own benchmark simulating actual workload of your real application. - I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)). How to help ? How could I help for improving the quality and performance of PostgreSQL on AIX ? We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html It will be great if you can somehow help to fix this problem. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company --- src/include/postgres.h.ORIGIN 2017-02-01 07:32:04 -0600 +++ src/include/postgres.h 2017-02-01 07:32:29 -0600 @@ -44,6 +44,10 @@ #ifndef POSTGRES_H #define POSTGRES_H +#ifdef _AIX +#define _LARGE_FILES 1 +#endif + #include "c.h" #include "utils/elog.h" #include "utils/palloc.h" --- src/pl/plpython/plpy_cursorobject.c.ORIGIN 2017-02-01 02:59:08 -0600 +++ src/pl/plpython/plpy_cursorobject.c 2017-02-01 03:00:20 -0600 @@ -4,6 +4,10 @@ * src/pl/plpython/plpy_cursorobject.c */ +#ifdef _AIX +#define _LARGE_FILES 1 +#endif + #include "postgres.h" #include --- src/pl/plpython/plpy_elog.c.ORIGIN 2017-02-01 02:59:08 -0600 +++ src/
Re: [HACKERS] Deadlock in XLogInsert at AIX
On 02/01/2017 04:12 PM, Konstantin Knizhnik wrote: On 01.02.2017 15:39, Heikki Linnakangas wrote: On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote: Attached please find my patch for XLC/AIX. The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl. The comment in this file says that: * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby * providing sequential consistency. This is undocumented. But it is not true any more (I checked generated assembler code in debugger). This is why I have added __sync() to this function. Now pgbench working normally. Seems like it was not so much undocumented, but an implementation detail that was not guaranteed after all.. Does __fetch_and_add emit a trailing isync there either? Seems odd if __compare_and_swap requires it, but __fetch_and_add does not. Unless we can find conclusive documentation on that, I think we should assume that an __isync() is required, too. There was a long thread on these things the last time this was changed: https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de. I couldn't find an explanation there of why we thought that fetch_and_add implicitly performs sync and isync. Also there is mysterious disappearance of assembler section function with sync instruction from pg_atomic_compare_exchange_u32_impl. I have fixed it by using __sync() built-in function instead. __sync() seems more appropriate there, anyway. We're using intrinsics for all the other things in generic-xlc.h. But it sure is scary that the "asm" sections just disappeared. In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync() and __lwsync() intrinsics? Those are an xlc compiler-specific thing, right? Or if they are expected to work on any ppc compiler, then we should probably use them always, instead of the asm sections. In summary, I came up with the attached. It's essentially your patch, with tweaks for the above-mentioned things. I don't have a powerpc system to test on, so there are probably some silly typos there. Why do you prefer to use _check_lock instead of __check_lock_mp ? First one is even not mentioned in XLC compiler manual: http://www-01.ibm.com/support/docview.wss?uid=swg27046906&aid=7 or http://scv.bu.edu/computation/bluegene/IBMdocs/compiler/xlc-8.0/html/compiler/ref/bif_sync.htm Googling around, it seems that they do more or less the same thing. I would guess that they actually produce the same assembly code, but I have no machine to test on. If I understand correctly, the difference is that __check_lock_mp() is an xlc compiler intrinsic, while _check_lock() is a libc function. The libc function presumably does __check_lock_mp() or __check_lock_up() depending on whether the system is a multi- or uni-processor system. So I think if we're going to change this, the use of __check_lock_mp() needs to be in an #ifdef block to check that you're on the XLC compiler, as it's a *compiler* intrinsic, while the current code that uses _check_lock() are in an "#ifdef _AIX" block, which is correct for _check_lock() because it's defined in libc, not by the compiler. But if there's no pressing reason to change it, let's leave it alone. It's not related to the problem at hand, right? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Make psql's \set display variables in alphabetical order.
On Wed, Feb 01, 2017 at 04:25:25PM +, Tom Lane wrote: > Make psql's \set display variables in alphabetical order. This is a substantial usability improvement which nevertheless is hard to imagine changes things that scripts relied on. I think it's worth back-patching. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
Hello Corey, There is a spurious empty line added at the very end of "mainloop.h": + #endif /* MAINLOOP_H */ Not in my diff, but that's been coming and going in your diff reviews. Strange. Maybe this is linked to the warning displayed with "git apply" when I apply the diff. I would suggest to add a short one line comment before each test to explain what is being tested, like "-- test \elif execution", "-- test \else execution"... Where are you suggesting this? In "regress/sql/psql.sql", in front of each group which starts a test. Debatable suggestion about "psql_branch_empty": The name isn't great. Maybe psql_branch_stack_empty()? Yep, maybe, or "empty_stack" or "stack_is_empty" or IDK... "psql_branch_end_state": it is a pop, it could be named "psql_branch_pop" Yeah, we either need to go fully with telling the programmer that it's a stack (push/pop/empty) or (begin_branch/end_branch/not_branching). I'm inclined to go full-stack, as it were. Anything consistent is ok. I'm fine with calling a stack a stack:-) -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION
On Wed, Feb 1, 2017 at 5:36 PM, Kyotaro HORIGUCHI wrote: > Hello, while looking another bug, I found that standby cannot > shutdown after DROP SUBSCRIPTION. > > standby=# CREATE SUBSCRPTION sub1 ... > standby=# > standby=# DROP SUBSCRIPTION sub1; > > Ctrl-C to the standby fails to work. ApplyLauncherMain is waiting > LogicalRepLauncherLock forever. > > The culprit is DropSbuscription. It acquires > LogicalRepLauncherLock but never releases. > > The attached patch fixes it. Most part of the fucntion is now > enclosed by PG_TRY-CATCH since some functions can throw > exceptions. The lwlock would be released when an exception occurs, so I don't think that TRY-CATCH is necessary here. Or it's necessary for another reason? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new autovacuum criterion for visible pages
On Sun, Jan 22, 2017 at 4:45 PM, Stephen Frost wrote: > Amit, > > * Amit Kapila (amit.kapil...@gmail.com) wrote: > > On Sun, Jan 22, 2017 at 3:27 AM, Stephen Frost > wrote: > > > * Simon Riggs (si...@2ndquadrant.com) wrote: > > >> On 12 August 2016 at 01:01, Tom Lane wrote: > > >> > Michael Paquier writes: > > >> >> In short, autovacuum will need to scan by itself the VM of each > > >> >> relation and decide based on that. > > >> > > > >> > That seems like a worthwhile approach to pursue. The VM is > supposed to be > > >> > small, and if you're worried it isn't, you could sample a few pages > of it. > > >> > I do not think any of the ideas proposed so far for tracking the > > >> > visibility percentage on-the-fly are very tenable. > > >> > > >> Sounds good, but we can't scan the VM for every table, every minute. > > >> We need to record something that will tell us how many VM bits have > > >> been cleared, which will then allow autovac to do a simple SELECT to > > >> decide what needs vacuuming. > > >> > > >> Vik's proposal to keep track of the rows inserted seems like the best > > >> approach to this issue. > > > > > > I tend to agree with Simon on this. I'm also worried that an approach > > > which was based off of a metric like "% of table not all-visible" might > > > result in VACUUM running over and over on a table because it isn't able > > > to actually make any progress towards improving that percentage. We'd > > > have to have some kind of "cool-off" period or something. > > > > > > Tracking INSERTs and then kicking off a VACUUM based on them seems to > > > address that in a natural way and also seems like something that users > > > would generally understand as it's very similar to what we do for > > > UPDATEs and DELETEs. > > > > > > Tracking the INSERTs as a reason to VACUUM is also very natural when > you > > > consider the need to update BRIN indexes. > > > > Another possible advantage of tracking INSERTs is for hash indexes > > where after split we need to remove tuples from buckets that underwent > > split recently. > > That's a good point also, and for clearing GIN pending lists too, now > that I think about it. > As much as I want my patch to go in, this is not an argument for it. The GIN pending lists will be handled by autoanalyze. > We really need to get this fixed for PG10. > I'm not sure what more I'm supposed to be doing, if anything. -- Vik Fearing +33 6 46 75 15 36http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Re: [HACKERS] Deadlock in XLogInsert at AIX
On 01.02.2017 15:39, Heikki Linnakangas wrote: In summary, I came up with the attached. It's essentially your patch, with tweaks for the above-mentioned things. I don't have a powerpc system to test on, so there are probably some silly typos there. Attached pleased find fixed version of your patch. I verified that it is correctly applied, build and postgres normally works with it. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/src/include/port/atomics/arch-ppc.h b/src/include/port/atomics/arch-ppc.h index ed1cd9d1b9..7cf8c8ef97 100644 --- a/src/include/port/atomics/arch-ppc.h +++ b/src/include/port/atomics/arch-ppc.h @@ -23,4 +23,11 @@ #define pg_memory_barrier_impl() __asm__ __volatile__ ("sync" : : : "memory") #define pg_read_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory") #define pg_write_barrier_impl() __asm__ __volatile__ ("lwsync" : : : "memory") + +#elif defined(__IBMC__) || defined(__IBMCPP__) + +#define pg_memory_barrier_impl()__sync() +#define pg_read_barrier_impl() __lwsync() +#define pg_write_barrier_impl() __lwsync() + #endif diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h index f854612d39..e1dd3310a5 100644 --- a/src/include/port/atomics/generic-xlc.h +++ b/src/include/port/atomics/generic-xlc.h @@ -48,7 +48,7 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, * consistency only, do not use it here. GCC atomics observe the same * restriction; see its rs6000_pre_atomic_barrier(). */ - __asm__ __volatile__ (" sync \n" ::: "memory"); + __sync(); /* * XXX: __compare_and_swap is defined to take signed parameters, but that @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, static inline uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) { + uint32 ret; + /* - * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby - * providing sequential consistency. This is undocumented. + * Use __sync() before and __isync() after, like in compare-exchange + * above. */ - return __fetch_and_add((volatile int *)&ptr->value, add_); + __sync(); + + ret = __fetch_and_add((volatile int *)&ptr->value, add_); + + __isync(); + + return ret; } #ifdef PG_HAVE_ATOMIC_U64_SUPPORT @@ -89,7 +97,7 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, { bool ret; - __asm__ __volatile__ (" sync \n" ::: "memory"); + __sync(); ret = __compare_and_swaplp((volatile long*)&ptr->value, (long *)expected, (long)newval); @@ -103,7 +111,15 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, static inline uint64 pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_) { - return __fetch_and_addlp((volatile long *)&ptr->value, add_); + uint64 ret; + + __sync(); + + ret = __fetch_and_addlp((volatile long *)&ptr->value, add_); + + __isync(); + + return ret; } #endif /* PG_HAVE_ATOMIC_U64_SUPPORT */ diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index 7aad2de..c6ef114 100644 --- a/src/include/storage/s_lock.h +++ b/src/include/storage/s_lock.h @@ -832,9 +831,8 @@ typedef unsigned int slock_t; #include typedef int slock_t; - -#define TAS(lock) _check_lock((slock_t *) (lock), 0, 1) -#define S_UNLOCK(lock) _clear_lock((slock_t *) (lock), 0) +#define TAS(lock) __check_lock_mp((slock_t *) (lock), 0, 1) +#define S_UNLOCK(lock) __clear_lock_mp((slock_t *) (lock), 0) #endif /* _AIX */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers