Re: [HACKERS] Patch to support SEMI and ANTI join removal
On 2/13/15 8:52 AM, Michael Paquier wrote: On Sun, Nov 23, 2014 at 8:23 PM, David Rowley wrote: As the patch stands there's still a couple of FIXMEs in there, so there's still a bit of work to do yet. Comments are welcome Hm, if there is still work to do, we may as well mark this patch as rejected as-is, also because it stands in this state for a couple of months. I didn't bring this up before, but I'm pretty sure this patch should be marked "returned with feedback". From what I've understood, "rejected" means "we don't want this thing, not in this form or any other". That doesn't seem to be the case for this patch, nor for a few others marked "rejected" in the currently in-progress commit fest. .m -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Decoding follows timelines
On Wed, Dec 17, 2014 at 5:35 PM, Simon Riggs wrote: > On 16 December 2014 at 21:17, Simon Riggs wrote: > > >>> This patch is a WIP version of doing that, but only currently attempts > > >> With the patch, XLogSendLogical uses the same logic to calculate > SendRqstPtr > >> that XLogSendPhysical does. It would be good to refactor that into a > common > >> function, rather than copy-paste. > > > > Some of the logic is similar, but not all. > > > >> SendRqstPtr isn't actually used for anything in XLogSendLogical. > > > > It exists to allow the call which resets TLI. > > > > I'll see if I can make it exactly identical; I didn't think so when I > > first looked, will look again. > > Yes, that works. New version attached > Moved patch to CF 2015-02 to not lose track of it, also because it does not seem it received a proper review. -- Michael
Re: [HACKERS] PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
On Thu, Jan 15, 2015 at 5:02 AM, Tomas Vondra wrote: > Maybe we can try later again, but there's no poin in keeping this in the > current CF. > > Any objections? > None, marked as rejected. -- Michael
Re: [HACKERS] Patch to support SEMI and ANTI join removal
On Sun, Nov 23, 2014 at 8:23 PM, David Rowley wrote: > > As the patch stands there's still a couple of FIXMEs in there, so there's > still a bit of work to do yet. > Comments are welcome > Hm, if there is still work to do, we may as well mark this patch as rejected as-is, also because it stands in this state for a couple of months. -- Michael
Re: [HACKERS] Logical Replication Helpers WIP for discussion
On Mon, Dec 22, 2014 at 10:26 PM, Robert Haas wrote: > On Fri, Dec 19, 2014 at 8:40 AM, Petr Jelinek > wrote: > > What I hope to get from this is agreement on the general approach and > > protocol so that we can have common base which will both make it easier > to > > create external logical replication solutions and also eventually lead to > > full logical replication inside core PostgreSQL. > > The protocol is a really important topic which deserves its own > discussion. Andres has mentioned some of the ideas he has in mind - > which I think are similar to what you did here - but there hasn't > really been a thread devoted to discussing that topic specifically. I > think that would be a good idea: lay out what you have in mind, and > why, and solicit feedback. > Looking at this patch, I don't see what we actually gain much here except a decoder plugin that speaks a special protocol for a special background worker that has not been presented yet. What actually is the value of that defined as a contrib/ module in-core. Note that we have already test_decoding to basically test the logical decoding facility, used at least at the SQL level to get logical changes decoded. Based on those reasons I am planning to mark this as rejected (it has no documentation as well). So please speak up if you think the contrary, but it seems to me that this could live happily out of core. -- Michael
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Tue, Dec 23, 2014 at 9:43 AM, Peter Geoghegan wrote: > On Mon, Dec 22, 2014 at 4:34 PM, Peter Geoghegan wrote: > > To put it another way, creating a separate object obfuscates > > scanRTEForColumn(), since it's the only client of > > updateFuzzyAttrMatchState(). > > > Excuse me. I mean *not* creating a separate object -- having a unified > state representation for the entire range-table, rather than having > one per RTE and merging them one by one into an overall/final range > table object. > Patch moved to CF 2015-02. -- Michael
Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)
On Tue, Jan 6, 2015 at 10:51 PM, Kouhei Kaigai wrote: > Jim, Thanks for your reviewing the patch. > > The attached patch is revised one according to your suggestion, > and also includes bug fix I could found. > > * Definitions of TIDOperator was moved to pg_operator.h > as other operator doing. > * Support the case of "ctid (operator) Param" expression. > * Add checks if commutator of TID was not found. > * Qualifiers gets extracted on plan stage, not path stage. > * Adjust cost estimation logic to fit SeqScan manner. > * Add some new test cases for regression test. > > > On 12/24/14, 7:54 PM, Kouhei Kaigai wrote: > > >>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai > > >>> > > >> wrote: > > I'm not certain whether we should have this functionality in > > contrib from the perspective of workload that can help, but its > > major worth is for an example of custom-scan interface. > > >>> worker_spi is now in src/test/modules. We may add it there as well, > > no? > > >>> > > >> Hmm, it makes sense for me. Does it also make sense to add a > > >> test-case to the core regression test cases? > > >> > > > The attached patch adds ctidscan module at test/module instead of > contrib. > > > Basic portion is not changed from the previous post, but file > > > locations and test cases in regression test are changed. > > > > First, I'm glad for this. It will be VERY valuable for anyone trying to > > clean up the end of a majorly bloated table. > > > > Here's a partial review... > > > > +++ b/src/test/modules/ctidscan/ctidscan.c > > > > +/* missing declaration in pg_proc.h */ > > +#ifndef TIDGreaterOperator > > +#define TIDGreaterOperator 2800 > > ... > > If we're calling that out here, should we have a corresponding comment in > > pg_proc.h, in case these ever get renumbered? > > > It was a quick hack when I moved this module out of the tree. > Yep, I should revert when I submit this patch again. > > > +CTidQualFromExpr(Node *expr, int varno) > > ... > > + if (IsCTIDVar(arg1, varno)) > > + other = arg2; > > + else if (IsCTIDVar(arg2, varno)) > > + other = arg1; > > + else > > + return NULL; > > + if (exprType(other) != TIDOID) > > + return NULL;/* should not happen */ > > + /* The other argument must be a pseudoconstant */ > > + if (!is_pseudo_constant_clause(other)) > > + return NULL; > > > > I think this needs some additional blank lines... > > > OK. And, I also noticed the coding style around this function is > different from other built-in plans, so I redefined the role of > this function just to check whether the supplied RestrictInfo is > OpExpr that involves TID inequality operator, or not. > > Expression node shall be extracted when Plan node is created > from Path node. > > > + if (IsCTIDVar(arg1, varno)) > > + other = arg2; > > + else if (IsCTIDVar(arg2, varno)) > > + other = arg1; > > + else > > + return NULL; > > + > > + if (exprType(other) != TIDOID) > > + return NULL;/* should not happen */ > > + > > + /* The other argument must be a pseudoconstant */ > > + if (!is_pseudo_constant_clause(other)) > > + return NULL; > > > > + * CTidEstimateCosts > > + * > > + * It estimates cost to scan the target relation according to the given > > + * restriction clauses. Its logic to scan relations are almost same as > > + * SeqScan doing, because it uses regular heap_getnext(), except for > > + * the number of tuples to be scanned if restriction clauses work well. > > > > That should read "same as what SeqScan is doing"... however, > what > > actual function are you talking about? I couldn't find > SeqScanEstimateCosts > > (or anything ending EstimateCosts). > > > It is cost_seqscan(). But I don't put a raw function name in the source > code comments in other portion, because nobody can guarantee it is up to > date in the future... > > > BTW, there's other grammar issues but it'd be best to handle those all at > > once after all the code stuff is done. > > > Yep. Help by native English speaker is very helpful for us. > > > + opno = get_commutator(op->opno); > > What happens if there's no commutator? Perhaps best to Assert(opno != > > InvalidOid) at the end of that if block. > > > Usually, commutator operator of TID is defined on initdb time, however, > nobody can guarantee a mad superuser doesn't drop it. > So, I added elog(ERROR,...) here. > > > > Though, it seems things will fall apart anyway if ctid_quals isn't > exactly > > what we're expecting; I don't know if that's OK or not. > > > No worry, it was already checked on planning stage. > > > + /* disk costs --- assume each tuple on a different page */ > > + run
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
On Thu, Jan 15, 2015 at 8:02 AM, Kouhei Kaigai wrote: > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai > wrote: > > > When custom-scan node replaced a join-plan, it shall have at least two > > > child plan-nodes. The callback handler of PlanCustomPath needs to be > > > able to call create_plan_recurse() to transform the underlying > > > path-nodes to plan-nodes, because this custom-scan node may take other > > > built-in scan or sub-join nodes as its inner/outer input. > > > In case of FDW, it shall kick any underlying scan relations to remote > > > side, thus we may not expect ForeignScan has underlying plans... > > > > Do you have an example of this? > > > Yes, even though full code set is too large for patch submission... > > https://github.com/pg-strom/devel/blob/master/src/gpuhashjoin.c#L1880 > > This create_gpuhashjoin_plan() is PlanCustomPath callback of GpuHashJoin. > It takes GpuHashJoinPath inherited from CustomPath that has multiple > underlying scan/join paths. > Once it is called back from the backend, it also calls > create_plan_recurse() > to make inner/outer plan nodes according to the paths. > > In the result, we can see the following query execution plan that > CustomScan > takes underlying scan plans. > > postgres=# EXPLAIN SELECT * FROM t0 NATURAL JOIN t1 NATURAL JOIN t2; > QUERY PLAN > > -- > Custom Scan (GpuHashJoin) (cost=2968.00..140120.31 rows=3970922 > width=143) >Hash clause 1: (aid = aid) >Hash clause 2: (bid = bid) >Bulkload: On >-> Custom Scan (GpuScan) on t0 (cost=500.00..57643.00 rows=409 > width=77) >-> Custom Scan (MultiHash) (cost=734.00..734.00 rows=4 width=37) > hash keys: aid > nBatches: 1 Buckets: 46000 Memory Usage: 99.99% > -> Seq Scan on t1 (cost=0.00..734.00 rows=4 width=37) > -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=4 > width=37) >hash keys: bid >nBatches: 1 Buckets: 46000 Memory Usage: 49.99% >-> Seq Scan on t2 (cost=0.00..734.00 rows=4 width=37) > (13 rows) > Where are we on this? AFAIK, we have now a feature with no documentation and no example in-core to test those custom routine APIs, hence moved to next CF. -- Michael
Re: [HACKERS] documentation update for doc/src/sgml/func.sgml
On Mon, Feb 2, 2015 at 10:42 PM, Robert Haas wrote: > On Tue, Jan 20, 2015 at 4:01 AM, Fabien COELHO > wrote: > >> I had a look at this patch. This patch adds some text below a table > >> of functions. Immediately above that table, there is this existing > >> language: > >> > >> The functions working with double precision data are > mostly > >> implemented on top of the host system's C library; accuracy and > behavior > >> in > >> boundary cases can therefore vary depending on the host system. > >> > >> This seems to me to substantially duplicate the information added by the > >> patch. > > > > I would rather say that it explicites the potential issues. Taking that > into > > account, maybe the part about floating point could be moved up after the > > above sentence, or the above sentence moved down as an introduction, with > > some pruning so that it fits in? > > Possibly. If anyone still cares about this patch, then they should > try revising it along those lines and submit an updated version. If > no one is excited enough about this to do that, we should just flag > this as rejected and move on. Since this patch has been kicking > around since August, my reading is nobody's very excited about it, but > maybe I'm misinterpreting the situation. Let's move on then, marked as rejected. -- Michael
Re: [HACKERS] REVIEW: Track TRUNCATE via pgstat
On Sat, Jan 24, 2015 at 5:58 AM, Alvaro Herrera wrote: > Here's v0.5. (Why did you use that weird decimal versioning scheme? You > could just say "v4" and save a couple of keystrokes). This patch makes > perfect sense to me now. I was ready to commit, but I checked the > regression test you added and noticed that you're only reading results > for the last set of operations because they all use the same table and > so each new set clobbers the values for the previous one. So I modified > them to use one table for each set, and report the counters for all > tables. In doing this I noticed that the one for trunc_stats_test3 is > at odds with what your comment in the .sql file says; would you review > it please? Thanks. > > (I didn't update the expected file.) > > BTW you forgot to update expected/prepared_xact_1.out, for the case when > prep xacts are disabled. > > If some other committer decides to give this a go, please remember to > bump catversion before pushing. > Alex, this patch seems nicely backed. Could you review the changes of Alvaro? This thread is waiting for your input for 3 weeks. -- Michael
Re: [HACKERS] Turning off HOT/Cleanup sometimes
On Wed, Dec 17, 2014 at 5:39 PM, Simon Riggs wrote: > On 15 December 2014 at 20:26, Jeff Janes wrote: > > > I still get the compiler error in contrib: > > > > pgstattuple.c: In function 'pgstat_heap': > > pgstattuple.c:279: error: too few arguments to function > > 'heap_beginscan_strat' > > > > Should it pass false for the always_prune? > > Yes. > > New version attached. > Moved patch to CF 2015-02 with same status "Needs review". It visibly needs more work, and numbers to show increase in performance while only cases showing up performance decrease showed up. -- Michael
Re: [HACKERS] parallel mode and parallel contexts
On Thu, Feb 12, 2015 at 3:59 AM, Robert Haas wrote: > We're not seeing eye to eye here yet, since I don't accept your > example scenario and you don't accept mine. Let's keep discussing. > > Meanwhile, here's an updated patch. > A lot of cool activity is showing up here, so moved the patch to CF 2015-02. Perhaps Andres you could add yourself as a reviewer? -- Michael
Re: [HACKERS] WALWriter active during recovery
On Thu, Dec 18, 2014 at 6:43 PM, Fujii Masao wrote: > On Tue, Dec 16, 2014 at 3:51 AM, Simon Riggs > wrote: > > Currently, WALReceiver writes and fsyncs data it receives. Clearly, > > while we are waiting for an fsync we aren't doing any other useful > > work. > > > > Following patch starts WALWriter during recovery and makes it > > responsible for fsyncing data, allowing WALReceiver to progress other > > useful actions. > > +1 > > > At present this is a WIP patch, for code comments only. Don't bother > > with anything other than code questions at this stage. > > > > Implementation questions are > > > > * How should we wake WALReceiver, since it waits on a poll(). Should > > we use SIGUSR1, which is already used for latch waits, or another > > signal? > > Probably we need to change libpqwalreceiver so that it uses the latch. > This is useful even for the startup process to report the replay location > to > the walreceiver in real time. > > > * Should we introduce some pacing delays if the WALreceiver gets too > > far ahead of apply? > > I don't think so for now. Instead, we can support synchronous_commit = > replay, > and the users can use that new mode if they are worried about the delay of > WAL replay. > > > * Other questions you may have? > > Who should wake the startup process so that it reads and replays the WAL > data? > Current walreceiver. But if walwriter is responsible for fsyncing WAL data, > probably walwriter should do that. Because the startup process should not > replay > the WAL data which has not been fsync'd yet. > Moved this patch to CF 2015-02 to not lose track of it and because it did not get any reviews. -- Michael
[HACKERS] question on Postgres smart shutdown mode
Hi All, I have a question on PG smart shutdown mode. When shutdown Postgres by issuing *Smart Shutdown *mode (SIGTERM) request, is there a way for client to be notified of this shutdown event? I tried PG_NOTIFY, but I cannot get any notification events when this happens. BTW, I am relative new to Postgres. Thanks, Wei
Re: [HACKERS] [PATCH] Add transforms feature
On Mon, Dec 22, 2014 at 12:19 PM, Michael Paquier wrote: > On Mon, Dec 15, 2014 at 3:10 PM, Peter Eisentraut wrote: > > fixed > This patch needs a rebase, it does not apply correctly in a couple of > places on latest HEAD (699300a): > ./src/include/catalog/catversion.h.rej > ./src/include/catalog/pg_proc.h.rej > ./src/pl/plpython/plpy_procedure.c.rej > I moved this patch to 2015-02 to not lose track of it and because it did not receive much reviews... -- Michael
Re: [HACKERS] Final Patch for GROUPING SETS
On Wed, Jan 21, 2015 at 6:02 AM, Andrew Gierth wrote: > Updated patch (mostly just conflict resolution): > > - fix explain code to track changes to deparse context handling > > - tiny expansion of some comments (clarify in nodeAgg header >comment that aggcontexts are now EContexts rather than just >memory contexts) > > - declare support for features in sql_features.txt, which had been >previously overlooked > > Patch moved to CF 2015-02. -- Michael
Re: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code
On Tue, Feb 10, 2015 at 10:32 PM, Michael Paquier wrote: > Patch updated is attached. > Patch moved to CF 2015-02 with same status. -- Michael
Re: [HACKERS] inherit support for foreign tables
On Thu, Jan 15, 2015 at 4:35 PM, Etsuro Fujita wrote: > As I said before, that seems to me like a good idea. So I'll update the > patch based on that if you're okey with it. Or you've found any problem > concerning the above idea? > Patch moved to CF 2015-02 with same status, "Ready for committer". -- Michael
Re: [HACKERS] pg_basebackup -x/X doesn't play well with archive_mode & wal_keep_segments
On Thu, Feb 12, 2015 at 4:18 PM, Andres Freund wrote: > No need for debugging. It's plain and simply a (cherry-pick) conflict I > resolved wrongly during backpatching. 9.3, 9.4 and master do not have > that problem. That whole fix was quite painful because every single > release had significantly different code :(. pg_basebackup/ is pretty > messy. > I'm not sure why my testsuite didn't trigger that problem. Possibly > because a retry makes things work :( > > Somewhat uckily it's 9.2 only (9.3, 9.4 and master look correct, earlier > releases don't have pg_receivexlog) and can quite easily be worked > around by creating the archive_status directory. The workaround works perfectly for me in this case, I'm going to updrade it up to 9.4 anyway soon. Thank you, Andres. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Thu, Feb 12, 2015 at 8:08 PM, Syed, Rahila wrote: > > > > Thank you for comments. Please find attached the updated patch. > > > > >This patch fails to compile: > >xlogreader.c:1049:46: error: extraneous ')' after condition, expected a statement > >blk->with_hole && blk->hole_offset <= 0)) > > This has been rectified. > > > > >Note as well that at least clang does not like much how the sanity check with with_hole are done. You should place parentheses around the '&&' expressions. Also, I would rather define with_hole == 0 or with_hole == 1 explicitly int those checks > > The expressions are modified accordingly. > > > > >There is a typo: > > >s/true,see/true, see/ > > >[nitpicky]Be as well aware of the 80-character limit per line that is usually normally by comment blocks.[/] > > > > Have corrected the typos and changed the comments as mentioned. Also , realigned certain lines to meet the 80-char limit. Thanks for the updated patch. + /* leave if data cannot be compressed */ + if (compressed_len == 0) + return false; This should be < 0, pglz_compress returns -1 when compression fails. + if (pglz_decompress(block_image, bkpb->bkp_len, record->uncompressBuf, + bkpb->bkp_uncompress_len) == 0) Similarly, this should be < 0. Regarding the sanity checks that have been added recently. I think that they are useful but I am suspecting as well that only a check on the record CRC is done because that's reliable enough and not doing those checks accelerates a bit replay. So I am thinking that we should simply replace them by assertions. I have as well re-run my small test case, with the following results (scripts and results attached) =# select test, user_diff,system_diff, pg_size_pretty(pre_update - pre_insert), pg_size_pretty(post_update - pre_update) from results; test | user_diff | system_diff | pg_size_pretty | pg_size_pretty -+---+-++ FPW on | 46.134564 |0.823306 | 429 MB | 566 MB FPW on | 16.307575 |0.798591 | 171 MB | 229 MB FPW on | 8.325136 |0.848390 | 86 MB | 116 MB FPW off | 29.992383 |1.100458 | 440 MB | 746 MB FPW off | 12.237578 |1.027076 | 171 MB | 293 MB FPW off | 6.814926 |0.931624 | 86 MB | 148 MB HEAD| 26.590816 |1.159255 | 440 MB | 746 MB HEAD| 11.620359 |0.990851 | 171 MB | 293 MB HEAD| 6.300401 |0.904311 | 86 MB | 148 MB (9 rows) The level of compression reached is the same as previous mark, 566MB for the case of fillfactor=50 ( cab7npqsc97o-ue5paxfmukwcxe_jioyxo1m4a0pmnmyqane...@mail.gmail.com) with a similar CPU usage. Once we get those small issues fixes, I think that it is with having a committer look at this patch, presumably Fujii-san. Regards, -- Michael compress_run.bash Description: Binary data results.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd behavior of updatable security barrier views on foreign tables
On 2015/02/11 4:06, Stephen Frost wrote: * Etsuro Fujita (fujita.ets...@lab.ntt.co.jp) wrote: On 2015/02/10 7:23, Dean Rasheed wrote: Sorry, I didn't have time to look at this properly. My initial thought is that expand_security_qual() needs to request a lock on rows coming >from the relation it pushes down into a subquery if that relation was the result relation, because otherwise it won't have any locks, since preprocess_rowmarks() only adds PlanRowMarks to non-target relations. That seems close to what I had in mind; expand_security_qual() needs to request a FOR UPDATE lock on rows coming from the relation it pushes down into a subquery only when that relation is the result relation and *foreign table*. I had been trying to work out an FDW-specific way to address this, but I think Dean's right that this should be addressed in expand_security_qual(), which means it'll apply to all cases and not just these FDW calls. I don't think that's actually an issue though and it'll match up to how SELECT FOR UPDATE is handled today. Sorry, my explanation was not accurate, but I also agree with Dean's idea. In the above, I just wanted to make it clear that such a lock request done by expand_security_qual() should be limited to the case where the relation that is a former result relation is a foreign table. If it's OK, I'll submit a patch for that, maybe early next week. Thank you for working on this issue! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] "multiple backends attempting to wait for pincount 1"
Two different CLOBBER_CACHE_ALWAYS critters recently reported exactly the same failure pattern on HEAD: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=markhor&dt=2015-02-06%2011%3A59%3A59 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tick&dt=2015-02-12%2010%3A22%3A57 I'd say we have a problem. I'd even go so far as to say that somebody has completely broken locking, because this looks like autovacuum and manual vacuuming are hitting the same table at the same time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] assessing parallel-safety
On Thu, Feb 12, 2015 at 07:40:12AM -0500, Robert Haas wrote: > On Thu, Feb 12, 2015 at 12:16 AM, Noah Misch wrote: > > That is a major mark against putting the check in simplify_function(), > > agreed. > > I do see one way to rescue that idea, which is this: put two flags, > parallelModeOK and parallelModeRequired, into PlannerGlobal. At the > beginning of planning, set parallelModeOK based on our best knowledge > at that time; as we preprocess expressions, update it to false if we > see something that's not parallel-safe. Emit paths for parallel plans > only if the flag is currently true. At the end of planning, when we > convert paths to plans, set parallelModeRequired if any parallel plan > elements are generated. If we started with parallelModeOK = true or > ended up with parallelModeRequired = false, then life is good. In the > unfortunate event that we end up with parallelModeOK = false and > parallelModeRequired = true, replan, this time with parallelModeOK = > false from the beginning. > So this would mostly be pretty cheap, but if you do hit the case where > a re-plan is required it would be pretty painful. > >> > Unless we want to rejigger this so that we do a > >> > complete eval_const_expressions() pass over the entire query tree > >> > (including all subqueries) FIRST, and then only after that go back and > >> > plan all of those subqueries, I don't see how to make this work; and > >> > I'm guessing that there are good reasons not to do that. > > > > I expect that would work fine, but I do think it premature to venture that > > far > > out of your way to optimize this new tree examination. Given your wish to optimize, I recommend first investigating the earlier thought to issue eval_const_expressions() once per planner() instead of once per subquery_planner(). Compared to the parallelModeRequired/parallelModeOK idea, it would leave us with a more maintainable src/backend/optimizer. I won't object to either design, though. Your survey of parallel safety among built-in functions was on-target. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Strange assertion using VACOPT_FREEZE in vacuum.c
Hi all, When calling vacuum(), there is the following assertion using VACOPT_FREEZE: Assert((vacstmt->options & VACOPT_VACUUM) || !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE))); I think that this should be changed with sanity checks based on the parameter values of freeze_* in VacuumStmt as we do not set up VACOPT_FREEZE when VACUUM is used without options in parenthesis, for something like that: Assert((vacstmt->options & VACOPT_VACUUM) || - !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE))); + ((vacstmt->options & VACOPT_FULL) == 0 && + vacstmt->freeze_min_age < 0 && + vacstmt->freeze_table_age < 0 && + vacstmt->multixact_freeze_min_age < 0 && + vacstmt->multixact_freeze_table_age < 0)); This would also have the advantage to limit the use of VACOPT_FREEZE in the query parser. A patch is attached. Thoughts? -- Michael diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 2f3f79d..3ddc077 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -110,7 +110,11 @@ vacuum(VacuumStmt *vacstmt, Oid relid, bool do_toast, /* sanity checks on options */ Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE)); Assert((vacstmt->options & VACOPT_VACUUM) || - !(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE))); + ((vacstmt->options & VACOPT_FULL) == 0 && + vacstmt->freeze_min_age < 0 && + vacstmt->freeze_table_age < 0 && + vacstmt->multixact_freeze_min_age < 0 && + vacstmt->multixact_freeze_table_age < 0)); Assert((vacstmt->options & VACOPT_ANALYZE) || vacstmt->va_cols == NIL); stmttype = (vacstmt->options & VACOPT_VACUUM) ? "VACUUM" : "ANALYZE"; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Table-level log_autovacuum_min_duration
On Fri, Feb 13, 2015 at 10:16 AM, Naoya Anzai wrote: >>> You mean that ... >>> Log_autovacuum_min_duration assumes a role of VACOPT_VERBOSE. >>> Even if this parameter never use currently for manual vacuum, >>> log_autovacuum_min_duration should be set zero(anytime output) >>> when we executes "VACUUM(or ANALYZE) VERBOSE". >>> Is my understanding correct? If so,it sounds logical. >>> >> >>Yup, that's my opinion. Now I don't know if people would mind to remove >>VACOPT_VERBOSE and replace the control it does by log_min_duration in >>VacuumStmt. At least both things are overlapping, and log_min_duration >>offers more options than the plain VACOPT_VERBOSE. > > OK. I agree with you. > Please re-implement as you are thinking. OK will do that today. >>> If we can abolish VERBOSE option, >>> I think it's ideal that we will prepare a new parameter like >>> a log_min_duration_vacuum(and log_min_duration_analyze) which >>> integrating "VERBOSE feature" and "log_autovacuum_min_duration". >>> >> >>What I think you are proposing here is a VERBOSE option that hypothetically >>gets activated if a manual VACUUM takes more than a certain amount >>specified by those parameters. I doubt this would be useful. In any case >>this is unrelated to this patch. > > I suspect manual vacuum often executes as "semi-auto vacuum" > by job-scheduler, cron, etc in actual maintenance cases. > Whether auto or manual, I think that's important to output > their logs in the same mechanism. > > Sorry, I seem to have wandered from the subject. No problem. That's a constructive discussion :) -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Table-level log_autovacuum_min_duration
>> You mean that ... >> Log_autovacuum_min_duration assumes a role of VACOPT_VERBOSE. >> Even if this parameter never use currently for manual vacuum, >> log_autovacuum_min_duration should be set zero(anytime output) >> when we executes "VACUUM(or ANALYZE) VERBOSE". >> Is my understanding correct? If so,it sounds logical. >> > >Yup, that's my opinion. Now I don't know if people would mind to remove >VACOPT_VERBOSE and replace the control it does by log_min_duration in >VacuumStmt. At least both things are overlapping, and log_min_duration >offers more options than the plain VACOPT_VERBOSE. OK. I agree with you. Please re-implement as you are thinking. >> If we can abolish VERBOSE option, >> I think it's ideal that we will prepare a new parameter like >> a log_min_duration_vacuum(and log_min_duration_analyze) which >> integrating "VERBOSE feature" and "log_autovacuum_min_duration". >> > >What I think you are proposing here is a VERBOSE option that hypothetically >gets activated if a manual VACUUM takes more than a certain amount >specified by those parameters. I doubt this would be useful. In any case >this is unrelated to this patch. I suspect manual vacuum often executes as "semi-auto vacuum" by job-scheduler, cron, etc in actual maintenance cases. Whether auto or manual, I think that's important to output their logs in the same mechanism. Sorry, I seem to have wandered from the subject. Naoya -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Reducing lock strength of trigger and foreign key DDL
On Sun, Feb 8, 2015 at 10:05 AM, Andreas Karlsson wrote: > On 01/30/2015 07:48 AM, Michael Paquier wrote: >> >> Looking at the latest patch, it seems that in >> AlterTableGetLockLevel@tablecmds.c we ought to put AT_ReAddConstraint, >> AT_AddConstraintRecurse and AT_ProcessedConstraint under the same >> banner as AT_AddConstraint. Thoughts? > > > A new version of the patch is attached which treats them as the same for > locking. I think it is correct and improves readability to do so. Well then, let's switch it to "Ready for committer". I am moving as well this entry to the next CF with the same status. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commit fest 2015-12 enters money time
On Thu, Jan 15, 2015 at 4:05 PM, Michael Paquier wrote: > We are soon entering in the money time for this CF. The last month has > been mainly a vacation period, the progress being fantomatic on many > fronts, still there are a couple of patches that are marked as ready > for committer: > - Foreign table inheritance , whose first patch has been committed > - speedup tidbitmap patch: cache page > - pg_basebackup vs. Windows and tablespaces (Extend base backup to > include symlink file used to restore symlinks) > - If pg_hba.conf has changed since last reload, emit a HINT to the > client and server log > - Add restore_command_retry_interval option to control timeout of > restore_command nonzero status code > - documentation update for doc/src/sgml/func.sgml > - Reducing lock strength of trigger and foreign key DDL > > I am going to make a first pass on all the patches to check their > status, and please note that the patches waiting for some input from > their authors will be marked as returned with feedback. In order to move on to the next CF, I am going to go through all the remaining patches and update their status accordingly. And sorry for slacking a bit regarding that. If you wish to complain, of course feel free to post messages on this thread or on the thread related to the patch itself. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup -x/X doesn't play well with archive_mode & wal_keep_segments
On 2015-02-12 11:44:05 -0800, Sergey Konoplev wrote: > On Thu, Feb 12, 2015 at 11:40 AM, Andres Freund > wrote: > > This obviously should not be the case. I'll have a look in a couple of > > hours. Until then you can likely just work around the problem by creating > > the archive_status directory. > > Thank you. Just let me know if you need some extra info or debugging. No need for debugging. It's plain and simply a (cherry-pick) conflict I resolved wrongly during backpatching. 9.3, 9.4 and master do not have that problem. That whole fix was quite painful because every single release had significantly different code :(. pg_basebackup/ is pretty messy. I'm not sure why my testsuite didn't trigger that problem. Possibly because a retry makes things work :( Somewhat uckily it's 9.2 only (9.3, 9.4 and master look correct, earlier releases don't have pg_receivexlog) and can quite easily be worked around by creating the archive_status directory. If you want to fix it locally, you just need to replace ReceiveXlogStream(conn, startpos, timeline, NULL, basedir, stop_streaming, standby_message_timeout, false, true); by ReceiveXlogStream(conn, startpos, timeline, NULL, basedir, stop_streaming, standby_message_timeout, false, false); Yes, that and pretty much all other functions in that directory have too many parameters. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gcc5: initdb produces gigabytes of _fsm files
On Thu, Feb 12, 2015 at 6:16 PM, Tom Lane wrote: > I wrote: >> Christoph Berg writes: >>> gcc5 is lurking in Debian experimental, and it's breaking initdb. > >> Yeah, I just heard the same about Red Hat as well: >> https://bugzilla.redhat.com/show_bug.cgi?id=1190978 >> Not clear if it's an outright compiler bug or they've just found some >> creative new way to make an optimization assumption we're violating. > > Apparently, it's the former. See > > https://bugzilla.redhat.com/show_bug.cgi?id=1190978#c3 > > I will be unamused if the gcc boys try to make an argument that they > did some valid optimization here. You're new here, aren't you? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gcc5: initdb produces gigabytes of _fsm files
I wrote: > Christoph Berg writes: >> gcc5 is lurking in Debian experimental, and it's breaking initdb. > Yeah, I just heard the same about Red Hat as well: > https://bugzilla.redhat.com/show_bug.cgi?id=1190978 > Not clear if it's an outright compiler bug or they've just found some > creative new way to make an optimization assumption we're violating. Apparently, it's the former. See https://bugzilla.redhat.com/show_bug.cgi?id=1190978#c3 I will be unamused if the gcc boys try to make an argument that they did some valid optimization here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about to have relnamespace and relrole?
On 2/12/15 5:28 AM, Kyotaro HORIGUCHI wrote: Hello, I changed the subject. This mail is to address the point at hand, preparing for registering this commitfest. 15 17:29:14 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20150204.172914.52110711.horiguchi.kyot...@lab.ntt.co.jp> Tue, 03 Feb 2015 10:12:12 -0500, Tom Lane wrote in <2540.1422976...@sss.pgh.pa.us> I'm not really excited about that. That line of thought would imply that we should have "reg*" types for every system catalog, which is surely overkill. Mmm. I suppose "for every OID usage", not "every system catalog". but I agree as the whole. There's no agreeable-by-all boundary. Perhaps it depends on how often the average DBA looks onto catalogs which have oids pointing another catalog which they want to see in human-readable form, without joins if possible. I very roughly counted how often the oids of catalogs referred from other catalogs. 1. Expected frequency of use ... For that reason, although the current policy of deciding whether to have reg* types seems to be whether they have schema-qualified names, I think regrole and regnamespace are valuable to have. Perhaps schema qualification was the original intent, but I think at this point everyone uses them for convenience. Why would I want to type out JOIN pg_namespace n ON n.oid = blah.???namespace when I could simply do ???namespace::regnamespace? 2. Anticipaed un-optimizability Tom pointed that these reg* types prevents planner from optimizing the query, so we should refrain from having such machinary. It should have been a long-standing issue but reg*s sees to be rather faster than joining corresponding catalogs for moderate number of the result rows, but this raises another more annoying issue. 3. Potentially breakage of MVCC The another issue Tom pointed is potentially breakage of MVCC by using these reg* types. Syscache is invalidated on updates so this doesn't seem to be a problem on READ COMMITTED mode, but breaks SERIALIZABLE mode. But IMHO it is not so serious problem as long as such inconsistency occurs only on system catalog and it is explicitly documented togethee with the un-optimizability issue. I'll add a patch for this later. The way I look at it, all these arguments went out the window when regclass was introduced. The reality is that reg* is *way* more convenient than manual joins. The arguments about optimization and MVCC presumably apply to all reg* casts, which means that ship has long since sailed. If we offered views that made it easier to get at this data then maybe this wouldn't matter so much. But DBA's don't want to join 3 catalogs together to try and answer a simple question. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Help me! Why did the salve stop suddenly ?
Hi, dear pgsql-hackers *1. environment* *DB Master* $ cat /etc/issue CentOS release 6.5 (Final) Kernel \r on an \m $ uname -av Linux l-x1.xx.cnx 3.14.29-3.centos6.x86_64 #1 SMP Tue Jan 20 17:48:32 CST 2015 x86_64 x86_64 x86_64 GNU/Linux $ psql -U postgres psql (9.3.5) Type "help" for help. postgres=# select version(); version -- PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3), 64-bit (1 row) $ pg_config BINDIR = /opt/pg93/bin DOCDIR = /opt/pg93/share/doc/postgresql HTMLDIR = /opt/pg93/share/doc/postgresql INCLUDEDIR = /opt/pg93/include PKGINCLUDEDIR = /opt/pg93/include/postgresql INCLUDEDIR-SERVER = /opt/pg93/include/postgresql/server LIBDIR = /opt/pg93/lib PKGLIBDIR = /opt/pg93/lib/postgresql LOCALEDIR = /opt/pg93/share/locale MANDIR = /opt/pg93/share/man SHAREDIR = /opt/pg93/share/postgresql SYSCONFDIR = /opt/pg93/etc/postgresql PGXS = /opt/pg93/lib/postgresql/pgxs/src/makefiles/pgxs.mk CONFIGURE = '--prefix=/opt/pg93' '--with-perl' '--with-libxml' '--with-libxslt' '--with-ossp-uuid' 'CFLAGS= -march=core2 -O2 ' CC = gcc CPPFLAGS = -D_GNU_SOURCE -I/usr/include/libxml2 CFLAGS = -march=core2 -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv CFLAGS_SL = -fpic LDFLAGS = -L../../../src/common -Wl,--as-needed -Wl,-rpath,'/opt/pg93/lib',--enable-new-dtags LDFLAGS_EX = LDFLAGS_SL = LIBS = -lpgport -lpgcommon -lxslt -lxml2 -lz -lreadline -lcrypt -ldl -lm VERSION = PostgreSQL 9.3.5 *DB Slave*$ cat /etc/issue CentOS release 6.5 (Final) Kernel \r on an \m $ uname -av Linux l-2.xx.cnx 3.14.31-3.centos6.x86_64 #1 SMP Mon Feb 2 15:26:04 CST 2015 x86_64 x86_64 x86_64 GNU/Linux $ psql -U postgres psql (9.3.5) Type "help" for help. postgres=# select version(); version -- PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3), 64-bit (1 row) postgres=# show log_line_prefix ; log_line_prefix -- [%u %d %a %h %m %p %c %l %x] (1 row) $ pg_config BINDIR = /opt/pg93/bin DOCDIR = /opt/pg93/share/doc/postgresql HTMLDIR = /opt/pg93/share/doc/postgresql INCLUDEDIR = /opt/pg93/include PKGINCLUDEDIR = /opt/pg93/include/postgresql INCLUDEDIR-SERVER = /opt/pg93/include/postgresql/server LIBDIR = /opt/pg93/lib PKGLIBDIR = /opt/pg93/lib/postgresql LOCALEDIR = /opt/pg93/share/locale MANDIR = /opt/pg93/share/man SHAREDIR = /opt/pg93/share/postgresql SYSCONFDIR = /opt/pg93/etc/postgresql PGXS = /opt/pg93/lib/postgresql/pgxs/src/makefiles/pgxs.mk CONFIGURE = '--prefix=/opt/pg93' '--with-perl' '--with-libxml' '--with-libxslt' '--with-ossp-uuid' 'CFLAGS= -march=core2 -O2 ' CC = gcc CPPFLAGS = -D_GNU_SOURCE -I/usr/include/libxml2 CFLAGS = -march=core2 -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv CFLAGS_SL = -fpic LDFLAGS = -L../../../src/common -Wl,--as-needed -Wl,-rpath,'/opt/pg93/lib',--enable-new-dtags LDFLAGS_EX = LDFLAGS_SL = LIBS = -lpgport -lpgcommon -lxslt -lxml2 -lz -lreadline -lcrypt -ldl -lm VERSION = PostgreSQL 9.3.5 2. *the DB log in the Slave's log_directory*[2015-02-05 15:38:51.406 CST 2328 54d08abc.918 6 0]WARNING: will not overwrite a used ItemId [2015-02-05 15:38:51.406 CST 2328 54d08abc.918 7 0]CONTEXT: xlog redo insert: rel 38171461/16384/57220350; tid 1778398/9 [2015-02-05 15:38:51.406 CST 2328 54d08abc.918 8 0]PANIC: heap_insert_redo: failed to add tuple [2015-02-05 15:38:51.406 CST 2328 54d08abc.918 9 0]CONTEXT: xlog redo insert: rel 38171461/16384/57220350; tid 1778398/9 [2015-02-05 15:38:51.765 CST 2320 54d08abb.910 6 0]LOG: startup process (PID 2328) was terminated by signal 6: Aborted [2015-02-05 15:38:51.765 CST 2320 54d08abb.910 7 0]LOG: terminating any other active server processes [DBusesr DBname [unknown] 192.168.xxx.x 2015-02-05 15:38:51.765 CST 61450 54d31d48.f00a 3 0]WARNING: terminating connection because of crash of another server process [DBusesr DBname [unknown] 192.168.xxx.x 2015-02-05 15:38:51.765 CST 61450 54d31d48.f00a 4 0]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. [DBusesr DBname [unknown] 192.168.xxx.x 2015-02-05 15:38:51.765 CST 61450 54d31d48.f00a 5 0]HINT: In a moment you should be able to reconnect to the database and repeat your command. [DBusesr DBname [unknown] 192.168.xxx.x 2015-02-05 15:38:51.765 CST 51208 54d315b6.c808 7 0]WARNING: terminating connection because of crash of another server process [DBuse
Re: [HACKERS] enabling nestedloop and disabling hashjon
On 12/2/15 18:29, Tom Lane wrote: > Ravi Kiran writes: >> I am sorry for the late reply, when I disabled the hash join command >> "enable_hashjoin=off" in the postgresql.conf file, it was not working. But >> I when I used the command "set enable_hashjoin=off" command in the back >> end. It worked. >> I am not able to understand why it did not get disabled when I changed it >> in the postgresql file. > > The two plausible explanations for that are (1) you didn't do a reload > or (2) you forgot to remove the '#' comment marker in the file's entry. Or you changed the wrong postgresql.conf file Regards Rodrigo Gonzalez -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enabling nestedloop and disabling hashjon
On 2/12/15 3:34 PM, Ravi Kiran wrote: sorry for the inconvenience if caused to anyone, but as David G johnston said, I was trying to change how the postgresql works and was not able to figure out how it should be done. I will make sure it will be clear from the next time. Thank you very much. And we're glad for the input. But we get a tremendous amount of email, so it's best if posts go to the right place. Just for future reference. @Tom lane Sir, I forgot to remove the # comment marker in the file's entry, Thank you. Glad you were able to fix the problem. :) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enabling nestedloop and disabling hashjon
On 2/12/15 3:20 PM, David G Johnston wrote: >>Does "show enable_hashjoin" say it's off? If not, I think you must've >>fat-fingered the postgresql.conf change somehow. > >For future reference, posts like this belong on pgsql-performance. >but postgres is still using the hash join algorithm even after modifying >the postgresql code To be fair given the original post, and some other recent posts by the same author, the question is not "why is my query performing slowly" but rather "I'm trying to change how PostgreSQL works and cannot figure out how to properly enable and disable algorithms". -hackers seems like the proper forum though the OP could give more context as to his overarching goals to make that more clear to readers. -hackers is for discussion directly related to developing Postgres itself. This email was request for help, and nothing to do with actual development. I'm not trying to dismiss the importance of the request; it is important. But the proper forum for it was -general or -performance. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enabling nestedloop and disabling hashjon
sorry for the inconvenience if caused to anyone, but as David G johnston said, I was trying to change how the postgresql works and was not able to figure out how it should be done. I will make sure it will be clear from the next time. Thank you very much. @Tom lane Sir, I forgot to remove the # comment marker in the file's entry, Thank you. ᐧ On Fri, Feb 13, 2015 at 2:50 AM, David G Johnston < david.g.johns...@gmail.com> wrote: > Jim Nasby-5 wrote > > On 2/10/15 9:29 AM, Tom Lane wrote: > >> Ravi Kiran < > > > ravi.kolanpaka@ > > > > writes: > >>> yes sir, I did try the pg_ctl reload command, but its still using the > >>> hash > >>> join algorithm and not the nested loop algorithm. I even restarted the > >>> server, even then its still using the hash join algorithm > >> > >> Does "show enable_hashjoin" say it's off? If not, I think you must've > >> fat-fingered the postgresql.conf change somehow. > > > > For future reference, posts like this belong on pgsql-performance. > > > > but postgres is still using the hash join algorithm even after modifying > > the postgresql code > > To be fair given the original post, and some other recent posts by the same > author, the question is not "why is my query performing slowly" but rather > "I'm trying to change how PostgreSQL works and cannot figure out how to > properly enable and disable algorithms". -hackers seems like the proper > forum though the OP could give more context as to his overarching goals to > make that more clear to readers. > > David J. > > > > > -- > View this message in context: > http://postgresql.nabble.com/enabling-nestedloop-and-disabling-hashjon-tp5837275p5837728.html > Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] enabling nestedloop and disabling hashjon
Ravi Kiran writes: > I am sorry for the late reply, when I disabled the hash join command > "enable_hashjoin=off" in the postgresql.conf file, it was not working. But > I when I used the command "set enable_hashjoin=off" command in the back > end. It worked. > I am not able to understand why it did not get disabled when I changed it > in the postgresql file. The two plausible explanations for that are (1) you didn't do a reload or (2) you forgot to remove the '#' comment marker in the file's entry. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enabling nestedloop and disabling hashjon
Jim Nasby-5 wrote > On 2/10/15 9:29 AM, Tom Lane wrote: >> Ravi Kiran < > ravi.kolanpaka@ > > writes: >>> yes sir, I did try the pg_ctl reload command, but its still using the >>> hash >>> join algorithm and not the nested loop algorithm. I even restarted the >>> server, even then its still using the hash join algorithm >> >> Does "show enable_hashjoin" say it's off? If not, I think you must've >> fat-fingered the postgresql.conf change somehow. > > For future reference, posts like this belong on pgsql-performance. > but postgres is still using the hash join algorithm even after modifying > the postgresql code To be fair given the original post, and some other recent posts by the same author, the question is not "why is my query performing slowly" but rather "I'm trying to change how PostgreSQL works and cannot figure out how to properly enable and disable algorithms". -hackers seems like the proper forum though the OP could give more context as to his overarching goals to make that more clear to readers. David J. -- View this message in context: http://postgresql.nabble.com/enabling-nestedloop-and-disabling-hashjon-tp5837275p5837728.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enabling nestedloop and disabling hashjon
I am sorry for the late reply, when I disabled the hash join command "enable_hashjoin=off" in the postgresql.conf file, it was not working. But I when I used the command "set enable_hashjoin=off" command in the back end. It worked. I am not able to understand why it did not get disabled when I changed it in the postgresql file. ᐧ On Fri, Feb 13, 2015 at 2:34 AM, Jim Nasby wrote: > On 2/10/15 9:29 AM, Tom Lane wrote: > >> Ravi Kiran writes: >> >>> yes sir, I did try the pg_ctl reload command, but its still using the >>> hash >>> join algorithm and not the nested loop algorithm. I even restarted the >>> server, even then its still using the hash join algorithm >>> >> >> Does "show enable_hashjoin" say it's off? If not, I think you must've >> fat-fingered the postgresql.conf change somehow. >> > > For future reference, posts like this belong on pgsql-performance. > > The other possibility is that the query estimates are so high that the > setting doesn't matter. When you set any of the enable_* settings to off, > all that really happens is the planner adds a cost of 10M to those nodes > when it's planning. Normally that's enough to toss those plans out, but in > extreme cases the cost estimates will still come up with the un-desired > plan. > > Can you post EXPLAIN ANALYZE output with the setting on and off? Or at > least plain EXLPAIN output. > -- > Jim Nasby, Data Architect, Blue Treble Consulting > Data in Trouble? Get it in Treble! http://BlueTreble.com >
Re: [HACKERS] assessing parallel-safety
On Thu, Feb 12, 2015 at 3:52 PM, Amit Kapila wrote: >> Probably not, because many queries will scan multiple relations, and >> we want to do all of this work just once per query. > > By this, do you mean to say that if there is any parallel-unsafe > expression (function call) in query, then we won't parallelize any > part of query, if so why is that mandatory? Because of stuff like set_config() and txid_current(), which will fail outright in parallel mode. Also because the user may have defined a function that updates some other table in the database, which will also fail outright if called in parallel mode. Instead of failing, we want those kinds of things to fall back to a non-parallel plan. > Can't we parallelize scan on a particular relation if all the expressions > in which that relation is involved are parallel-safe No, because the execution of that node can be interleaved in arbitrary fashion with all the other nodes in the plan tree. Once we've got parallel mode active, all the related prohibitions apply to *everything* we do thereafter, not just that one node. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enabling nestedloop and disabling hashjon
On 2/10/15 9:29 AM, Tom Lane wrote: Ravi Kiran writes: yes sir, I did try the pg_ctl reload command, but its still using the hash join algorithm and not the nested loop algorithm. I even restarted the server, even then its still using the hash join algorithm Does "show enable_hashjoin" say it's off? If not, I think you must've fat-fingered the postgresql.conf change somehow. For future reference, posts like this belong on pgsql-performance. The other possibility is that the query estimates are so high that the setting doesn't matter. When you set any of the enable_* settings to off, all that really happens is the planner adds a cost of 10M to those nodes when it's planning. Normally that's enough to toss those plans out, but in extreme cases the cost estimates will still come up with the un-desired plan. Can you post EXPLAIN ANALYZE output with the setting on and off? Or at least plain EXLPAIN output. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gcc5: initdb produces gigabytes of _fsm files
Christoph Berg writes: > gcc5 is lurking in Debian experimental, and it's breaking initdb. Yeah, I just heard the same about Red Hat as well: https://bugzilla.redhat.com/show_bug.cgi?id=1190978 Not clear if it's an outright compiler bug or they've just found some creative new way to make an optimization assumption we're violating. Either way it seems clear that the find-a-page-with-free-space code is getting into an infinite loop whereby it keeps extending the FSM till it runs out of disk space. There's a more detailed stack trace in the Red Hat report. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] assessing parallel-safety
On Thu, Feb 12, 2015 at 10:07 PM, Robert Haas wrote: > > On Thu, Feb 12, 2015 at 6:40 AM, Amit Kapila wrote: > > If we have to go this way, then isn't it better to evaluate the same > > when we are trying to create parallel path (something like in the > > parallel_seq scan patch - create_parallelscan_paths())? > > Probably not, because many queries will scan multiple relations, and > we want to do all of this work just once per query. By this, do you mean to say that if there is any parallel-unsafe expression (function call) in query, then we won't parallelize any part of query, if so why is that mandatory? Can't we parallelize scan on a particular relation if all the expressions in which that relation is involved are parallel-safe > Also, when > somebody adds another parallel node (e.g. parallel hash join) that > will need this same information. > I think we should be able to handle this even if we have per relation information (something like don't parallelize a node/path if any lower node is not parallel) With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
[HACKERS] gcc5: initdb produces gigabytes of _fsm files
Hi, gcc5 is lurking in Debian experimental, and it's breaking initdb. There's bug reports for 9.1 and 9.4: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778070 (9.1) https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778071 (9.4) but I could reproduce it with 9.5devel (from last week) as well: gcc (Debian 5-20150205-1) 5.0.0 20150205 (experimental) [trunk revision 220455] make[2]: Leaving directory '/srv/projects/postgresql/pg/master/src/common' ../../../src/test/regress/pg_regress --inputdir=. --temp-install=./tmp_check --top-builddir=../../..--dlpath=. --schedule=./parallel_schedule == removing existing temp installation== == creating temporary installation== == initializing database system == pg_regress: initdb failed Examine /srv/projects/postgresql/pg/master/src/test/regress/log/initdb.log for the reason. Command was: "/srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/initdb" -D "/srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/data" -L "/srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/install//usr/local/pgsql/share" --noclean --nosync > "/srv/projects/postgresql/pg/master/src/test/regress/log/initdb.log" 2>&1 GNUmakefile:138: recipe for target 'check' failed src/test/regress $ cat log/initdb.log Running in noclean mode. Mistakes will not be cleaned up. The files belonging to this database system will be owned by user "cbe". This user must also own the server process. The database cluster will be initialized with locales COLLATE: de_DE.utf8 CTYPE:de_DE.utf8 MESSAGES: C MONETARY: de_DE.utf8 NUMERIC: de_DE.utf8 TIME: de_DE.utf8 The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "german". Data page checksums are disabled. creating directory /srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... sysv creating configuration files ... ok creating template1 database in /srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/data/base/1 ... FATAL: could not extend file "base/1/2617_fsm": wrote only 4096 of 8192 bytes at block 46197 HINT: Check free disk space. PANIC: cannot abort transaction 1, it was already committed Aborted (core dumped) src/test/regress $ ls -al tmp_check/data/base/1/ insgesamt 34156376 drwx-- 2 cbe cbe 4096 Feb 12 20:04 ./ drwx-- 3 cbe cbe 4096 Feb 12 19:55 ../ -rw--- 1 cbe cbe 40960 Feb 12 20:04 1247 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.1 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.2 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.3 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.4 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.5 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.6 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1247_fsm.7 -rw--- 1 cbe cbe 59138048 Feb 12 20:04 1247_fsm.8 -rw--- 1 cbe cbe 49152 Feb 12 20:04 1249 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.1 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.2 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.3 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.4 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.5 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.6 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.7 -rw--- 1 cbe cbe 59138048 Feb 12 20:04 1249_fsm.8 -rw--- 1 cbe cbe 540672 Feb 12 20:03 1255 -rw--- 1 cbe cbe 1073741824 Feb 12 19:55 1255_fsm -rw--- 1 cbe cbe 1073741824 Feb 12 19:55 1255_fsm.1 -rw--- 1 cbe cbe 1073741824 Feb 12 19:55 1255_fsm.2 -rw--- 1 cbe cbe 1073741824 Feb 12 19:55 1255_fsm.3 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1255_fsm.4 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1255_fsm.5 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1255_fsm.6 -rw--- 1 cbe cbe 1073741824 Feb 12 20:03 1255_fsm.7 -rw--- 1 cbe cbe 59138048 Feb 12 20:03 1255_fsm.8 -rw--- 1 cbe cbe 16384 Feb 12 20:04 1259 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.1 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.2 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.3 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.4 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.5 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.6 -rw--- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.7 -rw--- 1 cbe cbe 59138048 Feb 12 20:04 1259_fsm.8 -rw--- 1 cbe cbe 0 Feb 12 20:04 2604 -rw---
Re: [HACKERS] pg_basebackup -x/X doesn't play well with archive_mode & wal_keep_segments
On Thu, Feb 12, 2015 at 11:40 AM, Andres Freund wrote: > This obviously should not be the case. I'll have a look in a couple of hours. > Until then you can likely just work around the problem by creating the > archive_status directory. Thank you. Just let me know if you need some extra info or debugging. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup -x/X doesn't play well with archive_mode & wal_keep_segments
Hi. This obviously should not be the case. I'll have a look in a couple of hours. Until then you can likely just work around the problem by creating the archive_status directory. -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup -x/X doesn't play well with archive_mode & wal_keep_segments
On Thu, Feb 12, 2015 at 11:13 AM, Andres Freund wrote: >>I started getting these errors after upgrading from 9.2.8 to 9.2.10. >>Is it something critical that requires version downgrade or I can just >>ignore that errors? > > What errors are you getting in precisely which circumstances? You're using > pg-receivexlog? Errors like this one pg_receivexlog: could not create archive status file "/mnt/archive/wal/archive_status/000402AF00B7.done": No such file or directory pg_receivexlog: disconnected.. on Linux xyz 3.2.0-76-generic #111-Ubuntu SMP PostgreSQL 9.2.10 Yes, I use pg_receivexlog. I also use a wrapper/watchdog script around pg_receivexlog which tracks failures and restarts the latter. The WAL files time correlates with the pg_receivexlog failures. postgres@xyz:~$ ls -ltr /mnt/archive/wal/ | tail -rw--- 1 postgres postgres 16777216 Feb 12 10:58 000402B60011 -rw--- 1 postgres postgres 16777216 Feb 12 11:02 000402B60012 -rw--- 1 postgres postgres 16777216 Feb 12 11:06 000402B60013 -rw--- 1 postgres postgres 16777216 Feb 12 11:11 000402B60014 -rw--- 1 postgres postgres 16777216 Feb 12 11:15 000402B60015 -rw--- 1 postgres postgres 16777216 Feb 12 11:19 000402B60016 -rw--- 1 postgres postgres 16777216 Feb 12 11:23 000402B60017 -rw--- 1 postgres postgres 16777216 Feb 12 11:27 000402B60018 -rw--- 1 postgres postgres 16777216 Feb 12 11:30 000402B60019 -rw--- 1 postgres postgres 16777216 Feb 12 11:32 000402B6001A.partial postgres@xyz:~$ cat /var/log/pgcookbook/manage_pitr-wal.log | tail Thu Feb 12 11:15:18 PST 2015 ERROR manage_pitr.sh: Problem occured during WAL archiving: pg_receivexlog: could not create archive status file "/mnt/archive/wal/archive_status/000402B60015.done": No such file or directory pg_receivexlog: disconnected.. Thu Feb 12 11:19:33 PST 2015 ERROR manage_pitr.sh: Problem occured during WAL archiving: pg_receivexlog: could not create archive status file "/mnt/archive/wal/archive_status/000402B60016.done": No such file or directory pg_receivexlog: disconnected.. Thu Feb 12 11:23:38 PST 2015 ERROR manage_pitr.sh: Problem occured during WAL archiving: pg_receivexlog: could not create archive status file "/mnt/archive/wal/archive_status/000402B60017.done": No such file or directory pg_receivexlog: disconnected.. Thu Feb 12 11:27:32 PST 2015 ERROR manage_pitr.sh: Problem occured during WAL archiving: pg_receivexlog: could not create archive status file "/mnt/archive/wal/archive_status/000402B60018.done": No such file or directory pg_receivexlog: disconnected.. Thu Feb 12 11:30:34 PST 2015 ERROR manage_pitr.sh: Problem occured during WAL archiving: pg_receivexlog: could not create archive status file "/mnt/archive/wal/archive_status/000402B60019.done": No such file or directory pg_receivexlog: disconnected.. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What exactly is our CRC algorithm?
On 02/11/2015 04:20 PM, Abhijit Menon-Sen wrote: At 2015-02-11 13:20:29 +0200, hlinnakan...@vmware.com wrote: I don't follow. I didn't change configure at all, compared to your patch. OK, I extrapolated a little too much. Your patch didn't actually include crc_instructions.h; Oh, I'm sorry. Here's the complete patch with crc_instructions.h - Heikki From bd4a90d339e21cd6ac517d077fe3a76abb5ef37d Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Tue, 10 Feb 2015 14:26:24 +0200 Subject: [PATCH 1/1] Use Intel SSE4.2 CRC instructions where available. On x86, perform a runtime check to see if we're running on a CPU that supports SSE 4.2. If we are, we can use the special crc32b and crc32q instructions for the CRC-32C calculations. That greatly speeds up CRC calculation. Abhijit Menon-Sen, reviewed by Andres Freund and me. --- configure | 2 +- configure.in| 2 +- src/common/pg_crc.c | 109 +--- src/include/common/pg_crc.h | 12 +++- src/include/pg_config.h.in | 3 + src/include/port/crc_instructions.h | 121 6 files changed, 235 insertions(+), 14 deletions(-) create mode 100644 src/include/port/crc_instructions.h diff --git a/configure b/configure index fa271fe..c352128 100755 --- a/configure +++ b/configure @@ -9204,7 +9204,7 @@ fi done -for ac_header in atomic.h crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h mbarrier.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h +for ac_header in atomic.h cpuid.h crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h mbarrier.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h do : as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default" diff --git a/configure.in b/configure.in index e6a49d1..588d626 100644 --- a/configure.in +++ b/configure.in @@ -1032,7 +1032,7 @@ AC_SUBST(UUID_LIBS) ## dnl sys/socket.h is required by AC_FUNC_ACCEPT_ARGTYPES -AC_CHECK_HEADERS([atomic.h crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h mbarrier.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h]) +AC_CHECK_HEADERS([atomic.h cpuid.h crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h mbarrier.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h]) # On BSD, test for net/if.h will fail unless sys/socket.h # is included first. diff --git a/src/common/pg_crc.c b/src/common/pg_crc.c index eba32d3..b6db749 100644 --- a/src/common/pg_crc.c +++ b/src/common/pg_crc.c @@ -21,25 +21,113 @@ #include "common/pg_crc.h" -/* Accumulate one input byte */ -#ifdef WORDS_BIGENDIAN -#define CRC8(x) pg_crc32c_table[0][((crc >> 24) ^ (x)) & 0xFF] ^ (crc << 8) +#ifdef PG_HAVE_CRC32C_INSTRUCTIONS +static pg_crc32 pg_comp_crc32c_hw(pg_crc32 crc, const void *data, size_t len); +#endif + +#if !defined(PG_HAVE_CRC32C_INSTRUCTIONS) || defined(PG_CRC32C_INSTRUCTIONS_NEED_RUNTIME_CHECK) +static pg_crc32 pg_comp_crc32c_sb8(pg_crc32 crc, const void *data, size_t len); +static const uint32 pg_crc32c_table[8][256]; +#endif + +#ifdef PG_CRC32C_INSTRUCTIONS_NEED_RUNTIME_CHECK +/* + * When built with support for CRC instructions, but we need to perform a + * run-time check to determine whether we can actually use them, + * pg_comp_crc32c is a function pointer. It is initialized to + * pg_comp_crc32c_choose, which performs the runtime check, and changes the + * function pointer so that subsequent calls go directly to the hw-accelerated + * version, or the fallback slicing-by-8 version. + */ +static pg_crc32 +pg_comp_crc32c_choose(pg_crc32 crc, const void *data, size_t len) +{ + if (pg_crc32_instructions_runtime_check()) + pg_comp_crc32c = pg_comp_crc32c_hw; + else + pg_comp_crc32c = pg_comp_crc32c_sb8; + + return pg_comp_crc32c(crc, data, len); +} + +pg_crc32 (*pg_comp_crc32c)(pg_crc32 crc, const void *data, size_t len) = pg_comp_crc32c_choose; + #else -#define CRC8(x) pg_crc32c_table[0][(crc ^ (x)) & 0xFF] ^ (crc >> 8) +/* + * No need for a runtime check. Compile directly with the hw-accelerated + * or the slicing-by-8 version. (We trust that the compiler + * is smart enough to inline it here.) + */ +pg_crc32 +pg_comp_crc32c(pg_crc32 crc, const void *data, size_t len) +{ +#if
Re: [HACKERS] pg_basebackup -x/X doesn't play well with archive_mode & wal_keep_segments
On February 12, 2015 8:11:05 PM CET, Sergey Konoplev wrote: >Hi, > >On Mon, Jan 5, 2015 at 4:34 AM, Andres Freund >wrote: >>> >> pg_receivexlog: could not create archive status file >>> >> "mmm/archive_status/00010003.done": No such file >or >>> >> directory >>> > >>> > Dang. Stupid typo. And my tests didn't catch it, because I had >>> > archive_directory in the target directory :( > >I started getting these errors after upgrading from 9.2.8 to 9.2.10. >Is it something critical that requires version downgrade or I can just >ignore that errors? What errors are you getting in precisely which circumstances? You're using pg-receivexlog? -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup -x/X doesn't play well with archive_mode & wal_keep_segments
Hi, On Mon, Jan 5, 2015 at 4:34 AM, Andres Freund wrote: >> >> pg_receivexlog: could not create archive status file >> >> "mmm/archive_status/00010003.done": No such file or >> >> directory >> > >> > Dang. Stupid typo. And my tests didn't catch it, because I had >> > archive_directory in the target directory :( I started getting these errors after upgrading from 9.2.8 to 9.2.10. Is it something critical that requires version downgrade or I can just ignore that errors? -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: binworld and install-binworld targets - was Re: [HACKERS] Release note bloat is getting out of hand
On 2/4/15 8:20 PM, Andrew Dunstan wrote: > > On 02/04/2015 06:53 PM, Tom Lane wrote: >>> Or maybe use a make variable, like NO_DOC. I think that's preferable to >>> adding more targets. >> Unless we can come up with a new target name that obviously means >> "world minus docs", the make-variable idea may be the best. > I'm not terribly keen on this. If you don't like "binworld", how about > "world-no-docs"? I think using options of some kind instead of top-level targets is preferable. If we add world-no-docs, should we also add install-world-no-docs, installdirs-world-no-docs, uninstall-world-no-docs, check-work-no-docs, installcheck-world-no-docs, clean-no-docs, distclean-no-docs, etc.? This would get out of hand. Also, it's harder to port things like that to other build systems, including the secondary ones we already have. We already have configure options to decide that we don't want to deal with part of the tree. (There is no make world-no-python.) We used to have support in configure to not build part of the docs. We could resurrect that if that's what people want. I'd actually prefer that even over a make variable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders
On 2/3/15 11:00 AM, Robert Haas wrote: > Crazy ideas: Could we make wal_level something other than > PGC_POSTMASTER? PGC_SIGHUP would be nice... Could we, maybe, even > make it a derived value rather than one that is explicitly configured? > Like, if you set max_wal_senders>0, you automatically get > wal_level=hot_standby? If you register a logical replication slot, > you automatically get wal_level=logical? We could probably make wal_level changeable at run-time if we somehow recorded to the point at which it was changed, as you describe later (or even brute-force it by forcing a checkpoint every time it is changed, which is not worse than what we require now (or even just write out a warning that the setting is not effective until after a checkpoint)). But that still leaves max_wal_senders (and arguably max_replication_slots) requiring a restart before replication can start. I don't see a great plan for those on the horizon. To me, the restart requirement is the killer. That there are so many interwoven settings isn't great either, but there will always be more options, and all we can do is manage it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans for GiST.
On Thu, Feb 12, 2015 at 3:07 PM, Thom Brown wrote: > > On 12 February 2015 at 16:40, Heikki Linnakangas wrote: >> >> On 02/12/2015 12:40 PM, Anastasia Lubennikova wrote: >>> >>> Thanks for answer. >>> Now it seems to be applied correctly. >> >> >> * Documentation is missing. > > > Anastasia provided a documentation patch in the first email. > Yeah, but it's a good practice send all the patches together. Will make the life of the reviewers better ;-) Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog: http://fabriziomello.github.io >> Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello >> Github: http://github.com/fabriziomello
Re: [HACKERS] Index-only scans for GiST.
On 12 February 2015 at 16:40, Heikki Linnakangas wrote: > On 02/12/2015 12:40 PM, Anastasia Lubennikova wrote: > >> Thanks for answer. >> Now it seems to be applied correctly. >> > > * Documentation is missing. > Anastasia provided a documentation patch in the first email. Thom
Re: [HACKERS] assessing parallel-safety
On Wed, Feb 11, 2015 at 3:21 PM, Robert Haas wrote: > I think we may want a dedicated parallel-safe property for functions > rather than piggybacking on provolatile ... I went through the current contents of pg_proc and tried to assess how much parallel-unsafe stuff we've got. I think there are actually three categories of things: (1) functions that can be called in parallel mode either in the worker or in the leader ("parallel safe"), (2) functions that can be called in parallel mode in the worker, but not in the leader ("parallel restricted"), and (3) functions that cannot be called in parallel mode at all ("parallel unsafe"). On a first read-through, the number of things that looked not to be anything other than parallel-safe looked to be fairly small; many of these could be made parallel-safe with more work, but it's unlikely to be worth the effort. current_query() - Restricted because debug_query_string is not copied. lo_open(), lo_close(), loread(), lowrite(), and other large object functions - Restricted because large object state is not shared. age(xid) - Restricted because it uses a transaction-lifespan cache which is not shared. now() - Restricted because transaction start timestamp is not copied. statement_timestamp() - Restricted because statement start timestamp is not copied. pg_conf_load_time() - Restricted because PgReloadTime is not copied. nextval(), currval() - Restricted because sequence-related state is not shared. setval() - Unsafe because no data can be written in parallel mode. random(), setseed() - Restricted because random seed state is not shared. (We could alternatively treat setseed() as unsafe and random() to be restricted only in sessions where setseed() has never been called, and otherwise safe.) pg_stat_get_* - Restricted because there's no guarantee the value would be the same in the parallel worker as in the leader. pg_backend_pid() - Restricted because the worker has a different PID. set_config() - Unsafe because GUC state must remain synchronized. pg_my_temp_schema() - Restricted because temporary namespaces aren't shared with parallel workers. pg_export_snapshot() - Restricted because the worker will go away quickly. pg_prepared_statement(), pg_cursor() - Restricted because the prepared statements and cursors are not synchronized with the worker. pg_listening_channels() - Restricted because listening channels are not synchronized with the worker. pg*advisory*lock*() - Restricted because advisory lock state is not shared with workers - and even if it were, the semantics would be hard to reason about. txid_current() - Unsafe because it might attempt XID assignment. pg_logical_slot*() - Unsafe because they do all kinds of crazy stuff. That's not a lot, and very little of it is anything you'd care about parallelizing anyway. I expect that the incidence of user-written parallel-unsafe functions will be considerably higher. I'm not sure if this impacts the decision about how to design the facility for assessing parallel-safety or not, but I thought it was worth sharing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans for GiST.
On 02/12/2015 12:40 PM, Anastasia Lubennikova wrote: Thanks for answer. Now it seems to be applied correctly. Thanks, it would be great to get this completed. This still leaks memory with the same test query as earlier. The leak seems to be into a different memory context now; it used to be to the "GiST scan context", but now it's to the ExecutorState context. See attached patch which makes the leak evident. Looking at my previous comments from August: * What's the reason for turning GISTScanOpaqueData.pageData from an array to a List? * Documentation is missing. - Heikki diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c index 0925e56..3768c9c 100644 --- a/src/backend/access/gist/gistget.c +++ b/src/backend/access/gist/gistget.c @@ -526,6 +526,12 @@ gistgettuple(PG_FUNCTION_ARGS) * It's always head of so->pageData */ so->pageData = list_delete_first(so->pageData); +{ + static int lastreport = 0; + if ((lastreport++) % 1 == 0) + MemoryContextStats(CurrentMemoryContext); +} + PG_RETURN_BOOL(TRUE); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] assessing parallel-safety
On Thu, Feb 12, 2015 at 6:40 AM, Amit Kapila wrote: > If we have to go this way, then isn't it better to evaluate the same > when we are trying to create parallel path (something like in the > parallel_seq scan patch - create_parallelscan_paths())? Probably not, because many queries will scan multiple relations, and we want to do all of this work just once per query. Also, when somebody adds another parallel node (e.g. parallel hash join) that will need this same information. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans for GiST.
On 12 February 2015 at 10:40, Anastasia Lubennikova wrote: > Thanks for answer. > Now it seems to be applied correctly. > (please avoid top-posting) Thanks for the updated patch. I can confirm that it now cleanly applies and compiles fine. I've run the tested in the SQL file you provided, and it's behaving as expected. Thom
Re: [HACKERS] Standby receiving part of missing WAL segment
On Thu, Feb 12, 2015 at 9:56 AM, Thom Brown wrote: > On 12 February 2015 at 13:56, Robert Haas wrote: >> >> On Wed, Feb 11, 2015 at 12:55 PM, Thom Brown wrote: >> > Today I witnessed a situation which appears to have gone down like this: >> > >> > - The primary server starting streaming WAL data from segment 00A8 to >> > the >> > standby >> > - The standby server started receiving that data >> > - Before 00A8 is finished, the wal sender process dies on the primary, >> > but >> > the archiver process continues, and 00A8 ends up being archived as usual >> > - The primary continues to generate WAL and cleans up old WAL files from >> > pg_xlog until 00A8 is gone. >> > - The primary is restarted and the wal sender process is back up and >> > running >> > - The standby says "waiting for 00A8", which it can no longer get from >> > the >> > primary >> > - 00A8 is in the standby's archive directory, but the standby is waiting >> > for >> > the rest of the segment from the primary via streaming replication, so >> > doesn't check the archive >> > - The standby is restarted >> > - The standby goes back into recovery and eventually replays 00A8 and >> > continues as normal. >> > >> > Should the standby be able to get feedback from the primary that the >> > requested segment is no longer available, and therefore know to check >> > its >> > archive? >> >> Last time I played around with this, if the standby requested a >> segment from the master that was no longer present there, the standby >> would immediately get an ERROR, which it seems like would get you out >> of trouble. I wonder why that didn't happen in your case. > > > Yeah, I've tried recreating this like so: > > - Primary streams to standby like usual > - Kill -9 primary then change its port and bring it back up > - Create traffic on primary until it no longer has the WAL file the standby > wants, but has archived it > - Change the port of the primary back to what the standby is trying to talk > to > > But before it gets to that 4th point, the standby has gone to the archive > for the rest of it: > > cp: cannot stat ‘/tmp/walarch/0001000600C8’: No such file or > directory > 2015-02-12 14:47:52 GMT [8280]: [1-1] user=,db=,client= FATAL: could not > connect to the primary server: could not connect to server: Connection > refused > Is the server running on host "127.0.0.1" and accepting > TCP/IP connections on port 5488? > > cp: cannot stat ‘/tmp/walarch/0001000600C8’: No such file or > directory > 2015-02-12 14:47:57 GMT [8283]: [1-1] user=,db=,client= FATAL: could not > connect to the primary server: could not connect to server: Connection > refused > Is the server running on host "127.0.0.1" and accepting > TCP/IP connections on port 5488? > > 2015-02-12 14:48:02 GMT [8202]: [6-1] user=,db=,client= LOG: restored log > file "0001000600C8" from archive > > I don't suppose this is something that was buggy in 9.3.1? I don't know. I don't remember a bug of that type, but I might have missed it. If you see the scenario again, it might help to pull a stack trace from the master and the standby. That might tell us whether the master or the standby is to blame, and where the offender is wedged. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
On Thu, Feb 12, 2015 at 9:50 AM, Tom Lane wrote: >> My first thought is that we should form some kind of TOAST-like >> backronym, like Serialization Avoidance Loading and Access Device >> (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I >> don't think there is anything per se wrong with the terms >> serialization and deserialization; indeed, I used the same ones in the >> parallel-mode stuff. But they are fairly general terms, so it might >> be nice to have something more specific that applies just to this >> particular usage. > > Hm. I'm not against the concept, but those particular suggestions don't > grab me. Fair enough. I guess the core of my point is just that I suggest we invent a name for this thing. "Serialize" and "deserialize" describe what you are doing just fine, but the mechanism itself should be called something, I think. When you say "varlena header" or "TOAST pointer" that is a name for a very particular thing, not just a general category of things you might do. If we replaced every instance of "TOAST pointer" to "reference to where the full value is stored", it would be much less clear, and naming all of the related functions would be harder. >> This is a clever >> representation, but it's hard to wrap your head around, and I'm not >> sure "primary" and "secondary" are the best names, although I don't >> have an idea as to what would be better. I'm a bit confused, though: >> once you give out a secondary pointer, how is it safe to write the >> object through the primary pointer? > > It's no different from allowing plpgsql to update the values of variables > of pass-by-reference types even though it has previously given out Datums > that are pointers to them: by the time we're ready to execute an > assignment, any query execution that had such a pointer is over and done > with. (This implies that cursor parameters have to be physically copied > into the cursor's execution state, which is one of a depressingly large > number of reasons why datumCopy() has to physically copy a deserialized > value rather than just copying the pointer. But otherwise it works.) OK, I see. So giving out a secondary pointer doesn't necessarily preclude further changes via the primary pointer, but you'd better be sure that you don't try until such time as all of those secondary references are gone. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2015 - mentors, students and admins.
Hi! On Mon, Feb 9, 2015 at 11:52 PM, Thom Brown wrote: > Google Summer of Code 2015 is approaching. I'm intending on registering > PostgreSQL again this year. > > Before I do that, I'd like to have an idea of how many people are > interested in being either a student or a mentor. > I'm ready to participate as mentor again! -- With best regards, Alexander Korotkov.
Re: [HACKERS] Standby receiving part of missing WAL segment
On 12 February 2015 at 13:56, Robert Haas wrote: > On Wed, Feb 11, 2015 at 12:55 PM, Thom Brown wrote: > > Today I witnessed a situation which appears to have gone down like this: > > > > - The primary server starting streaming WAL data from segment 00A8 to the > > standby > > - The standby server started receiving that data > > - Before 00A8 is finished, the wal sender process dies on the primary, > but > > the archiver process continues, and 00A8 ends up being archived as usual > > - The primary continues to generate WAL and cleans up old WAL files from > > pg_xlog until 00A8 is gone. > > - The primary is restarted and the wal sender process is back up and > running > > - The standby says "waiting for 00A8", which it can no longer get from > the > > primary > > - 00A8 is in the standby's archive directory, but the standby is waiting > for > > the rest of the segment from the primary via streaming replication, so > > doesn't check the archive > > - The standby is restarted > > - The standby goes back into recovery and eventually replays 00A8 and > > continues as normal. > > > > Should the standby be able to get feedback from the primary that the > > requested segment is no longer available, and therefore know to check its > > archive? > > Last time I played around with this, if the standby requested a > segment from the master that was no longer present there, the standby > would immediately get an ERROR, which it seems like would get you out > of trouble. I wonder why that didn't happen in your case. Yeah, I've tried recreating this like so: - Primary streams to standby like usual - Kill -9 primary then change its port and bring it back up - Create traffic on primary until it no longer has the WAL file the standby wants, but has archived it - Change the port of the primary back to what the standby is trying to talk to But before it gets to that 4th point, the standby has gone to the archive for the rest of it: cp: cannot stat ‘/tmp/walarch/0001000600C8’: No such file or directory 2015-02-12 14:47:52 GMT [8280]: [1-1] user=,db=,client= FATAL: could not connect to the primary server: could not connect to server: Connection refused Is the server running on host "127.0.0.1" and accepting TCP/IP connections on port 5488? cp: cannot stat ‘/tmp/walarch/0001000600C8’: No such file or directory 2015-02-12 14:47:57 GMT [8283]: [1-1] user=,db=,client= FATAL: could not connect to the primary server: could not connect to server: Connection refused Is the server running on host "127.0.0.1" and accepting TCP/IP connections on port 5488? 2015-02-12 14:48:02 GMT [8202]: [6-1] user=,db=,client= LOG: restored log file "0001000600C8" from archive I don't suppose this is something that was buggy in 9.3.1? Thom
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
Robert Haas writes: > On Tue, Feb 10, 2015 at 3:00 PM, Tom Lane wrote: >> BTW, I'm not all that thrilled with the "deserialized object" terminology. >> I found myself repeatedly tripping up on which form was serialized and >> which de-. If anyone's got a better naming idea I'm willing to adopt it. > My first thought is that we should form some kind of TOAST-like > backronym, like Serialization Avoidance Loading and Access Device > (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I > don't think there is anything per se wrong with the terms > serialization and deserialization; indeed, I used the same ones in the > parallel-mode stuff. But they are fairly general terms, so it might > be nice to have something more specific that applies just to this > particular usage. Hm. I'm not against the concept, but those particular suggestions don't grab me. > I found the notion of "primary" and "secondary" TOAST pointers to be > quite confusing. I *think* what you are doing is storing two pointers > to the object in the object, and a pointer to the object is really a > pointer to one of those two pointers to the object. Depending on > which one it is, you can write the object, or not. There's more to it than that. (Writing more docs is one of the to-do items ;-).) We could alternatively have done that with two different va_tag values for "read write" and "read only", which indeed was my initial intention before I thought of this dodge. However, then you have to figure out where to store such pointers, which is problematic both for plpgsql variable assignment and for ExecMakeSlotContentsReadOnly, especially the latter which would have to put any freshly-made pointer in a long-lived context resulting in query-lifespan memory leaks. So I early decided that the read-write pointer should live right in the object's own context where it need not be copied when swinging the context ownership someplace else, and later realized that there should also be a permanent read-only pointer in there for the use of ExecMakeSlotContentsReadOnly, and then realized that they didn't need to have different va_tag values if we implemented the "is read-write pointer" test as it's done in the patch. Having only one va_tag value not two saves cycles, I think, because there are a lot of low-level tests that don't need to distinguish, eg VARTAG_SIZE(). However it does make it more expensive when you do need to distinguish, so I might reconsider that decision later. (Since these will never go to disk, we can whack the representation around pretty freely if needed.) Also, I have hopes of allowing deserialized-object pointers to be copied into tuples as pointers rather than by reserialization, if we can establish that the tuple is short-lived enough that the pointer will stay good, which would be true in a lot of cases during execution of queries by plpgsql. With the patch's design, a pointer so copied will automatically be considered read-only, which I *think* is the behavior we'd need. If it turns out that it's okay to propagate read-write-ness through such a copy step then that would argue in favor of using two va_tag values. It may be that this solution is overly cute and we should just use two tag values. But I wanted to be sure it was possible for copying of a pointer to automatically lose read-write-ness, in case we need to have such a guarantee. > This is a clever > representation, but it's hard to wrap your head around, and I'm not > sure "primary" and "secondary" are the best names, although I don't > have an idea as to what would be better. I'm a bit confused, though: > once you give out a secondary pointer, how is it safe to write the > object through the primary pointer? It's no different from allowing plpgsql to update the values of variables of pass-by-reference types even though it has previously given out Datums that are pointers to them: by the time we're ready to execute an assignment, any query execution that had such a pointer is over and done with. (This implies that cursor parameters have to be physically copied into the cursor's execution state, which is one of a depressingly large number of reasons why datumCopy() has to physically copy a deserialized value rather than just copying the pointer. But otherwise it works.) There is more work to do to figure out how we can safely give out a read/write pointer for cases like hstore_var := hstore_concat(hstore_var, ...); Aside from the question of whether hstore_concat guarantees not to trash the value on failure, we'd have to restrict this (I think) to expressions in which there is only one reference to the target variable and it's an argument of the topmost function/operator. But that's something I've not tried to implement yet. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical decoding document
>> Hi, I need help. >> >> In "46.6.4.5 Change Callback" >> >>Note: Only changes in user defined tables that are not unlogged >>(see UNLOGGED) and not temporary (see TEMPORARY or TEMP) can be >>extracted using logical decoding. >> >> I cannot parse the sentence above. Maybe logical decoding does not >> decode a table if it is an unloged table or a temporary table? > > It basically is saying that you won't see changes made to temporary > and/or unlogged tables. But the begin/commit callbacks being called for > the relevant transaction. Oh, thanks. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_check_dir comments and implementation mismatch
Il 02/02/15 21:48, Robert Haas ha scritto: > On Sat, Jan 31, 2015 at 8:28 AM, Marco Nenciarini > wrote: >> Il 30/01/15 03:54, Michael Paquier ha scritto: >>> On Fri, Jan 30, 2015 at 2:49 AM, Tom Lane wrote: There is at least one other bug in that function now that I look at it: in event of a readdir() failure, it neglects to execute closedir(). Perhaps not too significant since all existing callers will just exit() anyway after a failure, but still ... >>> I would imagine that code scanners like coverity or similar would not >>> be happy about that. ISTM that it is better to closedir() >>> appropriately in all the code paths. >>> >> >> I've attached a new version of the patch fixing the missing closedir on >> readdir error. > > If readir() fails and closedir() succeeds, the return will be -1 but > errno will be 0. > The attached patch should fix it. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it diff --git a/src/port/pgcheckdir.c b/src/port/pgcheckdir.c index 7061893..7102f2c 100644 *** a/src/port/pgcheckdir.c --- b/src/port/pgcheckdir.c *** *** 22,28 * Returns: *0 if nonexistent *1 if exists and empty ! *2 if exists and not empty *-1 if trouble accessing directory (errno reflects the error) */ int --- 22,30 * Returns: *0 if nonexistent *1 if exists and empty ! *2 if exists and contains _only_ dot files ! *3 if exists and contains a mount point ! *4 if exists and not empty *-1 if trouble accessing directory (errno reflects the error) */ int *** pg_check_dir(const char *dir) *** 32,37 --- 34,41 DIR*chkdir; struct dirent *file; booldot_found = false; + boolmount_found = false; + int readdir_errno; chkdir = opendir(dir); if (chkdir == NULL) *** pg_check_dir(const char *dir) *** 51,60 { dot_found = true; } else if (strcmp("lost+found", file->d_name) == 0) { ! result = 3; /* not empty, mount point */ ! break; } #endif else --- 55,64 { dot_found = true; } + /* lost+found directory */ else if (strcmp("lost+found", file->d_name) == 0) { ! mount_found = true; } #endif else *** pg_check_dir(const char *dir) *** 64,72 } } ! if (errno || closedir(chkdir)) result = -1;/* some kind of I/O error? */ /* We report on dot-files if we _only_ find dot files */ if (result == 1 && dot_found) result = 2; --- 68,87 } } ! if (errno) result = -1;/* some kind of I/O error? */ + /* Close chkdir and avoid overwriting the readdir errno on success */ + readdir_errno = errno; + if (closedir(chkdir)) + result = -1;/* error executing closedir */ + else + errno = readdir_errno; + + /* We report on mount point if we find a lost+found directory */ + if (result == 1 && mount_found) + result = 3; + /* We report on dot-files if we _only_ find dot files */ if (result == 1 && dot_found) result = 2; signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Simplify sleeping while reading/writing from client
On 02/06/2015 11:50 AM, Andres Freund wrote: On 2015-02-05 16:45:50 +0200, Heikki Linnakangas wrote: I propose the attached, which pulls all the wait-retry logic up to secure_read() and secure_write(). This makes the code a lot more understandable. Generally a good idea. Especially if we get more ssl implementations. But I think it'll make the already broken renegotiation handling even worse. Because now we're always entering do_handshake in nonblocking mode (essentially). Good point. In the other thread, we concluded that the SSL_do_handshake() call isn't really needed anyway, so attached are two patches: 1. remove the SSL_do_handshake() calls, and 2. the previous patch, to simplify the waiting logic. - Heikki From 30e5930c9da553d504339e00a05a2a8892182f63 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Thu, 12 Feb 2015 15:53:26 +0200 Subject: [PATCH 1/2] Simplify the way OpenSSL renegotiation is initiated in server. At least in all modern versions of OpenSSL, it is enough to call SSL_renegotiate() once, and then forget about it. Subsequent SSL_write() and SSL_read() calls will finish the handshake. The SSL_set_session_id_context() call is unnecessary too. We only have one SSL context, and the SSL session was created with that to begin with. --- src/backend/libpq/be-secure-openssl.c | 23 --- 1 file changed, 23 deletions(-) diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index d5f9712..d13ce33 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -624,33 +624,10 @@ be_tls_write(Port *port, void *ptr, size_t len) */ SSL_clear_num_renegotiations(port->ssl); - SSL_set_session_id_context(port->ssl, (void *) &SSL_context, - sizeof(SSL_context)); if (SSL_renegotiate(port->ssl) <= 0) ereport(COMMERROR, (errcode(ERRCODE_PROTOCOL_VIOLATION), errmsg("SSL failure during renegotiation start"))); - else - { - int retries; - - /* - * A handshake can fail, so be prepared to retry it, but only - * a few times. - */ - for (retries = 0;; retries++) - { -if (SSL_do_handshake(port->ssl) > 0) - break; /* done */ -ereport(COMMERROR, - (errcode(ERRCODE_PROTOCOL_VIOLATION), - errmsg("SSL handshake failure on renegotiation, retrying"))); -if (retries >= 20) - ereport(FATAL, - (errcode(ERRCODE_PROTOCOL_VIOLATION), - errmsg("could not complete SSL handshake on renegotiation, too many failures"))); - } - } } wloop: -- 2.1.4 From 6341bd76e8184c5c464c90a1ba3d99f02ef61bac Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Thu, 5 Feb 2015 16:44:13 +0200 Subject: [PATCH 2/2] Simplify waiting logic in reading from / writing to client. The client socket is always in non-blocking mode, and if we actually want blocking behaviour, we emulate it by sleeping and retrying. But we retry loops at different layers for reads and writes, which was confusing. To simplify, remove all the sleeping and retrying code from the lower levels, from be_tls_read and secure_raw_read and secure_raw_write, and put all the logic in secure_read() and secure_write(). --- src/backend/libpq/be-secure-openssl.c | 81 +-- src/backend/libpq/be-secure.c | 143 ++ src/backend/libpq/pqcomm.c| 3 +- src/include/libpq/libpq-be.h | 4 +- 4 files changed, 79 insertions(+), 152 deletions(-) diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index d13ce33..37af6e4 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -511,14 +511,11 @@ be_tls_close(Port *port) * Read data from a secure connection. */ ssize_t -be_tls_read(Port *port, void *ptr, size_t len) +be_tls_read(Port *port, void *ptr, size_t len, int *waitfor) { ssize_t n; int err; - int waitfor; - int latchret; -rloop: errno = 0; n = SSL_read(port->ssl, ptr, len); err = SSL_get_error(port->ssl, n); @@ -528,39 +525,15 @@ rloop: port->count += n; break; case SSL_ERROR_WANT_READ: + *waitfor = WL_SOCKET_READABLE; + errno = EWOULDBLOCK; + n = -1; + break; case SSL_ERROR_WANT_WRITE: - /* Don't retry if the socket is in nonblocking mode. */ - if (port->noblock) - { -errno = EWOULDBLOCK; -n = -1; -break; - } - - waitfor = WL_LATCH_SET; - - if (err == SSL_ERROR_WANT_READ) -waitfor |= WL_SOCKET_READABLE; - else -waitfor |= WL_SOCKET_WRITEABLE; - - latchret = WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0); - - /* - * We'll, among other situations, get here if the low level - * routine doing the actual recv() via the socket got interrupted - * by a signal. That's so we can handle interrupts once outside - * openssl, so we don't jump out from underneath its covers. We - * can check this both, when reading and writing,
Re: [HACKERS] Logical decoding document
Hi, On 2015-02-12 22:55:33 +0900, Tatsuo Ishii wrote: > Hi, I need help. > > In "46.6.4.5 Change Callback" > >Note: Only changes in user defined tables that are not unlogged >(see UNLOGGED) and not temporary (see TEMPORARY or TEMP) can be >extracted using logical decoding. > > I cannot parse the sentence above. Maybe logical decoding does not > decode a table if it is an unloged table or a temporary table? It basically is saying that you won't see changes made to temporary and/or unlogged tables. But the begin/commit callbacks being called for the relevant transaction. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standby receiving part of missing WAL segment
On Wed, Feb 11, 2015 at 12:55 PM, Thom Brown wrote: > Today I witnessed a situation which appears to have gone down like this: > > - The primary server starting streaming WAL data from segment 00A8 to the > standby > - The standby server started receiving that data > - Before 00A8 is finished, the wal sender process dies on the primary, but > the archiver process continues, and 00A8 ends up being archived as usual > - The primary continues to generate WAL and cleans up old WAL files from > pg_xlog until 00A8 is gone. > - The primary is restarted and the wal sender process is back up and running > - The standby says "waiting for 00A8", which it can no longer get from the > primary > - 00A8 is in the standby's archive directory, but the standby is waiting for > the rest of the segment from the primary via streaming replication, so > doesn't check the archive > - The standby is restarted > - The standby goes back into recovery and eventually replays 00A8 and > continues as normal. > > Should the standby be able to get feedback from the primary that the > requested segment is no longer available, and therefore know to check its > archive? Last time I played around with this, if the standby requested a segment from the master that was no longer present there, the standby would immediately get an ERROR, which it seems like would get you out of trouble. I wonder why that didn't happen in your case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Logical decoding document
Hi, I need help. In "46.6.4.5 Change Callback" Note: Only changes in user defined tables that are not unlogged (see UNLOGGED) and not temporary (see TEMPORARY or TEMP) can be extracted using logical decoding. I cannot parse the sentence above. Maybe logical decoding does not decode a table if it is an unloged table or a temporary table? Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
On Tue, Feb 10, 2015 at 3:00 PM, Tom Lane wrote: > I've now taken this idea as far as building the required infrastructure > and revamping a couple of array operators to use it. There's a lot yet > to do, but I've done enough to get some preliminary ideas about > performance (see below). Very impressive. This is something that's been mentioned before and which I always thought would be great to have, but I didn't expect it would be this easy to cobble together a working implementation. Or maybe "easy" isn't the right term, but this isn't a very big patch. > BTW, I'm not all that thrilled with the "deserialized object" terminology. > I found myself repeatedly tripping up on which form was serialized and > which de-. If anyone's got a better naming idea I'm willing to adopt it. My first thought is that we should form some kind of TOAST-like backronym, like Serialization Avoidance Loading and Access Device (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I don't think there is anything per se wrong with the terms serialization and deserialization; indeed, I used the same ones in the parallel-mode stuff. But they are fairly general terms, so it might be nice to have something more specific that applies just to this particular usage. I found the notion of "primary" and "secondary" TOAST pointers to be quite confusing. I *think* what you are doing is storing two pointers to the object in the object, and a pointer to the object is really a pointer to one of those two pointers to the object. Depending on which one it is, you can write the object, or not. This is a clever representation, but it's hard to wrap your head around, and I'm not sure "primary" and "secondary" are the best names, although I don't have an idea as to what would be better. I'm a bit confused, though: once you give out a secondary pointer, how is it safe to write the object through the primary pointer? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
Hi, I've attached an updated version of the patch. This fixes the issue on checksum calculation for segments after the first one. To solve it I've added an optional uint32 *segno argument to parse_filename_for_nontemp_relation, so I can know the segment number and calculate the block number correctly. Il 29/01/15 18:57, Robert Haas ha scritto: > On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini > wrote: >> The current implementation of copydir function is incompatible with LSN >> based incremental backups. The problem is that new files are created, >> but their blocks are still with the old LSN, so they will not be backed >> up because they are looking old enough. > > I think this is trying to pollute what's supposed to be a pure > fs-level operation ("copy a directory") into something that is aware > of specific details like the PostgreSQL page format. I really think > that nothing in storage/file should know about the page format. If we > need a function that copies a file while replacing the LSNs, I think > it should be a new function living somewhere else. > I've named it copydir_set_lsn and placed it as static function in dbcommands.c. This lefts the copydir and copy_file functions in copydir.c untouched. The copydir function in copydir.c is now unused, while the copy_file function is still used during unlogged tables reinit. > A bigger problem is that you are proposing to stamp those files with > LSNs that are, for lack of a better word, fake. I would expect that > this would completely break if checksums are enabled. Also, unlogged > relations typically have an LSN of 0; this would change that in some > cases, and I don't know whether that's OK. > I've investigate a bit and I have not been able to find any problem here. > The issues here are similar to those in > http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de > - basically, I think we need to make CREATE DATABASE and ALTER > DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is > never going to work right. If we're not going to allow that, we need > to disallow hot backups while those operations are in progress. > As already said the copydir-LSN patch should be treated as a "temporary" until a proper WAL logging of CREATE DATABASE and ALTER DATABASE SET TABLESPACE will be implemented. At that time we could probably get rid of the whole copydir.[ch] file moving the copy_file function inside reinit.c Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c index afd9255..02b5fee 100644 *** a/src/backend/storage/file/reinit.c --- b/src/backend/storage/file/reinit.c *** static void ResetUnloggedRelationsInTabl *** 28,35 int op); static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op); - static bool parse_filename_for_nontemp_relation(const char *name, - int *oidchars, ForkNumber *fork); typedef struct { --- 28,33 *** ResetUnloggedRelationsInDbspaceDir(const *** 388,446 fsync_fname((char *) dbspacedirname, true); } } - - /* - * Basic parsing of putative relation filenames. - * - * This function returns true if the file appears to be in the correct format - * for a non-temporary relation and false otherwise. - * - * NB: If this function returns true, the caller is entitled to assume that - * *oidchars has been set to the a value no more than OIDCHARS, and thus - * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID - * portion of the filename. This is critical to protect against a possible - * buffer overrun. - */ - static bool - parse_filename_for_nontemp_relation(const char *name, int *oidchars, - ForkNumber *fork) - { - int pos; - - /* Look for a non-empty string of digits (that isn't too long). */ - for (pos = 0; isdigit((unsigned char) name[pos]); ++pos) - ; - if (pos == 0 || pos > OIDCHARS) - return false; - *oidchars = pos; - - /* Check for a fork name. */ - if (name[pos] != '_') - *fork = MAIN_FORKNUM; - else - { - int forkchar; - - forkchar = forkname_chars(&name[pos + 1], fork); - if (forkchar <= 0) - return false; - pos += forkchar + 1; - } - - /* Check for a segment number. */ - if (name[pos] == '.') - { - int segchar; - -
Re: [HACKERS] SSL renegotiation and other related woes
On 02/12/2015 01:33 AM, Andres Freund wrote: On 2015-02-11 14:54:03 +0200, Heikki Linnakangas wrote: Thoughts? Can you reproduce any errors with this? I'm on battery right now, so I can't really test much. Did you test having an actual standby instead of pg_receivexlog? I saw some slightly different errors there. Does this patch alone work for you or did you test it together with the earlier one to fix the renegotiation handling server side when nonblocking? Because that frequently seemed to error out for me, at least over actual network. A latch loop + checking for the states seemed to fix that for me. This patch alone worked for me. +* NB: This relies on the calling code to call pqsecure_read(), completing +* the renegotiation handshake, if pqsecure_write() returns 0. Otherwise +* we'll never make progress. +*/ I think this should a) refer to the fact that pqSendSome does that in blocking scenarios b) expand the comment around the pqReadData to reference that fact. How are we going to deal with callers using libpq in nonblocking mode. A client doing a large COPY FROM STDIN to the server using a nonblocking connection (not a stupid thing to do) could hit this IIUC. Hmm, that's problematic even without SSL. If you do a large COPY FROM STDIN, and the server sends a lot of NOTICEs, the NOTICEs will be accumulated in the TCP send and receive buffers until they fill up. After that, the server will block on the send, and will stop reading the tuple data the client sends, and you get a deadlock. In blocking mode, pqSendSome calls pqReadData to avoid that. I think pqSendSome should call pqReadData() after getting EWOULDBLOCK, even in non-blocking mode. It won't completely prevent the problem: it's possible that the receive buffer fills up, but the application doesn't call pqSendSome() until the socket becomes writeable, which won't happen until the client drains the incoming data is drained and unblocks the server. But it greatly reduces the risk. In practice that will solve the problem for SSL renegotiations. We should also document that the application should always wait for the socket readable condition, in addition to writeable, and call pqConsumeInput(). That fixes the issue for good. Could we perhaps call SSL_peek() when SSL_write() hits WANTS_READ? In combination with your fix I think that should fix the possibility of such a deadlock? Especially on the serverside where the socket most of the time is is in, although emulated, blocking mode that seems easier - we can just retry afterwards. Hmm. Not sure what the semantics of SSL_peek() are. At least we'd need to call it with a large enough buffer that it would read all the incoming data from the OS buffer. I think it would be safer to just call SSL_read(). Both the server and libpq have an input buffer that you can easily read into at any time. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] assessing parallel-safety
On Thu, Feb 12, 2015 at 12:16 AM, Noah Misch wrote: > That is a major mark against putting the check in simplify_function(), agreed. I do see one way to rescue that idea, which is this: put two flags, parallelModeOK and parallelModeRequired, into PlannerGlobal. At the beginning of planning, set parallelModeOK based on our best knowledge at that time; as we preprocess expressions, update it to false if we see something that's not parallel-safe. Emit paths for parallel plans only if the flag is currently true. At the end of planning, when we convert paths to plans, set parallelModeRequired if any parallel plan elements are generated. If we started with parallelModeOK = true or ended up with parallelModeRequired = false, then life is good. In the unfortunate event that we end up with parallelModeOK = false and parallelModeRequired = true, replan, this time with parallelModeOK = false from the beginning. If there are no sub-queries involved, this will work out fine - parallelModeOK will always be definitively set to the right value before we rely on it for anything. This includes cases where subqueries are eliminated by pulling them up. If there *are* subqueries, we'll still be OK if we happen to hit the parallel-unsafe construct before we hit the part - if any - that can benefit from parallelism. So this would mostly be pretty cheap, but if you do hit the case where a re-plan is required it would be pretty painful. >> > Unless we want to rejigger this so that we do a >> > complete eval_const_expressions() pass over the entire query tree >> > (including all subqueries) FIRST, and then only after that go back and >> > plan all of those subqueries, I don't see how to make this work; and >> > I'm guessing that there are good reasons not to do that. > > I expect that would work fine, but I do think it premature to venture that far > out of your way to optimize this new tree examination. The cost may just not > matter. Other parts of the planner use code like contain_volatile_functions() > and contain_nonstrict_functions(), which have the same kind of inefficiency > you're looking to avoid here. They do, but those are run on much smaller pieces of the query tree, never on the whole thing. I really wish Tom Lane would weigh in here, as an opinion from him could save me from spending time on a doomed approach. I expect criticism of this type: http://www.postgresql.org/message-id/22266.1269641...@sss.pgh.pa.us -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Jan Urbański writes: > Andres Freund writes: > >> On 2015-02-12 09:31:27 +0100, Jan Urbański wrote: >>> That doesn't solve the problem of the Python deadlock, where you're not at >>> leisure to call a C function at the beginning of your module. >> >> We could just never unload the hooks... > > That's what we did before 4e816286533dd34c10b368487d4079595a3e1418 :) And it > got changed after > http://www.postgresql.org/message-id/48620925.6070...@pws.com.au > >> >>> > * If there's already callbacks set: Remember that fact and don't >>> > overwrite. In the next major version: warn. >>> >>> So yeah, that was my initial approach - check if callbacks are set, don't do >>> the dance if they are. It felt like a crutch, though, and racy at that. >>> There's >>> no atomic way to test-and-set those callbacks. The window for racyness is >>> small, though. >> >> If you do that check during library initialization instead of every >> connection it shouldn't be racy - if that part is run in a multithreaded >> fashion you're doing something crazy. > > Yes, that's true. The problem is that there's no real libpq initialisation > function. The docs say that: > > "If your application initializes libssl and/or libcrypto libraries and libpq > is > built with SSL support, you should call PQinitOpenSSL" > > So most apps will just not bother. The moment you know you'll need SSL is only > when you get an 'S' message from the server... For the sake of discussion, here's a patch to prevent stomping on previously-set callbacks, racy as it looks. FWIW, it does fix the Python deadlock and doesn't cause the PHP segfault... J diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c index a32af34..54312a5 100644 --- a/src/interfaces/libpq/fe-secure-openssl.c +++ b/src/interfaces/libpq/fe-secure-openssl.c @@ -710,6 +710,9 @@ verify_peer_name_matches_certificate(PGconn *conn) * Callback functions for OpenSSL internal locking */ +#if OPENSSL_VERSION_NUMBER < 0x1000 +/* OpenSSL 1.0.0 deprecates the CRYPTO_set_id_callback function and provides a + * default implementation, so there's no need for our own. */ static unsigned long pq_threadidcallback(void) { @@ -720,6 +723,7 @@ pq_threadidcallback(void) */ return (unsigned long) pthread_self(); } +#endif /* OPENSSL_VERSION_NUMBER < 0x1000 */ static pthread_mutex_t *pq_lockarray; @@ -806,9 +810,14 @@ pgtls_init(PGconn *conn) if (ssl_open_connections++ == 0) { - /* These are only required for threaded libcrypto applications */ - CRYPTO_set_id_callback(pq_threadidcallback); - CRYPTO_set_locking_callback(pq_lockingcallback); + /* These are only required for threaded libcrypto applications, but + * make sure we don't stomp on them if they're already set. */ +#if OPENSSL_VERSION_NUMBER < 0x1000 + if (CRYPTO_get_id_callback() == NULL) +CRYPTO_set_id_callback(pq_threadidcallback); +#endif + if (CRYPTO_get_locking_callback() == NULL) +CRYPTO_set_locking_callback(pq_lockingcallback); } } #endif /* ENABLE_THREAD_SAFETY */ @@ -885,9 +894,14 @@ destroy_ssl_system(void) if (pq_init_crypto_lib && ssl_open_connections == 0) { - /* No connections left, unregister libcrypto callbacks */ - CRYPTO_set_locking_callback(NULL); - CRYPTO_set_id_callback(NULL); + /* No connections left, unregister libcrypto callbacks, if no one + * registered different ones in the meantime. */ +#if OPENSSL_VERSION_NUMBER < 0x1000 + if (CRYPTO_get_id_callback() == pq_threadidcallback) + CRYPTO_set_id_callback(NULL); +#endif + if (CRYPTO_get_locking_callback() == pq_lockingcallback) + CRYPTO_set_locking_callback(NULL); /* * We don't free the lock array or the SSL_context. If we get another -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Andres Freund writes: > On 2015-02-12 09:31:27 +0100, Jan Urbański wrote: >> That doesn't solve the problem of the Python deadlock, where you're not at >> leisure to call a C function at the beginning of your module. > > We could just never unload the hooks... That's what we did before 4e816286533dd34c10b368487d4079595a3e1418 :) And it got changed after http://www.postgresql.org/message-id/48620925.6070...@pws.com.au > >> > * If there's already callbacks set: Remember that fact and don't >> > overwrite. In the next major version: warn. >> >> So yeah, that was my initial approach - check if callbacks are set, don't do >> the dance if they are. It felt like a crutch, though, and racy at that. >> There's >> no atomic way to test-and-set those callbacks. The window for racyness is >> small, though. > > If you do that check during library initialization instead of every > connection it shouldn't be racy - if that part is run in a multithreaded > fashion you're doing something crazy. Yes, that's true. The problem is that there's no real libpq initialisation function. The docs say that: "If your application initializes libssl and/or libcrypto libraries and libpq is built with SSL support, you should call PQinitOpenSSL" So most apps will just not bother. The moment you know you'll need SSL is only when you get an 'S' message from the server... Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] assessing parallel-safety
On Thu, Feb 12, 2015 at 1:51 AM, Robert Haas wrote: > > > I think we may want a dedicated parallel-safe property for functions > rather than piggybacking on provolatile, but that will probably also > be changeable via ALTER FUNCTION, and stored rules won't get > miraculously updated. So this definitely can't be something we figure > out at parse-time ... it's got to be determined later. But at the > moment I see no way to do that without an extra pass over the whole > rewritten query tree. :-( > If we have to go this way, then isn't it better to evaluate the same when we are trying to create parallel path (something like in the parallel_seq scan patch - create_parallelscan_paths())? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] How about to have relnamespace and relrole?
Hello, I changed the subject. This mail is to address the point at hand, preparing for registering this commitfest. 15 17:29:14 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20150204.172914.52110711.horiguchi.kyot...@lab.ntt.co.jp> > Tue, 03 Feb 2015 10:12:12 -0500, Tom Lane wrote in > <2540.1422976...@sss.pgh.pa.us> > > I'm not really excited about that. That line of thought would imply > > that we should have "reg*" types for every system catalog, which is > > surely overkill. > > Mmm. I suppose "for every OID usage", not "every system catalog". > but I agree as the whole. There's no agreeable-by-all > boundary. Perhaps it depends on how often the average DBA looks > onto catalogs which have oids pointing another catalog which they > want to see in human-readable form, without joins if possible. > > I very roughly counted how often the oids of catalogs referred > from other catalogs. 1. Expected frequency of use I counted how often oids of system catalogs are referred to by other system catalog/views. Among them, pg_stat_* views, are excluded since they have text representations for all oid references. The result is like this. The full result of the counting is in the Excel file but it's not at hand for now.. I'll show it here if anyone wants to see it. pg_class.oid: 27 pg_authid.oid:33 pg_namespace.oid: 20 pg_proc.oid: 13 pg_type.oid: 15 pg_databse.oid:5 pg_operator.oid: 5 pg_am.oid: 4 Among these, authid and namespace are apparently referred to frequently but don't have their reg* types but referred to from more points than proc, type, operator, am.. # By the way, I don't understand where the "reg" comes from, # REGistered? Or other origin? For that reason, although the current policy of deciding whether to have reg* types seems to be whether they have schema-qualified names, I think regrole and regnamespace are valuable to have. 2. Anticipaed un-optimizability Tom pointed that these reg* types prevents planner from optimizing the query, so we should refrain from having such machinary. It should have been a long-standing issue but reg*s sees to be rather faster than joining corresponding catalogs for moderate number of the result rows, but this raises another more annoying issue. 3. Potentially breakage of MVCC The another issue Tom pointed is potentially breakage of MVCC by using these reg* types. Syscache is invalidated on updates so this doesn't seem to be a problem on READ COMMITTED mode, but breaks SERIALIZABLE mode. But IMHO it is not so serious problem as long as such inconsistency occurs only on system catalog and it is explicitly documented togethee with the un-optimizability issue. I'll add a patch for this later. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Table-level log_autovacuum_min_duration
On Thu, Feb 12, 2015 at 5:44 PM, Naoya Anzai wrote: > Hi, Michael-san > > > An updated patch is attached, > > I'm sorry for confusing you. > > I think you don't have to implement this code to disable this > feature with using value "-2".Because this use case is a rare case, > and there is a practical workaround using huge value like "2e9". > (You suggested "2e9" to me, didn't you? :) ) So, please remove this code. > I will clean up the code. > > Well, I see your point but this is not completely true: we could as > > well rely entirely on this parameter instead of VACOPT_VERBOSE to > > determine if autovacuum, a vacuum or an analyze are in verbose mode, > > and remove VACOPT_VERBOSE, but I can imagine people complaining if > > VACOPT_VERBOSE is removed. So let's set it up unconditionally to -1 in > > gram.y for now. However if people think that it is fine to remove > > VACOPT_VERBOSE, we could use exclusively this parameter in VacuumStmt. > > Or even add sanity checks at the top of vacuum() to ensure that > > VACOPT_VERBOSE is set only when log_min_duration is positive. > > Additional opinions on this matter are welcome. > > I understand your point at last. :) > > You mean that ... > Log_autovacuum_min_duration assumes a role of VACOPT_VERBOSE. > Even if this parameter never use currently for manual vacuum, > log_autovacuum_min_duration should be set zero(anytime output) > when we executes "VACUUM(or ANALYZE) VERBOSE". > Is my understanding correct? If so,it sounds logical. > Yup, that's my opinion. Now I don't know if people would mind to remove VACOPT_VERBOSE and replace the control it does by log_min_duration in VacuumStmt. At least both things are overlapping, and log_min_duration offers more options than the plain VACOPT_VERBOSE. > If we can abolish VERBOSE option, > I think it's ideal that we will prepare a new parameter like > a log_min_duration_vacuum(and log_min_duration_analyze) which > integrating "VERBOSE feature" and "log_autovacuum_min_duration". > What I think you are proposing here is a VERBOSE option that hypothetically gets activated if a manual VACUUM takes more than a certain amount specified by those parameters. I doubt this would be useful. In any case this is unrelated to this patch. -- Michael
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Thank you for comments. Please find attached the updated patch. >This patch fails to compile: >xlogreader.c:1049:46: error: extraneous ')' after condition, expected a >statement >blk->with_hole && blk->hole_offset <= > 0)) This has been rectified. >Note as well that at least clang does not like much how the sanity check with >with_hole are done. You should place parentheses around the '&&' expressions. >Also, I would rather define with_hole == 0 or with_hole == 1 explicitly int >those checks The expressions are modified accordingly. >There is a typo: >s/true,see/true, see/ >[nitpicky]Be as well aware of the 80-character limit per line that is usually >normally by comment blocks.[/] Have corrected the typos and changed the comments as mentioned. Also , realigned certain lines to meet the 80-char limit. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Support-compression-for-full-page-writes-in-WAL_v18.patch Description: Support-compression-for-full-page-writes-in-WAL_v18.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Thu, Feb 12, 2015 at 2:19 AM, Robert Haas wrote: > > On Tue, Feb 10, 2015 at 3:56 PM, Andres Freund wrote: > > On 2015-02-10 09:23:02 -0500, Robert Haas wrote: > >> On Tue, Feb 10, 2015 at 9:08 AM, Andres Freund wrote: > > > > As pointed out above (moved there after reading the patch...) I don't > > think a chunk size of 1 or any other constant size can make sense. I > > don't even believe it'll necessarily be constant across an entire query > > execution (big initially, small at the end). Now, we could move > > determining that before the query execution into executor > > initialization, but then we won't yet know how many workers we're going > > to get. We could add a function setting that at runtime, but that'd mix > > up responsibilities quite a bit. > > I still think this belongs in heapam.c somehow or other. If the logic > is all in the executor, then it becomes impossible for any code that > doensn't use the executor to do a parallel heap scan, and that's > probably bad. It's not hard to imagine something like CLUSTER wanting > to reuse that code, and that won't be possible if the logic is up in > some higher layer. If the logic we want is to start with a large > chunk size and then switch to a small chunk size when there's not much > of the relation left to scan, there's still no reason that can't be > encapsulated in heapam.c. > It seems to me that we need to use both ways (make heap or other lower layers aware of parallelism and another one is handle at executor level and use callback_function and callback_state to make it work) for doing parallelism. TBH, I think for the matter of this patch we can go either way and then think more on it as we move ahead to parallelize other operations. So what I can do is to try using Robert's patch to make heap aware of parallelism and then see how it comes up? > > Btw, using a atomic uint32 you'd end up without the spinlock and just > > about the same amount of code... Just do a atomic_fetch_add_until32(var, > > 1, InvalidBlockNumber)... ;) > > I thought of that, but I think there's an overflow hazard. > > > Where the 'coordinated seqscan' scans a relation so that each tuple > > eventually gets returned once across all nodes, but the nested loop (and > > through it the index scan) will just run normally, without any > > coordination and parallelism. But everything below --- would happen > > multiple nodes. If you agree, yes, then we're in violent agreement > > ;). The "single node that gets copied" bit above makes me a bit unsure > > whether we are though. > > Yeah, I think we're talking about the same thing. > > > To me, given the existing executor code, it seems easiest to achieve > > that by having the ParallelismDrivingNode above having a dynamic number > > of nestloop children in different backends and point the coordinated > > seqscan to some shared state. As you point out, the number of these > > children cannot be certainly known (just targeted for) at plan time; > > that puts a certain limit on how independent they are. But since a > > large number of them can be independent between workers it seems awkward > > to generally treat them as being the same node across workers. But maybe > > that's just an issue with my mental model. > > I think it makes sense to think of a set of tasks in which workers can > assist. So you a query tree which is just one query tree, with no > copies of the nodes, and then there are certain places in that query > tree where a worker can jump in and assist that node. To do that, it > will have a copy of the node, but that doesn't mean that all of the > stuff inside the node becomes shared data at the code level, because > that would be stupid. > As per my understanding of the discussion related to this point, I think there are 3 somewhat related ways to achieve this. 1. Both master and worker runs the same node (ParallelSeqScan) where the work done by worker (scan chunks of the heap) for this node is subset of what is done by master (coordinate the data returned by workers + scan chunks of heap). It seems to me Robert is advocating this approach. 2. Master and worker uses different nodes to operate. Master runs parallelism drivingnode (ParallelSeqscan - coordinate the data returned by workers + scan chunks of heap ) and worker runs some form of Parallelismdriver node (PartialSeqScan - scan chunks of the heap). It seems to me Andres is proposing this approach. 3. Same as 2, but modify existing SeqScan node to behave as PartialSeqScan. This is what I have done in patch. Correct me or add here if I have misunderstood any thing. I think going forward (for cases like aggregation) the work done in Master and Worker node will have substantial differences that it is better to do the work as part of different nodes in master and worker. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Index-only scans for GiST.
Thanks for answer. Now it seems to be applied correctly. 2015-02-12 3:12 GMT+04:00 Thom Brown : > On 11 February 2015 at 22:50, Anastasia Lubennikova < > lubennikov...@gmail.com> wrote: > >> Finally there is a new version of patch (in attachments). >> It provides multicolumn index-only scan for GiST indexes. >> >> - Memory leak is fixed. >> - little code cleanup >> - example of performance test in attachmens >> - function OIDs have debugging values (*) just to avoid merge >> conflicts while testing patch >> >> Wiki page of the project is >> >> https://wiki.postgresql.org/wiki/Support_for_Index-only_scans_for_GIST_GSoC_2014 >> >> Waiting for feedback. >> > > Hi Anastasia. Thanks for the updated patch. I've just tried applying it > to head and it doesn't appear to apply cleanly. > > $ patch -p1 < ~/Downloads/indexonlyscan_gist_2.0.patch > (Stripping trailing CRs from patch; use --binary to disable.) > patching file src/backend/access/gist/gist.c > Hunk #1 succeeded at 1404 (offset 9 lines). > Hunk #2 succeeded at 1434 (offset 9 lines). > (Stripping trailing CRs from patch; use --binary to disable.) > patching file src/backend/access/gist/gistget.c > Hunk #1 succeeded at 227 (offset 1 line). > Hunk #2 succeeded at 243 (offset 1 line). > Hunk #3 succeeded at 293 (offset -4 lines). > Hunk #4 succeeded at 330 (offset -4 lines). > Hunk #5 succeeded at 365 (offset -5 lines). > Hunk #6 succeeded at 444 (offset -27 lines). > Hunk #7 succeeded at 474 (offset -27 lines). > Hunk #8 FAILED at 518. > Hunk #9 succeeded at 507 (offset -28 lines). > Hunk #10 succeeded at 549 with fuzz 1 (offset -28 lines). > Hunk #11 FAILED at 601. > 2 out of 11 hunks FAILED -- saving rejects to file > src/backend/access/gist/gistget.c.rej > ... > > -- > Thom > -- Best regards, Lubennikova Anastasia diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c index db2a452..53750da 100644 --- a/src/backend/access/gist/gist.c +++ b/src/backend/access/gist/gist.c @@ -1404,6 +1404,14 @@ initGISTstate(Relation index) else giststate->distanceFn[i].fn_oid = InvalidOid; + /* opclasses are not required to provide a Fetch method */ + if (OidIsValid(index_getprocid(index, i + 1, GIST_FETCH_PROC))) + fmgr_info_copy(&(giststate->fetchFn[i]), + index_getprocinfo(index, i + 1, GIST_FETCH_PROC), + scanCxt); + else + giststate->fetchFn[i].fn_oid = InvalidOid; + /* * If the index column has a specified collation, we should honor that * while doing comparisons. However, we may have a collatable storage @@ -1426,6 +1434,22 @@ initGISTstate(Relation index) return giststate; } +/* + * Gistcanreturn is supposed to be true if ANY FetchFn method is defined. + * If FetchFn exists it would be used in index-only scan + * Thus the responsibility rests with the opclass developer. + */ + +Datum +gistcanreturn(PG_FUNCTION_ARGS) { + Relation index = (Relation) PG_GETARG_POINTER(0); + int i = PG_GETARG_INT32(1); + if (OidIsValid(index_getprocid(index, i+1, GIST_FETCH_PROC))) + PG_RETURN_BOOL(true); + else + PG_RETURN_BOOL(false); +} + void freeGISTstate(GISTSTATE *giststate) { diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c index 717cb85..0925e56 100644 --- a/src/backend/access/gist/gistget.c +++ b/src/backend/access/gist/gistget.c @@ -227,8 +227,10 @@ gistindex_keytest(IndexScanDesc scan, * If tbm/ntids aren't NULL, we are doing an amgetbitmap scan, and heap * tuples should be reported directly into the bitmap. If they are NULL, * we're doing a plain or ordered indexscan. For a plain indexscan, heap - * tuple TIDs are returned into so->pageData[]. For an ordered indexscan, + * tuple TIDs are returned into so->pageData. For an ordered indexscan, * heap tuple TIDs are pushed into individual search queue items. + * If index-only scan is possible, heap tuples themselves are returned + * into so->pageData or into search queue. * * If we detect that the index page has split since we saw its downlink * in the parent, we push its new right sibling onto the queue so the @@ -241,6 +243,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances, GISTScanOpaque so = (GISTScanOpaque) scan->opaque; Buffer buffer; Page page; + GISTSTATE *giststate = so->giststate; + Relation r = scan->indexRelation; + boolisnull[INDEX_MAX_KEYS]; + GISTSearchHeapItem *tmpListItem; GISTPageOpaque opaque; OffsetNumber maxoff; OffsetNumber i; @@ -287,8 +293,6 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances, MemoryContextSwitchTo(oldcxt); } - so->nPageData = so->curPageData = 0; - /* * check all tuples on page */ @@ -326,11 +330,20 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances, else if (scan->numberOfOrderBys == 0 && GistPageIsLeaf(page)) { /* - * Non-ordered scan, so report heap tuples in so->pageData[] + * Non-ordered scan, so rep
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
On 2015-02-12 09:31:27 +0100, Jan Urbański wrote: > > Andres Freund writes: > > > On 2015-02-09 18:17:14 +0100, Jan Urbański wrote: > >> First of all, the current behaviour is crazy. We're setting and unsetting > >> the > >> locking callback every time a connection is made/closed, which is not how > >> OpenSSL is supposed to be used. The *application* using libpq should set a > >> callback before it starts threads, it's no business of the library's. > > > > I don't buy this argument at all. Delivering a libpq that's not > > threadsafe in some circumstances is a really bad idea. > > I knew this would be a hard sell :( What I know is that the current situation > is not good and leaving it as-is causes real grief. It certainly causes less grief than just breaking working applications. While really shitty the current situation works in a large number of scenarios. It's not that common to have several users of openssl in an application *and* unload libpq. > > I fail to see why it's any better to do it in the VM. That relies on > > either always loading the VM's openssl module, even if you don't > > actually need it because the library you use deals with openssl. It > > prevents other implementations of openssl in the VM. > > The callbacks are a thing that's owned by the application, so libraries have > no > business in setting them up. The way OpenSSL's API is specified (very > unfortunately IMHO) is that before you use OpenSSL from threads, you need to > set the callbacks. If you don't control your application's startup (like when > you're a script executed through a VM), you assume the VM took care of it at > startup. If you're a library, you assume the user took care of it, as they > should. That's a bogus argument. The VM cannot setup up every library in the world in a correct way. It'd be obviously be completely insane to load the ssl module in every library just because some part of some random application might need it. In fact, the ssl library for python does pretty much the same thing as libpq does (although it's slightly more careful). It's not the VM at all. > > What I think we should do is the following: > > > > * Emphasize the fact that it's important to use PQinitSSL(false) if you > > did things yourself. > > PQinitOpenSSL(true, false) if anything. The reason that function got > introduced > is precisely because PQinitSSL(false) isn't exactly right, see > http://www.postgresql.org/message-id/49820d7d.7030...@esilo.com Well, that really depends on what the application is actually using... > That doesn't solve the problem of the Python deadlock, where you're not at > leisure to call a C function at the beginning of your module. We could just never unload the hooks... > > * If there's already callbacks set: Remember that fact and don't > > overwrite. In the next major version: warn. > > So yeah, that was my initial approach - check if callbacks are set, don't do > the dance if they are. It felt like a crutch, though, and racy at that. > There's > no atomic way to test-and-set those callbacks. The window for racyness is > small, though. If you do that check during library initialization instead of every connection it shouldn't be racy - if that part is run in a multithreaded fashion you're doing something crazy. > > * Assert or something if the callback when unregistering isn't the same > > as it was when registering. That's pretty much guaranteed to cause > > issues. > > So let me try another proposal and see if it doesn't set alarm bells off. > > * When opening a connection, if there's a callback set, don't overwrite it > (small race window). > * When closing a connection, if the callback set is not > pq_lockingcallback/pq_threadidcallback, don't NULL it (small race window) If we do the tests once during initialization there shouldn't be a race > Asserts in frontend code are impossible, right? There's no way to warn, > either. You can write to stderr... > That would be a ~8 line patch. Does it feel back-patcheable? I think we first need to find out what we all can agree on before deciding about that. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
Andres Freund writes: > On 2015-02-09 18:17:14 +0100, Jan Urbański wrote: >> First of all, the current behaviour is crazy. We're setting and unsetting the >> locking callback every time a connection is made/closed, which is not how >> OpenSSL is supposed to be used. The *application* using libpq should set a >> callback before it starts threads, it's no business of the library's. > > I don't buy this argument at all. Delivering a libpq that's not > threadsafe in some circumstances is a really bad idea. I knew this would be a hard sell :( What I know is that the current situation is not good and leaving it as-is causes real grief. >> The old behaviour was slightly less insane (set callbacks first time we're >> engaging OpenSSL code, never touch them again). The real sane solution is to >> leave it up to the application. > > The real solution would be for openssl to do it itself. I think that the OpenSSL API is the real culprit there, requiring threads to register a callback is what's causing all the issues, but I don't think this will get ever changed. >> Note that the PHP pgsql extension doesn't set the OpenSSL callbacks, because >> libpq was setting them on its own. If we remove the callback handling from >> libpq, PHP will need to add them. By the way, the MySQL extension for PHP >> also >> does not set those callbacks. > > I think this shows pretty clearly how insane it would be to change this > in a minor release. Do you really expect people to update libpq and php > in unison. After a minor release? Well, I haven't found reports of threaded PHP+MySQL+SSL programs crashing, and the MySQL extension also doesn't care about the callbacks. I think it's because it's both because it's a rare combination, or because almost everyone loads the cURL extension, which *does* set up callbacks. Like I said, it's not a good situation. > Note that we *already* provide applications with the ability to set the > callbacks themselves and prevent us from doing so: PQinitSSL(false). Ah, I only now saw that with PQinitOpenSSL(true, false) you can work around the problem, if you set up your own callbacks first. That seems to at least provide a way to make libpq not do the callbacks dance in released versions. Thank you. >> Let me reiterate: I now believe the callbacks should be set by the >> application, >> libraries should not touch them, since they don't know what will they be >> stomping on. If the application is run through a VM like Python or PHP, it's >> the VM that should make sure the callbacks are set. > > I fail to see why it's any better to do it in the VM. That relies on > either always loading the VM's openssl module, even if you don't > actually need it because the library you use deals with openssl. It > prevents other implementations of openssl in the VM. The callbacks are a thing that's owned by the application, so libraries have no business in setting them up. The way OpenSSL's API is specified (very unfortunately IMHO) is that before you use OpenSSL from threads, you need to set the callbacks. If you don't control your application's startup (like when you're a script executed through a VM), you assume the VM took care of it at startup. If you're a library, you assume the user took care of it, as they should. Now I know that a lot of apps get this wrong. Python almost does the right thing, but you're right, it only sets up callbacks if you load the 'ssl' module. It still feels like setting them up in library code is stomping on things not owned by the library - like libperl setting up signal handlers, which caused problems in Postgres. They're resources not owned by the library! > What I think we should do is the following: > > * Emphasize the fact that it's important to use PQinitSSL(false) if you > did things yourself. PQinitOpenSSL(true, false) if anything. The reason that function got introduced is precisely because PQinitSSL(false) isn't exactly right, see http://www.postgresql.org/message-id/49820d7d.7030...@esilo.com That doesn't solve the problem of the Python deadlock, where you're not at leisure to call a C function at the beginning of your module. > * If there's already callbacks set: Remember that fact and don't > overwrite. In the next major version: warn. So yeah, that was my initial approach - check if callbacks are set, don't do the dance if they are. It felt like a crutch, though, and racy at that. There's no atomic way to test-and-set those callbacks. The window for racyness is small, though. > * Assert or something if the callback when unregistering isn't the same > as it was when registering. That's pretty much guaranteed to cause > issues. So let me try another proposal and see if it doesn't set alarm bells off. * When opening a connection, if there's a callback set, don't overwrite it (small race window). * When closing a connection, if the callback set is not pq_lockingcallback/pq_threadidcallback, don't NULL it (small race window) Asserts i
Re: [HACKERS] Table-level log_autovacuum_min_duration
Hi, Michael-san > An updated patch is attached, I'm sorry for confusing you. I think you don't have to implement this code to disable this feature with using value "-2".Because this use case is a rare case, and there is a practical workaround using huge value like "2e9". (You suggested "2e9" to me, didn't you? :) ) So, please remove this code. > Well, I see your point but this is not completely true: we could as > well rely entirely on this parameter instead of VACOPT_VERBOSE to > determine if autovacuum, a vacuum or an analyze are in verbose mode, > and remove VACOPT_VERBOSE, but I can imagine people complaining if > VACOPT_VERBOSE is removed. So let's set it up unconditionally to -1 in > gram.y for now. However if people think that it is fine to remove > VACOPT_VERBOSE, we could use exclusively this parameter in VacuumStmt. > Or even add sanity checks at the top of vacuum() to ensure that > VACOPT_VERBOSE is set only when log_min_duration is positive. > Additional opinions on this matter are welcome. I understand your point at last. :) You mean that ... Log_autovacuum_min_duration assumes a role of VACOPT_VERBOSE. Even if this parameter never use currently for manual vacuum, log_autovacuum_min_duration should be set zero(anytime output) when we executes "VACUUM(or ANALYZE) VERBOSE". Is my understanding correct? If so,it sounds logical. If we can abolish VERBOSE option, I think it's ideal that we will prepare a new parameter like a log_min_duration_vacuum(and log_min_duration_analyze) which integrating "VERBOSE feature" and "log_autovacuum_min_duration". Regards, --- Naoya Anzai -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers