Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
IOn 10/14/2015 08:45 PM, Alvaro Herrera wrote: > Amir Rohan wrote: > >> it does fail the "dependent options" test: >> $ postgres -C "archive_mode" >> on >> $ postgres -C wal_level >> minimal >> >> no errors, great, let's try it: >> $ pg_ctl restart >> >> FATAL: WAL archival cannot be enabled when wal_level is "minimal" > > This complaint could be fixed we had a --check-config that runs the > check hook for every variable, I think. I think that closes all the > syntax checks you want; then your tool only needs to worry about > semantic checks, and can obtain the values by repeated application of -C > (or, more conveniently, have a new -C mode that takes no args > and prints the names and values for all parameters). > That sounds reasonable and convenient. It's important to have one tool that covers everything, so I'd have to call "-C" and parse the errors, but that seems possible. If there was a flag which produced something more like the output of the pg_settings view in a structured format, as well as the errors, that would be ideal. And possibly useful for other tool building. Would such a thing be contrary to the pg spirit? Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/14/2015 07:41 PM, David Fetter wrote: >> On Wed, Oct 14, 2015 at 01:52:21AM +0300, Amir Rohan wrote: >> >> I've considered "vendoring", but it seems like enough code surgery >> be involved to make this very dubious "reuse". The language is simple >> enough that writing a parser from scratch isn't a big deal hard, and >> there doesn't seem much room for divergent parsing either. > > Such room as there is seems worth eliminating if possible. IMO, not If it means the tool will thus dodge minor potential for harmless bugs, but becomes far more difficult to install in the process. > There's even a formal name for this issue, which attackers can use, although > the implications as a source of subtle bugs in the absence of an > attacker seem more pertinent right now. > > https://www.google.com/?q=parse%20tree%20differential%20attack > Interesting, but the security implications of a "parser attack" or indeed different parsing behavior in some corner cases, are roughly nil in this case. >> So, the only question is whether reusing the existing parser will >> brings along some highly useful functionality beyond an AST and >> a battle-tested validator for bools, etc'. I'm not ruling anything >> out yet, though. > > I suspect that having a single source parser, however painful now, > will pay large dividends later. > I don't think anything is more important for a productivity tool then being easily installed and easy to use. It's a fact of life that all software will have bugs. If they matter, you fix them. Still, if there are more concrete benefits to adapting the parser, such as batteries-included features I'm simply unaware of, I'd love to hear about it. Thanks, Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/14/2015 05:55 PM, Andres Freund wrote: > On 2015-10-14 17:46:25 +0300, Amir Rohan wrote: >> On 10/14/2015 05:35 PM, Andres Freund wrote: >>> Then your argument about the CF process doesn't seem to make sense. > >> Why? I ask again, what do you mean by "separate process"? > > Not going through the CF and normal release process. > >> either it's in core (and follows its processes) or it isn't. But you >> can't say you don't want it in core but that you also don't >> want it to follow a "separate process". > > Oh for crying out loud. You write: > Andres, I'm not here looking for ways to quibble with you. So, please "assume good faith". >> 4) You can't easily extend the checks performed, without forking >> postgres or going through the (lengthy, rigorous) cf process. > > and > >>> I don't think we as a community want to do that without review >>> mechanisms in place, and I personally don't think we want to add >>> separate processes for this. > >> That's what "contribute" means in my book. > > I don't see how those two statements don't conflict. > Right. I was saying that "contribute" always implies review before acceptance, responding to the first half of your sentence. The second half assumes it makes sense to discuss "review process" as a separate issue from inclusion in core. It does not make sense, and I said so. If you have a bone to pick with my comment about CF review being lengthy, the points was that an independent tool can move more quickly to accept submissions because: 1. there's less at stake 2. if it's written in a higher level language, enhancements are easier. Those don't hold when adding another check involves changes to the `postgres` binary. Fair? Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/14/2015 05:35 PM, Andres Freund wrote: > On 2015-10-14 16:50:41 +0300, Amir Rohan wrote: >>> I don't think we as a community want to do that without review >>> mechanisms in place, and I personally don't think we want to add >>> separate processes for this. >>> >> >> That's what "contribute" means in my book. > > Then your argument about the CF process doesn't seem to make sense. > Why? I ask again, what do you mean by "separate process"? either it's in core (and follows its processes) or it isn't. But you can't say you don't want it in core but that you also don't want it to follow a "separate process". I think you're simply saying you're -1 on the whole idea. ok. Regards, Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/14/2015 04:24 PM, Andres Freund wrote: > On 2015-10-14 16:17:55 +0300, Amir Rohan wrote: >> it does fail the "dependent options" test: >> $ postgres -C "archive_mode" >> on >> $ postgres -C wal_level >> minimal > > Yea, because that's currently evaluated outside the config > mechanism. It'd imo would be good to change that independent of this > thread. > I agree. >> 5) Because it checks syntax only, you don't get the benefits of having >> an official place for the community to easily contribute rules that >> warn you against config pitfalls, so that all users benefits. >> See my OP for real-world examples and links about this problem. > > I don't think we as a community want to do that without review > mechanisms in place, and I personally don't think we want to add > separate processes for this. > That's what "contribute" means in my book. I'm getting mixed signals about what the "community" wants. I certainly think if adding rules involves modifying the postgres server code, that is far too high a bar and no one will. I'm not sure what you mean by "separate process". My original pitch was to have this live in core or contrib, and no one wanted it. If you don't want it in core, but people thinks its a good idea to have (with review), what would you suggest? Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/14/2015 01:30 PM, Andres Freund wrote: > On 2015-10-14 01:54:46 +0300, Amir Rohan wrote: >> Andres, please see upthread for quite a bit on what it doesn't do, and >> why having it in the server is both an advantages and a shortcoming. > > As far as I have skimmed the thread it's only talking about shortcoming > in case it requires a running server. Which -C doesn't. > That's helpful, no one has suggested -C yet. This was new to me. Here's what the usage says: Usage: postgres [OPTION]... Options: -C NAMEprint value of run-time parameter, then exit ``` P1. The fact that -C will warn about bad values when you query for an unrelated name (as it requires you to do) is cunningly and successfully elided. I expected to have to check every one individually. How about a no-argument version of "-C"? P2. Next, that it will print the value of a "run-time" parameter, without an actual process is another nice surprise, which I wouldn't have guessed. The error messages are all one could wish for: LOG: invalid value for parameter "wal_level": "archive2" HINT: Available values: minimal, archive, hot_standby, logical. LOG: parameter "fsync" requires a Boolean value *and* it prints everything it finds wrong, not stopping at the first one. very nice. it does fail the "dependent options" test: $ postgres -C "archive_mode" on $ postgres -C wal_level minimal no errors, great, let's try it: $ pg_ctl restart FATAL: WAL archival cannot be enabled when wal_level is "minimal" > I don't think there's any fundamental difference between some external > binary parsing & validating the config file and the postgres binary > doing it. -C does a much better job then pg_file_settings for this task, I agree. Still, it doesn't answer the already mentioned points of: 1) You need a server (binary, not a running server) to check a config (reasonable), with a version matching what you'll run on (again, reasonable) 2) It doesn't catch more complicated problems like dependent options. 3) It doesn't catch bad ideas, only errors. 4) You can't easily extend the checks performed, without forking postgres or going through the (lengthy, rigorous) cf process. 5) Because it checks syntax only, you don't get the benefits of having an official place for the community to easily contribute rules that warn you against config pitfalls, so that all users benefits. See my OP for real-world examples and links about this problem. > There *is* the question of the utility being able to to > process options from multiple major releases, but I don't think that's a > particularly worthwhile goal here. > For one thing, I think it's been made clear that either few people know about -C or few use it as part of their usual workflow. That in itself is an issue. Any bloggers reading this? Next, you may not agree semantic checks/advice is useful. I only agree it's easy to dismiss until it saves your (i.e. a user's) ass that first time. I would *highly* recommend the talk mentioned in the OP: "...Lag - What's Wrong with My Slave?" (Samantha Billington, 2015) https://youtu.be/If22EwtCFKc Not just for the concrete examples (*you* know those by heart), but to really hear the frustration in a real production user's voice when they deploy pg in production, hit major operational difficulties, spend lots of time and effort trying to root-cause and finally realize they simply needed to "turn on FOO". Most of these problem can be caught very easily with a conf linter. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/14/2015 01:16 AM, Andres Freund wrote: > On October 13, 2015 11:14:19 PM GMT+02:00, Alvaro Herrera > wrote: >> Amir Rohan wrote: >> >>> I've been considering that. Reusing the parser would ensure no errors >>> are introduces by having a different implementation, but on the other >>> hand involving the pg build in installation what's intended as a >>> lightweight, independent tool would hurt. >>> Because it's dubious whether this will end up in core, I'd like >>> "pip install pg_confcheck" to be all that is required. >> >> Maybe just compile a single file in a separate FRONTEND environment? > > Maybe I'm missing something here - but doesn't the posted binary do nearly > all of this for you already? There's the option to display the value of a > config option, and that checks the validity of the configuration. Might need > to add an option to validate hba.conf et al as well and allow to display all > values. That should be pretty simple? > Andres, please see upthread for quite a bit on what it doesn't do, and why having it in the server is both an advantages and a shortcoming. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/14/2015 01:12 AM, Alvaro Herrera wrote: > Amir Rohan wrote: >> On 10/14/2015 12:14 AM, Alvaro Herrera wrote: >>> Amir Rohan wrote: >>> >>>> I've been considering that. Reusing the parser would ensure no errors >>>> are introduces by having a different implementation, but on the other >>>> hand involving the pg build in installation what's intended as a >>>> lightweight, independent tool would hurt. >>>> Because it's dubious whether this will end up in core, I'd like >>>> "pip install pg_confcheck" to be all that is required. >>> >>> Maybe just compile a single file in a separate FRONTEND environment? >> >> You mean refactoring the postgres like rhass means? could you elaborate? >> >> I believe most people get pg as provided by their distro or PaaS, >> and not by compiling it. > > I mean the utility would be built by using a file from the backend > source, just like pg_xlogdump does. We have several such cases. > I don't think this is impossible to do outside core, either. > I've considered "vendoring", but it seems like enough code surgery be involved to make this very dubious "reuse". The language is simple enough that writing a parser from scratch isn't a big deal hard, and there doesn't seem much room for divergent parsing either. So, the only question is whether reusing the existing parser will brings along some highly useful functionality beyond an AST and a battle-tested validator for bools, etc'. I'm not ruling anything out yet, though. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/14/2015 12:14 AM, Alvaro Herrera wrote: > Amir Rohan wrote: > >> I've been considering that. Reusing the parser would ensure no errors >> are introduces by having a different implementation, but on the other >> hand involving the pg build in installation what's intended as a >> lightweight, independent tool would hurt. >> Because it's dubious whether this will end up in core, I'd like >> "pip install pg_confcheck" to be all that is required. > > Maybe just compile a single file in a separate FRONTEND environment? > You mean refactoring the postgres like rhass means? could you elaborate? I believe most people get pg as provided by their distro or PaaS, and not by compiling it. Involving a pg build environment is asking a whole lot of someone who has pg all installed and who just wants to lint their conf. It's also legitimate for someone to be configuring postgres without having a clue how to compile it. If this was in core, this wouldn't be an issue because then packages would also include the tool. Note I'm not actually pushing for that, it's that that has implications on what can be assumed as available. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/13/2015 09:20 PM, Robert Haas wrote: > On Mon, Oct 12, 2015 at 8:01 PM, Amir Rohan wrote: >> That wasn't my intention. Perhaps I'm overreacting to a long-standing >> "Tom Lane's bucket of cold water" tradition. I'm new here. >> I understand your point and I was only reiterating what in particular >> makes the conf checker distinctly useful IMO, and what it could >> provide that pg_settings doesn't. >> >> I've looked at parts of the pg_settings implementation and indeed some >> of that code (and legwork) could be reused so the mundane parts >> of writing this will be less hassle. I might have missed that if Tom and >> you hadn't pointed that out. >> >> So, Fair, and thanks. > > No worries. I'm not upset with you, and I see where you're coming > from. But I since I'm trying to be helpful I thought I would check > whether it's working. Sounds like yes, which is nice. > > It would be spiffy if we could use the same config-file parser from > outside postgres itself, but it seems hard. I assume the core lexer > and parser could be adapted to work from libcommon with some > non-enormous amount of effort, but check-functions are can and do > assume that they are running in a backend environment; one would lose > a lot of sanity-checking if those couldn't be executed, I've been considering that. Reusing the parser would ensure no errors are introduces by having a different implementation, but on the other hand involving the pg build in installation what's intended as a lightweight, independent tool would hurt. Because it's dubious whether this will end up in core, I'd like "pip install pg_confcheck" to be all that is required. Also, anything short of building the tool in tree with an unmodified parser amounts to forking the parser code and maintaining it along with upstream, per version, etc'. I'm not eager to do that. > and checking GUCs created by loadable modules seems impossible. Still, even a > partial move toward making this code accessible in frontend code would > be welcome from where I sit. > Dumping the output of the pg_settings view into json has already provided all the type/enum information needed to type-check values and flag unrecognized GUCs. I don't really see that as involving a heroic amount of work, the language seems extremely simple. There aren't that many types, type-checking them isn't that much work, and once that's written the same checks should apply to all GUCs registered with the server, assuming the pg_settings view is exhaustive in that sense. Any modules outside the main server can provide their own data in a similar format, if it comes to that. I doubt whether such a tool has that much appeal, especially if it isn't bundled with the server. So, I'll think about this some more, and write up a description of how I think it's going to look for some feedback. Then I'll code something. Regards, Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore
On 10/11/2015 03:20 AM, Peter Geoghegan wrote: > On Thu, Sep 3, 2015 at 5:35 PM, David Rowley > wrote: >> My test cases are: > > Note that my text caching and unsigned integer comparison patches have > moved the baseline down quite noticeably. I think that my mobile > processor out-performs the Xeon you used for this, which seems a > little odd even taken the change in baseline performance into account. > To add a caveat not yet mentioned, the idea behind prefetching is to scarifice spare memory bandwidth for performance. That can be a winning bet on a quiet box (the one we benchmark on), but can backfire on production db when the extra memory pressure can degrade all running queries. Something to test for, or at least keep in mind. >> set work_mem ='1GB'; >> create table t1 as select md5(random()::text) from >> generate_series(1,1000); >> >> Times are in milliseconds. Median and average over 10 runs. >> >> Test 1 > I am the reluctant owner of outmoded hardware. Namely a core2 from around 2007 on plain spinning metal. My results (linux 64bit): -- Test 1 -- set work_mem ='1GB'; select count(distinct md5) from t1; == Master == 42771.040 ms <- outlier? 41704.570 ms 41631.660 ms 41421.877 ms == Patch == 42469.911 ms <- outlier? 41378.556 ms 41375.870 ms 41118.105 ms 41096.283 ms 41095.705 ms -- Test 2 -- select sum(rn) from (select row_number() over (order by md5) rn from t1) a; == Master == 44186.775 ms 44137.154 ms 44111.271 ms 44061.896 ms 44109.122 ms == Patch == 44592.590 ms 44403.076 ms 44276.170 ms very slight difference in an ambiguous direction, but also no perf catastrophe. > It's worth considering that for some (possibly legitimate) reason, the > built-in function call is ignored by your compiler, since GCC has > license to do that. You might try this on both master and patched > builds: > > ~/postgresql/src/backend/utils/sort$ gdb -batch -ex 'file tuplesort.o' > -ex 'disassemble tuplesort_gettuple_common' > prefetch_disassembly.txt > > ... > > Notably, there is a prefetchnta instruction here. > I have verified the prefetech is emitted in the disassembly. An added benefit of owning outmoded hardware is that the MSR for this generation is public and I can disable individual prefetcher units by twiddeling a bit. Disabling the "HW prefetch" or the "DCU prefetch" units on a pacthed version gave results that look relatively unchanged, which seems promising. Disabling them both at once on an unpatched version shows a slowdown of 5-6% in test1 (43347.181, 43898.705, 43399.428). That gives an indication of maximum potential gains in this direction, for this box at least. Finally, I notice my results are 4x slower than everyone else's. That can be very tough on a man's pride, let me tell you. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/13/2015 02:02 AM, Robert Haas wrote: > On Fri, Oct 9, 2015 at 4:38 PM, Amir Rohan wrote: >> It does catch bad syntax, but in most cases all you get is >> "The setting could not be applied". that's not great for enums >> or a float instead of an int. I guess a future version will fix that >> (or not). > > I expect we would consider patches to improve the error messages if > you (or someone else) wanted to propose such. But you don't have to > want to do that. > >> You need a running server to run a check. You need to monkey >> with said server's configuration in place to run a check. You must be on >> 9.5+. The checking mechanism isn't extensible. Certainly not as easily >> as dropping a new rule file somewhere. It doesn't check (AFAICT) for bad >> combinations of values, for example it will tell you that you can't >> change `wal_archive` without restart (without showing source location >> btw, bug?), but not that you better set `wal_level` *before* you >> restart. It doesn't do any semantic checks. It won't warn you >> about things that are not actually an error, just a bad idea. > > So, I'm not saying that a config checker has no value. In fact, I > already said the opposite. You seem to be jumping all over me here > when all I was trying to do is explain what I think Tom was getting > at. I *do* think that pg_file_settings is a helpful feature that is > certainly related to what you are trying to do, but I don't think that > it means that a config checker is useless. Fair? > That wasn't my intention. Perhaps I'm overreacting to a long-standing "Tom Lane's bucket of cold water" tradition. I'm new here. I understand your point and I was only reiterating what in particular makes the conf checker distinctly useful IMO, and what it could provide that pg_settings doesn't. I've looked at parts of the pg_settings implementation and indeed some of that code (and legwork) could be reused so the mundane parts of writing this will be less hassle. I might have missed that if Tom and you hadn't pointed that out. So, Fair, and thanks. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/11/2015 01:19 PM, Michael Paquier wrote: > On Sun, Oct 11, 2015 at 4:44 PM, Amir Rohan wrote: >> On 10/11/2015 02:47 AM, Michael Paquier wrote: >> I apologize -- that didn't came out right. >> What I meant to suggest was "open an issue" to track >> any works that needs to be done. But I guess that's >> not the PG way. > > No problem. I was not clear either. We could create a new item in the > TODO list (https://wiki.postgresql.org/wiki/Todo) and link it to > dedicated page on the wiki where all the potential tests would be > listed. > It couldn't hurt but also may be just a waste of your time. I'm just realizing how central an issue tracker is to how I work and how much not having one irritates me. Tough luck for me I guess. Regards, Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/11/2015 02:47 AM, Michael Paquier wrote: > On Sat, Oct 10, 2015 at 10:52 PM, Amir Rohan wrote: >> On 10/10/2015 04:32 PM, Michael Paquier wrote: >> I was arguing that it's an on-going task that would do >> better if it had a TODO list, instead of "ideas for tests" >> being scattered across 50-100 messages spanning a year or >> more in one thread or another. You may disagree. > > Let's be clear. I am fully in line with your point. > I apologize -- that didn't came out right. What I meant to suggest was "open an issue" to track any works that needs to be done. But I guess that's not the PG way. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/10/2015 04:32 PM, Michael Paquier wrote: > On Sat, Oct 10, 2015 at 9:04 PM, Amir Rohan wrote: >> Now that v9 fixes the problem, here's a summary from going over the >> entire thread one last time: > > Thanks a lot for the summary of the events. > >> # Windows and TAP sets >> Noah (2015-03) mentioned TAP doesn't work on windows, and hoped >> this would include some work on that. >> IIUC, the facilities and tests do run on windows, but focus was there >> and not the preexisting TAP suite. > > They do work on Windows, see 13d856e. > Thanks, I did not know that. >> # Test coverage (in the future) >> Andres wanted a test for xid/multixid wraparound which also raises >> the question of the tests that will need to be written in the future. > > I recall that this would have needed extra functions on the backend... > >> The patch focuses on providing facilities, while providing new coverage >> for several features. There should be a TODO list on the wiki (bug >> tracker, actually), where the list of tests to be written can be managed. >> Some were mentioned in the thread (multi/xid wraparound >> hot_standby_feedback, max_standby_archive_delay and >> max_standby_streaming_delay? recovery_target_action? some in your >> original list?), but threads >> are precisely where these things get lost in the cracks. > > Sure, that's an on-going task. > >> # Directory structure >> I suggested keeping backup/log/PGDATA per instance, rejected. > > I guess that I am still flexible on this one, the node information > (own PGDATA, connection string, port, etc.) is logged as well so this > is not a big deal to me... > >> # Parallel tests and port collisions >> Lots about this. Final result is no port races are possible because >> dedicated dirs are used per test, per instance. And because tcp >> isn't used for connections on any platform (can you confirm that's >> true on windows as well? I'm not familiar with sspi and what OSI >> layer it lives on) > > On Windows you remain with the problem that all nodes initialized > using TestLib.pm will listen to 127.0.0.1, sspi being used to ensure > that the connection at user level is secure (additional entries in > pg_hba.conf are added). > >> # decouple cleanup from node shutdown >> Added (in latest patches?) > > Yes this was added. > >> Michael, is there anything else to do here or shall I mark this for >> committer review? > > I have nothing else. Thanks a lot! > Ok, marked for committer, I hope I'm following "correct" cf procedure. Regards, Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/10/2015 04:32 PM, Michael Paquier wrote: > On Sat, Oct 10, 2015 at 9:04 PM, Amir Rohan wrote: >> The patch focuses on providing facilities, while providing new coverage >> for several features. There should be a TODO list on the wiki (bug >> tracker, actually), where the list of tests to be written can be managed. >> Some were mentioned in the thread (multi/xid wraparound >> hot_standby_feedback, max_standby_archive_delay and >> max_standby_streaming_delay? recovery_target_action? some in your >> original list?), but threads >> are precisely where these things get lost in the cracks. > > Sure, that's an on-going task. > I was arguing that it's an on-going task that would do better if it had a TODO list, instead of "ideas for tests" being scattered across 50-100 messages spanning a year or more in one thread or another. You may disagree. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/10/2015 02:43 PM, Michael Paquier wrote: > On Fri, Oct 9, 2015 at 8:53 PM, Michael Paquier > wrote: >> On Fri, Oct 9, 2015 at 8:47 PM, Amir Rohan wrote: >>> Ok, I've put myself down as reviewer in cfapp. I don't think I can >>> provide any more useful feedback that would actually result in changes >>> at this point, but I'll read through the entire discussion once last >>> time and write down final comments/notes. After that I have no problem >>> marking this for a committer to look at. >> >> OK. If you have any comments or remarks, please do not hesitate at all! > > So, to let everybody know the issue, Amir has reported me offlist a > bug in one of the tests that can be reproduced more easily on a slow > machine: > Yeah, I usually stick to the list for discussion, but I ran an earlier version without issues and thought this might be a problem with my system as I've changed things a bit this week. Now that v9 fixes the probkem, here's a summary from going over the entire thread one last time: # Windows and TAP sets Noah (2015-03) mentioned TAP doesn't work on windows, and hoped this would include some work on that. IIUC, the facilities and tests do run on windows, but focus was there and not the preexisting TAP suite. # Test coverage (in the future) Andres wanted a test for xid/multixid wraparound which also raises the question of the tests that will need to be written in the future. The patch focuses on providing facilities, while providing new coverage for several features. There should be a TODO list on the wiki (bug tracker, actually), where the list of tests to be written can be managed. Some were mentioned in the thread (multi/xid wraparound hot_standby_feedback, max_standby_archive_delay and max_standby_streaming_delay? recovery_target_action? some in your original list?), but threads are precisely where these things get lost in the cracks. # Interactive use vs. TAP tests Early on the goal was also to provide something for interactive use in order to test scenarios. The shift has focused to the TAP tests and some of the choices in the API reflect that. Interactive use is possible, but wasn't a central requirement. # Directory structure I suggested keeping backup/log/PGDATA per instance, rejected. # Parallel tests and port collisions Lots about this. Final result is no port races are possible because dedicated dirs are used per test, per instance. And because tcp isn't used for connections on any platform (can you confirm that's true on windows as well? I'm not familiar with sspi and what OSI layer it lives on) # Allow test to specify shutdown mode Added # decouple cleanup from node shutdown Added (in latest patches?) # Conveniences for test writing vs. running My suggestions weren't picked up, but for one thing setting CLEANUP=0 in the lib (which means editing it...) can be useful for writers. # blocking until server ready pg_isready wrapper added. # Multiple masters back and forth, but supported in latest version. That's it. I've ran the latest (v9) tests works and passed on my system (fedora 64bit) and also under docker with --cpu-quota=1, which simulates a slow machine. Michael, is there anything else to do here or shall I mark this for committer review? Regards, Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/09/2015 09:55 PM, Robert Haas wrote: > On Thu, Oct 8, 2015 at 9:06 AM, Amir Rohan wrote: >> On 10/08/2015 02:38 PM, Tom Lane wrote: >>> Amir Rohan writes: >>>> Comments? >>> >>> ISTM that all of the "functional" parts of this are superseded by >>> pg_file_settings; >> >> To use the new pg_file_settings view you need: >> 1( 9.5+ >> 2( a running server >> >> The feature is describes as follows in a97e0c3354ace5d74 >> >>> The information returned includes the configuration file name, line >>> number in that file, sequence number indicating when the parameter is >>> loaded )useful to see if it is later masked by another definition of >>> the same parameter(, parameter name, and what it is set to at that >>> point. This information is updated on reload of the server. >> >> None of which has much overlap in purpose with what I was suggesting, so >> I don't see why you think it actually makes it redundant. > > I think Tom's point is that if you want to see whether the file > reloads without errors, you can use this view for that. That would > catch stuff like wal_level=lgocial and > default_transaction_isolation='read_committed'. > It does catch bad syntax, but in most cases all you get is "The setting could not be applied". that's not great for enums or a float instead of an int. I guess a future version will fix that (or not). You need a running server to run a check. You need to monkey with said server's configuration in place to run a check. You must be on 9.5+. The checking mechanism isn't extensible. Certainly not as easily as dropping a new rule file somewhere. It doesn't check (AFAICT) for bad combinations of values, for example it will tell you that you can't change `wal_archive` without restart (without showing source location btw, bug?), but not that you better set `wal_level` *before* you restart. It doesn't do any semantic checks. It won't warn you about things that are not actually an error, just a bad idea. Forgive me if any of the above betrays an ignorance of some of pg_file_settings's finer points. I have only read the documentation and tried it in a shell. There was also concern about the prohibitive maintenance burden such a tool would impose. ISTM all you actually need is a JSON file generated from the output of `select * from pg_settings` to make the really tedious bit completely automatic. The semantic checks are by their nature "best effort", and it's my impression (and only that) that they are more stable, in the sense that bad ideas remain so. That's why I disagree with tom's suggestion that pg_file_settings obviates the need for the tool I proposed. Which isn't at all to say it doesn't solve another very well. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/09/2015 02:12 PM, Michael Paquier wrote: > On Fri, Oct 9, 2015 at 8:25 AM, Michael Paquier > wrote: >> On Thu, Oct 8, 2015 at 11:28 PM, Amir Rohan wrote: >>> Wouldn't this work? >>> 1) start timer >>> 2) Grab pg_stat_replication.sent_location from master >>> 3) pg_switch_xlog() # I /think/ we want this, could be wrong >> >> For a warm standby, you would want that, but this depends on two factors: >> - The moment master completes archiving of this segment >> - The moment standby restores it. >> On slow machines, those two things become by far the bottleneck, >> imagine a PI restricted on I/O with a low-class SD card in the worst >> case (I maintain one, with a good card, still the I/O is a >> bottleneck). >> >>> 4) Poll slave's pg_last_xlog_replay_location() until LSN shows up >>> 5) stop timer >> >> That's not really solid, there is an interval of time between the >> moment the LSN position is taken from the master and the standby. An >> accurate method is to log/store on master when a given WAL position >> has been flushed to disk, and do the same on slave at replay for this >> LSN position. In any case this is doing to flood badly the logs of >> both nodes, and as the backend cares about the performance of >> operations in this code path we won't want to do that anyway. >> >> To make it short, it seems to me that simply waiting until the LSN a >> test is waiting for has been replayed is just but fine for this set of >> tests to ensure their run consistency, let's not forget that this is >> the goal here. > > In terms of features, it seems that this patch has everything it needs > to allow one to design tests to work on both Linux and Windows, and it > is careful regarding CVE-2014-0067. Thoughts about moving that as > "Ready for committer"? > Ok, I've put myself down as reviewer in cfapp. I don't think I can provide any more useful feedback that would actually result in changes at this point, but I'll read through the entire discussion once last time and write down final comments/notes. After that I have no problem marking this for a committer to look at. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/08/2015 04:47 PM, Michael Paquier wrote: > On Thu, Oct 8, 2015 at 6:03 PM, Amir Rohan wrote: >> On 10/08/2015 10:39 AM, Michael Paquier wrote: >>>> Someone mentioned a daisy chain setup which sounds fun. Anything else in >>>> particular? Also, it would be nice to have some canned way to measure >>>> end-to-end replication latency for variable number of nodes. >>> >>> Hm. Do you mean comparing the LSN position between two nodes even if >>> both nodes are not connected to each other? What would you use it for? >>> >> >> In a cascading replication setup, the typical _time_ it takes for a >> COMMIT on master to reach the slave (assuming constant WAL generation >> rate) is an important operational metric. > > Hm. You mean the exact amount of time it gets to be sure that a given > WAL position has been flushed on a cascading standby, be it across > multiple layers. Er, that's a bit tough without patching the backend > where I guess we would need to keep a track of when a LSN position has > been flushed. And calls of gettimeofday are expensive, so that does > not sound like a plausible alternative here to me... > Wouldn't this work? 1) start timer 2) Grab pg_stat_replication.sent_location from master 3) pg_switch_xlog() # I /think/ we want this, could be wrong 4) Poll slave's pg_last_xlog_replay_location() until LSN shows up 5) stop timer Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
On 10/08/2015 02:38 PM, Tom Lane wrote: > Amir Rohan writes: >> Comments? > > ISTM that all of the "functional" parts of this are superseded by > pg_file_settings; To use the new pg_file_settings view you need: 1( 9.5+ 2( a running server The feature is describes as follows in a97e0c3354ace5d74 > The information returned includes the configuration file name, line > number in that file, sequence number indicating when the parameter is > loaded )useful to see if it is later masked by another definition of > the same parameter(, parameter name, and what it is set to at that > point. This information is updated on reload of the server. None of which has much overlap in purpose with what I was suggesting, so I don't see why you think it actually makes it redundant. > or at least, if they aren't, you need to provide a > rationale that doesn't consist only of pointing to pre-9.5 discussions. The original rationale already does AFAICT. > The "advice" parts of it maybe are still useful, but a tool that's just > meant for that would look quite a bit different. Maybe what you're really > after is to update pgtune. > In the sense that pgtune processes .conf files and helps the user choose better settings, yes there's commonality. But in that it looks at 5 GUCs or so and applies canned formulas to silently write out a new .conf without explanation, it's not the same tool. > Also, as a general comment, the sketch you provided seems to require > porting everything the server knows about > GUC file syntax, yes, a parser, possibly sharing code with the server. > file locations, specified via command line option, yes. > variable names and values misc/guc.c is eminently parseable in that sense. For some types, validation is trivial. For others, we do the best we can and improve incrementally. > into some other representation that apparently > is not C, nor Perl either. Well, it's data, it probably should have that anyway. You raise an interesting issue that having the schema for the configuration described in a more congenial format than C code would have made implementing this easier, and may actually be worthwhile in its own right. Aside on parsing, when starting a brand new configuration file was suggested in caog9aphycpmtypaawfd3_v7svokbnecfivmrc1axhb40zbs...@mail.gmail.com/ only so json could be used for configuring that feature, I wrote a parser that extends postgresql.conf to accept JSON objects as values as a toy exercise. It's not rocket science. So, None of the above seem like a major hurdle to me. > I think that's likely to be impossible to keep accurate or up to date. If significant rot sets in a few releases down the line, there may be more false positives. But then again if users found it useful so developers cared about keeping it up to data, it wouldn't get that way. Again, previous note about DRY and lack of explicit schema for GUCs except in code form. Also, this depends crucially on the GUC churn rate, which admittedly you are far better qualified to pronounce on than me. > Just as a thought experiment, > ask yourself how you'd validate the TimeZone setting in external code, bearing in mind that > anytime you don't give exactly the same answer the server would, your tool > has failed to be helpful. A tool may be extremely useful for a hundred checks, and poor in a few others. That would make it imperfect, not useless. If TimeZone turns out to be an extremely challenging GUC to validate, I'd unceremoniously skip the semantic check for it. Kind Regards, Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files
Hi, Related SO question from '13: https://stackoverflow.com/questions/1619/how-to-syntax-check-postgresql-config-files Peter Eisentraut answered: > There is no way to do this that is similar to apache2ctl. If you reload > the configuration files and there is a syntax error, the PostgreSQL > server will complain in the log and refuse to load the new file. > <...> > (Of course, this won't guard you against writing semantically wrong > things, but apache2ctl won't do that either.) So, at least one person in the history of the universe (besides me) has wanted a tool that does this. In addition to a simple syntax check, there's a bunch of "config wisdom" tidbits I've encountering, which is scattered through talks, commit messages, and mailing list discussion, and documentation notes (chapter 17, paragraph 12). These could be collected into a tool that: - Checks your configuration's syntax - Checks for semantically legal values ('read committed' not 'read_committed' ) - Warns of unrecognized keys ("'hot_standby' is not a valid GUC in v9.1"). - Is version-aware. - Serves as an "executable" form of past experience. - Describes the config problem in a structured way (as an expression, for example) - Includes a friendly, semi-verbose, description of the problem. - Describes ways to fix the problem, *right there*. - Is independent from server (but reuses the same parser), particularly any life-cycle commands such as restart. Users who adopt the tool will also seem more likely to contribute back coverage for any new pitfalls they fall into, which can yield feedback for developers and drive improvements in documentation. Those inclined can use the tool in their ops repo' CI. Two talk videos have been particularly valuable sources for example of configuration that can bite you: "...Lag - What's Wrong with My Slave?" (Samantha Billington, 2015) https://youtu.be/If22EwtCFKc "PostgreSQL Replication Tutorial"(Josh berkus,2015 ) https://youtu.be/GobQw9LMEaw What I've collected so far: - Quoting rules for recovery.conf and postgresql.conf are different - 'primary_conninfo' is visible to unprivileged users, so including a password is a security risk. - streaming replication + read slave should consider turning on "hot_standby_feedback". - replication_timeout < wal_receiver_status_interval is bad. - setting max_wal_senders too low so no streaming replication block backup with pg_basebackup - max_wal_senders without wal_level - recently on "weekly news": Fujii Masao pushed: Document that max_worker_processes must be high enough in standby. The setting values of some parameters including max_worker_processes must be equal to or higher than the values on the master. However, previously max_worker_processes was not listed as such parameter in the document. So this commit adds it to that list. Back-patch to 9.4 where max_worker_processes was added. http://git.postgresql.org/pg/commitdiff/1ea5ce5c5f204918b8a9fa6eaa8f3f1374aa8aec - turning on replication with default max_wal_senders =0 - FATAL: WAL archival (archive_mode=on) requires wal_level "archive", "hot_standby", or "logical" There must be more, please mention any other checks you feel should be included. Note: The server _will_ explicitly complain about the last two but a bad "wal_level" for example is only checked once the server is already down: "LOG: parameter "wal_level" cannot be changed without restarting the server" Implementation: without getting too far ahead, I feel rules should be stored as independent files in a drop directory, in some lightweight, structured format (JSON, YAML, custom textual format), with some mandatory fields (version, severity, description, solution(s)). This makes it easy for people to add new checks without making any oblique language demands (Perl isn't as popular as it used to be) Comments? Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/08/2015 10:39 AM, Michael Paquier wrote: > On Thu, Oct 8, 2015 at 3:59 PM, Amir Rohan wrote: >> On 10/08/2015 08:19 AM, Michael Paquier wrote: >>> On Wed, Oct 7, 2015 at 5:44 PM, Amir Rohan wrote: >>>> On 10/07/2015 10:29 AM, Michael Paquier wrote: >>>>> On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan wrote: >>>>>> Also, the removal of poll_query_until from pg_rewind looks suspiciously >>>>>> like a copy-paste gone bad. Pardon if I'm missing something. >>>>> >>>>> Perhaps. Do you have a suggestion regarding that? It seems to me that >>>>> this is more useful in TestLib.pm as-is. >>>>> >>>> >>>> My mistake, the patch only shows some internal function being deleted >>>> but RewindTest.pm (obviously) imports TestLib. You're right, TestLib is >>>> a better place for it. >>> >>> OK. Here is a new patch version. I have removed the restriction >>> preventing to call make_master multiple times in the same script (one >>> may actually want to test some stuff related to logical decoding or >>> FDW for example, who knows...), forcing PGHOST to always use the same >>> value after it has been initialized. I have added a sanity check >>> though, it is not possible to create a node based on a base backup if >>> no master are defined. This looks like a cheap insurance... I also >>> refactored a bit the code, using the new init_node_info to fill in the >>> fields of a newly-initialized node, and I removed get_free_port, >>> init_node, init_node_from_backup, enable_restoring and >>> enable_streaming from the list of routines exposed to the users, those >>> can be used directly with make_master, make_warm_standby and >>> make_hot_standby. We could add them again if need be, somebody may >>> want to be able to get a free port, set up a node without those >>> generic routines, just that it does not seem necessary now. >>> Regards, >>> >> >> If you'd like, I can write up some tests for cascading replication which >> are currently missing. > > 001 is testing cascading, like that node1 -> node2 -> node3. > >> Someone mentioned a daisy chain setup which sounds fun. Anything else in >> particular? Also, it would be nice to have some canned way to measure >> end-to-end replication latency for variable number of nodes. > > Hm. Do you mean comparing the LSN position between two nodes even if > both nodes are not connected to each other? What would you use it for? > In a cascading replication setup, the typical _time_ it takes for a COMMIT on master to reach the slave (assuming constant WAL generation rate) is an important operational metric. It would be useful to catch future regressions for that metric, which may happen even when a patch doesn't outright break cascading replication. Just automating the measurement could be useful if there's no pg facility that tracks performance over time in a regimented fashion. I've seen multiple projects which consider a "benchmark suite" to be part of its testing strategy. As for the "daisy chain" thing, it was (IIRC) mentioned in a josh berkus talk I caught on youtube. It's possible to setup cascading replication, take down the master, and then reinsert it as replicating slave, so that you end up with *all* servers replicating from the ancestor in the chain, and no master. I think it was more a fun hack then anything, but also an interesting corner case to investigate. >> What about going back through the commit log and writing some regression >> tests for the real stinkers, if someone care to volunteer some candidate >> bugs > > I have drafted a list with a couple of items upthread: > http://www.postgresql.org/message-id/cab7npqsgffsphocrhfoasdanipvn6wsh2nykf1kayrm+9_m...@mail.gmail.com > So based on the existing patch the list becomes as follows: > - wal_retrieve_retry_interval with a high value, say setting to for > example 2/3s and loop until it is applied by checking it is it has > been received by the standby every second. > - recovery_target_action > - archive_cleanup_command > - recovery_end_command > - pg_xlog_replay_pause and pg_xlog_replay_resume > In the list of things that could have a test, I recall that we should > test as well 2PC with the recovery delay, look at a1105c3d. This could > be included in 005. a1105c3 Mar 23 Fix copy & paste error in 4f1b890b137. Andres Freund 4f1b890 Mar 15 Merge the various forms of transaction commit & abort records. Andres Freund Is that the right commit? > The advantage of implementing that now is
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/08/2015 08:19 AM, Michael Paquier wrote: > On Wed, Oct 7, 2015 at 5:44 PM, Amir Rohan wrote: >> On 10/07/2015 10:29 AM, Michael Paquier wrote: >>> On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan wrote: >>>> Also, the removal of poll_query_until from pg_rewind looks suspiciously >>>> like a copy-paste gone bad. Pardon if I'm missing something. >>> >>> Perhaps. Do you have a suggestion regarding that? It seems to me that >>> this is more useful in TestLib.pm as-is. >>> >> >> My mistake, the patch only shows some internal function being deleted >> but RewindTest.pm (obviously) imports TestLib. You're right, TestLib is >> a better place for it. > > OK. Here is a new patch version. I have removed the restriction > preventing to call make_master multiple times in the same script (one > may actually want to test some stuff related to logical decoding or > FDW for example, who knows...), forcing PGHOST to always use the same > value after it has been initialized. I have added a sanity check > though, it is not possible to create a node based on a base backup if > no master are defined. This looks like a cheap insurance... I also > refactored a bit the code, using the new init_node_info to fill in the > fields of a newly-initialized node, and I removed get_free_port, > init_node, init_node_from_backup, enable_restoring and > enable_streaming from the list of routines exposed to the users, those > can be used directly with make_master, make_warm_standby and > make_hot_standby. We could add them again if need be, somebody may > want to be able to get a free port, set up a node without those > generic routines, just that it does not seem necessary now. > Regards, > If you'd like, I can write up some tests for cascading replication which are currently missing. Someone mentioned a daisy chain setup which sounds fun. Anything else in particular? Also, it would be nice to have some canned way to measure end-to-end replication latency for variable number of nodes. What about going back through the commit log and writing some regression tests for the real stinkers, if someone care to volunteer some candidate bugs Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/07/2015 10:29 AM, Michael Paquier wrote: > On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan wrote: >> On 10/07/2015 09:27 AM, Michael Paquier wrote: >>> On Wed, Oct 7, 2015 at 7:51 AM, Michael Paquier >>> wrote: >>>> On Wed, Oct 7, 2015 at 7:43 AM, Michael Paquier >>>> wrote: >>>>> On Wed, Oct 7, 2015 at 5:58 AM, Robert Haas wrote: >>>>>> On Sat, Oct 3, 2015 at 7:38 AM, Michael Paquier wrote: >>>>>> It seems that these days 'make check' creates a directory in /tmp >>>>>> called /tmp/pg_regress-RANDOMSTUFF. Listening on TCP ports is >>>>>> disabled, and the socket goes inside this directory with a name like >>>>>> .s.PGSQL.PORT. You can connect with psql -h >>>>>> /tmp/pg_regress-RANDOMSTUFF -p PORT, but not over TCP. This basically >>>>>> removes the risk of TCP port number collisions, as well as the risk of >>>>>> your temporary instance being hijacked by a malicious user on the same >>>>>> machine. >>>>> >>>>> Right, that's for example /var/folders/ on OSX, and this is defined >>>>> once per test run via $tempdir_short. PGHOST is set to that as well. >>>> >>>> Er, mistake here. That's actually once per standard_initdb, except >>>> that all the tests I have included in my patch run it just once to >>>> create a master node. It seems that it would be wiser to set one >>>> socket dir per node then, remove the port assignment stuff, and use >>>> tempdir_short as a key to define a node as well as in the connection >>>> string to this node. I'll update the patch later today... >>> >>> So, my conclusion regarding multiple calls of make_master is that we >>> should not allow to do it. On Unix/Linux we could have a separate unix >>> socket directory for each node, but not on Windows where >>> listen_addresses is set to look after 127.0.0.1. On Unix/Linux, PGHOST >>> is set by the master node to a tempdir once and for all. Hence, to >>> make the code more consistent, I think that we should keep the port >>> lookup machinery here. An updated patch is attached. >>> >> If parallel tests are supported, get_free_port is still racy even >> with last_port_found because it's: >> 1) process-local. >> 2) even if it were shared, there's the race window between the >> available-check and the listen() I mentioned upthread. >> >> If parallel tests are explicitly disallowed, a comment to that >> effect (and a note on things known to break) might help someone >> down the road. > > Actually, no, port lookup will not map and parallel tests would work > fine thinking more about it, each set of tests uses its own PGHOST to > a private unix socket directory so even if multiple tests use the same > port number they won't interact with each other because they connect > to different socket paths. MinGW is a problem though, and an existing > one in the perl test scripts, I recall that it can use make -j and > that's on Windows where PGHOST is mapping to 127.0.0.1 only. > ah, the portnum is actually a real tcp port only on windows, and the race is limited to that case as you say. Note that in the tcp case, using psql to check is wrong: $ nc -l 8001 # listen on 8001 $ psql -X -h lo -p 8001 postgres < /dev/null psql: could not connect to server: Connection refused Is the server running on host "lo" (127.0.0.1) and accepting TCP/IP connections on port 8001? The port isn't free, but psql is really only checking if pg is there and reports that the port is available. That's a fairly mild issue, though. >> Also, the removal of poll_query_until from pg_rewind looks suspiciously >> like a copy-paste gone bad. Pardon if I'm missing something. > > Perhaps. Do you have a suggestion regarding that? It seems to me that > this is more useful in TestLib.pm as-is. > My mistake, the patch only shows some internal function being deleted but RewindTest.pm (obviously) imports TestLib. You're right, TestLib is a better place for it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/07/2015 09:27 AM, Michael Paquier wrote: > On Wed, Oct 7, 2015 at 7:51 AM, Michael Paquier > wrote: >> On Wed, Oct 7, 2015 at 7:43 AM, Michael Paquier >> wrote: >>> On Wed, Oct 7, 2015 at 5:58 AM, Robert Haas wrote: On Sat, Oct 3, 2015 at 7:38 AM, Michael Paquier wrote: It seems that these days 'make check' creates a directory in /tmp called /tmp/pg_regress-RANDOMSTUFF. Listening on TCP ports is disabled, and the socket goes inside this directory with a name like .s.PGSQL.PORT. You can connect with psql -h /tmp/pg_regress-RANDOMSTUFF -p PORT, but not over TCP. This basically removes the risk of TCP port number collisions, as well as the risk of your temporary instance being hijacked by a malicious user on the same machine. >>> >>> Right, that's for example /var/folders/ on OSX, and this is defined >>> once per test run via $tempdir_short. PGHOST is set to that as well. >> >> Er, mistake here. That's actually once per standard_initdb, except >> that all the tests I have included in my patch run it just once to >> create a master node. It seems that it would be wiser to set one >> socket dir per node then, remove the port assignment stuff, and use >> tempdir_short as a key to define a node as well as in the connection >> string to this node. I'll update the patch later today... > > So, my conclusion regarding multiple calls of make_master is that we > should not allow to do it. On Unix/Linux we could have a separate unix > socket directory for each node, but not on Windows where > listen_addresses is set to look after 127.0.0.1. On Unix/Linux, PGHOST > is set by the master node to a tempdir once and for all. Hence, to > make the code more consistent, I think that we should keep the port > lookup machinery here. An updated patch is attached. > If parallel tests are supported, get_free_port is still racy even with last_port_found because it's: 1) process-local. 2) even if it were shared, there's the race window between the available-check and the listen() I mentioned upthread. If parallel tests are explicitly disallowed, a comment to that effect (and a note on things known to break) might help someone down the road. Also, the removal of poll_query_until from pg_rewind looks suspiciously like a copy-paste gone bad. Pardon if I'm missing something. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/04/2015 04:29 PM, Andres Freund wrote: > On October 4, 2015 3:27:00 PM GMT+02:00, Amir Rohan > wrote: > >> Perhaps it would help a little if you posted the latest patch here as >> well? So that at least the app picks it up again. > > You can as additional threads in the cf app. > Done, thank you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/02/2015 03:33 PM, Michael Paquier wrote: > > Michael, I'm afraid my email bungling has damaged your thread. I didn't include an "In-reply-To" header when I posted: trinity-b4a8035d-59af-4c42-a37e-258f0f28e44a-1443795007012@3capp-mailcom-lxa08. And we subsequently had our discussion over there instead of here, where the commitfest app is tracking it. https://commitfest.postgresql.org/6/197/ Perhaps it would help a little if you posted the latest patch here as well? So that at least the app picks it up again. Apologies for my ML n00bness, Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/03/2015 03:50 PM, Amir Rohan wrote: > On 10/03/2015 02:38 PM, Michael Paquier wrote: >> On Fri, Oct 2, 2015 at 11:10 PM, Amir Rohan wrote: >>> On 10/02/2015 03:33 PM, Michael Paquier wrote: >>> >>> Granted, you have to try fairly hard to shoot yourself in the leg, >>> but since the solution is so simple, why not? If we never reuse ports >>> within a single test, this goes away. >> >> Well, you can reuse the same port number in a test. Simply teardown >> the existing node and then recreate a new one. I think that port >> number assignment to a node should be transparent to the caller, in >> our case the perl test script holding a scenario. >> > > What part of "Never assign the same port twice during one test" > makes this "not transparent to the user"? > > If you're thinking about parallel tests, I don't think you > need to worry. Availability checks take care of one part, Except now that I think of it, that's definitely a race: Thread1: is_available(5432) -> True Thread2: is_available(5432) -> True Thread1: listen(5432) -> True Thread2: listen(5432) -> #$@#$&@#$^&$#@& I don't know if parallel tests are actually supported, though. If theye are, you're right that this is a shared global resource wrt concurrency. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/03/2015 02:38 PM, Michael Paquier wrote: > On Fri, Oct 2, 2015 at 11:10 PM, Amir Rohan wrote: >> On 10/02/2015 03:33 PM, Michael Paquier wrote: >>> Any server instances created during the tests should never use a >>> user-defined port for portability. Hence using those ports as keys >>> just made sense. We could have for example custom names, that have >>> port values assigned to them, but that's actually an overkill and >>> complicates the whole facility. >>> >> >> Something like: >> >> global nPortsAssigned = 0; >> AssignPort() -> return is_ok(nPortsAssigned++) >> >> was what I used. > > Why do you need that. Creating a node is in the most basic way a > matter of calling make_master, make_*_standby where a port number, or > identifier gets uniquely assigned to a node. The main point of the > approach taken by this patch is to make port assignment transparent > for the caller. > See next. >> Granted, you have to try fairly hard to shoot yourself in the leg, >> but since the solution is so simple, why not? If we never reuse ports >> within a single test, this goes away. > > Well, you can reuse the same port number in a test. Simply teardown > the existing node and then recreate a new one. I think that port > number assignment to a node should be transparent to the caller, in > our case the perl test script holding a scenario. > What part of "Never assign the same port twice during one test" makes this "not transparent to the user"? If you're thinking about parallel test, I don't think you need to worry. Availability checks take care of one part, and the portnum-as-map-key-is-test-local takes care of the other. But, see next. > >>>> 4) Port assignment relies on liveness checks on running servers. >>>> If a server is shut down and a new instantiated, the port will get >>>> reused, data will get trashed, and various confusing things can happen. >>> >>> Right. The safest way to do that is to check in get_free_port if a >>> port number is used by a registered node, and continue to loop in if >>> that's the case. So done. >>> >> >> That eliminates the "sweet and gentle" variant of the scenario, but it's >> susceptible to the "ABA problem": >> https://en.wikipedia.org/wiki/ABA_problem >> https://youtu.be/CmxkPChOcvw?t=786 > > I learnt a new thing here. That's basically an existing problem even > with the existing perl test infrastructure relying on TestLib.pm when > tests are run in parallel. What we would need here is a global mapping > file storing all the port numbers used by all the nodes currently in > the tests. > Yeah, a poorman's way to ensure ports aren't reused (I wasn't very clear at top of post ) is something like: global nPortsAssigned = 0; AssignPort(): basePort = BASE_PORT; # the lowest port we use while(!available(basePort+nPortsAssigned)): basePort++ nPortsAssigned++ return basePort; It has its glaring faults, but would probably work ok. In any case, I'm sure you can do better. >> >> Granted, you have to try fairly hard to shoot yourself in the leg, >> but since the solution is so simple, why not? If we never reuse ports >> within a single test, this goes away. > > Well, you can reuse the same port number in a test. Simply teardown > the existing node and then recreate a new one. I think that port > number assignment to a node should be transparent to the caller, in > our case the perl test script holding a scenario. > I was using you *never* want to reuse port numbers. That is, as long as the lib ensures we never reuse ports within one test, all kinds of corner cases are eliminated. >>>> 5) Servers are shutdown with -m 'immediate', which can lead to races >>>> in the script when archiving is turned on. That may be good for some >>>> tests, but there's no control over it. >>> >>> I hesitated with fast here actually. So changed this way. We would >>> want as wall a teardown command to stop the node with immediate and >>> unregister the node from the active list. >>> >> >> In particular, I was shutting down an archiving node and the archiving >> was truncated. I *think* smart doesn't do this. But again, it's really >> that the test writer can't easily override, not that the default is wrong. > > Ah, OK. Then fast is just fine. It shuts down the node correctly. > "smart" would wait for all the current connections to finish but I am > wondering if it
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 10/02/2015 03:33 PM, Michael Paquier wrote: > Any server instances created during the tests should never use a > user-defined port for portability. Hence using those ports as keys > just made sense. We could have for example custom names, that have > port values assigned to them, but that's actually an overkill and > complicates the whole facility. > Something like: global nPortsAssigned = 0; AssignPort() -> return is_ok(nPortsAssigned++) was what I used. >> 2) Behaviour (paths in particular) is hardwired rather then overridable >> defaults. > > This is the case of all the TAP tests. We could always use the same > base directory for all the nodes and then embed a sub-directory whose > name is decided using the port number. But I am not really sure if > that's a win. > I understand, but it eliminates the kind of scenarios this convenience package lets you express... conveniently. >> This is exactly what I needed to test, problems: >> 3) Can't stop server without clearing its testing data (the maps holding >> paths and things). But that data might be specifically >> needed, in particular the backup shouldn't disappear when the >> server melts down or we have a very low-grade DBA on our hands. > > OK, you have a point here. You may indeed want routines for to enable > and disable a node completely decoupled from start and stop, with > something like enable_node and disable_node that basically registers > or unregisters it from the list of active nodes. I have updated the > patch this way. > Excellent. >> 4) Port assignment relies on liveness checks on running servers. >> If a server is shut down and a new instantiated, the port will get >> reused, data will get trashed, and various confusing things can happen. > > Right. The safest way to do that is to check in get_free_port if a > port number is used by a registered node, and continue to loop in if > that's the case. So done. > That eliminates the "sweet and gentle" variant of the scenario, but it's susceptible to the "ABA problem": https://en.wikipedia.org/wiki/ABA_problem https://youtu.be/CmxkPChOcvw?t=786 Granted, you have to try fairly hard to shoot yourself in the leg, but since the solution is so simple, why not? If we never reuse ports within a single test, this goes away. >> 5) Servers are shutdown with -m 'immediate', which can lead to races >> in the script when archiving is turned on. That may be good for some >> tests, but there's no control over it. > > I hesitated with fast here actually. So changed this way. We would > want as wall a teardown command to stop the node with immediate and > unregister the node from the active list. > In particular, I was shutting down an archiving node and the archiving was truncated. I *think* smart doesn't do this. But again, it's really that the test writer can't easily override, not that the default is wrong. >> Other issues: >> 6. Directory structure, used one directory per thing but more logical >> to place all things related to an instance under a single directory, >> and name them according to role (57333_backup, and so on). > > Er, well. The first version of the patch did so, and then I switched > to an approach closer to what the existing TAP facility is doing. But > well let's simplify things a bit. > I know, I know, but: 1) an instance is a "thing" in your script, so having its associated paraphernalia in one place makes more sense (maybe only to me). 2) That's often how folks (well, how I) arrange things in deployment, at least with archive/backup as symlinks to the nas. Alternatively, naming the dirs with a prefix (srv_foo_HASH, backup_foo_backupname_HASH, etc') would work as well. >> 7. enable_restoring() uses "cp -i" 'archive_command', not a good fit >> for an automated test. > > This seems like a good default to me, and actually that's portable on > Windows easily. One could always append a custom archive_command in a > test when for example testing conflicting archiving when archive_mode > = always. > Ok, I wasn't sure about this, but specifically activating a switch that asks for input from the user during a test? hmm. >> 8. No canned way to output a pprinted overview of the running system >> (paths, ports, for manual checking). > > Hm. Why not... Are you suggesting something like print_current_conf > that goes through all the registered nodes and outputs that? How would > you use it? > For one thin, I could open a few terminals and `$(print_env_for_server 5437), so psql just worked. I wish PEG had that as well. >> 10. If a test passes/fails or dies due to a bug, everything is cleaned. >> Great for testing, bad for postmortem. > > That's something directly related to TestLib.pm where > File:Temp:tempdir creates a temporary path with CLEANUP = 1. We had > discussions regarding that actually... > >> 11. a canned "server is responding to queries" helper would be convenient. > > Do you mean a wrapper on pg_isready? Do you have use cases in mind f
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
> On 09/30/2015 03:27 PM, Tom Lane wrote: >> Josh Berkus writes: >>> On 09/30/2015 03:10 PM, Tom Lane wrote: I'd be feeling a lot more positive about this whole thread if any people had stepped up and said "yes, *I* will put in a lot of grunt-work to make something happen here". The lack of any volunteers suggests strongly that this thread is a waste of time, just as the several similar ones before it have been. >> >>> Hmmm? Frost volunteered to stand up debbugs. >> >> So he did, and did anyone volunteer to put data into it, or to do ongoing >> curation of said data? If we simply connect it up to the mailing lists, >> and then stand back and wait for magic to happen, we will not ever have >> anything that's any more useful than the existing mailing list archives. On 10/01/2015 01:49 AM, Alvaro Herrera wrote: > Josh Berkus wrote: > >> Well, it's hard for anyone to volunteer when we don't know what the >> actual volunteer tasks are. I certainly intend to do *something* to >> support the bug tracker system, but I don't know yet what that something is. > > I volunteer to do something too, as long as I don't have to click on > endless web forms. > I volunteer to help populate the new tracker from whatever from the currently exists in, and will also put into action any reasonable scraping/augmentation ideas put forth to make it a more productive tool. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch: Revised documentation on base backups
On 09/28/2015 06:33 AM, Michael Paquier wrote: > On Sun, Sep 27, 2015 at 11:27 AM, Amir Rohan wrote: >> Further editing. See attached V3. > > I think that you should consider adding this patch to the next commit > fest so as we do not lose track of it: > https://commitfest.postgresql.org/7/ > I've recently picked up the base backup section of the documentation: http://www.postgresql.org/docs/9.4/static/continuous-archiving.html (Section 24.3.2) and went through like a tutorial. I'm a relative newcomer to postgres, so this was the real deal. The docs seem to be in need of some improvement, primarily because it has the feel of having been monkey-patched repeatedly it in the past. It needs to have someone with more experience read through it and comment on technical accuracy and any knee-slapper error I might have introduced. One pain point in rewriting is the definition of terms, or lack there of in the manual. I didn't find a section defining terms conclusively for consistent referencing throughout the documentation. For example, how should one refer to the directory housing "postgresql.conf"?, is it: 1. The data directory 2. The storage directory 3. The PGDATA directory 4. The cluster directory 5. The server root directory 6. The config file directory 7. etc' This lack of clear guidelines and uniformly applied terms makes for some awkward verbal manoeuvring and compromises the quality of the documentation (but admittedly, not its copiousness). Amir >From f763259e6e91d69dd1c41f67169fa0666b1f82d9 Mon Sep 17 00:00:00 2001 From: root Date: Sat, 26 Sep 2015 11:50:43 +0300 Subject: [PATCH] Rewritten documentation on base backups V2 diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 7413666..e2727e1 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -427,8 +427,8 @@ tar -cf backup.tar /usr/local/pgsql/data If simultaneous snapshots are not possible, one option is to shut down the database server long enough to establish all the frozen snapshots. - Another option is to perform a continuous archiving base backup () because such backups are immune to file + Another option is to create a base backup using the continuous archiving feature + () because such backups are immune to file system changes during the backup. This requires enabling continuous archiving just during the backup process; restore is done using continuous archive recovery (). @@ -752,60 +752,58 @@ test ! -f /mnt/server/archivedir/000100A90065 && cp pg_xlog/ Making a Base Backup -The easiest way to perform a base backup is to use the - tool. It can create -a base backup either as regular files or as a tar archive. If more -flexibility than can provide is -required, you can also make a base backup using the low level API -(see ). +A base backup consists of one or more WAL files and a . +Together they sufficient to recreate the database's state at some point in the past. +Once a base backup is made, the WAL files that precede its creation are no longer +necessary in order to recover the database to some later point in time. + + + +The interval between base backups should usually be +chosen based on how much storage you want to expend on archived WAL +files, since you must keep all the archived WAL files back to your +last base backup. +You should also consider how long you are prepared to spend +recovering, if recovery should be necessary — the system will have to +replay all those WAL segments, and that could take awhile if it has +been a long time since the last base backup. -It is not necessary to be concerned about the amount of time it takes -to make a base backup. However, if you normally run the -server with full_page_writes disabled, you might notice a drop -in performance while the backup runs since full_page_writes is -effectively forced on during backup mode. +Creating a base backup may be a lengthy process if you have a lots of data. +Be aware that If you normally run the server with full_page_writes +disabled, you might notice a drop in performance while the backup runs since +full_page_writes is effectively forced on during backup mode. To make use of the backup, you will need to keep all the WAL segment files generated during and after the file system backup. -To aid you in doing this, the base backup process -creates a backup history file that is immediately -stored into the WAL archive area. This file is named after the first -WAL segment file that you need for the file system backup. -For example, if the starting WAL file is -0001123455CD the backup history file will be -named something like -0001123455CD.007C9330.backup. (The second -part of the fi
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 09/25/2015 01:47 PM, Michael Paquier wrote: > On Fri, Sep 25, 2015 at 5:57 PM, Amir Rohan wrote: >> > That's the kind of thing that each serious developer on this mailing > list already has in a rather different shape but with the same final > result: Oh, I guess I'll have to write one then. :) > offer a way to set up a cluster ready for hacking in a single > command, and being able to switch among them easily. I am not sure we > would really find a cross-platform script generic enough to satisfy > all the needs of folks here and integrate it directly in the tree :) > Yes, perl/bash seems to be the standard and both have their shorcomings, in either convenience or familiarity... a static binary + scripts/configuration would be least resistance,and it doesn't actually have to live in the tree, but it needs to be good enough out of the box that someone new will prefer to use/improve it over rolling their own from scratch, and people need to know it's there. Coming back to the recovery testing package... I was investigating some behaviour having to do with recovery and tried your new library to write a repro case. This uncovers some implicit assumptions in the package that can make things diffficult when violated. I had to rewrite nearly every function I used. Major pain points: 1) Lib stores metadata (ports,paths, etc') using ports as keys. -> Assumes ports aren't reused. -> Because assumes server keep running until the tear down. And 2) Behaviour (paths in particular) is hardwired rather then overridable defaults. This is exactly what I needed to test, problems: 3) Can't stop server without clearing its testing data (the maps holding paths and things). But that data might be specifically needed, in particular the backup shouldn't disappear when the server melts down or we have a very low-grade DBA on our hands. 4) Port assignment relies on liveness checks on running servers. If a server is shut down and a new instantiated, the port will get reused, data will get trashed, and various confusing things can happen. 5) Servers are shutdown with -m 'immediate', which can lead to races in the script when archiving is turned on. That may be good for some tests, but there's no control over it. Other issues: 6. Directory structure, used one directory per thing but more logical to place all things related to an instance under a single directory, and name them according to role (57333_backup, and so on). 7. enable_restoring() uses "cp -i" 'archive_command', not a good fit for an automated test. Aside from running the tests, the convenience of writing them needs to be considered. My perl is very weak, it's been at least a decade, but It was difficult to make progress because everything is geared toward a batch "pass/fail" run . Stdout is redirected, and the log files can't be 'tail --retry -f' in another terminal, because they're clobbered at every run. Also: 8. No canned way to output a pprinted overview of the running system (paths, ports, for manual checking). 9. Finding things is difficult, See 6. 10. If a test passes/fails or dies due to a bug, everything is cleaned. Great for testing, bad for postmortem. 11. a canned "server is responding to queries" helper would be convenient. It might be a good idea to: 1) Never reuse ports during a test. Liveness checking is used to avoid collisions, but should not determine order of assignment. 2) Decouple cleanup from server shutdown. Do the cleanup as the end of test only, and allow the user to keep things around. 3) Adjust the directory structure to one top directory per server with (PGDATA, backup, archive) subdirs. 4) Instead of passing ports around as keys, have _explicit functions which can be called directly by the user (I'd like the backup *HERE* please), with the current functions refactored to merely invoke them by interpolating in the values associated with the port they were given. 4b) server shutdown should perheps be "smart" by default, or segmented into calmly_bring_to_a_close(), pull_electric_plug() and drop_down_the_stairs_into_swimming_pool(). Regards, Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 09/25/2015 09:29 AM, Michael Paquier wrote: > > > On Fri, Sep 25, 2015 at 3:11 PM, Amir Rohan <mailto:amir.ro...@mail.com>> wrote: > > On 08/14/2015 06:32 AM, Michael Paquier wrote: > > I am rather happy with the shape of this patch now, so feel free > to review it... > > Regards, > > Michael, I've ran these and it worked fine for me. > See attached patch with a couple of minor fixes. > > > Thanks! I still think that we could improve a bit more the way > parametrization is done in postgresql.conf when a node is initialized by > appending a list of parameters or have a set of hardcoded behaviors > including a set of default parameters and their values... But well > feedback is welcome regarding that. I also arrived at the conclusion > that it would be better to place the new package file in src/test/perl > instead of src/test/recovery to allow any users of the TAP tests to have > it in their PERL5LIB path and to be able to call the new routines to > create and manipulate nodes. > -- > Michael Having a subcommand in Greg's PEG (http://github.com/gregs1104/peg), that allows you to create one of several "canned" clusters would be convenient as well, for manual testing and folling around with features. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On 08/14/2015 06:32 AM, Michael Paquier wrote: > On Fri, Aug 14, 2015 at 12:54 AM, Michael Paquier > wrote: >> On Mon, Jun 29, 2015 at 10:11 PM, Michael Paquier >> wrote: >>> On Wed, Mar 18, 2015 at 1:59 PM, Michael Paquier >>> wrote: Feedback is of course welcome, but note that I am not seriously expecting any until we get into 9.6 development cycle and I am adding this patch to the next CF. >>> I have moved this patch to CF 2015-09, as I have enough patches to >>> take care of for now... Let's focus on Windows support and improvement >>> of logging for TAP in the first round. That will be already a good >>> step forward. >> OK, attached is a new version of this patch, that I have largely >> reworked to have more user-friendly routines for the tests. The number >> of tests is still limited still it shows what this facility can do: >> that's on purpose as it does not make much sense to code a complete >> and complicated set of tests as long as the core routines are not >> stable, hence let's focus on that first. >> I have not done yet tests on Windows, I am expecting some tricks >> needed for the archive and recovery commands generated for the tests. > Attached is v3. I have tested and fixed the tests such as they can run > on Windows. archive_command and restore_command are using Windows' > copy when needed. There was also a bug with the use of a hot standby > instead of a warm one, causing test 002 to fail. > I am rather happy with the shape of this patch now, so feel free to review > it... > Regards, Michael, I've ran these and it worked fine for me. See attached patch with a couple of minor fixes. Amir diff --git a/src/test/recovery/RecoveryTest.pm b/src/test/recovery/RecoveryTest.pm index c015f3b..5cc977d 100644 --- a/src/test/recovery/RecoveryTest.pm +++ b/src/test/recovery/RecoveryTest.pm @@ -11,7 +11,7 @@ package RecoveryTest; # It is then up to the test to decide what to do with the newly-created # node. # -# Environmenment configuration of each node is available through a set +# Environment configuration of each node is available through a set # of global variables provided by this package, hashed depending on the # port number of a node: # - connstr_nodes connection string to connect to this node diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl index 757a639..2c59b73 100644 --- a/src/test/recovery/t/001_stream_rep.pl +++ b/src/test/recovery/t/001_stream_rep.pl @@ -31,9 +31,9 @@ psql $connstr_nodes{ $port_master }, my $current_lsn = psql $connstr_nodes{ $port_master }, "SELECT pg_current_xlog_location();"; my $caughtup_query = "SELECT '$current_lsn'::pg_lsn <= replay_location FROM pg_stat_replication"; -poll_query_until($caughtup_query, $connstr_nodes{ $port_master }) - or die "Timed out while waiting for standby 1 to catch up"; poll_query_until($caughtup_query, $connstr_nodes{ $port_standby_1 }) + or die "Timed out while waiting for standby 1 to catch up"; +poll_query_until($caughtup_query, $connstr_nodes{ $port_standby_2 }) or die "Timed out while waiting for standby 2 to catch up"; my $result = psql $connstr_nodes{ $port_standby_1 }, -- 2.4.3 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for N synchronous standby servers - take 2
> Sent: Thursday, September 24, 2015 at 3:11 AM > > From: "Tom Lane" > Robert Haas writes: > > Well, I think that if we create our own mini-language, it may well be > > possible to make the configuration for this compact enough to fit on > > one line. If we use JSON, I think there's zap chance of that. But... > > that's just what *I* think. >> I've implemented a parser that reads you mini-language and dumps a JSON equivalent. Once you start naming groups the line fills up quite quickly, and on the other hands the JSON is verbose and fiddely. But implementing a mechanism that can be used by other features in the future seems the deciding factor here, rather then the brevity of a bespoke mini-language. > > <...> we're best off avoiding the challenges of dealing with multi-line > postgresql.conf entries. > > And I'm really not much in favor of a separate file; if we go that way > then we're going to have to reinvent a huge amount of infrastructure > that already exists for GUCs. > > regards, tom lane Adding support for JSON objects (or some other kind of composite data type) to the .conf parser would negate the need for one, and would also solve the problem being discussed for future cases. I don't know whether that would break some tooling you care about, but if there's interest, I can probably do some of that work. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for N synchronous standby servers - take 2
>On 07/16/15, Robert Haas wrote: > >>> * Developers will immediately understand the format >> >>I doubt it. I think any format that we pick will have to be carefully >>documented. People may know what JSON looks like in general, but they >>will not immediately know what bells and whistles are available in >>this context. >> >>> * Easy to programmatically manipulate in a range of languages >> >> <...> I think it will be rare to need to parse the postgresql.conf string, >> manipulate it programatically, and then put it back. > >On Sun, Jul 19, 2015 at 4:16 PM, Tom Lane wrote: >> Josh Berkus writes: >>> On 07/17/2015 04:36 PM, Jim Nasby wrote: I'm guessing it'd be really ugly/hard to support at least this GUC being multi-line? >> >>> Mind you, multi-line GUCs would be useful otherwise, but we don't want >>> to hinge this feature on making that work. >> >> Do we really want such a global reduction in friendliness to make this >> feature easier? > >Maybe shoehorning this into the GUC mechanism is the wrong thing, and >what we really need is a new config file for this. The information >we're proposing to store seems complex enough to justify that. It seems like: 1) There's a need to support structured data in configuration for future needs as well, it isn't specific to this feature. 2) There should/must be a better way to validate configuration then to restarting the server in search of syntax errors. Creating a whole new configuration file for just one feature *and* in a different format seems suboptimal. What happens when the next 20 features need structured config data, where does that go? will there be additional JSON config files *and* perhaps new mini-language values in .conf as development continues? How many dedicated configuration files is too many? Now, about JSON (Earlier Upthread): On 07/01/15, Peter Eisentraut wrote: > On 6/26/15 2:53 PM, Josh Berkus wrote: > > I would also suggest that if I lose this battle and > > we decide to go with a single stringy GUC, that we at least use JSON > > instead of defining our out, proprietary, syntax? > > Does JSON have a natural syntax for a set without order? No. Nor Timestamps. It doesn't even distingush integer from float (Though parsers do it for you in dynamic languages). It's all because of its unsightly _javascript_ roots. The current patch is now forced by JSON to conflate sets and lists, so un/ordered semantics are no longer tied to type but to the specific configuration keys. So, If a feature ever needs a key where the difference between set and list matters and needs to support both, you'll need seperate keys (both with lists, but meaning different things) or a separate "mode" key or something. Not terrible, just iffy. Other have found JSON unsatisfactory before. For example, the clojure community has made (at least) two attempts at alternatives, complete with the meh adoption rates you'd expect despite being more capable formats: http://blog.cognitect.com/blog/2014/7/22/transit https://github.com/edn-format/edn There's also YAML, TOML, etc', none as universal as JSON. But to reiterate, JSON itself has Lackluster type support (no sets, no timestamps), is verbose, iseasy to malform when editing (missed a curly brace, shouldn't use a single quote), isn't extensible, and my personal pet peeve is that it doesn't allow non-string or bare-string keys in maps (a.k.a "death by double-quotes"). Python has the very natural {1,2,3} syntax for sets, but of course that's not part of JSON. If JSON wins out despite all this, one alternative not discussed is to extend the .conf parser to accept json dicts as a fundamental type. e.g.: ### data_directory = 'ConfigDir' port = 5432 work_mem = 4MB hot_standby = off client_min_messages = notice log_error_verbosity = default autovacuum_analyze_scale_factor = 0.1 synch_standby_config = { "sync_info": { "nodes": [ { "priority": 1, "group": "cluster1" }, "A" ], "quorum": 3 }, "groups": { "cluster1": [ "B", "C" ] } } This *will* break someone's perl I would guess. Ironically, those scripts wouldn't have broken if some structured format were in use for the configuration data when they were written... `postgres --describe-config` is also pretty much tied to a line-oriented configuration. Amir p.s. MIA configuration validation tool/switch should probably get a thread too.