Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]

Greg Smith Sun, 10 Mar 2013 08:25:15 -0700

On 3/7/13 2:42 AM, Amit Kapila wrote:

I also think the tests added for regression may be more than required...
If you think above optimization's to reduce number of tests are okay, then I
will update the patch.

I was not trying to get you to remove regression tests. I was justpointing out to everyone that the patch seems longer than it really is,because you put so many of them in there. This is not a problem, and Ididn't bring it up to say you should remove some of them.

This feature is close to being functional enough to make a lot of peoplehappy. Most of it works the way I've wanted it to for a few years now,an end point several people have pushed toward in pieces for years now.There is a decent sized list of issues that I found under carefultesting though. Maybe all these can be resolved quickly and this moveson to commit candidate. I'd like to see this feature hit the code base,but it feels like a good amount of work is left to reach there to meright now.

I'm focusing on functional rather than code quality review here. Myopinion on how the implementation is coded isn't worth very muchanyway--I have a bad track record for correctly judging whethersomething is good quality when it starts messing with the GUC system.The patch has shrunk since original submission a bit, and it has manyregression tests it passes. It may still be objectionably long. Thereis a good amount of cleanup in comments and documentation needed, and Iwill be happy to work on that part myself. I wanted to get thefunctional issues outlined first though. I would be shocked if thisgoes in without someone else wanting more code cleanup orsimplification. I don't know who is going to do that other than one ofthe committers though.

If Amit can address the functional ones, I can tweak the text/level ofthe log messages myself during the cleanup round I do.


= Functional changes =

There are a number of individually small but serious blocker items Ithink need to be changed in how this works. I see 7 functional changesto be made before there's no surprising behavior here. I have a lot ofcommentary about how the log/error messages from this might be improvedin addition to that.

1) When you change a sighup or user setting, it writes the config fileout. But it does not signal for a reload. Example:


$ psql -c "show work_mem" -x
work_mem | 1MB
$ psql -c "set persistent work_mem='2MB'"
SET
$ psql -c "show work_mem" -x
work_mem | 1MB

That the above doesn't leave work_mem set to 2MB for new sessions issurprising. SET PERSISTENT can't do anything about postmasterparameters like shared_buffers. But it should be able to changework_mem or a sighup parameter like checkpoint_segments. The regressiontest examples call pg_reload_conf() in order to activate the changes.As a user, I would expect the config reload to be automatic afterexecuting SET PERSISTENT on parameters that can be changed dynamically.

2) Once automatic activation is happening, a hint is really needed incases where it can't happen. I'd like to see a warning about that.There's one like this in the code already:

LOG: parameter "shared_buffers" cannot be changed without restartingthe server


Something like this would be appropriate for SET PERSISTENT:

LOG: parameter "shared_buffers" cannot be changed without restartingthe server. The changed setting has been saved toconfig/postgresql.auto.conf.

3) The config/postgresql.auto.conf file (which I'd like to see named'config/persistent.auto.conf') is completely overwritten when a changeis made. When the file is written out, both at initdb time and whenchanges are made, it should have this as its first line:

# Do not edit this file manually! It is overwritten by the SETPERSISTENT command.

4) There is one bad problem case left if I try to make this not work.If I do this:


-rm -rf config/
-Remove "include_dir config"

-Restart the server. It gives me a warning that SET PERSISTENT won'twork because at least one of the multiple things it needs are missing.

-mkdir config
-psql -c "set persistent checkpoint_segments='32'"

That command happily writes out postgresql.auto.conf, but since Iremoved the include_dir it doesn't matter. It will never become active.

The check for the include_dir entry needs to happen each time SETPERSISTENT runs. This is more important than noticing it at serverstart time. In fact, I think if this problem case includes the WARNINGabout "include_dir config" not being in the postgresql.conf, the checkthat happens at server start (shown below) might even go away. Peoplewho object to it as log file garbage will be happier, and users will betold about the feature being broken if they try to use it.

5) An error message appears every time you reload configuration, if somepart of the SET PERSISTENT mechanism isn't functional. This is going tobe too much for some people. And it's confusing when it happens as partan interactive psql session. I have an example way down at the very endof this message.


6) Putting spaces into memory units results in broken config files:

$ psql -c "set persistent work_mem='2 MB'"
SET
$ cat $PGDATA/config/postgresql.auto.conf
work_mem = 2 MB

This is wrong and you'll get this on reload:

LOG: syntax error in file"/Users/gsmith/pgwork/data/autoconf/config/postgresql.auto.conf" line 1,near token "MB"LOG: configuration file"/Users/gsmith/pgwork/data/autoconf/postgresql.conf" contains errors; nochanges were applied


It needs to look like this:

work_mem = '2 MB'

A regression test for some cases with spaces in the middle should beadded too. This case is really weird, due to how the code is readingthe existing postgresql.auto.conf file. If I first manually fix thefile, and then run a change again, it stays fixed! Look at this weirdsequence:


$ psql -c "set persistent work_mem='2 MB'"
SET
$ cat $PGDATA/config/postgresql.auto.conf
work_mem = 2 MB
[Here this setting is broken]
$ vi $PGDATA/config/postgresql.auto.conf
[Add quotes around the value, now it works]
$ cat $PGDATA/config/postgresql.auto.conf
work_mem = '2 MB'
$ pg_ctl reload
server signaled
$ psql -c "set persistent work_mem='4 MB'"
SET
$ cat $PGDATA/config/postgresql.auto.conf
work_mem = '4 MB'

Below I'll rant some more about how looking at what's in the existingfile, rather than the in-memory GUCs, has some other surprising properties.

7) If I run SET PERSISTENT a lot concurrently, something happens thatslows down connecting to the server. Restarting the server makes it goaway. I have a pgbench test case demonstrating the problem below, inthe "Concurrency" section. I haven't tried to replicate it on anothersystem yet.


= Tested features that work fine =

Entries added here are tracked as you'd expect:

$ psql -c "set persistent checkpoint_segments='32'"
$ pg_ctl reload

$ psql -xc "select name,setting,sourcefile,sourceline from pg_settingswhere name='checkpoint_segments'"

name       | checkpoint_segments
setting    | 32
sourcefile | /Users/gsmith/pgwork/data/autoconf/config/postgresql.auto.conf
sourceline | 1

When the postgresql.auto.conf file is missing and you use SETPERSISTENT, it quietly creates it when writing out the new setting.


= Concurrency and performance slowdown =

I made two pgbench scripts that adjust a guc, one user and the othersighup, to a random value:


random-join-limit.sql:

\setrandom limit 1 8
set persistent join_collapse_limit=:limit;
select pg_reload_conf();

random-segments.sql:

\setrandom segments 3 16
set persistent checkpoint_segments=:segments;
select pg_reload_conf();

I then fired off a bunch of these in parallel:

pgbench -n -f random-join-limit.sql -f random-segments.sql -c 8 -T 60

This ran at 1547 TPS on my 8 core Mac, so that's not bad. No assertionsand the config file was still valid at the end, which is a good sign thelocking method isn't allowing utter chaos. Without the pg_reload_conf()in the test files, it also completed. With the reload happening in onefile but not the other, things were also fine.

However, one thing I saw is that the server got significantly slower themore I ran this test script. After a few minutes it was down to only400 TPS. The delay shows up between when I run psql and when I get theprompt back. Here's normal behavior:


$ time psql -c "select 1"
real    0m0.006s

And here's what I get after a single run of this pgbench hammering:

$ time psql -c "select 1"
real    0m0.800s

800ms? The slowdown is all for psql to start and connect, it's not inthe executor:


$ time psql -c "explain analyze select 1"
                                     QUERY PLAN
------------------------------------------------------------------------------------

Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001rows=1 loops=1)

 Total runtime: 0.039 ms
(2 rows)

real    0m0.807s

Stopping and restarting the server brings the performance back tonormal. Maybe this is one of those assertion build issues, but itsmells like there could be nasty memory leak somewhere instead.Clearing whatever weirdness is going on here is a blocking issue.


= Loss of in-memory changes =

In this example, I change two settings, corrupt the file, and thenchange one of them a second time:


$ psql -c "set persistent checkpoint_segments='32'"
$ psql -c "set persistent work_mem='2MB'"
$ cat postgresql.auto.conf
checkpoint_segments = 32
work_mem = 2MB
$ rm postgresql.auto.conf
$ psql -c "set persistent work_mem='2MB'"
SET
$ cat postgresql.auto.conf
work_mem = 2MB
$ psql -xc "show checkpoint_segments"
checkpoint_segments | 3

That this happens isn't unreasonable, and I can live with thelimitation. The change to work_mem is lost, but it didn't have to be.A user might expect that in this case the last SET PERSISTENT would havewritten out a file with both the settings still intact. The running GUCentries certainly knew that postgresql.auto.conf had two lines withthese entries at that point. All in-memory persistent changes could bedumped out to postgresql.auto.conf. That's what I had hoped waspossible in the implementation.

Amit may have this right though, because I think the code he's using issimpler and reliable than what I had in mind. I'm really not sure whichway is better. This one isn't a blocker, and if this gets committed itcould easily be nudged around later, as an internal change without acatversion bump.


= Error messages =

If you remove postgresql.auto.conf and restart the server, it gives awarning that SET PERSISTENT won't work until you put it back. The errorin this and several similar cases is pretty generic too:

WARNING: File "postgresql.auto.conf" is not accessible, either file"postgresql.auto.conf" or folder "config" doesn't exist. or default"config" is not present in postgresql.conf.

It would be nice if the error were more specific, individuallyidentifying which of these is the actual problem. I can rewrite thatlong text entry to be more readable, but I think it should be a seriesof smaller error checks with their own individual messages instead.

If you remove postgresql.auto.conf then exeute "pg_ctl reload", it givesthis error 6 times, which seems excessive. Reducing how often itappears in the reload case would be nice.


Deleting the whole config directory gives this:

LOG: could not open configuration directory"/Users/gsmith/pgwork/data/autoconf/config": No such file or directoryLOG: configuration file"/Users/gsmith/pgwork/data/autoconf/postgresql.conf" contains errors; nochanges were applied


If you now try to use the feature, the error message could be better.

$ psql -c "set persistent checkpoint_segments='32'"

ERROR: Failed to open auto conf temp file"/Users/gsmith/pgwork/data/autoconf/config/postgresql.auto.conf.temp":No such file or directory

It would be nice to complain about the config directory being missing,as a first check before the file is opened. Restarting the server inthis situation throws the correct error in your face though:

LOG: could not open configuration directory"/Users/gsmith/pgwork/data/autoconf/config": No such file or directoryFATAL: configuration file"/Users/gsmith/pgwork/data/autoconf/postgresql.conf" contains errors

If you render this whole system unavailable by removing the "include_dirconfig", at every server start you'll see this:

WARNING: File "postgresql.auto.conf" is not accessible, either file"postgresql.auto.conf" or folder "config" doesn't exist. or default"config" is not present in postgresql.conf.Configuration parameters changed before start/reload of server with SETPERSISTENT command will not be effective.

LOG:  database system was shut down at 2013-03-09 23:55:03 EST

This is a debatable design choice. Some people might not want the fileand remove it, but don't want to be nagged about it. If people want towipe out the file or directory and work the old way, without thisfeature available, that's fine and they can. To me, helping users whoaccidentally break this is more important than reducing harmless warningmessages for things people did intentionally. WARNING might not be theright level for this though. The existing checks like this I showedabove use LOG for this sort of thing.

The bigger problem is that this message shows up whenever you reload theconfig too. Watch this bizarre sequence:


gsmith=# select pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)

gsmith=#
gsmith=# select 1;

 ?column?
----------
        1

And as I commented above, shifting more of the checks to SET PERISTENTtime could eliminate this check from running at server start and reloadaltogether. I would be fine with them *also* happening at server start.But I could understand that other people might not like that. Andhaving this pop up on every reload, appearing to a client next toanother statement altogether, that isn't acceptable though.


--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]

Reply via email to