Re: Tracking cluster upgrade and configuration history

2020-11-16 Thread Mark Dilger



> On Nov 15, 2020, at 11:23 PM, Ian Lawrence Barwick  wrote:
> 
> 2020年11月16日(月) 15:48 Bharath Rupireddy 
> :
>> 
>> On Thu, Nov 12, 2020 at 4:01 AM Mark Dilger
>>  wrote:
>>> 
>>> While supporting customers, it would frequently be useful to have more 
>>> information about the history of a cluster.  For example, which prior 
>>> versions were ever installed and running on the cluster?  Has the cluster 
>>> ever been run with fsync=off?  When did the server last enter recovery, if 
>>> ever?  Was a backup_label file present at that time?
>>> 
>> 
>> +1 for the idea. The information will be useful at times for debugging 
>> purposes.
> 
> It's certainly something which would be nice to have.

Thanks for the feedback.

>>> Would it make sense to alternately, or additionally, store some of this 
>>> information in a flat text file in pg_data, say a new file named 
>>> "cluster_history" or such?
>>> 
>> 
>> IMHO, this is also a good idea. We need to think of the APIs to
>> open/read/write/close that history file? How often and which processes
>> and what type of data they write? Is it that the postmaster alone will
>> write into that file? If multiple processes are allowed to write, how
>> to deal with concurrent writers? Will users have to open manually and
>> read that file? or Will we have some program similar to
>> pg_controldata? Will we have some maximum limit to the size of this
>> file?
> 
> pg_stat_statements might be worth looking at as one way of handling that kind
> of file.
> 
> However the problem with keeping a separate file which is not WAL-logged would
> mean it doesn't get propagated to standbys, and there's also the question
> of how it could be maintained across upgrades via pg_upgrade.

Hmmm.  I was not expecting the file to be propagated to standbys.  The 
information could legitimately be different for a primary and a standby.  As a 
very simple example, there may be a flag bit for whether the cluster has 
operated as a standby.  That does raise questions about what sort of 
information about a primary that a standby should track, in case they get 
promoted to primary and information about the old primary would be useful for 
troubleshooting.  Ideas welcome

> 
> FWIW I did once create a background worker extension [1] which logs
> configuration changes to a table, though it's not particularly maintained or
> recommended for production use.

I'm happy to change course if the consensus on the list favors using something 
larger, like log files or logging to a table, but for now I'm still thinking 
about this in terms of something smaller than that.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Tracking cluster upgrade and configuration history

2020-11-16 Thread Mark Dilger



> On Nov 15, 2020, at 10:47 PM, Bharath Rupireddy 
>  wrote:
> 
> On Thu, Nov 12, 2020 at 4:01 AM Mark Dilger
>  wrote:
>> 
>> While supporting customers, it would frequently be useful to have more 
>> information about the history of a cluster.  For example, which prior 
>> versions were ever installed and running on the cluster?  Has the cluster 
>> ever been run with fsync=off?  When did the server last enter recovery, if 
>> ever?  Was a backup_label file present at that time?
>> 
> 
> +1 for the idea. The information will be useful at times for debugging 
> purposes.

Thanks for the feedback.

> 
>> 
>> Some of this type of information could strictly be fixed size, such as a 
>> fixed set of timestamps for the time at which a fixed set of things last 
>> occurred, or a fixed set of bits indicating whether a fixed set of things 
>> ever happened.
>> 
>> Some other types would be variable size, but hopefully short in practice, 
>> like a list of all postgres versions that have ever been run on the cluster.
>> 
>> Logging the information via the usual log mechanism seems insufficient, as 
>> log files may get rotated and this information lost.
>> 
> 
> True. Just a thought, can we use existing logging mechanism and APIs
> to write to a new file that never gets rotated by the syslogger(Of
> course, we need to think of the maximum file size that's allowed)? The
> idea is like this: we use elog/ereport and so on with a new debug
> level, when specified, instead of logging into the standard log files,
> we log it to the new file.

That's going in a very different direction from what I had in mind.  I was 
imagining something like a single binary or text file of either fixed size or 
something very short but not fixed.  The "very short but not fixed" part of 
that seems a bit too hand-waving on reflection.  Any variable length list, such 
as the list of postgres versions started on the cluster, could be made fixed 
length by only tracking the most recent N of them, perhaps with a flag bit to 
indicate if the list has overflowed.

Using elog/ereport with a new log level that gets directed into a different log 
file is an interesting idea, but it is not clear how to use elog/ereport in a 
principled way to write files that need never get too large.

>> Would it be acceptable to store some fixed set of flag bits and timestamps 
>> in pg_control?  Space there is at a premium.
>> 
> 
> Since we allocate ControlFileData in shared memory and also we may
> have some data with timestamps, variable texts and so on, having this
> included in pg_control data structure would not seem a good idea to
> me.

Variable length texts seem completely out of scope for this.  I would expect 
the data to be a collection of integer types and flag bits.  Fixed length text 
might also be possible, but I don't have any examples in mind of text that we'd 
want to track.

>> Would it make sense to alternately, or additionally, store some of this 
>> information in a flat text file in pg_data, say a new file named 
>> "cluster_history" or such?
>> 
> 
> IMHO, this is also a good idea. We need to think of the APIs to
> open/read/write/close that history file? How often and which processes
> and what type of data they write? Is it that the postmaster alone will
> write into that file? If multiple processes are allowed to write, how
> to deal with concurrent writers? Will users have to open manually and
> read that file? or Will we have some program similar to
> pg_controldata? Will we have some maximum limit to the size of this
> file?

This depends in part on feedback about which information others on this list 
would like to see included, but I was imagining something similar to how 
pg_control works, or using pg_control itself.  The maximum size for pg_control 
is 512 bytes, and on my system sizeof(ControlFileData) = 296, which leaves 216 
bytes free.  I didn't check how much that might change on systems with 
different alignments.  We could either use some of the ~200 bytes currently 
available in pg_control, or use another file, "pg_history" or such, following 
the design pattern already used for pg_control.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Tracking cluster upgrade and configuration history

2020-11-15 Thread Ian Lawrence Barwick
2020年11月16日(月) 15:48 Bharath Rupireddy :
>
> On Thu, Nov 12, 2020 at 4:01 AM Mark Dilger
>  wrote:
> >
> > While supporting customers, it would frequently be useful to have more 
> > information about the history of a cluster.  For example, which prior 
> > versions were ever installed and running on the cluster?  Has the cluster 
> > ever been run with fsync=off?  When did the server last enter recovery, if 
> > ever?  Was a backup_label file present at that time?
> >
>
> +1 for the idea. The information will be useful at times for debugging 
> purposes.

It's certainly something which would be nice to have.

> > Would it make sense to alternately, or additionally, store some of this 
> > information in a flat text file in pg_data, say a new file named 
> > "cluster_history" or such?
> >
>
> IMHO, this is also a good idea. We need to think of the APIs to
> open/read/write/close that history file? How often and which processes
> and what type of data they write? Is it that the postmaster alone will
> write into that file? If multiple processes are allowed to write, how
> to deal with concurrent writers? Will users have to open manually and
> read that file? or Will we have some program similar to
> pg_controldata? Will we have some maximum limit to the size of this
> file?

pg_stat_statements might be worth looking at as one way of handling that kind
of file.

However the problem with keeping a separate file which is not WAL-logged would
mean it doesn't get propagated to standbys, and there's also the question
of how it could be maintained across upgrades via pg_upgrade.

FWIW I did once create a background worker extension [1] which logs
configuration changes to a table, though it's not particularly maintained or
recommended for production use.

[1] https://github.com/ibarwick/config_log


Regards

Ian Barwick
-- 
EnterpriseDB: https://www.enterprisedb.com




Re: Tracking cluster upgrade and configuration history

2020-11-15 Thread Bharath Rupireddy
On Thu, Nov 12, 2020 at 4:01 AM Mark Dilger
 wrote:
>
> While supporting customers, it would frequently be useful to have more 
> information about the history of a cluster.  For example, which prior 
> versions were ever installed and running on the cluster?  Has the cluster 
> ever been run with fsync=off?  When did the server last enter recovery, if 
> ever?  Was a backup_label file present at that time?
>

+1 for the idea. The information will be useful at times for debugging purposes.

>
> Some of this type of information could strictly be fixed size, such as a 
> fixed set of timestamps for the time at which a fixed set of things last 
> occurred, or a fixed set of bits indicating whether a fixed set of things 
> ever happened.
>
> Some other types would be variable size, but hopefully short in practice, 
> like a list of all postgres versions that have ever been run on the cluster.
>
> Logging the information via the usual log mechanism seems insufficient, as 
> log files may get rotated and this information lost.
>

True. Just a thought, can we use existing logging mechanism and APIs
to write to a new file that never gets rotated by the syslogger(Of
course, we need to think of the maximum file size that's allowed)? The
idea is like this: we use elog/ereport and so on with a new debug
level, when specified, instead of logging into the standard log files,
we log it to the new file.

>
> Would it be acceptable to store some fixed set of flag bits and timestamps in 
> pg_control?  Space there is at a premium.
>

Since we allocate ControlFileData in shared memory and also we may
have some data with timestamps, variable texts and so on, having this
included in pg_control data structure would not seem a good idea to
me.

>
> Would it make sense to alternately, or additionally, store some of this 
> information in a flat text file in pg_data, say a new file named 
> "cluster_history" or such?
>

IMHO, this is also a good idea. We need to think of the APIs to
open/read/write/close that history file? How often and which processes
and what type of data they write? Is it that the postmaster alone will
write into that file? If multiple processes are allowed to write, how
to deal with concurrent writers? Will users have to open manually and
read that file? or Will we have some program similar to
pg_controldata? Will we have some maximum limit to the size of this
file?

>
> I'm happy to put together a more concrete proposal, but solicit your opinions 
> first on the merits of the idea generally, and if you think the idea good, on 
> the specifics you'd like to see included.
>

Welcome to know more about this idea.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Tracking cluster upgrade and configuration history

2020-11-11 Thread Mark Dilger
Hello Hackers,

While supporting customers, it would frequently be useful to have more 
information about the history of a cluster.  For example, which prior versions 
were ever installed and running on the cluster?  Has the cluster ever been run 
with fsync=off?  When did the server last enter recovery, if ever?  Was a 
backup_label file present at that time?

Some of this type of information could strictly be fixed size, such as a fixed 
set of timestamps for the time at which a fixed set of things last occurred, or 
a fixed set of bits indicating whether a fixed set of things ever happened.

Some other types would be variable size, but hopefully short in practice, like 
a list of all postgres versions that have ever been run on the cluster.

Logging the information via the usual log mechanism seems insufficient, as log 
files may get rotated and this information lost.

Would it be acceptable to store some fixed set of flag bits and timestamps in 
pg_control?  Space there is at a premium.

Would it make sense to alternately, or additionally, store some of this 
information in a flat text file in pg_data, say a new file named 
"cluster_history" or such?

I'm happy to put together a more concrete proposal, but solicit your opinions 
first on the merits of the idea generally, and if you think the idea good, on 
the specifics you'd like to see included.

Thanks!

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company