Simon Riggs wrote:
On Tue, 2010-01-12 at 15:11 -0500, Bruce Momjian wrote:
Stefan Kaltenbrunner wrote:
Simon Riggs wrote:
On Tue, 2010-01-12 at 08:24 +0100, Stefan Kaltenbrunner wrote:
Fujii Masao wrote:
On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith <g...@2ndquadrant.com> wrote:
I don't think anybody can deploy this feature without at least some very
basic monitoring here.  I like the basic proposal you made back in September
for adding a pg_standbys_xlog_location to replace what you have to get from
ps right now:
http://archives.postgresql.org/pgsql-hackers/2009-09/msg00889.php

That's basic, but enough that people could get by for a V1.
Yeah, I have no objection to add such simple capability which monitors
the lag into the first release. But I guess that, in addition to that,
Simon wanted the capability to collect the statistical information about
replication activity (e.g., a transfer time, a write time, replay time).
So I'd like to postpone it.
yeah getting that would all be nice and handy but we have to remember that this is really our first cut at integrated replication. Being able to monitor lag is what is needed as a minimum, more advanced stuff can and will emerge once we get some actual feedback from the field.
Though there won't be any feedback from the field because there won't be
any numbers to discuss. Just "it appears to be working". Then we will go
into production and the problems will begin to be reported. We will be
able to do nothing to resolve them because we won't know how many people
are affected.
field is also production usage in my pov, and I'm not sure how we would know how many people are affected by some imaginary issue just because there is a column that has some numbers in it. All of the large features we added in the past got finetuned and improved in the following releases, and I expect SR to be one of them that will see a lot of improvement in 8.5+n. Adding detailed monitoring of some random stuff (I don't think there was a clear proposal of what kind of stuff you would like to see) while we don't really know what the performance characteristics are might easily lead to us provding a ton of data and nothing relevant :( What I really think we should do for this first cut is to make it as foolproof and easy to set up as possible and add the minimum required monitoring knobs but not going overboard with doing too many stats.
I totally agree.  If SR isn't going to be useful without being
feature-complete, we might as well just drop it for 8.5 right now.
Let's get a reasonable feature set implemented and then come back in 8.6
to improve it.  For example, there is no need for a special
'replication' user (just use super-user), and monitoring should be
minimal until we have field experience of exactly what monitoring we
need.
The final commit-fest is in 5 days --- this is not the time for design
discussion and feature additions.  If we wait for SR to be feature
complete, with design discussions, etc, we will hopelessly delay 8.5 and
people will get frustrated.  I am not saying we can't talk about design,
but none of this should be a requirement for 8.5.

We can't add monitoring until we know what the performance
characteristics are. Hmmm. And how will we know what the performance
characteristics are, I wonder?

well I would say we do exactly how we have done in the past with other features - by debugging the stuff with low level tools until we fully understand what it really is and then we can always add more "accessible" stats.


Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to