Re: [HACKERS] [PATCHES] Fix "database is ready" race condition
Jim Nasby wrote: > On Feb 5, 2007, at 8:19 AM, Tom Lane wrote: > >"Simon Riggs" <[EMAIL PROTECTED]> writes: > >>My suggestions would be > >>1. "Database system has completed recovery" and > >>2. "Database system is ready to accept connections" > > > >The second was in fact the wording I had in mind, sorry for not being > >clear. As to the first, the question is whether a log message at > >that specific point has any use. It's not "completion of recovery", > >exactly, since we go through that spot whether it's a normal startup > >or recovery (and there is another log message when we actually do > >any WAL replay). AFAICS it's a fairly randomly chosen place in a > >long sequence of Things That Must Happen. Part of the reason Markus > >is seeing a race condition is that this isn't the last thing done > >before the startup subprocess exits --- see BootstrapMain. So I'm > >for just getting rid of it. > > It is useful to know if the database had to do recovery, though, and > if it did do recovery, it would be useful to know how long it took if > the subsequent startup took a real amount of time. That's already logged elsewhere, so there's no loss of functionality. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Fix "database is ready" race condition
On Feb 5, 2007, at 8:19 AM, Tom Lane wrote: "Simon Riggs" <[EMAIL PROTECTED]> writes: My suggestions would be 1. "Database system has completed recovery" and 2. "Database system is ready to accept connections" The second was in fact the wording I had in mind, sorry for not being clear. As to the first, the question is whether a log message at that specific point has any use. It's not "completion of recovery", exactly, since we go through that spot whether it's a normal startup or recovery (and there is another log message when we actually do any WAL replay). AFAICS it's a fairly randomly chosen place in a long sequence of Things That Must Happen. Part of the reason Markus is seeing a race condition is that this isn't the last thing done before the startup subprocess exits --- see BootstrapMain. So I'm for just getting rid of it. It is useful to know if the database had to do recovery, though, and if it did do recovery, it would be useful to know how long it took if the subsequent startup took a real amount of time. BTW, this is a real problem I've seen on a database with 500k entries in pg_class... it takes several minutes to finish starting after the 'Postmaster is ready' message. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Fix "database is ready" race condition
"Simon Riggs" <[EMAIL PROTECTED]> writes: > My suggestions would be > 1. "Database system has completed recovery" and > 2. "Database system is ready to accept connections" The second was in fact the wording I had in mind, sorry for not being clear. As to the first, the question is whether a log message at that specific point has any use. It's not "completion of recovery", exactly, since we go through that spot whether it's a normal startup or recovery (and there is another log message when we actually do any WAL replay). AFAICS it's a fairly randomly chosen place in a long sequence of Things That Must Happen. Part of the reason Markus is seeing a race condition is that this isn't the last thing done before the startup subprocess exits --- see BootstrapMain. So I'm for just getting rid of it. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] Fix "database is ready" race condition
On Sun, 2007-02-04 at 14:15 -0500, Tom Lane wrote: > Markus Schiltknecht <[EMAIL PROTECTED]> writes: > > is there a good reason to print the "database system is ready" message > > in StartupXLOG() in xact.c? It has a) nothing to do with xlog and b) > > opens a small race condition: the message gets printed, while it still > > take some CPU cycles until the postmaster really gets the SIGCHLD signal > > and sets StartupPID = 0. If you (or rather: an automated test program) > > try to connect within this timespan, you get a "database is starting up" > > error, which clearly contradicts the "is ready" message. > > I don't think there's any compelling reason for having that log message > in its current form. What about redefining it to mean "postmaster is > ready to accept connections" --- either with that wording, or keeping > the old wording? Then we could just put it in one place in postmaster.c > and be done. I think your proposed patch is overcomplicated by trying > to have it still come out in bootstrap/standalone cases. For a > standalone backend, getting a prompt is what tells you it's ready ;-) I'm OK with moving the message to be executed from another place, but I have some comments on the changed wording. Firstly, "Database system" is great general wording. "Postmaster" only means something if you know the architecture of PostgreSQL, which most people don't. If we did change the wording, I'd want to have separate messages for the two events of - database can now accept connections - recovery is complete One of the TODO items is to allow the dbserver to be available for read-only queries while still recovering, so any change to the wording should be made with that in mind, so we don't need to change it too often. My suggestions would be 1. "Database system has completed recovery" and 2. "Database system is ready to accept connections" Currently those messages would occur in that order and be issued by StartupXLOG() for (1) and postmaster for (2). In the future they may be issued in a different order. If we stick with only a single message, we should keep it the same as now, wherever the code and whatever the exact timing of its execution. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] Fix "database is ready" race condition
Hi, Tom Lane wrote: I don't think there's any compelling reason for having that log message in its current form. What about redefining it to mean "postmaster is ready to accept connections" --- either with that wording, or keeping the old wording? Then we could just put it in one place in postmaster.c and be done. Sure, I'm fine with that wording. As long as it's in the reaper() and thus the postmaster is really ready then it says so, I'm happy. Markus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Fix "database is ready" race condition
Markus Schiltknecht <[EMAIL PROTECTED]> writes: > is there a good reason to print the "database system is ready" message > in StartupXLOG() in xact.c? It has a) nothing to do with xlog and b) > opens a small race condition: the message gets printed, while it still > take some CPU cycles until the postmaster really gets the SIGCHLD signal > and sets StartupPID = 0. If you (or rather: an automated test program) > try to connect within this timespan, you get a "database is starting up" > error, which clearly contradicts the "is ready" message. I don't think there's any compelling reason for having that log message in its current form. What about redefining it to mean "postmaster is ready to accept connections" --- either with that wording, or keeping the old wording? Then we could just put it in one place in postmaster.c and be done. I think your proposed patch is overcomplicated by trying to have it still come out in bootstrap/standalone cases. For a standalone backend, getting a prompt is what tells you it's ready ;-) regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate