Hi, No, it hasn't changed since the first time I looked at it.
root@atlassian:/home/tbadmin# ps ax | grep post 1364 ? Ss 0:00 /usr/lib/postfix/master 5198 pts/3 S 0:00 su postgres 5221 pts/3 S 0:00 /usr/lib/postgresql/9.3/bin/postgres -D /etc/postgresql/9.3/main 5222 ? Ss 0:10 postgres: startup process recovering 0000000100000005000000A3 11161 pts/4 S+ 0:00 grep --color=auto post The /var/log/postgres logs haven't got anything new in them. Also the screen session hasn't got anything new in them, except it adds a line every time I attempt to connect to it using psql. The last attempt below was my connection attempt for this email, a bit over 8 hours after my last connection attempt. Screen session: 2017-01-28 12:38:35 AEDT FATAL: the database system is starting up 2017-01-28 23:00:01 AEDT FATAL: the database system is starting up 2017-01-28 23:00:01 AEDT FATAL: the database system is starting up 2017-01-29 07:14:00 AEDT FATAL: the database system is starting up I also still can't connect. postgres@atlassian:/home/tbadmin$ psql psql: FATAL: the database system is starting up *Brian Mills* CTO *Mob: *0410660003 *Melbourne* 03 9012 3460 or 03 8376 6327 *|* *Sydney* 02 8064 3600 *|* *Brisbane* 07 3173 1570 Level 1 *|* 600 Chapel Street *|* South Yarra*|* VIC *|* 3141 *|* Australia <https://www.facebook.com/TryBooking/> <https://twitter.com/trybooking> <https://www.linkedin.com/company/trybooking-com> On 28 January 2017 at 16:08, Adrian Klaver <adrian.kla...@aklaver.com> wrote: > On 01/27/2017 05:40 PM, Brian Mills wrote: > >> First of all, Thank you for your time to assist me learning. I really >> appreciate it. >> >> root# ps ax | grep post >> 1364 ? Ss 0:00 /usr/lib/postfix/master >> 5198 pts/3 S 0:00 su postgres >> 5221 pts/3 S 0:00 /usr/lib/postgresql/9.3/bin/postgres -D >> /etc/postgresql/9.3/main >> 5222 ? Ss 0:10 postgres: startup process recovering >> 0000000100000005000000A3 >> 7930 pts/4 S+ 0:00 grep --color=auto post >> > > So if you check back does the recovering XXXXXXXX part change? > > If so Postgres is walking through the WAL files as it should. > > >> Its a single machine postgres database server, so I'm assuming there is >> no cluster log. If there is one, where would I look for it? >> The only log in /var/log/postgres is postgresql-9.3-main.log >> > > That would be it. A single Postgres instance has multiple databases in it, > by default it starts with template0, template1 and postgres databases. Then > add whatever databases you created and you have a cluster of databases. > > which shows (tail): >> >> 2017-01-27 20:27:01 AEDT LOG: database system was shut down at >> 2017-01-27 20:26:29 AEDT >> 2017-01-27 20:27:01 AEDT LOG: MultiXact member wraparound protections >> are now enabled >> 2017-01-27 20:27:01 AEDT LOG: autovacuum launcher started >> 2017-01-27 20:27:01 AEDT LOG: database system is ready to accept >> connections >> 2017-01-27 20:27:02 AEDT LOG: incomplete startup packet >> 2017-01-27 20:28:54 AEDT ERROR: unexpected chunk size 104 (expected >> 1996) in chunk 3 of 22 for toast value 48862 in pg_toast_20028 >> 2017-01-27 20:28:54 AEDT STATEMENT: COPY public.bodycontent >> (bodycontentid, body, contentid, bodytypeid) TO stdout; >> 2017-01-27 20:30:13 AEDT LOG: received fast shutdown request >> 2017-01-27 20:30:13 AEDT LOG: aborting any active transactions >> 2017-01-27 20:30:13 AEDT LOG: autovacuum launcher shutting down >> 2017-01-27 20:30:13 AEDT LOG: shutting down >> 2017-01-27 20:30:13 AEDT LOG: database system is shut down >> > > That would be your Attempt 1 log. > >> >> I ran the screen utility so I could leave the db started using >> the pg_ctl command. The later logs in that session have not progressed, >> its last entry is still >> 2017-01-27 23:00:01 AEDT FATAL: the database system is starting up >> Which is still later datetime than the /var/log/postgres... log. >> > > So it is just logging to stdout and not to the log file. > > >> Connection attempt shows the same log >> postgres@atlassian:/home/myuser$ psql >> psql: FATAL: the database system is starting up >> >> Nothing in the syslog that seems connected. >> >> *Brian Mills* >> CTO >> >> >> *Mob: *0410660003 <tel:0410660003> >> *Melbourne* 03 9012 3460 <tel:03%209012%203460> or 03 8376 6327 >> <tel:03%208376%206327> *|* * **Sydney* 02 8064 3600 >> <tel:02%208064%203600> *|* *Brisbane* 07 3173 1570 >> <tel:07%203173%201570> >> Level 1 *|* 600 Chapel Street *|* South >> Yarra*|* VIC *|* 3141 *|* Australia >> >> <https://www.facebook.com/TryBooking/> <https://twitter.com/trybooking> < >> https://www.linkedin.com/company/trybooking-com> >> >> On 28 January 2017 at 12:05, Adrian Klaver <adrian.kla...@aklaver.com >> <mailto:adrian.kla...@aklaver.com>> wrote: >> >> On 01/27/2017 01:31 PM, Brian Mills wrote: >> >> Hi, >> >> I have a Atlassian Confluence Wiki that depends on postgres, but I >> haven't much experience with postgres other than for this purpose. >> >> A few days ago, the hard disk filled, so all services stopped >> working. >> When the admin realised this he increased the disk size (its in >> a cloud, >> so that was easy to do) however I think the way it shutdown left >> Postgres in an inconsistent state for some reason. >> Now when I start it up I get this message in the log over and >> over again: >> "FATAL: the database system is starting up" >> >> I have a backup, which I have successfully recovered, but it is >> 24 hours >> old, the next backup was the cause of the disk filling. So I'm >> using >> this as exercise in learning a bit more about postgres. >> >> I did some research and found a number of options. I took a file >> level >> backup with the service not running then tried 2 things. I >> haven't found >> much else on what to do though. >> >> *Attempt 1 - Reset Log * >> >> It sounded like this shouldn't be my first option (it wasn't) >> but it did >> sound like what I needed to do. >> I ran this command >> ./pg_resetxlog /var/lib/postgresql/9.3/main -f >> It worked a treat, the database did startup ok. >> However when I tried to dump the DB: >> root@atlassian:/home/myuser# sudo -u postgres pg_dump >> confluencedb > >> $now-confluencedb.sql >> pg_dump: Dumping the contents of table "bodycontent" failed: >> PQgetResult() failed. >> pg_dump: Error message from server: ERROR: unexpected chunk >> size 104 >> (expected 1996) in chunk 3 of 22 for toast value 48862 in >> pg_toast_20028 >> pg_dump: The command was: COPY public.bodycontent >> (bodycontentid, body, >> contentid, bodytypeid) TO stdout; >> The dump failed, so I assume this did leave my database in an >> inconsistent state. >> >> >> *Attempt 2 - startup manually and let it try recovery* >> >> I restored my file level backup and started again. >> This time I tried to startup manually on the command line to see >> the >> output (I'd done it as a service startup a number of times to >> nearly the >> same effect too) >> >> postgres@atlassian:/usr/lib/postgresql/9.3/bin$ ./pg_ctl -D >> /etc/postgresql/9.3/main start >> server starting >> postgres@atlassian:/usr/lib/postgresql/9.3/bin$ 2017-01-27 >> 20:36:08 AEDT >> LOG: database system was interrupted while in recovery at >> 2017-01-27 >> 20:13:26 AEDT >> 2017-01-27 20:36:08 AEDT HINT: This probably means that some >> data is >> corrupted and you will have to use the last backup for recovery. >> 2017-01-27 20:36:08 AEDT LOG: database system was not properly >> shut >> down; automatic recovery in progress >> 2017-01-27 20:36:08 AEDT LOG: redo starts at 5/528B4558 >> 2017-01-27 20:36:18 AEDT LOG: redo done at 5/A3FFF9E8 >> 2017-01-27 20:36:18 AEDT LOG: last completed transaction was at >> log >> time 2017-01-24 02:08:00.023064+11 >> 2017-01-27 23:00:01 AEDT FATAL: the database system is starting >> up >> 2017-01-27 23:00:01 AEDT FATAL: the database system is starting >> up >> >> >> What does ps ax | grep post show? >> >> Is the cluster set up to log to a file, if so what does it show? >> >> Does the system log show anything relevant? >> >> >> I've left it that way for 12 hours, and its still not allowing >> connections. >> I assume its doing some kind of consistency check? >> >> >> What does it say when you attempt a connection? >> >> >> >> Does anyone have any suggestions on what I should be doing to >> try and >> restore this database? >> >> - The amount of change is minimal in the DB after 6pm it should be >> basically no change overnight. >> - The log above seems to suggest it has completed a redo ok, is >> that right? >> - The last completed transaction time 2017-01-24 >> 02:08:00.023064+11 the >> log mentions would be fine to use, so loosing even a few hours >> before >> that would be more than adequate. >> >> I'm just not clear what the database is doing now, or how I >> should be >> trying to recover it. >> >> Any help anyone can suggest would be great! I've given myself this >> weekend to attempt to recover more than the last backup, after >> that I >> need to restore the service for my team to use and suck up the >> lost last >> day of updates. >> >> Thanks, >> Brian >> >> >> >> -- >> Adrian Klaver >> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> >> >> >> > > -- > Adrian Klaver > adrian.kla...@aklaver.com >