Hi listers,
Here is my problem. I am running PITR restore on a machine remote from my production machine. I'm shipping logs over there, compressed, then uncompressing them and copying them to pg_xlog. Everything works fine until a network outage creates a gap in my logs. The recovery terminates at log "0000000100000C28000000B1" and brings the database up Because it can't find "0000000100000C28000000B2". Log "0000000100000C28000000B3" is copied over but I wish to restart recovery at B2. So I scp B2 over from my primary machine from a folder that I created for just such an occasion. Now I rename recovery.done to recovery.conf (Copied here for your convenience) 'sh /usr/local/postgresql-8.2.5/bin/copy.sh %f %p 2>>/tmp/recovery.log' (and copy.sh:) REQ_FILE=$1 DEST=$2 LF="${REQ_FILE}.lock" SUFFIX=${REQ_FILE##*.} ############################################################### ## check if file is transaction log or informational file ## if transaction log, cat from archlog and uncompress into unzipped folder ## if informational simply copy into unzipped folder (it came over uncompressed) ############################################################################ ######### if [ "${SUFFIX}" != 'history' ] && [ "${SUFFIX}" != 'backup' ]; then cat "/logs/var/backups/archlog/${REQ_FILE}" | gzip -dc > "/logs/var/backups/unzipped/${REQ_FILE}" if [ "$?" = "0" ] ; then echo 'successful uncompress of ' "/logs/var/backups/unzipped/${REQ_FILE}" >> /tmp/restore.mavmail.log else echo 'unsuccessful uncompress of ' "/logs/var/backups/unzipped/${REQ_FILE}" >> /tmp/restore.mavmail.log echo 'the return code is ' "$?" >> /tmp/restore.mavmail.log fi else cp "/logs/var/backups/archlog/${REQ_FILE}" "/logs/var/backups/unzipped/${REQ_FILE}" fi ############################################################################ ########### ## check for size. If not a full size (16777216) trans log, the copy from ## cobra is still in progress. Don't copy this file. Stop recovery here. ############################################################################ ########### SIZE=$(ls -gG1 "/logs/var/backups/unzipped/${REQ_FILE}" | awk '{ print $3}' ) echo "The size of the log to be restored is " "${SIZE}" >> /tmp/restore.mavmail.log if [ "${SUFFIX}" != 'history' ] && [ "${SUFFIX}" != 'backup' ]; then if [ "${SIZE}" != '16777216' ]; then echo 'partially written log - not restored - finishing recovery' >> /tmp/restore.mavmail.log exit 0 fi fi /usr/bin/lockfile "${LF}" ################################################################ ## copy either full sized trans log or informational file ## into pg_xlog data cluster. ################################################################ cp "/logs/var/backups/unzipped/${REQ_FILE}" "${DEST}" rm -f "${LF}" rm "/logs/var/backups/unzipped/${REQ_FILE}" (END) Now when I try to restart, hoping to begin recovery with the C2 log I get an invalid checkpoint error: : LOG: starting archive recovery Feb 25 10:08:10 ar-db3 postgres[32538]: [3-1] @: LOG: restore_command = "sh /usr/local/postgresql-8.2.5/bin/copy.sh %f %p 2>>/tmp/recovery.log" Feb 25 10:08:11 ar-db3 postgres[32538]: [4-1] @: LOG: restored log file "0000000100000C28000000B1" from archive Feb 25 10:08:11 ar-db3 postgres[32538]: [5-1] @: LOG: invalid record length at C28/B1FFECA4 Feb 25 10:08:11 ar-db3 postgres[32538]: [6-1] @: LOG: invalid primary checkpoint record Feb 25 10:08:12 ar-db3 postgres[32538]: [7-1] @: LOG: restored log file "0000000100000C28000000B1" from archive Feb 25 10:08:12 ar-db3 postgres[32538]: [8-1] @: LOG: invalid record length at C28/B1FFEC5C Feb 25 10:08:12 ar-db3 postgres[32538]: [9-1] @: LOG: invalid secondary checkpoint record Feb 25 10:08:12 ar-db3 postgres[32538]: [10-1] @: PANIC: could not locate a valid checkpoint record Feb 25 10:08:12 ar-db3 postgres[32537]: [1-1] @: LOG: startup process (PID 32538) was terminated by signal 6 Feb 25 10:08:12 ar-db3 postgres[32537]: [2-1] @: LOG: aborting startup due to startup process failure I remove the recovery.conf file, successfully start the database and issue a checkpoint. I try the restore again and get the same error. So, is there a way that I can force the recovery to begin at B2 or am I dead in the water and need to bring in another full file copy and Start from scratch: Thanks for your time. Mark Steben│Database Administrator│ @utoRevenue-R- "Join the Revenue-tion" 95 Ashley Ave. West Springfield, MA., 01089 413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax) @utoRevenue is a registered trademark and a division of Dominion Enterprises