Re: [GENERAL] Recovery_target_time misinterpreted?

2013-08-02 Thread Klaus Ita
No, it's super frustrating. While I do the recovery, it says it reaches a
consistent recovery state, and i just cannot find a way how to convince pg
to stop at that state:



2013-08-02 09:23:25 GMT DEBUG:  postgres: PostmasterMain: initial
environment dump:
2013-08-02 09:23:25 GMT DEBUG:  -
2013-08-02 09:23:25 GMT DEBUG:  PG_GRANDPARENT_PID=9077
2013-08-02 09:23:25 GMT DEBUG:  PGLOCALEDIR=/usr/share/locale
2013-08-02 09:23:25 GMT DEBUG:  PGSYSCONFDIR=/etc/postgresql-common
2013-08-02 09:23:25 GMT DEBUG:  LANG=en_US.utf8
2013-08-02 09:23:25 GMT DEBUG:  PWD=/var/lib/postgresql
2013-08-02 09:23:25 GMT DEBUG:  PGDATA=/var/lib/postgresql/9.1/main
2013-08-02 09:23:25 GMT DEBUG:  LC_COLLATE=en_US.utf8
2013-08-02 09:23:25 GMT DEBUG:  LC_CTYPE=en_US.utf8
2013-08-02 09:23:25 GMT DEBUG:  LC_MESSAGES=en_US.utf8
2013-08-02 09:23:25 GMT DEBUG:  LC_MONETARY=C
2013-08-02 09:23:25 GMT DEBUG:  LC_NUMERIC=C
2013-08-02 09:23:25 GMT DEBUG:  LC_TIME=C
2013-08-02 09:23:25 GMT DEBUG:  -


2013-08-02 11:23:26 CEST DEBUG:  invoking IpcMemoryCreate(size=32399360)
2013-08-02 11:23:26 CEST DEBUG:  removing file "pg_notify/"
2013-08-02 11:23:26 CEST DEBUG:  max_safe_fds = 982, usable_fds = 1000,
already_open = 8
2013-08-02 11:23:26 CEST LOG:  database system was interrupted while in
recovery at log time 2013-07-29 11:45:24 CEST
2013-08-02 11:23:26 CEST HINT:  If this has occurred more than once some
data might be corrupted and you might need to choose an earlier recovery
target.
2013-08-02 11:23:26 CEST DEBUG:  restore_command = 'cp
/home/validad-pg-backups/pgmaster/wal_files/%f.gz /tmp/%f.gz && gunzip
/tmp/%f.gz && mv /tmp/%f %p'
2013-08-02 11:23:26 CEST DEBUG:  trigger_file =
'/var/lib/postgresql/9.1/main/stop_replication_trigger'
2013-08-02 11:23:26 CEST LOG:  starting archive recovery
2013-08-02 11:23:26 CEST DEBUG:  executing restore command "cp
/home/validad-pg-backups/pgmaster/wal_files/0001027A002C.gz
/tmp/0001027A002C.gz && gunzip /tmp/0001027A002C.gz
&& mv /tmp/0001027A002C pg_xlog/RECOVERYXLOG"
2013-08-02 11:23:26 CEST DEBUG:  forked new backend, pid=9090 socket=9
2013-08-02 11:23:26 CEST LOG:  incomplete startup packet
2013-08-02 11:23:26 CEST DEBUG:  shmem_exit(0): 0 callbacks to make
2013-08-02 11:23:26 CEST DEBUG:  proc_exit(0): 1 callbacks to make
2013-08-02 11:23:26 CEST DEBUG:  exit(0)
2013-08-02 11:23:26 CEST DEBUG:  shmem_exit(-1): 0 callbacks to make
2013-08-02 11:23:26 CEST DEBUG:  proc_exit(-1): 0 callbacks to make
2013-08-02 11:23:26 CEST DEBUG:  reaping dead processes
2013-08-02 11:23:26 CEST DEBUG:  server process (PID 9090) exited with exit
code 0
2013-08-02 11:23:26 CEST LOG:  restored log file "0001027A002C"
from archive
2013-08-02 11:23:26 CEST DEBUG:  got WAL segment from archive
2013-08-02 11:23:26 CEST DEBUG:  checkpoint record is at 27A/2CB77750
2013-08-02 11:23:26 CEST DEBUG:  redo record is at 27A/2CB77750; shutdown
TRUE
2013-08-02 11:23:26 CEST DEBUG:  next transaction ID: 0/381985248; next
OID: 1201662
2013-08-02 11:23:26 CEST DEBUG:  next MultiXactId: 130079; next
MultiXactOffset: 272843
2013-08-02 11:23:26 CEST DEBUG:  oldest unfrozen transaction ID: 197713560,
in database 331065
2013-08-02 11:23:26 CEST DEBUG:  transaction ID wrap limit is 2345197207,
limited by database with OID 331065
2013-08-02 11:23:26 CEST DEBUG:  resetting unlogged relations: cleanup 1
init 0
2013-08-02 11:23:26 CEST LOG:  redo starts at 27A/2CB777A8
2013-08-02 11:23:26 CEST DEBUG:  executing restore command "cp
/home/validad-pg-backups/pgmaster/wal_files/0001027A002D.gz
/tmp/0001027A002D.gz && gunzip /tmp/0001027A002D.gz
&& mv /tmp/0001027A002D pg_xlog/RECOVERYXLOG"
2013-08-02 11:23:27 CEST DEBUG:  forked new backend, pid=9098 socket=9
2013-08-02 11:23:27 CEST FATAL:  the database system is starting up
2013-08-02 11:23:27 CEST DEBUG:  shmem_exit(1): 0 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  proc_exit(1): 1 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  exit(1)
2013-08-02 11:23:27 CEST DEBUG:  shmem_exit(-1): 0 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  proc_exit(-1): 0 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  reaping dead processes
2013-08-02 11:23:27 CEST DEBUG:  server process (PID 9098) exited with exit
code 1
2013-08-02 11:23:27 CEST LOG:  restored log file "0001027A002D"
from archive
2013-08-02 11:23:27 CEST DEBUG:  got WAL segment from archive
2013-08-02 11:23:27 CEST DEBUG:  executing restore command "cp
/home/validad-pg-backups/pgmaster/wal_files/0001027A002E.gz
/tmp/0001027A002E.gz && gunzip /tmp/0001027A002E.gz
&& mv /tmp/0001027A002E pg_xlog/RECOVERYXLOG"
2013-08-02 11:23:27 CEST DEBUG:  forked new backend, pid=9105 socket=9
2013-08-02 

Re: [GENERAL] Recovery_target_time misinterpreted?

2013-08-02 Thread Albe Laurenz
Klaus Ita wrote:
>>> I have restored a Database Cluster with a recovery_target_time set to
>>>
>>> recovery_target_time =  '2013-07-27 21:20:17.127664+00'
>>> recovery_target_inclusive = false
>>>
>>>
>>>
>>> now it seems the restore rather restored to some point in time (rather the 
>>> 18th than the 27th). Is
>>> there an explanation for this huge gap? Is that the last 'consistent state'?
>> 
>> 
>> Maybe the log entries created during restore can answer the question.

> 2013-07-30 11:15:15 UTC <%> LOG:  starting point-in-time recovery to 
> 2013-07-27 21:20:17.127664+00
> 2013-07-30 11:15:15 UTC <%> LOG:  restored log file 
> "00010230005C" from archive
> 2013-07-30 11:15:15 UTC <%> LOG:  restored log file 
> "00010230005A" from archive
> 2013-07-30 11:15:15 UTC <%> LOG:  redo starts at 230/5ACD7CC0
> ...
> ...
> ...
> 2013-07-30 14:28:45 UTC <%> LOG:  restored log file 
> "000102640002" from archive
> 2013-07-30 14:28:45 UTC <%> LOG:  unexpected pageaddr 263/C706C000 in log 
> file 612, segment 2, offset
> 442368
> 2013-07-30 14:28:45 UTC <%> LOG:  redo done at 264/20698A8
> 2013-07-30 14:28:45 UTC <%> LOG:  last completed transaction was at log time 
> 2013-07-18
> 11:42:22.121512+00
> 2013-07-30 14:28:45 UTC <%> LOG:  restored log file 
> "000102640002" from archive
> cp: cannot stat 
> `/var/tmp/xlogs_recovered_2013-07-30/wal_files/0002.history*': No such 
> file or
> directory
> mv: cannot stat `/tmp/0002.history': No such file or directory
> 2013-07-30 14:28:45 UTC <%> LOG:  selected new timeline ID: 2
> cp: cannot stat 
> `/var/tmp/xlogs_recovered_2013-07-30/wal_files/0001.history*': No such 
> file or
> directory
> mv: cannot stat `/tmp/0001.history': No such file or directory
> 2013-07-30 14:28:45 UTC <%> LOG:  archive recovery complete
> 2013-07-30 14:29:09 UTC <%> LOG:  autovacuum launcher started
> 2013-07-30 14:29:09 UTC <%> LOG:  database system is ready to accept 
> connections
> 
> well, that does not indicate anything for me.

To me it indicates that log file 000102640002 might be corrupt.

PostgreSQL stops replaying WAL after it detects a corrupt WAL record.

Do you have a second copy of the WAL file?

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recovery_target_time misinterpreted?

2013-07-31 Thread Klaus Ita
2013-07-30 11:15:15 UTC <%> LOG:  starting point-in-time recovery to
2013-07-27 21:20:17.127664+00
2013-07-30 11:15:15 UTC <%> LOG:  restored log file
"00010230005C" from archive
2013-07-30 11:15:15 UTC <%> LOG:  restored log file
"00010230005A" from archive
2013-07-30 11:15:15 UTC <%> LOG:  redo starts at 230/5ACD7CC0
...
...
...
2013-07-30 14:28:45 UTC <%> LOG:  restored log file
"000102640002" from archive
2013-07-30 14:28:45 UTC <%> LOG:  unexpected pageaddr 263/C706C000 in log
file 612, segment 2, offset 442368
2013-07-30 14:28:45 UTC <%> LOG:  redo done at 264/20698A8
2013-07-30 14:28:45 UTC <%> LOG:  last completed transaction was at log
time 2013-07-18 11:42:22.121512+00
2013-07-30 14:28:45 UTC <%> LOG:  restored log file
"000102640002" from archive
cp: cannot stat
`/var/tmp/xlogs_recovered_2013-07-30/wal_files/0002.history*': No such
file or directory
mv: cannot stat `/tmp/0002.history': No such file or directory
2013-07-30 14:28:45 UTC <%> LOG:  selected new timeline ID: 2
cp: cannot stat
`/var/tmp/xlogs_recovered_2013-07-30/wal_files/0001.history*': No such
file or directory
mv: cannot stat `/tmp/0001.history': No such file or directory
2013-07-30 14:28:45 UTC <%> LOG:  archive recovery complete
2013-07-30 14:29:09 UTC <%> LOG:  autovacuum launcher started
2013-07-30 14:29:09 UTC <%> LOG:  database system is ready to accept
connections


well, that does not indicate anything for me.



On Wed, Jul 31, 2013 at 9:37 AM, Albe Laurenz wrote:

> Klaus Ita wrote:
> > I have restored a Database Cluster with a recovery_target_time set to
> >
> > recovery_target_time =  '2013-07-27 21:20:17.127664+00'
> > recovery_target_inclusive = false
> >
> >
> >
> > now it seems the restore rather restored to some point in time (rather
> the 18th than the 27th). Is
> > there an explanation for this huge gap? Is that the last 'consistent
> state'?
>
> Maybe the log entries created during restore can answer the question.
>
> Yours,
> Laurenz Albe
>


Re: [GENERAL] Recovery_target_time misinterpreted?

2013-07-31 Thread Albe Laurenz
Klaus Ita wrote:
> I have restored a Database Cluster with a recovery_target_time set to
> 
> recovery_target_time =  '2013-07-27 21:20:17.127664+00'
> recovery_target_inclusive = false
> 
> 
> 
> now it seems the restore rather restored to some point in time (rather the 
> 18th than the 27th). Is
> there an explanation for this huge gap? Is that the last 'consistent state'?

Maybe the log entries created during restore can answer the question.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Recovery_target_time misinterpreted?

2013-07-31 Thread Klaus Ita
Hello PG Experts!

I have restored a Database Cluster with a recovery_target_time set to

recovery_target_time =  '2013-07-27 21:20:17.127664+00'
recovery_target_inclusive = false


now it seems the restore rather restored to some point in time (rather the
18th than the 27th). Is there an explanation for this huge gap? Is that the
last 'consistent state'?

pg version:
xaxos_ch=> select version();

version
--
 PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit
(1 row)


somewhat irrelevant, as my question is more general?


thank you for your help!

klaus