Re: [HACKERS] PITR - Some data is not recovered.

2004-08-10 Thread OKADA Satoshi
Tom Lane wrote:

>OKADA Satoshi <[EMAIL PROTECTED]> writes:
>  
>
>>I'm testing PITR using pgbench and postgresql ver.8.0bata. 
>>I think that result of recovery is wrong.
>>
>>
>
>Many thanks for this bug report.  I have developed a patch (attached)
>that seems to fix it for me.  Please test and see if you can still
>cause the problem.
>
I got correct results! Thanks for developing patch.

I will test PITR about various patterns. It will report, if there are
some failures.



Thanks,
Satoshi OKADA


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] PITR - Some data is not recovered.

2004-08-10 Thread Tom Lane
OKADA Satoshi <[EMAIL PROTECTED]> writes:
> I'm testing PITR using pgbench and postgresql ver.8.0bata. 
> I think that result of recovery is wrong.

Many thanks for this bug report.  I have developed a patch (attached)
that seems to fix it for me.  Please test and see if you can still
cause the problem.

BTW, I found that it was much easier to cause the bug to happen if you
force a long commit delay.  To do this you need to have
fsync = on
commit_siblings = 1
commit_delay = 10
and while the pgbench run is going, start a new psql and do
begin;
create table foo(f1 int);
(or any other command that modifies the database).  If you are watching
things with "top" you should notice a drastic decrease in the CPU
consumption of the backend connected to pgbench as soon as you do this.
Then proceed with taking the on-line backup.  You can end the extra psql
session after the backup is done, to allow pgbench to go back to
normal speed.

Doing it this way I got a failure just about every time (3 failures in 3
tries) whereas without, I had seen only one failure in four tries.

regards, tom lane


Index: src/backend/access/transam/xact.c
===
RCS file: /cvsroot/pgsql-server/src/backend/access/transam/xact.c,v
retrieving revision 1.177
diff -c -r1.177 xact.c
*** src/backend/access/transam/xact.c   3 Aug 2004 15:57:26 -   1.177
--- src/backend/access/transam/xact.c   10 Aug 2004 20:55:48 -
***
*** 574,586 
START_CRIT_SECTION();
  
/*
!* We only need to log the commit in XLOG if the transaction made
!* any transaction-controlled XLOG entries or will delete files.
 * (If it made no transaction-controlled XLOG entries, its XID
 * appears nowhere in permanent storage, so no one else will ever care
!* if it committed.)
 */
madeTCentries = (MyLastRecPtr.xrecoff != 0);
if (madeTCentries || nrels > 0)
{
XLogRecData rdata[3];
--- 574,601 
START_CRIT_SECTION();
  
/*
!* If our transaction made any transaction-controlled XLOG entries,
!* we need to lock out checkpoint start between writing our XLOG
!* record and updating pg_clog.  Otherwise it is possible for the
!* checkpoint to set REDO after the XLOG record but fail to flush the
!* pg_clog update to disk, leading to loss of the transaction commit
!* if we crash a little later.  Slightly klugy fix for problem
!* discovered 2004-08-10.
!*
 * (If it made no transaction-controlled XLOG entries, its XID
 * appears nowhere in permanent storage, so no one else will ever care
!* if it committed; so it doesn't matter if we lose the commit flag.)
!*
!* Note we only need a shared lock.
 */
madeTCentries = (MyLastRecPtr.xrecoff != 0);
+   if (madeTCentries)
+   LWLockAcquire(CheckpointStartLock, LW_SHARED);
+ 
+   /*
+* We only need to log the commit in XLOG if the transaction made
+* any transaction-controlled XLOG entries or will delete files.
+*/
if (madeTCentries || nrels > 0)
{
XLogRecData rdata[3];
***
*** 668,673 
--- 683,692 
TransactionIdCommitTree(nchildren, children);
}
  
+   /* Unlock checkpoint lock if we acquired it */
+   if (madeTCentries)
+   LWLockRelease(CheckpointStartLock);
+ 
END_CRIT_SECTION();
}
  
***
*** 850,855 
--- 869,876 
 *
 * We do not flush XLOG to disk unless deleting files, since the
 * default assumption after a crash would be that we aborted, anyway.
+* For the same reason, we don't need to worry about interlocking
+* against checkpoint start.
 */
if (MyLastRecPtr.xrecoff != 0 || nrels > 0)
{
Index: src/backend/access/transam/xlog.c
===
RCS file: /cvsroot/pgsql-server/src/backend/access/transam/xlog.c,v
retrieving revision 1.158
diff -c -r1.158 xlog.c
*** src/backend/access/transam/xlog.c   9 Aug 2004 16:26:01 -   1.158
--- src/backend/access/transam/xlog.c   10 Aug 2004 20:55:49 -
***
*** 4699,4704 
--- 4699,4713 
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.time = time(NULL);
  
+   /*
+ 

Re: [HACKERS] PITR - Some data is not recovered.

2004-08-09 Thread Tom Lane
> Okay, so theoretically it should work ... I'm trying it now to see
> if I can reproduce the problem locally.

It took several tries, but eventually I did reproduce it.  It seems
the triggering condition is for the REDO pointer to be before the
checkpoint record itself.  Not sure why, yet, but it looks like the
record(s) in between are not restored for some reason...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PITR - Some data is not recovered.

2004-08-09 Thread Tom Lane
OKADA Satoshi <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Is this actually the official beta1 version, or is it a snapshot from
>> last week sometime?  
>> 
> I got it from
> ftp.postgresql.org/pub/source/v8.0.0beta/postgresql-8.0.0beta1.tar.gz, and
> xlog.c revision number is 1.157.

Okay, so theoretically it should work ... I'm trying it now to see
if I can reproduce the problem locally.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] PITR - Some data is not recovered.

2004-08-09 Thread OKADA Satoshi
Tom Lane wrote:

>OKADA Satoshi <[EMAIL PROTECTED]> writes:
>
>>I'm testing PITR using pgbench and postgresql ver.8.0bata. 
>>
>
>Is this actually the official beta1 version, or is it a snapshot from
>last week sometime?  
>
I got it from
ftp.postgresql.org/pub/source/v8.0.0beta/postgresql-8.0.0beta1.tar.gz, and
xlog.c revision number is 1.157.

Thanks,
Satoshi OKADA


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] PITR - Some data is not recovered.

2004-08-09 Thread Tom Lane
OKADA Satoshi <[EMAIL PROTECTED]> writes:
> I'm testing PITR using pgbench and postgresql ver.8.0bata. 

Is this actually the official beta1 version, or is it a snapshot from
last week sometime?  In the first commit that had pg_start_backup(),
there wasn't any logic to ensure that the recovery replay would really
start from before the backup began.  If a checkpoint occurred after you
started the tar run but before tar got around to copying pg_control,
the recovery wouldn't work properly.  I can't be sure but your report
seems consistent with such a problem.

[ digs in CVS logs ]  The necessary additional code was committed in
revision 1.155 of src/backend/access/transam/xlog.c.  If you have 1.154
please update and try again.  If you do have 1.155 or later then we
need to look closer.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[HACKERS] PITR - Some data is not recovered.

2004-08-09 Thread OKADA Satoshi

I'm testing PITR using pgbench and postgresql ver.8.0bata. 
I think that result of recovery is wrong.  My test procedure 
is as follows:

I edited postgresql.conf for PITR and started the postmaster.
And I executed "pgbench -t 2".
  % pgbench -t 2

I did backup procedure before end of pgbench.
  % psql -c "SELECT pg_start_backup('label1')"
  % cd $PGDATA
  % tar cf /tmp/back.tar ./*
  % psql -c "SELECT pg_stop_backup()"

restore and recovery
  % pg_ctl stop
  % cp -R $PGDATA/pg_xlog /tmp/.
  % rm -rf $PGDATA/*
  % cd $PGDATA
  % tar xf /tmp/data.tar 

  create "recovery.conf" file

  % rm -rf $PGDATA/pg_xlog
  % cp -R /tmp/pg_xlog $PGDATA/.
  % pg_ctl start 

check data after recovery
  % psql -c "SELECT count(*) from history"
count 
   ---
1
   (1 row)

Number of records should be 2, but result is 1.
I found lack of data that was inserted near backup time, as a 
result of comparing original "history" table and  "history" table which
was recovered.
Is my backup procedure wrong?

Thanks, 
Satoshi OKADA


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings