On 06/07/16 17:41, Marco Nenciarini wrote:
> On 06/07/16 17:37, Marco Nenciarini wrote:
>> Hi,
>>
>> On 06/07/16 17:07, francesco.cano...@2ndquadrant.it wrote:
>>> The following bug has been logged on the website:
>>>
>>> Bug reference:      14230
>>> Logged by:          Francesco Canovai
>>> Email address:      francesco.cano...@2ndquadrant.it
>>> PostgreSQL version: 9.6beta2
>>> Operating system:   Linux
>>> Description:        
>>>
>>> I'm taking a concurrent backup from a standby in PostgreSQL beta2 and I get
>>> the wrong timeline from pg_stop_backup(false).
>>>
>>> This is what I'm doing:
>>>
>>> 1) I set up an environment with a primary server and a replica in streaming
>>> replication.
>>>
>>> 2) On the replica, I run
>>>
>>> postgres=# SELECT pg_start_backup('test_backup', true, false);
>>>  pg_start_backup 
>>> -----------------
>>>  0/3000A00
>>> (1 row)
>>>
>>> 3) When I run pg_stop_backup, it returns a start wal location belonging to a
>>> file with timeline 0.
>>>
>>> postgres=# SELECT pg_stop_backup(false);
>>>                               pg_stop_backup                              
>>>
>>> ---------------------------------------------------------------------------
>>>  (0/3000AE0,"START WAL LOCATION: 0/3000A00 (file
>>> 000000000000000000000003)+
>>>  CHECKPOINT LOCATION: 0/3000A38                                          
>>> +
>>>  BACKUP METHOD: streamed                                                 
>>> +
>>>  BACKUP FROM: standby                                                    
>>> +
>>>  START TIME: 2016-07-06 16:44:31 CEST                                    
>>> +
>>>  LABEL: test_backup                                                      
>>> +
>>>  ","")
>>> (1 row)
>>>
>>> The timeline returned is fine (is 1) when running the same commands on the
>>> master.
>>>
>>> An incorrect backup label doesn't prevent PostgreSQL from starting up, but
>>> it affects the tools using that information.
>>>
>>>
>>
>> The issue here is that the do_pg_stop_backup function uses the
>> ThisTimeLineID variable that is not valid on standbys.
>>
>> I think that it should read it from
>> ControlFile->checkPointCopy.ThisTimeLineID as we do in do_pg_start_backup.
>>
> 
> No, that's not the solution.
> 
> The backup_label is generated during the do_pg_start_backup call, so
> also the copy in  ControlFile->checkPointCopy.ThisTimeLineID is
> uninitialized.
> 

After further analysis, the issue is that we retrieve the starttli from
the ControlFile structure, but it was using ThisTimeLineID when writing
the backup label.

I've attached a very simple patch that fixes it.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e4645a3..aecede1 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
*************** do_pg_start_backup(const char *backupids
*** 9974,9980 ****
  		} while (!gotUniqueStartpoint);
  
  		XLByteToSeg(startpoint, _logSegNo);
! 		XLogFileName(xlogfilename, ThisTimeLineID, _logSegNo);
  
  		/*
  		 * Construct tablespace_map file
--- 9974,9980 ----
  		} while (!gotUniqueStartpoint);
  
  		XLByteToSeg(startpoint, _logSegNo);
! 		XLogFileName(xlogfilename, starttli, _logSegNo);
  
  		/*
  		 * Construct tablespace_map file

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to