Re: block-level incremental backup

Ibrar Ahmed Wed, 17 Jul 2019 01:46:15 -0700

On Wed, Jul 17, 2019 at 10:22 AM Jeevan Chalke <
[email protected]> wrote:


>
>
> On Thu, Jul 11, 2019 at 5:00 PM Jeevan Chalke <
> [email protected]> wrote:
>
>> Hi Anastasia,
>>
>> On Wed, Jul 10, 2019 at 11:47 PM Anastasia Lubennikova <
>> [email protected]> wrote:
>>
>>> 23.04.2019 14:08, Anastasia Lubennikova wrote:
>>> > I'm volunteering to write a draft patch or, more likely, set of
>>> > patches, which
>>> > will allow us to discuss the subject in more detail.
>>> > And to do that I wish we agree on the API and data format (at least
>>> > broadly).
>>> > Looking forward to hearing your thoughts.
>>>
>>> Though the previous discussion stalled,
>>> I still hope that we could agree on basic points such as a map file
>>> format and protocol extension,
>>> which is necessary to start implementing the feature.
>>>
>>
>> It's great that you too come up with the PoC patch. I didn't look at your
>> changes in much details but we at EnterpriseDB too working on this feature
>> and started implementing it.
>>
>> Attached series of patches I had so far... (which needed further
>> optimization and adjustments though)
>>
>> Here is the overall design (as proposed by Robert) we are trying to
>> implement:
>>
>> 1. Extend the BASE_BACKUP command that can be used with replication
>> connections. Add a new [ LSN 'lsn' ] option.
>>
>> 2. Extend pg_basebackup with a new --lsn=LSN option that causes it to
>> send the option added to the server in #1.
>>
>> Here are the implementation details when we have a valid LSN
>>
>> sendFile() in basebackup.c is the function which mostly does the thing
>> for us. If the filename looks like a relation file, then we'll need to
>> consider sending only a partial file. The way to do that is probably:
>>
>> A. Read the whole file into memory.
>>
>> B. Check the LSN of each block. Build a bitmap indicating which blocks
>> have an LSN greater than or equal to the threshold LSN.
>>
>> C. If more than 90% of the bits in the bitmap are set, send the whole
>> file just as if this were a full backup. This 90% is a constant now; we
>> might make it a GUC later.
>>
>> D. Otherwise, send a file with .partial added to the name. The .partial
>> file contains an indication of which blocks were changed at the beginning,
>> followed by the data blocks. It also includes a checksum/CRC.
>> Currently, a .partial file format looks like:
>>  - start with a 4-byte magic number
>>  - then store a 4-byte CRC covering the header
>>  - then a 4-byte count of the number of blocks included in the file
>>  - then the block numbers, each as a 4-byte quantity
>>  - then the data blocks
>>
>>
>> We are also working on combining these incremental back-ups with the full
>> backup and for that, we are planning to add a new utility called
>> pg_combinebackup. Will post the details on that later once we have on the
>> same page for taking backup.
>>
>
> For combining a full backup with one or more incremental backup, we are
> adding
> a new utility called pg_combinebackup in src/bin.
>
> Here is the overall design as proposed by Robert.
>
> pg_combinebackup starts from the LAST backup specified and work backward.
> It
> must NOT start with the full backup and work forward. This is important
> both
> for reasons of efficiency and of correctness. For example, if you start by
> copying over the full backup and then later apply the incremental backups
> on
> top of it then you'll copy data and later end up overwriting it or removing
> it. Any files that are leftover at the end that aren't in the final
> incremental backup even as .partial files need to be removed, or the
> result is
> wrong. We should aim for a system where every block in the output
> directory is
> written exactly once and nothing ever has to be created and then removed.
>
> To make that work, we should start by examining the final incremental
> backup.
> We should proceed with one file at a time. For each file:
>
> 1. If the complete file is present in the incremental backup, then just
> copy it
> to the output directory - and move on to the next file.
>
> 2. Otherwise, we have a .partial file. Work backward through the backup
> chain
> until we find a complete version of the file. That might happen when we get
> \back to the full backup at the start of the chain, but it might also
> happen
> sooner - at which point we do not need to and should not look at earlier
> backups for that file. During this phase, we should read only the HEADER of
> each .partial file, building a map of which blocks we're ultimately going
> to
> need to read from each backup. We can also compute the offset within each
> file
> where that block is stored at this stage, again using the header
> information.
>
> 3. Now, we can write the output file - reading each block in turn from the
> correct backup and writing it to the write output file, using the map we
> constructed in the previous step. We should probably keep all of the input
> files open over steps 2 and 3 and then close them at the end because
> repeatedly closing and opening them is going to be expensive. When that's
> done,
> go on to the next file and start over at step 1.
>
>
> At what stage you will apply the WAL generated in between the START/STOP
backup.


> We are already started working on this design.
>
> --
> Jeevan Chalke
> Technical Architect, Product Development
> EnterpriseDB Corporation
>
>

-- 
Ibrar Ahmed

Re: block-level incremental backup

Reply via email to