Re: WIP/PoC for parallel backup

Asif Rehman Thu, 24 Oct 2019 04:25:34 -0700

On Thu, Oct 24, 2019 at 3:21 PM Ibrar Ahmed <ibrar.ah...@gmail.com> wrote:


>
>
> On Fri, Oct 18, 2019 at 4:12 PM Jeevan Chalke <
> jeevan.cha...@enterprisedb.com> wrote:
>
>>
>>
>> On Thu, Oct 17, 2019 at 10:51 AM Asif Rehman <asifr.reh...@gmail.com>
>> wrote:
>>
>>>
>>> Attached are the updated patches.
>>>
>>
>> I had a quick look over these changes and they look good overall.
>> However, here are my few review comments I caught while glancing the
>> patches
>> 0002 and 0003.
>>
>>
>> --- 0002 patch
>>
>> 1.
>> Can lsn option be renamed to start-wal-location? This will be more clear
>> too.
>>
>> 2.
>> +typedef struct
>> +{
>> +    char        name[MAXPGPATH];
>> +    char        type;
>> +    int32        size;
>> +    time_t        mtime;
>> +} BackupFile;
>>
>> I think it will be good if we keep this structure in a common place so
>> that
>> the client can also use it.
>>
>> 3.
>> +    SEND_FILE_LIST,
>> +    SEND_FILES_CONTENT,
>> Can above two commands renamed to SEND_BACKUP_MANIFEST and
>> SEND_BACKUP_FILE
>> respectively?
>> The reason behind the first name change is, we are not getting only file
>> lists
>> here instead we are getting a few more details with that too. And for
>> others,
>> it will be inline with START_BACKUP/STOP_BACKUP/SEND_BACKUP_MANIFEST.
>>
>> 4.
>> Typos:
>> non-exlusive => non-exclusive
>> retured => returned
>> optionaly => optionally
>> nessery => necessary
>> totoal => total
>>
>>
>> --- 0003 patch
>>
>> 1.
>> +static int
>> +simple_list_length(SimpleStringList *list)
>> +{
>> +    int            len = 0;
>> +    SimpleStringListCell *cell;
>> +
>> +    for (cell = list->head; cell; cell = cell->next, len++)
>> +        ;
>> +
>> +    return len;
>> +}
>>
>> I think it will be good if it goes to simple_list.c. That will help in
>> other
>> usages as well.
>>
>> 2.
>> Please revert these unnecessary changes:
>>
>> @@ -1475,6 +1575,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult
>> *res, int rownum)
>>               */
>>              snprintf(filename, sizeof(filename), "%s/%s", current_path,
>>                       copybuf);
>> +
>>              if (filename[strlen(filename) - 1] == '/')
>>              {
>>                  /*
>>
>> @@ -1528,8 +1622,8 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult
>> *res, int rownum)
>>                       * can map them too.)
>>                       */
>>                      filename[strlen(filename) - 1] = '\0';    /* Remove
>> trailing slash */
>> -
>>                      mapped_tblspc_path =
>> get_tablespace_mapping(&copybuf[157]);
>> +
>>                      if (symlink(mapped_tblspc_path, filename) != 0)
>>                      {
>>                          pg_log_error("could not create symbolic link
>> from \"%s\" to \"%s\": %m",
>>
>> 3.
>> Typos:
>> retrive => retrieve
>> takecare => take care
>> tablespae => tablespace
>>
>> 4.
>> ParallelBackupEnd() function does not do anything for parallelism. Will
>> it be
>> better to just rename it as EndBackup()?
>>
>> 5.
>> To pass a tablespace path to the server in SEND_FILES_CONTENT, you are
>> reusing
>> a LABEL option, that seems odd. How about adding a new option for that?
>>
>> 6.
>> It will be good if we have some comments explaining what the function is
>> actually doing in its prologue. For functions like:
>> GetBackupFilesList()
>> ReceiveFiles()
>> create_workers_and_fetch()
>>
>>
>> Thanks
>>
>>
>>>
>>> Thanks,
>>>
>>> --
>>> Asif Rehman
>>> Highgo Software (Canada/China/Pakistan)
>>> URL : www.highgo.ca
>>>
>>>
>>
>> --
>> Jeevan Chalke
>> Associate Database Architect & Team Lead, Product Development
>> EnterpriseDB Corporation
>> The Enterprise PostgreSQL Company
>>
>>
> I had a detailed discussion with Robert Haas at PostgreConf Europe about
> parallel backup.
> We discussed the current state of the patch and what needs to be done to
> get the patch committed.
>
> - The current patch uses a process to implement parallelism. There are many
> reasons we need to use threads instead of processes. To start with, as
> this is a client utility it makes
> more sense to use threads. The data needs to be shared amongst different
> threads and the main process,
> handling that is simpler as compared to interprocess communication.
>

Yes I agree. I have already converted the code to use threads instead of
processes. This avoids the overhead
of interprocess communication.

With a single file fetching strategy, this requires communication between
competing threads/processes. To handle
that in a multiprocess application, it requires IPC. The current approach
of multiple threads instead of processes
avoids this overhead.


> - Fetching a single file or multiple files was also discussed. We
> concluded in our discussion that we
> need to benchmark to see if disk I/O is a bottleneck or not and if
> parallel writing gives us
> any benefit. This benchmark needs to be done on different hardware and
> different
> network to identify which are the real bottlenecks. In general, we agreed
> that we could start with fetching
> one file at a time but that will be revisited after the benchmarks are
> done.
>

I'll share the updated patch in the next couple of days. After that, I'll
work on benchmarking that in
different environments that I have.


>
> - There is also an ongoing debate in this thread that we should have one
> single tar file for all files or one
> TAR file per thread. I really want to have a single tar file because the
> main purpose of the TAR file is to
> reduce the management of multiple files, but in case of one file per
> thread, we end up with many tar
> files. Therefore we need to have one master thread which is responsible
> for writing on tar file and all
> the other threads will receive the data from the network and stream to the
> master thread. This also
> supports the idea of using a thread-based model rather than a
> process-based approach because it
> requires too much data sharing between processes. If we cannot achieve
> this, then we can disable the
> TAR option for parallel backup in the first version.
>

I am in favour of disabling the tar format for the first version of
parallel backup.


> - In the case of data sharing, we need to try to avoid unnecessary locking
> and more suitable algorithm to
> solve the reader-writer problem is required.
>
> --
> Ibrar Ahmed
>


--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

Reply via email to