On Mon, May 27, 2019 at 9:39 PM Paul Guo <p...@pivotal.io> wrote: > > > On Tue, May 14, 2019 at 11:06 AM Kyotaro HORIGUCHI < > horiguchi.kyot...@lab.ntt.co.jp> wrote: > >> Hello. >> >> At Mon, 13 May 2019 17:37:50 +0800, Paul Guo <p...@pivotal.io> wrote in < >> caeet0zf9yn4daxyuflzocayyxuff1ms_oqwea+rwv3gha5q...@mail.gmail.com> >> > Thanks for the reply. >> > >> > On Tue, May 7, 2019 at 2:47 PM Kyotaro HORIGUCHI < >> > horiguchi.kyot...@lab.ntt.co.jp> wrote: >> > >> > > >> > > + if (!(stat(parent_path, &st) == 0 && S_ISDIR(st.st_mode))) >> > > + { >> > > >> > > This patch is allowing missing source and destination directory >> > > even in consistent state. I don't think it is safe. >> > > >> > >> > I do not understand this. Can you elaborate? >> >> Suppose we were recoverying based on a backup at LSN1 targeting >> to LSN3 then it crashed at LSN2, where LSN1 < LSN2 <= LSN3. LSN2 >> is called as "consistency point", before where the database is >> not consistent. It's because we are applying WAL recored older >> than those that were already applied in the second trial. The >> same can be said for crash recovery, where LSN1 is the latest >> checkpoint ('s redo LSN) and LSN2=LSN3 is the crashed LSN. >> >> Creation of an existing directory or dropping of a non-existent >> directory are apparently inconsistent or "broken" so we should >> stop recovery when seeing such WAL records while database is in >> consistent state. >> > > This seems to be hard to detect. I thought using invalid_page mechanism > long ago, > but it seems to be hard to fully detect a dropped tablespace. > > > > > 2) Fixed dbase_desc(). Now the xlog output looks correct. >> > > > >> > > > rmgr: Database len (rec/tot): 42/ 42, tx: 486, lsn: >> > > > 0/016386A8, prev 0/01638630, desc: CREATE copy dir base/1 to >> > > > pg_tblspc/16384/PG_12_201904281/16386 >> > > > >> > > > rmgr: Database len (rec/tot): 34/ 34, tx: 487, lsn: >> > > > 0/01638EB8, prev 0/01638E40, desc: DROP dir >> > > > pg_tblspc/16384/PG_12_201904281/16386 >> > > >> > > WAL records don't convey such information. The previous >> > > description seems right to me. >> > > >> > >> > 2019-04-17 14:52:14.951 CST [23030] CONTEXT: WAL redo at 0/3011650 for >> > Database/CREATE: copy dir 1663/1 to 65546/65547 >> > The directories are definitely wrong and misleading. >> >> The original description is right in the light of how the server >> recognizes. The record exactly says that "copy dir 1663/1 to >> 65546/65547" and the latter path is converted in filesystem layer >> via a symlink. >> > > In either $PG_DATA/pg_tblspc or symlinked real tablespace directory, > there is an additional directory like PG_12_201905221 between > tablespace oid and database oid. See the directory layout as below, > so the directory info in xlog dump output was not correct. > > $ ls -lh data/pg_tblspc/ > > > total 0 > > > lrwxrwxrwx. 1 gpadmin gpadmin 6 May 27 17:23 16384 -> /tmp/2 > > > $ ls -lh /tmp/2 > > > total 0 > > > drwx------. 3 gpadmin gpadmin 18 May 27 17:24 PG_12_201905221 > >> >> >> > > > > Also I'd suggest we should use pg_mkdir_p() in >> > > TablespaceCreateDbspace() >> > > > > to replace >> > > > > the code block includes a lot of >> > > > > get_parent_directory(), MakePGDirectory(), etc even it >> > > > > is not fixing a bug since pg_mkdir_p() code change seems to be >> more >> > > > > graceful and simpler. >> > > >> > > But I don't agree to this. pg_mkdir_p goes above two-parents up, >> > > which would be unwanted here. >> > > >> > > I do not understand this also. pg_mkdir_p() is similar to 'mkdir -p'. >> > This change just makes the code concise. Though in theory the change is >> not >> > needed. >> >> We don't want to create tablespace direcotory after concurrent >> DROPing, as the comment just above is saying: >> >> | * Acquire TablespaceCreateLock to ensure that no DROP TABLESPACE >> | * or TablespaceCreateDbspace is running concurrently. >> >> If the concurrent DROP TABLESPACE destroyed the grand parent >> directory, we mustn't create it again. >> > > Yes, this is a good reason to keep the original code. Thanks. > > By the way, based on your previous test patch I added another test which > could easily detect > the missing src directory case. > >
I updated the patch to v3. In this version, we skip the error if copydir fails due to missing src/dst directory, but to make sure the ignoring is legal, I add a simple log/forget mechanism (Using List) similar to the xlog invalid page checking mechanism. Two tap tests are included. One is actually from a previous patch by Kyotaro in this email thread and another is added by me. In addition, dbase_desc() is fixed to make the message accurate. Thanks.
v3-0001-skip-copydir-if-either-src-directory-or-dst-direc.patch
Description: Binary data