Bug#635683: [PATCH/RFC v3] Re: sid perl update problem with 5.12.4-2: text file busy
On Sat, 13 Aug 2011, Guillem Jover wrote: > None of the other file types really need the deferred renames, and I > actually rather not have deferred renames at all! but oh well. Also > because we need to have the immediate rename code path in any case for > the directories I don't see the point in unneedingly deferring renames > for the other file types. IMO this point of view hold when you implemented the initial change but now that we have real files/symlinks/hardlinks deffered, the other 3 (fifo/block/char dev) are like 0,01% of the affected files and this discrepancy just makes the code harder to understand. Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-dpkg-bugs-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#635683: [PATCH/RFC v3] Re: sid perl update problem with 5.12.4-2: text file busy
On Fri, 2011-08-12 at 22:09:55 +0200, Raphael Hertzog wrote: > On Fri, 12 Aug 2011, Jonathan Nieder wrote: > > --- a/src/archives.c > > +++ b/src/archives.c > > @@ -851,7 +851,8 @@ tarobject(void *ctx, struct tar_entry *ti) > > * in .dpkg-new. > > */ > > > > - if (ti->type == tar_filetype_file || ti->type == tar_filetype_symlink) { > > + if (ti->type == tar_filetype_file || ti->type == tar_filetype_symlink || > > + ti->type == tar_filetype_hardlink) { > > nifd->namenode->flags |= fnnf_deferred_rename; > > > > debug(dbg_eachfiledetail, "tarobject done and installation deferred"); That's what I was referring to with my comment of the proper way to fix this. Which I've pushed now to the master and squeeze branches. > But from a design point of view, it just seems wrong to not have all files > treated the same. IMO we should be using deferred rename for all files except > directories. None of the other file types really need the deferred renames, and I actually rather not have deferred renames at all! but oh well. Also because we need to have the immediate rename code path in any case for the directories I don't see the point in unneedingly deferring renames for the other file types. > Then one can wonder whether we ought to do something to ensure other types > of files are written on the disk too... and whether adding a fsync() on > hardlink brings anything. Since the hardlinks all point to the same inode, > I guess a fsync() on the hardlink doesn't add anything, Right, doing fsync() on anything other than a tar file or tar directory is pointless. regards, guillem -- To UNSUBSCRIBE, email to debian-dpkg-bugs-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#635683: [PATCH/RFC v3] Re: sid perl update problem with 5.12.4-2: text file busy
Hi, Raphael Hertzog wrote: > Instead of making yet another exception for hardlinks, it's time > to decide that deferred rename is the norm and only directories > should be exempted from this. Makes perfect sense. Thanks! Here's the remainder of the train of thought I sent before. I don't see any reason to apply it unless we have some information (e.g., perf data) that suggests it helps. -- >8 -- Subject: dpkg: try opening files for reading when fsync-ing fsync() is not supposed to affect the file descriptor but the underlying inode, so on most modern operating systems it does not care whether it is passed a file descriptor open for reading or for writing. Opening for reading can be slightly more lightweight --- for example, on Linux it avoids having to allocate structures related to quota. However, some OSes (e.g., AIX) do require a descriptor open for writing. Therefore open files for reading, not writing, before fsyncing them and only fall back to opening for writing if the system requires it. Signed-off-by: Jonathan Nieder --- debian/changelog |4 src/archives.c | 51 ++- 2 files changed, 46 insertions(+), 9 deletions(-) diff --git a/debian/changelog b/debian/changelog index 7c6343d3..2b255171 100644 --- a/debian/changelog +++ b/debian/changelog @@ -186,6 +186,10 @@ dpkg (1.16.1) UNRELEASED; urgency=low * Add new --raw-extract option to dpkg-deb combining --control and --extract. Closes: #552123 + [ Jonathan Nieder ] + * Open extracted files for reading, not writing, before fsyncing them. +Only fall back to opening for writing if fsync fails with errno == EBADF. + [ Updated dpkg translations ] * German (Sven Joachim). Closes: #620312 * Swedish (Peter Krefting). diff --git a/src/archives.c b/src/archives.c index c8d80aad..c0dac081 100644 --- a/src/archives.c +++ b/src/archives.c @@ -911,6 +911,47 @@ tar_writeback_barrier(struct fileinlist *files, struct pkginfo *pkg) } #endif +static int +open_checked(const char *path, int flags) +{ + int fd = open(path, flags); + if (fd < 0) +ohshite(_("unable to open '%.255s'"), path); + return fd; +} + +static void +close_checked(int fd, const char *path) +{ + if (close(fd)) +ohshite(_("error closing/writing `%.255s'"), path); +} + +static void +fsync_path(const char *path) +{ + int fd; + + fd = open_checked(path, O_RDONLY); + if (fsync(fd) == 0) { +close_checked(fd, path); +return; + } + if (errno != EBADF) +ohshite(_("unable to sync file '%.255s'"), path); + + /* + * fsync() on some Unixen (e.g., AIX) requires "fd" to be + * open for writing, failing with EBADF when otherwise. + * Try again with O_WRONLY for their sake. + */ + close(fd); + fd = open_checked(path, O_WRONLY); + if (fsync(fd)) +ohshite(_("unable to sync file '%.255s'"), path); + close_checked(fd, path); +} + void tar_deferred_extract(struct fileinlist *files, struct pkginfo *pkg) { @@ -935,15 +976,7 @@ tar_deferred_extract(struct fileinlist *files, struct pkginfo *pkg) int fd; debug(dbg_eachfiledetail, "deferred extract needs fsync"); - - fd = open(fnamenewvb.buf, O_WRONLY); - if (fd < 0) -ohshite(_("unable to open '%.255s'"), fnamenewvb.buf); - if (fsync(fd)) -ohshite(_("unable to sync file '%.255s'"), fnamenewvb.buf); - if (close(fd)) -ohshite(_("error closing/writing `%.255s'"), fnamenewvb.buf); - + fsync_path(fnamenewvb.buf); cfile->namenode->flags &= ~fnnf_deferred_fsync; } -- 1.7.6 -- To UNSUBSCRIBE, email to debian-dpkg-bugs-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Processed: Re: Bug#635683: [PATCH/RFC v3] Re: sid perl update problem with 5.12.4-2: text file busy
Processing commands for cont...@bugs.debian.org: > tag 635683 + patch Bug #635683 [dpkg] dpkg --unpack: »/usr/bin/perl5.12.4.dpkg-new« cannot be opened: The program cannot be executed or altered/changed (busy) Added tag(s) patch. > thanks Stopping processing here. Please contact me if you need assistance. -- 635683: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=635683 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems -- To UNSUBSCRIBE, email to debian-dpkg-bugs-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#635683: [PATCH/RFC v3] Re: sid perl update problem with 5.12.4-2: text file busy
tag 635683 + patch thanks Hi, On Fri, 12 Aug 2011, Jonathan Nieder wrote: > Ok, here's a patch with better behavior. On Linux and similar platforms, > it just opens the file for reading. If fsync fails with errno == EBADF, > it falls back to opening for writing, and if that fails with errno == > ETXTBSY, it skips the fsync under the assumption that the file has already > been fsynced. I thought the same at first but all this is just wrong. If you look carefully, you'll notice that only real files are fsync()ed and hardlinks are not (look where fnnf_deferred_fsync is added to the flags). In the tar archive, the first file in a set of hardlinked files appears as a normal file and the others are then marked as hardlinks of the first file. What broke dpkg here is the the fact that hardlinks do not use deferred renames and are thus put into places before the "target file" is fsync()ed and renamed. Thus the minimal fix is this one: > --- a/src/archives.c > +++ b/src/archives.c > @@ -851,7 +851,8 @@ tarobject(void *ctx, struct tar_entry *ti) > * in .dpkg-new. > */ > > - if (ti->type == tar_filetype_file || ti->type == tar_filetype_symlink) { > + if (ti->type == tar_filetype_file || ti->type == tar_filetype_symlink || > + ti->type == tar_filetype_hardlink) { > nifd->namenode->flags |= fnnf_deferred_rename; > > debug(dbg_eachfiledetail, "tarobject done and installation deferred"); But from a design point of view, it just seems wrong to not have all files treated the same. IMO we should be using deferred rename for all files except directories. Then one can wonder whether we ought to do something to ensure other types of files are written on the disk too... and whether adding a fsync() on hardlink brings anything. Since the hardlinks all point to the same inode, I guess a fsync() on the hardlink doesn't add anything, the supplementary information is just the availability of the same file on a different path and to ensure persistency of this it's the fsync() on the directory that we would need. I have attached my suggested patch. It's tested and I confirm that it fixes this issue. Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) commit 8f91f1da8defc4d8464fda8b100f966f655dceeb Author: Raphaël Hertzog Date: Fri Aug 12 21:23:58 2011 +0200 dpkg: use deferred rename for all files except directories The decision to defer fsync/rename only of normal files and to keep renaming the other directly types during the unpack (commit 9cd41fdda1c27169c52d73b3b3ce71991d724994) has been wrong first for the symlinks (commit a766f501f6da46aca070c315e6429e163d188202) and we discover now that it's also wrong for hardlinks: due to this change hardlinks are put into place before the "target" file. This opens up the possibility that the file gets executed (via the hardlink) before the deferred fsync/rename which then fails with ETXTBSY. Instead of making yet another exception for hardlinks, it's time to decide that deferred rename is the norm and only directories should be exempted from this. This change also means that when we create hardlinks we can assume the target file is a .dpkg-new one and we can get rid of the check on fnnf_deferred_rename to compute the correct path. Reported-by: Niko Tyni diff --git a/debian/changelog b/debian/changelog index e875ca6..7c6343d 100644 --- a/debian/changelog +++ b/debian/changelog @@ -98,6 +98,9 @@ dpkg (1.16.1) UNRELEASED; urgency=low * Fix possible segfault of dpkg in findbreakcycle(). LP: #733414 * dpkg-source now properly cleans up the temporary tarball generated for native formats in case of unexpected interruption. Closes: #631494 + * Fix dpkg to defer rename of hardlinks just like normal files. Otherwise +we might end up trying to fsync() a file that is already used and +fail with ETXTBSY. Closes: #635683 [ Guillem Jover ] * Install deb-src-control(5) man pages in dpkg-dev. Closes: #620520 diff --git a/src/archives.c b/src/archives.c index f060e70..c8d80aa 100644 --- a/src/archives.c +++ b/src/archives.c @@ -764,8 +764,7 @@ tarobject(void *ctx, struct tar_entry *ti) varbuf_add_char(&hardlinkfn, '/'); linknode = findnamenode(ti->linkname, 0); varbuf_add_str(&hardlinkfn, namenodetouse(linknode, tc->pkg)->name); -if (linknode->flags & fnnf_deferred_rename) - varbuf_add_str(&hardlinkfn, DPKGNEWEXT); +varbuf_add_str(&hardlinkfn, DPKGNEWEXT); varbuf_end_str(&hardlinkfn); if (link(hardlinkfn.buf,fnamenewvb.buf)) ohshite(_("error creating hard link `%.255s'"), ti->name); @@ -851,7 +850,7 @@ tarobject(void *ctx, struct tar_entry *ti) * in .dpkg-new. */ - if (ti->type == tar_filetype_file || ti->type == tar_filetype_symlink) { + if (ti->type != tar_filetype_dir) { ni
Bug#635683: [PATCH/RFC v3] Re: sid perl update problem with 5.12.4-2: text file busy
Jonathan Nieder wrote: > To fix that, we should fsync() each file with multiple hard links only > once and then rename all links, so the logic to skip fsync would be no > longer needed. ... and here's a patch on top implementing that. --- debian/changelog | 13 ++--- src/archives.c | 38 -- 2 files changed, 14 insertions(+), 37 deletions(-) diff --git a/debian/changelog b/debian/changelog index e5de69d4..24d5df9f 100644 --- a/debian/changelog +++ b/debian/changelog @@ -189,13 +189,12 @@ dpkg (1.16.1~jrn) local; urgency=low [ Jonathan Nieder ] * Defer installing hard links with their final name until the file has been fsynced. - * Open extracted files for reading, not writing, in order to fsync them. -Otherwise the open can error out when preparing to rename a binary into -place that has a hard link already in use. Regression introduced in -1.15.6.1. Closes: #635683 - * If fsync fails with EBADF, as it is documented to on older platforms -when fd is not opened for writing, reopen the file for writing and try -again. If that fails with ETXTBSY, just skip the fsync. + * Suppress the deferred fsync before renaming hard links to files that +have already been fsynced and linked before. This prevents errors from +opening binaries for writing when another link is already in use. +Regression introduced in 1.15.6.1. Closes: #635683 + * When possible, open extracted files for reading, not writing, in order +to fsync them. [ Updated dpkg translations ] * German (Sven Joachim). Closes: #620312 diff --git a/src/archives.c b/src/archives.c index c221f388..61eee5be 100644 --- a/src/archives.c +++ b/src/archives.c @@ -851,6 +851,13 @@ tarobject(void *ctx, struct tar_entry *ti) * in .dpkg-new. */ + /* + * Files with multiple links only need one fsync: before the first link + * is renamed into place. + */ + if (ti->type == tar_filetype_hardlink) +nifd->namenode->flags &= ~fnnf_deferred_fsync; + if (ti->type == tar_filetype_file || ti->type == tar_filetype_symlink || ti->type == tar_filetype_hardlink) { nifd->namenode->flags |= fnnf_deferred_rename; @@ -920,32 +927,6 @@ deferred_extract_fsync(const char *path) * The documentation for some Unixen (e.g., AIX) says fsync * requires a file descriptor open for writing and will * return EBADF otherwise. - * - * Unfortunately, opening a file for writing fails if that - * file happens to be a binary whose text is mapped. That - * can happen in two cases: - * - * A) We are installing a hard link to a binary that was - * already installed into place and run. In that case, - * we've already done the fsync, so it's safe to skip. - * - * B) Some obnoxious user decided to run the .dpkg-new - * file directly. - * - * For now let's assume we are in case A and skip the fsync - * when errno == ETXTBSY. XXX: It would be better to prevent (A) - * higher in the call stack and fall back to sync() here. - * - * On Linux, we just open the file for reading, fsync does not - * return EBADF, and the behavior is as simple as - * - *fd = open(...); - *if (fd < 0) - * ohshite(...); - *if (fsync(...)) - * ohshite(...); - *if (close(...)) - * ohshite(...); */ int fd; @@ -961,11 +942,8 @@ deferred_extract_fsync(const char *path) /* Try again with an fd open for writing. */ close(fd); fd = open(path, O_WRONLY); - if (fd < 0) { -if (errno == ETXTBSY) - goto done; + if (fd < 0) ohshite(_("unable to open '%.255s'"), path); - } if (fsync(fd)) ohshite(_("unable to sync file '%.255s'"), path); -- 1.7.6 -- To UNSUBSCRIBE, email to debian-dpkg-bugs-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#635683: [PATCH/RFC v3] Re: sid perl update problem with 5.12.4-2: text file busy
Guillem Jover wrote: > This is misleading on Linux, but it's not a safe portable assumption > to make on POSIX in general as that behaviour is not specified and as > such is implementation specific Ok, here's a patch with better behavior. On Linux and similar platforms, it just opens the file for reading. If fsync fails with errno == EBADF, it falls back to opening for writing, and if that fails with errno == ETXTBSY, it skips the fsync under the assumption that the file has already been fsynced. One problem: on the affected platforms (i.e., not Debian), a prankster could go around executing .dpkg-new files to prevent the fsyncs, trigger a crash, and then we have filesystem corruption again. Unpleasant. To fix that, we should fsync() each file with multiple hard links only once and then rename all links, so the logic to skip fsync would be no longer needed. Untested. --- debian/changelog | 11 src/archives.c | 74 ++--- 2 files changed, 75 insertions(+), 10 deletions(-) diff --git a/debian/changelog b/debian/changelog index 20c296d5..e5de69d4 100644 --- a/debian/changelog +++ b/debian/changelog @@ -186,6 +186,17 @@ dpkg (1.16.1~jrn) local; urgency=low * Add new --raw-extract option to dpkg-deb combining --control and --extract. Closes: #552123 + [ Jonathan Nieder ] + * Defer installing hard links with their final name until the file has +been fsynced. + * Open extracted files for reading, not writing, in order to fsync them. +Otherwise the open can error out when preparing to rename a binary into +place that has a hard link already in use. Regression introduced in +1.15.6.1. Closes: #635683 + * If fsync fails with EBADF, as it is documented to on older platforms +when fd is not opened for writing, reopen the file for writing and try +again. If that fails with ETXTBSY, just skip the fsync. + [ Updated dpkg translations ] * German (Sven Joachim). Closes: #620312 * Swedish (Peter Krefting). diff --git a/src/archives.c b/src/archives.c index f060e706..c221f388 100644 --- a/src/archives.c +++ b/src/archives.c @@ -851,7 +851,8 @@ tarobject(void *ctx, struct tar_entry *ti) * in .dpkg-new. */ - if (ti->type == tar_filetype_file || ti->type == tar_filetype_symlink) { + if (ti->type == tar_filetype_file || ti->type == tar_filetype_symlink || + ti->type == tar_filetype_hardlink) { nifd->namenode->flags |= fnnf_deferred_rename; debug(dbg_eachfiledetail, "tarobject done and installation deferred"); @@ -912,6 +913,67 @@ tar_writeback_barrier(struct fileinlist *files, struct pkginfo *pkg) } #endif +static void +deferred_extract_fsync(const char *path) +{ + /* + * The documentation for some Unixen (e.g., AIX) says fsync + * requires a file descriptor open for writing and will + * return EBADF otherwise. + * + * Unfortunately, opening a file for writing fails if that + * file happens to be a binary whose text is mapped. That + * can happen in two cases: + * + * A) We are installing a hard link to a binary that was + * already installed into place and run. In that case, + * we've already done the fsync, so it's safe to skip. + * + * B) Some obnoxious user decided to run the .dpkg-new + * file directly. + * + * For now let's assume we are in case A and skip the fsync + * when errno == ETXTBSY. XXX: It would be better to prevent (A) + * higher in the call stack and fall back to sync() here. + * + * On Linux, we just open the file for reading, fsync does not + * return EBADF, and the behavior is as simple as + * + *fd = open(...); + *if (fd < 0) + * ohshite(...); + *if (fsync(...)) + * ohshite(...); + *if (close(...)) + * ohshite(...); + */ + + int fd; + + fd = open(path, O_RDONLY); + if (fd < 0) +ohshite(_("unable to open '%.255s'"), path); + if (fsync(fd) == 0) +goto done; + else if (errno != EBADF) +ohshite(_("unable to sync file '%.255s'"), path); + + /* Try again with an fd open for writing. */ + close(fd); + fd = open(path, O_WRONLY); + if (fd < 0) { +if (errno == ETXTBSY) + goto done; +ohshite(_("unable to open '%.255s'"), path); + } + if (fsync(fd)) +ohshite(_("unable to sync file '%.255s'"), path); + +done: + if (close(fd)) +ohshite(_("error closing/writing `%.255s'"), path); +} + void tar_deferred_extract(struct fileinlist *files, struct pkginfo *pkg) { @@ -936,15 +998,7 @@ tar_deferred_extract(struct fileinlist *files, struct pkginfo *pkg) int fd; debug(dbg_eachfiledetail, "deferred extract needs fsync"); - - fd = open(fnamenewvb.buf, O_WRONLY); - if (fd < 0) -ohshite(_("unable to open '%.255s'"), fnamenewvb.buf); - if (fsync(fd)) -ohshite(_("unable to sync file '%.255s'"), fnamenewvb.buf); - if (close(fd)) -ohshite(_("error closing/writing `%.255s'"), f