Hi Ben, Ben Hutchings wrote: > On Mon, 2010-11-15 at 19:31 +0100, Philipp Kern wrote:
>> and I don't suppose we could make that the default? Is there anything >> else the dpkg developers can try to be portable and still not be >> sacrificing performance? > > I'm coming to this late. It sounds like dpkg has changed its behaviour > several times recently. Please can you summarise dpkg's current and > proposed use of fsync() vs sync(), and the reasons for this. > > Also do I understand correctly that fsync() is more expensive when ext4 > delayed allocation is in use? Here's a try, based on "git log --grep=sync". Some reports[1] indicated that dpkg was truncating files to zero length on ext4 (and ubifs) filesystems with delayed allocation enabled. This happened whenever a system crash occured during or closely after an upgrade, which is not really acceptable, especially considering that upgrades are a time a person is likely to be trying things out that might crash the system. I. So some patches were written and applied to fsync() each new file as it is written (before the rename()). These patches are part of dpkg 1.15.6. The result was very slow[2], especially on ext4 but also on ext3. Colin Watson noticed that a sync() is a lot faster. Unfortunately sync() being synchronous is not portable (e.g., on BSD it returns right away, before files have been committed to disk), so the attempted fix was II. Write all .dpkg-tmp files. fsync storm to make sure all .dpkg-tmp files have well defined content, then rename storm to put them in place. This appeared to improve performance quite a bit, but that was a bug[3]. After fixing that bug, the slowdown was still present[4] (as Mike Hommey had predicted[5]), at least on ext4. There seems to be a per-fsync cost. So that leaves us with sync(): III. Write all .dpkg-tmp files. sync(). Rename storm to put the files in place. which is quite fast, really --- it about cancels out the effect of the new optimization of using FIEMAP to read /var/lib/dpkg/* to be about the same speed as lenny. Unfortunately, in addition to not being portably synchronous, sync() does not have the right semantics. In particular, when building with pbuilder on tmpfs, sync() syncs _all_ filesystems, including whatever slow thumb drive happens to be mounted at the same time. Hope that helps, Jonathan [1] See https://bugzilla.kernel.org/show_bug.cgi?id=15910 for example. [2] http://lists.debian.org/debian-dpkg/2010/03/threads.html#00029 [3] http://bugs.debian.org/577756 [4] http://bugs.debian.org/578635 [5] http://lists.debian.org/debian-dpkg/2010/03/msg00036.html -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20101121050023.gb11...@burratino