Re: Speeding up dpkg, a proposal
On Sat, 19 Mar 2011, Goswin von Brederlow wrote: > > I recently had a situation where I was doing a backup to a USB flash > > device and I decided to install some Debian packages. The sync() > > didn't complete until the backup completed because the write-back > > buffers were never empty! > > Which is odd because I've used sync() while copying large amounts of > data and the sync() completes while the copying is going on steadily. > It only waits for the currently dirty buffers to be written, not for any > buffers that get dirtied later. At least it used to. Maybe. Of course if I happened to have 500M of dirty buffers on the flash device that could take a long time to be written and give a result that's pretty close to my first impression. -- My Main Blog http://etbe.coker.com.au/ My Documents Bloghttp://doc.coker.com.au/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/201103190034.04057.russ...@coker.com.au
Re: Speeding up dpkg, a proposal
Russell Coker writes: > I recently had a situation where I was doing a backup to a USB flash device > and I decided to install some Debian packages. The sync() didn't complete > until the backup completed because the write-back buffers were never empty! Which is odd because I've used sync() while copying large amounts of data and the sync() completes while the copying is going on steadily. It only waits for the currently dirty buffers to be written, not for any buffers that get dirtied later. At least it used to. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87r5a4tvcx.fsf@frosties.localnet
Re: Speeding up dpkg, a proposal
On Fri, 18 Mar 2011, Russell Coker wrote: > On Fri, 18 Mar 2011, Goswin von Brederlow wrote: > > > On a machine with lots of RAM (== disk cache...) and high I/O load, you > > > don't want to do a (global!) sync(). This can totally kill the machine > > > for 20min or more and is a big no go. > > > > > > -- vbi > > > > Then don't use the option. It should definetly be an option: > > It's a pity that there is no kernel support for synching one filesystem (or > maybe a few filesystems). It is being implemented right now, actually... Maybe it will be in 2.6.39. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110318013625.ga6...@khazad-dum.debian.net
Re: Speeding up dpkg, a proposal
On Fri, Mar 18, 2011 at 12:11 AM, Russell Coker wrote: >> Then don't use the option. It should definetly be an option: > > It's a pity that there is no kernel support for synching one filesystem (or > maybe a few filesystems). That'd be only a partial work around. Even with a single fs one big sync can be bad. > I recently had a situation where I was doing a backup to a USB flash device > and I decided to install some Debian packages. The sync() didn't complete > until the backup completed because the write-back buffers were never empty! Didn't dpkg stop using sync? Olaf -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/AANLkTi=sjgv2n_otcrgymw8zvqo0pgd9zz_1+dquu...@mail.gmail.com
Re: Speeding up dpkg, a proposal
On Fri, 18 Mar 2011, Goswin von Brederlow wrote: > > On a machine with lots of RAM (== disk cache...) and high I/O load, you > > don't want to do a (global!) sync(). This can totally kill the machine > > for 20min or more and is a big no go. > > > > -- vbi > > Then don't use the option. It should definetly be an option: It's a pity that there is no kernel support for synching one filesystem (or maybe a few filesystems). I recently had a situation where I was doing a backup to a USB flash device and I decided to install some Debian packages. The sync() didn't complete until the backup completed because the write-back buffers were never empty! If dpkg had only sync'd the filesystems used for Debian files (IE ones on the hard drive) then the package install would have taken a fraction of the time and I could have used the programs in question while the backup was running. -- My Main Blog http://etbe.coker.com.au/ My Documents Bloghttp://doc.coker.com.au/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/201103181011.08741.russ...@coker.com.au
Re: Speeding up dpkg, a proposal
Adrian von Bidder writes: > On Wednesday 02 March 2011 17.02:11 Marius Vollmer wrote: >> - Instead, we move all packages that are to be unpacked into >> half-installed / reinstreq before touching the first one, and put a >> big sync() right before carefully writing /var/lib/dpkg/status. > > You don't want to do this. While production systems usually are upgraded in > downtime windows (with less load), it is sometimes necessary to install some > package (tcpdump or whatever to diagnose problems...) while the system is > under high load. Especially when you're trying to find out why the machine > has a load of 20 and you can't afford to kill it... > > On a machine with lots of RAM (== disk cache...) and high I/O load, you > don't want to do a (global!) sync(). This can totally kill the machine for > 20min or more and is a big no go. > > -- vbi Then don't use the option. It should definetly be an option: sync / fs sync / fsync / sync only metadata / single sync at end / no sync at all MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/871v25a36z.fsf@frosties.localnet
Re: Speeding up dpkg, a proposal
Marius Vollmer writes: > ext Chow Loong Jin writes: > >> Could we somehow avoid using sync()? sync() syncs all mounted filesystems, >> which >> isn't exactly very friendly when you have a few slow-syncing filesystems like >> btrfs (or even NFS) mounted. > > Hmm, right. We could keep a list of all files that need fsyncing, and > then fsync them all just before writing the checkpoint. > > Half of that is already done (for the content of the packages), we would > need to add it for the files in /var/lib/dpkg/, or we could just fsync > the whole directory. > > But then again, I would argue that the sync() is actually necessary > always, for correct semantics: You also want to sync everything that the > postinst script has done before recording that a package is fully > installed. Except for chroots, the throw away after use kind, this realy doesn't matter. If the system crashes at any point before the chroot is thrown away then it just gets thrown away after boot and the whole operation is restarted from scratch. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/8762rha3ax.fsf@frosties.localnet
Re: Speeding up dpkg, a proposal
On 3/3/2011 1:32 PM, Phillip Susi wrote: > Don't you mean it MAY be initiated if the cache decides there is enough > memory pressure? I don't know of any other call besides fsync and > friends to force the writeback so before that is called, it could ( and > likely is if you have plenty of memory ) still be sitting in the cache > and the disk queue idle. Nevermind, I figured out what you meant. This new writeback_init/barrier() code that uses sync_file_range. I hadn't seen that before. That should do quite well for an individual archive, so I guess the next thing to do is try to get the calls to tar_deferred_extract() delayed until after multiple archives are unpacked. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d6fe262.3050...@cfl.rr.com
Re: Speeding up dpkg, a proposal
On 3/3/2011 1:30 PM, Guillem Jover wrote: > Actually, this was discarded early on, as Linux does not implement > aio_fsync() for any file system. Also the interface is quite cumbersome > as it requires to keep state for each aio operation, and using SA_SIGINFO > (which is not yet available everywhere). I was wondering why I couldn't find the fs implementation. How annoying. The posix and libaio wrapper interfaces are cumbersome, but directly calling io_submit it would be perfect to queue up all of the writes at once and then wait for their completion. If the kernel actually implemented it. Sigh. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d6fe195.8030...@cfl.rr.com
Re: Speeding up dpkg, a proposal
On Thu, Mar 3, 2011 at 7:32 PM, Phillip Susi wrote: >> And we use some linux specific ioctl to avoid that fragmentation. > > Could you be more specific? sync_file_range(fd.a, 0, 0, SYNC_FILE_RANGE_WRITE); sync_file_range(fd.a, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE); Olaf -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/AANLkTimBZq=piivirm+o+rhzt5d-trmcbn6kncy2m...@mail.gmail.com
Re: Speeding up dpkg, a proposal
On 3/3/2011 12:49 PM, Raphael Hertzog wrote: > That's wrong. The writeback is initiated before the fsync() so the > filesystem can order the write how it wants. Don't you mean it MAY be initiated if the cache decides there is enough memory pressure? I don't know of any other call besides fsync and friends to force the writeback so before that is called, it could ( and likely is if you have plenty of memory ) still be sitting in the cache and the disk queue idle. > And we use some linux specific ioctl to avoid that fragmentation. Could you be more specific? -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d6fdeb8.5010...@cfl.rr.com
Re: Speeding up dpkg, a proposal
On Thu, 2011-03-03 at 18:49:44 +0100, Raphael Hertzog wrote: > On Thu, 03 Mar 2011, Phillip Susi wrote: > > It would be much better to use aio to queue up all of the syncs at once, > > so that the elevator can coalesce and reorder them for optimal writing. > > I'm not convinced it would help. You're welcome to try and provide a > patch if it works. > > I'm not even convinced it's possible with the existing interfaces (but I > have no experience with AIO). aio_fsync() is only usable with aio_write() > and it's not possible to use lio_listio() to batch a bunch of aio_fsync(). Actually, this was discarded early on, as Linux does not implement aio_fsync() for any file system. Also the interface is quite cumbersome as it requires to keep state for each aio operation, and using SA_SIGINFO (which is not yet available everywhere). regards, guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110303183039.ga4...@gaara.hadrons.org
Re: Speeding up dpkg, a proposal
Hi, On Thu, 03 Mar 2011, Phillip Susi wrote: > I have another proposal. It looks like right now dpkg extracts all of > the files in the archive, then for each one, calls fsync() then > rename(). Because this is done serially for each file in the archive, > it forces small, out of order writes that cause extra seeking and queue > plugging. That's wrong. The writeback is initiated before the fsync() so the filesystem can order the write how it wants. And we use some linux specific ioctl to avoid that fragmentation. > It would be much better to use aio to queue up all of the syncs at once, > so that the elevator can coalesce and reorder them for optimal writing. I'm not convinced it would help. You're welcome to try and provide a patch if it works. I'm not even convinced it's possible with the existing interfaces (but I have no experience with AIO). aio_fsync() is only usable with aio_write() and it's not possible to use lio_listio() to batch a bunch of aio_fsync(). Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110303174944.gb13...@rivendell.home.ouaza.com
Re: Speeding up dpkg, a proposal
I have another proposal. It looks like right now dpkg extracts all of the files in the archive, then for each one, calls fsync() then rename(). Because this is done serially for each file in the archive, it forces small, out of order writes that cause extra seeking and queue plugging. It would be much better to use aio to queue up all of the syncs at once, so that the elevator can coalesce and reorder them for optimal writing. Further, this processing is done on a per archive basis. It would be even better if multiple archives could be extracted at once, and all of the fsyncs from all of the archives batched up. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d6fbf93.1030...@cfl.rr.com
Re: Speeding up dpkg, a proposal
On Thu, 03 Mar 2011, Marius Vollmer wrote: > ext Raphael Hertzog writes: > > > On Wed, 02 Mar 2011, Marius Vollmer wrote: > >> - Instead, we move all packages that are to be unpacked into > >> half-installed / reinstreq before touching the first one, and put a > >> big sync() right before carefully writing /var/lib/dpkg/status. > > > > The big sync() doesn't work. It means dpkg never finishes its work on > > systems with lots of unrelated I/O. > > Ok, understood. It's now clear to me that the big sync should be > replaced with deferred fsyncs. (I would defer the fsync of the content > of all packages until modstatdb_checkpoint, not just until > tar_deferred_extract.) This is assuming you don't use --force-unsafe-io. Otherwise you don't sync packages content at all. > With that change, do you think the approach is sound? It looks like it could work in principle. But it might have unexpected complications in case of interruptions. You said it yourself: "it leaves its database behind in a correct but quite outdated and not so friendly state" The "reinstreq" flag is usually present on a single package only, and we know that this single package is (likely) broken. So we reinstall it and we can go ahead. Now with your scheme, we have many packages in that state and we don't know which ones are really broken. At least the one which was being processed at the time of the interruption (as in power loss). Are we sure there are no case where this brokenness leads to failures in preinst of some of the other packages to be reinstalled? How is the package manager supposed to order the reinstallations? > To understand our troubles, you need to know that we have around 2500 > packages with just a single file in it. For those packages, dpkg spends > the largest part of its time in writing the nine journal entries to > /var/lib/dpkg/updates. nine? I haven't reviewed the code but that's quite a lot indeed. Maybe there's room for optimization here. A quick review indeed reveals this sequence (for an upgrade): - half_installed + reinstreq - unpacked + reinstreq - half_installed + reinstreq - unpacked + reinstreq - unpacked - unpacked (again at start of configure, don't know why) - half_configured - installed - the final installation in the status file Indeed, your scenario is very particular. Usually you have many files and thus the fsync() of all the files is what takes the most time (compared to the 9 fsync() for the status information) and there --force-unsafe-io shows a sizable improvement. > We will reduce the number of our packages, so this issue might solve > itself that way, but I had good success in reducing the per-package > overhead of dpkg, and if it is correct and works for us, why not use the > 'reckless' option as well? I don't think we're interested in adding more options that make it even more difficult to understand what dpkg does. Either there's a better way of doing it and we use it all the time, or we keep it like it is. For instance, I wonder if we could not get rid of two modstatdb_note in the above list: - the first "unpacked + reinstreq" could be directly brought back to "half_installed + reinstreq" with minimal consequences (the only difference comes when one of the conflictor/to-be-deconfigured package fails to be deconfigured). - the other one at the start of the configure process Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110303133441.gc11...@rivendell.home.ouaza.com
Re: Speeding up dpkg, a proposal
On Thu, Mar 3, 2011 at 8:33 AM, Marius Vollmer wrote: > And in the big picture, all we need is some guarantee that renames are > comitted in order, and after the content of the file that is being > renamed. I have the impression that all reasonable filesystems give > that guarantee now, no? No, they took shortcuts in the implementation and commit the rename before the source file. > Hmm, right. We could keep a list of all files that need fsyncing, and > then fsync them all just before writing the checkpoint. Instead of fsync you might want to use the async Linux specific sync options. -- Olaf -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/AANLkTinW=kxn2trssg+b327prv5m66-zpdxc0s10l...@mail.gmail.com
Re: Speeding up dpkg, a proposal
ext Raphael Hertzog writes: > On Wed, 02 Mar 2011, Marius Vollmer wrote: >> - Instead, we move all packages that are to be unpacked into >> half-installed / reinstreq before touching the first one, and put a >> big sync() right before carefully writing /var/lib/dpkg/status. > > The big sync() doesn't work. It means dpkg never finishes its work on > systems with lots of unrelated I/O. Ok, understood. It's now clear to me that the big sync should be replaced with deferred fsyncs. (I would defer the fsync of the content of all packages until modstatdb_checkpoint, not just until tar_deferred_extract.) With that change, do you think the approach is sound? > We've seen reports of poor performance with btrfs (and that's what you use > for Meego IIRC) so you might want to investigate why btrfs is coping so > badly with a few fsync() just on the status files. This is about Harmattan, which uses ext4. To understand our troubles, you need to know that we have around 2500 packages with just a single file in it. For those packages, dpkg spends the largest part of its time in writing the nine journal entries to /var/lib/dpkg/updates. We will reduce the number of our packages, so this issue might solve itself that way, but I had good success in reducing the per-package overhead of dpkg, and if it is correct and works for us, why not use the 'reckless' option as well? -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87mxlc8vem@big.research.nokia.com
Re: Speeding up dpkg, a proposal
Hi Marius, no need to CC Guillem privately, the dpkg maintainers are reachable at debian-d...@lists.debian.org. :) On Wed, 02 Mar 2011, Marius Vollmer wrote: > - Instead, we move all packages that are to be unpacked into > half-installed / reinstreq before touching the first one, and put a > big sync() right before carefully writing /var/lib/dpkg/status. The big sync() doesn't work. It means dpkg never finishes its work on systems with lots of unrelated I/O. We have used that in the past and we had to revert, see commit 5ee4e4e0458088cde1625ddb5a3d736f31a335d3 and the associated bug reports: http://bugs.debian.org/588339 http://bugs.debian.org/595927 http://bugs.debian.org/600075 Later we completely removed the codepath USE_SYNC_SYNC due to this. Even if you're not using the sync() in the same way that we did, its mere usage makes the whole solution a no-go. Other people have already mentioned --force-unsafe-io as a way to recover some of the performance lost. It does still sync the updates to the internal database (but not the files installed by the packages). We've seen reports of poor performance with btrfs (and that's what you use for Meego IIRC) so you might want to investigate why btrfs is coping so badly with a few fsync() just on the status files. That's my 2cents on this discussion. Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110303080011.gg6...@rivendell.home.ouaza.com
Re: Speeding up dpkg, a proposal
ext Chow Loong Jin writes: > Could we somehow avoid using sync()? sync() syncs all mounted filesystems, > which > isn't exactly very friendly when you have a few slow-syncing filesystems like > btrfs (or even NFS) mounted. Hmm, right. We could keep a list of all files that need fsyncing, and then fsync them all just before writing the checkpoint. Half of that is already done (for the content of the packages), we would need to add it for the files in /var/lib/dpkg/, or we could just fsync the whole directory. But then again, I would argue that the sync() is actually necessary always, for correct semantics: You also want to sync everything that the postinst script has done before recording that a package is fully installed. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87y64w8zy1@big.research.nokia.com
Re: Speeding up dpkg, a proposal
ext Chow Loong Jin writes: > I remember seeing there being some list of files to be fsynced in one of the > older dpkgs. It's probably that which led to the ext4 slowdown [...] Hmm, performance is the ultimate reason for doing all this, but right now, I am mostly interested in whether my changes are correct. I know that they improve performance, but I am not totally convinced that they are actually correct in the way that they change the status of packages, etc. I am only proposing to add this as an option to dpkg, not to make it the default. We might enable it in Harmattan, if I have the balls and it does in fact speed things up enough, but nothing of that is certain right now. We might get the improvement we need just from reducing our number of packages to something reasonable. >> But then again, I would argue that the sync() is actually necessary >> always, for correct semantics: You also want to sync everything that the >> postinst script has done before recording that a package is fully >> installed. > > Yes, you're right. I completely forgot about that. I don't think most postinst > scripts sync when done. I suppose the best that can be done is to batch the > stuff as best as can be done to reduce the number of sync()s needed. On the other hand, it _is_ the job of the maintainerscripts to sync if that s necessary for correctness, and maybe we don't want to take that reponsibility away from them. And in the big picture, all we need is some guarantee that renames are comitted in order, and after the content of the file that is being renamed. I have the impression that all reasonable filesystems give that guarantee now, no? -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87r5ao8xqi@big.research.nokia.com
Re: Speeding up dpkg, a proposal
On Thursday 03,March,2011 02:45 PM, Marius Vollmer wrote: > ext Chow Loong Jin writes: > >> Could we somehow avoid using sync()? sync() syncs all mounted filesystems, >> which >> isn't exactly very friendly when you have a few slow-syncing filesystems like >> btrfs (or even NFS) mounted. > > Hmm, right. We could keep a list of all files that need fsyncing, and > then fsync them all just before writing the checkpoint. I remember seeing there being some list of files to be fsynced in one of the older dpkgs. It's probably that which led to the ext4 slowdown -- you'd get the same effect of one sync() per file on systems with an ext4 root. If you had a process doing heavy I/O in the background, each of those fsync()s will take a considerable amount of time. > Half of that is already done (for the content of the packages), we would > need to add it for the files in /var/lib/dpkg/, or we could just fsync > the whole directory. > > But then again, I would argue that the sync() is actually necessary > always, for correct semantics: You also want to sync everything that the > postinst script has done before recording that a package is fully > installed. Yes, you're right. I completely forgot about that. I don't think most postinst scripts sync when done. I suppose the best that can be done is to batch the stuff as best as can be done to reduce the number of sync()s needed. -- Kind regards, Loong Jin signature.asc Description: OpenPGP digital signature
Re: Speeding up dpkg, a proposal
Yodel again! On Wednesday 02 March 2011 17.02:11 Marius Vollmer wrote: > It shows a speed up between factor six and two in our environment (ext4 > on a slowish flash drive) . I am not sure whether messing with the > fundamentals of dpkg is worth a factor of two in performance To not be all negative: read the recent discussion about fsync() and other stuff in dpkg (I'm not sure where the discussion happened exactly; it was about dpkg becoming extremely slow in some use cases on modern filesystems like btrfs and was a short time before the release. Since then, there is an option for dpkg: unsafe-io: Do not perform safe I/O operations when unpacking. Currently this implies not performing file system syncs before file renames, which is known to cause substantial performance degradation on some file systems, unfortunately the ones that require the safe I/O on the first place due to their unreliable behaviour causing zero-length files on abrupt system crashes. Note: For ext4, the main offender, consider using instead the mount option nodelalloc, which will fix both the per‐ formance degradation and the data safety issues, the lat‐ ter by making the file system not produce zero-length files on abrupt system crashes with any software not doing syncs before atomic renames. Warning: Using this option might improve performance at the cost of losing data, use with care. So you should compare against dpkg with unsafe-io. Very slightly pre-dating this: I often (on btrfs, and when I'm inside development chroots that don't matter much) end up wrapping aptitude/apt-get/dpkg with eatmydata and get the same benefits. But the tool really deserves its name: a hard reboot directly after a package installation will leave a mess behind... (I could so far avoid this; I recently had a zero-length initrd after a kernel upgrade that *might* have been related. OTOH I did a clean shutdown there so it shouldn't have happened...) Yet another point, for the future: I *think* that btrfs is building an interface so applications can directly access btrfs transactions. Which would allow to do package upgrades in a btrfs transaction, and since (again, I *think* so, I'm not sure) can even be ended without forcing an immediate sync (the result is more like a barrier than a sync), this would be a fine way to deal with the situation. At the cost that it's btrfs specific. cheers -- vbi -- BOFH excuse #233: TCP/IP UDP alarm threshold is set too low. signature.asc Description: This is a digitally signed message part.
Re: Speeding up dpkg, a proposal
On Wednesday 02 March 2011 17.02:11 Marius Vollmer wrote: > - Instead, we move all packages that are to be unpacked into > half-installed / reinstreq before touching the first one, and put a > big sync() right before carefully writing /var/lib/dpkg/status. You don't want to do this. While production systems usually are upgraded in downtime windows (with less load), it is sometimes necessary to install some package (tcpdump or whatever to diagnose problems...) while the system is under high load. Especially when you're trying to find out why the machine has a load of 20 and you can't afford to kill it... On a machine with lots of RAM (== disk cache...) and high I/O load, you don't want to do a (global!) sync(). This can totally kill the machine for 20min or more and is a big no go. -- vbi -- featured link: http://www.pool.ntp.org signature.asc Description: This is a digitally signed message part.
Re: Speeding up dpkg, a proposal
On Wed, Mar 02, 2011 at 08:13:06PM +, Roger Leigh wrote: > Btrfs is quite simply awful in chroots at present, and it seems > --force-unsafe-io doesn't really seem to help massively either. > It's dog slow--it's quicker to untar a chroot onto ext3 than to > bother with Btrfs. Because unsafe-io does not apply to the handling of the info directory. Bastian -- Those who hate and fight must stop themselves -- otherwise it is not stopped. -- Spock, "Day of the Dove", stardate unknown -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110302231220.ga31...@wavehammer.waldi.eu.org
Re: Speeding up dpkg, a proposal
On Wed, Mar 02, 2011 at 08:13:06PM +, Roger Leigh wrote: > On Thu, Mar 03, 2011 at 01:51:35AM +0800, Chow Loong Jin wrote: > > Hi, > > > > On Thursday 03,March,2011 12:02 AM, Marius Vollmer wrote: > > > [...] > > > - Instead, we move all packages that are to be unpacked into > > > half-installed / reinstreq before touching the first one, and put a > > > big sync() right before carefully writing /var/lib/dpkg/status. > > > > Could we somehow avoid using sync()? sync() syncs all mounted filesystems, > > which isn't exactly very friendly when you have a few slow-syncing > > filesystems like btrfs (or even NFS) mounted. I recall my schroots that ran > > on tmpfs unpacking exceptionally slowly due to this issue until I stuck > > libeatmydata (or a variant of it) onto the schroots' dpkgs. > > Btrfs is quite simply awful in chroots at present, and it seems > --force-unsafe-io doesn't really seem to help massively either. > It's dog slow--it's quicker to untar a chroot onto ext3 than to > bother with Btrfs. > > This is a shame, because Btrfs snapshots are the most fast and > reliable out there at the moment (LVM snapshots are fast, but not > /that/ fast, and LVM appears to have locking issues which need > addressing to make it robust enough to handle simultaneous > creation and removal of many snapshots. > > It would be great if there was a solution to this problem; is anyone > running Btrfs as a root filesystem who has any suggestions? eatmydata works great for my chroots on btrfs. Mike -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110302202917.ga21...@glandium.org
Re: Speeding up dpkg, a proposal
On Thu, Mar 03, 2011 at 01:51:35AM +0800, Chow Loong Jin wrote: > Hi, > > On Thursday 03,March,2011 12:02 AM, Marius Vollmer wrote: > > [...] > > - Instead, we move all packages that are to be unpacked into > > half-installed / reinstreq before touching the first one, and put a > > big sync() right before carefully writing /var/lib/dpkg/status. > > Could we somehow avoid using sync()? sync() syncs all mounted filesystems, > which isn't exactly very friendly when you have a few slow-syncing > filesystems like btrfs (or even NFS) mounted. I recall my schroots that ran > on tmpfs unpacking exceptionally slowly due to this issue until I stuck > libeatmydata (or a variant of it) onto the schroots' dpkgs. Btrfs is quite simply awful in chroots at present, and it seems --force-unsafe-io doesn't really seem to help massively either. It's dog slow--it's quicker to untar a chroot onto ext3 than to bother with Btrfs. This is a shame, because Btrfs snapshots are the most fast and reliable out there at the moment (LVM snapshots are fast, but not /that/ fast, and LVM appears to have locking issues which need addressing to make it robust enough to handle simultaneous creation and removal of many snapshots. It would be great if there was a solution to this problem; is anyone running Btrfs as a root filesystem who has any suggestions? Regards, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `-GPG Public Key: 0x25BFB848 Please GPG sign your mail. signature.asc Description: Digital signature
Re: Speeding up dpkg, a proposal
Chow Loong Jin wrote the following on 02.03.2011 18:51 > Hi, > > On Thursday 03,March,2011 12:02 AM, Marius Vollmer wrote: >> [...] >> - Instead, we move all packages that are to be unpacked into >> half-installed / reinstreq before touching the first one, and put a >> big sync() right before carefully writing /var/lib/dpkg/status. > > Could we somehow avoid using sync()? sync() syncs all mounted filesystems, > which > isn't exactly very friendly when you have a few slow-syncing filesystems like > btrfs (or even NFS) mounted. I recall my schroots that ran on tmpfs unpacking > exceptionally slowly due to this issue until I stuck libeatmydata (or a > variant > of it) onto the schroots' dpkgs. > > I actually recall there being some things mentioned about FS_IOC_SYNCFS (lkml > thread[1], dpkg-devel thread[2]) for faster per-filesystem syncing, but that > seems to have died of natural causes last year. > >> [...] > > > [1] http://thread.gmane.org/gmane.linux.file-systems/44628 > [2] http://lists.debian.org/debian-dpkg/2010/11/msg00069.html Just for reference afaik there has been a discussion about this topic last year. If you are interested why dpkg does all those syncs read: http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/770841 -- bye Thilo 4096R/0xC70B1A8F 721B 1BA0 095C 1ABA 3FC6 7C18 89A4 A2A0 C70B 1A8F -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/ikm75b$s3t$1...@dough.gmane.org
Re: Speeding up dpkg, a proposal
Hi, On Thursday 03,March,2011 12:02 AM, Marius Vollmer wrote: > [...] > - Instead, we move all packages that are to be unpacked into > half-installed / reinstreq before touching the first one, and put a > big sync() right before carefully writing /var/lib/dpkg/status. Could we somehow avoid using sync()? sync() syncs all mounted filesystems, which isn't exactly very friendly when you have a few slow-syncing filesystems like btrfs (or even NFS) mounted. I recall my schroots that ran on tmpfs unpacking exceptionally slowly due to this issue until I stuck libeatmydata (or a variant of it) onto the schroots' dpkgs. I actually recall there being some things mentioned about FS_IOC_SYNCFS (lkml thread[1], dpkg-devel thread[2]) for faster per-filesystem syncing, but that seems to have died of natural causes last year. > [...] [1] http://thread.gmane.org/gmane.linux.file-systems/44628 [2] http://lists.debian.org/debian-dpkg/2010/11/msg00069.html -- Kind regards, Loong Jin signature.asc Description: OpenPGP digital signature