Re: [gentoo-user] Speed up `du'
On 25 May 2008, at 03:56, Hemmann, Volker Armin wrote: ... reiserfs and xfs your barriers by default. This sentence no parse. Stroller. -- gentoo-user@lists.gentoo.org mailing list
[gentoo-user] Speed up `du'
Is there any way to speed up the du command? I mean short of having cron run it on target directories and store results. (not really speeding up but at least not having to wait for a result) I've seen various mention of du being slow but don't recall any mentions of how to speed it up. I use Reiserfs with default sizes. In some situations like a large cache of nntp messages of several GB. I might wait 5-10 minutes or more for du to get the size of the directory. Are there other file systems that can return a result of `du' faster? I'm curious how `df' computes sizes so much quicker. Even after rm'ing a large amount of data... `df' sees the difference right away. Or maybe there is some other tool or technique that can quickly tell me the size of a directory or set of directories. -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Speed up `du'
At Sat, 24 May 2008 16:49:09 -0500 [EMAIL PROTECTED] wrote: Is there any way to speed up the du command? I mean short of having cron run it on target directories and store results. (not really speeding up but at least not having to wait for a result) I've seen various mention of du being slow but don't recall any mentions of how to speed it up. I use Reiserfs with default sizes. In some situations like a large cache of nntp messages of several GB. I might wait 5-10 minutes or more for du to get the size of the directory. Are there other file systems that can return a result of `du' faster? I'm curious how `df' computes sizes so much quicker. Even after rm'ing a large amount of data... `df' sees the difference right away. I can't help with speeding up du, but can explain df's speed. This information is kept in the superblock. Each operation that changes size updates the superblock and df just reads the result. (In a sense it is like your cron soln above for du :-) .) Or maybe there is some other tool or technique that can quickly tell me the size of a directory or set of directories. allan -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Speed up `du'
On Sat, May 24, 2008 at 04:49:09PM -0500, Penguin Lover [EMAIL PROTECTED] squawked: Is there any way to speed up the du command? I mean short of having cron run it on target directories and store results. (not really speeding up but at least not having to wait for a result) I've seen various mention of du being slow but don't recall any mentions of how to speed it up. I use Reiserfs with default sizes. In some situations like a large cache of nntp messages of several GB. I might wait 5-10 minutes or more for du to get the size of the directory. Are there other file systems that can return a result of `du' faster? I'm curious how `df' computes sizes so much quicker. Even after rm'ing a large amount of data... `df' sees the difference right away. Or maybe there is some other tool or technique that can quickly tell me the size of a directory or set of directories. I am pretty sure the problem with du is that it actually looks, recursively, at every single file and computes the size that way. So the time you have to wait is mostly due to disk IO (and caching would also explain why if you run du twice in a row the answer returns much more quickly). So, if you know what the bottle-neck directory is (for example, the directory of nntp messages), the tricks in http://gentoo-wiki.com/TIP_Speeding_up_portage should probably work just as well. HTH, W -- You're very sure of your facts, he said at last, I couldn't trust the thinking of a man who takes the Universe - if there is one - for granted. Sortir en Pantoufles: up 533 days, 21:55 -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Speed up `du'
On 25 May 2008, at 00:24, Willie Wong wrote: On Sat, May 24, 2008 at 04:49:09PM -0500, Penguin Lover [EMAIL PROTECTED] squawked: ... I use Reiserfs with default sizes. In some situations like a large cache of nntp messages of several GB. I might wait 5-10 minutes or more for du to get the size of the directory. I am pretty sure the problem with du is that it actually looks, recursively, at every single file and computes the size that way. What he said. Or maybe there is some other tool or technique that can quickly tell me the size of a directory or set of directories. Keep all the files in a honkin' big tarball. :P If you need to read these files on the fly then I'm afraid you'll have to write a kernel filesystem extension (or find one?) that will read them out of the tar file, slowing all read write actions down. But, hey, `du` on the tarball will complete in no time at all!! ;) In seriousness, another thing to do is keep these files on a separate partition, if you can. Basically a user's ~ which includes both .maildir and My HiDef Videos is non-optimal. Are there other file systems that can return a result of `du' faster? All filesystems have their advantages disadvantages. http://www.debian-administration.org/articles/388 Reading the above I _think_ the test most similar in function to running `du` on many small files is the Directory listing and file search into the previous file tree test, at which ResiderFS is fastest. I need to look into this myself soon, to try get best speed at a 3gig corpus of email. I was expecting EXT3 to be best - when you create the filesystem you can specify the blocksize. It's possible that the author of the filesystems comparison could have chosen options when formatting his EXT3 disk that affected the speed of the results - a journal would make writes slower, for instance (not sure about reads). Stroller. -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Speed up `du'
On Sonntag, 25. Mai 2008, Stroller wrote: On 25 May 2008, at 00:24, Willie Wong wrote: On Sat, May 24, 2008 at 04:49:09PM -0500, Penguin Lover [EMAIL PROTECTED] squawked: ... I use Reiserfs with default sizes. In some situations like a large cache of nntp messages of several GB. I might wait 5-10 minutes or more for du to get the size of the directory. I am pretty sure the problem with du is that it actually looks, recursively, at every single file and computes the size that way. What he said. Or maybe there is some other tool or technique that can quickly tell me the size of a directory or set of directories. Keep all the files in a honkin' big tarball. :P If you need to read these files on the fly then I'm afraid you'll have to write a kernel filesystem extension (or find one?) that will read them out of the tar file, slowing all read write actions down. But, hey, `du` on the tarball will complete in no time at all!! ;) In seriousness, another thing to do is keep these files on a separate partition, if you can. Basically a user's ~ which includes both .maildir and My HiDef Videos is non-optimal. Are there other file systems that can return a result of `du' faster? All filesystems have their advantages disadvantages. http://www.debian-administration.org/articles/388 one thing the article does not mention: reiserfs and xfs your barriers by default. ext3 does not. And if you turn on barriers (as mount option) you loose 30% of its speed. Of course, if you care about data integrity, LVM is ruled out too - for the same reason. So if you care about data integrity and speed at the same time, ext3 is ruled out. XFS is broken on a monthly basis (just search the lkml archives for xfs. It is sickening). Leaves reiserfs as only sane choice. -- gentoo-user@lists.gentoo.org mailing list