Re: Work (really slow directory access on ext4)
On Wed, Aug 6, 2014 at 2:26 PM, Arlie Stephens wrote: > On Aug 06 2014, Theodore Ts'o wrote: >> >> I don't subscribe to kernelnewbies, but I came across this thread in >> the mail archive while researching an unrelated issue. >> >> Valdis' observations are on the mark here. It's almost certain that >> you are getting overwhelmed with other disk traffic, because your >> directory isn't *that* big. > > Thank you very much. As the user in question, I'm afraid this one > turns out to be a clear case of "user is an idiot." > > I made a dumb mistake in the way I was measuring things. The situation > on this server is not as bad as it looked. > >> That being said, there are certainly issues with really really big >> directories, and solving this is certainly not going to be a newbie >> project (if it was easy to solve, it would have been addressed a long >> time ago). See: >> >> http://en.it-usenet.org/thread/11916/10367/ > > However, this response is precious. Suddenly a whole bunch of things > make sense from that posting alone. Last time I looked seriously at > file system code, it was the Berkeley Fast File System, also known as > UFS. I've never had time and inclination to look at a modern file > system. That article managed to straighten out multiple misconceptions > for me, and point me in good directions. > >> for the background. It's a little bit dated, in that we do use a >> 64-bit hash on 64-bit systems, but the fundamental issues are still >> there. > > And that's in addition to what you covered here - which includes what > might be a useful workaround for the application which may or may not > be hitting a problem that the ls test was intended to simplify. I'm > passing that on to the app. developer. > > Many, many thanks. > >> If you sort the readdir files by inode order, this can help >> significantly. Some userspace programs, such as mutt, do this. >> Unfortunately "ls" does not. (That might be a good newbie project, >> since it's a userspace-only project. However, I'm pretty sure the >> shellutils maintainers will also react negatively if they are sent >> patches which don't compile. :-) >> >> A proof of concept of how this can be a win can be found here: >> >> http://git.kernel.org/cgit/fs/ext2/e2fsprogs.git/tree/contrib/spd_readdir.c >> >> LD_PRELOAD aren't guaranteed to work on all programs, so this is much >> more of a hack than something I'd recommend for extended production >> use. But it shows that if you have a readdir+stat workload, sorting >> by inode makes a huge difference. >> >> As far as getting traces to better understand problems, I strongly >> suggest that you try things like vmstat, iostat, and blktrace; system >> call traces like strace aren't going to get you very far. (See >> http://brooker.co.za/blog/2013/07/14/io-performance.html for a nice >> introduction to blktrace). Use the scientific method; collect >> baseline statistics using vmstat, iostat, sar, before you run your >> test workload, so you know how much I/O is going on before you start >> your test. If you can run your test on a quiscient system, that's a >> really good idea. Then collect statistics as your run your workload, >> and then only tweak one variable at a time, and record everything in a >> systematic way. > > Another tool I didn't know about. Thank you very much. >> >> Finally, if you have more problems of a technical nature with respect >> to the ext4, there is the ext3-us...@redhat.com list, or the >> developer's list at linux-e...@vger.kernel.org. It would be nice if >> you tried the ext3-users or the kernel-newbies or tried googling to >> see if anyone else has come across the problem and figured out the >> solution already, but if you can't figure things out any other way, do >> feel free to ask the linux-ext4 list. We won't bite. :-) > > Thank you. I'll make sure to do my homework properly in future - and > never never believe things senior members of my team tell me without > verifying them first, at least not if I'm going to post about them :-( > >> >> Cheers, >> >> - Ted >> >> P.S. If you have a large number of directories which are much larger >> than you expect, and you don't want to do the "mkdir foo.new; mv foo/* >> foo.new ; rmdir foo; mv foo.new foo" trick on a large number of >> directories, you can also schedule downtime and while the file system >> is unmounted, use "e2fsck -fD". See the man page for more details. >> It won't solve all of your problems, and it might not solve any of >> your problem, but it will probably make the performance of large >> directories somewhat better. > > Another hint of substantially more value than everything I posted > about this topic. > > Thank you again. > > -- > Arlie > > (Arlie Stephens ar...@worldash.org) > > ___ > Kernelnewbies mailing list > Kernelnewbies@kernelnewbies.org > http://lists.kernelnewbies.org/ma
Re: Work (really slow directory access on ext4)
On Aug 06 2014, Theodore Ts'o wrote: > > I don't subscribe to kernelnewbies, but I came across this thread in > the mail archive while researching an unrelated issue. > > Valdis' observations are on the mark here. It's almost certain that > you are getting overwhelmed with other disk traffic, because your > directory isn't *that* big. Thank you very much. As the user in question, I'm afraid this one turns out to be a clear case of "user is an idiot." I made a dumb mistake in the way I was measuring things. The situation on this server is not as bad as it looked. > That being said, there are certainly issues with really really big > directories, and solving this is certainly not going to be a newbie > project (if it was easy to solve, it would have been addressed a long > time ago). See: > > http://en.it-usenet.org/thread/11916/10367/ However, this response is precious. Suddenly a whole bunch of things make sense from that posting alone. Last time I looked seriously at file system code, it was the Berkeley Fast File System, also known as UFS. I've never had time and inclination to look at a modern file system. That article managed to straighten out multiple misconceptions for me, and point me in good directions. > for the background. It's a little bit dated, in that we do use a > 64-bit hash on 64-bit systems, but the fundamental issues are still > there. And that's in addition to what you covered here - which includes what might be a useful workaround for the application which may or may not be hitting a problem that the ls test was intended to simplify. I'm passing that on to the app. developer. Many, many thanks. > If you sort the readdir files by inode order, this can help > significantly. Some userspace programs, such as mutt, do this. > Unfortunately "ls" does not. (That might be a good newbie project, > since it's a userspace-only project. However, I'm pretty sure the > shellutils maintainers will also react negatively if they are sent > patches which don't compile. :-) > > A proof of concept of how this can be a win can be found here: > > http://git.kernel.org/cgit/fs/ext2/e2fsprogs.git/tree/contrib/spd_readdir.c > > LD_PRELOAD aren't guaranteed to work on all programs, so this is much > more of a hack than something I'd recommend for extended production > use. But it shows that if you have a readdir+stat workload, sorting > by inode makes a huge difference. > > As far as getting traces to better understand problems, I strongly > suggest that you try things like vmstat, iostat, and blktrace; system > call traces like strace aren't going to get you very far. (See > http://brooker.co.za/blog/2013/07/14/io-performance.html for a nice > introduction to blktrace). Use the scientific method; collect > baseline statistics using vmstat, iostat, sar, before you run your > test workload, so you know how much I/O is going on before you start > your test. If you can run your test on a quiscient system, that's a > really good idea. Then collect statistics as your run your workload, > and then only tweak one variable at a time, and record everything in a > systematic way. Another tool I didn't know about. Thank you very much. > > Finally, if you have more problems of a technical nature with respect > to the ext4, there is the ext3-us...@redhat.com list, or the > developer's list at linux-e...@vger.kernel.org. It would be nice if > you tried the ext3-users or the kernel-newbies or tried googling to > see if anyone else has come across the problem and figured out the > solution already, but if you can't figure things out any other way, do > feel free to ask the linux-ext4 list. We won't bite. :-) Thank you. I'll make sure to do my homework properly in future - and never never believe things senior members of my team tell me without verifying them first, at least not if I'm going to post about them :-( > > Cheers, > > - Ted > > P.S. If you have a large number of directories which are much larger > than you expect, and you don't want to do the "mkdir foo.new; mv foo/* > foo.new ; rmdir foo; mv foo.new foo" trick on a large number of > directories, you can also schedule downtime and while the file system > is unmounted, use "e2fsck -fD". See the man page for more details. > It won't solve all of your problems, and it might not solve any of > your problem, but it will probably make the performance of large > directories somewhat better. Another hint of substantially more value than everything I posted about this topic. Thank you again. -- Arlie (Arlie Stephens ar...@worldash.org) ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
I don't subscribe to kernelnewbies, but I came across this thread in the mail archive while researching an unrelated issue. Valdis' observations are on the mark here. It's almost certain that you are getting overwhelmed with other disk traffic, because your directory isn't *that* big. That being said, there are certainly issues with really really big directories, and solving this is certainly not going to be a newbie project (if it was easy to solve, it would have been addressed a long time ago). See: http://en.it-usenet.org/thread/11916/10367/ for the background. It's a little bit dated, in that we do use a 64-bit hash on 64-bit systems, but the fundamental issues are still there. If you sort the readdir files by inode order, this can help significantly. Some userspace programs, such as mutt, do this. Unfortunately "ls" does not. (That might be a good newbie project, since it's a userspace-only project. However, I'm pretty sure the shellutils maintainers will also react negatively if they are sent patches which don't compile. :-) A proof of concept of how this can be a win can be found here: http://git.kernel.org/cgit/fs/ext2/e2fsprogs.git/tree/contrib/spd_readdir.c LD_PRELOAD aren't guaranteed to work on all programs, so this is much more of a hack than something I'd recommend for extended production use. But it shows that if you have a readdir+stat workload, sorting by inode makes a huge difference. As far as getting traces to better understand problems, I strongly suggest that you try things like vmstat, iostat, and blktrace; system call traces like strace aren't going to get you very far. (See http://brooker.co.za/blog/2013/07/14/io-performance.html for a nice introduction to blktrace). Use the scientific method; collect baseline statistics using vmstat, iostat, sar, before you run your test workload, so you know how much I/O is going on before you start your test. If you can run your test on a quiscient system, that's a really good idea. Then collect statistics as your run your workload, and then only tweak one variable at a time, and record everything in a systematic way. Finally, if you have more problems of a technical nature with respect to the ext4, there is the ext3-us...@redhat.com list, or the developer's list at linux-e...@vger.kernel.org. It would be nice if you tried the ext3-users or the kernel-newbies or tried googling to see if anyone else has come across the problem and figured out the solution already, but if you can't figure things out any other way, do feel free to ask the linux-ext4 list. We won't bite. :-) Cheers, - Ted P.S. If you have a large number of directories which are much larger than you expect, and you don't want to do the "mkdir foo.new; mv foo/* foo.new ; rmdir foo; mv foo.new foo" trick on a large number of directories, you can also schedule downtime and while the file system is unmounted, use "e2fsck -fD". See the man page for more details. It won't solve all of your problems, and it might not solve any of your problem, but it will probably make the performance of large directories somewhat better. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
On Thu, Jul 31, 2014 at 7:41 PM, Henry Hallam wrote: > Try redirecting the ls output to /dev/null or a file, thus disabling > its color highlighting and thus removing a bunch of syscalls. See if > it's now the same no matter what choice of 'time'. > > On Thu, Jul 31, 2014 at 4:36 PM, Arlie Stephens wrote: >> Hi Nick, >> >> [Context - directory ls taking 4-15 seconds; directory large, with >> long filenames, but nowhere near as huge as Valdis' mail directory.] >> >> I've now discovered a really bizarre pattern, and I'm inclined to stop >> blaming the file system until some clarity develops. If I ever get it >> to the point where I can produce a high quality bug report - with or >> without patch - I will do so - but what I have now is anything but >> clear and high quality. >> >> On Jul 30 2014, Nick Krause wrote: >>> On Wed, Jul 30, 2014 at 3:48 PM, wrote: >>> > On Wed, 30 Jul 2014 10:38:13 -0700, Arlie Stephens said: >>> > >>> >> On the good side, Vladis' observations of his mail directory have been >>> >> a great help. >>> > >>> > And remember, that's on a single laptop-class hard drive, no fancy raid or >>> > anything. (Though it *is* a hybrid, with 32G of flash cache on the front >>> > end). >>> > >>> > You throw some *real* hardware at it, it of course would go even faster. >>> >>> Just send me the logs and anything else you think may help me. >>> Please note cc the ext4 mailing list as this will also let the other >>> ext4 developers and maintainers known about your problem. >>> Cheers Nick >> >> I'm now in a state of complete bafflement. >> >> It turns out we have a whole collection of misbehaving directories, >> making this testable without waiting for caches to clear. >> >> I have a couple of strace's of fast ls's, and a function ftrace that >> captured about half of a 7 second ls. (The latter is huge, and >> probably not suitable for posting.) >> >> I also have a really bizarre observation, the kind that makes you >> wonder whether you are actually dreaming. It appears that the >> misbehaviour is strongly influenced by the choice of "time" function. >> The problem only occurs when using the shell built-in. /usr/bin/time >> always produces a fast response. >> >> Stranger still - flat out impossible, I'd have said before seeing it - >> a "fast" ls, run with /usr/bin/time can be followed *immediately* >> by a slow "ls", run with bash' time. It's as if the first one doesn't >> warm the cache, which is completely absurd - except I've been able to >> make this happen 5 times in a row, first with strace and then >> without. >> >> # with /usr/bin/time the ls is fast >> $ time -p ls bad_dir >> ... >> real 0.21 >> user 0.00 >> sys 0.00 >> >> >> # with the builtin time, right *after* the strace run, the time can be >> # horrible. >> $ time -p ls bad_dir >> ... >> real 5.60 >> user 0.00 >> sys 0.17 >> >> # run it again, and the directory is in cache as expected. >> $ time -p ls bad_dir >> ... >> real 0.11 >> user 0.00 >> sys 0.02 >> >> >> This is not an artefact of one or other time reporting incorrectly - >> I'm noticing a long pause before output occurs, but only on the middle >> test of the three. >> >> I can't imagine any sane way for this to be happening, short of >> coincidence or user error - and I've now seen this sequence 5 times in >> a row, on 5 different directories created and populated by the same >> app. (Three times with strace, twice without.) >> >> >> -- >> Arlie >> >> (Arlie Stephens ar...@worldash.org) >> >> ___ >> Kernelnewbies mailing list >> Kernelnewbies@kernelnewbies.org >> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies I agree with Hugo, seems right to send me the output in a file to read to see if this actually is a bug with ext4. Regards Nick ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
Try redirecting the ls output to /dev/null or a file, thus disabling its color highlighting and thus removing a bunch of syscalls. See if it's now the same no matter what choice of 'time'. On Thu, Jul 31, 2014 at 4:36 PM, Arlie Stephens wrote: > Hi Nick, > > [Context - directory ls taking 4-15 seconds; directory large, with > long filenames, but nowhere near as huge as Valdis' mail directory.] > > I've now discovered a really bizarre pattern, and I'm inclined to stop > blaming the file system until some clarity develops. If I ever get it > to the point where I can produce a high quality bug report - with or > without patch - I will do so - but what I have now is anything but > clear and high quality. > > On Jul 30 2014, Nick Krause wrote: >> On Wed, Jul 30, 2014 at 3:48 PM, wrote: >> > On Wed, 30 Jul 2014 10:38:13 -0700, Arlie Stephens said: >> > >> >> On the good side, Vladis' observations of his mail directory have been >> >> a great help. >> > >> > And remember, that's on a single laptop-class hard drive, no fancy raid or >> > anything. (Though it *is* a hybrid, with 32G of flash cache on the front >> > end). >> > >> > You throw some *real* hardware at it, it of course would go even faster. >> >> Just send me the logs and anything else you think may help me. >> Please note cc the ext4 mailing list as this will also let the other >> ext4 developers and maintainers known about your problem. >> Cheers Nick > > I'm now in a state of complete bafflement. > > It turns out we have a whole collection of misbehaving directories, > making this testable without waiting for caches to clear. > > I have a couple of strace's of fast ls's, and a function ftrace that > captured about half of a 7 second ls. (The latter is huge, and > probably not suitable for posting.) > > I also have a really bizarre observation, the kind that makes you > wonder whether you are actually dreaming. It appears that the > misbehaviour is strongly influenced by the choice of "time" function. > The problem only occurs when using the shell built-in. /usr/bin/time > always produces a fast response. > > Stranger still - flat out impossible, I'd have said before seeing it - > a "fast" ls, run with /usr/bin/time can be followed *immediately* > by a slow "ls", run with bash' time. It's as if the first one doesn't > warm the cache, which is completely absurd - except I've been able to > make this happen 5 times in a row, first with strace and then > without. > > # with /usr/bin/time the ls is fast > $ time -p ls bad_dir > ... > real 0.21 > user 0.00 > sys 0.00 > > > # with the builtin time, right *after* the strace run, the time can be > # horrible. > $ time -p ls bad_dir > ... > real 5.60 > user 0.00 > sys 0.17 > > # run it again, and the directory is in cache as expected. > $ time -p ls bad_dir > ... > real 0.11 > user 0.00 > sys 0.02 > > > This is not an artefact of one or other time reporting incorrectly - > I'm noticing a long pause before output occurs, but only on the middle > test of the three. > > I can't imagine any sane way for this to be happening, short of > coincidence or user error - and I've now seen this sequence 5 times in > a row, on 5 different directories created and populated by the same > app. (Three times with strace, twice without.) > > > -- > Arlie > > (Arlie Stephens ar...@worldash.org) > > ___ > Kernelnewbies mailing list > Kernelnewbies@kernelnewbies.org > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
Hi Nick, [Context - directory ls taking 4-15 seconds; directory large, with long filenames, but nowhere near as huge as Valdis' mail directory.] I've now discovered a really bizarre pattern, and I'm inclined to stop blaming the file system until some clarity develops. If I ever get it to the point where I can produce a high quality bug report - with or without patch - I will do so - but what I have now is anything but clear and high quality. On Jul 30 2014, Nick Krause wrote: > On Wed, Jul 30, 2014 at 3:48 PM, wrote: > > On Wed, 30 Jul 2014 10:38:13 -0700, Arlie Stephens said: > > > >> On the good side, Vladis' observations of his mail directory have been > >> a great help. > > > > And remember, that's on a single laptop-class hard drive, no fancy raid or > > anything. (Though it *is* a hybrid, with 32G of flash cache on the front > > end). > > > > You throw some *real* hardware at it, it of course would go even faster. > > Just send me the logs and anything else you think may help me. > Please note cc the ext4 mailing list as this will also let the other > ext4 developers and maintainers known about your problem. > Cheers Nick I'm now in a state of complete bafflement. It turns out we have a whole collection of misbehaving directories, making this testable without waiting for caches to clear. I have a couple of strace's of fast ls's, and a function ftrace that captured about half of a 7 second ls. (The latter is huge, and probably not suitable for posting.) I also have a really bizarre observation, the kind that makes you wonder whether you are actually dreaming. It appears that the misbehaviour is strongly influenced by the choice of "time" function. The problem only occurs when using the shell built-in. /usr/bin/time always produces a fast response. Stranger still - flat out impossible, I'd have said before seeing it - a "fast" ls, run with /usr/bin/time can be followed *immediately* by a slow "ls", run with bash' time. It's as if the first one doesn't warm the cache, which is completely absurd - except I've been able to make this happen 5 times in a row, first with strace and then without. # with /usr/bin/time the ls is fast $ time -p ls bad_dir ... real 0.21 user 0.00 sys 0.00 # with the builtin time, right *after* the strace run, the time can be # horrible. $ time -p ls bad_dir ... real 5.60 user 0.00 sys 0.17 # run it again, and the directory is in cache as expected. $ time -p ls bad_dir ... real 0.11 user 0.00 sys 0.02 This is not an artefact of one or other time reporting incorrectly - I'm noticing a long pause before output occurs, but only on the middle test of the three. I can't imagine any sane way for this to be happening, short of coincidence or user error - and I've now seen this sequence 5 times in a row, on 5 different directories created and populated by the same app. (Three times with strace, twice without.) -- Arlie (Arlie Stephens ar...@worldash.org) ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
On Wed, Jul 30, 2014 at 3:48 PM, wrote: > On Wed, 30 Jul 2014 10:38:13 -0700, Arlie Stephens said: > >> On the good side, Vladis' observations of his mail directory have been >> a great help. > > And remember, that's on a single laptop-class hard drive, no fancy raid or > anything. (Though it *is* a hybrid, with 32G of flash cache on the front end). > > You throw some *real* hardware at it, it of course would go even faster. Just send me the logs and anything else you think may help me. Please note cc the ext4 mailing list as this will also let the other ext4 developers and maintainers known about your problem. Cheers Nick ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
On Wed, 30 Jul 2014 10:38:13 -0700, Arlie Stephens said: > On the good side, Vladis' observations of his mail directory have been > a great help. And remember, that's on a single laptop-class hard drive, no fancy raid or anything. (Though it *is* a hybrid, with 32G of flash cache on the front end). You throw some *real* hardware at it, it of course would go even faster. pgpHRO1EbxZsD.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
Hi Nick, On Jul 29 2014, Nick Krause wrote: > >> I was doing a vanilla ls. So was the original reporter, unless he has > >> some really strange aliases. > >> > >> > >> I'm afraid I'll be rather unpopular if I drop the caches on the system > >> in question, creating a burst of poor performance, so my best bet is > >> probably to see what I can do with ftrace on Monday, or perhaps > >> partway through the weekend. > >> > >> There is normally a fair amount of disk activity going on - much of it > >> writes. So I can expect cached blocks to age out in a reasonable time. > >> > > Arlie, > > Whenever you get around to it is fine. > > Just send me a log. > > Cheers Nick > > Arlie, > just a friendly reminder can you try to send me the log this week. > Regards Nick I was just going to post an apology for going dark on you. I made one attempt to capture the data yesterday, and messed up - no useful data saved. And then half the world invaded my workspace with higher priority tasks ;-) I'm going to make another attempt at it this morning. On the good side, Vladis' observations of his mail directory have been a great help. Now I know that simply being a large ext4 directory is not the problem ;-) I.e. ext4 really isn't as brain damaged as I feared. (We had someone here who was initially sure that was it, and he has more experience in linux server space than I do, so I took his initial opinion at face value.) More soon, I hope. -- Arlie (Arlie Stephens ar...@worldash.org) ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
On Fri, Jul 25, 2014 at 9:22 PM, Nick Krause wrote: > On Fri, Jul 25, 2014 at 9:08 PM, Arlie Stephens wrote: >> On Jul 25 2014, valdis.kletni...@vt.edu wrote: >>> On Fri, 25 Jul 2014 15:23:42 -0700, Arlie Stephens said: >>> >>> > If you want an annoying problem, explain and/or fix directory >>> > performance on ext4. I've got a server where an ls of a directory took >>> > 5 seconds, according to "time", even though it only has 295 entries at >>> > present. >>> >>> I don't suppose you could get a trace of where that ls is spending its >>> time with the kernel's trace facilities, or even just getting a stack trace >>> of where that ls is in the kernel? >> >> These are all very good questions. >> >> To my amazement, I found that no one had yet fixed the problem by >> deleting and recreating the directory, and I do have sudo access. >> This time it was only 4 seconds... >> real 0m3.992s >> user 0m0.005s >> sys 0m0.052s >> >>> I'll go out on a limb and ask if a *second* ls of the same directory runs >>> quickly because it's now cache-hot. If so, I'd start looking at whether >>> there's large amounts of *other* disk activity going on, and the reads of >>> the >>> directory are getting hung in the I/O queue behind other disk >>> read/writes. >> >> Sure enough, the cache saved me on a second read - >> real 0m0.010s >> user 0m0.000s >> sys 0m0.010s >> >>> Also, are you doing an 'ls' (which just requires reading the name/inode# >>> pairs), or an 'ls -l' whihc in addition requires a stat() call to read in >>> the >>> inode itself)? That makes a lot of difference. Cache-cold on my laptop, >>> and a >>> *huge* Mail/linux-kernel directory (yes, it really *is* an 11M directory, >>> it's got a half-million entries in it): >> >> I was doing a vanilla ls. So was the original reporter, unless he has >> some really strange aliases. >> >> >> I'm afraid I'll be rather unpopular if I drop the caches on the system >> in question, creating a burst of poor performance, so my best bet is >> probably to see what I can do with ftrace on Monday, or perhaps >> partway through the weekend. >> >> There is normally a fair amount of disk activity going on - much of it >> writes. So I can expect cached blocks to age out in a reasonable time. >> >> >>> [~] echo 3 >| /proc/sys/vm/drop_caches >>> [~] cd Mail >>> [~/Mail] time ls linux-kernel/ | wc -l >>> 478401 >>> >>> real0m2.387s >>> user0m0.500s >>> sys 0m0.433s >>> [~/Mail] ls -ld linux-kernel/ >>> drwxr-xr-x. 2 valdis valdis 11005952 Jul 25 19:30 linux-kernel/ >> >> Compared to your directory, mine is microscopic >> >> $ ls -ld >> drwxr-xr-x 2 yyy yyy 36864 Jul 25 12:19 >> >> >>> [~/Mail] time ls -l linux-kernel/ | wc -l >>> 478402 >>> >>> real0m32.915s >>> user0m2.483s >>> sys 0m20.787s >> >> -- >> Arlie >> >> (Arlie Stephens ar...@worldash.org) > > > Arlie, > Whenever you get around to it is fine. > Just send me a log. > Cheers Nick Arlie, just a friendly reminder can you try to send me the log this week. Regards Nick ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
On Fri, Jul 25, 2014 at 9:08 PM, Arlie Stephens wrote: > On Jul 25 2014, valdis.kletni...@vt.edu wrote: >> On Fri, 25 Jul 2014 15:23:42 -0700, Arlie Stephens said: >> >> > If you want an annoying problem, explain and/or fix directory >> > performance on ext4. I've got a server where an ls of a directory took >> > 5 seconds, according to "time", even though it only has 295 entries at >> > present. >> >> I don't suppose you could get a trace of where that ls is spending its >> time with the kernel's trace facilities, or even just getting a stack trace >> of where that ls is in the kernel? > > These are all very good questions. > > To my amazement, I found that no one had yet fixed the problem by > deleting and recreating the directory, and I do have sudo access. > This time it was only 4 seconds... > real 0m3.992s > user 0m0.005s > sys 0m0.052s > >> I'll go out on a limb and ask if a *second* ls of the same directory runs >> quickly because it's now cache-hot. If so, I'd start looking at whether >> there's large amounts of *other* disk activity going on, and the reads of the >> directory are getting hung in the I/O queue behind other disk >> read/writes. > > Sure enough, the cache saved me on a second read - > real 0m0.010s > user 0m0.000s > sys 0m0.010s > >> Also, are you doing an 'ls' (which just requires reading the name/inode# >> pairs), or an 'ls -l' whihc in addition requires a stat() call to read in the >> inode itself)? That makes a lot of difference. Cache-cold on my laptop, >> and a >> *huge* Mail/linux-kernel directory (yes, it really *is* an 11M directory, >> it's got a half-million entries in it): > > I was doing a vanilla ls. So was the original reporter, unless he has > some really strange aliases. > > > I'm afraid I'll be rather unpopular if I drop the caches on the system > in question, creating a burst of poor performance, so my best bet is > probably to see what I can do with ftrace on Monday, or perhaps > partway through the weekend. > > There is normally a fair amount of disk activity going on - much of it > writes. So I can expect cached blocks to age out in a reasonable time. > > >> [~] echo 3 >| /proc/sys/vm/drop_caches >> [~] cd Mail >> [~/Mail] time ls linux-kernel/ | wc -l >> 478401 >> >> real0m2.387s >> user0m0.500s >> sys 0m0.433s >> [~/Mail] ls -ld linux-kernel/ >> drwxr-xr-x. 2 valdis valdis 11005952 Jul 25 19:30 linux-kernel/ > > Compared to your directory, mine is microscopic > > $ ls -ld > drwxr-xr-x 2 yyy yyy 36864 Jul 25 12:19 > > >> [~/Mail] time ls -l linux-kernel/ | wc -l >> 478402 >> >> real0m32.915s >> user0m2.483s >> sys 0m20.787s > > -- > Arlie > > (Arlie Stephens ar...@worldash.org) Arlie, Whenever you get around to it is fine. Just send me a log. Cheers Nick ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Work (really slow directory access on ext4)
On Jul 25 2014, valdis.kletni...@vt.edu wrote: > On Fri, 25 Jul 2014 15:23:42 -0700, Arlie Stephens said: > > > If you want an annoying problem, explain and/or fix directory > > performance on ext4. I've got a server where an ls of a directory took > > 5 seconds, according to "time", even though it only has 295 entries at > > present. > > I don't suppose you could get a trace of where that ls is spending its > time with the kernel's trace facilities, or even just getting a stack trace > of where that ls is in the kernel? These are all very good questions. To my amazement, I found that no one had yet fixed the problem by deleting and recreating the directory, and I do have sudo access. This time it was only 4 seconds... real 0m3.992s user 0m0.005s sys 0m0.052s > I'll go out on a limb and ask if a *second* ls of the same directory runs > quickly because it's now cache-hot. If so, I'd start looking at whether > there's large amounts of *other* disk activity going on, and the reads of the > directory are getting hung in the I/O queue behind other disk > read/writes. Sure enough, the cache saved me on a second read - real 0m0.010s user 0m0.000s sys 0m0.010s > Also, are you doing an 'ls' (which just requires reading the name/inode# > pairs), or an 'ls -l' whihc in addition requires a stat() call to read in the > inode itself)? That makes a lot of difference. Cache-cold on my laptop, and > a > *huge* Mail/linux-kernel directory (yes, it really *is* an 11M directory, > it's got a half-million entries in it): I was doing a vanilla ls. So was the original reporter, unless he has some really strange aliases. I'm afraid I'll be rather unpopular if I drop the caches on the system in question, creating a burst of poor performance, so my best bet is probably to see what I can do with ftrace on Monday, or perhaps partway through the weekend. There is normally a fair amount of disk activity going on - much of it writes. So I can expect cached blocks to age out in a reasonable time. > [~] echo 3 >| /proc/sys/vm/drop_caches > [~] cd Mail > [~/Mail] time ls linux-kernel/ | wc -l > 478401 > > real0m2.387s > user0m0.500s > sys 0m0.433s > [~/Mail] ls -ld linux-kernel/ > drwxr-xr-x. 2 valdis valdis 11005952 Jul 25 19:30 linux-kernel/ Compared to your directory, mine is microscopic $ ls -ld drwxr-xr-x 2 yyy yyy 36864 Jul 25 12:19 > [~/Mail] time ls -l linux-kernel/ | wc -l > 478402 > > real0m32.915s > user0m2.483s > sys 0m20.787s -- Arlie (Arlie Stephens ar...@worldash.org) ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies