On Wednesday 17 November 2010 21:55:28 Alan McKinnon wrote: > Apparently, though unproven, at 23:00 on Wednesday 17 November 2010, Paul > > Hartman did opine thusly: > > On Wed, Nov 17, 2010 at 2:35 PM, Mick <michaelkintz...@gmail.com> wrote: > > > Why is the second time so much faster? The size of the derived db was > > > the same on both occasions. > > > > I guess caching like Volker said too. What happens if you do something > > like this twice: > > > > sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb > > Now I'm intrigued. I did some quick and nasty tests. > > First, mlocate's updatedb. No measures taken to invalidate caches etc: > > # time updatedb > real 0m39.265s > user 0m2.245s > sys 0m0.228s > > > Then unmerge mlocate, emerge slocate, delete all dbs, run slocate's > updatedb twice: > > # rm /var/lib/[ms]locate/*db > # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb > real 1m35.365s > user 0m5.941s > sys 0m0.383s > # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb > real 1m34.929s > user 0m5.925s > sys 0m0.377s > > slocate seems quicker than the few tests I'd already done with mlocate and > has no optimizations to re-use existing correct data in the db. Now > unmerge slocate, merge mlocate, do not delete dbs and run mlocate's > updatedb twice: > > # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb > real 3m50.574s > user 0m7.277s > sys 0m0.361s > # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb > real 1m5.830s > user 0m2.088s > sys 0m0.173s > > Second run definitely quicker as it only has to read the fs, not write the > entire index as well. But that initial run ... The old slocate db was still > around, possibly affecting the first run, so delete both db's and run > mlocate's updatedb twice: > > # rm /var/lib/[ms]locate/*db > # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb > real 3m51.592s > user 0m7.249s > sys 0m0.350s > # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb > real 1m7.662s > user 0m1.997s > sys 0m0.159s > > Almost identical to the prior test, so the presence of slocate's db has no > effect on mlocate. Then I realized I hadn't measured how long they took to > reindex a largely cache'd fs so I tried that with both, deleting the db's > at each test: > > slocate: > # rm /var/lib/[ms]locate/*db > rm: cannot remove `/var/lib/[ms]locate/*db': No such file or directory > # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb > real 1m34.341s > user 0m5.929s > sys 0m0.397s > # time updatedb > real 0m2.454s > user 0m0.855s > sys 0m1.569s > > mlocate: > # rm /var/lib/[ms]locate/*db > # sync; sh -c "echo 3 > /proc/sys/vm/drop_caches"; time updatedb > real 3m54.792s > user 0m7.215s > sys 0m0.350s > # time updatedb > real 0m0.538s > user 0m0.302s > sys 0m0.232s > > 0.5 second vs 2.5 seconds. Wow. > > Conclusions: > > 1. mlocate is slow at building it's db from scratch - about 250% as long as > slocate on the same task. > 2. mlocate is faster at reindexing a largely-unchanged fs - it does it in > about 66% of the time slocate took. > 3. mlocate is insanely quick at reindexing a db that is in cache. > > #1 is are - most systems will only do it once > #3 is silly and does not represent anything close to reality > #2 is pretty realistic and a 33% performance boost is significant > > I have no idea where the speed increase in #3 comes from. This is an ext4 > fs - does ext4 keep an in-memory hash of inodes it reads? It seems to me > that would be a very clever and very useful thing for an fs to do.
No. 3 is what made me sent my first post. I was almost convinced that I did something wrong, because no sooner had I hit return it completed. I've deleted the database and rebooted. This is what I'm getting now on the first run: # time updatedb real 2m30.729s user 0m0.723s sys 0m9.070s My database is small, this is a relatively slim installation: # ls -la /var/lib/mlocate/mlocate.db -rw-r----- 1 root locate 9326688 Nov 17 22:14 /var/lib/mlocate/mlocate.db -- Regards, Mick
signature.asc
Description: This is a digitally signed message part.