On Wed, 12 Dec 2012, [ISO-8859-1] Olav Gr?n?s Gjerde wrote:

I'm working on scanning filesystems to build a file search engine and
came over something interesting.

I can walk through 300 000 folders in ~19.5seconds with this command:
ls -Ra | grep -e "./.*:" | sed "s/://"

With find, it surprisingly takes ~50.5 seconds.:
find . -type d

This is because 'find' with '-type' lstats all the files.  It doesn't
use DT_DIR from dirent for some reason.  ls can be slowed down similarly
using -F.

My results are based on five runs of each command to warm up the disk cache.
I've tried both this with both UFS and ZFS, and both filesystems shows
the same speed difference.

I get almost exactly the same ratio of speeds on an old version of FreeBSD.
All the data was cached, and there were only 7 symlinks.  Thr file system
was mounted with -noatime, so the cache actually worked.

On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just
slight faster than find(about 15-20%).

Apparently lstat() is relatively much slower in FreeBSD.  It only takes
5 usec here, but that is a lot for converting cached data (getpid()
takes 0.2 usec).  A file system mounted with -atime might be much
slower, for writing directory timestamps (the sync of the timestamps
is delayed, but it is a very heavyweight operation).

Are there a faster way to walk folders on FreeBSD? Are there some
options(sysctl) I could tune to improve the performance?

Nothing much faster than find without -type.  Whatever fts(3) gives.

Bruce
_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscr...@freebsd.org"

Reply via email to