On Sun, 24 Jun 2012, Sascha Silbe <sascha-...@silbe.org> wrote: > All the time I thought what makes "notmuch new" so abysmally slow is the > stat() for each maildir. But as it continued to be slow even after I > moved most mails out of 'new' (into 'new-20120624'), I strace'd notmuch > and noticed it listed even unchanged directories, thereby listing and > iterating over each and every single of the 900k mails in my mail store. > > There's still quite some room for further improvements as it continues > to take several minutes to scan < 100 new mails in changed directories > containing < 1000 mails in total. Even the rsync run that fetches the > new mails is faster.
I haven't looked over your patches yet, but this result surprises me. Could you explain your setup a little more? How much mail do you have and across how many directories? What file system are you using? I'm also surprised that your new approach helps. This directory listing has to be read off disk one way or the other, but listing directories is the bread-and-butter of file systems, whereas I would think that Xapian would require more IO to accomplish the same effect. Does your patch win because you can specifically list subdirectories out of Xapian, making the IO proportional to the number of subdirectories instead of the number of subdirectories and files (even though the constant factors probably favor reading from the file system)? I like the idea of these patches, I just want to make sure I have a firm grip on what's being optimized and why it wins. _______________________________________________ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch