https://bugs.kde.org/show_bug.cgi?id=334218
Bug ID: 334218 Summary: synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it Classification: Unclassified Product: Akonadi Version: GIT (master) Platform: unspecified OS: Linux Status: UNCONFIRMED Severity: normal Priority: NOR Component: Maildir Resource Assignee: kdepim-bugs@kde.org Reporter: mar...@lichtvoll.de Even after working around [Bug 332684] New: [Maildir] lots of stats calls to /etc/localtime on synchronizing folders by setting an TZ environment variable synchronizing large folders with filesystem contents hogs one CPU core for minutes. Reproducible: Always Steps to Reproduce: 1. Have a large maildir folder. 2. Synchronize it. Actual Results: akonadi_maildir_resource hogs one Sandybridge core for minutes. SSDs are under utilized. MySQL barely visible. Expected Results: Synchronizing large folders is faster. Akonadi stats every file. Is it necessary? For a folder with 250000 mails that are 250000 calls to stat(). While just listing folder contents with martin@merkaba:~/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory> /usr/bin/time find kernel-ml | wc -l 0.21user 0.35system 0:00.68elapsed 82%CPU (0avgtext+0avgdata 59316maxresident)k 13648inputs+0outputs (1major+17920minor)pagefaults 0swaps 250167 is blazingly fast here. We have high CPU usage here as well… but I bet thats due to Linux caching the directory entries and inodes: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 454480 453722 99% 0,98K 28405 16 454480K btrfs_inode 434616 418562 96% 0,19K 20696 21 82784K dentry So, wouldn´t it be sufficient to only stat() the files that are new or have updated timestamps? martin@merkaba:~/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory> /usr/bin/time find kernel-ml -ls | wc -l 0.70user 0.36system 0:01.07elapsed 99%CPU (0avgtext+0avgdata 59536maxresident)k 32inputs+0outputs (0major+18010minor)pagefaults 0swaps 250167 indicated that also the timestamps can be provided quickly. So I´d: 1) list the fs folder contents for filenames and timestamps (mtime). 2) compare with database. 3) only stat() the files that are new or have been updated meanwhile. Result: Blazingly fast folder sync? Part of the CPU time used I see no activity of akonadi maildir resource in strace. Other time is stat()-ing files like this: [pid 4137] stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R234.merkaba", {st_mode=S_IFREG|0644, st_size=4079, ...}) = 0 [pid 4137] stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R322.merkaba", {st_mode=S_IFREG|0644, st_size=8056, ...}) = 0 [pid 4137] stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R342.merkaba", {st_mode=S_IFREG|0644, st_size=2771, ...}) = 0 [pid 4137] stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R608.merkaba", {st_mode=S_IFREG|0644, st_size=4492, ...}) = 0 [pid 4137] stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R665.merkaba", {st_mode=S_IFREG|0644, st_size=13036, ...}) = 0 [pid 4137] stat("/home/martin/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory/kernel-ml/new/1376733031.R738.merkaba", ^C{st_mode=S_IFREG|0644, st_size=6870, ...}) = 0 Related observations also indicate that Akonadi is doing this work needlessly: Bug 334209 - synchronizes folder contents during runtime needlessly Bug 334216 - synchronizes folder with filesystem after downloading and filtering mails needlessly Again blazingly fast ThinkPad T520 with Sandybridge and Dual SSD BTRFS RAID 1 setup. -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs