[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #9 from Martin Steigerwald mar...@lichtvoll.de --- Sergio, I tried: 107 QStringList listNew() const 108 { 109 static int count = 0; count++; 110 static int elapsed = 0; 111 112 QDir d( path + QString::fromLatin1( /new ) ); 113 d.setSorting(QDir::NoSort); 114 115 QTime t; 116 t.start(); 117 QStringList list = d.entryList( QDir::Files ); 118 elapsed += t.elapsed(); 119 kDebug() PERF: listNew NoSort count elapsed \n; 120 return list; 121 122 //return d.entryList( QDir::Files ); 123 } 124 125 QStringList listCurrent() const 126 { 127 static int count = 0; count++; 128 static int elapsed = 0; 129 130 QDir d( path + QString::fromLatin1( /cur ) ); 131 d.setSorting(QDir::NoSort); 132 133 QTime t; 134 t.start(); 135 QStringList list = d.entryList( QDir::Files ); 136 elapsed += t.elapsed(); 137 kDebug() PERF: listCurrent NoSort count elapsed \n; 138 return list; 139 140 //return d.entryList( QDir::Files ); 141 } as well as your -std::cout PERF: listNew NoSort count elapsed \n; +kDebug() PERF: listNew NoSort count elapsed \n; I see no output on konsole or in ~/.xsession-errors even with full debug enabled in kdebugdialog before starting akonadi again. Additionally I look at the callgrind files again and I still think they are quite different. The nodirsorting callgrind-vcard.26128.1 has: - QDir::entryList - QAlgorithmsPrivate::qSortHelper - QAlgorithmsPrivate::qSortHelper - QDirSortItemComparator::operator - QString::compare as first 5 out of first 6 entries. And none of it has a very long bar. Yet the old callgrind has - QAlgorithmsPrivat::qSortHelper, lr 145684.95, lr per call as the first of the entries and it has a very long outstanding bar. This very long outstanding bar is basically gone in the new callgrind run. Here as directly from callgrind file: So somewhere QDir may still sort something, but not at the original place anymore. At least thats my bet on looking at the data. -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #10 from Martin Steigerwald mar...@lichtvoll.de --- Sergio, I tried: 107 QStringList listNew() const 108 { 109 static int count = 0; count++; 110 static int elapsed = 0; 111 112 QDir d( path + QString::fromLatin1( /new ) ); 113 d.setSorting(QDir::NoSort); 114 115 QTime t; 116 t.start(); 117 QStringList list = d.entryList( QDir::Files ); 118 elapsed += t.elapsed(); 119 kDebug() PERF: listNew NoSort count elapsed \n; 120 return list; 121 122 //return d.entryList( QDir::Files ); 123 } 124 125 QStringList listCurrent() const 126 { 127 static int count = 0; count++; 128 static int elapsed = 0; 129 130 QDir d( path + QString::fromLatin1( /cur ) ); 131 d.setSorting(QDir::NoSort); 132 133 QTime t; 134 t.start(); 135 QStringList list = d.entryList( QDir::Files ); 136 elapsed += t.elapsed(); 137 kDebug() PERF: listCurrent NoSort count elapsed \n; 138 return list; 139 140 //return d.entryList( QDir::Files ); 141 } as well as your -std::cout PERF: listNew NoSort count elapsed \n; +kDebug() PERF: listNew NoSort count elapsed \n; I see no output on konsole or in ~/.xsession-errors even with full debug enabled in kdebugdialog before starting akonadi again. Additionally I look at the callgrind files again and I still think they are quite different. The nodirsorting callgrind-vcard.26128.1 has: - QDir::entryList - QAlgorithmsPrivate::qSortHelper - QAlgorithmsPrivate::qSortHelper - QDirSortItemComparator::operator - QString::compare as first 5 out of first 6 entries. And none of it has a very long bar. Yet the old callgrind-vcard.28282.1 has - QAlgorithmsPrivat::qSortHelper, lr 145684.95, lr per call as the first of the entries and it has a very long outstanding bar. This very long outstanding bar is basically gone in the new callgrind run. Just look at each Just look at these two files with kcachegrind to me they look quite different in the old the sort stuff is dominating everything else by a factor of 10 at least, in the new it looks way more balanced. So somewhere QDir may still sort something, but not at the original place anymore. At least thats my bet on looking at the data. I wonder whether something else is using QDir as well and did not disable sorting and we see this now. -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #11 from Martin Steigerwald mar...@lichtvoll.de --- I don´t know whether this is called and what that does, but setting nosort here as well for testing: diff --git a/resources/maildir/libmaildir/keycache.cpp b/resources/maildir/libmaildir/keycache.cpp index f0af9c4..814ce6c 100644 --- a/resources/maildir/libmaildir/keycache.cpp +++ b/resources/maildir/libmaildir/keycache.cpp @@ -75,12 +75,14 @@ bool KeyCache::isNewKey( const QString dir, const QString key ) const QSet QString KeyCache::listNew( const QString dir ) const { QDir d( dir + QString::fromLatin1( /new ) ); +d.setSorting(QDir::NoSort); return d.entryList( QDir::Files ).toSet(); } QSet QString KeyCache::listCurrent( const QString dir ) const { QDir d( dir + QString::fromLatin1( /cur ) ); +d.setSorting(QDir::NoSort); return d.entryList( QDir::Files ).toSet(); } -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #12 from Martin Steigerwald mar...@lichtvoll.de --- Created attachment 86428 -- https://bugs.kde.org/attachment.cgi?id=86428action=edit callgrind with nodirsorting also in libmaildir/keycache.cpp Well this is it. KMail is usable even with make -j4 on kdepim-runtime on a hyperthreaded Sandybridge mobile dual core with that. Akonadi maildir resource still appears with some CPU usage, but I didn´t see it at 100% for even just one 10 second measurement period in atop. callgrind logs show a different bottleneck now: MaildirResource::listRecursive To me these two nosorting patches have such a huge impact on Maildir resource performance that using KMail is a completely different experience now. KMail hardly ever blocks on folder changes anymore for example, although that seems just the case due to synchronisation being much faster now. So this still is good to *fix*. -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #1 from Martin Steigerwald mar...@lichtvoll.de --- I bet that stat()ing files isn´t the only issue here, as find -ls also uses a variant of the stat() call on each file and as demonstrated it way faster: Warm caches: martin@merkaba:~/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory strace -c find kernel-ml -ls 21 /dev/null % time seconds usecs/call callserrors syscall -- --- --- - - 90.934.199983 17250167 newfstatat 6.340.292629 770 380 getdents 1.970.091028 14 6437 write 0.600.027740 51 539 brk 0.040.001891 345618 open 0.030.001159 2743 mmap 0.020.000866 1750 close 0.010.000650 1738 fstat 0.010.000492 492 1 execve 0.010.000458 2916 mprotect 0.010.000334 42 8 8 access 0.010.000323 36 9 munmap 0.010.000311 2413 read 0.000.000180 45 4 4 connect 0.000.000176 1017 fchdir 0.000.000150 21 7 openat 0.000.000121 30 4 socket 0.000.95 19 5 lseek 0.000.51 17 3 2 ioctl 0.000.27 27 1 uname 0.000.14 14 1 arch_prctl -- --- --- - - 100.004.61867825779932 total martin@merkaba:~/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory /usr/bin/time find kernel-ml -ls 21 /dev/null 0.69user 0.35system 0:01.04elapsed 100%CPU (0avgtext+0avgdata 59536maxresident)k 0inputs+0outputs (0major+18011minor)pagefaults 0swaps After echo 3 /proc/sys/vm/drop_caches: martin@merkaba:~/.local/share/local-mail/.Lichtvoll.directory/.Linux.directory /usr/bin/time find kernel-ml -ls 21 /dev/null 3.40user 9.67system 0:21.56elapsed 60%CPU (0avgtext+0avgdata 59532maxresident)k 780736inputs+0outputs (8major+18001minor)pagefaults 0swaps (still just 20 seconds and well in case of KMail / Akonadi caches are warm, machine has 8 GB of RAM) -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #2 from Martin Steigerwald mar...@lichtvoll.de --- Compared to this strace of martin@merkaba:~#16 ps aux | head -1 ; ps aux | grep akonadi_maildir_resource | grep -v grep USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND martin 26830 12.3 4.4 660948 356240 ? Rl 12:49 5:47 /usr/local/bin/akonadi_agent_launcher akonadi_maildir_resource akonadi_maildir_resource_0 after downloading about 50-100 new mails and filtering them. The tasks used up one sandybridge core for several minutes even in this case – OK, granted strace will make it way slower, so what is 5-10 minutes might be one minute without strace maybe. Here strace -c for the whole time of maildir resource occupying the CPU core: merkaba:~ strace -c -f -p 26830 Process 26830 attached with 2 threads - interrupt to quit ^CProcess 26830 detached Process 26831 detached % time seconds usecs/call callserrors syscall -- --- --- - - 84.29 10.324846 26395164 111 stat 5.870.719352 448 1607 getdents 2.850.349666 5 74548 74264 recvmsg 2.770.339268 9 37233 459 read 1.700.208472 6 37003 poll 1.230.150676 4 35434 ioctl 0.390.048249 11 4521 write 0.240.029460 44666 rename 0.170.021152 18 1178 sendmsg 0.140.017625 10 1744 access 0.090.010768 21 518 brk 0.080.009760 38 259 open 0.030.003285 11 309 close 0.030.003128 7 442 fstat 0.030.003128 456966 lstat 0.020.002616 3087 munmap 0.020.002154 4350 statfs 0.010.001812 2187 mmap 0.010.001712 7 259 fcntl 0.010.000826 1750 openat 0.000.000411 759 lseek 0.000.000262 558 uname 0.000.78 78 1 restart_syscall -- --- --- - - 100.00 12.248706590746 74900 total Note during this time I saw not any notable activity of MySQL server in atop. Just the maildir resource hogging up the CPU. Also Akonadi server itself is idling. Except for the end where MySQL and Akonadi server show a short burst of activity. After this analysis I get more and more the impression that the time Akonadi maildir resource takes to sort 50-100 mails exceeds the actually necessary effort by a tremendous factor. Maildir resource is doing needless work here. For me there is no other explaination to it. Filtering mails to folders IMHO consists of the following work: 1) Running filters on the mails. These load the mail headers and check various criteria. Should be done in a second or two. 2) Actually moving the files to their respective folders which involves: - changing folder path in database. - mv the file - okay in case of Akonadi this may go through file_db_data? Anyway I cannot see why this ridicolously amount of system calls and CPU usage – as I demontrated in my analysis here – is even remotely necessary for the job. After this analysis I intend to look at the source again. Maybe with reading API docs I understand it a bit better than last time and find something. Akonadi is git 159bfbdf8667bb01b1f4c85ea144ca128ccb3195 -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #3 from Martin Steigerwald mar...@lichtvoll.de --- Created attachment 86406 -- https://bugs.kde.org/attachment.cgi?id=86406action=edit callgrind data of 5 minutes cpu busy maildir resource About 5 minute callgrind data with akonadi maildir resource being 100% cpu busy after Akonadi restart. As its still busy I can add another dump later on :) Done as: martin@merkaba:~ env | grep AKONADI AKONADI_VALGRIND=akonadi_maildir_resource AKONADI_VALGRIND_SKIN=callgrind AKONADI_VALGRIND_OPTIONS=--instr-atstart=no --callgrind-out-file=/home/martin/callgrind-vcard.%p martin@merkaba:~ callgrind_control -i on PID 28282: /usr/local/bin/akonadi_agent_launcher akonadi_maildir_resource akonadi_maildir_resource_0 sending command instrumentation on to pid 28282 OK. martin@merkaba:~ date Fr 2. Mai 14:06:14 CEST 2014 martin@merkaba:~ date Fr 2. Mai 14:11:14 CEST 2014 martin@merkaba:~ callgrind_control -d PID 28282: /usr/local/bin/akonadi_agent_launcher akonadi_maildir_resource akonadi_maildir_resource_0 -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 Martin Steigerwald mar...@lichtvoll.de changed: What|Removed |Added Attachment #86406|0 |1 is obsolete|| --- Comment #4 from Martin Steigerwald mar...@lichtvoll.de --- Created attachment 86408 -- https://bugs.kde.org/attachment.cgi?id=86408action=edit complete callgrind run over a longer time with three dumps one after 5 minutes, another one after 15 minutes (i.e. ten minutes later) and a third one before stopping the thing -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #5 from Martin Steigerwald mar...@lichtvoll.de --- Testing with: diff --git a/resources/maildir/libmaildir/maildir.cpp b/resources/maildir/libmaildir/maildir.cpp index 9bd3802..36166ed 100644 --- a/resources/maildir/libmaildir/maildir.cpp +++ b/resources/maildir/libmaildir/maildir.cpp @@ -106,12 +106,14 @@ public: QStringList listNew() const { QDir d( path + QString::fromLatin1( /new ) ); +d.setSorting(QDir::NoSort); return d.entryList( QDir::Files ); } QStringList listCurrent() const { QDir d( path + QString::fromLatin1( /cur ) ); +d.setSorting(QDir::NoSort); return d.entryList( QDir::Files ); } which disables QDir sorting which was so prominent in callgrind traces. I get the impression it helps the bit, but it does not seem to fix all of the performance issue. Need to take more time to check more thoroughly. -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #6 from Martin Steigerwald mar...@lichtvoll.de --- Suggestion was from David Faure and Sergio. -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #7 from Martin Steigerwald mar...@lichtvoll.de --- Created attachment 86418 -- https://bugs.kde.org/attachment.cgi?id=86418action=edit second callgrind of about well at least half an hour, after nodirsorting patch applied This one looks a quite bit different. The resource still hogs a core, partly for minutes still, and KMail. But KMail does respond more often in between. I think KMail reacts after each complete folder synchronisation for a short time. So do you think about this callgrind? -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs
[Akonadi] [Bug 334218] synchronizations of large folders with filesystem contents hogs a Sandybridge core for minutes stat()ing every file in it
https://bugs.kde.org/show_bug.cgi?id=334218 --- Comment #8 from Martin Steigerwald mar...@lichtvoll.de --- During this callgrind run KMail did respond in between, so I really think folder sync got faster by nodir sorting patch as during last callgrind of also about half an hour or so it didn´t respond once. Yet… no hard numbers yet. I don´t know how to make repeatable tests so far. -- You are receiving this mail because: You are the assignee for the bug. ___ Kdepim-bugs mailing list Kdepim-bugs@kde.org https://mail.kde.org/mailman/listinfo/kdepim-bugs