----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/104052/ -----------------------------------------------------------
(Updated Feb. 25, 2012, 8:45 p.m.) Review request for kdelibs, David Faure and Michael Pyne. Changes ------- Greatly changed the diff. Added: - gzip compression using KFilterDev - prevent image insertion when the current image from the cache is exactly the same as the one we want to insert. - heavily improved speed (changed main description for this) Description (updated) ------- I was running KWin through callgrind to see where possible bottlenecks are. I wasn't expecting much since it improved greatly during the 4.8 dev cycle, however one stood out. The saving of PNG images was taking about 1/5th of the time in KWin that i could see directly. That looked like something i might be able to optimize. What this patch is doing is storing the actual image bits to prevent saving a PNG image to the mmapped cache. That was a hot code path in time (cycles), not even in calls. I've also reduced the amount of memory copies to a bare minimum by adding a rawFind function to KSharedDataCache which fills a QByteArray::fromRawData thus preventing a expensive memory copy. The rawFind is used for looking up an image and fetching it's data without copying it. That is done because QImage seems to make a copy itself internally. I don't have any performance measurements, however, prior to this patch my kwin test was using up ~5.000.000.000 cycles. After this patch it's using up 1.370.000.000. I don't have raw performance numbers to see if the cache itself is actually faster, it certainly has become a lot cheaper to use the cache. Logic wise i would say creating a QImage from the cached data should be way faster now since there is no step involved anymore in decoding the image. Storing is certainly an order of magnitude faster. -- update -- After spending a lot more time trying to get compression in the mix "someone" came with the suggestion to use KFilterDev. Never heard of it, but i gave it a shot anyway. It turned out to be the bull-eye in this case. Data is now greatly compressed using bzip (any other compression available in KFilterDev makes it a lot slower) with no speed loss compared to my previous benchmark results, the opposite is true, speedups! New benchmarks compared to the stock 4.8.0 KImageCache: Read: ~8x faster (was 5x in my previous patch) Write: ~5x faster (equally fast compared to stock, no speedup here in my previous patch) There are still a few major speedups that can be taken here, but are a lot more complicated to do. - use a faster compression method, that will probably greatly speedup insertion and retrieval! (i keep saying it: LZ4!) - somehow store the bits separate so that only the bits can be fetched for insertion checking. I actually did do this already, but the added overhead of more QIODevice objects and more QBuffers is causing it to not be beneficial at all. It's even a little slower. As for the issues marked below for race conditions. I need help in that area. I can't do what's suggested (i lack the knowledge for it). Special thanks go to David Faure for helping me a great deal with this. Diffs (updated) ----- kdecore/util/kshareddatacache.h 339cecc kdecore/util/kshareddatacache.cpp 9fe3995 kdeui/tests/CMakeLists.txt c8b8c85 kdeui/tests/kimagecachetests.h PRE-CREATION kdeui/tests/kimagecachetests.cpp PRE-CREATION kdeui/util/kimagecache.cpp a5bbbe1 Diff: http://git.reviewboard.kde.org/r/104052/diff/ Testing ------- I've also written a bunch of test cases (greatly improved by David Faure) to see if i didn't break anything. According to the test (which is also comparing the actual image bits) it's all passing just fine. Thanks, Mark Gaiser