Re: Review Request: KImageCache optimization

Mark Gaiser Sun, 26 Feb 2012 01:40:26 -0800

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/104052/
-----------------------------------------------------------


(Updated Feb. 25, 2012, 8:45 p.m.)


Review request for kdelibs, David Faure and Michael Pyne.


Changes
-------

Greatly changed the diff.
Added:
- gzip compression using KFilterDev
- prevent image insertion when the current image from the cache is exactly the 
same as the one we want to insert.
- heavily improved speed
(changed main description for this)


Description (updated)
-------

I was running KWin through callgrind to see where possible bottlenecks are. I 
wasn't expecting much since it improved greatly during the 4.8 dev cycle, 
however one stood out. The saving of PNG images was taking about 1/5th of the 
time in KWin that i could see directly. That looked like something i might be 
able to optimize.

What this patch is doing is storing the actual image bits to prevent saving a 
PNG image to the mmapped cache. That was a hot code path in time (cycles), not 
even in calls. I've also reduced the amount of memory copies to a bare minimum 
by adding a rawFind function to KSharedDataCache which fills a 
QByteArray::fromRawData thus preventing a expensive memory copy. The rawFind is 
used for looking up an image and fetching it's data without copying it. That is 
done because QImage seems to make a copy itself internally. I don't have any 
performance measurements, however, prior to this patch my kwin test was using 
up ~5.000.000.000 cycles. After this patch it's using up 1.370.000.000. I don't 
have raw performance numbers to see if the cache itself is actually faster, it 
certainly has become a lot cheaper to use the cache. Logic wise i would say 
creating a QImage from the cached data should be way faster now since there is 
no step involved anymore in decoding the image. Storing is certainly an order 
of magnitude faster.

-- update --
After spending a lot more time trying to get compression in the mix "someone" 
came with the suggestion to use KFilterDev. Never heard of it, but i gave it a 
shot anyway. It turned out to be the bull-eye in this case. Data is now greatly 
compressed using bzip (any other compression available in KFilterDev makes it a 
lot slower) with no speed loss compared to my previous benchmark results, the 
opposite is true, speedups!

New benchmarks compared to the stock 4.8.0 KImageCache:
Read: ~8x faster (was 5x in my previous patch)
Write: ~5x faster (equally fast compared to stock, no speedup here in my 
previous patch)

There are still a few major speedups that can be taken here, but are a lot more 
complicated to do.
- use a faster compression method, that will probably greatly speedup insertion 
and retrieval! (i keep saying it: LZ4!)
- somehow store the bits separate so that only the bits can be fetched for 
insertion checking. I actually did do this already, but the added overhead of 
more QIODevice objects and more QBuffers is causing it to not be beneficial at 
all. It's even a little slower.

As for the issues marked below for race conditions. I need help in that area. I 
can't do what's suggested (i lack the knowledge for it).

Special thanks go to David Faure for helping me a great deal with this.


Diffs (updated)
-----

  kdecore/util/kshareddatacache.h 339cecc 
  kdecore/util/kshareddatacache.cpp 9fe3995 
  kdeui/tests/CMakeLists.txt c8b8c85 
  kdeui/tests/kimagecachetests.h PRE-CREATION 
  kdeui/tests/kimagecachetests.cpp PRE-CREATION 
  kdeui/util/kimagecache.cpp a5bbbe1 

Diff: http://git.reviewboard.kde.org/r/104052/diff/


Testing
-------

I've also written a bunch of test cases (greatly improved by David Faure) to 
see if i didn't break anything. According to the test (which is also comparing 
the actual image bits) it's all passing just fine.


Thanks,

Mark Gaiser

Re: Review Request: KImageCache optimization

Reply via email to