Caching for X11DRV_SetImageBits?
Let me start off with an introduction. I'm a veteran C programmer, but I'm unfamiliar with programming X11 and unfamiliar with wine, although I've been looking into both recently to correct a problem I'm having with wine. I posted about the same problem in August on the same mailing list, about making function parameters const. This was a simplistic solution, but I suggested it simply because I didn't feel I had the time to familiarize myself with all the inner-workings of wine. Now, I believe I do have the time. To repeat the situation, I found that in several of the games I like to run, the same problem is causing wine to be slow: using oprofile, I find that nearly all of wine's CPU usage lies in x11drv, and 98-99% of the cpu usage in x11drv is taken up by calls to some of the convert functions in dib_convert.c. For example, in Civilization 3, some 98% of the CPU usage in x11drv is taken up by two functions: convert_555_to_565_asis, and convert_565_to_555_asis. This summer I had a similar problem on my laptop, where fce ultra was using the function convert_888_to_0888_asis for most of its cpu. Essentially, it seems that most of the work wine is doing when running these programs is just for converting between one pixel format and back, and I hope to find a good solution to this speed problem, preferably through caching. I'm writing to this list in hope of other more experienced wine hackers giving their advice, telling me what is possible, and what the best solution is, before I go too far on my own with this. I'm looking at a group of related functions in wine, in dlls/x11drv/dib.c. It came to my attention that in X11DRV_SetDIBits and X11DRV_SetDIBitsToDevice, wine always seems to end up calling X11DRV_DIB_SetImageBits, which calls (in my case, probably because my desktop is 16-bit color) X11DRV_DIB_SetImageBits_16, which in every case calls a convert function of one type or another (all of which are CPU hogs). These are the core functions that seem to be related to the problem. Here is my early analysis of the problem: X11DRV_DIB_SetImageBits always creates and destroys an XImage during the life of the function, and calls X11DRV_DIB_SetImageBits_16 (or another bit depth), which ends up converting the bitmap to the proper color format / bit depth. It seems to me the best solution is to use some kind of caching to save the converted bitmap in its cached form, preferably as an XImage. Inside X_PHYSBITMAP the HBITMAP is stored, windows' unique handle for that particular bitmap. So, what if we stored the appropriate converted XImage per-HBITMAP, per-bpp, so that it doesn't need to be created, converted, and destroyed each time, and we could delete all cached XImages corresponding to the HBITMAP from the cache when DeleteObject() is called for that HBITMAP? I'm doing a lot of guesswork on the inner-workings of wine, X, and the windows GDI here. Please tell me, am I on the right track for eliminating these unnecessary conversions? If not, where should I be looking? Again I'm new to wine, X, and somewhat new to GDI programming - but I would like to learn. Please, GDI/X11 experts, give me some advice on the best solution to this problem!
Re: Const Function Parameters?
On 7/26/05, Troy Rollo [EMAIL PROTECTED] wrote: On Wed, 27 Jul 2005 09:25, Felix Nawothnig wrote: There is no need to make anything except the pointers const - I don't think I've ever seen that in real world code. In theory this would give the compiler slightly more information... but if the optimizer is unable to figure out that the parameter is never used as an lvalue by himself it sucks so badly that it probably won't do much better with that hints anyway. :-) In fact I don't know that making these const would make any difference - I'd like to see the assembly code generated by the two versions for comparison. I agree, the compiler should realise the parameters aren't used as an l-value in the function anywhere - but it apparently doesn't, or marking the parameters const is having some other effect on the binary - with gcc 4.0.1 (at least on my system, Debian sid), the compiled size of winex11.drv.so changes with my patch (it increases by 368 bytes from 5335409 - 5335777). Just changing the parameters to const for the function in question (and none of the other functions in the file) increases the size from the original size by 64 bytes. I won't pretend to know why my patch increases the size of the binary, but it shows that even without looking at the assembly code, this patch is making some kind of difference (whether good or bad) in the compiled code. I would look further into it, but I don't know assembly. The 1.5% appears to be within the noise. A few of us analysed those routines to death on IRC a couple of months back and decided the real problem was that they were being called too frequently, not that there was a problem with the routines themselves. The routines could probably be made faster by rewriting them in assembly language, but bigger benefits could be gained by figuring out why there are so many calls and reducing the number of calls. I agree it would be good to have these routines called less in the future. However, until we reduce the number of times that routine is called (whenever that may be), it would be nice to know we're doing everything we can to make the routine as fast as possible. Anyway, I'm the amateur wine-hacker here, so if you guys still say we shouldn't change it, I won't argue the matter further. I just want to point out that this patch is certainly doing SOMETHING to the resulting binary, and it's possibly good.
Const Function Parameters?
I've been doing some oprofille tests with wine running fce ultra, the 8-bit Nintendo emulator. I found that when running a rom for 60 seconds, more than 99% of the CPU utilization for winex11drv (which uses the most of all components of wine in this case) is in the same function : convert_888_to_0888_asis (in dlls/x11drv/dib_convert.c). I've also found that marking a couple of parameters of that function const (that I believe should be marked const anyways), CPU usage in that function drops measurably with oprofile. As far as I know, parameters that aren't modified in a function should be marked const anyways, to send the right hint to the compiler. Since it actually turns out to be faster too, it seems worth it to me. Here are the results from oprofile before the change: 2198299.2012 convert_888_to_0888_asis 2207899.1735 convert_888_to_0888_asis 2220799.1605 convert_888_to_0888_asis 2216199.1544 convert_888_to_0888_asis 2215899.2253 convert_888_to_0888_asis and after: 2182899.1731 convert_888_to_0888_asis 2176999.2296 convert_888_to_0888_asis 2183599.3313 convert_888_to_0888_asis 2186899.1027 convert_888_to_0888_asis 2160199.1508 convert_888_to_0888_asis On average, it went from 22117 (context switches I believe?) to 21780, about a 1.5% improvement. The same test was run with a script each time (run fceu with the same rom, kill it after 60 seconds). I'm assuming the percentages (the second column in this chart) don't matter much, because this function we're examining takes up an overwhelming amount of the function's CPU ( 99%), so its percentage depends largely on how many context switches other functions took up. I've taken the liberty of providing a diff that patches all the functions in this file to use const parameters where appropriate. I don't get any compiler warnings with this compiling with gcc 4.0.1. Should I submit this to wine-patches? This would be my first patch to wine (or any open-source project, period). I'd appreciate some feedback before I try for it. Thanks in advance! ? patch.diff Index: dlls/x11drv/dib_convert.c === RCS file: /home/wine/wine/dlls/x11drv/dib_convert.c,v retrieving revision 1.3 diff -u -p -r1.3 dib_convert.c --- dlls/x11drv/dib_convert.c 30 Nov 2004 21:38:58 - 1.3 +++ dlls/x11drv/dib_convert.c 26 Jul 2005 15:03:45 - @@ -69,9 +69,9 @@ * 15 bit conversions */ -static void convert_5x5_asis(int width, int height, - const void* srcbits, int srclinebytes, - void* dstbits, int dstlinebytes) +static void convert_5x5_asis(const int width, const int height, + const void* srcbits, const int srclinebytes, + void* dstbits, const int dstlinebytes) { int y; @@ -83,9 +83,9 @@ static void convert_5x5_asis(int width, } -static void convert_555_reverse(int width, int height, -const void* srcbits, int srclinebytes, -void* dstbits, int dstlinebytes) +static void convert_555_reverse(const int width, const int height, +const void* srcbits, const int srclinebytes, +void* dstbits, const int dstlinebytes) { const DWORD* srcpixel; DWORD* dstpixel; @@ -115,9 +115,9 @@ static void convert_555_reverse(int widt } } -static void convert_555_to_565_asis(int width, int height, -const void* srcbits, int srclinebytes, -void* dstbits, int dstlinebytes) +static void convert_555_to_565_asis(const int width, const int height, +const void* srcbits, const int srclinebytes, +void* dstbits, const int dstlinebytes) { const DWORD* srcpixel; DWORD* dstpixel; @@ -147,9 +147,9 @@ static void convert_555_to_565_asis(int } } -static void convert_555_to_565_reverse(int width, int height, - const void* srcbits, int srclinebytes, - void* dstbits, int dstlinebytes) +static void convert_555_to_565_reverse(const int width, const int height, + const void* srcbits, const int srclinebytes, + void* dstbits, const int dstlinebytes) { const DWORD* srcpixel; DWORD* dstpixel; @@ -181,9 +181,9 @@ static void convert_555_to_565_reverse(i } } -static void convert_555_to_888_asis(int width, int height, -const void* srcbits, int srclinebytes, -void* dstbits, int dstlinebytes) +static void convert_555_to_888_asis(const int width, const int height, +const void* srcbits, const int srclinebytes, +