Caching for X11DRV_SetImageBits?

2005-10-22 Thread Michael Carlson
Let me start off with an introduction. I'm a veteran C programmer, but
I'm unfamiliar with programming X11 and unfamiliar with wine, although
I've been looking into both recently to correct a problem I'm having
with wine. I posted about the same problem in August on the same
mailing list, about making function parameters const. This was a
simplistic solution, but I suggested it simply because I didn't feel I
had the time to familiarize myself with all the inner-workings of wine.
Now, I believe I do have the time.

To repeat the situation, I found that in several of the games I like to
run, the same problem is causing wine to be slow: using oprofile, I
find that nearly all of wine's CPU usage lies in x11drv, and 98-99% of
the cpu usage in x11drv is taken up by calls to some of the convert
functions in dib_convert.c. For example, in Civilization 3, some 98% of
the CPU usage in x11drv is taken up by two functions:
convert_555_to_565_asis, and convert_565_to_555_asis. This summer I had
a similar problem on my laptop, where fce ultra was using the function
convert_888_to_0888_asis for most of its cpu.

Essentially, it seems that most of the work wine is doing when running
these programs is just for converting between one pixel format and
back, and I hope to find a good solution to this speed problem,
preferably through caching. I'm writing to this list in hope of other
more experienced wine hackers giving their advice, telling me what is
possible, and what the best solution is, before I go too far on my own
with this.

I'm looking at a group of related functions in wine, in
dlls/x11drv/dib.c. It came to my attention that in X11DRV_SetDIBits and
X11DRV_SetDIBitsToDevice, wine always seems to end up calling
X11DRV_DIB_SetImageBits, which calls (in my case, probably because my
desktop is 16-bit color) X11DRV_DIB_SetImageBits_16, which in every
case calls a convert function of one type or another (all of which are
CPU hogs). These are the core functions that seem to be related to the
problem.

Here is my early analysis of the problem: X11DRV_DIB_SetImageBits
always creates and destroys an XImage during the life of the function,
and calls X11DRV_DIB_SetImageBits_16 (or another bit depth), which ends
up converting the bitmap to the proper color format / bit depth. It
seems to me the best solution is to use some kind of caching to save
the converted bitmap in its cached form, preferably as an XImage.
Inside X_PHYSBITMAP the HBITMAP is stored, windows' unique handle for
that particular bitmap. So, what if we stored the appropriate converted
XImage per-HBITMAP, per-bpp, so that it doesn't need to be created,
converted, and destroyed each time, and we could delete all cached
XImages corresponding to the HBITMAP from the cache when DeleteObject()
is called for that HBITMAP?

I'm doing a lot of guesswork on the inner-workings of wine, X, and the
windows GDI here. Please tell me, am I on the right track for
eliminating these unnecessary conversions? If not, where should I be
looking? Again I'm new to wine, X, and somewhat new to GDI programming
- but I would like to learn. Please, GDI/X11 experts, give me some
advice on the best solution to this problem!



Re: Const Function Parameters?

2005-07-27 Thread Michael Carlson
On 7/26/05, Troy Rollo [EMAIL PROTECTED] wrote:
 On Wed, 27 Jul 2005 09:25, Felix Nawothnig wrote:
  There is no need to make anything except the pointers const - I don't
  think I've ever seen that in real world code. In theory this would give
  the compiler slightly more information... but if the optimizer is unable
  to figure out that the parameter is never used as an lvalue by himself
  it sucks so badly that it probably won't do much better with that hints
  anyway. :-)
 
 In fact I don't know that making these const would make any difference - I'd
 like to see the assembly code generated by the two versions for comparison.

I agree, the compiler should realise the parameters aren't used as an
l-value in the function anywhere - but it apparently doesn't, or
marking the parameters const is having some other effect on the binary
- with gcc 4.0.1 (at least on my system, Debian sid), the compiled
size of winex11.drv.so changes with my patch (it increases by 368
bytes from 5335409 - 5335777). Just changing the parameters to const
for the function in question (and none of the other functions in the
file) increases the size from the original size by 64 bytes.

I won't pretend to know why my patch increases the size of the binary,
but it shows that even without looking at the assembly code, this
patch is making some kind of difference (whether good or bad) in the
compiled code. I would look further into it, but I don't know
assembly.

 
 The 1.5% appears to be within the noise.
 
 A few of us analysed those routines to death on IRC a couple of months back
 and decided the real problem was that they were being called too frequently,
 not that there was a problem with the routines themselves. The routines could
 probably be made faster by rewriting them in assembly language, but bigger
 benefits could be gained by figuring out why there are so many calls and
 reducing the number of calls.

 
I agree it would be good to have these routines called less in the
future. However, until we reduce the number of times that routine is
called (whenever that may be), it would be nice to know we're doing
everything we can to make the routine as fast as possible.

Anyway, I'm the amateur wine-hacker here, so if you guys still say we
shouldn't change it, I won't argue the matter further. I just want to
point out that this patch is certainly doing SOMETHING to the
resulting binary, and it's possibly good.




Const Function Parameters?

2005-07-26 Thread Michael Carlson
I've been doing some oprofille tests with wine running fce ultra, the
8-bit Nintendo emulator. I found that when running a rom for 60
seconds, more than 99% of the CPU utilization for winex11drv (which
uses the most of all components of wine in this case) is in the same
function : convert_888_to_0888_asis (in dlls/x11drv/dib_convert.c).

I've also found that marking a couple of parameters of that function
const (that I believe should be marked const anyways), CPU usage in
that function drops measurably with oprofile. As far as I know,
parameters that aren't modified in a function should be marked const
anyways, to send the right hint to the compiler. Since it actually
turns out to be faster too, it seems worth it to me.

Here are the results from oprofile before the change:

2198299.2012  convert_888_to_0888_asis
2207899.1735  convert_888_to_0888_asis
2220799.1605  convert_888_to_0888_asis
2216199.1544  convert_888_to_0888_asis
2215899.2253  convert_888_to_0888_asis

and after:

2182899.1731  convert_888_to_0888_asis
2176999.2296  convert_888_to_0888_asis
2183599.3313  convert_888_to_0888_asis
2186899.1027  convert_888_to_0888_asis
2160199.1508  convert_888_to_0888_asis

On average, it went from 22117 (context switches I believe?) to 21780,
about a 1.5% improvement. The same test was run with a script each
time (run fceu with the same rom, kill it after 60 seconds). I'm
assuming the percentages (the second column in this chart) don't
matter much, because this function we're examining takes up an
overwhelming amount of the function's CPU ( 99%), so its percentage
depends largely on how many context switches other functions took up.

I've taken the liberty of providing a diff that patches all the
functions in this file to use const parameters where appropriate. I
don't get any compiler warnings with this compiling with gcc 4.0.1.
Should I submit this to wine-patches? This would be my first patch to
wine (or any open-source project, period). I'd appreciate some
feedback before I try for it.

Thanks in advance!
? patch.diff
Index: dlls/x11drv/dib_convert.c
===
RCS file: /home/wine/wine/dlls/x11drv/dib_convert.c,v
retrieving revision 1.3
diff -u -p -r1.3 dib_convert.c
--- dlls/x11drv/dib_convert.c	30 Nov 2004 21:38:58 -	1.3
+++ dlls/x11drv/dib_convert.c	26 Jul 2005 15:03:45 -
@@ -69,9 +69,9 @@
  * 15 bit conversions
  */
 
-static void convert_5x5_asis(int width, int height,
- const void* srcbits, int srclinebytes,
- void* dstbits, int dstlinebytes)
+static void convert_5x5_asis(const int width, const int height,
+ const void* srcbits, const int srclinebytes,
+ void* dstbits, const int dstlinebytes)
 {
 int y;
 
@@ -83,9 +83,9 @@ static void convert_5x5_asis(int width, 
 }
 
 
-static void convert_555_reverse(int width, int height,
-const void* srcbits, int srclinebytes,
-void* dstbits, int dstlinebytes)
+static void convert_555_reverse(const int width, const int height,
+const void* srcbits, const int srclinebytes,
+void* dstbits, const int dstlinebytes)
 {
 const DWORD* srcpixel;
 DWORD* dstpixel;
@@ -115,9 +115,9 @@ static void convert_555_reverse(int widt
 }
 }
 
-static void convert_555_to_565_asis(int width, int height,
-const void* srcbits, int srclinebytes,
-void* dstbits, int dstlinebytes)
+static void convert_555_to_565_asis(const int width, const int height,
+const void* srcbits, const int srclinebytes,
+void* dstbits, const int dstlinebytes)
 {
 const DWORD* srcpixel;
 DWORD* dstpixel;
@@ -147,9 +147,9 @@ static void convert_555_to_565_asis(int 
 }
 }
 
-static void convert_555_to_565_reverse(int width, int height,
-   const void* srcbits, int srclinebytes,
-   void* dstbits, int dstlinebytes)
+static void convert_555_to_565_reverse(const int width, const int height,
+   const void* srcbits, const int srclinebytes,
+   void* dstbits, const int dstlinebytes)
 {
 const DWORD* srcpixel;
 DWORD* dstpixel;
@@ -181,9 +181,9 @@ static void convert_555_to_565_reverse(i
 }
 }
 
-static void convert_555_to_888_asis(int width, int height,
-const void* srcbits, int srclinebytes,
-void* dstbits, int dstlinebytes)
+static void convert_555_to_888_asis(const int width, const int height,
+const void* srcbits, const int srclinebytes,
+