>  * Is optimizing the code based on these results really that good for
> real life performance for 16bit applications. 

One thing that is important to note is that Win16 is slower than Win32
because the Win16 API almost always converts it paramters and calls
the corresponding Win32 API.

This should probably not be optimized for maintaince reasons. 

> For instance, is
> optimizing bit blits that important? Well, I guess this depends on the
> benchmark's weights for each test and how good this benchmark 
> is (was) 
> in the first place. 

One important function (ScrollDC) that uses blits is quite
fast, so it might not be that critical.
 
>  * This is a 16bit benchmark but nowadays most applications 
> are 32bits. 
> To what extent do these results reflect the performance we would get
> with a 32bit application? 

To a very large extent the same I think, especially for slow functions.
The Win16 <=> Win32 call overhead is not that large.
 
>  * In any case this benchmark it only tests graphical so it only
> presents a small view into Wine's performance. 

True.
 
>  * I was surprised (and still am) by the result we got on MoveTo
> (MoveTo16 really). According to the test we are 11 times slower than
> Windows? Yet MoveTo does not seem to be an API that does 
> much. Of course
> MoveTo is a very fast API in any case so this probably does not affect
> our performance much in real life. What I wondered is how much of it
> could be explained by general 'setup costs' (DC_GetDCUpdate,
> GDI_ReleaseObj, system call?)

DC_GetDCUpdate is unnessary to call since it mainly updates
the visual region which MoveTo doesn't use.

However the cost for the normal DC_GetDCPtr is probably not that much
less so I don't know.

Anyway, I have tried to optimize MoveTo{,Ex}{,16} looking at the assembler
output from both GNU C and Solaris C in order to make it as fast as 
I'm able. I will probaly submit to wine-patches later today.
 
It will be a little faster but probably not that much.
I fear that the DC_GetDCPtr is the main culprit.

> Ratio      Win           Wine          Test
[snip]
>    61.33           0.4           0.0   Pixel, Get

Not suprised. The implementation of this is horrible.

>    23.82     5240000.0      220000.0   GetNearestColor

This uses an O(n) algoritm but can be implemented
with and O(log(n)) algoritm one I think.

>    21.52          24.1           1.1   PolyLine

Hmm. Polyline16 uses HeapAlloc when converting
Win16 to Win32. Perhaps it should use alloca or
something instead.

>    11.06     7810000.0      706000.0   MoveTo
[snip]
>     8.78     6290000.0      716000.0   MoveToEx

I have slight optimized it as I said above.
MoveTo16 now directly calls MoveToEx(32)
instead of MoveToEx16. This might make
some difference if the difference is
this large in the benchmark.

Reply via email to