On Fri, 2002-01-18 at 02:59, Denis Oliver Kropp wrote: > > The problem with the i810 is the ioctl for each command, maybe we can gather > commands in the buffer and flush it after a specified threshold and by calling > EngineSync. > > The software is pretty fast on your machine because the video memory is the same > physical memory as the system memory on your machine. > Hi
Thanks for the response. I have already done what you suggested. Before that, FillTriangle was doing 50Mpixels/sec which really tells on the overhead of an ioctl. So I borrowed your DDA algorithm and modified it so instead of a FillRect per horizontal line, I accumulated all the instructions per horizontal line and burst them at one time. The result is what the current number is 150-160Mpixels/sec, but still too slow compared to software (200 Mpixels/sec). I also made the DMA transfers asynchronous, so I actually added performance monitoring similar to other gfxdrivers. Here's for FillRect which requires a few instructions per draw. I increased buffer size so it's effect will be limited. Benchmarking with 256x256 in 16bit mode... (16bit) Fill Rectangles 3.02 secs ( 520.30 MPixel/sec) (-) DirectFB/Media: PNG Provider Construct '/usr/local/share/directfb-examples/meter.png' (-) DirectFB/Core: shutting down! (-) DirectFB/I810: DMA Buffer Performance Monitoring: (-) DirectFB/I810: results may not be valid for DMA size < 32K (-) DirectFB/I810: 256 DMA buffer size in KB (-) DirectFB/I810: 16 i810_wait_for_blit_idle calls (-) DirectFB/I810: 48083 i810_wait_for_space calls (-) DirectFB/I810: 1345856 BUFFER transfers (i810_wait_for_space sum) (-) DirectFB/I810: 5 BUFFER wait cycles (depends on CPU) (-) DirectFB/I810: 2705 IDLE wait cycles (depends on CPU) (-) DirectFB/I810: 48077 BUFFER space cache hits(depends on CPU) (-) DirectFB/I810: 0 BUFFER timeout sum (possible hardware crash) (-) DirectFB/I810: 0 IDLE timeout sum (possible hardware crash) (-) DirectFB/I810: Conclusion: (-) DirectFB/I810: Average buffer transfers per i810_wait_for_space call: 27.99 (-) DirectFB/I810: Average wait cycles per i810_wait_for_space call: 0.00 (-) DirectFB/I810: Average wait cycles per i810_wait_for_blit_idle call: 169.06 (-) DirectFB/I810: Average buffer space cache hits: 99% Not bad. I get 99% cache hits, and it does not spend too many CPU cycles waiting. Now here's the modified FillTriangle function. Fill Triangles 3.02 secs ( 147.37 MPixel/sec) (-) DirectFB/Media: PNG Provider Construct '/usr/local/share/directfb-examples/meter.png' (-) DirectFB/Core: shutting down! (-) DirectFB/I810: DMA Buffer Performance Monitoring: (-) DirectFB/I810: results may not be valid for DMA size < 32K (-) DirectFB/I810: 256 DMA buffer size in KB (-) DirectFB/I810: 17 i810_wait_for_blit_idle calls (-) DirectFB/I810: 40885 i810_wait_for_space calls (-) DirectFB/I810: 42814696 BUFFER transfers (i810_wait_for_space sum) (-) DirectFB/I810: 510734 BUFFER wait cycles (depends on CPU) (-) DirectFB/I810: 20143 IDLE wait cycles (depends on CPU) (-) DirectFB/I810: 34776 BUFFER space cache hits(depends on CPU) (-) DirectFB/I810: 0 BUFFER timeout sum (possible hardware crash) (-) DirectFB/I810: 0 IDLE timeout sum (possible hardware crash) (-) DirectFB/I810: Conclusion: (-) DirectFB/I810: Average buffer transfers per i810_wait_for_space call: 1047.20 (-) DirectFB/I810: Average wait cycles per i810_wait_for_space call: 12.49 (-) DirectFB/I810: Average wait cycles per i810_wait_for_blit_idle call: 1184.88 (-) DirectFB/I810: Average buffer space cache hits: 85% There's just too many instructions to transfer, and so directFB is waiting most times. I can still optimize the driver more, but it would require hacks. Basically, I just have to say that pure software is faster than pseudo- accelerated-based functions for the i810. I'm no expert in graphics programming and I would appreciate any suggestions to improve the driver. Thanks. Tony -- Info: To unsubscribe send a mail to [EMAIL PROTECTED] with "unsubscribe directfb-dev" as subject.
