Currently: 1) Blorp color clears and resolves emit unconditional render target flush + command stream after every clear/resolve (including regular non-fast clears).
2) Blorp color clears, resolves and blits emit texture and constant cache resolves even in case only destination is dirty. This is because brw_render_cache_set_check_flush() does both render target flush as well as the top-of-pipe read cache flushes. 3) Similarly to item 2, 3D and compute paths also flush texture and constant caches even if none of the texture surfaces are dirty. 4) In case of multiple surfaces needing resolves, all render paths (blorp, 3D and compute) emit render target, texture and constant cache flushes after each resolve instead of just once after all resolves. This series addresses all four cases. Good news are that even though the current setup isn't optimal, it doesn't actually get any better in most cases performance wise. There is modest gain in OglDrvRes which does heavy blorp blitting. I'm expecting this series also to make blorp tex uploads and blorp mipmap generation more competitive. Bad news are in the final patch - it looks that current unconditional flushing/stalling has been hiding bugs elsewhere. There are cases which rely on the flushes after non-fast clears. Hunting the real cause is, however, difficult. I only saw them in CI system within full runs and was not able to reproduce them myself. As first steps the series introduces end-of-pipe synchronization. This is a flush combined with stall and post-sync operation of writing a double word (32 bits). Until now this wasn't really needed as there was in many cases double flushing which in turn looks to take long enough to hide the need for the sync. I also noticed that one needs to be rather careful with it - performance gets decreased noticeably when used unneeded. I don't really know if we want to go this way myself even. Current logic - while not ideal - is rather simple. Topi Pohjolainen (16): i965/miptree: Tell if anything got resolved i965/gen6+: Implement end-of-pipe sync i965: Hook end-of-pipe-sync after texture resolves i965: Hook end-of-pipe-sync after image resolves i965: Hook end-of-pipe-sync after framebuffer resolves i965: Consider layered rt resolves along with other i965: Add color resolve end-of-pipe-sync before switch to blit ring i965/dri2: Add end-of-pipe-sync after color resolves i965/miptree: Add color resolve end-of-pipe-sync before sharing i965: Add end-of-pipe sync before non-gpu read of color resolves i965/blorp: Do more fine grained flushing/syncing i965/blorp/blit: Refactor hiz/ccs prep for blits i965/blorp: Use conditional end-of-pipe-sync i965: Consider surface resolves and sync after blorp ops i965: Check if fast color clear state transition needs sync i965/blorp: Drop unnecessary flushes after clear/resolve src/mesa/drivers/dri/i965/brw_blorp.c | 187 ++++++++++---- src/mesa/drivers/dri/i965/brw_compute.c | 2 + src/mesa/drivers/dri/i965/brw_context.c | 333 +++++++++++++++++++------ src/mesa/drivers/dri/i965/brw_context.h | 3 + src/mesa/drivers/dri/i965/brw_draw.c | 36 +-- src/mesa/drivers/dri/i965/brw_pipe_control.c | 91 +++++++ src/mesa/drivers/dri/i965/genX_blorp_exec.c | 11 - src/mesa/drivers/dri/i965/intel_blit.c | 16 +- src/mesa/drivers/dri/i965/intel_copy_image.c | 10 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 25 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +- src/mesa/drivers/dri/i965/intel_pixel.c | 4 + src/mesa/drivers/dri/i965/intel_pixel_bitmap.c | 5 +- src/mesa/drivers/dri/i965/intel_pixel_read.c | 7 +- src/mesa/drivers/dri/i965/intel_tex_image.c | 10 +- src/mesa/drivers/dri/i965/intel_tex_subimage.c | 11 +- 16 files changed, 557 insertions(+), 196 deletions(-) -- 2.5.5 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev