Re: [Mesa-dev] [PATCH] r600g: track dirty registers better.
Le 18/04/2011 16:37, Benjamin BELLEC a écrit : Le 18/04/2011 10:20, Dave Airlie a écrit : On Mon, Apr 18, 2011 at 11:00 AM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com This is a first step to decreasing the CPU usage, by decreasing how much stuff we pass to the GPU and hence to the kernel CS checker. This adds a check to see if the values we need to write are actually dirty, and avoids writing if they are. However certain register need to always be written so we add a new flag to say which ones should be always written if used. (Note this could probably be done cleaner with a larger refactoring, since I think the CONST_BUFFER_SIZE_PS/VS and CONST_CACHE_PS/VS might be better off as a special state). It also moves the need_bo to be a flags on the register now. With this, a frame of gears goes from emitting 3k dwords to emitting 2k dwords, and I'm sure it could get a lot smaller. TODO: Currently we flush if we have a BO, this could probably be improved. Drop the special flush flag and move the buffer size ps/vs to a special state. I've pushed a v2 of this to the r600g-dirty branch in my repo with another couple of patches on top the v2 just fixes the evergreen paths. The other patches cause regressions, but decrease further the amount of dwords per frame which should decrease time in the kernel parser, just have to figure out the regressions. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Hello, I just test your branch, it breaks the 3 apps I tested : - glxgears, I do not see gears. I can see them quickly by moving the window at the border of my screen - nexuiz-sdl : the menu is break. (not tested ingame) - etqw : the menu is break too. In game, I have phantom frame. This is hard to describe, and I have no screenshots to show you exactly. To sum up, I have the impression that the frames are mixed. I will test your future patches (if any). Benjamin All is now fixed. Also, I have 15.50 % of framerate improvements in nexuiz-sdl on my system (x86 - RV770 - 1680*1050 - No HDR - No sound) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: track dirty registers better.
On Mon, Apr 18, 2011 at 11:00 AM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com This is a first step to decreasing the CPU usage, by decreasing how much stuff we pass to the GPU and hence to the kernel CS checker. This adds a check to see if the values we need to write are actually dirty, and avoids writing if they are. However certain register need to always be written so we add a new flag to say which ones should be always written if used. (Note this could probably be done cleaner with a larger refactoring, since I think the CONST_BUFFER_SIZE_PS/VS and CONST_CACHE_PS/VS might be better off as a special state). It also moves the need_bo to be a flags on the register now. With this, a frame of gears goes from emitting 3k dwords to emitting 2k dwords, and I'm sure it could get a lot smaller. TODO: Currently we flush if we have a BO, this could probably be improved. Drop the special flush flag and move the buffer size ps/vs to a special state. I've pushed a v2 of this to the r600g-dirty branch in my repo with another couple of patches on top the v2 just fixes the evergreen paths. The other patches cause regressions, but decrease further the amount of dwords per frame which should decrease time in the kernel parser, just have to figure out the regressions. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: track dirty registers better.
Le 18/04/2011 10:20, Dave Airlie a écrit : On Mon, Apr 18, 2011 at 11:00 AM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com This is a first step to decreasing the CPU usage, by decreasing how much stuff we pass to the GPU and hence to the kernel CS checker. This adds a check to see if the values we need to write are actually dirty, and avoids writing if they are. However certain register need to always be written so we add a new flag to say which ones should be always written if used. (Note this could probably be done cleaner with a larger refactoring, since I think the CONST_BUFFER_SIZE_PS/VS and CONST_CACHE_PS/VS might be better off as a special state). It also moves the need_bo to be a flags on the register now. With this, a frame of gears goes from emitting 3k dwords to emitting 2k dwords, and I'm sure it could get a lot smaller. TODO: Currently we flush if we have a BO, this could probably be improved. Drop the special flush flag and move the buffer size ps/vs to a special state. I've pushed a v2 of this to the r600g-dirty branch in my repo with another couple of patches on top the v2 just fixes the evergreen paths. The other patches cause regressions, but decrease further the amount of dwords per frame which should decrease time in the kernel parser, just have to figure out the regressions. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Hello, I just test your branch, it breaks the 3 apps I tested : - glxgears, I do not see gears. I can see them quickly by moving the window at the border of my screen - nexuiz-sdl : the menu is break. (not tested ingame) - etqw : the menu is break too. In game, I have phantom frame. This is hard to describe, and I have no screenshots to show you exactly. To sum up, I have the impression that the frames are mixed. I will test your future patches (if any). Benjamin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: track dirty registers better.
From: Dave Airlie airl...@redhat.com This is a first step to decreasing the CPU usage, by decreasing how much stuff we pass to the GPU and hence to the kernel CS checker. This adds a check to see if the values we need to write are actually dirty, and avoids writing if they are. However certain register need to always be written so we add a new flag to say which ones should be always written if used. (Note this could probably be done cleaner with a larger refactoring, since I think the CONST_BUFFER_SIZE_PS/VS and CONST_CACHE_PS/VS might be better off as a special state). It also moves the need_bo to be a flags on the register now. With this, a frame of gears goes from emitting 3k dwords to emitting 2k dwords, and I'm sure it could get a lot smaller. TODO: Currently we flush if we have a BO, this could probably be improved. Drop the special flush flag and move the buffer size ps/vs to a special state. Signed-off-by: Dave Airlie airl...@redhat.com --- src/gallium/drivers/r600/r600.h|1 + src/gallium/winsys/r600/drm/evergreen_hw_context.c | 133 ++ src/gallium/winsys/r600/drm/r600_hw_context.c | 145 src/gallium/winsys/r600/drm/r600_priv.h| 10 +- 4 files changed, 168 insertions(+), 121 deletions(-) diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h index 4256a7e..d605000 100644 --- a/src/gallium/drivers/r600/r600.h +++ b/src/gallium/drivers/r600/r600.h @@ -179,6 +179,7 @@ struct r600_block_reloc { struct r600_block { struct list_headlist; unsignedstatus; + unsignedflags; unsignedstart_offset; unsignedpm4_ndwords; unsignedpm4_flush_ndwords; diff --git a/src/gallium/winsys/r600/drm/evergreen_hw_context.c b/src/gallium/winsys/r600/drm/evergreen_hw_context.c index d914836..1c164fe 100644 --- a/src/gallium/winsys/r600/drm/evergreen_hw_context.c +++ b/src/gallium/winsys/r600/drm/evergreen_hw_context.c @@ -69,29 +69,29 @@ static const struct r600_reg evergreen_context_reg_list[] = { {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_02800C_DB_RENDER_OVERRIDE, 0, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028010_DB_RENDER_OVERRIDE2, 0, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, GROUP_FORCE_NEW_BLOCK, 0, 0, 0}, - {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028014_DB_HTILE_DATA_BASE, 1, 0, 0}, + {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028014_DB_HTILE_DATA_BASE, REG_FLAG_NEED_BO, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, GROUP_FORCE_NEW_BLOCK, 0, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028028_DB_STENCIL_CLEAR, 0, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_02802C_DB_DEPTH_CLEAR, 0, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028030_PA_SC_SCREEN_SCISSOR_TL, 0, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028034_PA_SC_SCREEN_SCISSOR_BR, 0, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, GROUP_FORCE_NEW_BLOCK, 0, 0, 0}, - {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028040_DB_Z_INFO, 1, 0, 0x}, + {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028040_DB_Z_INFO, REG_FLAG_NEED_BO, 0, 0x}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, GROUP_FORCE_NEW_BLOCK, 0, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028044_DB_STENCIL_INFO, 0, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, GROUP_FORCE_NEW_BLOCK, 0, 0, 0}, - {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028048_DB_Z_READ_BASE, 1, 0, 0}, + {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028048_DB_Z_READ_BASE, REG_FLAG_NEED_BO, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, GROUP_FORCE_NEW_BLOCK, 0, 0, 0}, - {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_02804C_DB_STENCIL_READ_BASE, 1, 0, 0}, + {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_02804C_DB_STENCIL_READ_BASE, REG_FLAG_NEED_BO, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, GROUP_FORCE_NEW_BLOCK, 0, 0, 0}, - {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028050_DB_Z_WRITE_BASE, 1, 0, 0}, + {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028050_DB_Z_WRITE_BASE, REG_FLAG_NEED_BO, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, GROUP_FORCE_NEW_BLOCK, 0, 0, 0}, - {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028054_DB_STENCIL_WRITE_BASE, 1, 0, 0}, + {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, R_028054_DB_STENCIL_WRITE_BASE, REG_FLAG_NEED_BO, 0, 0}, {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET,