Re: [Mesa-dev] [PATCH] r600g: track dirty registers better.

2011-04-19 Thread Benjamin BELLEC
Le 18/04/2011 16:37, Benjamin BELLEC a écrit :
 Le 18/04/2011 10:20, Dave Airlie a écrit :
 On Mon, Apr 18, 2011 at 11:00 AM, Dave Airlie airl...@gmail.com wrote:
 From: Dave Airlie airl...@redhat.com

 This is a first step to decreasing the CPU usage, by decreasing how much
 stuff we pass to the GPU and hence to the kernel CS checker.

 This adds a check to see if the values we need to write are actually dirty,
 and avoids writing if they are. However certain register need to always
 be written so we add a new flag to say which ones should be always written
 if used. (Note this could probably be done cleaner with a larger 
 refactoring,
  since I think the CONST_BUFFER_SIZE_PS/VS and CONST_CACHE_PS/VS might
 be better off as a special state).

 It also moves the need_bo to be a flags on the register now.

 With this, a frame of gears goes from emitting 3k dwords to emitting 2k 
 dwords,
 and I'm sure it could get a lot smaller.

 TODO:
 Currently we flush if we have a BO, this could probably be improved.
 Drop the special flush flag and move the buffer size ps/vs to a special 
 state.


 I've pushed a v2 of this to the r600g-dirty branch in my repo with
 another couple of patches on top

 the v2 just fixes the evergreen paths.

 The other patches cause regressions, but decrease further the amount
 of dwords per frame which should decrease time in the kernel parser,
 just have to figure out the regressions.

 Dave.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
 Hello,
 
 I just test your branch, it breaks the 3 apps I tested :
 
 - glxgears, I do not see gears. I can see them quickly by moving the
 window at the border of my screen
 - nexuiz-sdl : the menu is break. (not tested ingame)
 - etqw : the menu is break too. In game, I have phantom frame. This is
 hard to describe, and I have no screenshots to show you exactly. To sum
 up, I have the impression that the frames are mixed.
 
 I will test your future patches (if any).
 
 Benjamin

All is now fixed.
Also, I have 15.50 % of framerate improvements in nexuiz-sdl on my
system (x86 - RV770 - 1680*1050 - No HDR - No sound)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: track dirty registers better.

2011-04-18 Thread Dave Airlie
On Mon, Apr 18, 2011 at 11:00 AM, Dave Airlie airl...@gmail.com wrote:
 From: Dave Airlie airl...@redhat.com

 This is a first step to decreasing the CPU usage, by decreasing how much
 stuff we pass to the GPU and hence to the kernel CS checker.

 This adds a check to see if the values we need to write are actually dirty,
 and avoids writing if they are. However certain register need to always
 be written so we add a new flag to say which ones should be always written
 if used. (Note this could probably be done cleaner with a larger refactoring,
  since I think the CONST_BUFFER_SIZE_PS/VS and CONST_CACHE_PS/VS might
 be better off as a special state).

 It also moves the need_bo to be a flags on the register now.

 With this, a frame of gears goes from emitting 3k dwords to emitting 2k 
 dwords,
 and I'm sure it could get a lot smaller.

 TODO:
 Currently we flush if we have a BO, this could probably be improved.
 Drop the special flush flag and move the buffer size ps/vs to a special state.


I've pushed a v2 of this to the r600g-dirty branch in my repo with
another couple of patches on top

the v2 just fixes the evergreen paths.

The other patches cause regressions, but decrease further the amount
of dwords per frame which should decrease time in the kernel parser,
just have to figure out the regressions.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: track dirty registers better.

2011-04-18 Thread Benjamin BELLEC
Le 18/04/2011 10:20, Dave Airlie a écrit :
 On Mon, Apr 18, 2011 at 11:00 AM, Dave Airlie airl...@gmail.com wrote:
 From: Dave Airlie airl...@redhat.com

 This is a first step to decreasing the CPU usage, by decreasing how much
 stuff we pass to the GPU and hence to the kernel CS checker.

 This adds a check to see if the values we need to write are actually dirty,
 and avoids writing if they are. However certain register need to always
 be written so we add a new flag to say which ones should be always written
 if used. (Note this could probably be done cleaner with a larger refactoring,
  since I think the CONST_BUFFER_SIZE_PS/VS and CONST_CACHE_PS/VS might
 be better off as a special state).

 It also moves the need_bo to be a flags on the register now.

 With this, a frame of gears goes from emitting 3k dwords to emitting 2k 
 dwords,
 and I'm sure it could get a lot smaller.

 TODO:
 Currently we flush if we have a BO, this could probably be improved.
 Drop the special flush flag and move the buffer size ps/vs to a special 
 state.

 
 I've pushed a v2 of this to the r600g-dirty branch in my repo with
 another couple of patches on top
 
 the v2 just fixes the evergreen paths.
 
 The other patches cause regressions, but decrease further the amount
 of dwords per frame which should decrease time in the kernel parser,
 just have to figure out the regressions.
 
 Dave.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Hello,

I just test your branch, it breaks the 3 apps I tested :

- glxgears, I do not see gears. I can see them quickly by moving the
window at the border of my screen
- nexuiz-sdl : the menu is break. (not tested ingame)
- etqw : the menu is break too. In game, I have phantom frame. This is
hard to describe, and I have no screenshots to show you exactly. To sum
up, I have the impression that the frames are mixed.

I will test your future patches (if any).

Benjamin
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: track dirty registers better.

2011-04-17 Thread Dave Airlie
From: Dave Airlie airl...@redhat.com

This is a first step to decreasing the CPU usage, by decreasing how much
stuff we pass to the GPU and hence to the kernel CS checker.

This adds a check to see if the values we need to write are actually dirty,
and avoids writing if they are. However certain register need to always
be written so we add a new flag to say which ones should be always written
if used. (Note this could probably be done cleaner with a larger refactoring,
 since I think the CONST_BUFFER_SIZE_PS/VS and CONST_CACHE_PS/VS might
be better off as a special state).

It also moves the need_bo to be a flags on the register now.

With this, a frame of gears goes from emitting 3k dwords to emitting 2k dwords,
and I'm sure it could get a lot smaller.

TODO:
Currently we flush if we have a BO, this could probably be improved.
Drop the special flush flag and move the buffer size ps/vs to a special state.

Signed-off-by: Dave Airlie airl...@redhat.com
---
 src/gallium/drivers/r600/r600.h|1 +
 src/gallium/winsys/r600/drm/evergreen_hw_context.c |  133 ++
 src/gallium/winsys/r600/drm/r600_hw_context.c  |  145 
 src/gallium/winsys/r600/drm/r600_priv.h|   10 +-
 4 files changed, 168 insertions(+), 121 deletions(-)

diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h
index 4256a7e..d605000 100644
--- a/src/gallium/drivers/r600/r600.h
+++ b/src/gallium/drivers/r600/r600.h
@@ -179,6 +179,7 @@ struct r600_block_reloc {
 struct r600_block {
struct list_headlist;
unsignedstatus;
+   unsignedflags;
unsignedstart_offset;
unsignedpm4_ndwords;
unsignedpm4_flush_ndwords;
diff --git a/src/gallium/winsys/r600/drm/evergreen_hw_context.c 
b/src/gallium/winsys/r600/drm/evergreen_hw_context.c
index d914836..1c164fe 100644
--- a/src/gallium/winsys/r600/drm/evergreen_hw_context.c
+++ b/src/gallium/winsys/r600/drm/evergreen_hw_context.c
@@ -69,29 +69,29 @@ static const struct r600_reg evergreen_context_reg_list[] = 
{
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_02800C_DB_RENDER_OVERRIDE, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028010_DB_RENDER_OVERRIDE2, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
GROUP_FORCE_NEW_BLOCK, 0, 0, 0},
-   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028014_DB_HTILE_DATA_BASE, 1, 0, 0},
+   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028014_DB_HTILE_DATA_BASE, REG_FLAG_NEED_BO, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
GROUP_FORCE_NEW_BLOCK, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028028_DB_STENCIL_CLEAR, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_02802C_DB_DEPTH_CLEAR, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028030_PA_SC_SCREEN_SCISSOR_TL, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028034_PA_SC_SCREEN_SCISSOR_BR, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
GROUP_FORCE_NEW_BLOCK, 0, 0, 0},
-   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028040_DB_Z_INFO, 1, 0, 0x},
+   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028040_DB_Z_INFO, REG_FLAG_NEED_BO, 0, 0x},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
GROUP_FORCE_NEW_BLOCK, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028044_DB_STENCIL_INFO, 0, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
GROUP_FORCE_NEW_BLOCK, 0, 0, 0},
-   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028048_DB_Z_READ_BASE, 1, 0, 0},
+   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028048_DB_Z_READ_BASE, REG_FLAG_NEED_BO, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
GROUP_FORCE_NEW_BLOCK, 0, 0, 0},
-   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_02804C_DB_STENCIL_READ_BASE, 1, 0, 0},
+   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_02804C_DB_STENCIL_READ_BASE, REG_FLAG_NEED_BO, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
GROUP_FORCE_NEW_BLOCK, 0, 0, 0},
-   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028050_DB_Z_WRITE_BASE, 1, 0, 0},
+   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028050_DB_Z_WRITE_BASE, REG_FLAG_NEED_BO, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
GROUP_FORCE_NEW_BLOCK, 0, 0, 0},
-   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028054_DB_STENCIL_WRITE_BASE, 1, 0, 0},
+   {PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET, 
R_028054_DB_STENCIL_WRITE_BASE, REG_FLAG_NEED_BO, 0, 0},
{PKT3_SET_CONTEXT_REG, EVERGREEN_CONTEXT_REG_OFFSET,