subject:"\[Mesa\-dev\] \[PATCH\] r600g\: order atom emission v2"

[Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread j . glisse

From: Jerome Glisse 

To avoid GPU lockup registers must be emited in a specific order
(no kidding ...). This patch rework atom emission so order in which
atom are emited in respect to each other is always the same. We
don't have any informations on what is the correct order so order
will need to be infered from fglrx command stream.

v2: add comment warning that atom order should not be taken lightly

Signed-off-by: Jerome Glisse 
---
 src/gallium/drivers/r600/evergreen_compute.c |  2 +-
 src/gallium/drivers/r600/evergreen_state.c   | 63 +++-
 src/gallium/drivers/r600/r600_hw_context.c   | 10 +++--
 src/gallium/drivers/r600/r600_pipe.c |  1 -
 src/gallium/drivers/r600/r600_pipe.h | 33 ---
 src/gallium/drivers/r600/r600_state.c| 53 ---
 src/gallium/drivers/r600/r600_state_common.c | 36 +---
 7 files changed, 116 insertions(+), 82 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index acf91ba..3533312 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -583,7 +583,7 @@ void evergreen_init_atom_start_compute_cs(struct 
r600_context *ctx)
/* since all required registers are initialised in the
 * start_compute_cs_cmd atom, we can EMIT_EARLY here.
 */
-   r600_init_command_buffer(cb, 256, EMIT_EARLY);
+   r600_init_command_buffer(ctx, cb, 1, 256);
cb->pkt_flags = RADEON_CP_PACKET3_COMPUTE_MODE;
 
switch (ctx->family) {
diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index bda8ed5..a8ec745 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2161,27 +2161,50 @@ static void cayman_emit_sample_mask(struct r600_context 
*rctx, struct r600_atom
 
 void evergreen_init_state_functions(struct r600_context *rctx)
 {
-   r600_init_atom(&rctx->cb_misc_state.atom, evergreen_emit_cb_misc_state, 
0, 0);
-   r600_atom_dirty(rctx, &rctx->cb_misc_state.atom);
-   r600_init_atom(&rctx->db_misc_state.atom, evergreen_emit_db_misc_state, 
7, 0);
-   r600_atom_dirty(rctx, &rctx->db_misc_state.atom);
-   r600_init_atom(&rctx->vertex_buffer_state.atom, 
evergreen_fs_emit_vertex_buffers, 0, 0);
-   r600_init_atom(&rctx->cs_vertex_buffer_state.atom, 
evergreen_cs_emit_vertex_buffers, 0, 0);
-   r600_init_atom(&rctx->vs_constbuf_state.atom, 
evergreen_emit_vs_constant_buffers, 0, 0);
-   r600_init_atom(&rctx->ps_constbuf_state.atom, 
evergreen_emit_ps_constant_buffers, 0, 0);
-   r600_init_atom(&rctx->vs_samplers.views.atom, 
evergreen_emit_vs_sampler_views, 0, 0);
-   r600_init_atom(&rctx->ps_samplers.views.atom, 
evergreen_emit_ps_sampler_views, 0, 0);
-   r600_init_atom(&rctx->cs_shader_state.atom, evergreen_emit_cs_shader, 
0, 0);
-   r600_init_atom(&rctx->vs_samplers.atom_sampler, 
evergreen_emit_vs_sampler, 0, 0);
-   r600_init_atom(&rctx->ps_samplers.atom_sampler, 
evergreen_emit_ps_sampler, 0, 0);
-
-   if (rctx->chip_class == EVERGREEN)
-   r600_init_atom(&rctx->sample_mask.atom, 
evergreen_emit_sample_mask, 3, 0);
-   else
-   r600_init_atom(&rctx->sample_mask.atom, 
cayman_emit_sample_mask, 4, 0);
+   unsigned id = 4;
+
+   /* !!!
+*  To avoid GPU lockup registers must be emited in a specific order
+* (no kidding ...). The order below is important and have been
+* partialy infered from analyzing fglrx command stream.
+*
+* Don't reorder atom without carefully checking the effect (GPU lockup
+* or piglit regression).
+* !!!
+*/
+
+   /* shader const */
+   r600_init_atom(rctx, &rctx->vs_constbuf_state.atom, id++, 
evergreen_emit_vs_constant_buffers, 0);
+   r600_init_atom(rctx, &rctx->ps_constbuf_state.atom, id++, 
evergreen_emit_ps_constant_buffers, 0);
+   /* shader program */
+   r600_init_atom(rctx, &rctx->cs_shader_state.atom, id++, 
evergreen_emit_cs_shader, 0);
+   /* sampler */
+   r600_init_atom(rctx, &rctx->vs_samplers.atom_sampler, id++, 
evergreen_emit_vs_sampler, 0);
+   r600_init_atom(rctx, &rctx->ps_samplers.atom_sampler, id++, 
evergreen_emit_ps_sampler, 0);
+   /* resources */
+   r600_init_atom(rctx, &rctx->vertex_buffer_state.atom, id++, 
evergreen_fs_emit_vertex_buffers, 0);
+   r600_init_atom(rctx, &rctx->cs_vertex_buffer_state.atom, id++, 
evergreen_cs_emit_vertex_buffers, 0);
+   r600_init_atom(rctx, &rctx->vs_samplers.views.atom, id++, 
evergreen_emit_vs_sampler_views, 0);
+   r600_init_atom(rctx, &rctx->ps_samplers.views.atom, id++, 
evergreen_emit_ps_sampler_views, 0);
+
+   if (rctx->chip_class == EVERGREEN) {
+   r600_init_atom(rctx, &rctx->sample_mask.atom, id++, 
evergreen_emit_sample_mask, 3);
+   }

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Marek Olšák

This looks good to me. It's funny to see the r300g architecture being
re-implemented in r600g. :)

There's one optimization that r300g has that this patch doesn't. r300g
keeps the index of the first and the last dirty atom and the loops
over the list of atoms look like this:
for (i = first_dirty; i <= last_dirty; i++)

And after emission:
first_dirty = some large number;
last_dirty= 0;

The atoms should be ordered according to how frequently they are
updated (except when the ordering is required by the hw). But most
importantly, if there are no state changes, the loops are trivially
skipped.

Marek

On Thu, Sep 6, 2012 at 5:01 PM,   wrote:
> From: Jerome Glisse 
>
> To avoid GPU lockup registers must be emited in a specific order
> (no kidding ...). This patch rework atom emission so order in which
> atom are emited in respect to each other is always the same. We
> don't have any informations on what is the correct order so order
> will need to be infered from fglrx command stream.
>
> v2: add comment warning that atom order should not be taken lightly
>
> Signed-off-by: Jerome Glisse 
> ---
>  src/gallium/drivers/r600/evergreen_compute.c |  2 +-
>  src/gallium/drivers/r600/evergreen_state.c   | 63 
> +++-
>  src/gallium/drivers/r600/r600_hw_context.c   | 10 +++--
>  src/gallium/drivers/r600/r600_pipe.c |  1 -
>  src/gallium/drivers/r600/r600_pipe.h | 33 ---
>  src/gallium/drivers/r600/r600_state.c| 53 ---
>  src/gallium/drivers/r600/r600_state_common.c | 36 +---
>  7 files changed, 116 insertions(+), 82 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
> b/src/gallium/drivers/r600/evergreen_compute.c
> index acf91ba..3533312 100644
> --- a/src/gallium/drivers/r600/evergreen_compute.c
> +++ b/src/gallium/drivers/r600/evergreen_compute.c
> @@ -583,7 +583,7 @@ void evergreen_init_atom_start_compute_cs(struct 
> r600_context *ctx)
> /* since all required registers are initialised in the
>  * start_compute_cs_cmd atom, we can EMIT_EARLY here.
>  */
> -   r600_init_command_buffer(cb, 256, EMIT_EARLY);
> +   r600_init_command_buffer(ctx, cb, 1, 256);
> cb->pkt_flags = RADEON_CP_PACKET3_COMPUTE_MODE;
>
> switch (ctx->family) {
> diff --git a/src/gallium/drivers/r600/evergreen_state.c 
> b/src/gallium/drivers/r600/evergreen_state.c
> index bda8ed5..a8ec745 100644
> --- a/src/gallium/drivers/r600/evergreen_state.c
> +++ b/src/gallium/drivers/r600/evergreen_state.c
> @@ -2161,27 +2161,50 @@ static void cayman_emit_sample_mask(struct 
> r600_context *rctx, struct r600_atom
>
>  void evergreen_init_state_functions(struct r600_context *rctx)
>  {
> -   r600_init_atom(&rctx->cb_misc_state.atom, 
> evergreen_emit_cb_misc_state, 0, 0);
> -   r600_atom_dirty(rctx, &rctx->cb_misc_state.atom);
> -   r600_init_atom(&rctx->db_misc_state.atom, 
> evergreen_emit_db_misc_state, 7, 0);
> -   r600_atom_dirty(rctx, &rctx->db_misc_state.atom);
> -   r600_init_atom(&rctx->vertex_buffer_state.atom, 
> evergreen_fs_emit_vertex_buffers, 0, 0);
> -   r600_init_atom(&rctx->cs_vertex_buffer_state.atom, 
> evergreen_cs_emit_vertex_buffers, 0, 0);
> -   r600_init_atom(&rctx->vs_constbuf_state.atom, 
> evergreen_emit_vs_constant_buffers, 0, 0);
> -   r600_init_atom(&rctx->ps_constbuf_state.atom, 
> evergreen_emit_ps_constant_buffers, 0, 0);
> -   r600_init_atom(&rctx->vs_samplers.views.atom, 
> evergreen_emit_vs_sampler_views, 0, 0);
> -   r600_init_atom(&rctx->ps_samplers.views.atom, 
> evergreen_emit_ps_sampler_views, 0, 0);
> -   r600_init_atom(&rctx->cs_shader_state.atom, evergreen_emit_cs_shader, 
> 0, 0);
> -   r600_init_atom(&rctx->vs_samplers.atom_sampler, 
> evergreen_emit_vs_sampler, 0, 0);
> -   r600_init_atom(&rctx->ps_samplers.atom_sampler, 
> evergreen_emit_ps_sampler, 0, 0);
> -
> -   if (rctx->chip_class == EVERGREEN)
> -   r600_init_atom(&rctx->sample_mask.atom, 
> evergreen_emit_sample_mask, 3, 0);
> -   else
> -   r600_init_atom(&rctx->sample_mask.atom, 
> cayman_emit_sample_mask, 4, 0);
> +   unsigned id = 4;
> +
> +   /* !!!
> +*  To avoid GPU lockup registers must be emited in a specific order
> +* (no kidding ...). The order below is important and have been
> +* partialy infered from analyzing fglrx command stream.
> +*
> +* Don't reorder atom without carefully checking the effect (GPU 
> lockup
> +* or piglit regression).
> +* !!!
> +*/
> +
> +   /* shader const */
> +   r600_init_atom(rctx, &rctx->vs_constbuf_state.atom, id++, 
> evergreen_emit_vs_constant_buffers, 0);
> +   r600_init_atom(rctx, &rctx->ps_constbuf_state.atom, id++, 
> evergreen_emit_ps_constant_buffers, 0);
> +   /* shader program */
> +   r600_init_atom(rctx, &rctx->cs_shader_state.atom, id++, 
> evergreen_emit_cs_s

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Jerome Glisse

On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák  wrote:
> This looks good to me. It's funny to see the r300g architecture being
> re-implemented in r600g. :)
>
> There's one optimization that r300g has that this patch doesn't. r300g
> keeps the index of the first and the last dirty atom and the loops
> over the list of atoms look like this:
> for (i = first_dirty; i <= last_dirty; i++)
>
> And after emission:
> first_dirty = some large number;
> last_dirty= 0;
>
> The atoms should be ordered according to how frequently they are
> updated (except when the ordering is required by the hw). But most
> importantly, if there are no state changes, the loops are trivially
> skipped.
>
> Marek

Don't think this optimization is worth it, there won't be much more
than 32 atom in the end and it definitely can't be ordered from most
frequent to less frequent as some of the stuff need to be at the last
being emitted and they are frequent one (primitive type for instance).

Cheers,
Jerome
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Marek Olšák

On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse  wrote:
> On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák  wrote:
>> This looks good to me. It's funny to see the r300g architecture being
>> re-implemented in r600g. :)
>>
>> There's one optimization that r300g has that this patch doesn't. r300g
>> keeps the index of the first and the last dirty atom and the loops
>> over the list of atoms look like this:
>> for (i = first_dirty; i <= last_dirty; i++)
>>
>> And after emission:
>> first_dirty = some large number;
>> last_dirty= 0;
>>
>> The atoms should be ordered according to how frequently they are
>> updated (except when the ordering is required by the hw). But most
>> importantly, if there are no state changes, the loops are trivially
>> skipped.
>>
>> Marek
>
> Don't think this optimization is worth it, there won't be much more
> than 32 atom in the end and it definitely can't be ordered from most
> frequent to less frequent as some of the stuff need to be at the last
> being emitted and they are frequent one (primitive type for instance).

I didn't say all atoms *must* be sorted. I meant that some (most?)
atoms can be sorted, i.e. you can have some atoms at fixed positions
(like the primitype type or the seamless cubemap state), but you have
always at least *some* freedom where you put the rest. The ordering I
had in mind was actually from the least frequent to the most frequent,
in other words, from the framebuffer (least frequent) to shaders to
textures to constant buffers to vertex buffers (most frequent).

Of course, the code should document which atoms must have fixed
positions along with an explanation. The comment that all atom
positions must not be changed isn't enough, because it's not true.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Jerome Glisse

On Thu, Sep 6, 2012 at 4:10 PM, Marek Olšák  wrote:
> On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse  wrote:
>> On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák  wrote:
>>> This looks good to me. It's funny to see the r300g architecture being
>>> re-implemented in r600g. :)
>>>
>>> There's one optimization that r300g has that this patch doesn't. r300g
>>> keeps the index of the first and the last dirty atom and the loops
>>> over the list of atoms look like this:
>>> for (i = first_dirty; i <= last_dirty; i++)
>>>
>>> And after emission:
>>> first_dirty = some large number;
>>> last_dirty= 0;
>>>
>>> The atoms should be ordered according to how frequently they are
>>> updated (except when the ordering is required by the hw). But most
>>> importantly, if there are no state changes, the loops are trivially
>>> skipped.
>>>
>>> Marek
>>
>> Don't think this optimization is worth it, there won't be much more
>> than 32 atom in the end and it definitely can't be ordered from most
>> frequent to less frequent as some of the stuff need to be at the last
>> being emitted and they are frequent one (primitive type for instance).
>
> I didn't say all atoms *must* be sorted. I meant that some (most?)
> atoms can be sorted, i.e. you can have some atoms at fixed positions
> (like the primitype type or the seamless cubemap state), but you have
> always at least *some* freedom where you put the rest. The ordering I
> had in mind was actually from the least frequent to the most frequent,
> in other words, from the framebuffer (least frequent) to shaders to
> textures to constant buffers to vertex buffers (most frequent).
>
> Of course, the code should document which atoms must have fixed
> positions along with an explanation. The comment that all atom
> positions must not be changed isn't enough, because it's not true.
>
> Marek

I won't try to find which atom can have complete floating position, i
am just grouping together register that are always emitted together in
fglrx and then i position this group relative to each other according
to fglrx position. That means all atom are always emitted in a
specific order. So there won't be any freedom. The only freedom i can
think of is btw 2 position forced atom and that make the sorting
completely useless and complicated.

Cheers,
Jerome
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Marek Olšák

On Fri, Sep 7, 2012 at 12:05 AM, Jerome Glisse  wrote:
> On Thu, Sep 6, 2012 at 4:10 PM, Marek Olšák  wrote:
>> On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse  wrote:
>>> On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák  wrote:
 This looks good to me. It's funny to see the r300g architecture being
 re-implemented in r600g. :)

 There's one optimization that r300g has that this patch doesn't. r300g
 keeps the index of the first and the last dirty atom and the loops
 over the list of atoms look like this:
 for (i = first_dirty; i <= last_dirty; i++)

 And after emission:
 first_dirty = some large number;
 last_dirty= 0;

 The atoms should be ordered according to how frequently they are
 updated (except when the ordering is required by the hw). But most
 importantly, if there are no state changes, the loops are trivially
 skipped.

 Marek
>>>
>>> Don't think this optimization is worth it, there won't be much more
>>> than 32 atom in the end and it definitely can't be ordered from most
>>> frequent to less frequent as some of the stuff need to be at the last
>>> being emitted and they are frequent one (primitive type for instance).
>>
>> I didn't say all atoms *must* be sorted. I meant that some (most?)
>> atoms can be sorted, i.e. you can have some atoms at fixed positions
>> (like the primitype type or the seamless cubemap state), but you have
>> always at least *some* freedom where you put the rest. The ordering I
>> had in mind was actually from the least frequent to the most frequent,
>> in other words, from the framebuffer (least frequent) to shaders to
>> textures to constant buffers to vertex buffers (most frequent).
>>
>> Of course, the code should document which atoms must have fixed
>> positions along with an explanation. The comment that all atom
>> positions must not be changed isn't enough, because it's not true.
>>
>> Marek
>
> I won't try to find which atom can have complete floating position, i
> am just grouping together register that are always emitted together in
> fglrx and then i position this group relative to each other according
> to fglrx position. That means all atom are always emitted in a
> specific order. So there won't be any freedom. The only freedom i can
> think of is btw 2 position forced atom and that make the sorting
> completely useless and complicated.

I'll add the optimization anyway (without sorting). Draw operations
without state changes or with only one state update are quite common.

Anyway, it was said in the v1 thread that the hardware doesn't need
any specific ordering for proper functioning. While it may be
beneficial to emit one or two registers earlier than the others,
insisting on fixed ordering of all of them is not only limiting, it
seems useless and waste of time as well. What I don't understand: Why
do you blindly copy everything fglrx *seems* to be doing without any
real reason? It does not fix any bug, it does not improve performance,
it does not clean up the code... so why? I am all ears.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Dave Airlie

On Fri, Sep 7, 2012 at 10:03 AM, Marek Olšák  wrote:
> On Fri, Sep 7, 2012 at 12:05 AM, Jerome Glisse  wrote:
>> On Thu, Sep 6, 2012 at 4:10 PM, Marek Olšák  wrote:
>>> On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse  wrote:
 On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák  wrote:
> This looks good to me. It's funny to see the r300g architecture being
> re-implemented in r600g. :)
>
> There's one optimization that r300g has that this patch doesn't. r300g
> keeps the index of the first and the last dirty atom and the loops
> over the list of atoms look like this:
> for (i = first_dirty; i <= last_dirty; i++)
>
> And after emission:
> first_dirty = some large number;
> last_dirty= 0;
>
> The atoms should be ordered according to how frequently they are
> updated (except when the ordering is required by the hw). But most
> importantly, if there are no state changes, the loops are trivially
> skipped.
>
> Marek

 Don't think this optimization is worth it, there won't be much more
 than 32 atom in the end and it definitely can't be ordered from most
 frequent to less frequent as some of the stuff need to be at the last
 being emitted and they are frequent one (primitive type for instance).
>>>
>>> I didn't say all atoms *must* be sorted. I meant that some (most?)
>>> atoms can be sorted, i.e. you can have some atoms at fixed positions
>>> (like the primitype type or the seamless cubemap state), but you have
>>> always at least *some* freedom where you put the rest. The ordering I
>>> had in mind was actually from the least frequent to the most frequent,
>>> in other words, from the framebuffer (least frequent) to shaders to
>>> textures to constant buffers to vertex buffers (most frequent).
>>>
>>> Of course, the code should document which atoms must have fixed
>>> positions along with an explanation. The comment that all atom
>>> positions must not be changed isn't enough, because it's not true.
>>>
>>> Marek
>>
>> I won't try to find which atom can have complete floating position, i
>> am just grouping together register that are always emitted together in
>> fglrx and then i position this group relative to each other according
>> to fglrx position. That means all atom are always emitted in a
>> specific order. So there won't be any freedom. The only freedom i can
>> think of is btw 2 position forced atom and that make the sorting
>> completely useless and complicated.
>
> I'll add the optimization anyway (without sorting). Draw operations
> without state changes or with only one state update are quite common.
>
> Anyway, it was said in the v1 thread that the hardware doesn't need
> any specific ordering for proper functioning. While it may be
> beneficial to emit one or two registers earlier than the others,
> insisting on fixed ordering of all of them is not only limiting, it
> seems useless and waste of time as well. What I don't understand: Why
> do you blindly copy everything fglrx *seems* to be doing without any
> real reason? It does not fix any bug, it does not improve performance,
> it does not clean up the code... so why? I am all ears.

At the very least, please document a list of lockups this avoids. Less
magic more text.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Jerome Glisse

On Thu, Sep 6, 2012 at 8:32 PM, Dave Airlie  wrote:
> On Fri, Sep 7, 2012 at 10:03 AM, Marek Olšák  wrote:
>> On Fri, Sep 7, 2012 at 12:05 AM, Jerome Glisse  wrote:
>>> On Thu, Sep 6, 2012 at 4:10 PM, Marek Olšák  wrote:
 On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse  wrote:
> On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák  wrote:
>> This looks good to me. It's funny to see the r300g architecture being
>> re-implemented in r600g. :)
>>
>> There's one optimization that r300g has that this patch doesn't. r300g
>> keeps the index of the first and the last dirty atom and the loops
>> over the list of atoms look like this:
>> for (i = first_dirty; i <= last_dirty; i++)
>>
>> And after emission:
>> first_dirty = some large number;
>> last_dirty= 0;
>>
>> The atoms should be ordered according to how frequently they are
>> updated (except when the ordering is required by the hw). But most
>> importantly, if there are no state changes, the loops are trivially
>> skipped.
>>
>> Marek
>
> Don't think this optimization is worth it, there won't be much more
> than 32 atom in the end and it definitely can't be ordered from most
> frequent to less frequent as some of the stuff need to be at the last
> being emitted and they are frequent one (primitive type for instance).

 I didn't say all atoms *must* be sorted. I meant that some (most?)
 atoms can be sorted, i.e. you can have some atoms at fixed positions
 (like the primitype type or the seamless cubemap state), but you have
 always at least *some* freedom where you put the rest. The ordering I
 had in mind was actually from the least frequent to the most frequent,
 in other words, from the framebuffer (least frequent) to shaders to
 textures to constant buffers to vertex buffers (most frequent).

 Of course, the code should document which atoms must have fixed
 positions along with an explanation. The comment that all atom
 positions must not be changed isn't enough, because it's not true.

 Marek
>>>
>>> I won't try to find which atom can have complete floating position, i
>>> am just grouping together register that are always emitted together in
>>> fglrx and then i position this group relative to each other according
>>> to fglrx position. That means all atom are always emitted in a
>>> specific order. So there won't be any freedom. The only freedom i can
>>> think of is btw 2 position forced atom and that make the sorting
>>> completely useless and complicated.
>>
>> I'll add the optimization anyway (without sorting). Draw operations
>> without state changes or with only one state update are quite common.
>>
>> Anyway, it was said in the v1 thread that the hardware doesn't need
>> any specific ordering for proper functioning. While it may be
>> beneficial to emit one or two registers earlier than the others,
>> insisting on fixed ordering of all of them is not only limiting, it
>> seems useless and waste of time as well. What I don't understand: Why
>> do you blindly copy everything fglrx *seems* to be doing without any
>> real reason? It does not fix any bug, it does not improve performance,
>> it does not clean up the code... so why? I am all ears.
>
> At the very least, please document a list of lockups this avoids. Less
> magic more text.
>
> Dave.

I am doing all this for hyperz. So if it only fix hyperz that make me
happy enough.

Cheers,
Jerome
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] r600g: order atom emission v2

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

8 matches

Site Navigation

Mail list logo

Footer information