"Pohjolainen, Topi" <topi.pohjolai...@intel.com> writes:

> On Mon, Mar 09, 2015 at 12:43:08PM +0200, Francisco Jerez wrote:
>> "Pohjolainen, Topi" <topi.pohjolai...@intel.com> writes:
>> 
>> > On Sat, Mar 07, 2015 at 04:15:08PM +0200, Francisco Jerez wrote:
>> >> Topi Pohjolainen <topi.pohjolai...@intel.com> writes:
>> >> 
>> >> > The original patch from Curro was based on something that is not
>> >> > present in the master yet. This patch tries to mimick the logic on
>> >> > top master.
>> >> > I wanted to see if could separate the building of the descriptor
>> >> > instruction from building of the send instruction. This logic now
>> >> > allows the caller to construct any kind of sequence of instructions
>> >> > filling in the descriptor before giving it to the send instruction
>> >> > builder.
>> >> >
>> >> > This is only compile tested. Curro, how would you feel about this
>> >> > sort of approach? I apologise for muddying the waters again but I
>> >> > wasn't entirely comfortable with the logic and wanted to try to
>> >> > something else.
>> >> >
>> >> > I believe patch number five should go nicely on top of this as
>> >> > the descriptor instruction could be followed by (or preceeeded by)
>> >> > any additional instructions modifying the descriptor register
>> >> > before the actual send instruction.
>> >> >
>> >> 
>> >> Topi, the goal I had in mind with PATCH 01 was to refactor a commonly
>> >> recurring pattern.  In terms of the functions defined in this patch my
>> >> example from yesterday [1] would now look like:
>> >> 
>> >> |   if (index.file == BRW_IMMEDIATE_VALUE) {
>> >> |
>> >> |      uint32_t surf_index = index.dw1.ud;
>> >> |
>> >> |      brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND);
>> >> |      brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW));
>> >> |      brw_set_src0(p, send, offset);
>> >> |      brw_set_sampler_message(p, send,
>> >> |                              surf_index,
>> >> |                              0, /* LD message ignores sampler unit */
>> >> |                              GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
>> >> |                              rlen,
>> >> |                              mlen,
>> >> |                              false, /* no header */
>> >> |                              simd_mode,
>> >> |                              0);
>> >> |
>> >> |      brw_mark_surface_used(prog_data, surf_index);
>> >> |
>> >> |   } else {
>> >> |
>> >> |      struct brw_reg addr = vec1(retype(brw_address_reg(0), 
>> >> BRW_REGISTER_TYPE_UD));
>> >> |
>> >> |      brw_push_insn_state(p);
>> >> |      brw_set_default_mask_control(p, BRW_MASK_DISABLE);
>> >> |      brw_set_default_access_mode(p, BRW_ALIGN_1);
>> >> |
>> >> |      /* a0.0 = surf_index & 0xff */
>> >> |      brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND);
>> >> |      brw_inst_set_exec_size(p->brw, insn_and, BRW_EXECUTE_1);
>> >> |      brw_set_dest(p, insn_and, addr);
>> >> |      brw_set_src0(p, insn_and, vec1(retype(index, 
>> >> BRW_REGISTER_TYPE_UD)));
>> >> |      brw_set_src1(p, insn_and, brw_imm_ud(0x0ff));
>> >> |
>> >> |
>> >> |      /* a0.0 |= <descriptor> */
>> >> |      brw_inst *descr_inst = brw_build_indirect_message_descr(p, addr, 
>> >> addr);
>> >> |      brw_set_sampler_message(p, descr_inst,
>> >> |                              0 /* surface */,
>> >> |                              0 /* sampler */,
>> >> |                              GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
>> >> |                              rlen /* rlen */,
>> >> |                              mlen /* mlen */,
>> >> |                              false /* header */,
>> >> |                              simd_mode,
>> >> |                              0);
>> >> |
>> >> |      /* dst = send(offset, a0.0) */
>> >> |      brw_send_indirect_message(p, BRW_SFID_SAMPLER, dst, offset, addr);
>> >> |
>> >> |      brw_pop_insn_state(p);
>> >> |
>> >> |      /* visitor knows more than we do about the surface limit required,
>> >> |       * so has already done marking.
>> >> |       */
>> >> |   }
>> >
>> > Which I think could also be written as follows. Or am I missing something
>> > again?
>> >
>> > static brw_inst *
>> > brw_build_surface_index_descr(struct brw_compile *p,
>> >                               struct brw_reg dst, index)
>> > {
>> >    brw_set_default_mask_control(p, BRW_MASK_DISABLE);
>> >    brw_set_default_access_mode(p, BRW_ALIGN_1);
>> >
>> >    /* a0.0 = surf_index & 0xff */
>> >    brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND);
>> >    brw_inst_set_exec_size(p->brw, insn_and, BRW_EXECUTE_1);
>> >    brw_set_dest(p, insn_and, addr);
>> >    brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD)));
>> >    brw_set_src1(p, insn_and, brw_imm_ud(0x0ff));
>> >
>> >    /* a0.0 |= <descriptor> */
>> >    brw_inst *descr_inst = brw_build_indirect_message_descr(p, addr, addr);
>> > }
>> >
>> > ...
>> >    brw_inst *descr_inst;
>> >    if (index.file == BRW_IMMEDIATE_VALUE) {
>> >       descr = brw_next_insn(p, BRW_OPCODE_SEND);
>> >       brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW));
>> >       brw_set_src0(p, send, offset);
>> >
>> >       brw_mark_surface_used(prog_data, surf_index);
>> >    } else {
>> >       struct brw_reg addr = vec1(retype(brw_address_reg(0),
>> >                                  BRW_REGISTER_TYPE_UD));
>> >       brw_push_insn_state(p);
>> >
>> >       brw_build_surface_index_descr(p, addr, index);
>> >       /* dst = send(offset, a0.0) */
>> >       descr_inst = brw_send_indirect_message(p, BRW_SFID_SAMPLER,
>> >                                              dst, offset, addr);
>> >       brw_pop_insn_state(p);
>> >    }
>> >
>> >    uint32_t surf_index = index.file == BRW_IMMEDIATE_VALUE ? index.dw1.ud 
>> > : 0;
>> >    brw_set_sampler_message(p, descr_inst,
>> >                            surf_index,
>> >                            0, /* LD message ignores sampler unit */
>> >                            GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
>> >                            rlen,
>> >                            mlen,
>> >                            false, /* no header */
>> >                            simd_mode,
>> >                            0);
>> > ...
>> 
>> Yeah, something like that.  In any case you need logic to handle both
>> the immediate and indirect cases explicitly by every caller, which seems
>> like plenty of duplication to me, not sure what you are trying to
>> achieve with this.
>
> Well, to me it seems that cases where address is an immediate and 
> non-immediate already have quite fixed paths. I'm uneasy about changing the
> final kick-off of the message sending to have two alternatives for these
> - in each path only one final branch is really possible but this isn't that
> obvious to the reader anymore as one needs to go and analyze the callers:
>

No, that's not right, *all* users of brw_send_indirect_message() or
brw_send_indirect_surface_message() will eventually have to handle
*both* immediate and indirect descriptor cases, and the necessary logic
is identical for *all* of them.

> Patch 1:
>   - introduces brw_send_indirect_message() considering if address is
>     immediate or not.
>   - called by generate_tex() against address that is never an immediate

This is only temporary, I have a (not yet sent for review) patch that
gets rid of explicit handling of immediate vs. indirect descriptors in
the texturing and pull constant code completely.

>   - called by generate_uniform_pull_constant_load_gen7() in a branch which
>     explicitly uses an immediate and another which explicitly uses
>     non-immediate
>
I don't think that any immediate-only branch of
generate_uniform_pull_constant_load_gen7() is calling
brw_send_indirect_message() at this point.

> Patch 5:
>   - introduces brw_send_indirect_surface_message() considering if address is
>     immediate or not.
>   - called by brw_untyped_atomic() and brw_untyped_surface_read() which
>     both are only called against immediates
>
> Patch 11:
>   - introduces brw_untyped_surface_write() which is only called against
>     immediate
>
> Patch 12:
>   - introduces brw_typed_atomic() which is only called against immediate
>   - introduces brw_typed_surface_read() which is only called against immediate
>   - introduces brw_typed_surface_write() which is only called against 
> immediate
>
Typed and untyped surface opcodes are *all* called with both immediate
and indirect surfaces.

>
>
> Then, in general I was hoping to have design pattern of this sort. Caller:
>
>    create_descriptor();
>    add_property_1_to_descriptor();
>    add_property_2_to_descriptor();
>    ...
>    send()
>

I guess I find this acceptable usually, but sticking to this model in
this case prevents abstraction, because the shared logic is only in the
first and last statements of your example.

>
> Instead of:
>
>   create_descriptor_and_send()
>   {
>      create();
>      send();
>   }
>
>   create_descriptor_with_1_and_send()
>   {
>      create_descriptor_and_send();
>      add_property_1();
>   }
>

If create_descriptor_with_1_and_send() is pseudocode for
brw_send_indirect_surface_message(), the only reason why it's there is
to emit the AND instruction that masks out invalid bits from the surface
index, which could otherwise lead to hangs.  I'm not planning on adding
more nesting levels.

>   create_descriptor_with_1_and_2_and_send()
>   {
>      create_descriptor_with_1_and_send()
>      add_property_2();
>   }
>
>   create_descriptor_with_1_and_2_and_3_and_send()
>   {
>      create_descriptor_with_2_and_send()
>      add_property_3();
>   }

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to