Chad Versace <chad.vers...@linux.intel.com> writes: > +void > +vec4_visitor::emit_unpack_half_2x16(dst_reg dst, src_reg src0) > +{ > + if (intel->gen < 7) > + assert(!"ir_unop_unpack_half_2x16 should be lowered"); > + > + assert(dst.type == BRW_REGISTER_TYPE_F); > + assert(src0.type == BRW_REGISTER_TYPE_UD); > + > + /* From the Ivybridge PRM, Vol4, Part3, Section 6.26 f32to16: > + * > + * Because this instruction does not have a 16-bit floating-point type, > + * the source data type must be Word (W). The destination type must be > + * F (Float). > + * > + * To use W as the source data type, we must adjust horizontal strides, > + * which is only possible in align1 mode. All my [chadv] attempts at > + * emitting align1 instructions for unpackHalf2x16 failed to pass the > + * Piglit tests, so I gave up. > + * > + * I've verified that, on gen7, it is safe to emit f16to32 in align16 mode > + * with UD as source data type. > + */
Have you tested this on something like: in uvec4 v; vec2 result = unpackHalf2x16(v.w); Those kinds of "the type must be X and the stride must by Y" have sometimes meant that it's just hardcoded and they don't look at what you program, so I'm concerned that some of your regioning (swizzle/abs/neg/uniformness) will just get thrown out by the hardware. But if it's passing on your tests with uniforms, it's probably OK. > + dst_reg tmp_dst(this, glsl_type::uvec2_type); > + src_reg tmp_src(tmp_dst); > + > + /* tmp.x = src0 & 0xffffu; */ > + tmp_dst.writemask = WRITEMASK_X; > + emit(new(mem_ctx) vec4_instruction(this, BRW_OPCODE_AND, > + tmp_dst, src0, src_reg(0xffffu))); These ought to use the helper functions for simplicity: "emit(AND(tmp_dst, src0, src_reg(0xffffu)));" Check out the ALU1 macro for how to set up one of those to have a similar helper for F16TO32 if you want to match up the style. > + > + /* tmp.y = src0 >> 16u; */ > + tmp_dst.writemask = WRITEMASK_Y; > + emit(new(mem_ctx) vec4_instruction(this, BRW_OPCODE_SHR, > + tmp_dst, src0, src_reg(16u))); > + > + /* dst.xy = f16to32(tmp); */ > + dst.writemask = WRITEMASK_XY; > + emit(new(mem_ctx) vec4_instruction(this, BRW_OPCODE_F16TO32, > + dst, tmp_src)); > +}
pgpYC9EUOWxsJ.pgp
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev