On 23/04/16 00:17, Jason Ekstrand wrote: > On Fri, Apr 22, 2016 at 3:13 PM, Jason Ekstrand <ja...@jlekstrand.net> > wrote: > >> >> >> On Tue, Apr 12, 2016 at 1:05 AM, Samuel Iglesias Gonsálvez < >> sigles...@igalia.com> wrote: >> >>> From: Iago Toral Quiroga <ito...@igalia.com> >>> >>> At least i965 hardware does not have native support for floor on doubles. >>> --- >>> src/compiler/nir/nir.h | 1 + >>> src/compiler/nir/nir_lower_double_ops.c | 29 >>> +++++++++++++++++++++++++++++ >>> 2 files changed, 30 insertions(+) >>> >>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h >>> index f83b2e0..b7231a7 100644 >>> --- a/src/compiler/nir/nir.h >>> +++ b/src/compiler/nir/nir.h >>> @@ -2287,6 +2287,7 @@ typedef enum { >>> nir_lower_dsqrt = (1 << 1), >>> nir_lower_drsq = (1 << 2), >>> nir_lower_dtrunc = (1 << 3), >>> + nir_lower_dfloor = (1 << 4), >>> } nir_lower_doubles_options; >>> >>> void nir_lower_doubles(nir_shader *shader, nir_lower_doubles_options >>> options); >>> diff --git a/src/compiler/nir/nir_lower_double_ops.c >>> b/src/compiler/nir/nir_lower_double_ops.c >>> index 9eec858..e1ec6da 100644 >>> --- a/src/compiler/nir/nir_lower_double_ops.c >>> +++ b/src/compiler/nir/nir_lower_double_ops.c >>> @@ -377,6 +377,27 @@ lower_trunc(nir_builder *b, nir_ssa_def *src) >>> return nir_pack_double_2x32_split(b, new_src_lo, new_src_hi); >>> } >>> >>> +static nir_ssa_def * >>> +lower_floor(nir_builder *b, nir_ssa_def *src) >>> +{ >>> + /* >>> + * For x >= 0, floor(x) = trunc(x) >>> + * For x < 0, >>> + * - if x is integer, floor(x) = x >>> + * - otherwise, floor(x) = trunc(x) - 1 >>> + */ >>> + nir_ssa_def *tr = nir_ftrunc(b, src); >>> + return nir_bcsel(b, >>> + nir_fge(b, src, nir_imm_double(b, 0.0)), >>> + tr, >>> + nir_bcsel(b, >>> + nir_fne(b, >>> + nir_fsub(b, src, tr), >>> + nir_imm_double(b, 0.0)), >>> >> >> As an aside, you can just as easily check "x is integer" by "x == >> truc(x)". That might be simpler. Same goes for ceil(). >> > > One more thought (Sorry for all the e-mails): It might be better to > implement this as > > floor(x) = (x >= 0 || x == trunc(x)) ? trunc(x) : trunc(x) - 1; > > That way you only have one bcsel and fewer 64-bit values floating around. > It *might* reduce register pressure (not sure if it actually will). > --Jason >
Yeah, good idea. I will do this change for floor() and ceil() and check if it reduces register pressure. Anyway, it saves one bcsel which is great. Thanks! Sam > >> + nir_fsub(b, tr, nir_imm_double(b, 1.0)), >>> + src)); >>> +} >>> + >>> static void >>> lower_doubles_instr(nir_alu_instr *instr, nir_lower_doubles_options >>> options) >>> { >>> @@ -405,6 +426,11 @@ lower_doubles_instr(nir_alu_instr *instr, >>> nir_lower_doubles_options options) >>> return; >>> break; >>> >>> + case nir_op_ffloor: >>> + if (!(options & nir_lower_dfloor)) >>> + return; >>> + break; >>> + >>> default: >>> return; >>> } >>> @@ -431,6 +457,9 @@ lower_doubles_instr(nir_alu_instr *instr, >>> nir_lower_doubles_options options) >>> case nir_op_ftrunc: >>> result = lower_trunc(&bld, src); >>> break; >>> + case nir_op_ffloor: >>> + result = lower_floor(&bld, src); >>> + break; >>> default: >>> unreachable("unhandled opcode"); >>> } >>> -- >>> 2.5.0 >>> >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> >> >> > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev