> Am 20.08.2019 um 13:54 schrieb Richard Sandiford <richard.sandif...@arm.com>: > > Ilya Leoshkevich <i...@linux.ibm.com> writes: >>> Am 20.08.2019 um 12:13 schrieb Richard Sandiford >>> <richard.sandif...@arm.com>: >>> >>> Ilya Leoshkevich <i...@linux.ibm.com> writes: >>>> z13 supports only non-signaling vector comparisons. This means we >>>> cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13. >>>> However, we cannot express this restriction today: the code only checks >>>> whether vcond$a$b optab, which does not contain information about the >>>> operation. >>>> >>>> Introduce a hook that tells whether target supports certain vector >>>> comparison operations with certain modes. >>>> >>>> gcc/ChangeLog: >>>> >>>> 2019-08-09 Ilya Leoshkevich <i...@linux.ibm.com> >>>> >>>> * doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document. >>>> * doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document. >>>> * optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p >>>> in addition to get_vcond_icode. >>>> * target.def (vcond_supported_p): New hook. >>>> * targhooks.c (default_vcond_supported_p): Likewise. >>>> * targhooks.h (default_vcond_supported_p): Likewise. >>> >>> IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses >>> the predicate for operand 3 to test whether a particular comparison >>> is supported. I guess this would require a cached rtx to avoid >>> generating too much garbage rtl though (via GTY((cache))). >> >> How can I implement such a predicate? Would calling maybe_gen_insn with >> a fake rtx be reasonable? In this case, what would be the best way to >> generate fake input operands? The existing code that calls >> maybe_gen_insn gets the corresponding rtxes from upper layers. > > I was thinking of something like optabs.c:can_compare_p, but with > some caching to reduce the overhead, and comparing registers rather > than constants. E.g.: > > static rtx cached_binop GTY ((cached)); > > rtx > get_cached_binop (machine_mode mode, rtx_code code, machine_mode op_mode) > { > ...create or modify cached_binop, with register operands...; > return cached_binop; > }
Thanks, I got it working! Still need to run the regtest, but the result looks promising. I have a question about caching. If I define just `static rtx cached_binop`, like you suggest, I would expect to see constant cache misses. Shouldn't it be something like `hash_map<tuple<enum rtx_code, machine_mode, machine_mode>, rtx>`?