Hi, On Sat, Aug 24 2019, Richard Sandiford wrote: > Martin Jambor <mjam...@suse.cz> writes:
... >> >> 2. direct_internal_fn_supported_p on which replacement_internal_fn >> depends to expand built-ins as internal functions cannot handle >> conversion optabs... and narrowing is a kind of conversion and the >> optab is added as such with OPTAB_CD. >> >> Actually, the second statement is not entirely true because somehow it >> can handle optab while_ult which is a conversion optab but a) the way it >> is handled, if I can understand it at all, seems to be a big hack and >> would be even worse if we decided to copy that for all narrowing math >> functions > > Think "big hack" is a bit unfair. The way that the internal function > maps argument types to the optab modes, and the way it expands calls > into rtl, depends on the "optab type" argument (the final argument to > DEF_INTERNAL_OPTAB_FN). This is relatively flexible in that it can use > a single-mode "direct" optab or a dual-mode "conversion" optab, with the > modes coming from whichever arguments are appropriate. New optab types > can be added as needed. My apologies. I guess I should have been more careful with my choice of words when perhaps I did not understand all aspects but when I saw: #define direct_while_optab_supported_p convert_optab_supported_p (and when I saw expand_while_optab_fn defined normally while all(?) other were constructed in an elaborate macro), I thought that I did not want to replicate the mechanism, not for a number of functions. > > FWIW, several other DEF_INTERNAL_OPTAB_FNs are conversion optabs too > (e.g. IFN_LOAD_LANES, IFN_STORE_LANES, IFN_MASK_LOAD, etc.). > > But... > >> and b) it gets both modes from argument types whereas we need one from >> the result type and so we would have to rewrite >> replacement_internal_fn anyway. > > ...yeah, I agree this breaks the current model. The reason IFN_WHILE_ULT > doesn't rely on the return type is that if you have: > > _2 = .WHILE_ULT (_0, _1) // returning a vector of 4 booleans > _3 = .WHILE_ULT (_0, _1) // returning a vector of 8 booleans > > then the calls look equivalent. So instead we pass an extra argument > indicating the required boolean vector "shape". > > The same "problem" could in principle apply to FADD if we ever needed > to support double+double->_Float16 for example. Right. I hope not going through an internal function is acceptable. If not, we'll have to teach this builtin->internal_funcxtion->optab mechanism about conversions. Thanks, Martin > >> Therefore, at least for now (GSoC deadline is kind of looming), I >> decided that the best way forward would be to not rely on internal >> functions but plug into expand_builtin() and I wrote the following, >> lightly tested patch - which of course misses testcases and stuff - but >> I'd be curious about any feedback now anyway. When I proposed a very >> similar approach for the roundeven x86_64 expansion, Uros actually then >> opted for a solution based on internal functions, so I am curious >> whether there are simple alternatives I do not see. >> >> Tejas, of course cases for other fadd variants should at least be added >> to expand_builtin. >> >> Thanks, >> >> Martin >> >> >> 2019-08-23 Tejas Joshi <tejasjoshi9...@gmail.com> >> Martin Jambor <mjam...@suse.cz> >> >> * builtins.c (expand_builtin_binary_conversion): New function. >> (expand_builtin): Call it. >> * config/rs6000/rs6000.md (unspec): Add UNSPEC_ADD_NARROWING. >> (add_truncdfsf3): New define_insn. >> * optabs.def (fadd_optab): New. >>