Hi Thomas,
Very many thanks for your help investigating this problem. > > This patch addresses the "increased register pressure" regression > > on nvptx-none caused by my change to transition the backend to > > a STORE_FLAG_VALUE = 1 target. > > Yes, "addresses", but unfortunately doesn't "resolve". ;-| Doh! > I'm confirming the improved code generation (less registers used, less > instructions emitted) in cases where it triggers -- but unfortunately it > doesn't in the PR104345 'libgomp.oacc-c-c++-common/reduction-cplx-dbl.c' > scenario. Looking over the nvptx code currently generated for reduction-cplx-dbl.c, it appears nearly optimal and it's difficult to see what could have regressed. [It makes almost no uses of Boolean types, so is relatively unaffected by a STORE_FLAG_VALUE change]. One remaining possibility is that the "register usage" regression is not in reduction-cplx-dbl.c itself but in __muldc3 in libgcc.a. [I believe kernel resource usage is computed including all called functions]. Might this be easy to test on your configuration, moving libgcc.a from one build to another? If it is __muldc3 regressing, then the other nvptx patches mentioned previously, and perhaps even improvements to isnan and isinf, may help. Again apologies that the "using nvptx set.?32 for "cond ? -1: 0" patch, that catches many of the issues observed in your initial PR analysis, isn't actually the root cause of this particular case. Thanks again, Roger --