Hi, On Fri, Jan 15, 2016 at 04:01:49PM +0100, Jakub Jelinek wrote: > On Fri, Jan 15, 2016 at 03:53:23PM +0100, Martin Jambor wrote: > > @@ -317,7 +319,7 @@ public: > > bool > > pass_ipa_hsa::gate (function *) > > { > > - return hsa_gen_requested_p () || in_lto_p; > > + return hsa_gen_requested_p (); > > } > > > > } // anon namespace > > I actually didn't mean this, I mean more of: > return (hsa_gen_requested_p () > #ifdef ENABLE_HSA > || in_lto_p > #endif > ); > or so. Unless you arrange in lto-wrapper or where that if > HSA is enabled in any LTO input source, then it is enabled also in > lto1. If you do that, your change is fine. >
This pass only creates HSA specific clones of ungridified target and parallel regions and functions marked with declare target. Whether or not any HSAIL is emitted is then controlled in the hsa-gen pass gate. The in_lto_p part was in fact a relict of a previous implementation. So while I agree that making such a change to lto-wrapper would be beneficial (although then we should limit its activity only to those nodes which come from enabled units), the change above does not make the current situation worse. I will make sure to look into lto-wrapper but meanwhile I still prefer the new condition. We have tested the new change and LTO compiled code with HSA enabled and LTO linked it with HSA disabled and: 1) if there was no gridified loop, the result was like HSA was disabled from the start 2) if there was a gridified kernel, the compiler compiled the kernel for the host but did not register it with libgomp and it ended up as an unreachable function. How do other accelerators cope with the situation when half of the application is compiled with the accelerator disabled? (Would some of their calls to GOMP_target_ext lead to abort?) Martin