[PATCH] D69878: Consoldiate internal denormal flushing controls

Cameron McInally via Phabricator via cfe-commits Fri, 10 Jan 2020 11:49:15 -0800

cameron.mcinally added inline comments.


================
Comment at: llvm/docs/LangRef.rst:1837
+   are present, this overrides ``"denormal-fp-math"``. Not all targets
+   support separately setting the denormal mode per type.
+
----------------
arsenm wrote:
> andrew.w.kaylor wrote:
> > cameron.mcinally wrote:
> > > andrew.w.kaylor wrote:
> > > > arsenm wrote:
> > > > > andrew.w.kaylor wrote:
> > > > > > arsenm wrote:
> > > > > > > andrew.w.kaylor wrote:
> > > > > > > > Can you document which targets do support the option? What 
> > > > > > > > happens if I try to use the option on a target where it is not 
> > > > > > > > supported?
> > > > > > > I'm not sure where to document this, or if/how/where to diagnose 
> > > > > > > it. I don't think the high level LangRef description is the right 
> > > > > > > place to discuss specific target handling.
> > > > > > > 
> > > > > > > Currently it won't error or anything. Code checking the denorm 
> > > > > > > mode will see the f32 specific mode, even if the target in the 
> > > > > > > end isn't really going to respect this.
> > > > > > > 
> > > > > > > One problem is this potentially does require coordination with 
> > > > > > > other toolchain components. For AMDGPU, the compiler can directly 
> > > > > > > tell the driver what FP mode to set on each entry point, but for 
> > > > > > > x86 it requires linking in crtfastmath to set the default mode 
> > > > > > > bits. If another target had a similar runtime environment 
> > > > > > > requirement, I don't think we can be sure the attribute is 
> > > > > > > correct or not.
> > > > > > There is precedent for describing target-specific behavior in 
> > > > > > LangRef. It just doesn't seem useful to say that not all targets 
> > > > > > support the attribute without saying which ones do. We should also 
> > > > > > say what is expected if a target doesn't support the attribute. It 
> > > > > > seems reasonable for the function attribute to be silently ignored.
> > > > > > 
> > > > > > > One problem is this potentially does require coordination with 
> > > > > > > other toolchain components. For AMDGPU, the compiler can directly 
> > > > > > > tell the driver what FP mode to set on each entry point, but for 
> > > > > > > x86 it requires linking in crtfastmath to set the default mode 
> > > > > > > bits.
> > > > > > 
> > > > > > This is a point I'm interested in. I don't like the current 
> > > > > > crtfastmath.o handling. It feels almost accidental when FTZ works 
> > > > > > as expected. My understanding is we link crtfastmath.o if we find 
> > > > > > it but if not everything just goes about its business. The Intel 
> > > > > > compiler injects code into main() to explicitly set the FTZ/DAZ 
> > > > > > control modes. That obviously has problems too, but it's at least 
> > > > > > consistent and predictable. As I understand it, crtfastmath.o sets 
> > > > > > these modes from a static initializer, but I'm not sure anything is 
> > > > > > done to determine the order of that initializer relative to others.
> > > > > > 
> > > > > > How does the compiler identify entry points for AMDGPU? And does it 
> > > > > > emit code to set FTZ based on the function attribute here?
> > > > > The entry points are a specific calling convention. There's no real 
> > > > > concept of main. Each kernel has an associated blob of metadata the 
> > > > > driver uses to set up various config registers on dispatch.
> > > > > 
> > > > > I don't think specially recognizing main in the compiler is 
> > > > > fundamentally different than having it done in a static constructor. 
> > > > > It's still a construct not associated with any particular function or 
> > > > > anything.
> > > > The problem with having it done in a static constructor is that you 
> > > > have no certainty of when it will be done relative to other static 
> > > > constructors. If it's in main you can at least say that it's after all 
> > > > the static constructors (assuming main is your entry point).
> > > Yes and no. The linker should honor static constructor priorities. But, 
> > > yeah, there's no guarantee that this constructor will run before other 
> > > priority 101 constructors.
> > > 
> > > The performance penalty for setting denormal flushing in main could be 
> > > significant (think C++). Also, there's precedent for using static 
> > > constructors, like GCC's crtfastmath.o.
> > Fair enough. I don't necessarily like how icc handles this. I don't have a 
> > problem with how gcc handles it. I just really don't like how LLVM does it. 
> > If we want to take the static constructor approach we should define our 
> > own, not depend on whether or not the GNU object file happens to be around.
> > 
> > Static initialization doesn't help for AMDGPU, and I suppose that's likely 
> > to be the case for any offload execution model. Since this patch is moving 
> > us toward a more consistent implementation I'm wondering if we can define 
> > some general rules for how this is supposed to work. Like when the function 
> > attribute will result in injected instructions setting the control flags 
> > and when it won't.
> I think the most we can expect of this attribute as informing codegen of the 
> expected FP denormal handling mode, and not something responsible for 
> ensuring the mode will really be set. AMDGPU conceptually could have a 
> separate set of attributes for setting the denormal FP mode, but since it 
> would look identical, this gets a bonus usage for setting it for kernels. 
> This doesn't protect you from calling functions in modules compiled with 
> different attributes, so similar problems outside the view of the compiler 
> still exist
> If we want to take the static constructor approach we should define our own, 
> not depend on whether or not the GNU object file happens to be around.

That's a good idea. There's subtle differences between targets in the GNU 
implementation. It would be good to standardize them.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69878: Consoldiate internal denormal flushing controls

Reply via email to