jlebar added a comment.

> AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as 
> I can tell the same is true for NVPTX (based on the attribute name).

I may be corrected, but I believe nvptx only supports ftz for f32.

> Double-precision instructions support subnormal inputs and results. 
> Single-precision instructions support subnormal inputs and results by default 
> for sm_20 and subsequent targets, and flush subnormal inputs and results to 
> sign-preserving zero for sm_1x targets. The optional .ftz modifier on 
> single-precision instructions provides backward compatibility with sm_1x 
> targets by flushing subnormal inputs and results to sign-preserving zero 
> regardless of the target architecture.

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to