jlebar added a comment. > AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as > I can tell the same is true for NVPTX (based on the attribute name).
I may be corrected, but I believe nvptx only supports ftz for f32. > Double-precision instructions support subnormal inputs and results. > Single-precision instructions support subnormal inputs and results by default > for sm_20 and subsequent targets, and flush subnormal inputs and results to > sign-preserving zero for sm_1x targets. The optional .ftz modifier on > single-precision instructions provides backward compatibility with sm_1x > targets by flushing subnormal inputs and results to sign-preserving zero > regardless of the target architecture. https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69878/new/ https://reviews.llvm.org/D69878 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits