[PATCH] D126158: [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration.

2022-05-31 Thread Stephan Herhut via Phabricator via cfe-commits
herhut accepted this revision. herhut added a comment. This revision is now accepted and ready to land. Separate pass works for me. Comment at: mlir/include/mlir/Dialect/LLVMIR/Transforms/Passes.td:19 +def NVVMOptimize : Pass<"nvvm-optimize"> { + let summary = "Optimize NVVM

[PATCH] D126158: [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration.

2022-05-25 Thread Stephan Herhut via Phabricator via cfe-commits
herhut added inline comments. Comment at: mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp:158 +// by the same divisor. +struct ExpandDivF16 : public ConvertOpToLLVMPattern { + using ConvertOpToLLVMPattern::ConvertOpToLLVMPattern; This pattern is a bit mis

[PATCH] D82574: Merge TableGen files used for clang options

2020-07-10 Thread Stephan Herhut via Phabricator via cfe-commits
herhut added a comment. Could you add the normalization back? This is in line with the comment to make sure the old and new files align. Comment at: clang/include/clang/Driver/Options.td:3455 + HelpText<"Specify target triple (e.g. i686-apple-darwin9)">, + MarshallingInfoStr