george.burgess.iv added a comment. For a more direct comparison, I offer https://godbolt.org/z/fqAhUC . The lack of optimization in the later case is because we're forced to mark the call to `__builtin_memcpy` in the inline memcpy as `nobuiltin`. If we instead rename things, this issue doesn't happen: https://godbolt.org/z/FKNTWo.
All other FORTIFY impls I know of defer to `_chk` functions for checking that's not guarded by `__builtin_constant_p`, rather than having size checks inline like the kernel. Both GCC and Clang have special knowledge of these intrinsics, so they can optimize them well: https://godbolt.org/z/L7rVHp . It'd be nice if the kernel's FORTIFY were more like all of the other existing ones in this way. Deferring to `_chk` builtins has the side-benefit that the `inline` `memcpy` is often smaller, which increases the inlineability of any functions that `memcpy` gets inlined into(*). The down-side is that the kernel now needs to carry definitions for `__memcpy_chk` et al. (*) -- not in an underhanded way. It's just that any condition that depends on `__builtin_constant_p` or `__builtin_object_size(obj)` is guaranteed to be folded away at compile-time; not representing them in IR is more "honest" to anything that's trying to determine the inlinability of a function. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D71082/new/ https://reviews.llvm.org/D71082 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits