On Wed, Aug 2, 2017 at 8:26 AM, Martin Liška <mli...@suse.cz> wrote: > On 08/02/2017 09:16 AM, Jakub Jelinek wrote: >> On Wed, Aug 02, 2017 at 09:13:40AM +0200, Martin Liška wrote: >>> On 08/01/2017 09:50 PM, Jakub Jelinek wrote: >>>> On Thu, Jul 20, 2017 at 08:59:29AM +0200, Martin Liška wrote: >>>>> Hello. >>>>> >>>>> Following patch does sharing of expansion for mem{p,}cpy and also strpcy >>>>> (with a known constant as source) >>>>> so that we use same type of expansion (direct insns emission, direct >>>>> emission with a loop instruction and >>>>> library call). As mentioned in the PR, glibc does not provide an >>>>> optimized version for majority of targets. >>>>> >>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. >>>> >>>> This broke e.g. >>>> FAIL: gcc.dg/20050503-1.c scan-assembler-not call >>>> on i686-linux, the result is significantly worse. >>>> Also, while perhaps majority of targets don't provide optimized version, >>>> some targets do, including i?86/x86_64, and if the memcpy would be expanded >>>> as a call, it is much better to just emit mempcpy call instead. >>>> Just look at the testcase, because of this misoptimization we suddenly >>>> can't >>>> use a tail call. >>>> >>>> Jakub >>>> >>> >>> I see. That said, should I introduce some target hook that will tell >>> whether to expand to >>> 'return memcpy(dst, src,l) + dst;' or call library mempcpy routine? >> >> If some targets aren't willing to provide fast mempcpy in libc, then yes I >> guess. And, for -Os you should never do the former, that isn't going to be >> shorter (at least unless the memcpy is expanded inline and is shorter than >> the call + addition). > > Good, I will work on that. > >> >> BTW, do we have folding of mempcpy to memcpy if the result is ignored (no >> lhs)? > > Yes, we do it, I've just verified that. > > Martin
Hi Martin, With r250771, GCC failed to build glibc for arm/aarch64 linux cross toolchain: during RTL pass: expand loadlocale.c: In function ‘_nl_load_locale’: loadlocale.c:199:7: internal compiler error: in emit_move_insn, at expr.c:3704 __mempcpy (__mempcpy (__mempcpy (newp, file->filename, filenamelen), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 0x80902b emit_move_insn(rtx_def*, rtx_def*) /test/source/gcc/gcc/expr.c:3703 0x6d2271 expand_builtin_memory_copy_args /test/source/gcc/gcc/builtins.c:3514 0x6d48d7 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int) /test/source/gcc/gcc/builtins.c:6847 0x80454c expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) /test/source/gcc/gcc/expr.c:10848 0x6f8a9c expand_expr /test/source/gcc/gcc/expr.h:276 0x6f8a9c expand_call_stmt /test/source/gcc/gcc/cfgexpand.c:2664 0x6f8a9c expand_gimple_stmt_1 /test/source/gcc/gcc/cfgexpand.c:3583 0x6f8a9c expand_gimple_stmt /test/source/gcc/gcc/cfgexpand.c:3749 0x6f9c1a expand_gimple_basic_block /test/source/gcc/gcc/cfgexpand.c:5751 0x6ff986 execute /test/source/gcc/gcc/cfgexpand.c:6358 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. I filed PR81666 for tracking. Thanks, bin