[PATCH,gcc/MIPS] Make loongson3a use fused madd.d
Hi, Loongson3a has 4 operand fused madd instrcution. This patch set loongson3a use fused madd.d. ChangeLog : *** gcc/ChangeLog *** 2016-11-03 Chenghua Xu config/mips/ * mips.h: Set loongson3a use fused madd.d. Tested on loongson3a. PS: I will soon submit some patches, how can i get a copyright assignment. diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index 81862a9..5076a2b 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -1056,11 +1056,11 @@ struct mips_cpu_info { /* ISA has 4 operand fused madd instructions of the form 'd = [+-] (a * b [+-] c)'. */ -#define ISA_HAS_FUSED_MADD4 TARGET_MIPS8000 +#define ISA_HAS_FUSED_MADD4 (TARGET_MIPS8000 || TARGET_LOONGSON_3A) /* ISA has 4 operand unfused madd instructions of the form 'd = [+-] (a * b [+-] c)'. */ -#define ISA_HAS_UNFUSED_MADD4 (ISA_HAS_FP4 && !TARGET_MIPS8000) +#define ISA_HAS_UNFUSED_MADD4 (ISA_HAS_FP4 && !TARGET_MIPS8000 && !TARGET_LOONGSON_3A) /* ISA has 3 operand r6 fused madd instructions of the form 'c = c [+-] (a * b)'. */
Re: [C++ Patch/RFC] PR 67980 ("left shift count is negative [-Wshift-count-negative] generated for unreachable code")
.. I'm still looking for some directions about the best way to handle this issue: anyway, in case it wasn't clear, the second patch I posted passes testing. Thanks, Paolo.
Re: [openacc] adjust default num_gangs
On Wed, Nov 02, 2016 at 03:11:52PM -0700, Cesar Philippidis wrote: >if (seen_zero) > { > + /* See if the user provided GOMP_OPENACC_DIM environment > + variable to specify runtime defaults. */ > + static int default_dims[GOMP_DIM_MAX]; > + > + pthread_mutex_lock (&ptx_dev_lock); I think pthread_once would be better here, with the default_dims initialization outlined into a separate helper function. Jakub
Re: [openacc] adjust default num_gangs
On 11/02/2016 12:50 PM, Jakub Jelinek wrote: > On Wed, Nov 02, 2016 at 12:34:47PM -0700, Cesar Philippidis wrote: >> @@ -932,9 +933,84 @@ nvptx_exec (void (*fn), size_t mapnum, void >> **hostaddrs, void **devaddrs, >> >>if (seen_zero) >> { >> + /* See if the user provided GOMP_OPENACC_DIM environment >> + variable to specify runtime defaults. */ >> + static int default_dims[GOMP_DIM_MAX]; >> + >> + if (!default_dims[0]) >> +{ > > Is this guarded by some lock, or is it just racy if multiple > nvptx_execs are done at the same time? > >> + /* We only read the environment variable once. You can't >> + change it in the middle of execution. The sytntax is > > syntax > >> + the same as for the -fopenacc-dim compilation option. */ >> + const char *env_var = getenv ("GOMP_OPENACC_DIM"); > >> + >> + if (CUDA_SUCCESS == cuDeviceGetAttribute >> + (&block_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev) >> + && CUDA_SUCCESS == cuDeviceGetAttribute >> + (&warp_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, dev) >> + && CUDA_SUCCESS == cuDeviceGetAttribute >> + (&dev_size, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, dev) >> + && CUDA_SUCCESS == cuDeviceGetAttribute >> + (&cpu_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, >> dev)) > > The formatting is wrong. 1) you should use the call should be on lhs of ==, > not rhs 2) ( should be after cuDeviceGetAttribute, not on the next line > 3) still the lines are too long. > > if (cuDeviceGetAttribute (&block_size, > CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, > dev) == CUDA_SUCCESS > && cuDeviceGetAttribute (... > > CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR > is still way too long, perhaps initialize a temporary const var > to that, or use some macro like > DEV_ATTR (MAX_THREADS_PER_MULTIPROCESSOR) > where > #define DEV_ATTR(x) CU_DEVICE_ATTRIBUTE_##x > > Otherwise LGTM. Thanks. I've applied this version to trunk. Cesar 2016-11-02 Cesar Philippidis Nathan Sidwell gcc/ * config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Set to zero. libgomp/ * plugin/plugin-nvptx.c (nvptx_exec): Interrogate board attributes to determine default geometry. * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Set gang dimension. diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 80fa9ae..782bbde 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -4174,7 +4174,7 @@ nvptx_expand_builtin (tree exp, rtx target, rtx ARG_UNUSED (subtarget), /* Define dimension sizes for known hardware. */ #define PTX_VECTOR_LENGTH 32 #define PTX_WORKER_LENGTH 32 -#define PTX_GANG_DEFAULT 32 +#define PTX_GANG_DEFAULT 0 /* Defer to runtime. */ /* Validate compute dimensions of an OpenACC offload or routine, fill in non-unity defaults. FN_LEVEL indicates the level at which a diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index 327500c..5ee350d 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -45,6 +45,7 @@ #include #include #include +#include static const char * cuda_error (CUresult r) @@ -932,9 +933,88 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, if (seen_zero) { + /* See if the user provided GOMP_OPENACC_DIM environment + variable to specify runtime defaults. */ + static int default_dims[GOMP_DIM_MAX]; + + pthread_mutex_lock (&ptx_dev_lock); + if (!default_dims[0]) + { + /* We only read the environment variable once. You can't + change it in the middle of execution. The syntax is + the same as for the -fopenacc-dim compilation option. */ + const char *env_var = getenv ("GOMP_OPENACC_DIM"); + if (env_var) + { + const char *pos = env_var; + + for (i = 0; *pos && i != GOMP_DIM_MAX; i++) + { + if (i && *pos++ != ':') + break; + if (*pos != ':') + { + const char *eptr; + + errno = 0; + long val = strtol (pos, (char **)&eptr, 10); + if (errno || val < 0 || (unsigned)val != val) + break; + default_dims[i] = (int)val; + pos = eptr; + } + } + } + + int warp_size, block_size, dev_size, cpu_size; + CUdevice dev = nvptx_thread()->ptx_dev->dev; + /* 32 is the default for known hardware. */ + int gang = 0, worker = 32, vector = 32; + CUdevice_attribute cu_tpb, cu_ws, cu_mpc, cu_tpm; + + cu_tpb = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK; + cu_ws = CU_DEVICE_ATTRIBUTE_WARP_SIZE; + cu_mpc = CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT; + cu_tpm = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR; + + if (cuDeviceGetAttribute (&block_size, cu_tpb, dev) == CUDA_SUCCESS + && cuDeviceGetAttribute (&warp_size, cu_ws, dev) == CUDA_SUCCESS + && cuDeviceGetAttribute (&
Re: [PATCH 1/2, libgcc]: Implement _divmoddi4
On Wed, Nov 2, 2016 at 12:19 PM, Uros Bizjak wrote: > On Wed, Nov 2, 2016 at 2:27 PM, Ian Lance Taylor wrote: >> On Mon, Oct 31, 2016 at 7:46 AM, Uros Bizjak wrote: >>> This function will be used in a follow-up patch to implement >>> TARGET_EXPAND_DIVMOD_LIBFUNC for x86 targets. Other targets can call >>> this function, so IMO it should be part of a generic library. >>> >>> 2016-10-31 Uros Bizjak >>> >>> * Makefile.in (LIB2_DIVMOD_FUNCS): Add _divmoddi4. >>> * libgcc2.c (__divmoddi4): New function. >>> * libgcc2.h (__divmoddi4): Declare. >>> * libgcc-std.ver.in (GCC_7.0.0): New. Add __PFX_divmoddi4 >>> and __PFX_divmodti4. >> >> You aren't defining divmodti4 anywhere, so it seems premature to add >> it to libgcc-std.ver.in. > > It is created magically for x86_64, in the same way e.g. divti3 is created. Ah, OK. Ian
Re: [PATCH, C++] Warn on redefinition of builtin functions (PR c++/71973)
On 11/02/16 18:51, Jason Merrill wrote: > On 11/02/2016 02:11 AM, Bernd Edlinger wrote: >> On 11/01/16 19:15, Bernd Edlinger wrote: >>> On 11/01/16 18:11, Jason Merrill wrote: On Tue, Nov 1, 2016 at 11:45 AM, Bernd Edlinger wrote: > On 11/01/16 16:20, Jason Merrill wrote: >> On 10/17/2016 03:18 PM, Bernd Edlinger wrote: >> I'm not even sure we need a new warning. Can we combine this warning >> with the block that currently follows? > > After 20 years of not having a warning on that, > an implicitly enabled warning would at least break lots of bogus > test cases. Would it, though? Which test cases still break with the current patch? >>> >>> Less than before, but there are still at least a few of them. >>> >>> I can make a list and send it tomorrow. >> >> FAIL: g++.dg/cpp1y/lambda-generic-udt.C -std=gnu++14 (test for excess >> errors) >> FAIL: g++.dg/cpp1y/lambda-generic-xudt.C -std=gnu++14 (test for excess >> errors) >> FAIL: g++.dg/init/new15.C -std=c++11 (test for excess errors) >> FAIL: g++.dg/init/new15.C -std=c++14 (test for excess errors) >> FAIL: g++.dg/init/new15.C -std=c++98 (test for excess errors) >> FAIL: g++.dg/ipa/inline-1.C -std=gnu++11 (test for excess errors) >> FAIL: g++.dg/ipa/inline-1.C -std=gnu++14 (test for excess errors) >> FAIL: g++.dg/ipa/inline-1.C -std=gnu++98 (test for excess errors) >> FAIL: g++.dg/ipa/inline-2.C -std=gnu++11 (test for excess errors) >> FAIL: g++.dg/ipa/inline-2.C -std=gnu++14 (test for excess errors) >> FAIL: g++.dg/ipa/inline-2.C -std=gnu++98 (test for excess errors) >> FAIL: g++.dg/tc1/dr20.C -std=c++11 (test for excess errors) >> FAIL: g++.dg/tc1/dr20.C -std=c++14 (test for excess errors) >> FAIL: g++.dg/tc1/dr20.C -std=c++98 (test for excess errors) >> FAIL: g++.dg/tree-ssa/inline-1.C -std=gnu++11 (test for excess errors) >> FAIL: g++.dg/tree-ssa/inline-1.C -std=gnu++14 (test for excess errors) >> FAIL: g++.dg/tree-ssa/inline-1.C -std=gnu++98 (test for excess errors) >> FAIL: g++.dg/tree-ssa/inline-2.C -std=gnu++11 (test for excess errors) >> FAIL: g++.dg/tree-ssa/inline-2.C -std=gnu++14 (test for excess errors) >> FAIL: g++.dg/tree-ssa/inline-2.C -std=gnu++98 (test for excess errors) >> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto >> -flto-partition=1to1 -fno-use-linker-plugin >> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto >> -flto-partition=none -fuse-linker-plugin >> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto >> -fuse-linker-plugin -fno-fat-lto-objects >> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto >> -flto-partition=1to1 -fno-use-linker-plugin >> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto >> -flto-partition=none -fuse-linker-plugin -fno-fat-lto-objects >> FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto >> -fuse-linker-plugin >> FAIL: g++.dg/lto/pr68811 cp_lto_pr68811_0.o assemble, -O2 >> FAIL: g++.old-deja/g++.law/except1.C -std=gnu++11 (test for excess >> errors) >> FAIL: g++.old-deja/g++.law/except1.C -std=gnu++14 (test for excess >> errors) >> FAIL: g++.old-deja/g++.law/except1.C -std=gnu++98 (test for excess >> errors) >> FAIL: g++.old-deja/g++.mike/p700.C -std=gnu++11 (test for excess errors) >> FAIL: g++.old-deja/g++.mike/p700.C -std=gnu++14 (test for excess errors) >> FAIL: g++.old-deja/g++.mike/p700.C -std=gnu++98 (test for excess errors) >> FAIL: g++.old-deja/g++.other/builtins10.C -std=c++11 (test for excess >> errors) >> FAIL: g++.old-deja/g++.other/builtins10.C -std=c++14 (test for excess >> errors) >> FAIL: g++.old-deja/g++.other/realloc.C -std=c++11 (test for excess >> errors) >> FAIL: g++.old-deja/g++.other/realloc.C -std=c++14 (test for excess >> errors) >> FAIL: g++.old-deja/g++.other/realloc.C -std=c++98 (test for excess >> errors) >> FAIL: g++.old-deja/g++.other/vbase5.C -std=c++11 (test for excess >> errors) >> FAIL: g++.old-deja/g++.other/vbase5.C -std=c++14 (test for excess >> errors) >> FAIL: g++.old-deja/g++.other/vbase5.C -std=c++98 (test for excess >> errors) >> >> >> The lto test case does emit the warning when assembling, but >> it still produces an executable and even executes it. >> >> Also g++.dg/cpp1y/lambda-generic-udt.C, g++.dg/tc1/dr20.C >> and g++.old-deja/g++.other/vbase5.C are execution tests. >> >> So I was wrong to assume these were all compile-only tests. >> >> I think that list should be fixable, if we decide to enable >> the warning by default. > > Yes, either by fixing the prototypes or disabling the warning. > Yes, I am inclined to enable the warning by default now. Most of the test cases are fixable in a fairly obvious way, see attachment. But I am unsure about g++.old-deja/g++.other/builtins10.C: // { dg-do assemble } // Test that built-in functions don't warn when prototyped without arguments. // Origin: PR c++/9367 // Copyright (C) 2003 Free Software Foundation.
Re: [openacc] adjust default num_gangs
On Wed, Nov 02, 2016 at 12:34:47PM -0700, Cesar Philippidis wrote: > @@ -932,9 +933,84 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, > void **devaddrs, > >if (seen_zero) > { > + /* See if the user provided GOMP_OPENACC_DIM environment > + variable to specify runtime defaults. */ > + static int default_dims[GOMP_DIM_MAX]; > + > + if (!default_dims[0]) > + { Is this guarded by some lock, or is it just racy if multiple nvptx_execs are done at the same time? > + /* We only read the environment variable once. You can't > + change it in the middle of execution. The sytntax is syntax > + the same as for the -fopenacc-dim compilation option. */ > + const char *env_var = getenv ("GOMP_OPENACC_DIM"); > + > + if (CUDA_SUCCESS == cuDeviceGetAttribute > + (&block_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev) > + && CUDA_SUCCESS == cuDeviceGetAttribute > + (&warp_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, dev) > + && CUDA_SUCCESS == cuDeviceGetAttribute > + (&dev_size, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, dev) > + && CUDA_SUCCESS == cuDeviceGetAttribute > + (&cpu_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, > dev)) The formatting is wrong. 1) you should use the call should be on lhs of ==, not rhs 2) ( should be after cuDeviceGetAttribute, not on the next line 3) still the lines are too long. if (cuDeviceGetAttribute (&block_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev) == CUDA_SUCCESS && cuDeviceGetAttribute (... CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR is still way too long, perhaps initialize a temporary const var to that, or use some macro like DEV_ATTR (MAX_THREADS_PER_MULTIPROCESSOR) where #define DEV_ATTR(x) CU_DEVICE_ATTRIBUTE_##x Otherwise LGTM. Jakub
[openacc] adjust default num_gangs
This patch teaches the libgomp runtime how to probe the CUDA driver to extract the number of Stream Multiprocessors that are available on the graphics hardware and use that as the default value for num_gangs. Without that patch, libgomp used to have num_gangs default to 32, which was chosen arbitrarily. At least this value maps onto a hardware value. More details regarding this patch can be found here: https://gcc.gnu.org/ml/gcc-patches/2016-08/msg02064.html https://gcc.gnu.org/ml/gcc-patches/2016-08/msg02084.html Is this patch OK for trunk? Cesar 2016-11-02 Cesar Philippidis Nathan Sidwell gcc/ * config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Set to zero. libgomp/ * plugin/plugin-nvptx.c (nvptx_exec): Interrogate board attributes to determine default geometry. * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Set gang dimension. diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 80fa9ae..782bbde 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -4174,7 +4174,7 @@ nvptx_expand_builtin (tree exp, rtx target, rtx ARG_UNUSED (subtarget), /* Define dimension sizes for known hardware. */ #define PTX_VECTOR_LENGTH 32 #define PTX_WORKER_LENGTH 32 -#define PTX_GANG_DEFAULT 32 +#define PTX_GANG_DEFAULT 0 /* Defer to runtime. */ /* Validate compute dimensions of an OpenACC offload or routine, fill in non-unity defaults. FN_LEVEL indicates the level at which a diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index 327500c..91c1386 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -45,6 +45,7 @@ #include #include #include +#include static const char * cuda_error (CUresult r) @@ -932,9 +933,84 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, if (seen_zero) { + /* See if the user provided GOMP_OPENACC_DIM environment + variable to specify runtime defaults. */ + static int default_dims[GOMP_DIM_MAX]; + + if (!default_dims[0]) + { + /* We only read the environment variable once. You can't + change it in the middle of execution. The sytntax is + the same as for the -fopenacc-dim compilation option. */ + const char *env_var = getenv ("GOMP_OPENACC_DIM"); + if (env_var) + { + const char *pos = env_var; + + for (i = 0; *pos && i != GOMP_DIM_MAX; i++) + { + if (i && *pos++ != ':') + break; + if (*pos != ':') + { + const char *eptr; + + errno = 0; + long val = strtol (pos, (char **)&eptr, 10); + if (errno || val < 0 || (unsigned)val != val) + break; + default_dims[i] = (int)val; + pos = eptr; + } + } + } + + int warp_size, block_size, dev_size, cpu_size; + CUdevice dev = nvptx_thread()->ptx_dev->dev; + /* 32 is the default for known hardware. */ + int gang = 0, worker = 32, vector = 32; + + if (CUDA_SUCCESS == cuDeviceGetAttribute + (&block_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev) + && CUDA_SUCCESS == cuDeviceGetAttribute + (&warp_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, dev) + && CUDA_SUCCESS == cuDeviceGetAttribute + (&dev_size, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, dev) + && CUDA_SUCCESS == cuDeviceGetAttribute + (&cpu_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, dev)) + { + GOMP_PLUGIN_debug (0, " warp_size=%d, block_size=%d," + " dev_size=%d, cpu_size=%d\n", + warp_size, block_size, dev_size, cpu_size); + gang = (cpu_size / block_size) * dev_size; + worker = block_size / warp_size; + vector = warp_size; + } + + /* There is no upper bound on the gang size. The best size + matches the hardware configuration. Logical gangs are + scheduled onto physical hardware. To maximize usage, we + should guess a large number. */ + if (default_dims[GOMP_DIM_GANG] < 1) + default_dims[GOMP_DIM_GANG] = gang ? gang : 1024; + /* The worker size must not exceed the hardware. */ + if (default_dims[GOMP_DIM_WORKER] < 1 + || (default_dims[GOMP_DIM_WORKER] > worker && gang)) + default_dims[GOMP_DIM_WORKER] = worker; + /* The vector size must exactly match the hardware. */ + if (default_dims[GOMP_DIM_VECTOR] < 1 + || (default_dims[GOMP_DIM_VECTOR] != vector && gang)) + default_dims[GOMP_DIM_VECTOR] = vector; + + GOMP_PLUGIN_debug (0, " default dimensions [%d,%d,%d]\n", + default_dims[GOMP_DIM_GANG], + default_dims[GOMP_DIM_WORKER], + default_dims[GOMP_DIM_VECTOR]); + } + for (i = 0; i != GOMP_DIM_MAX; i++) - if (!dims[i]) - dims[i] = /* TODO */ 32; + if (!dims[i]) + dims[i] = default_dims[i]; } /* This reserves a chunk of a pre-allocated page of memory mapped on both @@ -954,8 +1030,8 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, mapnum * sizeof (void *)); GOMP_PLUGIN_debug (
Re: [PATCH] enhance buffer overflow warnings (and c/53562)
On Wed, Nov 02, 2016 at 10:55:23AM -0600, Martin Sebor wrote: > >That's an unfair assertion in light of the numbers above. > > > >>If you want a warning for suspicious calls, sure, but > >>1) it has to be clearly worded significantly differently from how do you > >> word it, so that users really understand you are warning about > >> suspicious code (though, I really believe it is very common and there > >> is really nothing the users can do about it) > >>2) it really shouldn't be enabled in -Wall, and I think not even in > >>-Wextra > > > >As you have argued yourself recently in our discussion of > >-Wimplicit-fallthrough, warnings that aren't enabled by either > >of these options don't generally benefit users because very few > >turn them on explicitly. I agree with that argument although I > >would be in favor of rolling out a new warning gradually over > >two or more releases if it were known to be prone to high rates > >of false positive. The -Wstringop-overflow warning clearly isn't > >in that category so there's no need for it. My offer to do > >additional testing is still good if you'd like to see more data. But obviously not all levels of the warning can/should be enabled with -Wall/-Werror. There are cases which are worth warning by default (the case where we want to inform the user if you reach this stmt, you'll get your program killed (will call __chk_fail)) is something that ought like before be enabled by default; can have a warning switch users can disable. Then there is the case where there is a sure buffer overflow (not using -D_FORTIFY_SOURCE, but still __bos (, 0) tells the buffer is too short, and it is unconditional (no tricks with PHIs where one path has short and another part has long size). This is something that is useful in -Wall. The rest I'm very doubtful about even for -Wextra. One thing is that for useful warnings users have to be able to do something to quiet them. For -Wmisleading-indentation you indent your code properly, for -Wimplicit-fallthrough you add attributes or comments, etc. But I really don't know what you want people to do to tell the compiler the warning is a false positive. Add asm ("" : "+g" (ptr)); to hide everything from the compiler? Too ugly, too unportable, pessimizing correct code. Another thing is that __builtin_object_size (x, 1) is very fragile thing especially since the introduction of MEM_REF, but also various foldings etc. Just look up the numerous open PRs about it. We try to mitigate it in some early optimization passes, and had to introduce an early objsz pass copy to deal with that. But further down the optimization passes the distinctions that __bos (x, 1) needs to do are very often lost, optimizers fold (char *)&s + offsetof (struct S, fld) into &s.fld and vice versa, MEM_REF just doesn't tell you anything about what field is used, etc. It is unreliable in both directions. So trying to use it for anything very late in the GIMPLE opts is just a bad idea. Jakub
Re: [RFA] Fix false positive out of bounds array index warning in LRA
On 11/02/2016 01:20 PM, Bernd Schmidt wrote: On 10/29/2016 06:21 PM, Jeff Law wrote: On a small number of ports, we only have 2 defined register classes. NO_REGS and ALL_REGS. Examples would include nvptx and vax. So let's look at check_and_process_move from lra-constraints.c: sclass = dclass = NO_REGS; if (REG_P (dreg)) dclass = get_reg_class (REGNO (dreg)); if (dclass == ALL_REGS) /* ALL_REGS is used for new pseudos created by transformations like reload of SUBREG_REG (see function simplify_operand_subreg). We don't know their class yet. We should figure out the class from processing the insn constraints not in this fast path function. Even if ALL_REGS were a right class for the pseudo, secondary_... hooks usually are not define for ALL_REGS. */ return false; [ ... ] /* Set up hard register for a reload pseudo for hook secondary_reload because some targets just ignore unassigned pseudos in the hook. */ if (dclass != NO_REGS && lra_get_regno_hard_regno (REGNO (dreg)) < 0) { dregno = REGNO (dreg); reg_renumber[dregno] = ira_class_hard_regs[dclass][0]; } The reference to ira_class_hard_regs is flagged by VRP as having an out-of-bounds array reference. WTF!? Of course it's pretty obvious once you look at the dumps... On most targets dclass in this code will have a VR_VARYING range and as a result no warning will be issued. But on the ptx and vax ports VRP is able to give us the range ~[NO_REGS,ALL_REGS] aka ~[0,1] The out-of-range array index is now obvious. So I tried to look up the rules for enum values and it seems like we can't prove that the code in the if statement is dead. However, can't we at least prove that it is "dead enough" for warning purposes? For both C & C++ you can shove in a value outside the range of the enum. There is new strongly typed enum in C++11, but I didn't figure converting to those would be well received :-) So for the given code the range really is ~[0,1] and we can overflow the array bounds. Ok for the trunk? Hmm, seems like a deficiency in the warning code TBH, but I guess this patch can't hurt. Maybe open a PR for improving the warning. I guess we could enhance the warning on the assumption that getting a value that is not one of the enumeration constants. That's assuming we've got the right type in the right place. I'll take a look and see before going forward. jeff
Re: [PATCH, Fortran] Extension: Support legacy PARAMETER statements with -std=legacy (or -fdec?)
On 11/02/2016 04:47 PM, Fritz Reese wrote: All, Another quirk of legacy compilers is their syntax for PARAMETER statements. Such statements are similar to standard PARAMETER statements but lack parentheses following the PARAMETER keyword. There is a good reason the standard doesn't support this - because the statement becomes ambiguous with assignment statements in fixed form. Consider the following: parameter pi = 3.14d0 In fixed form, spaces are ignored so in standard Fortran the above looks like an assignment to the variable "parameterpi". In legacy compilers, the above is instead interpreted as parameter (pi = 3.14d0) which of course declares the variable 'pi' to be a parameter with the value 3.14. Please test this with a program that has a *variable* that's named PARAMETER. I have encountered Fortran compilers in the past that couldn't cope with it (I won't name names). Thanks and kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: [RFA] Fix false positive out of bounds array index warning in LRA
On 10/29/2016 06:21 PM, Jeff Law wrote: On a small number of ports, we only have 2 defined register classes. NO_REGS and ALL_REGS. Examples would include nvptx and vax. So let's look at check_and_process_move from lra-constraints.c: sclass = dclass = NO_REGS; if (REG_P (dreg)) dclass = get_reg_class (REGNO (dreg)); if (dclass == ALL_REGS) /* ALL_REGS is used for new pseudos created by transformations like reload of SUBREG_REG (see function simplify_operand_subreg). We don't know their class yet. We should figure out the class from processing the insn constraints not in this fast path function. Even if ALL_REGS were a right class for the pseudo, secondary_... hooks usually are not define for ALL_REGS. */ return false; [ ... ] /* Set up hard register for a reload pseudo for hook secondary_reload because some targets just ignore unassigned pseudos in the hook. */ if (dclass != NO_REGS && lra_get_regno_hard_regno (REGNO (dreg)) < 0) { dregno = REGNO (dreg); reg_renumber[dregno] = ira_class_hard_regs[dclass][0]; } The reference to ira_class_hard_regs is flagged by VRP as having an out-of-bounds array reference. WTF!? Of course it's pretty obvious once you look at the dumps... On most targets dclass in this code will have a VR_VARYING range and as a result no warning will be issued. But on the ptx and vax ports VRP is able to give us the range ~[NO_REGS,ALL_REGS] aka ~[0,1] The out-of-range array index is now obvious. So I tried to look up the rules for enum values and it seems like we can't prove that the code in the if statement is dead. However, can't we at least prove that it is "dead enough" for warning purposes? Ok for the trunk? Hmm, seems like a deficiency in the warning code TBH, but I guess this patch can't hurt. Maybe open a PR for improving the warning. Bernd
Re: [PATCH 1/2, libgcc]: Implement _divmoddi4
On Wed, Nov 2, 2016 at 2:27 PM, Ian Lance Taylor wrote: > On Mon, Oct 31, 2016 at 7:46 AM, Uros Bizjak wrote: >> This function will be used in a follow-up patch to implement >> TARGET_EXPAND_DIVMOD_LIBFUNC for x86 targets. Other targets can call >> this function, so IMO it should be part of a generic library. >> >> 2016-10-31 Uros Bizjak >> >> * Makefile.in (LIB2_DIVMOD_FUNCS): Add _divmoddi4. >> * libgcc2.c (__divmoddi4): New function. >> * libgcc2.h (__divmoddi4): Declare. >> * libgcc-std.ver.in (GCC_7.0.0): New. Add __PFX_divmoddi4 >> and __PFX_divmodti4. > > You aren't defining divmodti4 anywhere, so it seems premature to add > it to libgcc-std.ver.in. It is created magically for x86_64, in the same way e.g. divti3 is created. objdump of x86_64 libgcc.a: --cut here-- ... _divmoddi4.o: file format elf64-x86-64 Disassembly of section .text: <__divmodti4>: 0:41 57push %r15 2:41 56push %r14 4:49 89 f9 mov%rdi,%r9 ... --cut here-- Uros.
Re: RFD: Buffer handling for ASM_GENERATE_INTERNAL_LABEL
On 10/29/2016 04:17 PM, Trevor Saunders wrote: So actually a thing I've wanted to do for a while is add a long string building class to generate asm into that would use a set of buffers adding new ones when the space is needed. So that would look something like I'd do something a little simpler and just wrap an obstack in a class. Bernd
Re: [PATCH] xtensa: don't xfail gcc.c-torture/compile/20001226-1.c
On Wed, Nov 2, 2016 at 10:22 AM, augustine.sterl...@gmail.com wrote: > On Tue, Nov 1, 2016 at 12:45 PM, Max Filippov wrote: >> With jump trampolines implemented in binutils since 2.25 and enabled by >> default this test no longer fails on xtensa. >> >> 2016-11-01 Max Filippov >> gcc/testsuite/ >> * gcc.c-torture/compile/20001226-1.c: Don't xfail on xtensa. > > Approved. Applied to trunk. Thank you! -- Max
Re: [PATCH] xtensa: fix ICE on pr59037.c test
On Wed, Nov 2, 2016 at 10:23 AM, augustine.sterl...@gmail.com wrote: > On Tue, Nov 1, 2016 at 12:11 PM, Max Filippov wrote: >> xtensa gcc gets ICE on pr59037.c test because its xtensa_output_literal >> function cannot handle integer literals of sizes other than 4 and 8, >> whereas the test uses 16-byte int vector. >> Split integer literal formatting into the recursive function >> xtensa_output_integer_literal_parts capable of handling literals of any >> power of 2 size not less than 4. >> >> 2016-11-01 Max Filippov >> gcc/ >> * config/xtensa/xtensa.c (xtensa_output_integer_literal_parts): >> New function. >> (xtensa_output_literal): Use xtensa_output_integer_literal_parts >> to format MODE_INT and MODE_PARTIAL_INT literals. > > Approved. Applied to trunk. Thank you! -- Max
Re: [Patch ARM 4/4] Enable _Float16
On Wed, 2 Nov 2016, James Greenhalgh wrote: > Done in this revision, though this makes no difference to what gets tested > as these options are not added when compiling the probe functions > (check_effective_target_float16). These would need a patch like: > > diff --git a/gcc/testsuite/lib/target-supports.exp > b/gcc/testsuite/lib/target-supports.exp > index abc4ac0..36573cc 100644 > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -2485,7 +2485,7 @@ proc check_effective_target_has_q_floating_suffix { } { > proc check_effective_target_float16 {} { > return [check_no_compiler_messages_nocache float16 object { > _Float16 x; > -}] > +} [add_options_for_float16 {}]] > } > > proc check_effective_target_float32 {} { > > To actually enable the tests. Then I think such a change should be made (for all these functions, probably separately from the ARM patch) so we have proper test coverage for _Float16 on ARM. (I noted when adding the dg-add-options to these tests that I hadn't tested this properly enabled the _Float64x / _Float128 tests on powerpc64le and that that could do with powerpc maintainer testing.) -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 0/3] use rtx_insn * more
On Wed, Nov 02, 2016 at 05:35:40PM +0100, Bernd Schmidt wrote: > On 11/02/2016 03:55 PM, David Malcolm wrote: > > > Did you mean this patch: > > https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01358.html > > That one is ok after the test, sorry for not being more clear. ah, no worries :) and sorry about the wrong link. Trev > > > Bernd >
Re: [PATCH] Generate reproducible output independently of the build-path
Ximin Luo: > Joseph Myers: >> On Wed, 2 Nov 2016, Ximin Luo wrote: >> >>> This patch series adds a new environment variable SOURCE_PREFIX_MAP. When >>> this >>> is set, GCC will treat this as an implicit "-fdebug-prefix-map=$value" >>> command-line argument. This makes the final binary output reproducible, and >> >> Only one argument? Sometimes you may want multiple -fdebug-prefix-map >> options (for both source and build directories, for example). Perhaps the >> value should be split on spaces to provide multiple such mappings? >> > > We wanted to keep this environment variable simple, and allowing this would > make it more costly for other projects to adopt. > > Whichever separator character we choose, we would have to either (a) disallow > it within the oldprefix/newprefix strings like what PATH does or (b) think of > an escaping scheme. Neither choice is great - with (a) since we want to map > *arbitrary* paths the restriction is less acceptable here than in PATH, and > with (b) it adds complexity to the spec, for uncommon use-cases. > If we picked '\n' as the separator character, we could perhaps live with the (a) restriction. I'll wait a while for others to comment some more. > In the case of an out-of-tree build, it is still possible with the current > proposal. Instead of: > > SOURCE_PREFIX_MAP="srcprefix=srcname:buildprefix=buildname" > > One could arrange the tree like: > > commonprefix/srcname > commonprefix/buildname > > then set SOURCE_PREFIX_MAP="commonprefix=." or > SOURCE_PREFIX_MAP="commonprefix/=". > -- GPG: ed25519/56034877E1F87C35 GPG: rsa4096/1318EFAC5FBBDBCE https://github.com/infinity0/pubkeys.git
Re: [PATCH, C++] Warn on redefinition of builtin functions (PR c++/71973)
On 11/02/2016 02:11 AM, Bernd Edlinger wrote: On 11/01/16 19:15, Bernd Edlinger wrote: On 11/01/16 18:11, Jason Merrill wrote: On Tue, Nov 1, 2016 at 11:45 AM, Bernd Edlinger wrote: On 11/01/16 16:20, Jason Merrill wrote: On 10/17/2016 03:18 PM, Bernd Edlinger wrote: I'm not even sure we need a new warning. Can we combine this warning with the block that currently follows? After 20 years of not having a warning on that, an implicitly enabled warning would at least break lots of bogus test cases. Would it, though? Which test cases still break with the current patch? Less than before, but there are still at least a few of them. I can make a list and send it tomorrow. FAIL: g++.dg/cpp1y/lambda-generic-udt.C -std=gnu++14 (test for excess errors) FAIL: g++.dg/cpp1y/lambda-generic-xudt.C -std=gnu++14 (test for excess errors) FAIL: g++.dg/init/new15.C -std=c++11 (test for excess errors) FAIL: g++.dg/init/new15.C -std=c++14 (test for excess errors) FAIL: g++.dg/init/new15.C -std=c++98 (test for excess errors) FAIL: g++.dg/ipa/inline-1.C -std=gnu++11 (test for excess errors) FAIL: g++.dg/ipa/inline-1.C -std=gnu++14 (test for excess errors) FAIL: g++.dg/ipa/inline-1.C -std=gnu++98 (test for excess errors) FAIL: g++.dg/ipa/inline-2.C -std=gnu++11 (test for excess errors) FAIL: g++.dg/ipa/inline-2.C -std=gnu++14 (test for excess errors) FAIL: g++.dg/ipa/inline-2.C -std=gnu++98 (test for excess errors) FAIL: g++.dg/tc1/dr20.C -std=c++11 (test for excess errors) FAIL: g++.dg/tc1/dr20.C -std=c++14 (test for excess errors) FAIL: g++.dg/tc1/dr20.C -std=c++98 (test for excess errors) FAIL: g++.dg/tree-ssa/inline-1.C -std=gnu++11 (test for excess errors) FAIL: g++.dg/tree-ssa/inline-1.C -std=gnu++14 (test for excess errors) FAIL: g++.dg/tree-ssa/inline-1.C -std=gnu++98 (test for excess errors) FAIL: g++.dg/tree-ssa/inline-2.C -std=gnu++11 (test for excess errors) FAIL: g++.dg/tree-ssa/inline-2.C -std=gnu++14 (test for excess errors) FAIL: g++.dg/tree-ssa/inline-2.C -std=gnu++98 (test for excess errors) FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto -flto-partition=1to1 -fno-use-linker-plugin FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto -flto-partition=none -fuse-linker-plugin FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O0 -flto -fuse-linker-plugin -fno-fat-lto-objects FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto -flto-partition=1to1 -fno-use-linker-plugin FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto -flto-partition=none -fuse-linker-plugin -fno-fat-lto-objects FAIL: g++.dg/lto/20080908-1 cp_lto_20080908-1_0.o assemble, -O2 -flto -fuse-linker-plugin FAIL: g++.dg/lto/pr68811 cp_lto_pr68811_0.o assemble, -O2 FAIL: g++.old-deja/g++.law/except1.C -std=gnu++11 (test for excess errors) FAIL: g++.old-deja/g++.law/except1.C -std=gnu++14 (test for excess errors) FAIL: g++.old-deja/g++.law/except1.C -std=gnu++98 (test for excess errors) FAIL: g++.old-deja/g++.mike/p700.C -std=gnu++11 (test for excess errors) FAIL: g++.old-deja/g++.mike/p700.C -std=gnu++14 (test for excess errors) FAIL: g++.old-deja/g++.mike/p700.C -std=gnu++98 (test for excess errors) FAIL: g++.old-deja/g++.other/builtins10.C -std=c++11 (test for excess errors) FAIL: g++.old-deja/g++.other/builtins10.C -std=c++14 (test for excess errors) FAIL: g++.old-deja/g++.other/realloc.C -std=c++11 (test for excess errors) FAIL: g++.old-deja/g++.other/realloc.C -std=c++14 (test for excess errors) FAIL: g++.old-deja/g++.other/realloc.C -std=c++98 (test for excess errors) FAIL: g++.old-deja/g++.other/vbase5.C -std=c++11 (test for excess errors) FAIL: g++.old-deja/g++.other/vbase5.C -std=c++14 (test for excess errors) FAIL: g++.old-deja/g++.other/vbase5.C -std=c++98 (test for excess errors) The lto test case does emit the warning when assembling, but it still produces an executable and even executes it. Also g++.dg/cpp1y/lambda-generic-udt.C, g++.dg/tc1/dr20.C and g++.old-deja/g++.other/vbase5.C are execution tests. So I was wrong to assume these were all compile-only tests. I think that list should be fixable, if we decide to enable the warning by default. Yes, either by fixing the prototypes or disabling the warning. If we remove the DECL_ANTICIPATED check, I see some failures in builtin* tests due to missing extern "C". That seems appropriate at file scope, but I'm not sure it's right for namespace std. Interesting, what have you done? The attached. But now that you've found that the existing warning deals with people doing silly things like redeclaring __builtin_* I guess let's leave it alone, and add the warning to the DECL_ANTICIPATED block the way you proposed. Jason commit d93d421435f8c38b2019526c7645a59e79a92cc5 Author: Jason Merrill Date: Tue Nov 1 17:16:12 2016 -0400 rem-ant diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index ecf4d14..963087d 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/
Re: [ping * 4] PR35503 - warn for restrict
On 2 November 2016 at 23:07, Jason Merrill wrote: > On Wed, Nov 2, 2016 at 1:08 PM, Prathamesh Kulkarni > wrote: >> On 2 November 2016 at 18:29, Jason Merrill wrote: >>> Then I'll approve the whole patch. >> Thanks! >> Trying the patch on kernel build (allmodconfig) reveals the following >> couple of warnings: >> http://pastebin.com/Sv2HFDUv >> >> I think warning for str_error_r() is correct > > It's accurate, but unhelpful; snprintf isn't going to use the contents > of buf via the variadic argument, so this warning is just noise. Ah, indeed, it's just printing address of buf, not using the contents. > >> however I am not sure if >> warning for pager_preexec() is legit or a false positive: >> >> pager.c: In function 'pager_preexec': >> pager.c:35:12: warning: passing argument 2 to restrict-qualified >> parameter aliases with argument 4 [-Wrestrict] >> select(1, &in, NULL, &in, NULL); >> ^~~~~~ >> Is the warning correct for the above call to select() syscall ? > > The warning looks correct based on the prototype > > extern int select (int __nfds, fd_set *__restrict __readfds, >fd_set *__restrict __writefds, >fd_set *__restrict __exceptfds, >struct timeval *__restrict __timeout); > > But passing the same fd_set to both readfds and exceptfds seems > reasonable to me, so this also seems like a false positive. > > Looking at C11, I see this example: > > EXAMPLE 3 The function parameter declarations > void h(int n, int * restrict p, int * restrict q, int * restrict r) > { > int i; > for (i = 0; i < n; i++) > p[i] = q[i] + r[i]; > } > > illustrate how an unmodified object can be aliased through two > restricted pointers. In particular, if a and b > are disjoint arrays, a call of the form h(100, a, b, b) has defined > behavior, because array b is not > modified within function h. > > This is is another example of well-defined code that your warning will > complain about. Yes, that's a limitation of the patch, it just looks at the prototype, and not how the arguments are used in the function. > >> Should we instead keep it in Wextra, or continue keeping it in Wall ? > > It seems that it doesn't belong in -Wall. I don't feel strongly about > -Wextra. Should I commit the patch by keeping Wrestrict "standalone", ie, not including it in either Wall or Wextra ? Thanks, Prathamesh > > Jason
Re: [Patch ARM 4/4] Enable _Float16
On Mon, Oct 24, 2016 at 10:28:42PM +, Joseph Myers wrote: > On Mon, 24 Oct 2016, James Greenhalgh wrote: > > > Hi, > > > > Finally, having added support for single-step DFmode to HFmode conversions, > > this patch adds support for _Float16 to the ARM back-end. > > Given the need for -mfp16-format=ieee (on some processors), you should be > updating target-supports.exp:add_options_for_float16 to add that option in > the ARM case, and make sure that the tests run in cases where the option > is needed to enable support for this format. Done in this revision, though this makes no difference to what gets tested as these options are not added when compiling the probe functions (check_effective_target_float16). These would need a patch like: diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index abc4ac0..36573cc 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -2485,7 +2485,7 @@ proc check_effective_target_has_q_floating_suffix { } { proc check_effective_target_float16 {} { return [check_no_compiler_messages_nocache float16 object { _Float16 x; -}] +} [add_options_for_float16 {}]] } proc check_effective_target_float32 {} { To actually enable the tests. > > That means making sure that only __fp16 promotes and adding similar hooks to > > those used in the AArch64 port giving the excess precision rules, and > > marking HFmode as supported in libgcc. > > Does "supported in libgcc" mean this patch completes fixing bug 63250? >From my reading of the bug. Yes. I'll tag the gcc ChangeLog as a fix, so it is recorded against the PR. Thanks, James --- gcc/ 2016-11-02 James Greenhalgh PR target/63250 * config/arm/arm-builtins.c (arm_simd_floatHF_type_node): Rename to... (arm_fp16_type_node): ...This, make visibile. (arm_simd_builtin_std_type): Rename arm_simd_floatHF_type_node to arm_fp16_type_node. (arm_init_simd_builtin_types): Likewise. (arm_init_fp16_builtins): Likewise. * config/arm/arm.c (arm_excess_precision): New. (arm_floatn_mode): Likewise. (TARGET_C_EXCESS_PRECISION): Likewise. (TARGET_FLOATN_MODE): Likewise. (arm_promoted_type): Only promote arm_fp16_type_node. * config/arm/arm.h (arm_fp16_type_node): Declare. gcc/testsuite/ 2016-11-02 James Greenhalgh * lib/target-supports.exp (add_options_for_float16): Add -mfp16-format=ieee when testign arm*-*-*. diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index e73043d..5ed38d1 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -652,7 +652,8 @@ static struct arm_simd_type_info arm_simd_types [] = { }; #undef ENTRY -static tree arm_simd_floatHF_type_node = NULL_TREE; +/* The user-visible __fp16 type. */ +tree arm_fp16_type_node = NULL_TREE; static tree arm_simd_intOI_type_node = NULL_TREE; static tree arm_simd_intEI_type_node = NULL_TREE; static tree arm_simd_intCI_type_node = NULL_TREE; @@ -739,7 +740,7 @@ arm_simd_builtin_std_type (enum machine_mode mode, case XImode: return arm_simd_intXI_type_node; case HFmode: - return arm_simd_floatHF_type_node; + return arm_fp16_type_node; case SFmode: return float_type_node; case DFmode: @@ -840,8 +841,8 @@ arm_init_simd_builtin_types (void) /* Continue with standard types. */ /* The __builtin_simd{64,128}_float16 types are kept private unless we have a scalar __fp16 type. */ - arm_simd_types[Float16x4_t].eltype = arm_simd_floatHF_type_node; - arm_simd_types[Float16x8_t].eltype = arm_simd_floatHF_type_node; + arm_simd_types[Float16x4_t].eltype = arm_fp16_type_node; + arm_simd_types[Float16x8_t].eltype = arm_fp16_type_node; arm_simd_types[Float32x2_t].eltype = float_type_node; arm_simd_types[Float32x4_t].eltype = float_type_node; @@ -1754,11 +1755,11 @@ arm_init_iwmmxt_builtins (void) static void arm_init_fp16_builtins (void) { - arm_simd_floatHF_type_node = make_node (REAL_TYPE); - TYPE_PRECISION (arm_simd_floatHF_type_node) = GET_MODE_PRECISION (HFmode); - layout_type (arm_simd_floatHF_type_node); + arm_fp16_type_node = make_node (REAL_TYPE); + TYPE_PRECISION (arm_fp16_type_node) = GET_MODE_PRECISION (HFmode); + layout_type (arm_fp16_type_node); if (arm_fp16_format) -(*lang_hooks.types.register_builtin_type) (arm_simd_floatHF_type_node, +(*lang_hooks.types.register_builtin_type) (arm_fp16_type_node, "__fp16"); } diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 524c474..5695548 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -272,6 +272,7 @@ static bool arm_builtin_support_vector_misalignment (machine_mode mode, int misalignment, bool is_packed); static void arm_conditional_register_usage (void); +static enum flt_eval_method arm_excess_precision (enum e
Re: [Patch 3/4] Half to double precision conversions
On Mon, Oct 24, 2016 at 10:24:27PM +, Joseph Myers wrote: > On Mon, 24 Oct 2016, James Greenhalgh wrote: > > > Conversions from double precision floats to the ARM __fp16 are required > > to round only once. > > I'd expect that when fixing this you need to update > gcc.target/arm/fp16-rounding-ieee-1.c (and fp16-rounding-alt-1.c, I > suppose) to expect rounding once. But I don't see such a testsuite change > anywhere in this patch series. Right, I went back and checked my ARM test environment. Sure enough it was producing nonsense results. I've fixed my build system issue, which causes these tests to start failing as you predicted. I've now fixed them in this patch revision to use the value they would return if rounding happened directly rather than first through SFmode. Thanks, James --- gcc/ 2016-11-02 James Greenhalgh * config/arm/arm.c (arm_convert_to_type): Delete. (TARGET_CONVERT_TO_TYPE): Delete. (arm_init_libfuncs): Enable trunc_optab from DFmode to HFmode. (arm_libcall_uses_aapcs_base): Add trunc_optab from DF- to HFmode. * config/arm/arm.h (TARGET_FP16_TO_DOUBLE): New. * config/arm/arm.md (truncdfhf2): Only convert through SFmode if we are in fast math mode, and have no single step hardware instruction. (extendhfdf2): Only expand through SFmode if we don't have a single-step hardware instruction. * config/arm/vfp.md (*truncdfhf2): New. (extendhfdf2): Likewise. gcc/testsuite/ 2016-11-02 James Greenhalgh * gcc.target/arm/fp16-rounding-alt-1.c (ROUNDED): Change expected result. * gcc.target/arm/fp16-rounding-ieee-1.c (ROUNDED): Change expected result. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 8ce4e4e..524c474 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -251,7 +251,6 @@ static bool arm_output_addr_const_extra (FILE *, rtx); static bool arm_allocate_stack_slots_for_args (void); static bool arm_warn_func_return (tree); static tree arm_promoted_type (const_tree t); -static tree arm_convert_to_type (tree type, tree expr); static bool arm_scalar_mode_supported_p (machine_mode); static bool arm_frame_pointer_required (void); static bool arm_can_eliminate (const int, const int); @@ -660,9 +659,6 @@ static const struct attribute_spec arm_attribute_table[] = #undef TARGET_PROMOTED_TYPE #define TARGET_PROMOTED_TYPE arm_promoted_type -#undef TARGET_CONVERT_TO_TYPE -#define TARGET_CONVERT_TO_TYPE arm_convert_to_type - #undef TARGET_SCALAR_MODE_SUPPORTED_P #define TARGET_SCALAR_MODE_SUPPORTED_P arm_scalar_mode_supported_p @@ -2565,6 +2561,11 @@ arm_init_libfuncs (void) ? "__gnu_h2f_ieee" : "__gnu_h2f_alternative")); + set_conv_libfunc (trunc_optab, HFmode, DFmode, + (arm_fp16_format == ARM_FP16_FORMAT_IEEE + ? "__gnu_d2h_ieee" + : "__gnu_d2h_alternative")); + /* Arithmetic. */ set_optab_libfunc (add_optab, HFmode, NULL); set_optab_libfunc (sdiv_optab, HFmode, NULL); @@ -5314,6 +5315,8 @@ arm_libcall_uses_aapcs_base (const_rtx libcall) SFmode)); add_libcall (libcall_htab, convert_optab_libfunc (trunc_optab, SFmode, DFmode)); + add_libcall (libcall_htab, + convert_optab_libfunc (trunc_optab, HFmode, DFmode)); } return libcall && libcall_htab->find (libcall) != NULL; @@ -23742,23 +23745,6 @@ arm_promoted_type (const_tree t) return NULL_TREE; } -/* Implement TARGET_CONVERT_TO_TYPE. - Specifically, this hook implements the peculiarity of the ARM - half-precision floating-point C semantics that requires conversions between - __fp16 to or from double to do an intermediate conversion to float. */ - -static tree -arm_convert_to_type (tree type, tree expr) -{ - tree fromtype = TREE_TYPE (expr); - if (!SCALAR_FLOAT_TYPE_P (fromtype) || !SCALAR_FLOAT_TYPE_P (type)) -return NULL_TREE; - if ((TYPE_PRECISION (fromtype) == 16 && TYPE_PRECISION (type) > 32) - || (TYPE_PRECISION (type) == 16 && TYPE_PRECISION (fromtype) > 32)) -return convert (type, convert (float_type_node, expr)); - return NULL_TREE; -} - /* Implement TARGET_SCALAR_MODE_SUPPORTED_P. This simply adds HFmode as a supported mode; even though we don't implement arithmetic on this type directly, it's supported by diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index fa8e84c..0f9a679 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -194,6 +194,11 @@ extern void (*arm_lang_output_object_attributes_hook)(void); #define TARGET_FP16 \ (ARM_FPU_FSET_HAS (TARGET_FPU_FEATURES, FPU_FL_FP16)) +/* FPU supports converting between HFmode and DFmode in a single hardware + step. */ +#define TARGET_FP16_TO_DOUBLE \ + (TARGET_HARD_FLOAT && (TARGET_FP16 && TARGET_VFP5)) + /* FPU supports fused-multiply-add operations. */ #define TARGET_FMA (TARGET_FPU_REV >= 4) diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
Re: RFC: Warnings silenced when macros from system headers are used (PR c/78000, c/71613)
On Wed, Nov 2, 2016 at 12:37 PM, Joseph Myers wrote: > On Wed, 2 Nov 2016, Jason Merrill wrote: > >> It seems to me that the general principle is that we should consider >> the location where the thing we're warning about is happening. In >> >>float_var = LLONG_MIN; >> >> The relevant location is that of the assignment, not the constant on >> the RHS. In your ?: example, a simple answer would be to warn based > > I'm not sure we track locations well enough to handle that yet. > > Say you have an implicit conversion of a function argument to the type > from the prototype and something about that conversion should be warned > about. Then you should obviously warn if the argument is a macro from a > system header but the call is outside a system header. But say the > function call itself comes from a macro defined in a system header - you > should still warn if the user passed an argument of the wrong type, even > if that argument is a macro from a system header. > > That is: > > /* System header. */ > int __foo (int); > /* This sort of macro to avoid accidental interposition issues has been >discussed lately on libc-alpha, so it's a realistic example. */ > #define foo(x) (0 + __foo (x)) > /* User code. */ > v = foo (NULL); > > should warn because the call to __foo logically results from user code > even though both NULL and foo are macros defined in system headers. I'm > not sure what the locations look like here. Sure, the problem here comes from the user combining the two macros. I suppose in this case you could notice that the macro expansions of 'foo' and 'NULL' are not nested. Jason
Re: [Patch 6/11] Migrate excess precision logic to use TARGET_EXCESS_PRECISION
On Fri, Oct 28, 2016 at 09:09:55PM +, Joseph Myers wrote: > On Fri, 14 Oct 2016, James Greenhalgh wrote: > > > +/* If the join of the implicit precision in which the target will compute > > + floating-point values and the standard precision in which the target > > will > > + compute values is not equal to the standard precision, then the target > > + is either unpredictable, or is a broken configuration in which it claims > > + standards compliance, but doesn't honor that. > > + > > + Effective predictability for __GCC_IEC_559 in flag_iso_mode, means that > > + the implicit precision is not wider, or less predictable than the > > + standard precision. > > + > > + Return TRUE if we have been asked to compile with > > + -fexcess-precision=standard, and following the rules above we are able > > + to guarantee the standards mode. */ > > + > > I'm not convinced by the logic you have here. At least, it seems > different from what we have at present, where -std=c11 > -fexcess-precision=fast is not considered unpredictable if the target > doesn't have any implicit excess precision. > > That is: I think the right question is whether the combination (front-end > excess precision, implicit back-end excess precision) does the same thing > as just front-end excess precision, regardless of the -fexcess-precision= > option. OK, I've reworked the patch along those lines. I noticed that the original logic looked for && TARGET_FLT_EVAL_METHOD != 0 And I no longer make that check. Is that something I need to reinstate? I didn't find any reference to excess precision in Annex F, so I'd guess not? Thanks, James --- gcc/ 2016-11-02 James Greenhalgh * toplev.c (init_excess_precision): Delete most logic. * tree.c (excess_precision_type): Rewrite to use TARGET_EXCESS_PRECISION. * doc/invoke.texi (-fexcess-precision): Document behaviour in a more generic fashion. * ginclude/float.h: Wrap definition of FLT_EVAL_METHOD in __STDC_WANT_IEC_60559_TYPES_EXT__. gcc/c-family/ 2016-11-02 James Greenhalgh * c-common.c (excess_precision_mode_join): New. (c_ts18661_flt_eval_method): New. (c_c11_flt_eval_method): Likewise. (c_flt_eval_method): Likewise. * c-common.h (excess_precision_mode_join): New. (c_flt_eval_method): Likewise. * c-cppbuiltin.c (c_cpp_flt_eval_method_iec_559): New. (cpp_iec_559_value): Call it. (c_cpp_builtins): Modify logic for __LIBGCC_*_EXCESS_PRECISION__, call c_flt_eval_method to set __FLT_EVAL_METHOD__ and __FLT_EVAL_METHOD_TS_18661_3__. gcc/testsuite/ 2016-11-02 James Greenhalgh * gcc.dg/fpermitted-flt-eval-methods_3.c: New. * gcc.dg/fpermitted-flt-eval-methods_4.c: Likewise. diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 307862b..9f0b4a6 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -7944,4 +7944,86 @@ cb_get_suggestion (cpp_reader *, const char *goal, return bm.get_best_meaningful_candidate (); } +/* Return the latice point which is the wider of the two FLT_EVAL_METHOD + modes X, Y. This isn't just >, as the FLT_EVAL_METHOD values added + by C TS 18661-3 for interchange types that are computed in their + native precision are larger than the C11 values for evaluating in the + precision of float/double/long double. If either mode is + FLT_EVAL_METHOD_UNPREDICTABLE, return that. */ + +enum flt_eval_method +excess_precision_mode_join (enum flt_eval_method x, + enum flt_eval_method y) +{ + if (x == FLT_EVAL_METHOD_UNPREDICTABLE + || y == FLT_EVAL_METHOD_UNPREDICTABLE) +return FLT_EVAL_METHOD_UNPREDICTABLE; + + /* GCC only supports one interchange type right now, _Float16. If + we're evaluating _Float16 in 16-bit precision, then flt_eval_method + will be FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16. */ + if (x == FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16) +return y; + if (y == FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16) +return x; + + /* Other values for flt_eval_method are directly comparable, and we want + the maximum. */ + return MAX (x, y); +} + +/* Return the value that should be set for FLT_EVAL_METHOD in the + context of ISO/IEC TS 18861-3. + + This relates to the effective excess precision seen by the user, + which is the join point of the precision the target requests for + -fexcess-precision={standard,fast} and the implicit excess precision + the target uses. */ + +static enum flt_eval_method +c_ts18661_flt_eval_method (void) +{ + enum flt_eval_method implicit += targetm.c.excess_precision (EXCESS_PRECISION_TYPE_IMPLICIT); + + enum excess_precision_type flag_type += (flag_excess_precision_cmdline == EXCESS_PRECISION_STANDARD + ? EXCESS_PRECISION_TYPE_STANDARD + : EXCESS_PRECISION_TYPE_FAST); + + enum flt_eval_method requested += targetm.c.excess_precision (flag_type); + + retu
Re: [ping * 4] PR35503 - warn for restrict
On Wed, Nov 2, 2016 at 1:08 PM, Prathamesh Kulkarni wrote: > On 2 November 2016 at 18:29, Jason Merrill wrote: >> Then I'll approve the whole patch. > Thanks! > Trying the patch on kernel build (allmodconfig) reveals the following > couple of warnings: > http://pastebin.com/Sv2HFDUv > > I think warning for str_error_r() is correct It's accurate, but unhelpful; snprintf isn't going to use the contents of buf via the variadic argument, so this warning is just noise. > however I am not sure if > warning for pager_preexec() is legit or a false positive: > > pager.c: In function 'pager_preexec': > pager.c:35:12: warning: passing argument 2 to restrict-qualified > parameter aliases with argument 4 [-Wrestrict] > select(1, &in, NULL, &in, NULL); > ^~~~~~ > Is the warning correct for the above call to select() syscall ? The warning looks correct based on the prototype extern int select (int __nfds, fd_set *__restrict __readfds, fd_set *__restrict __writefds, fd_set *__restrict __exceptfds, struct timeval *__restrict __timeout); But passing the same fd_set to both readfds and exceptfds seems reasonable to me, so this also seems like a false positive. Looking at C11, I see this example: EXAMPLE 3 The function parameter declarations void h(int n, int * restrict p, int * restrict q, int * restrict r) { int i; for (i = 0; i < n; i++) p[i] = q[i] + r[i]; } illustrate how an unmodified object can be aliased through two restricted pointers. In particular, if a and b are disjoint arrays, a call of the form h(100, a, b, b) has defined behavior, because array b is not modified within function h. This is is another example of well-defined code that your warning will complain about. > Should we instead keep it in Wextra, or continue keeping it in Wall ? It seems that it doesn't belong in -Wall. I don't feel strongly about -Wextra. Jason
Re: [Patch 1/11] Add a new target hook for describing excess precision intentions
On Fri, Oct 28, 2016 at 09:12:11PM +, Joseph Myers wrote: > On Fri, 14 Oct 2016, James Greenhalgh wrote: > > > + value set for @code{-fexcess-precision=[standard|fast]}.", > > I think the correct markup for the option here is: > > @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}} > > (that is, using @option not @code, and with the [ | ] not in a fixed-width > font because they aren't part of the option name). > Thanks, I've updated that line in this revision of the patch following your suggestion. James --- gcc/ 2016-11-02 James Greenhalgh * target.def (excess_precision): New hook. * target.h (flt_eval_method): New. (excess_precision_type): Likewise. * targhooks.c (default_excess_precision): New. * targhooks.h (default_excess_precision): New. * doc/tm.texi.in (TARGET_C_EXCESS_PRECISION): New. * doc/tm.texi: Regenerate. diff --git a/gcc/coretypes.h b/gcc/coretypes.h index 869f858..09f8213 100644 --- a/gcc/coretypes.h +++ b/gcc/coretypes.h @@ -331,6 +331,24 @@ enum symbol_visibility VISIBILITY_INTERNAL }; +/* enums used by the targetm.excess_precision hook. */ + +enum flt_eval_method +{ + FLT_EVAL_METHOD_UNPREDICTABLE = -1, + FLT_EVAL_METHOD_PROMOTE_TO_FLOAT = 0, + FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE = 1, + FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE = 2, + FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 = 16 +}; + +enum excess_precision_type +{ + EXCESS_PRECISION_TYPE_IMPLICIT, + EXCESS_PRECISION_TYPE_STANDARD, + EXCESS_PRECISION_TYPE_FAST +}; + /* Support for user-provided GGC and PCH markers. The first parameter is a pointer to a pointer, the second a cookie. */ typedef void (*gt_pointer_operator) (void *, void *); diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 0aed3d4..d1d1e76 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -947,6 +947,10 @@ sign-extend the result to 64 bits. On such machines, set Do not define this macro if it would never modify @var{m}. @end defmac +@deftypefn {Target Hook} {enum flt_eval_method} TARGET_C_EXCESS_PRECISION (enum excess_precision_type @var{type}) +Return a value, with the same meaning as @code{FLT_EVAL_METHOD} C that describes which excess precision should be applied. @var{type} is either @code{EXCESS_PRECISION_TYPE_IMPLICIT}, @code{EXCESS_PRECISION_TYPE_FAST}, or @code{EXCESS_PRECISION_TYPE_STANDARD}. For @code{EXCESS_PRECISION_TYPE_IMPLICIT}, the target should return which precision and range operations will be implictly evaluated in regardless of the excess precision explicitly added. For @code{EXCESS_PRECISION_TYPE_STANDARD} and @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the explicit excess precision that should be added depending on the value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}. +@end deftypefn + @deftypefn {Target Hook} machine_mode TARGET_PROMOTE_FUNCTION_MODE (const_tree @var{type}, machine_mode @var{mode}, int *@var{punsignedp}, const_tree @var{funtype}, int @var{for_return}) Like @code{PROMOTE_MODE}, but it is applied to outgoing function arguments or function return values. The target hook should return the new mode diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 9e5b456..12f6bc0 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -921,6 +921,8 @@ sign-extend the result to 64 bits. On such machines, set Do not define this macro if it would never modify @var{m}. @end defmac +@hook TARGET_C_EXCESS_PRECISION + @hook TARGET_PROMOTE_FUNCTION_MODE @defmac PARM_BOUNDARY diff --git a/gcc/target.def b/gcc/target.def index c3c87b0..9fc31b3 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -5430,6 +5430,23 @@ DEFHOOK_UNDOC machine_mode, (char c), default_mode_for_suffix) +DEFHOOK +(excess_precision, + "Return a value, with the same meaning as @code{FLT_EVAL_METHOD} C that\ + describes which excess precision should be applied. @var{type} is\ + either @code{EXCESS_PRECISION_TYPE_IMPLICIT},\ + @code{EXCESS_PRECISION_TYPE_FAST}, or\ + @code{EXCESS_PRECISION_TYPE_STANDARD}. For\ + @code{EXCESS_PRECISION_TYPE_IMPLICIT}, the target should return which\ + precision and range operations will be implictly evaluated in regardless\ + of the excess precision explicitly added. For\ + @code{EXCESS_PRECISION_TYPE_STANDARD} and\ + @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the\ + explicit excess precision that should be added depending on the\ + value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}.", + enum flt_eval_method, (enum excess_precision_type type), + default_excess_precision) + HOOK_VECTOR_END (c) /* Functions specific to the C++ frontend. */ diff --git a/gcc/targhooks.c b/gcc/targhooks.c index 866747a..73e1c25 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -2135,4 +2135,12 @@ default_min_arithmetic_precision (void) return WORD_REGISTER_OPERATIONS ? BITS_PER_WORD : BITS_PER_UNIT; } +/* Default implementation of TARGET_C_EXCESS_PRECISION. */ +
Re: [PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those
On Wed, Nov 2, 2016 at 12:33 PM, Jakub Jelinek wrote: > On Wed, Nov 02, 2016 at 04:44:05PM +0100, Jakub Jelinek wrote: >> which means if gen_type_die or gen_type_die_with_usage doesn't >> use the langhook, then we'd emit a completely useless { __pfn; __delta } >> struct into debug info first, and then in modified_type_die used >> the langhook, get OFFSET_TYPE and probably create the >> DW_TAG_ptr_to_member_type. So I think we really need that. >> >> > > 2) it is used for something Ada-ish I really don't know how to test etc. >> > >to be able to find out if it is safe to call it in >> > >gen_type_die_with_usage too >> > >> > You could find an Ada test that uses the code and verify that the >> > output stays the same? >> >> I can try to find the patch that introduced it and if it contained any >> testcases. > > I couldn't find any unfortunately. Pierre, could you please test if the > following patch doesn't regress anything in the Ada debug info area? > > Here is updated patch I'm going to bootstrap/regtest; it generates the > same debuginfo on ref-3.C testcase. Looks good. Jason
Re: [PATCH] xtensa: fix ICE on pr59037.c test
On Tue, Nov 1, 2016 at 12:11 PM, Max Filippov wrote: > xtensa gcc gets ICE on pr59037.c test because its xtensa_output_literal > function cannot handle integer literals of sizes other than 4 and 8, > whereas the test uses 16-byte int vector. > Split integer literal formatting into the recursive function > xtensa_output_integer_literal_parts capable of handling literals of any > power of 2 size not less than 4. > > 2016-11-01 Max Filippov > gcc/ > * config/xtensa/xtensa.c (xtensa_output_integer_literal_parts): > New function. > (xtensa_output_literal): Use xtensa_output_integer_literal_parts > to format MODE_INT and MODE_PARTIAL_INT literals. Approved.
Re: [PATCH] xtensa: don't xfail gcc.c-torture/compile/20001226-1.c
On Tue, Nov 1, 2016 at 12:45 PM, Max Filippov wrote: > With jump trampolines implemented in binutils since 2.25 and enabled by > default this test no longer fails on xtensa. > > 2016-11-01 Max Filippov > gcc/testsuite/ > * gcc.c-torture/compile/20001226-1.c: Don't xfail on xtensa. Approved.
PR61409: -Wmaybe-uninitialized false-positive with -O2
Hi Jeff. As discussed in the PR, here is a patch exploring your idea of ignoring unguarded uses if we can prove that the guards for such uses are invalidated by the uninitialized operand paths being executed. This is an updated patch from my suggestion in the PR. It bootstraps with no regression on x86-64 Linux, and fixes the PR in question. As the "NOTE:" in the code states, we could be much smarter when invalidating predicates, but for now let's do straight negation which works for the simple case. We could expand on this in the future. OK for trunk? commit 8375d7e28c1a798dd0cc0f487d7fa1068d9eb124 Author: Aldy Hernandez Date: Thu Aug 25 10:44:29 2016 -0400 PR middle-end/61409 * tree-ssa-uninit.c (use_pred_not_overlap_with_undef_path_pred): Remove reference to missing NUM_PREDS in function comment. (can_one_predicate_be_invalidated_p): New. (can_chain_union_be_invalidated_p): New. (flatten_out_predicate_chains): New. (uninit_ops_invalidate_phi_use): New. (is_use_properly_guarded): Call uninit_ops_invalidate_phi_use. diff --git a/gcc/testsuite/gcc.dg/uninit-pr61409.c b/gcc/testsuite/gcc.dg/uninit-pr61409.c new file mode 100644 index 000..c27a67b --- /dev/null +++ b/gcc/testsuite/gcc.dg/uninit-pr61409.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -Wuninitialized" } */ + +int *rw; +int line_height; +int pixel_width; +int text_cols; +int width1, width2, width3; + +void *pointer; + +void f (int i, int j) +{ + void *ptr; + if (i) +{ + if (j) + return; + ptr = pointer; +} + pixel_width = 1234 + width1 + 2 * width2 + 2 * width3; + *rw = text_cols + line_height; + if (i) +rw=ptr; /* { dg-bogus "uninitialized" "bogus warning" } */ +} diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c index 1344854..37c99a7 100644 --- a/gcc/tree-ssa-uninit.c +++ b/gcc/tree-ssa-uninit.c @@ -1184,11 +1184,10 @@ prune_uninit_phi_opnds (gphi *phi, unsigned uninit_opnds, gphi *flag_def, transformation which eliminates the merge point thus makes path sensitive analysis unnecessary.) - NUM_PREDS is the number is the number predicate chains, PREDS is - the array of chains, PHI is the phi node whose incoming (undefined) - paths need to be pruned, and UNINIT_OPNDS is the bitmap holding - uninit operand positions. VISITED_PHIS is the pointer set of phi - stmts being checked. */ + PHI is the phi node whose incoming (undefined) paths need to be + pruned, and UNINIT_OPNDS is the bitmap holding uninit operand + positions. VISITED_PHIS is the pointer set of phi stmts being + checked. */ static bool use_pred_not_overlap_with_undef_path_pred (pred_chain_union preds, @@ -2140,6 +2139,158 @@ normalize_preds (pred_chain_union preds, gimple *use_or_def, bool is_use) return norm_preds; } +/* Return TRUE if PREDICATE can be invalidated by any individual + predicate in WORKLIST. */ + +static bool +can_one_predicate_be_invalidated_p (pred_info predicate, + vec worklist) +{ + for (size_t i = 0; i < worklist.length (); ++i) +{ + pred_info *p = worklist[i]; + + /* NOTE: This is a very simple check, and only understands an +exact opposite. So, [i == 0] is currently only invalidated +by [.NOT. i == 0] or [i != 0]. Ideally we should also +invalidate with say [i > 5] or [i == 8]. There is certainly +room for improvement here. */ + if (pred_neg_p (predicate, *p)) + return true; +} + return false; +} + +/* Return TRUE if all USE_PREDS can be invalidated by some predicate + in WORKLIST. */ + +static bool +can_chain_union_be_invalidated_p (pred_chain_union use_preds, + vec worklist) +{ + /* Remember: + PRED_CHAIN_UNION = PRED_CHAIN1 || PRED_CHAIN2 || PRED_CHAIN3 + PRED_CHAIN = PRED_INFO1 && PRED_INFO2 && PRED_INFO3, etc. + + We need to invalidate the entire PRED_CHAIN_UNION, which means, + invalidating every PRED_CHAIN in this union. But to invalidate + an individual PRED_CHAIN, all we need to invalidate is _any_ one + PRED_INFO, by boolean algebra !PRED_INFO1 || !PRED_INFO2... */ + for (size_t i = 0; i < use_preds.length (); ++i) +{ + pred_chain c = use_preds[i]; + bool entire_pred_chain_invalidated = false; + for (size_t j = 0; j < c.length (); ++j) + if (can_one_predicate_be_invalidated_p (c[i], worklist)) + { + entire_pred_chain_invalidated = true; + break; + } + if (!entire_pred_chain_invalidated) + return false; +} + return true; +} + +/* Flatten out all the factors in all the pred_chain_union's in PREDS + into a WORKLIST of individual PRED_INFO's. + + N is the number of pred_chain_union's in PREDS. + + Since we are interested in the inverse of the PRED_CHAIN's, by + boolean algebra, an inve
Re: [PATCH][AArch64] Improve SHA1 scheduling
On Tue, Oct 25, 2016 at 10:08 AM, Wilco Dijkstra wrote: > SHA1H instructions may be scheduled after a SHA1C instruction > that uses the same input register. However SHA1C updates its input, > so if SHA1H is scheduled after it, it requires an extra move. > Increase the priority of SHA1H to ensure it gets scheduled > earlier, avoiding the move. > > Is this something the generic scheduler could do automatically for > instructions with RMW operands? I was thinking that but there are many of those on x86 so it might not make sense at all. Also the SIMD instruction FMLA has the similar problem most likely though I don't know if it make sense to do a similar thing for that one. Maybe it makes sense to mark the instructions in the .md file with some attribute and then just check for that attribute here instead of special casing SHA1C. Something like (not a full patch and don't claim this is correct): (define_attr "rmw" "yes,no" (const_string "no")) ... "sha1\\t%q0, %s2, %3.4s" [(set_attr "type" "crypto_sha1_slow")] [(set_attr "rmw" "yes")] ... static int aarch64_sched_adjust_priority (rtx_insn *insn, int priority) { if (get_attr_rmw (insn) == yes) return MAX(priority - 10, 0); return priority; } Thanks, Andrew > > Passes bootstrap & regress. OK for commit? > > ChangeLog: > 2016-10-25 Wilco Dijkstra > > * config/aarch64/aarch64.c (aarch64_sched_adjust_priority) > New function. > (TARGET_SCHED_ADJUST_PRIORITY): Define target hook. > -- > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index > 9b2f9cb19343828dc39e9950ebbefe941521942a..2b25bd1bdd6f4e7737f8e04c3b3684cdff6c4b80 > 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -13668,6 +13668,26 @@ aarch64_sched_fusion_priority (rtx_insn *insn, int > max_pri, >return; > } > > +/* Implement the TARGET_SCHED_ADJUST_PRIORITY hook. > + Adjust priority of sha1h instructions so they are scheduled before > + other SHA1 instructions. */ > + > +static int > +aarch64_sched_adjust_priority (rtx_insn *insn, int priority) > +{ > + rtx x = PATTERN (insn); > + > + if (GET_CODE (x) == SET) > +{ > + x = SET_SRC (x); > + > + if (GET_CODE (x) == UNSPEC && XINT (x, 1) == UNSPEC_SHA1H) > + return priority + 10; > +} > + > + return priority; > +} > + > /* Given OPERANDS of consecutive load/store, check if we can merge > them into ldp/stp. LOAD is true if they are load instructions. > MODE is the mode of memory operands. */ > @@ -14431,6 +14451,9 @@ aarch64_optab_supported_p (int op, machine_mode > mode1, machine_mode, > #undef TARGET_CAN_USE_DOLOOP_P > #define TARGET_CAN_USE_DOLOOP_P can_use_doloop_if_innermost > > +#undef TARGET_SCHED_ADJUST_PRIORITY > +#define TARGET_SCHED_ADJUST_PRIORITY aarch64_sched_adjust_priority > + > #undef TARGET_SCHED_MACRO_FUSION_P > #define TARGET_SCHED_MACRO_FUSION_P aarch64_macro_fusion_p > >
Re: [ping * 4] PR35503 - warn for restrict
On 2 November 2016 at 18:29, Jason Merrill wrote: > Then I'll approve the whole patch. Thanks! Trying the patch on kernel build (allmodconfig) reveals the following couple of warnings: http://pastebin.com/Sv2HFDUv I think warning for str_error_r() is correct, however I am not sure if warning for pager_preexec() is legit or a false positive: pager.c: In function 'pager_preexec': pager.c:35:12: warning: passing argument 2 to restrict-qualified parameter aliases with argument 4 [-Wrestrict] select(1, &in, NULL, &in, NULL); ^~~~~~ Is the warning correct for the above call to select() syscall ? I am a bit anxious about keeping the warning in Wall because it's breaking the kernel. Should we instead keep it in Wextra, or continue keeping it in Wall ? Also is there a way to gracefully disable Werror for kernel builds ? I tried: make allmodconfig make all KCFLAGS="-Wno-error=restrict" CFLAGS="-Wno-error=restrict" -j8 but that didn't work. I managed to workaround by manually modifying Makefiles to not pass Werror, but I hope there's a better way. Thanks, Prathamesh > > On Wed, Nov 2, 2016 at 8:42 AM, Joseph Myers wrote: >> The format-checking parts of the patch are OK. >> >> -- >> Joseph S. Myers >> jos...@codesourcery.com
[PATCH, GCC/ARM] Fix PR77933: stack corruption on ARM when using high registers and lr
Hi, When saving registers, function thumb1_expand_prologue () aims at minimizing the number of push instructions. One of the optimization it does is to push lr alongside high register(s) (after having moved them to low register(s)) when there is no low register to save. The way this is implemented is to add lr to the list of registers that can be pushed just before the push happens. This would then push lr and allows it to be used for further push if there was not enough registers to push all high registers to be pushed. However, the logic that decides what register to move high registers to before being pushed only looks at low registers (see for loop initialization). This means not only that lr is not used for pushing high registers but also that lr is not removed from the list of registers to be pushed when it's not used. This extra lr push is not poped in epilogue leading in stack corruption. This patch changes the loop to iterate over register r0 to lr so as to both fix the stack corruption and reuse lr to push some high register when possible. ChangeLog entry are as follow: *** gcc/ChangeLog *** 2016-11-01 Thomas Preud'homme PR target/77933 * config/arm/arm.c (thumb1_expand_prologue): Also check for lr being a pushable register. *** gcc/testsuite/ChangeLog *** 2016-11-01 Thomas Preud'homme PR target/77933 * gcc.target/arm/pr77933.c: New test. Testing: no regression on arm-none-eabi GCC cross-compiler targeting Cortex-M0 Is this ok for trunk? Best regards, Thomas diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index dd8d5e5db8ca50daab648e58df290969aa794862..22a20caf42389748fc64ee0a3f880c6bea4c8f49 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -24990,7 +24990,7 @@ thumb1_expand_prologue (void) { unsigned long real_regs_mask = 0; - for (regno = LAST_LO_REGNUM; regno >= 0; regno --) + for (regno = LR_REGNUM; regno >= 0; regno --) { if (pushable_regs & (1 << regno)) { diff --git a/gcc/testsuite/gcc.target/arm/pr77933.c b/gcc/testsuite/gcc.target/arm/pr77933.c new file mode 100644 index ..cba7e9bb9aa57c6755f79b5ec2ea1a8744e23599 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr77933.c @@ -0,0 +1,9 @@ +/* { dg-do run } */ +/* { dg-options "-O1" } */ + +int +main (void) +{ + __asm__ volatile ("" : : : "r8", "r9", "lr"); + return 0; +}
Re: [PATCH] Generate reproducible output independently of the build-path
Joseph Myers: > On Wed, 2 Nov 2016, Ximin Luo wrote: > >> This patch series adds a new environment variable SOURCE_PREFIX_MAP. When >> this >> is set, GCC will treat this as an implicit "-fdebug-prefix-map=$value" >> command-line argument. This makes the final binary output reproducible, and > > Only one argument? Sometimes you may want multiple -fdebug-prefix-map > options (for both source and build directories, for example). Perhaps the > value should be split on spaces to provide multiple such mappings? > We wanted to keep this environment variable simple, and allowing this would make it more costly for other projects to adopt. Whichever separator character we choose, we would have to either (a) disallow it within the oldprefix/newprefix strings like what PATH does or (b) think of an escaping scheme. Neither choice is great - with (a) since we want to map *arbitrary* paths the restriction is less acceptable here than in PATH, and with (b) it adds complexity to the spec, for uncommon use-cases. In the case of an out-of-tree build, it is still possible with the current proposal. Instead of: SOURCE_PREFIX_MAP="srcprefix=srcname:buildprefix=buildname" One could arrange the tree like: commonprefix/srcname commonprefix/buildname then set SOURCE_PREFIX_MAP="commonprefix=." or SOURCE_PREFIX_MAP="commonprefix/=". X -- GPG: ed25519/56034877E1F87C35 GPG: rsa4096/1318EFAC5FBBDBCE https://github.com/infinity0/pubkeys.git
Re: [PATCH] enhance buffer overflow warnings (and c/53562)
Sure, they might and in that case the warning would be a false positive. It wouldn't be the first such warning that wasn't 100% free of them. But my testing with Binutils, GCC, and the Linux kernel has exposed only 10 instances of new warnings and I don't think I saw this idiom among them. But even if some were, or if all of them were false positives I think that would be well within the acceptable rates. I've gone through all the new warnings. With one exception where it's unclear what exactly is going on they all appear to be due to the idiom of writing into a struct member to clear or copy data into the member and those that follow to the end of the struct, like so: struct S { int a, b, c } *s; memset (&s->b, 0, sizeof *s - offsetof (struct S, b)); It should be doable to recognize this idiom and suppress the warning in this case. Let me work on enhancing the patch to do that. Here are the numbers of warnings for Binutils and the kernel: 113 -Wimplicit-fallthrough 38 -Wformat-length= 12 -Wunused-const-variable= 10 -Wstringop-overflow 2 -Wdangling-else 2 -Wframe-address 2 -Wint-in-bool-context 1 -Wbool-operation Using some header structure at the beginning and then conditionally on fields in that structure various payloads occurs in many projects, starting with glibc, gcc, Linux kernel, ... The warning really must not be detached from reality. That's an unfair assertion in light of the numbers above. If you want a warning for suspicious calls, sure, but 1) it has to be clearly worded significantly differently from how do you word it, so that users really understand you are warning about suspicious code (though, I really believe it is very common and there is really nothing the users can do about it) 2) it really shouldn't be enabled in -Wall, and I think not even in -Wextra As you have argued yourself recently in our discussion of -Wimplicit-fallthrough, warnings that aren't enabled by either of these options don't generally benefit users because very few turn them on explicitly. I agree with that argument although I would be in favor of rolling out a new warning gradually over two or more releases if it were known to be prone to high rates of false positive. The -Wstringop-overflow warning clearly isn't in that category so there's no need for it. My offer to do additional testing is still good if you'd like to see more data. As for the wording, I welcome suggestions for improvements. Martin
[PATCH] rs6000: Disable shrink-wrap-separate for abi=spe (PR78168)
With the SPE ABI, if we wrap GPRs we need to handle the upper half of the extended 64-bit registers as well, which we cannot easily do. So, this patch disables separate shrink-wrapping for the SPE ABI. Tested on powerpc64-linux {-m32,-m64}; also tested on the testcase (I finally managed to build a config that fails). Committing to trunk. Segher 2016-11-02 Segher Boessenkool PR target/78168 * config/r6000/rs6000.c (rs6000_get_separate_components): Return NULL if TARGET_SPE_ABI. --- gcc/config/rs6000/rs6000.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index f9e4739..e5a6166 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -27633,6 +27633,9 @@ rs6000_get_separate_components (void) if (WORLD_SAVE_P (info)) return NULL; + if (TARGET_SPE_ABI) +return NULL; + sbitmap components = sbitmap_alloc (32); bitmap_clear (components); -- 1.9.3
Re: [PATCH][AArch64] Improve SHA1 scheduling
ping From: Wilco Dijkstra Sent: 25 October 2016 18:08 To: GCC Patches Cc: nd Subject: [PATCH][AArch64] Improve SHA1 scheduling SHA1H instructions may be scheduled after a SHA1C instruction that uses the same input register. However SHA1C updates its input, so if SHA1H is scheduled after it, it requires an extra move. Increase the priority of SHA1H to ensure it gets scheduled earlier, avoiding the move. Is this something the generic scheduler could do automatically for instructions with RMW operands? Passes bootstrap & regress. OK for commit? ChangeLog: 2016-10-25 Wilco Dijkstra * config/aarch64/aarch64.c (aarch64_sched_adjust_priority) New function. (TARGET_SCHED_ADJUST_PRIORITY): Define target hook. -- diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 9b2f9cb19343828dc39e9950ebbefe941521942a..2b25bd1bdd6f4e7737f8e04c3b3684cdff6c4b80 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -13668,6 +13668,26 @@ aarch64_sched_fusion_priority (rtx_insn *insn, int max_pri, return; } +/* Implement the TARGET_SCHED_ADJUST_PRIORITY hook. + Adjust priority of sha1h instructions so they are scheduled before + other SHA1 instructions. */ + +static int +aarch64_sched_adjust_priority (rtx_insn *insn, int priority) +{ + rtx x = PATTERN (insn); + + if (GET_CODE (x) == SET) + { + x = SET_SRC (x); + + if (GET_CODE (x) == UNSPEC && XINT (x, 1) == UNSPEC_SHA1H) + return priority + 10; + } + + return priority; +} + /* Given OPERANDS of consecutive load/store, check if we can merge them into ldp/stp. LOAD is true if they are load instructions. MODE is the mode of memory operands. */ @@ -14431,6 +14451,9 @@ aarch64_optab_supported_p (int op, machine_mode mode1, machine_mode, #undef TARGET_CAN_USE_DOLOOP_P #define TARGET_CAN_USE_DOLOOP_P can_use_doloop_if_innermost +#undef TARGET_SCHED_ADJUST_PRIORITY +#define TARGET_SCHED_ADJUST_PRIORITY aarch64_sched_adjust_priority + #undef TARGET_SCHED_MACRO_FUSION_P #define TARGET_SCHED_MACRO_FUSION_P aarch64_macro_fusion_p
Re: [PATCH][AArch64 - v3] Simplify eh_return implementation
ping From: Wilco Dijkstra Sent: 02 September 2016 12:31 To: Ramana Radhakrishnan; GCC Patches Cc: nd Subject: Re: [PATCH][AArch64 - v3] Simplify eh_return implementation Ramana Radhakrishnan wrote: > Can you please file a PR for this and add some testcases ? This sounds like > a serious enough problem that needs to be looked at probably going back since > the dawn of time. I've created PR77455. Updated patch below: This patch simplifies the handling of the EH return value. We force the use of the frame pointer so the return location is always at FP + 8. This means we can emit a simple volatile access in EH_RETURN_HANDLER_RTX without needing md patterns, splitters and frame offset calculations. The new implementation also fixes various bugs in aarch64_final_eh_return_addr, which does not work with -fomit-frame-pointer, alloca or outgoing arguments. Bootstrap OK, GCC Regression OK, OK for trunk? Would it be useful to backport this to GCC6.x? ChangeLog: 2016-09-02 Wilco Dijkstra PR77455 gcc/ * config/aarch64/aarch64.md (eh_return): Remove pattern and splitter. * config/aarch64/aarch64.h (AARCH64_EH_STACKADJ_REGNUM): Remove. (EH_RETURN_HANDLER_RTX): New define. * config/aarch64/aarch64.c (aarch64_frame_pointer_required): Force frame pointer in EH return functions. (aarch64_expand_epilogue): Add barrier for eh_return. (aarch64_final_eh_return_addr): Remove. (aarch64_eh_return_handler_rtx): New function. * config/aarch64/aarch64-protos.h (aarch64_final_eh_return_addr): Remove. (aarch64_eh_return_handler_rtx): New prototype. testsuite/ * gcc.target/aarch64/eh_return.c: New test. -- diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 3cdd69b8af1089a839e5d45cda94bc70a15cd777..327c0a97f6f687604afef249b79ac22628418070 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -358,7 +358,7 @@ int aarch64_hard_regno_mode_ok (unsigned, machine_mode); int aarch64_hard_regno_nregs (unsigned, machine_mode); int aarch64_uxt_size (int, HOST_WIDE_INT); int aarch64_vec_fpconst_pow_of_2 (rtx); -rtx aarch64_final_eh_return_addr (void); +rtx aarch64_eh_return_handler_rtx (void); rtx aarch64_mask_from_zextract_ops (rtx, rtx); const char *aarch64_output_move_struct (rtx *operands); rtx aarch64_return_addr (int, rtx); diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 003fec87e41db618570663f28cc2387a87e8252a..fa81e4b853daf08842955288861ec7e7acca 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -400,9 +400,9 @@ extern unsigned aarch64_architecture_version; #define ASM_DECLARE_FUNCTION_NAME(STR, NAME, DECL) \ aarch64_declare_function_name (STR, NAME, DECL) -/* The register that holds the return address in exception handlers. */ -#define AARCH64_EH_STACKADJ_REGNUM (R0_REGNUM + 4) -#define EH_RETURN_STACKADJ_RTX gen_rtx_REG (Pmode, AARCH64_EH_STACKADJ_REGNUM) +/* For EH returns X4 contains the stack adjustment. */ +#define EH_RETURN_STACKADJ_RTX gen_rtx_REG (Pmode, R4_REGNUM) +#define EH_RETURN_HANDLER_RTX aarch64_eh_return_handler_rtx () /* Don't use __builtin_setjmp until we've defined it. */ #undef DONT_USE_BUILTIN_SETJMP diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index e742c19d76e6c62117aa62a990b9c2945aa06b74..f07d771ea343803e054e03f59c8c1efb698bf474 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2739,6 +2739,10 @@ aarch64_frame_pointer_required (void) && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM))) return true; + /* Force a frame pointer for EH returns so the return address is at FP+8. */ + if (crtl->calls_eh_return) + return true; + return false; } @@ -3298,7 +3302,8 @@ aarch64_expand_epilogue (bool for_sibcall) + cfun->machine->frame.saved_varargs_size) != 0; /* Emit a barrier to prevent loads from a deallocated stack. */ - if (final_adjust > crtl->outgoing_args_size || cfun->calls_alloca) + if (final_adjust > crtl->outgoing_args_size || cfun->calls_alloca + || crtl->calls_eh_return) { emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx)); need_barrier_p = false; @@ -3366,52 +3371,15 @@ aarch64_expand_epilogue (bool for_sibcall) emit_jump_insn (ret_rtx); } -/* Return the place to copy the exception unwinding return address to. - This will probably be a stack slot, but could (in theory be the - return register). */ +/* Implement EH_RETURN_HANDLER_RTX. The return address is stored at FP + 8. + The access needs to be volatile to prevent it from being removed. */ rtx -aarch64_final_eh_return_addr (void) +aarch64_eh_return_handler_rtx (void) { - HOST_WIDE_INT fp_offset; - - aarch64_layout_frame (); - - fp_offset = cfun->machine->frame.frame_si
Re: [PATCH v2][AArch64] Fix symbol offset limit
ping From: Wilco Dijkstra Sent: 12 September 2016 15:50 To: Richard Earnshaw; GCC Patches Cc: nd Subject: Re: [PATCH v2][AArch64] Fix symbol offset limit Wilco wrote: > The original example is from GCC itself, the fixed_regs array is small but > due to > optimization we can end up with &fixed_regs + 0x. We could also check the bounds of each symbol if they exist, like the patch below. In aarch64_classify_symbol symbols are allowed full-range offsets on relocations. This means the offset can use all of the +/-4GB offset, leaving no offset available for the symbol itself. This results in relocation overflow and link-time errors for simple expressions like &global_char + 0xff00. To avoid this, limit the offset to +/-1GB so that the symbol needs to be within a 3GB offset from its references. For the tiny code model use a 64KB offset, allowing most of the 1MB range for code/data between the symbol and its references. For symbols with a defined size, limit the offset to be within the size of the symbol. ChangeLog: 2016-09-12 Wilco Dijkstra gcc/ * config/aarch64/aarch64.c (aarch64_classify_symbol): Apply reasonable limit to symbol offsets. testsuite/ * gcc.target/aarch64/symbol-range.c (foo): Set new limit. * gcc.target/aarch64/symbol-range-tiny.c (foo): Likewise. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 385bd560fb12cd5d404e6ddb2f01edf1fe72d729..275a828ac9e6e9b8187380c1b602ffb1b2bcfb21 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9351,6 +9351,8 @@ aarch64_classify_symbol (rtx x, rtx offset) if (aarch64_tls_symbol_p (x)) return aarch64_classify_tls_symbol (x); + const_tree decl = SYMBOL_REF_DECL (x); + switch (aarch64_cmodel) { case AARCH64_CMODEL_TINY: @@ -9359,25 +9361,45 @@ aarch64_classify_symbol (rtx x, rtx offset) we have no way of knowing the address of symbol at compile time so we can't accurately say if the distance between the PC and symbol + offset is outside the addressible range of +/-1M in the - TINY code model. So we rely on images not being greater than - 1M and cap the offset at 1M and anything beyond 1M will have to - be loaded using an alternative mechanism. Furthermore if the - symbol is a weak reference to something that isn't known to - resolve to a symbol in this module, then force to memory. */ + TINY code model. So we limit the maximum offset to +/-64KB and + assume the offset to the symbol is not larger than +/-(1M - 64KB). + Furthermore force to memory if the symbol is a weak reference to + something that doesn't resolve to a symbol in this module. */ if ((SYMBOL_REF_WEAK (x) && !aarch64_symbol_binds_local_p (x)) - || INTVAL (offset) < -1048575 || INTVAL (offset) > 1048575) + || !IN_RANGE (INTVAL (offset), -0x1, 0x1)) return SYMBOL_FORCE_TO_MEM; + + /* Limit offset to within the size of a declaration if available. */ + if (decl && DECL_P (decl)) + { + const_tree decl_size = DECL_SIZE (decl); + + if (decl_size + && !IN_RANGE (INTVAL (offset), 0, tree_to_shwi (decl_size))) + return SYMBOL_FORCE_TO_MEM; + } + return SYMBOL_TINY_ABSOLUTE; case AARCH64_CMODEL_SMALL: /* Same reasoning as the tiny code model, but the offset cap here is - 4G. */ + 1G, allowing +/-3G for the offset to the symbol. */ if ((SYMBOL_REF_WEAK (x) && !aarch64_symbol_binds_local_p (x)) - || !IN_RANGE (INTVAL (offset), HOST_WIDE_INT_C (-4294967263), - HOST_WIDE_INT_C (4294967264))) + || !IN_RANGE (INTVAL (offset), -0x4000, 0x4000)) return SYMBOL_FORCE_TO_MEM; + + /* Limit offset to within the size of a declaration if available. */ + if (decl && DECL_P (decl)) + { + const_tree decl_size = DECL_SIZE (decl); + + if (decl_size + && !IN_RANGE (INTVAL (offset), 0, tree_to_shwi (decl_size))) + return SYMBOL_FORCE_TO_MEM; + } + return SYMBOL_SMALL_ABSOLUTE; case AARCH64_CMODEL_TINY_PIC: diff --git a/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c b/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c index d7e46b059e41f2672b3a1da5506fa8944e752e01..d399a3637ed834ddc4bb429594c4ec229b5c2ea8 100644 --- a/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c +++ b/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c @@ -1,12 +1,12 @@ -/* { dg-do compile } */ +/* { dg-do link } */ /* { dg-options "-O3 -save-temps -mcmodel=tiny" } */ -int fixed_regs[
Re: [PATCH v2] AIX visibility
On Wed, Nov 02, 2016 at 11:28:40AM -0500, Segher Boessenkool wrote: > On Wed, Nov 02, 2016 at 11:41:32AM -0400, David Edelsohn wrote: > > Any comments on ASM_WEAKEN_DECL change? > > It no longer checks RS6000_WEAK, is that always on now? Oh never mind, I can't read (it actually scrolled off my screen, what a great excuse). So, looks just fine to me. Segher
Re: [PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those
On Wed, Nov 02, 2016 at 04:44:05PM +0100, Jakub Jelinek wrote: > which means if gen_type_die or gen_type_die_with_usage doesn't > use the langhook, then we'd emit a completely useless { __pfn; __delta } > struct into debug info first, and then in modified_type_die used > the langhook, get OFFSET_TYPE and probably create the > DW_TAG_ptr_to_member_type. So I think we really need that. > > > > 2) it is used for something Ada-ish I really don't know how to test etc. > > >to be able to find out if it is safe to call it in > > >gen_type_die_with_usage too > > > > You could find an Ada test that uses the code and verify that the > > output stays the same? > > I can try to find the patch that introduced it and if it contained any > testcases. I couldn't find any unfortunately. Pierre, could you please test if the following patch doesn't regress anything in the Ada debug info area? Here is updated patch I'm going to bootstrap/regtest; it generates the same debuginfo on ref-3.C testcase. 2016-11-02 Jakub Jelinek Alexandre Oliva Jason Merrill PR debug/28767 PR debug/56974 * langhooks.h (struct lang_hooks_for_types): Document type_hash_eq being also called on METHOD_TYPEs. Add type_dwarf_attribute langhook. * langhooks.c (lhd_type_dwarf_attribute): New function. * langhooks-def.h (lhd_type_dwarf_attribute): Declare. (LANG_HOOKS_TYPE_DWARF_ATTRIBUTE): Define. (LANG_HOOKS_FOR_TYPES_INITIALIZER): Add LANG_HOOKS_TYPE_DWARF_ATTRIBUTE. * tree.h (check_lang_type): Declare. * tree.c (check_lang_type): New function. (check_qualified_type, check_aligned_type): Call it. * dwarf2out.c (modified_type_die): Don't use type_main_variant for FUNCTION_TYPE or METHOD_TYPE, instead walk over variants with check_base_type and check_lang_type. (gen_ptr_to_mbr_type_die): If lookup_type_die is already non-NULL, return early. For pointer-to-data-member add DW_AT_use_location attribute. (gen_subroutine_type_die): Add DW_AT_{,rvalue_}reference attribute if needed. (gen_type_die_with_usage): Don't use type_main_variant for FUNCTION_TYPE or METHOD_TYPE, instead walk over variants with check_base_type and check_lang_type. Formatting fixes. Call get_debug_type langhook. cp/ * tree.c (cp_check_qualified_type): Use check_base_type and TYPE_QUALS comparison instead of check_qualified_type. (cxx_type_hash_eq): Return false if type_memfn_rqual don't match. * cp-objcp-common.c (cp_get_debug_type): New function. (cp_decl_dwarf_attribute): Don't handle types here. (cp_type_dwarf_attribute): New function. * cp-objcp-common.h (cp_get_debug_type, cp_type_dwarf_attribute): Declare. (LANG_HOOKS_GET_DEBUG_TYPE, LANG_HOOKS_TYPE_DWARF_ATTRIBUTE): Define. testsuite/ * g++.dg/debug/dwarf2/ptrdmem-1.C: New test. * g++.dg/debug/dwarf2/ref-3.C: New test. * g++.dg/debug/dwarf2/refqual-1.C: New test. * g++.dg/debug/dwarf2/refqual-2.C: New test. --- gcc/langhooks.h.jj 2016-11-02 15:55:59.243607249 +0100 +++ gcc/langhooks.h 2016-11-02 17:06:23.818730222 +0100 @@ -120,7 +120,7 @@ struct lang_hooks_for_types /* Return TRUE if TYPE1 and TYPE2 are identical for type hashing purposes. Called only after doing all language independent checks. At present, this function is only called when both TYPE1 and TYPE2 are - FUNCTION_TYPEs. */ + FUNCTION_TYPE or METHOD_TYPE. */ bool (*type_hash_eq) (const_tree, const_tree); /* Return TRUE if TYPE uses a hidden descriptor and fills in information @@ -162,6 +162,10 @@ struct lang_hooks_for_types for the debugger about scale factor, etc. */ bool (*get_fixed_point_type_info) (const_tree, struct fixed_point_type_info *); + + /* Returns -1 if dwarf ATTR shouldn't be added for TYPE, or the attribute + value otherwise. */ + int (*type_dwarf_attribute) (const_tree, int); }; /* Language hooks related to decls and the symbol table. */ --- gcc/langhooks.c.jj 2016-11-02 15:55:59.232607389 +0100 +++ gcc/langhooks.c 2016-11-02 16:35:39.938889496 +0100 @@ -702,6 +702,15 @@ lhd_decl_dwarf_attribute (const_tree, in return -1; } +/* Default implementation of LANG_HOOKS_TYPE_DWARF_ATTRIBUTE. Don't add + any attributes. */ + +int +lhd_type_dwarf_attribute (const_tree, int) +{ + return -1; +} + /* Returns true if the current lang_hooks represents the GNU C frontend. */ bool --- gcc/langhooks-def.h.jj 2016-11-02 15:55:59.151608423 +0100 +++ gcc/langhooks-def.h 2016-11-02 17:05:54.296095837 +0100 @@ -84,6 +84,7 @@ extern bool lhd_omp_mappable_type (tree) extern const char *lhd_get_substring_location (const substring_loc &, location_t *out_loc);
Re: RFC: Warnings silenced when macros from system headers are used (PR c/78000, c/71613)
On Wed, 2 Nov 2016, Jason Merrill wrote: > It seems to me that the general principle is that we should consider > the location where the thing we're warning about is happening. In > >float_var = LLONG_MIN; > > The relevant location is that of the assignment, not the constant on > the RHS. In your ?: example, a simple answer would be to warn based I'm not sure we track locations well enough to handle that yet. Say you have an implicit conversion of a function argument to the type from the prototype and something about that conversion should be warned about. Then you should obviously warn if the argument is a macro from a system header but the call is outside a system header. But say the function call itself comes from a macro defined in a system header - you should still warn if the user passed an argument of the wrong type, even if that argument is a macro from a system header. That is: /* System header. */ int __foo (int); /* This sort of macro to avoid accidental interposition issues has been discussed lately on libc-alpha, so it's a realistic example. */ #define foo(x) (0 + __foo (x)) /* User code. */ v = foo (NULL); should warn because the call to __foo logically results from user code even though both NULL and foo are macros defined in system headers. I'm not sure what the locations look like here. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Generate reproducible output independently of the build-path
On Wed, 2 Nov 2016, Ximin Luo wrote: > This patch series adds a new environment variable SOURCE_PREFIX_MAP. When this > is set, GCC will treat this as an implicit "-fdebug-prefix-map=$value" > command-line argument. This makes the final binary output reproducible, and Only one argument? Sometimes you may want multiple -fdebug-prefix-map options (for both source and build directories, for example). Perhaps the value should be split on spaces to provide multiple such mappings? -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 0/3] use rtx_insn * more
On 11/02/2016 03:55 PM, David Malcolm wrote: Did you mean this patch: https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01358.html That one is ok after the test, sorry for not being more clear. Bernd
Re: [PATCH v2] AIX visibility
On Wed, Nov 02, 2016 at 11:41:32AM -0400, David Edelsohn wrote: > Any comments on ASM_WEAKEN_DECL change? It no longer checks RS6000_WEAK, is that always on now? Otherwise looks fine to me. Segher
Re: [PATCH 1/4] Use SOURCE_PREFIX_MAP envvar as an implicit debug-prefix-map
On Wed, 2 Nov 2016, Ximin Luo wrote: > +@item SOURCE_PREFIX_MAP If this variable is set, it specifies a mapping The text should start on a separate line from the @item. > +that is used to transform filepaths that are output in debug symbols. > +This helps the embedded paths become reproducible, without having the > +unreproducible value be visible in other input sources - such as GCC Use a TeX em dash, ---, instead of " - ". -- Joseph S. Myers jos...@codesourcery.com
Re: RFC: Warnings silenced when macros from system headers are used (PR c/78000, c/71613)
On Wed, Nov 2, 2016 at 11:51 AM, Marek Polacek wrote: > One of the pending issues that we should address before we release GCC 7 is > that sometimes we don't issue a warning if the location points to a macro > defined in a system header, unless -Wsystem-headers. Consider e.g. > > enum { e1 = LLONG_MIN }; > > or > > float_var = LLONG_MIN; > > Here, LLONG_MIN comes from a system header and so the compiler doesn't print > any warnings even though it should--the problem is not in the macro itself, > but in how it's used. This has happened before, e.g. with NULL, and the fix > was to call expansion_point_location_if_in_system_header, but this strategy > is not tenable; there are too many such issues. After having spent days on > this it seems that we should always use the expansion location except for > certain pedwarns/warnings--the challenge is of course how to distinguish > which ones. This patch introduces expand_macros_sentinel that can be used > to suppress expanding macros, removes various > expansion_point_location_if_in_system_header calls and fixes testsuite > fallout. I have *not* gone over all the warnings/pedwarns yet, because this > is touch-and-go whether this'll be considered a reasonable approach. > > The general principle should be: is it the macro or its user that is > responsible for the warning?, but in practice it was often less clear to > me what to do. So e.g. imagine > > #define C : > #define F i ?: 3 > > in a system header and then > > i = i ? C 5; // 1 > return F; // 2 > > in user code -- I believe we should NOT warn for 2 (so don't expand the > location), > but that also means we won't warn for 1. > > Thoughts? It seems to me that the general principle is that we should consider the location where the thing we're warning about is happening. In float_var = LLONG_MIN; The relevant location is that of the assignment, not the constant on the RHS. In your ?: example, a simple answer would be to warn based on the location of the ?. The ?C example seems not worth worrying about either way. Jason
Re: [PATCH][GCC][testsuite] Fix failing vminnm/vmaxnm tests for ARM
On 2 November 2016 at 16:38, Tamar Christina wrote: > Hi all, > > This fixes the ARM failures in the testsuite. > Previously these tests were gated on if ARMv8-a > support was available in the compiler, not if the hardware > was an ARMv8-a hardware. > > This thus resulted in correct code, but wouldn't run on > any other hardware. > > Ran regression tests on aarch64-none-linux-gnu, arm-none-linux-gnueabihf. > > Ok for trunk? > It seems OK to me (without actually testing, though) Thanks Christophe > Thanks, > Tamar > > gcc/testsuite/ > > 2016-11-01 Tamar Christina > > * gcc.target/arm/simd/vmaxnm_f32_1.c (dg-require-effective-target): > Check for arm_v8_neon_hw. > * gcc.target/arm/simd/vmaxnmq_f32_1.c (dg-require-effective-target): > Likewise. > * gcc.target/arm/simd/vminnm_f32_1.c (dg-require-effective-target): > Likewise. > * gcc.target/arm/simd/vminnmq_f32_1.c(dg-require-effective-target): > Likewise.
RFC: Warnings silenced when macros from system headers are used (PR c/78000, c/71613)
One of the pending issues that we should address before we release GCC 7 is that sometimes we don't issue a warning if the location points to a macro defined in a system header, unless -Wsystem-headers. Consider e.g. enum { e1 = LLONG_MIN }; or float_var = LLONG_MIN; Here, LLONG_MIN comes from a system header and so the compiler doesn't print any warnings even though it should--the problem is not in the macro itself, but in how it's used. This has happened before, e.g. with NULL, and the fix was to call expansion_point_location_if_in_system_header, but this strategy is not tenable; there are too many such issues. After having spent days on this it seems that we should always use the expansion location except for certain pedwarns/warnings--the challenge is of course how to distinguish which ones. This patch introduces expand_macros_sentinel that can be used to suppress expanding macros, removes various expansion_point_location_if_in_system_header calls and fixes testsuite fallout. I have *not* gone over all the warnings/pedwarns yet, because this is touch-and-go whether this'll be considered a reasonable approach. The general principle should be: is it the macro or its user that is responsible for the warning?, but in practice it was often less clear to me what to do. So e.g. imagine #define C : #define F i ?: 3 in a system header and then i = i ? C 5; // 1 return F; // 2 in user code -- I believe we should NOT warn for 2 (so don't expand the location), but that also means we won't warn for 1. Thoughts? Bootstrapped/regtested on x86_64-linux and ppc64-linux. 2016-11-02 Marek Polacek PR c/78000 PR c/71613 gcc/c-family/ * c-common.c (unsafe_conversion_p): Don't call expansion_point_location_if_in_system_header. (c_cpp_error): Add a paramater and expand locations depending on that. * c-common.h (c_cpp_error): Update declaration. * c-warn.c (warnings_for_convert_and_check): Don't call expansion_point_location_if_in_system_header. gcc/c/ * c-decl.c (declspecs_add_type): Use expand_macros_sentinel to suppress expanding macro locations. * c-parser.c (c_parser_postfix_expression) [RID_FUNCTION_NAME, RID_PRETTY_FUNCTION_NAME, RID_C99_FUNCTION_NAME]: Likewise. * c-typeck.c (convert_arguments): Don't call expansion_point_location_if_in_system_header. (pedwarn_init): Likewise. (warning_init): Likewise. (convert_for_assignment): Don't call expansion_point_location_if_in_system_header. (c_finish_return): Likewise. gcc/cp/ * call.c (conversion_null_warnings): Don't call expansion_point_location_if_in_system_header. * cvt.c (build_expr_type_conversion): Likewise. * parser.c (set_and_check_decl_spec_loc): Use expand_macros_sentinel to suppress expanding macro locations. * typeck.c (cp_build_binary_op): Don't call expansion_point_location_if_in_system_header. gcc/ * diagnostic.c (diagnostic_initialize): Initialize dc_expand_locations. (diagnostic_report_warnings_p): New function. * diagnostic.h (struct diagnostic_context): Add dc_expand_locations. (diagnostic_report_warnings_p): Remove. (struct expand_macros_sentinel): New sentinel. (diagnostic_report_warnings_p): Declare. * genmatch.c (error_cb): Add bool parameter. (fatal_at): Adjust error_cb call. (warning_at): Likewise. * input.c (on_error): Add bool parameter. gcc/fortran/ * cpp.c (cb_cpp_error): Add a paramater and expand locations depending on that. gcc/testsuite/ * gcc.dg/pr71613.c: New. * gcc.dg/pr78000.c: New. * gcc.dg/pr78000.h: New. libcpp/ * charset.c (noop_error_cb): Add bool parameter. (saved_error_handler): Likewise. (cpp_interpret_string_ranges): * errors.c (cpp_diagnostic_at_richloc): Adjust cb.error call. (cpp_diagnostic_at): Likewise. (cpp_diagnostic_with_line): Add bool parameter and pass it down to cb.error. (cpp_error_with_line_noexpand): New. (cpp_warning_with_line_noexpand): New. (cpp_pedwarning_with_line_noexpand): New. (cpp_warning_with_line_syshdr): Pass true to cpp_diagnostic_with_line. * expr.c (cpp_classify_number): Call *_noexpand variants. * include/cpplib.h (error): Adjust declaration. (cpp_error_with_line_noexpand): New. (cpp_warning_with_line_noexpand): New. (cpp_pedwarning_with_line_noexpand): New. diff --git gcc/gcc/c-family/c-common.c gcc/gcc/c-family/c-common.c index 307862b..b0a1b5c 100644 --- gcc/gcc/c-family/c-common.c +++ gcc/gcc/c-family/c-common.c @@ -1230,7 +1230,6 @@ unsafe_conversion_p (location_t loc, tree type, tree expr, bool produce_warns) { enum conversion_safety give_warning = SAFE_CONVERSION; /* is 0 or false */ tree expr_
Re: [PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those
On Wed, Nov 02, 2016 at 11:31:25AM -0400, Jason Merrill wrote: > On Wed, Nov 2, 2016 at 10:31 AM, Jakub Jelinek wrote: > > It uses Alex' LANG_HOOKS_GET_PTRMEMFN_TYPE langhook. I've tried > > to think about https://gcc.gnu.org/ml/gcc-patches/2011-05/msg00227.html > > and we even have such a langhook now, modified_type_die > > uses lang_hooks.types.get_debug_type, but > > 1) it is just called in modified_type_die and not in > >gen_type_die_with_usage, that looks weird > > How much of a problem is that? modified_type_die calls gen_type_die, > does that not cover the cases needed here? If e.g. on the ref-3.C testcase from the patch I put breakpoint on both gen_ptr_to_mbr_type_die and modified_type_die (the latter only for type->base.code == RECORD_TYPE), then I see first: #0 gen_ptr_to_mbr_type_die (type=, context_die=>, class_type=, member_type=) at ../../gcc/dwarf2out.c:23128 #1 0x00c3a7e5 in gen_type_die_with_usage (type=, context_die=>, usage=DINFO_USAGE_DIR_USE) at ../../gcc/dwarf2out.c:24428 #2 0x00c3ab92 in gen_type_die (type=, context_die=>) at ../../gcc/dwarf2out.c:24491 #3 0x00c3caed in gen_decl_die (decl=, origin=, ctx=0x0, context_die=>) at ../../gcc/dwarf2out.c:25117 #4 0x00c3b13e in process_scope_var (stmt=, decl=, origin=, context_die=>) at ../../gcc/dwarf2out.c:24620 #5 0x00c3b1bb in decls_for_scope (stmt=, context_die=>) at ../../gcc/dwarf2out.c:24645 and only afterwards: #0 modified_type_die (type=, cv_quals=0, reverse=false, context_die=>) at ../../gcc/dwarf2out.c:12328 #1 0x00c2f03b in add_type_attribute (object_die=>, type=, cv_quals=0, reverse=false, context_die=>) at ../../gcc/dwarf2out.c:20346 #2 0x00c354f4 in gen_variable_die (decl=, origin=, context_die=>) at ../../gcc/dwarf2out.c:22688 #3 0x00c3cb8e in gen_decl_die (decl=, origin=, ctx=0x0, context_die=>) at ../../gcc/dwarf2out.c:25138 #4 0x00c3b13e in process_scope_var (stmt=, decl=, origin=, context_die=>) at ../../gcc/dwarf2out.c:24620 #5 0x00c3b1bb in decls_for_scope (stmt=, context_die=>) at ../../gcc/dwarf2out.c:24645 which means if gen_type_die or gen_type_die_with_usage doesn't use the langhook, then we'd emit a completely useless { __pfn; __delta } struct into debug info first, and then in modified_type_die used the langhook, get OFFSET_TYPE and probably create the DW_TAG_ptr_to_member_type. So I think we really need that. > > 2) it is used for something Ada-ish I really don't know how to test etc. > >to be able to find out if it is safe to call it in > >gen_type_die_with_usage too > > You could find an Ada test that uses the code and verify that the > output stays the same? I can try to find the patch that introduced it and if it contained any testcases. > > 3) most importantly, if the C++ version of this langhook would create > >OFFSET_TYPE on the fly, I don't know how to ensure effective sharing > >of DW_TAG_ptr_to_member_type nodes with the same DW_AT_type > >and DW_AT_containing_type; unless the C++ langhook adds some extra > >hash table that caches already created OFFSET_TYPEs or something similar, > >it would create a new OFFSET_TYPE each time it is called > > build_offset_type already uses a hash table. Ah, ok. > > Also, I really don't know how well does GDB (especially older releases) > > handle DW_TAG_ptr_to_member_type for PMF, so the patch wraps that currently > > with if (dwarf_version >= 5). Quick grep revealed that GDB has code to > > handle the __pfn/__delta fields. So, can I ask somebody from the GDB > > team to test this patch with that if (dwarf_version >= 5) replaced > > with if (1) and see if it works properly with current GDB as well as say > > 4-5 years old one (e.g. with -gdwarf-2 or -gdwarf-3)? If yes, we > > should emit it unconditionally. > > This all makes sense to me. > > > + if (dwarf_version >= 5) > > + { > > + tree class_type = lang_hooks.types.get_ptrmemfn_type (type, 0); > > + if (class_type != NULL_TREE) > > This can be > > if (dwarf_version >= 5) > if (tree class_type = lang_hooks.types.get_ptrmemfn_type (type, 0)) Ok. Jakub
[PATCH, Fortran] Extension: Support legacy PARAMETER statements with -std=legacy (or -fdec?)
All, Another quirk of legacy compilers is their syntax for PARAMETER statements. Such statements are similar to standard PARAMETER statements but lack parentheses following the PARAMETER keyword. There is a good reason the standard doesn't support this - because the statement becomes ambiguous with assignment statements in fixed form. Consider the following: parameter pi = 3.14d0 In fixed form, spaces are ignored so in standard Fortran the above looks like an assignment to the variable "parameterpi". In legacy compilers, the above is instead interpreted as parameter (pi = 3.14d0) which of course declares the variable 'pi' to be a parameter with the value 3.14. Attached is a patch for the GNU Fortran front-end which allows the compiler to interpret these legacy PARAMETER statements. The patch in its current form does this without warning through -std=legacy (warning for -std=gnu, error for -std=f*). Bootstraps and regtests on x86_64-redhat-linux. However, note that this would change by default the compiler's interpretation of fixed-form variables starting with the string "parameter", if any such cases existed in real code. IMO fixed form code is isomorphic to legacy code, so I imagine most users writing fixed-form/legacy code would intend for a legacy PARAMETER statement, rather than assignment to variable PARAMETERPI, when writing such a statement. Beyond compatibility, one motivation for acceptance/recognizance of these statements in GNU Fortran is the counterintuitive compile errors currently seen when compiling legacy code which uses this type of statement. Because the legacy PARAMETER statement is currently interpreted as an assignment in fixed-form, and parameter statements often precede other specification statements in real code, GNU Fortran complains not about the parameter statement itself but often the following line with "Unexpected data declaration" or similar. This serves to confuse the user. An example of this can be seen by compiling the attached dec_parameter_1.f in fixed form with the current build of GNU Fortran. I am amenable to only enabling support for legacy PARAMETER statements through -fdec, so that the default (-std=gnu) behavior is unchanged for such cases in fixed form. But this would leave the esoteric "Unexpected data declaration statement" errors for legacy code without -fdec. The case would be difficult to detect and work around specifically since the entire parameter "assignment" statement is eaten by the time the error comes about. What do you think? --- Fritz Reese Author: Fritz O. Reese Date: Tue Nov 1 12:26:57 2016 -0400 Support legacy PARAMETER statements with -std=legacy. gcc/fortran/ * decl.c (gfc_match_parameter): Allow omitted '()' with -std=legacy. * parse.c (decode_statement): Match "parmeter" before assignments. * gfortran.texi: Document. gcc/testsuite/gfortran.dg/ * dec_parameter_1.f: New test. * dec_parameter_2.f90: Likewise. * dec_parameter_3.f90: Likewise. * dec_parameter_4.f90: Likewise. gcc/fortran/decl.c| 10 +++- gcc/fortran/gfortran.texi | 16 ++ gcc/fortran/parse.c |4 +- gcc/testsuite/gfortran.dg/dec_parameter_1.f | 64 + gcc/testsuite/gfortran.dg/dec_parameter_2.f90 | 63 gcc/testsuite/gfortran.dg/dec_parameter_3.f90 | 13 + gcc/testsuite/gfortran.dg/dec_parameter_4.f90 | 13 + 7 files changed, 180 insertions(+), 3 deletions(-) diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c index f18eb41..0120ceb 100644 --- a/gcc/fortran/decl.c +++ b/gcc/fortran/decl.c @@ -7821,10 +7821,16 @@ cleanup: match gfc_match_parameter (void) { + const char *term = " )%t"; match m; if (gfc_match_char ('(') == MATCH_NO) -return MATCH_NO; +{ + /* With legacy PARAMETER statements, don't expect a terminating ')'. */ + if (!gfc_notify_std (GFC_STD_LEGACY, "PARAMETER without '()' at %C")) + return MATCH_NO; + term = " %t"; +} for (;;) { @@ -7832,7 +7838,7 @@ gfc_match_parameter (void) if (m != MATCH_YES) break; - if (gfc_match (" )%t") == MATCH_YES) + if (gfc_match (term) == MATCH_YES) break; if (gfc_match_char (',') != MATCH_YES) diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi index 85ab31b..1d60551 100644 --- a/gcc/fortran/gfortran.texi +++ b/gcc/fortran/gfortran.texi @@ -1472,6 +1472,7 @@ compatibility extensions along with those enabled by @option{-std=legacy}. * Bitwise logical operators:: * Extended I/O specifiers:: * Default exponents:: +* Legacy PARAMETER statements:: @end menu @node Old-style kind specifications @@ -2705,6 +2706,21 @@ For compatibility, GNU Fortran supports a default exponent of zero in real constants with @option{-fdec}. For example, @code{9e} would be interpret
Re: [PATCH] enhance buffer overflow warnings (and c/53562)
On 11/02/2016 01:37 AM, Jakub Jelinek wrote: On Tue, Nov 01, 2016 at 08:55:03PM -0600, Martin Sebor wrote: struct S { int a, b, c, d; }; #define bos(p, t) __builtin_object_size (p, t) #define memset0(p, i, n) __builtin___memset_chk (p, i, n, bos (p, 0)) #define memset1(p, i, n) __builtin___memset_chk (p, i, n, bos (p, 1)) void f0 (struct S *s) { memset0 (&s->d, 0, 1024); // no warning here (bos 0) } But we do not want the warning here, there is nothing wrong on it. The caller may be void bar (void) { struct T { struct S header; char payload[1024 - offsetof (struct S, d)]; } t; initialize (&t); f0 (&t.header); } and the callee might rely on that. Sure, they might and in that case the warning would be a false positive. It wouldn't be the first such warning that wasn't 100% free of them. But my testing with Binutils, GCC, and the Linux kernel has exposed only 10 instances of new warnings and I don't think I saw this idiom among them. But even if some were, or if all of them were false positives I think that would be well within the acceptable rates. Here are the numbers of warnings for Binutils and the kernel: 113 -Wimplicit-fallthrough 38 -Wformat-length= 12 -Wunused-const-variable= 10 -Wstringop-overflow 2 -Wdangling-else 2 -Wframe-address 2 -Wint-in-bool-context 1 -Wbool-operation Using some header structure at the beginning and then conditionally on fields in that structure various payloads occurs in many projects, starting with glibc, gcc, Linux kernel, ... The warning really must not be detached from reality. That's an unfair assertion in light of the numbers above. If you want a warning for suspicious calls, sure, but 1) it has to be clearly worded significantly differently from how do you word it, so that users really understand you are warning about suspicious code (though, I really believe it is very common and there is really nothing the users can do about it) 2) it really shouldn't be enabled in -Wall, and I think not even in -Wextra As you have argued yourself recently in our discussion of -Wimplicit-fallthrough, warnings that aren't enabled by either of these options don't generally benefit users because very few turn them on explicitly. I agree with that argument although I would be in favor of rolling out a new warning gradually over two or more releases if it were known to be prone to high rates of false positive. The -Wstringop-overflow warning clearly isn't in that category so there's no need for it. My offer to do additional testing is still good if you'd like to see more data. As for the wording, I welcome suggestions for improvements. Martin
Re: [PATCH][GCC][testsuite] Fix failing vminnm/vmaxnm tests for ARM
On 02/11/16 15:38, Tamar Christina wrote: Hi all, This fixes the ARM failures in the testsuite. Previously these tests were gated on if ARMv8-a support was available in the compiler, not if the hardware was an ARMv8-a hardware. This thus resulted in correct code, but wouldn't run on any other hardware. Ran regression tests on aarch64-none-linux-gnu, arm-none-linux-gnueabihf. Ok for trunk? Looks ok to me too. Kyrill Thanks, Tamar gcc/testsuite/ 2016-11-01 Tamar Christina * gcc.target/arm/simd/vmaxnm_f32_1.c (dg-require-effective-target): Check for arm_v8_neon_hw. * gcc.target/arm/simd/vmaxnmq_f32_1.c (dg-require-effective-target): Likewise. * gcc.target/arm/simd/vminnm_f32_1.c (dg-require-effective-target): Likewise. * gcc.target/arm/simd/vminnmq_f32_1.c(dg-require-effective-target): Likewise.
[PATCH v2] AIX visibility
This revised patch makes two changes: 1) Fix typo in configure.ac 2) Add AIX visibility support for ASM_WEAKEN_DECL, which does touch the same code as Linux. The AIX "weak" support fixes a large number of C++ visibility testcases. Bootstrapped on powerpc-ibm-aix7.2.0.0. * configure.ac (.hidden): Change to conftest_s string. Provide string for AIX assembler. (gcc_cv_ld_hidden): Yes for AIX. * configure: Regenerate. * dwarf2asm.c (USE_LINKONCE_INDIRECT): Don't set for AIX (XCOFF). * config/rs6000/rs6000-protos.h (rs6000_asm_weaken_decl): Declare (rs6000_xcoff_asm_output_aligned_decl_common): Declare. * config/rs6000/xcoff.h (TARGET_ASM_GLOBALIZE_DECL_NAME): Define. (ASM_OUTPUT_ALIGNED_DECL_COMMON): Define. (ASM_OUTPUT_ALIGNED_COMMON): Delete. * config/rs6000/rs6000.c (rs6000_init_builtins): Change clog rename from #if to if. (rs6000_xcoff_visibility): New. (rs6000_xcoff_declare_function_name): Add visibility support. (rs6000_xcoff_asm_globalize_decl_name): New. (rs6000_xcoff_asm_output_aligned_decl_common): New. (rs6000_asm_weaken_decl): New. (rs6000_code_end): Disable HIDDEN_LINKONCE on XCOFF. config/rs6000/rs6000.h (ASM_WEAKEN_DECL): Change definition to reference function. dwarf2asm.c okay? Any comments on ASM_WEAKEN_DECL change? Thanks, David ZZ Description: Binary data
Re: [PATCH] combine: Improve change_zero_ext (fixes PR71847)
On Tue, Oct 25, 2016 at 11:40 AM, Segher Boessenkool wrote: > This improves a few things in change_zero_ext. Firstly, it should use > the passed in pattern in recog_for_combine, not the pattern of the insn > (they are not the same if the whole pattern was replaced). Secondly, > it handled zero_ext of a subreg, but with hard registers we do not get > a subreg, instead the mode of the reg is changed. So this handles that. > Thirdly, after changing a zero_ext to an AND, the resulting RTL may become > non-canonical, like (ior (ashift ..) (and ..)); the AND should be first, > it is commutative. And lastly, zero_extract as a set_dest wasn't handled > at all, but now it is. > > This fixes the testcase in PR71847, and improves code generation in some > other edge cases too. > > Tested on powerpc64-linux {-m32,-m64}; I'll also test it on x86 and > will build lots of crosses before committing. Hi Segher, This causes failure @ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78186 It includes how to report it, could you please have a look? Thanks. Thanks, Bin
[PATCH][GCC][testsuite] Fix failing vminnm/vmaxnm tests for ARM
Hi all, This fixes the ARM failures in the testsuite. Previously these tests were gated on if ARMv8-a support was available in the compiler, not if the hardware was an ARMv8-a hardware. This thus resulted in correct code, but wouldn't run on any other hardware. Ran regression tests on aarch64-none-linux-gnu, arm-none-linux-gnueabihf. Ok for trunk? Thanks, Tamar gcc/testsuite/ 2016-11-01 Tamar Christina * gcc.target/arm/simd/vmaxnm_f32_1.c (dg-require-effective-target): Check for arm_v8_neon_hw. * gcc.target/arm/simd/vmaxnmq_f32_1.c (dg-require-effective-target): Likewise. * gcc.target/arm/simd/vminnm_f32_1.c (dg-require-effective-target): Likewise. * gcc.target/arm/simd/vminnmq_f32_1.c(dg-require-effective-target): Likewise. gcc.arm-fix-vminnm-vmaxnm-tests Description: gcc.arm-fix-vminnm-vmaxnm-tests
Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module
OK. On Wed, Nov 2, 2016 at 10:18 AM, Jiong Wang wrote: > On 02/11/16 13:42, Jakub Jelinek wrote: >> >> On Wed, Nov 02, 2016 at 01:26:48PM +, Jiong Wang wrote: >>> >>> -/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION >>> note. */ >>> +/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION >>> note. */ >> >> >> Too long line. > > > Hmm, it shows 80 columns under my editor. I guess '+' is counted in? > >> >>> +/* RTL sequences inside PARALLEL are raw expression representation. >>> + >>> + mem_loc_descriptor can be used to build generic DWARF expressions >>> for >>> + DW_CFA_expression and DW_CFA_val_expression where the expression >>> may can >>> + not be represented using normal RTL sequences. In this case, >>> group all >>> + expression operations (DW_OP_*) inside a PARALLEL. For those >>> DW_OP which >>> + doesn't have RTL mapping, wrap it using UNSPEC. The logic for >>> parsing >>> + PARALLEL sequences is: >>> + >>> + foreach elem inside PARALLEL >>> + if (elem is UNSPEC) >>> + dw_op = XINT (elem, 1) (DWARF operation is kept as UNSPEC >>> number) >>> + oprnd1 = XVECEXP (elem, 0, 0) >>> + oprnd2 = XVECEXP (elem, 0, 1) >>> + else >>> + call mem_loc_descriptor */ >> >> >> Not sure if it is a good idea to document in weirdly formatted >> pseudo-language what the code actually does a few lines below. IMHO >> either >> express it in words, or don't express it at all. > > > OK, fixed. I replaced these comments as some brief words. > >> >>> + exp_result = >>> + new_loc_descr ((enum dwarf_location_atom) dw_op, >>> oprnd1, >>> +oprnd2); >> >> >> Wrong formatting, = should be on the next line. >> >>> + } >>> + else >>> + exp_result = >>> + mem_loc_descriptor (elem, mode, mem_mode, >>> + VAR_INIT_STATUS_INITIALIZED); >> >> >> Likewise. > > > Both fixed. Patch updated, please review. > > > Thanks. > > gcc/ > 2016-11-02 Jiong Wang > > * reg-notes.def (CFA_VAL_EXPRESSION): New entry. > * dwarf2cfi.c (dwarf2out_frame_debug_cfa_val_expression): New > function. > (dwarf2out_frame_debug): Support REG_CFA_VAL_EXPRESSION. > (output_cfa_loc): Support DW_CFA_val_expression. > (output_cfa_loc_raw): Likewise. > (output_cfi): Likewise. > (output_cfi_directive): Likewise. > * dwarf2out.c (dw_cfi_oprnd1_desc): Support DW_CFA_val_expression. > (dw_cfi_oprnd2_desc): Likewise. > (mem_loc_descriptor): Recognize new pattern generated for value > expression. >
Re: [PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those
On Wed, Nov 2, 2016 at 10:31 AM, Jakub Jelinek wrote: > It uses Alex' LANG_HOOKS_GET_PTRMEMFN_TYPE langhook. I've tried > to think about https://gcc.gnu.org/ml/gcc-patches/2011-05/msg00227.html > and we even have such a langhook now, modified_type_die > uses lang_hooks.types.get_debug_type, but > 1) it is just called in modified_type_die and not in >gen_type_die_with_usage, that looks weird How much of a problem is that? modified_type_die calls gen_type_die, does that not cover the cases needed here? > 2) it is used for something Ada-ish I really don't know how to test etc. >to be able to find out if it is safe to call it in >gen_type_die_with_usage too You could find an Ada test that uses the code and verify that the output stays the same? > 3) most importantly, if the C++ version of this langhook would create >OFFSET_TYPE on the fly, I don't know how to ensure effective sharing >of DW_TAG_ptr_to_member_type nodes with the same DW_AT_type >and DW_AT_containing_type; unless the C++ langhook adds some extra >hash table that caches already created OFFSET_TYPEs or something similar, >it would create a new OFFSET_TYPE each time it is called build_offset_type already uses a hash table. > Also, I really don't know how well does GDB (especially older releases) > handle DW_TAG_ptr_to_member_type for PMF, so the patch wraps that currently > with if (dwarf_version >= 5). Quick grep revealed that GDB has code to > handle the __pfn/__delta fields. So, can I ask somebody from the GDB > team to test this patch with that if (dwarf_version >= 5) replaced > with if (1) and see if it works properly with current GDB as well as say > 4-5 years old one (e.g. with -gdwarf-2 or -gdwarf-3)? If yes, we > should emit it unconditionally. This all makes sense to me. > + if (dwarf_version >= 5) > + { > + tree class_type = lang_hooks.types.get_ptrmemfn_type (type, 0); > + if (class_type != NULL_TREE) This can be if (dwarf_version >= 5) if (tree class_type = lang_hooks.types.get_ptrmemfn_type (type, 0)) Jason
Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)
On Wed, Nov 02, 2016 at 02:51:41PM +0100, Richard Biener wrote: > >> I don't believe it needs a cleanup on every iteration. One cleanup at > >> the end should work fine. > > > > But as the comment there says: > > > > /* Merge the duplicated blocks into predecessors, when possible. */ > > cleanup_cfg (0); > > > > (this is not a new comment), and without merging blocks this whole > > patch does zilch. > > > > There is no need to create new basic blocks at all (can replace the > > final branch in a block with a copy of the whole block it jumps to, > > instead); and then it is painfully obvious that switching to layout > > mode here is pointless, too. > > > > Do you want me to do that? > > > > Btw, this isn't quadratic anyway; it is a constant number (the maximal > > length allowed of the block to copy) linear. Unless there are unboundedly > > many forwarder blocks, which there shouldn't be, but I cannot prove that. > > Well, you iterate calling functions (find candidates, merge, cleanup-cfg) that > walk the whole function. So unless the number of iterations is bound > with a constant I call this quadratic (in function size). It is bounded (with that caveat above): new candidates will be bigger than the block merged into it, so there won't be more than PARAM_MAX_GOTO_DUPLICATION_INSNS passes. But I can make it all work simpler, in non-layout mode, if you prefer. Segher
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On 11/02/2016 03:51 PM, Jakub Jelinek wrote: > On Wed, Nov 02, 2016 at 03:38:25PM +0100, Martin Liška wrote: >> it converts: >> foo () >> { >> char a; >> char * p; >> char _1; >> int _2; >> int _8; >> int _9; >> >> : >> ASAN_MARK (2, &a, 1); >> a = 0; >> p_6 = &a; >> ASAN_MARK (1, &a, 1); >> _1 = *p_6; > > You shouldn't convert if a is addressable (when ignoring &a in ASAN_MARK > calls). Only if there is &a just in ASAN_MARK and MEM_REF, you can convert. Sure, which should be done in execute_update_addresses_taken via gimple_ior_addresses_taken. > >> to: >> >> foo () >> { >> char a; >> char * p; >> char _1; >> int _2; >> >> : >> a_10 = 0; >> a_12 = ASAN_POISON (); >> _1 = a_12; >> if (_1 != 0) >> goto ; >> else >> goto ; >> >> : >> >> : >> # _2 = PHI <1(2), 0(3)> >> return _2; >> >> } >> >> and probably the last goal is to convert the newly added internal fn to a >> runtime call. >> Hope sanopt pass is the right place where to it? > > If ASAN_POISON is ECF_CONST and has any uses during sanopt, perhaps best > would be to add an artificial variable you give the same name as the > underlying var of the SSA_NAME (and alignment, locus etc.) and poison it > right away (keep unpoisoning only to the function epilogue) and then > ASAN_CHECK replace all uses of that SSA_NAME with ASAN_CHECK + use of > (D) SSA_NAME. When I create an ASAN_POISON call in execute_update_addresses_taken, there would not be any ASAN_CHECK generated as it's going to be rewritten to SSA form (like the previous sample I sent). I like the idea of having a parallel variable, which can be poisoned at the very beginning of a function. Whenever we have a use of the SSA_NAME (like a_12 = ASAN_POISON ()), we can simply insert BUILT_IN_ASAN_REPORT_LOADx(¶llel_variable) statement. No change would be necessary for ASAN runtime in such case. Will it work? Thanks, Martin > > Jakub >
Re: [PATCH] Fix for big-endian gcc.c-torture/execute/pr55750.c
On Wed, 2 Nov 2016, Kyrill Tkachov wrote: > Hi all, > > I noticed that my patch for PR tree-optimization/78170 broke > execute.exp=pr55750.c on big-endian. > The problem with that patch is that we should not forget to clear the padding > bits in the temporary > buffer when merging values even when they are less than a byte. > > Tested on aarch64_be-none-elf (the test failure goes away) > Bootstrap and test on x86_64 ongoing. > Ok for trunk if successful? Ok. Thanks, Richard. > David, does this patch allow AIX bootstrap to proceed by any chance? Crossing fingers ;) > Thanks, > Kyrill > > 2016-11-02 Kyrylo Tkachov > > * gimple-ssa-store-merging.c (encode_tree_to_bitpos): Don't forget to > clear padding bits even when they're less than a byte. >
[PATCH] Fix PR78185
The following fixes PR78185 by properly honoring possibly infinite child loops when computing what blocks are always executed during loop invariant motion. Such loops behave as if the loop would exit at this point. Both GIMPLE and RTL level passes have that very same issue and the following fixes the GIMPLE one and simply disables hoisting of possibly trapping or faulting instructions in the RTL pass ... Eric seems to remember this might regress gzip so I'm going to put this on one of our SPEC testers for tonights run as well. Maybe somebody else wants to check the performance impact (I'm interested in both, GIMPLE and RTL change fallout for performance). Bootstrap and regtest running on x86_64-unknown-linux-gnu. If all goes well I'll followup with the obvious removal of no longer needed stuff in loop-invariant.c. Another variant for RTL would be to simply treat all edges entering a child loop (not only those entering possibly infinitely looping ones) as an exit. I think finite_loop_p has no RTL level variant (yet). Richard. 2016-11-02 Richard Biener PR middle-end/78185 * loop-invariant.c (find_invariant_insn): Never hoist trapping or faulting instructions. * tree-ssa-loop-im.c: Include tree-ssa-loop-niter.h. (fill_always_executed_in_1): Honor infinite child loops. * gcc.dg/pr78185.c: New testcase. diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index 551103f..deb5be6 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1076,7 +1076,7 @@ pre_check_invariant_p (bool simple, rtx dest) unless the program ends due to a function call. */ static void -find_invariant_insn (rtx_insn *insn, bool always_reached, bool always_executed) +find_invariant_insn (rtx_insn *insn, bool, bool always_executed) { df_ref ref; struct def *def; @@ -1108,8 +1108,8 @@ find_invariant_insn (rtx_insn *insn, bool always_reached, bool always_executed) if (can_throw_internal (insn)) return; - /* We cannot make trapping insn executed, unless it was executed before. */ - if (may_trap_or_fault_p (PATTERN (insn)) && !always_reached) + /* We cannot make trapping insn executed. */ + if (may_trap_or_fault_p (PATTERN (insn))) return; depends_on = BITMAP_ALLOC (NULL); diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 463db04..0524e57 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see #include "trans-mem.h" #include "gimple-fold.h" #include "tree-scalar-evolution.h" +#include "tree-ssa-loop-niter.h" /* TODO: Support for predicated code motion. I.e. @@ -2369,8 +2370,16 @@ fill_always_executed_in_1 (struct loop *loop, sbitmap contains_call) break; FOR_EACH_EDGE (e, ei, bb->succs) - if (!flow_bb_inside_loop_p (loop, e->dest)) - break; + { + /* If there is an exit from this BB. */ + if (!flow_bb_inside_loop_p (loop, e->dest)) + break; + /* Or we enter a possibly non-finite loop. */ + if (flow_loop_nested_p (bb->loop_father, + e->dest->loop_father) + && ! finite_loop_p (e->dest->loop_father)) + break; + } if (e) break; Index: trunk/gcc/testsuite/gcc.dg/pr78185.c === --- trunk/gcc/testsuite/gcc.dg/pr78185.c(revision 0) +++ trunk/gcc/testsuite/gcc.dg/pr78185.c(revision 0) @@ -0,0 +1,28 @@ +/* { dg-do run { target *-*-linux* *-*-gnu* } } */ +/* { dg-options "-O" } */ + +#include +#include +#include + +static char var1 = 0L; +static char *var2 = &var1; + +void do_exit (int i) +{ + exit (0); +} + +int main(void) +{ + struct sigaction s; + sigemptyset (&s.sa_mask); + s.sa_handler = do_exit; + s.sa_flags = 0; + sigaction (SIGALRM, &s, NULL); + alarm (1); + /* The following loop is infinite, the division by zero should not + be hoisted out of it. */ + for (; (var1 == 0 ? 0 : (100 / var1)) == *var2; ); + return 0; +}
[PATCH] Fix for big-endian gcc.c-torture/execute/pr55750.c
Hi all, I noticed that my patch for PR tree-optimization/78170 broke execute.exp=pr55750.c on big-endian. The problem with that patch is that we should not forget to clear the padding bits in the temporary buffer when merging values even when they are less than a byte. Tested on aarch64_be-none-elf (the test failure goes away) Bootstrap and test on x86_64 ongoing. Ok for trunk if successful? David, does this patch allow AIX bootstrap to proceed by any chance? Thanks, Kyrill 2016-11-02 Kyrylo Tkachov * gimple-ssa-store-merging.c (encode_tree_to_bitpos): Don't forget to clear padding bits even when they're less than a byte. diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c index 081620e50f603e2de8ed962aec6c619890ce1e33..db3c3c14a5b8938024db8bed80145c77d29396ac 100644 --- a/gcc/gimple-ssa-store-merging.c +++ b/gcc/gimple-ssa-store-merging.c @@ -430,7 +430,8 @@ encode_tree_to_bitpos (tree expr, unsigned char *ptr, int bitlen, int bitpos, contain a sign bit due to sign-extension). */ unsigned int padding = byte_size - ROUND_UP (bitlen, BITS_PER_UNIT) / BITS_PER_UNIT - 1; - if (padding != 0) + if (padding != 0 + || bitlen % BITS_PER_UNIT != 0) { /* On big-endian the padding is at the 'front' so just skip the initial bytes. */
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On Wed, Nov 02, 2016 at 03:38:25PM +0100, Martin Liška wrote: > it converts: > foo () > { > char a; > char * p; > char _1; > int _2; > int _8; > int _9; > > : > ASAN_MARK (2, &a, 1); > a = 0; > p_6 = &a; > ASAN_MARK (1, &a, 1); > _1 = *p_6; You shouldn't convert if a is addressable (when ignoring &a in ASAN_MARK calls). Only if there is &a just in ASAN_MARK and MEM_REF, you can convert. > to: > > foo () > { > char a; > char * p; > char _1; > int _2; > > : > a_10 = 0; > a_12 = ASAN_POISON (); > _1 = a_12; > if (_1 != 0) > goto ; > else > goto ; > > : > > : > # _2 = PHI <1(2), 0(3)> > return _2; > > } > > and probably the last goal is to convert the newly added internal fn to a > runtime call. > Hope sanopt pass is the right place where to it? If ASAN_POISON is ECF_CONST and has any uses during sanopt, perhaps best would be to add an artificial variable you give the same name as the underlying var of the SSA_NAME (and alignment, locus etc.) and poison it right away (keep unpoisoning only to the function epilogue) and then ASAN_CHECK replace all uses of that SSA_NAME with ASAN_CHECK + use of (D) SSA_NAME. Jakub
Re: [PATCH 0/3] use rtx_insn * more
On Wed, 2016-11-02 at 00:05 -0400, Trevor Saunders wrote: > On Mon, Oct 31, 2016 at 07:37:54AM -0600, Jeff Law wrote: > > On 10/28/2016 01:13 PM, tbsaunde+...@tbsaunde.org wrote: > > > From: Trevor Saunders > > > > > > HI, > > > > > > This series changes various variables type from rtx to rtx_insn * > > > so that the > > > remaining patches in this series > > > http://gcc.gnu.org/ml/gcc-patches/2016-10/msg01353.html can be > > > applied. > > > > > > patches bootstrapped and regtested on x86_64-linux-gnu, and run > > > through config-list.mk, ok? > > > > > > Thanks! > > > > > > Trev > > > > > > Trevor Saunders (3): > > > use rtx_insn * in various places where it is obvious > > > split up the trial variable in reorg.c:relax_delay_slots to use > > > rtx_insn * more > > > split up some variables to use rtx_insn * more > > All 3 patches in this series are fine. > > Thanks for the reviews. > > Can I trouble you to look at > http://gcc.gnu.org/ml/gcc-patches/2016-10/ > now that it is tested on aarch64? I think you're missing the full URL here; the above URL is the index for all of October. Did you mean this patch: https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01358.html ?
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On 11/02/2016 02:16 PM, Richard Biener wrote: > On Wed, Nov 2, 2016 at 2:06 PM, Jakub Jelinek wrote: >> On Wed, Nov 02, 2016 at 01:59:00PM +0100, Richard Biener wrote: Yeah, that is what I meant. The issue is how to report uses of such SSA_NAME when there is no memory. So, either we'd need a special runtime library entrypoint that would report uses after scope even when there is no underlying memory, or we'd need to force it at asan pass time into memory again. >>> >>> Well, there can't be any uses outside the scope -- there are no (memory) >>> uses >>> left if we rewrite the thing into SSA. That is, the address can no >>> longer "escape". >>> >>> Of course there could have been invalid uses before the rewrite into SSA. >>> But >>> those can be diagnosed either immediately before or after re-writing into >>> SSA >>> at compile-time (may be in dead code regions of course). >> >> Sure, we can warn on those at compile time, but we really should arrange to >> error on those at runtime if they are ever executed, the UB happens only at >> runtime, so in dead code isn't fatal. > > Then we can replace those uses with a call into the asan runtime diagnosing > the > issue instead? > > Richard. > >> Jakub OK, thanks for the clarification, it's more clear to me. So we want to consider for SSA transformation of ASAN_MARK only is_gimple_reg_types. I'm having a test-case where it converts: foo () { char a; char * p; char _1; int _2; int _8; int _9; : ASAN_MARK (2, &a, 1); a = 0; p_6 = &a; ASAN_MARK (1, &a, 1); _1 = *p_6; if (_1 != 0) goto ; else goto ; : _9 = 1; goto ; : _8 = 0; : # _2 = PHI <_9(3), _8(4)> return _2; } to: foo () { char a; char * p; char _1; int _2; : a_10 = 0; a_12 = ASAN_POISON (); _1 = a_12; if (_1 != 0) goto ; else goto ; : : # _2 = PHI <1(2), 0(3)> return _2; } and probably the last goal is to convert the newly added internal fn to a runtime call. Hope sanopt pass is the right place where to it? Thanks, Martin
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On Wed, Nov 02, 2016 at 03:27:42PM +0100, Martin Liška wrote: > > So is there anything I should do wrt -Wswitch-unreachable? > > > > Marek > > > > Probably not. I'm having a patch puts GIMPLE_SWITCH statement to a proper > place > in GIMPLE_BIND. Let's see whether such patch can bootstrap and survive > regression > tests. Please do that only for -fsanitize-use-after-scope, it will likely affect at least for -O0 the debugging experience. For -O0 -fsanitize=address -fsanitize-use-after-scope perhaps we could arrange for some extra stmt to have the locus of the switch (where we still don't want the vars to appear in scope) and then have no locus on the ASAN_MARK and actual GIMPLE_SWITCH or something similar. Jakub
[PATCH] DW_TAG_ptr_to_member_type for PMF and DW_AT_{,rvalue_}reference for those
Hi! This is a merge of my patch from yesterday, Jason's incremental patch to that and parts of Alex' patch from Oct 19. It uses Alex' LANG_HOOKS_GET_PTRMEMFN_TYPE langhook. I've tried to think about https://gcc.gnu.org/ml/gcc-patches/2011-05/msg00227.html and we even have such a langhook now, modified_type_die uses lang_hooks.types.get_debug_type, but 1) it is just called in modified_type_die and not in gen_type_die_with_usage, that looks weird 2) it is used for something Ada-ish I really don't know how to test etc. to be able to find out if it is safe to call it in gen_type_die_with_usage too 3) most importantly, if the C++ version of this langhook would create OFFSET_TYPE on the fly, I don't know how to ensure effective sharing of DW_TAG_ptr_to_member_type nodes with the same DW_AT_type and DW_AT_containing_type; unless the C++ langhook adds some extra hash table that caches already created OFFSET_TYPEs or something similar, it would create a new OFFSET_TYPE each time it is called Also, I really don't know how well does GDB (especially older releases) handle DW_TAG_ptr_to_member_type for PMF, so the patch wraps that currently with if (dwarf_version >= 5). Quick grep revealed that GDB has code to handle the __pfn/__delta fields. So, can I ask somebody from the GDB team to test this patch with that if (dwarf_version >= 5) replaced with if (1) and see if it works properly with current GDB as well as say 4-5 years old one (e.g. with -gdwarf-2 or -gdwarf-3)? If yes, we should emit it unconditionally. Bootstrapped/regtested on x86_64-linux and i686-linux, on the new ref-3.C testcase emits the number of DIEs I really expect to. Ok for trunk? 2016-11-02 Jakub Jelinek Alexandre Oliva Jason Merrill PR debug/28767 PR debug/56974 * langhooks.h (struct lang_hooks_for_types): Document type_hash_eq being also called on METHOD_TYPEs. Add type_dwarf_attribute and get_ptrmemfn_type langhooks. * langhooks.c (lhd_type_dwarf_attribute): New function. * langhooks-def.h (lhd_type_dwarf_attribute): Declare. (LANG_HOOKS_TYPE_DWARF_ATTRIBUTE, LANG_HOOKS_GET_PTRMEMFN_TYPE): Define. (LANG_HOOKS_FOR_TYPES_INITIALIZER): Add LANG_HOOKS_TYPE_DWARF_ATTRIBUTE and LANG_HOOKS_GET_PTRMEMFN_TYPE. * hooks.h (hook_tree_const_tree_int_null): Declare. * hooks.c (hook_tree_const_tree_int_null): New function. * tree.h (check_lang_type): Declare. * tree.c (check_lang_type): New function. (check_qualified_type, check_aligned_type): Call it. * dwarf2out.c (modified_type_die): Don't use type_main_variant for FUNCTION_TYPE or METHOD_TYPE, instead walk over variants with check_base_type and check_lang_type. (gen_ptr_to_mbr_type_die): Add class_type and member_type arguments, allow type to be something other than OFFSET_TYPE, if lookup_type_die is already non-NULL, return early. For OFFSET_TYPE add DW_AT_use_location attribute. (gen_subroutine_type_die): Add DW_AT_{,rvalue_}reference attribute if needed. (gen_type_die_with_usage): Don't use type_main_variant for FUNCTION_TYPE or METHOD_TYPE, instead walk over variants with check_base_type and check_lang_type. Formatting fixes. Adjust gen_ptr_to_mbr_type_die caller. Handle PMF. cp/ * tree.c (cp_check_qualified_type): Use check_base_type and TYPE_QUALS comparison instead of check_qualified_type. (cxx_type_hash_eq): Return false if type_memfn_rqual don't match. * cp-objcp-common.c (cp_get_ptrmemfn_type): New function. (cp_decl_dwarf_attribute): Don't handle types here. (cp_type_dwarf_attribute): New function. * cp-objcp-common.h (cp_get_ptrmemfn_type, cp_type_dwarf_attribute): Declare. (LANG_HOOKS_GET_PTRMEMFN_TYPE, LANG_HOOKS_TYPE_DWARF_ATTRIBUTE): Define. testsuite/ * g++.dg/debug/dwarf2/ptrdmem-1.C: New test. * g++.dg/debug/dwarf2/ref-3.C: New test. * g++.dg/debug/dwarf2/refqual-1.C: New test. * g++.dg/debug/dwarf2/refqual-2.C: New test. --- gcc/langhooks.h.jj 2016-11-01 15:18:44.994506161 +0100 +++ gcc/langhooks.h 2016-11-02 11:43:51.905046755 +0100 @@ -120,7 +120,7 @@ struct lang_hooks_for_types /* Return TRUE if TYPE1 and TYPE2 are identical for type hashing purposes. Called only after doing all language independent checks. At present, this function is only called when both TYPE1 and TYPE2 are - FUNCTION_TYPEs. */ + FUNCTION_TYPE or METHOD_TYPE. */ bool (*type_hash_eq) (const_tree, const_tree); /* Return TRUE if TYPE uses a hidden descriptor and fills in information @@ -162,6 +162,15 @@ struct lang_hooks_for_types for the debugger about scale factor, etc. */ bool (*get_fixed_point_type_info) (const_tree, struct
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On 11/02/2016 03:20 PM, Marek Polacek wrote: > On Wed, Nov 02, 2016 at 11:10:53AM +0100, Jakub Jelinek wrote: >> On Wed, Nov 02, 2016 at 10:59:26AM +0100, Jakub Jelinek wrote: Which is gimplified as: int * ptr; switch (argc) , case 1: > { int a; try { ASAN_MARK (2, &a, 4); : goto ; : ptr = &a; goto ; } finally { ASAN_MARK (1, &a, 4); } >> >>> Shouldn't there be also ASAN_MARK on the implicit gotos from the switch >>> statement? Otherwise, consider this being done in a loop, after the first >>> iteration you ASAN_MARK (1, &a, 4) (i.e. poison), then you iterate say with >>> args 1 and have in case 1: a = 0;, won't that trigger runtime error? >> >> Wonder if for the variables declared inside of switch body, because we don't >> care about uses before scope, it wouldn't be more efficient to arrange for >> int *p, *q, *r; >> switch (x) >> { >> int a; >> case 1: >> p = &a; >> a = 5; >> break; >> int b; >> case 2: >> int c; >> q = &b; >> r = &c; >> b = 3; >> c = 4; >> break; >> } >> to effectively ASAN_MARK (2, &a, 4); ASAN_MARK (2, &b, 4); ASAN_MARK (2, &c, >> 4); >> before the GIMPLE_SWITCH, instead of unpoisoning them on every case label >> where they might be in scope. Though, of course, at least until lower pass >> that is quite ugly, because it would refer to "not yet declared" variables. >> Perhaps we'd need to move the ASAN_MARK and GIMPLE_SWITCH (but of course not >> the expression evaluation of the switch control expression) inside of the >> switches' GIMPLE_BIND. > > So is there anything I should do wrt -Wswitch-unreachable? > > Marek > Probably not. I'm having a patch puts GIMPLE_SWITCH statement to a proper place in GIMPLE_BIND. Let's see whether such patch can bootstrap and survive regression tests. Thanks, Martin
[PATCH] Generate reproducible output independently of the build-path
(Please keep me on CC, I am not subscribed) Background == We are on a long journey to make build processes be able to reproduce the build outputs independently of which filesystem path the build is being executed from - e.g. if the executing user doesn't have root access to be able to execute it under a standard path such as /build. This currently is making about 2k-3k [1] packages in Debian unreproducible when build-paths are varied across builds. [1] https://tests.reproducible-builds.org/debian/issues/unstable/captures_build_path_issue.html Previous attempts have involved using -fdebug-prefix-map to strip away the prefix of an absolute path, leaving behind the part relative to the top-level directory of the source code, which is reproducible. But this flag was itself stored in DW_AT_producer, which makes the final output unreproducible. This was pointed out in bug 68848 and fixed in r231835. However, through more testing, we have found out that the fix just mentioned is not enough to achieve reproducibility in practice. The main issue is that many different packages like to store CFLAGS et. al. in arbitrary ways. So if we add an explicit -fdebug-prefix-map flag to the system-level CFLAGS etc, this will often propagate into the build result, making it again dependent on the build-path, and not reproducible. For example: Some packages embed compiler flags into their pkg-config files (or equivalent): https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/curl.html https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/perl.html https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/qt4-x11.html https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/fflas-ffpack.html https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/sip4.html Other packages embed compiler flags directly into the binary: https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/singular.html https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/mutt.html etc etc. We think it's not appropriate to patch all (3k+) of these packages to strip out -fdebug-prefix-map flags. This would involve adding quite complex logic to everyone's build scripts, and we have to adapt this logic every single time to that particular package. Also, in general CFLAGS is *supposed* to affect the compiler output, and saving it unconditionally is quite a reasonable thing for packages to do. If we tried to patch all of these packages, we would be turning "reproducible builds" in to a costly opt-in feature, rather than on-by-default that everyone can easily benefit from. So, we believe it is better to patch GCC centrally. Our proposed solution is similar to (a) the SOURCE_DATE_EPOCH environment variable which was previously accepted into GCC and was used to successfully make 400+ packages reproducible, and (b) the -fdebug-prefix-map mechanism that already exists in GCC and which nearly but not quite, achieves at-scale build-path-independent reproducibility. Proposal This patch series adds a new environment variable SOURCE_PREFIX_MAP. When this is set, GCC will treat this as an implicit "-fdebug-prefix-map=$value" command-line argument. This makes the final binary output reproducible, and also hides the unreproducible value (the build path prefix) from CFLAGS et. al. which everyone is (understandably) embedding as-is into their build output. This environment variable also acts on the __FILE__ macro, mapping it in the same way that debug-prefix-map works for debug symbols. We have seen that __FILE__ is also a very large source of unreproducibility, and is represented quite heavily in the 3k+ figure given above. Finally, we tweak the __TIMESTAMP__ macro so it honours SOURCE_DATE_EPOCH, in a similar way to how __DATE__ and __TIME__ do so already. More details are given in the headers of the patch files themselves. Testing === I've tested these patches on a Debian testing/unstable x86_64-linux-gnu system. So far I've only run the new tests that this patch adds, on a disable-bootstrap build. I will do a full bootstrap and run the full testsuite over the next few days, both with and without this patch, and report back. Copyright disclaimer I dedicate these patches to the public domain by waiving all of my rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law. See https://creativecommons.org/publicdomain/zero/1.0/legalcode for full text. Please let me know if the above is insufficient and I will be happy to sign any relevant forms.
[PATCH 3/4] Use SOURCE_PREFIX_MAP envvar to transform __FILE__
Honour the SOURCE_PREFIX_MAP environment variable when expanding the __FILE__ macro, in the same way that debug-prefix-map works for debugging symbol paths. This patch follows similar lines to the earlier patch for SOURCE_DATE_EPOCH. Specifically, we read the environment variable not in libcpp but via a hook which has an implementation defined in gcc/c-family. However, to achieve this is more complex than the earlier patch: we need to share the prefix_map data structure and associated functions between libcpp and c-family. Therefore, we need to move these to libiberty. (For comparison, the SOURCE_DATE_EPOCH patch did not need this because time_t et. al. are in the standard C library.) Acknowledgements Dhole who wrote the earlier patch for SOURCE_DATE_EPOCH which saved me a lot of time on figuring out what to edit. ChangeLogs -- include/ChangeLog: 2016-11-01 Ximin Luo * prefix-map.h: New file, mostly derived from /gcc/final.c. libiberty/ChangeLog: 2016-11-01 Ximin Luo * prefix-map.c: New file, mostly derived from /gcc/final.c. * Makefile.in: Update for new files. gcc/ChangeLog: 2016-11-01 Ximin Luo * final.c: Generalise and refactor code related to debug_prefix_map. Move some of it to /libiberty/prefix-map.c, /include/prefix-map.h and refactor the remaining code to use the moved-out things. * doc/invoke.texi (Environment Variables): Update SOURCE_PREFIX_MAP to describe how it affects __FILE__ expansion. gcc/c-family/ChangeLog: 2016-11-01 Ximin Luo * c-common.c (cb_get_source_prefix_map): Define new call target. * c-common.h (cb_get_source_prefix_map): Declare call target. * c-lex.c (init_c_lex): Set the get_source_prefix_map callback. libcpp/ChangeLog: 2016-11-01 Ximin Luo * include/cpplib.h (cpp_callbacks): Add get_source_prefix_map callback. * init.c (cpp_create_reader): Initialise source_prefix_map field. * internal.h (cpp_reader): Add new field source_prefix_map. * macro.c (_cpp_builtin_macro_text): Set the source_prefix_map field if unset and apply it to the __FILE__ macro. gcc/testsuite/ChangeLog: 2016-11-01 Ximin Luo * gcc.dg/cpp/source_prefix_map-1.c: New test. * gcc.dg/cpp/source_prefix_map-2.c: New test. Index: gcc-7-20161030/include/prefix-map.h === --- /dev/null +++ gcc-7-20161030/include/prefix-map.h @@ -0,0 +1,71 @@ +/* Declarations for manipulating filename prefixes. + + Copyright (C) 2016 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software Foundation, + Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA. */ + +#ifndef _PREFIX_MAP_H +#define _PREFIX_MAP_H + +#ifdef __cplusplus +extern "C" { +#endif + +#ifdef HAVE_STDLIB_H +#include +#endif + +/* Linked-list of mappings from old prefixes to new prefixes. */ + +struct prefix_map +{ + const char *old_prefix; + const char *new_prefix; + size_t old_len; + size_t new_len; + struct prefix_map *next; +}; + +/* Parse a single prefix-map. + + The string `arg' is split at the final '=' character. The part before + it is used to set `map->old_prefix' and `map->old_len', and the part + after it is used to set `map->new_prefix' and `map->new_len'. + + If `arg' does not contain a '=' then 0 is returned. Otherwise, a + non-zero value is returned. + */ + +extern int parse_prefix_map (const char *arg, struct prefix_map *map); + +/* Perform mapping of filename prefixes. + + Return the filename corresponding to `old_name'. The return value is + equal to `old_name' if no transformation occurred, else it is equal + to `new_name' where the new filename is stored. + + On entry into this function, `new_name' must be able to hold at least + `(old_name - map->old_len + map->old_len + 1)' characters, where + `map' is the mapping that will be selected and performed. + */ + +extern const char *apply_prefix_map (const char *old_name, char *new_name, +struct prefix_map *map_head); + +#ifdef __cplusplus +} +#endif + +#endif /* _PREFIX_MAP_H */ Index: gcc-7-20161030/libiberty/prefix-map.c === --- /dev/null +++ gcc-7-20161030/libiberty/prefix-map.c @
[PATCH 1/4] Use SOURCE_PREFIX_MAP envvar as an implicit debug-prefix-map
Define the SOURCE_PREFIX_MAP environment variable, and treat it as an implicit CLI -fdebug-prefix-map option specified before any explicit such options. Acknowledgements Daniel Kahn Gillmor who wrote the patch for r231835, which saved me a lot of time figuring out what to edit. HW42 for discussion on the details of the proposal, and for suggesting that we retain the ability to map the prefix to something other than ".". ChangeLogs -- gcc/ChangeLog: 2016-11-01 Ximin Luo * opts-global.c (add_debug_prefix_map_from_envvar): Add the value of SOURCE_PREFIX_MAP as a debug_prefix_map if the former is set. (handle_common_deferred_options): Call add_debug_prefix_map_from_envvar before processing options. * final.c: (add_debug_prefix_map): Be less specific in the error message, since it can also be triggered by the SOURCE_PREFIX_MAP variable. * doc/invoke.texi (Environment Variables): Document form and behaviour of SOURCE_PREFIX_MAP. gcc/testsuite/ChangeLog: 2016-11-01 Ximin Luo * gcc.dg/debug/dwarf2/source_prefix_map-1.c: New test. * gcc.dg/debug/dwarf2/source_prefix_map-2.c: New test. Index: gcc-7-20161030/gcc/opts-global.c === --- gcc-7-20161030.orig/gcc/opts-global.c +++ gcc-7-20161030/gcc/opts-global.c @@ -310,6 +310,21 @@ decode_options (struct gcc_options *opts finish_options (opts, opts_set, loc); } +/* Add a debug-prefix-map using the SOURCE_PREFIX_MAP environment variable if + it is set. */ + +static void +add_debug_prefix_map_from_envvar () +{ + char *prefix_map; + + prefix_map = getenv ("SOURCE_PREFIX_MAP"); + if (!prefix_map) +return; + + add_debug_prefix_map (prefix_map); +} + /* Hold command-line options associated with stack limitation. */ const char *opt_fstack_limit_symbol_arg = NULL; int opt_fstack_limit_register_no = -1; @@ -335,6 +350,8 @@ handle_common_deferred_options (void) if (flag_opt_info) opt_info_switch_p (NULL); + add_debug_prefix_map_from_envvar (); + FOR_EACH_VEC_ELT (v, i, opt) { switch (opt->opt_index) Index: gcc-7-20161030/gcc/final.c === --- gcc-7-20161030.orig/gcc/final.c +++ gcc-7-20161030/gcc/final.c @@ -1531,7 +1531,7 @@ add_debug_prefix_map (const char *arg) p = strchr (arg, '='); if (!p) { - error ("invalid argument %qs to -fdebug-prefix-map", arg); + error ("invalid value %qs for debug-prefix-map", arg); return; } map = XNEW (debug_prefix_map); Index: gcc-7-20161030/gcc/doc/invoke.texi === --- gcc-7-20161030.orig/gcc/doc/invoke.texi +++ gcc-7-20161030/gcc/doc/invoke.texi @@ -26222,6 +26222,23 @@ Recognize EUCJP characters. If @env{LANG} is not defined, or if it has some other value, then the compiler uses @code{mblen} and @code{mbtowc} as defined by the default locale to recognize and translate multibyte characters. + +@item SOURCE_PREFIX_MAP If this variable is set, it specifies a mapping +that is used to transform filepaths that are output in debug symbols. +This helps the embedded paths become reproducible, without having the +unreproducible value be visible in other input sources - such as GCC +command-line flags or standardised build-time environment variables like +@code{CFLAGS} - that are commonly legitimately-embedded in the build +output by higher-level build processes. + +The form and behaviour is similar to @option{-fdebug-prefix-map}. That +is, the value of @env{SOURCE_PREFIX_MAP} must be of the form +@samp{@var{old}=@var{new}}. The split occurs on the first @code{=} +character, so that @var{old} cannot itself contain a @code{=}. + +Whenever an absolute source- or build-related filepath is to be emitted +in a final end-result output, GCC will replace @var{old} with @var{new} +if that filepath starts with @var{old}. @end table @noindent Index: gcc-7-20161030/gcc/testsuite/gcc.dg/debug/dwarf2/source_prefix_map-1.c === --- /dev/null +++ gcc-7-20161030/gcc/testsuite/gcc.dg/debug/dwarf2/source_prefix_map-1.c @@ -0,0 +1,9 @@ +/* DW_AT_comp_dir should be relative if SOURCE_PREFIX_MAP is a prefix of it. */ +/* { dg-do compile } */ +/* { dg-options "-gdwarf -dA" } */ +/* { dg-set-compiler-env-var SOURCE_PREFIX_MAP "[file dirname [pwd]]=DWARF2TEST" } */ +/* { dg-final { scan-assembler-dem "DW_AT_comp_dir: \"DWARF2TEST/gcc" } } */ + +void func (void) +{ +} Index: gcc-7-20161030/gcc/testsuite/gcc.dg/debug/dwarf2/source_prefix_map-2.c === --- /dev/null +++ gcc-7-20161030/gcc/testsuite/gcc.dg/debug/dwarf2/source_prefix_map-2.c @@ -0,0 +1,8 @@ +/* DW_AT_comp_dir should be absolute if SOURCE_PREFIX_MAP is not set. */ +/* { dg-do compile } */ +/* { dg-op
[PATCH 4/4] Use SOURCE_DATE_EPOCH in place of __TIMESTAMP__ if the latter is newer.
This brings the behaviour in line with the __DATE__ and __TIME__ macros. Note though the minor difference: __TIMESTAMP__ is defined as the file-modification date and not the "current" date or time like the other two macros. Therefore, we do a clamping behaviour (similar to tar --clamp-mtime). Acknowledgements Reiner Herrmann suggested the clamping behaviour. ChangeLogs -- libcpp/ChangeLog: 2016-11-01 Ximin Luo * macro.c (_cpp_builtin_macro_text): Use SOURCE_DATE_EPOCH in place of __TIMESTAMP__ if the latter is newer than the former. gcc/ChangeLog: 2016-11-01 Ximin Luo * doc/cppenv.texi (Environment Variables): Update SOURCE_DATE_EPOCH to describe the new effect on __TIMESTAMP__. gcc/testsuite/ChangeLog: 2016-11-01 Ximin Luo * gcc.dg/cpp/source_date_epoch-4.c: New test. * gcc.dg/cpp/source_date_epoch-5.c: New test. Index: gcc-7-20161030/libcpp/macro.c === --- gcc-7-20161030.orig/libcpp/macro.c +++ gcc-7-20161030/libcpp/macro.c @@ -264,7 +264,30 @@ _cpp_builtin_macro_text (cpp_reader *pfi struct tm *tb = NULL; struct stat *st = _cpp_get_file_stat (file); if (st) - tb = localtime (&st->st_mtime); + { + /* Set a reproducible timestamp for __DATE__ and __TIME__ macro + if SOURCE_DATE_EPOCH is defined. */ + if (pfile->source_date_epoch == (time_t) -2 + && pfile->cb.get_source_date_epoch != NULL) + pfile->source_date_epoch = pfile->cb.get_source_date_epoch (pfile); + + if (pfile->source_date_epoch >= (time_t) 0) + { + /* If SOURCE_DATE_EPOCH is set, we want reproducible + timestamps so use gmtime not localtime. */ + if (st->st_mtime >= pfile->source_date_epoch) + tb = gmtime (&pfile->source_date_epoch); + else + /* Don't use SOURCE_DATE_EPOCH if the timestamp is +older, since that means it was probably already +set in a reproducible way (e.g. via source tarball +extraction or some other method). */ + tb = gmtime (&st->st_mtime); + } + else + tb = localtime (&st->st_mtime); + } + if (tb) { char *str = asctime (tb); Index: gcc-7-20161030/gcc/doc/cppenv.texi === --- gcc-7-20161030.orig/gcc/doc/cppenv.texi +++ gcc-7-20161030/gcc/doc/cppenv.texi @@ -83,8 +83,9 @@ main input file is omitted. @item SOURCE_DATE_EPOCH If this variable is set, its value specifies a UNIX timestamp to be used in replacement of the current date and time in the @code{__DATE__} -and @code{__TIME__} macros, so that the embedded timestamps become -reproducible. +and @code{__TIME__} macros, and in replacement of the file modification +date in the @code{__TIMESTAMP__} macro if the latter is newer than the +former, so that the embedded timestamps become reproducible. The value of @env{SOURCE_DATE_EPOCH} must be a UNIX timestamp, defined as the number of seconds (excluding leap seconds) since Index: gcc-7-20161030/gcc/testsuite/gcc.dg/cpp/source_date_epoch-4.c === --- /dev/null +++ gcc-7-20161030/gcc/testsuite/gcc.dg/cpp/source_date_epoch-4.c @@ -0,0 +1,13 @@ +/* __TIMESTAMP__ should use SOURCE_DATE_EPOCH if the latter is older. */ +/* { dg-do run } */ +/* { dg-set-compiler-env-var TZ "UTC" } */ +/* { dg-set-compiler-env-var LC_ALL "C" } */ +/* { dg-set-compiler-env-var SOURCE_DATE_EPOCH "0" } */ + +int +main () +{ + if (__builtin_strcmp (__TIMESTAMP__, "Thu Jan 1 00:00:00 1970") != 0) +__builtin_abort (); + return 0; +} Index: gcc-7-20161030/gcc/testsuite/gcc.dg/cpp/source_date_epoch-5.c === --- /dev/null +++ gcc-7-20161030/gcc/testsuite/gcc.dg/cpp/source_date_epoch-5.c @@ -0,0 +1,13 @@ +/* __TIMESTAMP__ should not use SOURCE_DATE_EPOCH if the latter is newer. */ +/* { dg-do run } */ +/* { dg-set-compiler-env-var TZ "UTC" } */ +/* { dg-set-compiler-env-var LC_ALL "C" } */ +/* { dg-set-compiler-env-var SOURCE_DATE_EPOCH "253402300799" } */ + +int +main () +{ + if (__builtin_strcmp (__TIMESTAMP__, "Fri Dec 31 23:59:59 UTC ") == 0) +__builtin_abort (); + return 0; +}
[PATCH 2/4] Split prefix-map values on the last '=' character, not the first
We are planning to ask other tools to support SOURCE_PREFIX_MAP, in the same way that we have already done this for SOURCE_DATE_EPOCH. So, this will last for some time and have quite wide reach. Consequently, we believe it is better to split on the final '=', since it is much less likely to result in problems. For example, with the previous behaviour (splitting on the first '=') it would not be possible to map paths like the following: C:\Documents and Settings\Betrand Russell\Proofs of 1+1=2\Automated Proofs\Source Code\ /srv/net/distributed-hash-table/address/VaIWP8YIWDChR2O76yiufXBsbw5g2skB_kp9VP-qVLvydovdGw==/projects/gcc-6/ These are contrived examples, but hopefully you can agree that they're not *so* unrealistic - future software or users might plausibly wish to run reproducible build processes underneath paths similar to these. On the other hand, I can think of much fewer cases where the new-prefix *must* include a '=' character. I can't think of any software project that includes it, and I'd imagine that any such (or future) projects that might exist would already have standardised "ASCII-only" versions of its name. ChangeLogs -- gcc/ChangeLog: 2016-11-01 Ximin Luo * final.c: (add_debug_prefix_map): Split on the last and not first '='. * doc/invoke.texi (Environment Variables): Update SOURCE_PREFIX_MAP to describe new parsing. Index: gcc-7-20161030/gcc/final.c === --- gcc-7-20161030.orig/gcc/final.c +++ gcc-7-20161030/gcc/final.c @@ -1528,7 +1528,7 @@ add_debug_prefix_map (const char *arg) debug_prefix_map *map; const char *p; - p = strchr (arg, '='); + p = strrchr (arg, '='); if (!p) { error ("invalid value %qs for debug-prefix-map", arg); Index: gcc-7-20161030/gcc/doc/invoke.texi === --- gcc-7-20161030.orig/gcc/doc/invoke.texi +++ gcc-7-20161030/gcc/doc/invoke.texi @@ -26233,8 +26233,8 @@ output by higher-level build processes. The form and behaviour is similar to @option{-fdebug-prefix-map}. That is, the value of @env{SOURCE_PREFIX_MAP} must be of the form -@samp{@var{old}=@var{new}}. The split occurs on the first @code{=} -character, so that @var{old} cannot itself contain a @code{=}. +@samp{@var{old}=@var{new}}. The split occurs on the last @code{=} +character, so that @var{new} cannot itself contain a @code{=}. Whenever an absolute source- or build-related filepath is to be emitted in a final end-result output, GCC will replace @var{old} with @var{new}
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On Wed, Nov 02, 2016 at 11:10:53AM +0100, Jakub Jelinek wrote: > On Wed, Nov 02, 2016 at 10:59:26AM +0100, Jakub Jelinek wrote: > > > Which is gimplified as: > > > > > > int * ptr; > > > > > > switch (argc) , case 1: > > > > { > > > int a; > > > > > > try > > > { > > > ASAN_MARK (2, &a, 4); > > > : > > > goto ; > > > : > > > ptr = &a; > > > goto ; > > > } > > > finally > > > { > > > ASAN_MARK (1, &a, 4); > > > } > > > Shouldn't there be also ASAN_MARK on the implicit gotos from the switch > > statement? Otherwise, consider this being done in a loop, after the first > > iteration you ASAN_MARK (1, &a, 4) (i.e. poison), then you iterate say with > > args 1 and have in case 1: a = 0;, won't that trigger runtime error? > > Wonder if for the variables declared inside of switch body, because we don't > care about uses before scope, it wouldn't be more efficient to arrange for > int *p, *q, *r; > switch (x) > { > int a; > case 1: > p = &a; > a = 5; > break; > int b; > case 2: > int c; > q = &b; > r = &c; > b = 3; > c = 4; > break; > } > to effectively ASAN_MARK (2, &a, 4); ASAN_MARK (2, &b, 4); ASAN_MARK (2, &c, > 4); > before the GIMPLE_SWITCH, instead of unpoisoning them on every case label > where they might be in scope. Though, of course, at least until lower pass > that is quite ugly, because it would refer to "not yet declared" variables. > Perhaps we'd need to move the ASAN_MARK and GIMPLE_SWITCH (but of course not > the expression evaluation of the switch control expression) inside of the > switches' GIMPLE_BIND. So is there anything I should do wrt -Wswitch-unreachable? Marek
Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module
On 02/11/16 13:42, Jakub Jelinek wrote: On Wed, Nov 02, 2016 at 01:26:48PM +, Jiong Wang wrote: -/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. */ +/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. */ Too long line. Hmm, it shows 80 columns under my editor. I guess '+' is counted in? +/* RTL sequences inside PARALLEL are raw expression representation. + + mem_loc_descriptor can be used to build generic DWARF expressions for + DW_CFA_expression and DW_CFA_val_expression where the expression may can + not be represented using normal RTL sequences. In this case, group all + expression operations (DW_OP_*) inside a PARALLEL. For those DW_OP which + doesn't have RTL mapping, wrap it using UNSPEC. The logic for parsing + PARALLEL sequences is: + + foreach elem inside PARALLEL + if (elem is UNSPEC) + dw_op = XINT (elem, 1) (DWARF operation is kept as UNSPEC number) + oprnd1 = XVECEXP (elem, 0, 0) + oprnd2 = XVECEXP (elem, 0, 1) + else + call mem_loc_descriptor */ Not sure if it is a good idea to document in weirdly formatted pseudo-language what the code actually does a few lines below. IMHO either express it in words, or don't express it at all. OK, fixed. I replaced these comments as some brief words. + exp_result = + new_loc_descr ((enum dwarf_location_atom) dw_op, oprnd1, +oprnd2); Wrong formatting, = should be on the next line. + } + else + exp_result = + mem_loc_descriptor (elem, mode, mem_mode, + VAR_INIT_STATUS_INITIALIZED); Likewise. Both fixed. Patch updated, please review. Thanks. gcc/ 2016-11-02 Jiong Wang * reg-notes.def (CFA_VAL_EXPRESSION): New entry. * dwarf2cfi.c (dwarf2out_frame_debug_cfa_val_expression): New function. (dwarf2out_frame_debug): Support REG_CFA_VAL_EXPRESSION. (output_cfa_loc): Support DW_CFA_val_expression. (output_cfa_loc_raw): Likewise. (output_cfi): Likewise. (output_cfi_directive): Likewise. * dwarf2out.c (dw_cfi_oprnd1_desc): Support DW_CFA_val_expression. (dw_cfi_oprnd2_desc): Likewise. (mem_loc_descriptor): Recognize new pattern generated for value expression. commit 36de0173c17efcc30c56ef10304377e71313e8bc Author: Jiong Wang Date: Wed Oct 19 15:42:04 2016 +0100 dwarf val expression diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c index 6491d5a..b8c88fb 100644 --- a/gcc/dwarf2cfi.c +++ b/gcc/dwarf2cfi.c @@ -1235,7 +1235,7 @@ dwarf2out_frame_debug_cfa_register (rtx set) reg_save (sregno, dregno, 0); } -/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. */ +/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. */ static void dwarf2out_frame_debug_cfa_expression (rtx set) @@ -1267,6 +1267,29 @@ dwarf2out_frame_debug_cfa_expression (rtx set) update_row_reg_save (cur_row, regno, cfi); } +/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_VAL_EXPRESSION + note. */ + +static void +dwarf2out_frame_debug_cfa_val_expression (rtx set) +{ + rtx dest = SET_DEST (set); + gcc_assert (REG_P (dest)); + + rtx span = targetm.dwarf_register_span (dest); + gcc_assert (!span); + + rtx src = SET_SRC (set); + dw_cfi_ref cfi = new_cfi (); + cfi->dw_cfi_opc = DW_CFA_val_expression; + cfi->dw_cfi_oprnd1.dw_cfi_reg_num = dwf_regno (dest); + cfi->dw_cfi_oprnd2.dw_cfi_loc += mem_loc_descriptor (src, GET_MODE (src), + GET_MODE (dest), VAR_INIT_STATUS_INITIALIZED); + add_cfi (cfi); + update_row_reg_save (cur_row, dwf_regno (dest), cfi); +} + /* A subroutine of dwarf2out_frame_debug, process a REG_CFA_RESTORE note. */ static void @@ -2033,10 +2056,16 @@ dwarf2out_frame_debug (rtx_insn *insn) break; case REG_CFA_EXPRESSION: + case REG_CFA_VAL_EXPRESSION: n = XEXP (note, 0); if (n == NULL) n = single_set (insn); - dwarf2out_frame_debug_cfa_expression (n); + + if (REG_NOTE_KIND (note) == REG_CFA_EXPRESSION) + dwarf2out_frame_debug_cfa_expression (n); + else + dwarf2out_frame_debug_cfa_val_expression (n); + handled_one = true; break; @@ -3015,7 +3044,8 @@ output_cfa_loc (dw_cfi_ref cfi, int for_eh) dw_loc_descr_ref loc; unsigned long size; - if (cfi->dw_cfi_opc == DW_CFA_expression) + if (cfi->dw_cfi_opc == DW_CFA_expression + || cfi->dw_cfi_opc == DW_CFA_val_expression) { unsigned r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh); @@ -3041,7 +3071,8 @@ output_cfa_loc_raw (dw_cfi_ref cfi) dw_loc_descr_ref loc; unsigned long size; - if (cfi->dw_cfi_opc == DW_CFA_expression) + if (cfi->dw_cfi_opc == DW_CFA_expression + || cfi->dw_cfi_opc == DW_CFA_val_expression) { u
Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS
On Wed, Nov 2, 2016 at 2:43 PM, Wilco Dijkstra wrote: > Richard Biener wrote: > On Tue, Nov 1, 2016 at 10:39 PM, Wilco Dijkstra > wrote: > >> > If bswap is false no byte swap is needed, so we found a native endian load >> > and it will always perform the optimization by inserting an unaligned load. >> >> Yes, the general agreement is that the expander can do best and thus we >> should canonicalize accesses to larger ones even for SLOW_UNALIGNED_ACCESS. >> The expander will generate the canonical best code (hopefully...). > > Right, but there are cases where you have to choose between unaligned or > aligned > accesses and you need to know whether the unaligned access is fast. > > A good example is memcpy expansion, if you have fast unaligned accesses then > you > should use them to deal with the last few bytes, but if they get expanded, > using several > aligned accesses is much faster than a single unaligned access. Yes. That's RTL expansion at which point you of course have to look at SLOW_UNALIGNED_ACCESS. >> > This apparently works on all targets, and doesn't cause alignment traps or >> > huge slowdowns via trap emulation claimed by SLOW_UNALIGNED_ACCESS. >> > So I'm at a loss what these macros are supposed to mean and how I can query >> > whether a backend supports fast unaligned access for a particular mode. >> > >> > What I actually want to write is something like: >> > >> > if (!FAST_UNALIGNED_LOAD (mode, align)) return false; >> > >> > And know that it only accepts unaligned accesses that are efficient on the >> > target. >> > Maybe we need a new hook like this and get rid of the old one? >> >> No, we don't need to other hook. >> >> Note there is another similar user in gimple-fold.c when folding small >> memcpy/memmove >> to single load/store pairs (patch posted but not applied by me -- I've >> asked for strict-align >> target maintainer feedback but got none). > > I didn't find it, do you have a link? https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00598.html >> Now - for bswap I'm only 99% sure that unaligned load + bswap is >> better than piecewise loads plus manual swap. > > It depends on whether unaligned loads and bswap are expanded or not. Even if > we > assume the expansion is at least as efficient as doing it explicitly > (definitely true > for modes larger than the native integer size - as we found out in PR77308!), > if both the unaligned load and bswap are expanded it seems better not to make > the > transformation for modes up to the word size. But there is no way to find out > as > SLOW_UNALIGNED_ACCESS must be true whenever STRICT_ALIGN is true. The case I was thinking about is availability of a bswap load operating only on aligned memory and "regular" register bswap being "fake" provided by first spilling to an aligned stack slot and then loading from that. Maybe a bit far-fetched. >> But generally I'm always in favor of removing SLOW_UNALIGNED_ACCESS / >> STRICT_ALIGNMENT checks from the GIMPLE side of the compiler. > > I sort of agree because the purpose of these macros is unclear - the > documentation > is insufficient and out of date. I do believe however we need an accurate way > to find out > whether a target supports fast unaligned accesses as that is required to > generate good > target code. I believe the target macros are solely for RTL expansion and say that it has to avoid unaligned ops as those would trap. Richard. > Wilco
Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)
On Wed, Nov 2, 2016 at 2:40 PM, Segher Boessenkool wrote: > On Wed, Nov 02, 2016 at 11:39:20AM +0100, Steven Bosscher wrote: >> On Wed, Nov 2, 2016 at 10:02 AM, Richard Biener >> wrote: >> > On Mon, Oct 31, 2016 at 4:35 PM, Segher Boessenkool wrote: >> >> On Mon, Oct 31, 2016 at 04:09:48PM +0100, Steven Bosscher wrote: >> >>> On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote: >> >>> > + cfg_layout_finalize (); >> >>> >> >>> I don't think you have to go into/out-of cfglayout mode for each >> >>> iteration. >> >> >> >> Yeah probably. I was too lazy :-) It needs the cleanup_cfg every >> >> iteration though, I was afraid that interacts. >> > >> > Ick. Why would it need a cfg-cleanup every iteration? >> >> I don't believe it needs a cleanup on every iteration. One cleanup at >> the end should work fine. > > But as the comment there says: > > /* Merge the duplicated blocks into predecessors, when possible. */ > cleanup_cfg (0); > > (this is not a new comment), and without merging blocks this whole > patch does zilch. > > There is no need to create new basic blocks at all (can replace the > final branch in a block with a copy of the whole block it jumps to, > instead); and then it is painfully obvious that switching to layout > mode here is pointless, too. > > Do you want me to do that? > > Btw, this isn't quadratic anyway; it is a constant number (the maximal > length allowed of the block to copy) linear. Unless there are unboundedly > many forwarder blocks, which there shouldn't be, but I cannot prove that. Well, you iterate calling functions (find candidates, merge, cleanup-cfg) that walk the whole function. So unless the number of iterations is bound with a constant I call this quadratic (in function size). > And on a testcase with 2000 cases (instead of the 4 in the testcase in > the PR) this pass takes less than 1% of the compiler runtime; and in > the normal cases (no computed gotos to unfactor) it does the same as > before. > > > Segher
Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS
Richard Biener wrote: On Tue, Nov 1, 2016 at 10:39 PM, Wilco Dijkstra wrote: > > If bswap is false no byte swap is needed, so we found a native endian load > > and it will always perform the optimization by inserting an unaligned load. > > Yes, the general agreement is that the expander can do best and thus we > should canonicalize accesses to larger ones even for SLOW_UNALIGNED_ACCESS. > The expander will generate the canonical best code (hopefully...). Right, but there are cases where you have to choose between unaligned or aligned accesses and you need to know whether the unaligned access is fast. A good example is memcpy expansion, if you have fast unaligned accesses then you should use them to deal with the last few bytes, but if they get expanded, using several aligned accesses is much faster than a single unaligned access. > > This apparently works on all targets, and doesn't cause alignment traps or > > huge slowdowns via trap emulation claimed by SLOW_UNALIGNED_ACCESS. > > So I'm at a loss what these macros are supposed to mean and how I can query > > whether a backend supports fast unaligned access for a particular mode. > > > > What I actually want to write is something like: > > > > if (!FAST_UNALIGNED_LOAD (mode, align)) return false; > > > > And know that it only accepts unaligned accesses that are efficient on the > > target. > > Maybe we need a new hook like this and get rid of the old one? > > No, we don't need to other hook. > > Note there is another similar user in gimple-fold.c when folding small > memcpy/memmove > to single load/store pairs (patch posted but not applied by me -- I've > asked for strict-align > target maintainer feedback but got none). I didn't find it, do you have a link? > Now - for bswap I'm only 99% sure that unaligned load + bswap is > better than piecewise loads plus manual swap. It depends on whether unaligned loads and bswap are expanded or not. Even if we assume the expansion is at least as efficient as doing it explicitly (definitely true for modes larger than the native integer size - as we found out in PR77308!), if both the unaligned load and bswap are expanded it seems better not to make the transformation for modes up to the word size. But there is no way to find out as SLOW_UNALIGNED_ACCESS must be true whenever STRICT_ALIGN is true. > But generally I'm always in favor of removing SLOW_UNALIGNED_ACCESS / > STRICT_ALIGNMENT checks from the GIMPLE side of the compiler. I sort of agree because the purpose of these macros is unclear - the documentation is insufficient and out of date. I do believe however we need an accurate way to find out whether a target supports fast unaligned accesses as that is required to generate good target code. Wilco
Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module
On Wed, Nov 02, 2016 at 01:26:48PM +, Jiong Wang wrote: > -/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. > */ > +/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. > */ Too long line. > +/* RTL sequences inside PARALLEL are raw expression representation. > + > + mem_loc_descriptor can be used to build generic DWARF expressions for > + DW_CFA_expression and DW_CFA_val_expression where the expression may > can > + not be represented using normal RTL sequences. In this case, group > all > + expression operations (DW_OP_*) inside a PARALLEL. For those DW_OP > which > + doesn't have RTL mapping, wrap it using UNSPEC. The logic for parsing > + PARALLEL sequences is: > + > + foreach elem inside PARALLEL > + if (elem is UNSPEC) > + dw_op = XINT (elem, 1) (DWARF operation is kept as UNSPEC number) > + oprnd1 = XVECEXP (elem, 0, 0) > + oprnd2 = XVECEXP (elem, 0, 1) > + else > + call mem_loc_descriptor */ Not sure if it is a good idea to document in weirdly formatted pseudo-language what the code actually does a few lines below. IMHO either express it in words, or don't express it at all. > + exp_result = > + new_loc_descr ((enum dwarf_location_atom) dw_op, oprnd1, > + oprnd2); Wrong formatting, = should be on the next line. > + } > + else > + exp_result = > + mem_loc_descriptor (elem, mode, mem_mode, > + VAR_INIT_STATUS_INITIALIZED); Likewise. Jakub
Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)
On Wed, Nov 02, 2016 at 11:39:20AM +0100, Steven Bosscher wrote: > On Wed, Nov 2, 2016 at 10:02 AM, Richard Biener > wrote: > > On Mon, Oct 31, 2016 at 4:35 PM, Segher Boessenkool wrote: > >> On Mon, Oct 31, 2016 at 04:09:48PM +0100, Steven Bosscher wrote: > >>> On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote: > >>> > + cfg_layout_finalize (); > >>> > >>> I don't think you have to go into/out-of cfglayout mode for each > >>> iteration. > >> > >> Yeah probably. I was too lazy :-) It needs the cleanup_cfg every > >> iteration though, I was afraid that interacts. > > > > Ick. Why would it need a cfg-cleanup every iteration? > > I don't believe it needs a cleanup on every iteration. One cleanup at > the end should work fine. But as the comment there says: /* Merge the duplicated blocks into predecessors, when possible. */ cleanup_cfg (0); (this is not a new comment), and without merging blocks this whole patch does zilch. There is no need to create new basic blocks at all (can replace the final branch in a block with a copy of the whole block it jumps to, instead); and then it is painfully obvious that switching to layout mode here is pointless, too. Do you want me to do that? Btw, this isn't quadratic anyway; it is a constant number (the maximal length allowed of the block to copy) linear. Unless there are unboundedly many forwarder blocks, which there shouldn't be, but I cannot prove that. And on a testcase with 2000 cases (instead of the 4 in the testcase in the PR) this pass takes less than 1% of the compiler runtime; and in the normal cases (no computed gotos to unfactor) it does the same as before. Segher
Re: [RFC PATCH] expand_strn_compare should attempt expansion even if neither string is constant
On Wed, 2016-11-02 at 13:41 +0100, Bernd Schmidt wrote: > On 10/27/2016 03:14 AM, Aaron Sawdey wrote: > > > > I'm currently working on a builtin expansion of strncmp for powerpc > > similar to the one for memcmp I checked recently. One thing I > > encountered is that the code in expand_strn_compare will not > > attempt to > > expand the cmpstrnsi pattern at all if neither string parameter is > > a > > constant string. This doesn't make a lot of sense in light of the > > fact > > that expand_str_compare starts with an attempt to expand cmpstrsi > > and > > then if that does not work, looks at the string args to see if one > > is > > constant so it can use cmpstrnsi with the known length. > > > > The attached patch is my attempt to add this fallback path to > > expand_strn_compare, i.e. if neither length is known, just attempt > > expansion of cmpstrnsi using the given 3 arguments. > > > > It looks like in addition to rs6000, there are 3 other targets that > > have cmpstrnsi patterns: i386, sh, and rx. > > > > Is this a reasonable approach to take with this? If so I'll > > bootstrap/regtest on i386 as rs6000 does not as yet have an > > expansion > > for cmpstrsi or cmpstrnsi. > > Just to be sure, this is superseded by your later patch series, > correct? > > (After I saw this one I had been trying to remember what exactly the > i386 issue was but it looks like you found it yourself.) > > > Bernd Yes, the later patch series replaces this preliminary patch. And yes, the i386 issue took some headscratching and careful reading of the arch manual on what repz cmpsb actually does. Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
On 2 November 2016 at 12:22, Bin.Cheng wrote: > On Tue, Nov 1, 2016 at 12:21 PM, Kyrill Tkachov > wrote: >> Hi Tamar, >> >> >> On 26/10/16 16:01, Tamar Christina wrote: >>> >>> Hi Christophe, >>> >>> Here's the updated patch. >>> >>> Cheers, >>> Tamar >>> >>> From: Christophe Lyon >>> Sent: Wednesday, October 19, 2016 11:23:56 AM >>> To: Tamar Christina >>> Cc: GCC Patches; Kyrylo Tkachov; nd >>> Subject: Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and >>> vminnmQ_ST intrinsincs. >>> >>> On 19 October 2016 at 11:36, Tamar Christina >>> wrote: Hi All, This patch implements the vmaxnmQ_ST and vminnmQ_ST intrinsics. The current builtin registration code is deficient since it can't access standard pattern names, to which vmaxnmQ_ST and vminnmQ_ST map directly. Thus, to enable the vectoriser to have access to these intrinsics, we implement them using builtin functions, which we expand to the proper standard pattern using a define_expand. This patch also implements the __ARM_FEATURE_NUMERIC_MAXMIN macro, which is defined when __ARM_ARCH >= 8, and which enables the intrinsics. Regression tested on arm-none-eabi and no regressions. This patch is a rework of a previous patch: https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01971.html OK for trunk? > These cases failed on arm-none-linux-gnueabihf as below: > FAIL: gcc.target/arm/simd/vmaxnm_f32_1.c execution test > FAIL: gcc.target/arm/simd/vmaxnmq_f32_1.c execution test > FAIL: gcc.target/arm/simd/vminnm_f32_1.c execution test > FAIL: gcc.target/arm/simd/vminnmq_f32_1.c execution test > > For such changes, I would suggest reg test for both bare-metal and > linux toolchains, plus a bootstrap for linux toolchain. > Hi, I confirm some tests are failing: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/241736/report-build-info.html Sorry I couldn't answer/test before you committed, I was on holidays. Christophe > Thanks, > bin > >> >> >> Ok. >> Thanks, >> Kyrill >> >> Thanks, Tamar --- gcc/ 2016-10-19 Bilyan Borisov Tamar Christina * config/arm/arm-c.c (arm_cpu_builtins): New macro definition. * config/arm/arm_neon.h (vmaxnm_f32): New intrinsinc. (vmaxnmq_f32): Likewise. (vminnm_f32): Likewise. (vminnmq_f32): Likewise. * config/arm/arm_neon_builtins.def (vmaxnm): New builtin. (vminnm): Likewise. * config/arm/neon.md (neon_, VCVTF): New expander. gcc/testsuite/ 2016-10-19 Bilyan Borisov * gcc.target/arm/simd/vmaxnm_f32_1.c: New. * gcc.target/arm/simd/vmaxnmq_f32_1.c: Likewise. * gcc.target/arm/simd/vminnm_f32_1.c: Likewise. * gcc.target/arm/simd/vminnmq_f32_1.c: Likewise. >>> I think you forgot to attach the new tests. >>> >>> Christophe >>> >>
Re: [PATCH 1/2, libgcc]: Implement _divmoddi4
On Mon, Oct 31, 2016 at 7:46 AM, Uros Bizjak wrote: > This function will be used in a follow-up patch to implement > TARGET_EXPAND_DIVMOD_LIBFUNC for x86 targets. Other targets can call > this function, so IMO it should be part of a generic library. > > 2016-10-31 Uros Bizjak > > * Makefile.in (LIB2_DIVMOD_FUNCS): Add _divmoddi4. > * libgcc2.c (__divmoddi4): New function. > * libgcc2.h (__divmoddi4): Declare. > * libgcc-std.ver.in (GCC_7.0.0): New. Add __PFX_divmoddi4 > and __PFX_divmodti4. You aren't defining divmodti4 anywhere, so it seems premature to add it to libgcc-std.ver.in. Other than that the patch is OK. Thanks. Ian
Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module
On 01/11/16 16:48, Jason Merrill wrote: > It seems to me that a CFA_*expression note would never use target > UNSPEC codes, and a DWARF UNSPEC would never appear outside of such a > note, so we don't need to worry about conflicts. Indeed. DWARF UNSPEC is deeper inside DW_CFA_*expression note. My worry of conflict makes no sense. I updated the patch to put DWARF operation in to UNSPEC number field. x86-64 bootstrap OK, no regression on gcc/g++. Please review. Thanks. gcc/ 2016-11-02 Jiong Wang * reg-notes.def (CFA_VAL_EXPRESSION): New entry. * dwarf2cfi.c (dwarf2out_frame_debug_cfa_val_expression): New function. (dwarf2out_frame_debug): Support REG_CFA_VAL_EXPRESSION. (output_cfa_loc): Support DW_CFA_val_expression. (output_cfa_loc_raw): Likewise. (output_cfi): Likewise. (output_cfi_directive): Likewise. * dwarf2out.c (dw_cfi_oprnd1_desc): Support DW_CFA_val_expression. (dw_cfi_oprnd2_desc): Likewise. (mem_loc_descriptor): Recognize new pattern generated for value expression. diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c index 6491d5aaf4c4a21241cc718bfff1016f6d149951..b8c88fbae1df80a2664a414d8ae016a5343bf435 100644 --- a/gcc/dwarf2cfi.c +++ b/gcc/dwarf2cfi.c @@ -1235,7 +1235,7 @@ dwarf2out_frame_debug_cfa_register (rtx set) reg_save (sregno, dregno, 0); } -/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. */ +/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_EXPRESSION note. */ static void dwarf2out_frame_debug_cfa_expression (rtx set) @@ -1267,6 +1267,29 @@ dwarf2out_frame_debug_cfa_expression (rtx set) update_row_reg_save (cur_row, regno, cfi); } +/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_VAL_EXPRESSION + note. */ + +static void +dwarf2out_frame_debug_cfa_val_expression (rtx set) +{ + rtx dest = SET_DEST (set); + gcc_assert (REG_P (dest)); + + rtx span = targetm.dwarf_register_span (dest); + gcc_assert (!span); + + rtx src = SET_SRC (set); + dw_cfi_ref cfi = new_cfi (); + cfi->dw_cfi_opc = DW_CFA_val_expression; + cfi->dw_cfi_oprnd1.dw_cfi_reg_num = dwf_regno (dest); + cfi->dw_cfi_oprnd2.dw_cfi_loc += mem_loc_descriptor (src, GET_MODE (src), + GET_MODE (dest), VAR_INIT_STATUS_INITIALIZED); + add_cfi (cfi); + update_row_reg_save (cur_row, dwf_regno (dest), cfi); +} + /* A subroutine of dwarf2out_frame_debug, process a REG_CFA_RESTORE note. */ static void @@ -2033,10 +2056,16 @@ dwarf2out_frame_debug (rtx_insn *insn) break; case REG_CFA_EXPRESSION: + case REG_CFA_VAL_EXPRESSION: n = XEXP (note, 0); if (n == NULL) n = single_set (insn); - dwarf2out_frame_debug_cfa_expression (n); + + if (REG_NOTE_KIND (note) == REG_CFA_EXPRESSION) + dwarf2out_frame_debug_cfa_expression (n); + else + dwarf2out_frame_debug_cfa_val_expression (n); + handled_one = true; break; @@ -3015,7 +3044,8 @@ output_cfa_loc (dw_cfi_ref cfi, int for_eh) dw_loc_descr_ref loc; unsigned long size; - if (cfi->dw_cfi_opc == DW_CFA_expression) + if (cfi->dw_cfi_opc == DW_CFA_expression + || cfi->dw_cfi_opc == DW_CFA_val_expression) { unsigned r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh); @@ -3041,7 +3071,8 @@ output_cfa_loc_raw (dw_cfi_ref cfi) dw_loc_descr_ref loc; unsigned long size; - if (cfi->dw_cfi_opc == DW_CFA_expression) + if (cfi->dw_cfi_opc == DW_CFA_expression + || cfi->dw_cfi_opc == DW_CFA_val_expression) { unsigned r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, 1); @@ -3188,6 +3219,7 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref fde, int for_eh) case DW_CFA_def_cfa_expression: case DW_CFA_expression: + case DW_CFA_val_expression: output_cfa_loc (cfi, for_eh); break; @@ -3302,16 +3334,13 @@ output_cfi_directive (FILE *f, dw_cfi_ref cfi) break; case DW_CFA_def_cfa_expression: - if (f != asm_out_file) - { - fprintf (f, "\t.cfi_def_cfa_expression ...\n"); - break; - } - /* FALLTHRU */ case DW_CFA_expression: +case DW_CFA_val_expression: if (f != asm_out_file) { - fprintf (f, "\t.cfi_cfa_expression ...\n"); + fprintf (f, "\t.cfi_%scfa_%sexpression ...\n", + cfi->dw_cfi_opc == DW_CFA_def_cfa_expression ? "def_" : "", + cfi->dw_cfi_opc == DW_CFA_val_expression ? "val_" : ""); break; } fprintf (f, "\t.cfi_escape %#x,", cfi->dw_cfi_opc); diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 4a3df339df2c6a6816ac8b8dbdb2466a7492c592..7dac70d7392f2c457ffd3f677e07decb1ba488a1 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -518,6 +518,7 @@ dw_cfi_oprnd1_desc (enum dwarf_call_frame_info cfi) case DW_CFA_def_cfa_register: case DW_CFA_register: case DW_CFA_expression: +case DW_CFA_val_expression: return dw_cfi_oprnd_reg_num; case DW_CFA_def_cfa_offset: @@ -551,6 +552,7 @@ dw_cfi_oprnd2_desc (enum dwarf_
Re: [PATCH] libiberty: Fix -Wimplicit-fallthrough warnings.
On Wed, Nov 02, 2016 at 02:19:33PM +0100, Mark Wielaard wrote: > Adjust some comments, add some explicit fall through comments or explicit > returns where necessary to not get implicit-fallthrough warnings. > > All fall throughs were deliberate. In one case I added an explicit return > false for clarity instead of falling through a default case (that also > would return false). > > libiberty/ChangeLog: > >* cplus-dem.c (demangle_signature): Move fall through comment. >(demangle_fund_type): Add fall through comment between 'G' and 'I'. >* hashtab.c (iterative_hash): Add fall through comments. >* regex.c (regex_compile): Add Fall through comment after '+'/'?'. >(byte_re_match_2_internal): Add Fall through comment after jump_n. >Change "Note fall through" to "Fall through". >(common_op_match_null_string_p): Return false after set_number_at >instead of fall through. LGTM, except for: > --- a/libiberty/cplus-dem.c > +++ b/libiberty/cplus-dem.c > @@ -1658,8 +1658,8 @@ demangle_signature (struct work_stuff *work, > break; > } > else > - /* fall through */ > {;} > + /* fall through */ I think you should just remove the else and {;} and just have fallthrough comment indented where else used to be. Jakub
[PATCH] libiberty: Fix -Wimplicit-fallthrough warnings.
Adjust some comments, add some explicit fall through comments or explicit returns where necessary to not get implicit-fallthrough warnings. All fall throughs were deliberate. In one case I added an explicit return false for clarity instead of falling through a default case (that also would return false). libiberty/ChangeLog: * cplus-dem.c (demangle_signature): Move fall through comment. (demangle_fund_type): Add fall through comment between 'G' and 'I'. * hashtab.c (iterative_hash): Add fall through comments. * regex.c (regex_compile): Add Fall through comment after '+'/'?'. (byte_re_match_2_internal): Add Fall through comment after jump_n. Change "Note fall through" to "Fall through". (common_op_match_null_string_p): Return false after set_number_at instead of fall through. --- diff --git a/libiberty/cplus-dem.c b/libiberty/cplus-dem.c index 7f63397..810e971 100644 --- a/libiberty/cplus-dem.c +++ b/libiberty/cplus-dem.c @@ -1658,8 +1658,8 @@ demangle_signature (struct work_stuff *work, break; } else - /* fall through */ {;} + /* fall through */ default: if (AUTO_DEMANGLING || GNU_DEMANGLING) @@ -4024,6 +4024,7 @@ demangle_fund_type (struct work_stuff *work, success = 0; break; } + /* fall through */ case 'I': (*mangled)++; if (**mangled == '_') diff --git a/libiberty/hashtab.c b/libiberty/hashtab.c index 04607ea..99381b1 100644 --- a/libiberty/hashtab.c +++ b/libiberty/hashtab.c @@ -962,17 +962,17 @@ iterative_hash (const PTR k_in /* the key */, c += length; switch(len) /* all the case statements fall through */ { -case 11: c+=((hashval_t)k[10]<<24); -case 10: c+=((hashval_t)k[9]<<16); -case 9 : c+=((hashval_t)k[8]<<8); +case 11: c+=((hashval_t)k[10]<<24);/* fall through */ +case 10: c+=((hashval_t)k[9]<<16); /* fall through */ +case 9 : c+=((hashval_t)k[8]<<8); /* fall through */ /* the first byte of c is reserved for the length */ -case 8 : b+=((hashval_t)k[7]<<24); -case 7 : b+=((hashval_t)k[6]<<16); -case 6 : b+=((hashval_t)k[5]<<8); -case 5 : b+=k[4]; -case 4 : a+=((hashval_t)k[3]<<24); -case 3 : a+=((hashval_t)k[2]<<16); -case 2 : a+=((hashval_t)k[1]<<8); +case 8 : b+=((hashval_t)k[7]<<24); /* fall through */ +case 7 : b+=((hashval_t)k[6]<<16); /* fall through */ +case 6 : b+=((hashval_t)k[5]<<8); /* fall through */ +case 5 : b+=k[4]; /* fall through */ +case 4 : a+=((hashval_t)k[3]<<24); /* fall through */ +case 3 : a+=((hashval_t)k[2]<<16); /* fall through */ +case 2 : a+=((hashval_t)k[1]<<8); /* fall through */ case 1 : a+=k[0]; /* case 0: nothing left to add */ } diff --git a/libiberty/regex.c b/libiberty/regex.c index 9ffc3f4..6854e3b 100644 --- a/libiberty/regex.c +++ b/libiberty/regex.c @@ -2493,6 +2493,7 @@ PREFIX(regex_compile) (const char *ARG_PREFIX(pattern), if ((syntax & RE_BK_PLUS_QM) || (syntax & RE_LIMITED_OPS)) goto normal_char; + /* Fall through. */ handle_plus: case '*': /* If there is no previous pattern... */ @@ -6697,6 +6698,7 @@ byte_re_match_2_internal (struct re_pattern_buffer *bufp, { case jump_n: is_a_jump_n = true; + /* Fall through. */ case pop_failure_jump: case maybe_pop_jump: case jump: @@ -7125,7 +7127,7 @@ byte_re_match_2_internal (struct re_pattern_buffer *bufp, DEBUG_PRINT1 (" Match => jump.\n"); goto unconditional_jump; } -/* Note fall through. */ +/* Fall through. */ /* The end of a simple repeat has a pop_failure_jump back to @@ -7150,7 +7152,7 @@ byte_re_match_2_internal (struct re_pattern_buffer *bufp, dummy_low_reg, dummy_high_reg, reg_dummy, reg_dummy, reg_info_dummy); } - /* Note fall through. */ + /* Fall through. */ unconditional_jump: #ifdef _LIBC @@ -7453,6 +7455,7 @@ byte_re_match_2_internal (struct re_pattern_buffer *bufp, { case jump_n: is_a_jump_n = true; + /* Fall through. */ case maybe_pop_jump: case pop_failure_jump: case jump: @@ -7718,6 +7721,7 @@ PREFIX(common_op_match_null_string_p) (UCHAR_T **p, UCHAR_T *end, case set_number_at: p1 += 2 * OFFSET_ADDRESS_SIZE; + return false; default: /* All other opcodes mean we cannot match the empty string. */ -- 1.8.3.1
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On Wed, Nov 2, 2016 at 2:06 PM, Jakub Jelinek wrote: > On Wed, Nov 02, 2016 at 01:59:00PM +0100, Richard Biener wrote: >> > Yeah, that is what I meant. The issue is how to report uses of such >> > SSA_NAME when there is no memory. So, either we'd need a special runtime >> > library entrypoint that would report uses after scope even when there is no >> > underlying memory, or we'd need to force it at asan pass time into memory >> > again. >> >> Well, there can't be any uses outside the scope -- there are no (memory) uses >> left if we rewrite the thing into SSA. That is, the address can no >> longer "escape". >> >> Of course there could have been invalid uses before the rewrite into SSA. >> But >> those can be diagnosed either immediately before or after re-writing into SSA >> at compile-time (may be in dead code regions of course). > > Sure, we can warn on those at compile time, but we really should arrange to > error on those at runtime if they are ever executed, the UB happens only at > runtime, so in dead code isn't fatal. Then we can replace those uses with a call into the asan runtime diagnosing the issue instead? Richard. > Jakub
[PATCH] Allow non-NULL offset for store-merging bases
The following teaches store-merging to handle non-NULL offset if the base is already addressable (otherwise introducing new pointers to a non-addressable base invalidates points-to information, see a comment in the patch how we could avoid this in theory). Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2016-11-02 Richard Biener * gimple-ssa-store-merging.c: Include gimplify-me.h. (imm_store_chain_info::output_merged_stores): Force base_addr to be proper GIMPLE for a MEM_REF address. (pass_store_merging::execute): Restrict negative bitpos handling to non-MEM_REF bases. Remove TREE_THIS_VOLATILE check. Take into account non-NULL_TREE offset if the base is already addressable. * gcc.dg/store_merging_8.c: New testcase. Index: gcc/gimple-ssa-store-merging.c === --- gcc/gimple-ssa-store-merging.c (revision 241789) +++ gcc/gimple-ssa-store-merging.c (working copy) @@ -125,6 +125,7 @@ #include "tree-cfg.h" #include "tree-eh.h" #include "target.h" +#include "gimplify-me.h" /* The maximum size (in bits) of the stores this pass should generate. */ #define MAX_STORE_BITSIZE (BITS_PER_WORD) @@ -1127,6 +1128,8 @@ imm_store_chain_info::output_merged_stor unsigned int i; bool fail = false; + tree addr = force_gimple_operand_1 (unshare_expr (base_addr), &seq, + is_gimple_mem_ref_addr, NULL_TREE); FOR_EACH_VEC_ELT (split_stores, i, split_store) { unsigned HOST_WIDE_INT try_size = split_store->size; @@ -1137,7 +1140,7 @@ imm_store_chain_info::output_merged_stor tree int_type = build_nonstandard_integer_type (try_size, UNSIGNED); int_type = build_aligned_type (int_type, align); - tree dest = fold_build2 (MEM_REF, int_type, base_addr, + tree dest = fold_build2 (MEM_REF, int_type, addr, build_int_cst (offset_type, try_pos)); tree src = native_interpret_expr (int_type, @@ -1366,15 +1369,10 @@ pass_store_merging::execute (function *f &unsignedp, &reversep, &volatilep); /* As a future enhancement we could handle stores with the same base and offset. */ - bool invalid = offset || reversep || bitpos < 0 + bool invalid = reversep || ((bitsize > MAX_BITSIZE_MODE_ANY_INT) && (TREE_CODE (rhs) != INTEGER_CST)) -|| !rhs_valid_for_store_merging_p (rhs) - /* An access may not be volatile itself but base_addr may be - a volatile decl i.e. MEM[&volatile-decl]. The hashing for - tree_operand_hash won't consider such stores equal to each - other so we can't track chains on them. */ -|| TREE_THIS_VOLATILE (base_addr); +|| !rhs_valid_for_store_merging_p (rhs); /* We do not want to rewrite TARGET_MEM_REFs. */ if (TREE_CODE (base_addr) == TARGET_MEM_REF) @@ -1398,7 +1396,32 @@ pass_store_merging::execute (function *f /* get_inner_reference returns the base object, get at its address now. */ else - base_addr = build_fold_addr_expr (base_addr); + { + if (bitpos < 0) + invalid = true; + base_addr = build_fold_addr_expr (base_addr); + } + + if (! invalid + && offset != NULL_TREE) + { + /* If the access is variable offset then a base +decl has to be address-taken to be able to +emit pointer-based stores to it. +??? We might be able to get away with +re-using the original base up to the first +variable part and then wrapping that inside +a BIT_FIELD_REF. */ + tree base = get_base_address (base_addr); + if (! base + || (DECL_P (base) + && ! TREE_ADDRESSABLE (base))) + invalid = true; + else + base_addr = build2 (POINTER_PLUS_EXPR, + TREE_TYPE (base_addr), + base_addr, offset); + } struct imm_store_chain_info **chain_info = m_stores.get (base_addr); Index: gcc/testsuite/gcc.dg/store_merging_8.c === --- gcc/testsuite/gcc.dg/store_merging_8.c (revision 0) +++ gcc/testsuite/gcc.dg/store_merging_8.c (working copy) @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target non_stric
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On Wed, Nov 02, 2016 at 01:59:00PM +0100, Richard Biener wrote: > > Yeah, that is what I meant. The issue is how to report uses of such > > SSA_NAME when there is no memory. So, either we'd need a special runtime > > library entrypoint that would report uses after scope even when there is no > > underlying memory, or we'd need to force it at asan pass time into memory > > again. > > Well, there can't be any uses outside the scope -- there are no (memory) uses > left if we rewrite the thing into SSA. That is, the address can no > longer "escape". > > Of course there could have been invalid uses before the rewrite into SSA. But > those can be diagnosed either immediately before or after re-writing into SSA > at compile-time (may be in dead code regions of course). Sure, we can warn on those at compile time, but we really should arrange to error on those at runtime if they are ever executed, the UB happens only at runtime, so in dead code isn't fatal. Jakub
Re: [ping * 4] PR35503 - warn for restrict
Then I'll approve the whole patch. On Wed, Nov 2, 2016 at 8:42 AM, Joseph Myers wrote: > The format-checking parts of the patch are OK. > > -- > Joseph S. Myers > jos...@codesourcery.com
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On Wed, Nov 2, 2016 at 1:56 PM, Jakub Jelinek wrote: > On Wed, Nov 02, 2016 at 01:36:31PM +0100, Richard Biener wrote: >> > Unless I'm missing something we either need to perform further analysis >> > during the addressable subpass - this variable could be made >> > non-addressable, but is used in ASAN_MARK, would if we rewrote it into SSA >> > form there be any SSA uses of the poisoning ASAN_MARK? If yes, keep it >> > addressable, otherwise rewrite into SSA. >> > Or, just rewrite it into SSA always, but turn the poisoning ASAN_MARK into >> > some new internal ECF_CONST call var_5 = ASAN_POISON (); and if we have any >> > uses of those, rewrite it back into addressable immediately or later or >> > something. >> >> Or just give up optimizing (asan has a penalty anyway)? Or > > Well, asan has a penalty and -fsanitize-use-after-scope even bigger penalty, > but the point is to make that penalty bearable. > >> re-structure ASAN_POISON () >> similar to clobbers: >> >> var = ASAN_POISION (); >> >> that avoids the address-taking and thus should be less heavy-weight on >> optimizations. > > Yeah, that is what I meant. The issue is how to report uses of such > SSA_NAME when there is no memory. So, either we'd need a special runtime > library entrypoint that would report uses after scope even when there is no > underlying memory, or we'd need to force it at asan pass time into memory > again. Well, there can't be any uses outside the scope -- there are no (memory) uses left if we rewrite the thing into SSA. That is, the address can no longer "escape". Of course there could have been invalid uses before the rewrite into SSA. But those can be diagnosed either immediately before or after re-writing into SSA at compile-time (may be in dead code regions of course). Richard. > Jakub
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On Wed, Nov 02, 2016 at 01:36:31PM +0100, Richard Biener wrote: > > Unless I'm missing something we either need to perform further analysis > > during the addressable subpass - this variable could be made > > non-addressable, but is used in ASAN_MARK, would if we rewrote it into SSA > > form there be any SSA uses of the poisoning ASAN_MARK? If yes, keep it > > addressable, otherwise rewrite into SSA. > > Or, just rewrite it into SSA always, but turn the poisoning ASAN_MARK into > > some new internal ECF_CONST call var_5 = ASAN_POISON (); and if we have any > > uses of those, rewrite it back into addressable immediately or later or > > something. > > Or just give up optimizing (asan has a penalty anyway)? Or Well, asan has a penalty and -fsanitize-use-after-scope even bigger penalty, but the point is to make that penalty bearable. > re-structure ASAN_POISON () > similar to clobbers: > > var = ASAN_POISION (); > > that avoids the address-taking and thus should be less heavy-weight on > optimizations. Yeah, that is what I meant. The issue is how to report uses of such SSA_NAME when there is no memory. So, either we'd need a special runtime library entrypoint that would report uses after scope even when there is no underlying memory, or we'd need to force it at asan pass time into memory again. Jakub
Re: [ping * 4] PR35503 - warn for restrict
The format-checking parts of the patch are OK. -- Joseph S. Myers jos...@codesourcery.com
Re: [RFC][PATCH] Remove a bad use of SLOW_UNALIGNED_ACCESS
On Tue, Nov 1, 2016 at 10:39 PM, Wilco Dijkstra wrote: > Jeff Law wrote: > >> I think you'll need to look at bz61320 before this could go in. > > I had a look, but there is nothing there that is related - eventually > a latent alignment bug was fixed in IVOpt. Note that the bswap phase > currently inserts unaligned accesses irrespectively of STRICT_ALIGNMENT > or SLOW_UNALIGNED_ACCESS: > > - if (bswap > - && align < GET_MODE_ALIGNMENT (TYPE_MODE (load_type)) > - && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align)) > - return false; > > If bswap is false no byte swap is needed, so we found a native endian load > and it will always perform the optimization by inserting an unaligned load. Yes, the general agreement is that the expander can do best and thus we should canonicalize accesses to larger ones even for SLOW_UNALIGNED_ACCESS. The expander will generate the canonical best code (hopefully...). > This apparently works on all targets, and doesn't cause alignment traps or > huge slowdowns via trap emulation claimed by SLOW_UNALIGNED_ACCESS. > So I'm at a loss what these macros are supposed to mean and how I can query > whether a backend supports fast unaligned access for a particular mode. > > What I actually want to write is something like: > > if (!FAST_UNALIGNED_LOAD (mode, align)) return false; > > And know that it only accepts unaligned accesses that are efficient on the > target. > Maybe we need a new hook like this and get rid of the old one? No, we don't need to other hook. Note there is another similar user in gimple-fold.c when folding small memcpy/memmove to single load/store pairs (patch posted but not applied by me -- I've asked for strict-align target maintainer feedback but got none). Now - for bswap I'm only 99% sure that unaligned load + bswap is better than piecewise loads plus manual swap. But generally I'm always in favor of removing SLOW_UNALIGNED_ACCESS / STRICT_ALIGNMENT checks from the GIMPLE side of the compiler. Richard. > Wilco >
Re: [RFC PATCH] expand_strn_compare should attempt expansion even if neither string is constant
On 10/27/2016 03:14 AM, Aaron Sawdey wrote: I'm currently working on a builtin expansion of strncmp for powerpc similar to the one for memcmp I checked recently. One thing I encountered is that the code in expand_strn_compare will not attempt to expand the cmpstrnsi pattern at all if neither string parameter is a constant string. This doesn't make a lot of sense in light of the fact that expand_str_compare starts with an attempt to expand cmpstrsi and then if that does not work, looks at the string args to see if one is constant so it can use cmpstrnsi with the known length. The attached patch is my attempt to add this fallback path to expand_strn_compare, i.e. if neither length is known, just attempt expansion of cmpstrnsi using the given 3 arguments. It looks like in addition to rs6000, there are 3 other targets that have cmpstrnsi patterns: i386, sh, and rx. Is this a reasonable approach to take with this? If so I'll bootstrap/regtest on i386 as rs6000 does not as yet have an expansion for cmpstrsi or cmpstrnsi. Just to be sure, this is superseded by your later patch series, correct? (After I saw this one I had been trying to remember what exactly the i386 issue was but it looks like you found it yourself.) Bernd
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On Wed, Nov 2, 2016 at 10:52 AM, Jakub Jelinek wrote: > On Wed, Nov 02, 2016 at 10:40:35AM +0100, Richard Biener wrote: >> > I wonder if the sanitize_asan_mark wouldn't better be some PROP_* property >> > set during the asan pass and kept on until end of compilation of that >> > function. That means even if a var only addressable because of ASAN_MARK >> > is >> > discovered after this pass we'd still be able to rewrite it into SSA. >> >> Note that being TREE_ADDRESSABLE also has effects on alias analysis >> (didn't follow the patches to see whether you handle ASAN_MARK specially >> in points-to analysis and/or alias analysis). >> >> Generally in update-address-taken you can handle ASAN_MARK similar to >> MEM_REF (and drop it in the rewrite phase?). > > That is the intent, but we can't do that before the asan pass, because > otherwise as Martin explained we don't diagnose at runtime bugs where > a variable is used outside of its scope only through a MEM_REF. > So we need to wait for asan pass to actually add a real builtin call that > takes the address in that case. Except now I really don't see how that > can work for the case where the var is used only properly when it is in the > scope, because the asan pass will still see those being addressable. > > Unless I'm missing something we either need to perform further analysis > during the addressable subpass - this variable could be made > non-addressable, but is used in ASAN_MARK, would if we rewrote it into SSA > form there be any SSA uses of the poisoning ASAN_MARK? If yes, keep it > addressable, otherwise rewrite into SSA. > Or, just rewrite it into SSA always, but turn the poisoning ASAN_MARK into > some new internal ECF_CONST call var_5 = ASAN_POISON (); and if we have any > uses of those, rewrite it back into addressable immediately or later or > something. Or just give up optimizing (asan has a penalty anyway)? Or re-structure ASAN_POISON () similar to clobbers: var = ASAN_POISION (); that avoids the address-taking and thus should be less heavy-weight on optimizations. Richard. > > Jakub
Re: [PATCH, vec-tails] Support loop epilogue vectorization
On Tue, 1 Nov 2016, Yuri Rumyantsev wrote: > Hi All, > > I re-send all patches sent by Ilya earlier for review which support > vectorization of loop epilogues and loops with low trip count. We > assume that the only patch - vec-tails-07-combine-tail.patch - was not > approved by Jeff. > > I did re-base of all patches and performed bootstrapping and > regression testing that did not show any new failures. Also all > changes related to new vect_do_peeling algorithm have been changed > accordingly. > > Is it OK for trunk? I would have prefered that the series up to -03-nomask-tails would _only_ contain epilogue loop vectorization changes but unfortunately the patchset is oddly separated. I have a comment on that part nevertheless: @@ -1608,7 +1614,10 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) /* Check if we can possibly peel the loop. */ if (!vect_can_advance_ivs_p (loop_vinfo) || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)) - || loop->inner) + || loop->inner + /* Required peeling was performed in prologue and +is not required for epilogue. */ + || LOOP_VINFO_EPILOGUE_P (loop_vinfo)) do_peeling = false; if (do_peeling @@ -1888,7 +1897,10 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) do_versioning = optimize_loop_nest_for_speed_p (loop) - && (!loop->inner); /* FORNOW */ + && (!loop->inner) /* FORNOW */ +/* Required versioning was performed for the + original loop and is not required for epilogue. */ + && !LOOP_VINFO_EPILOGUE_P (loop_vinfo); if (do_versioning) { please do that check in the single caller of this function. Otherwise I still dislike the new ->aux use and I believe that simply passing down info from the processed parent would be _much_ cleaner. That is, here (and avoid the FOR_EACH_LOOP change): @@ -580,12 +586,21 @@ vectorize_loops (void) && dump_enabled_p ()) dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, "loop vectorized\n"); - vect_transform_loop (loop_vinfo); + new_loop = vect_transform_loop (loop_vinfo); num_vectorized_loops++; /* Now that the loop has been vectorized, allow it to be unrolled etc. */ loop->force_vectorize = false; + /* Add new loop to a processing queue. To make it easier + to match loop and its epilogue vectorization in dumps + put new loop as the next loop to process. */ + if (new_loop) + { + loops.safe_insert (i + 1, new_loop->num); + vect_loops_num = number_of_loops (cfun); + } simply dispatch to a vectorize_epilogue (loop_vinfo, new_loop) function which will set up stuff properly (and also perform the if-conversion of the epilogue there). That said, if we can get in non-masked epilogue vectorization separately that would be great. I'm still torn about all the rest of the stuff and question its usability (esp. merging the epilogue with the main vector loop). But it has already been approved ... oh well. Thanks, Richard.
Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add
On Nov 2, 2016, at 4:28 AM, Jakub Jelinek wrote: > > On Wed, Nov 02, 2016 at 10:19:26AM +0100, Richard Biener wrote: >> On Tue, Nov 1, 2016 at 1:09 AM, Jakub Jelinek wrote: >>> On Mon, Oct 31, 2016 at 05:28:42PM -0500, Bill Schmidt wrote: The PowerPC back end loses performance on vector intrinsics, because currently all of them are treated as calls throughout the middle-end phases and only expanded when they reach RTL. Our version of altivec.h currently defines the public names of overloaded functions (like vec_add) to be #defines for hidden functions (like __builtin_vec_add), which are recognized in the parser as requiring special back-end support. Tables in rs6000-c.c handle dispatch of the overloaded functions to specific function calls appropriate to the argument types. >>> >>> This doesn't look very nice. If all you care is that the builtins like >>> __builtin_altivec_vaddubm etc. that __builtin_vec_add overloads into fold >>> into generic vector operations under certain conditions, just fold those >>> into whatever you want in targetm.gimple_fold_builtin (gsi). >> >> Note that traditionally "overloading" with GCC "builtins" is done by >> using varargs >> and the "type generic" attribute. That doesn't scale to return type >> overloading >> though for which we usually added direct support to the parser (for example >> for __builtin_shuffle). > > My understanding is that rs6000 already does that, it hooks into > resolve_overloaded_builtin which already handles the fully type generic > builtins where not just the arguments, but also the return type can be > picked up. But it resolves the overloaded builtins into calls to other > builtins that are not type-generic. > > So, either that function instead of returning the specific md builtin calls > in some cases already returns trees with the generic behavior of the > builtin, or it returns what it does now and then in the gimple fold builtin > target hook (note, the normal fold builtin target hook is not right for > that, because it is mostly used for folding builtins into constant - callers > will usually throw away other results) fold those specific md builtins > into whatever GIMPLE you want. If we want to decrease amount of folding in > the FEs, the gimple fold builtin hook is probably better. > > Jakub Thanks, all. Using the gimple_fold_builtin target hook works very well and is exactly what I'm looking for. I've reworked the patch to the much simpler https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00104.html. Much obliged for the help! Bill