Hi Dave, I made some more progress on moving tests from gcc.dg/analyzer/ to c-c++-common/analyzer. I'll only detail the most noteworthy tests I encountered.
gcc.dg/analyzer/pr103892.c troubled me with an Excess error when compiled with g++ analysis bailed out early (91 'after-snode' enodes; 314 enodes) [-Wanalyzer-too-complex] pr103892.c is compiled with optimization level -O2. Analysis bails out when the number of "after-SN" enodes explodes, i.e. exceeds a certain proportion of the total number of SG nodes. limit(after-SN) = m_sg.num_nodes () * param-bb-explosion-factor. The reason why C and C++ differs is simply due to what '-O2' does to each of them. Under -O0 or -O1, there is no failure whatsoever, under -O2 only C++ fails, whereas with -O3 both gcc and g++ emits -Wanalyzer-too-complex. With -O2, although GCC produces a greater total number of "after-SN" enodes than G++, their proportion barely stays under limit(after-SN) as the total number of SN is also bigger, hence no warning is emitted. All in all, I don't believe there is a significant difference here between C and C++, nor is there much we can do about this. Therefore I'll simply add a { dg-warning "analysis bailed out early" "" { target c++ } } An interesting divergence between GCC and G++ was in the handling of built-ins. Tests gcc.dg/analyzer/{pr104062.c,sprintf-2.c} are failing when compiling with G++. FAIL: c-c++-common/analyzer/pr104062.c leak of ap7 at line 13 (test for warnings, line 12) The reason is neither realloc nor sprintf are considered by G++ as a built-in, contrary to GCC. C built-ins are mapped within the analyzer to known_functions using their 'enum combined_fn' code (See kf.cc:register_known_functions), not their function name. Therefore, we find a built-in known function by checking tree.h:fndecl_built_in_p returns true and calling known_function_manager::get_normal_builtin. The former returns false for 'realloc' and 'sprintf' when using G++. To fix that, I've derived builtin_known_function from known_function. /* Subclass of known_function for builtin functions. */ class builtin_known_function : public known_function { public: virtual enum built_in_function builtin_code () const = 0; tree builtin_decl () const { gcc_assert (builtin_code () < END_BUILTINS); return builtin_info[builtin_code ()].decl; } virtual const builtin_known_function * dyn_cast_builtin_kf () const { return this; } virtual builtin_known_function * dyn_cast_builtin_kf () { return this; } }; And 'kfm.add (BUILT_IN_REALLOC, make_unique<kf_realloc> ());' becomes kfm.add ("realloc", make_unique<kf_realloc> ()); kfm.add ("__builtin_realloc", make_unique<kf_realloc> ()); Unfortunately we have to register the built-ins using their function name which is more apt to bugs, and the double call to kfm.add is quite messy. Having two instances of kf_realloc however is not that troubling, as builtin_known_function objects are lightweight. That GCC built-ins are not recognized by G++ doesn't only impede their detection, but also how we process them, and what information we have of them. In gcc.dg/analyzer/sprintf-2.c, sprintf signature follows the manual and cppreference. Yet we expect sm-malloc to warn about use of NULL when either of the argument is NULL, although there is no attribute(nonnull) specified. In fact, sm-malloc isn't using the signature of sprintf as specified in the test case, but rather the one provided by GCC in builtins.def /* See e.g. https://en.cppreference.com/w/c/io/fprintf and https://www.man7.org/linux/man-pages/man3/sprintf.3.html */ extern int sprintf(char* dst, const char* fmt, ...) __attribute__((__nothrow__)); // No attribute(nonnull) int test_null_dst (void) { return sprintf (NULL, "hello world"); /* { dg-warning "use of NULL where non-null expected" } */ } int test_null_fmt (char *dst) { return sprintf (dst, NULL); /* { dg-warning "use of NULL where non-null expected" } */ } Signature in builtins.def: DEF_LIB_BUILTIN (BUILT_IN_SPRINTF, "sprintf", BT_FN_INT_STRING_CONST_STRING_VAR, ATTR_NOTHROW_NONNULL_1_FORMAT_PRINTF_2_3) My question is then : Should G++ behave as GCC and ignore the user's signatures of built-ins, instead using the attributes specified by GCC ? At the moment I went with a "yes". If the function is a builtin, I'm operating on its builtin_known_function::builtin_decl () (see above) rather than the callee_fndecl deduced from a gimple call, therefore overriding the user's signature. Doing so led to tests sprintf-2.c and realloc-1.c to pass both in GCC and G++. Last test I'd like to discuss is analyzer/pr99193-1.c /* { dg-additional-options "-Wno-analyzer-too-complex" } */ /* Verify absence of false positive from -Wanalyzer-mismatching-deallocation on realloc(3). Based on https://github.com/libguestfs/libguestfs/blob/f19fd566f6387ce7e4d82409528c9dde374d25e0/daemon/command.c#L115 which is GPLv2 or later. */ typedef __SIZE_TYPE__ size_t; typedef __builtin_va_list va_list; #ifdef __cplusplus #if __cplusplus >= 201103L #define NULL nullptr #else #define NULL 0 #endif #else #define NULL ((void *)0) #endif extern void *malloc (size_t __size) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__malloc__)) __attribute__ ((__alloc_size__ (1))); extern void perror (const char *__s); extern void *realloc (void *__ptr, size_t __size) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__warn_unused_result__)) __attribute__ ((__alloc_size__ (2))); extern void guestfs_int_cleanup_free (void *ptr); extern int commandrvf (char **stdoutput, char **stderror, unsigned flags, char const* const *argv); #define CLEANUP_FREE __attribute__((cleanup(guestfs_int_cleanup_free))) int commandrf (char **stdoutput, char **stderror, unsigned flags, const char *name, ...) { va_list args; CLEANUP_FREE const char **argv = NULL; char *s; int i, r; /* Collect the command line arguments into an array. */ i = 2; argv = (const char **) malloc (sizeof (char *) * i); if (argv == NULL) { perror ("malloc"); return -1; } argv[0] = (char *) name; argv[1] = NULL; __builtin_va_start (args, name); while ((s = __builtin_va_arg (args, char *)) != NULL) { const char **p = (const char **) realloc (argv, sizeof (char *) * (++i)); /* { dg-bogus "'free'" } */ if (p == NULL) { perror ("realloc"); __builtin_va_end (args); // (*) return -1; } argv = p; argv[i-2] = s; argv[i-1] = NULL; } __builtin_va_end (args); r = commandrvf (stdoutput, stderror, flags, argv); return r; } // G++ emits Excess error here Even without the prior changes to the processing of built-ins, va_end and __builtin_va_end are recognized by G++. Yet, G++ fails with an Excess error "warning: missing call to 'va_end' [-Wanalyzer-va-list-leak]". /home/vultkayn/Desktop/gcc_sources_perso/gcc/gcc/testsuite/c-c++-common/analyzer/pr99193-1.c:75:1: warning: missing call to ‘va_end’ [-Wanalyzer-va-list-leak] 75 | } | ^ ‘int commandrf(char**, char**, unsigned int, const char*, ...)’: events 1-3 | | 49 | if (argv == NULL) { | | ^~ | | | | | (1) following ‘false’ branch... |...... | 53 | argv[0] = (char *) name; | | ~ | | | | | (2) ...to here |...... | 56 | __builtin_va_start (args, name); | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (3) ‘va_start’ called here (state of ‘’: ‘start’ -> ‘started’, NULL origin, meaning: {verb: ‘acquire’, noun: ‘resource’}) | ‘int commandrf(char**, char**, unsigned int, const char*, ...)’: events 4-9 | | 58 | while ((s = __builtin_va_arg (args, char *)) != NULL) { | | ^ | | | | | (4) following ‘true’ branch... | 59 | const char **p = (const char **) realloc (argv, sizeof (char *) * (++i)); /* { dg-bogus "'free'" } */ | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | | | (5) ...to here | | (6) when ‘realloc’ fails | 60 | if (p == NULL) { | | ~~ | | | | | (7) following ‘true’ branch (when ‘p’ is NULL)... | 61 | perror ("realloc"); | | ~~~~~~~~~~~~~~~~~~ | | | | | (8) ...to here |...... | 75 | } | | ~ | | | | | (9) missing call to ‘va_end’ to match ‘va_start’ at (3) (in global state ‘started’) | The IPA doesn't differ much between C and C++ at this sec // gcc -fdump-ipa-analyzer <bb 5> : i_44 = i_22 + 1; _8 = (long unsigned int) i_44; _9 = _8 * 8; argv.3_10 = argv; p_46 = realloc (argv.3_10, _9); if (p_46 == 0B) goto <bb 6>; [INV] else goto <bb 7>; [INV] <bb 6> : perror ("realloc"); __builtin_va_end (&args); _52 = -1; // predicted unlikely by early return (on trees) predictor. goto <bb 10>; [INV] // g++ -fdump-analyzer-ipa <bb 6> : i_48 = i_22 + 1; _8 = (long unsigned int) i_48; _9 = _8 * 8; argv.3_10 = argv; p_50 = realloc (argv.3_10, _9); if (p_50 == 0B) // if p == NULL goto <bb 7>; [INV] else goto <bb 9>; [INV] <bb 7> : perror ("realloc"); <bb 8> : __builtin_va_end (&args); _56 = -1; // predicted unlikely by early return (on trees) predictor. goto <bb 13>; [INV] If I comment out the call to perror("realloc"), then G++ behaves as GCC. If I replace perror ("realloc") by any other call, to a fully-defined function or whatever, then the warning reappears. The above SSA differ by an extra basic block in G++ that splits GCC's <bb 6> in two. However, that doesn't account for much, as changing if (p == NULL) { perror ("realloc"); __builtin_va_end (args); // (*) return -1; } to if (p == NULL) { int x; perror("realloc"); if (x > 7) x = 12; __builtin_va_end (args); return -1; } generates a similar basic block around __builtin_va_end, yet G++ keeps emitting the false positive whereas GCC doesn't. // GCC modified SSA <bb 6> : perror ("realloc"); if (x_51(D) > 7) goto <bb 7>; [INV] else goto <bb 8>; [INV] <bb 7> : x_52 = 12; <bb 8> : __builtin_va_end (&args); _54 = -1; // predicted unlikely by early return (on trees) predictor. goto <bb 12>; [INV] // G++ modified SSA <bb 7> : perror ("realloc"); <bb 8> : if (x_55(D) > 7) goto <bb 9>; [INV] else goto <bb 10>; [INV] <bb 9> : x_56 = 12; <bb 10> : __builtin_va_end (&args); _58 = -1; // predicted unlikely by early return (on trees) predictor. goto <bb 15>; [INV] Debugging and glaring at the exploded graph gave me no clue towards fixing this test. If you have any hint, I welcome it. I hope the above isn't too indigest. Cheers, Benjamin.