This patch moves handling of Wnormalized= to c.opt. There were two quirks when doing this:
1) I cannot use the cpplib.h type 'enum cpp_normalize_level' as Type() because this will require including cpplib.h into options.h, which in turn causes a lot of problems, thus I needed to use Type(int). Similarly, I cannot use this type in the CPP option warn_normalize, because C++ does not allow conversions from int to this enum. 2) The code in c_common_handle_option seems to say that -Wnormalized= is equivalent to -Wnormalized=nfkc. However, -Wnormalized= was already rejected as not valid. Moreover, it emits a note for -Werror=normalized= saying that it is equivalent to -Wnormalized=nfc, however, this note is never actually emitted since the code never reaches that condition, so I chose to not even try to replicate the note or allow -Wnormalized=. Surprisingly, -Werror=normalized= was already equivalent to what -Werror=normalized would do, but -Werror=normalized was rejected because -Wnormalized did not exist. Thus, I added -Wnormalized to handle this corner case. In summary, after the patch the only behavior changes are that -Werror=normalized, -Wnormalized and -Wno-normalized work. Bootstrapped and regression tested on x86_64-linux-gnu OK? gcc/ChangeLog: 2014-09-05 Manuel López-Ibáñez <m...@gcc.gnu.org> * doc/invoke.texi (Wnormalized=): Update. libcpp/ChangeLog: 2014-09-05 Manuel López-Ibáñez <m...@gcc.gnu.org> * include/cpplib.h (struct cpp_options): Declare warn_normalize as int instead of enum. gcc/c-family/ChangeLog: 2014-09-05 Manuel López-Ibáñez <m...@gcc.gnu.org> * c.opt (Wnormalized): New. (Wnormalized=): Use Enum and Reject Negative. * c-opts.c (c_common_handle_option): Do not handle Wnormalized here. gcc/testsuite/ChangeLog: 2014-09-05 Manuel López-Ibáñez <m...@gcc.gnu.org> * gcc.dg/cpp/warn-normalized-3.c: Delete useless dg-prune-output.
Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 214904) +++ gcc/doc/invoke.texi (working copy) @@ -260,11 +260,12 @@ Objective-C and Objective-C++ Dialects}. -Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol -Winvalid-pch -Wlarger-than=@var{len} -Wunsafe-loop-optimizations @gol -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol -Wmain -Wmaybe-uninitialized -Wmemset-transposed-args -Wmissing-braces @gol -Wmissing-field-initializers -Wmissing-include-dirs @gol --Wno-multichar -Wnonnull -Wodr -Wno-overflow -Wopenmp-simd @gol +-Wno-multichar -Wnonnull -Wnormalized=@r{[}none@r{|}id@r{|}nfc@r{|}nfkc@r{]} @gol + -Wodr -Wno-overflow -Wopenmp-simd @gol -Woverlength-strings -Wpacked -Wpacked-bitfield-compat -Wpadded @gol -Wparentheses -Wpedantic-ms-format -Wno-pedantic-ms-format @gol -Wpointer-arith -Wno-pointer-to-int-cast @gol -Wredundant-decls -Wno-return-local-addr @gol -Wreturn-type -Wsequence-point -Wshadow -Wno-shadow-ivar @gol @@ -4918,12 +4919,14 @@ warnings without this one, use @option{- @opindex Wmultichar Do not warn if a multicharacter constant (@samp{'FOOF'}) is used. Usually they indicate a typo in the user's code, as they have implementation-defined values, and should not be used in portable code. -@item -Wnormalized=<none|id|nfc|nfkc> +@item -Wnormalized@r{[}=@r{<}none@r{|}id@r{|}nfc@r{|}nfkc@r{>]} @opindex Wnormalized= +@opindex Wnormalized +@opindex Wno-normalized @cindex NFC @cindex NFKC @cindex character set, input normalization In ISO C and ISO C++, two identifiers are different if they are different sequences of characters. However, sometimes when characters @@ -4935,24 +4938,26 @@ the same sequence. GCC can warn you if have not been normalized; this option controls that warning. There are four levels of warning supported by GCC@. The default is @option{-Wnormalized=nfc}, which warns about any identifier that is not in the ISO 10646 ``C'' normalized form, @dfn{NFC}. NFC is the -recommended form for most uses. +recommended form for most uses. It is equivalent to +@option{-Wnormalized}. Unfortunately, there are some characters allowed in identifiers by ISO C and ISO C++ that, when turned into NFC, are not allowed in identifiers. That is, there's no way to use these symbols in portable ISO C or C++ and have all your identifiers in NFC@. @option{-Wnormalized=id} suppresses the warning for these characters. It is hoped that future versions of the standards involved will correct this, which is why this option is not the default. You can switch the warning off for all characters by writing -@option{-Wnormalized=none}. You should only do this if you -are using some other normalization scheme (like ``D''), because -otherwise you can easily create bugs that are literally impossible to see. +@option{-Wnormalized=none} or @option{-Wno-normalized}. You should +only do this if you are using some other normalization scheme (like +``D''), because otherwise you can easily create bugs that are +literally impossible to see. Some characters in ISO 10646 have distinct meanings but look identical in some fonts or display methodologies, especially once formatting has been applied. For instance @code{\u207F}, ``SUPERSCRIPT LATIN SMALL LETTER N'', displays just like a regular @code{n} that has been Index: gcc/c-family/c.opt =================================================================== --- gcc/c-family/c.opt (revision 214904) +++ gcc/c-family/c.opt (working copy) @@ -631,13 +639,36 @@ Warn about NULL being passed to argument Wnonnull C ObjC C++ ObjC++ LangEnabledBy(C ObjC C++ ObjC++,Wall) ; +Wnormalized +C ObjC C++ ObjC++ Alias(Wnormalized=,nfc,none) +; + Wnormalized= -C ObjC C++ ObjC++ Joined Warning --Wnormalized=<id|nfc|nfkc> Warn about non-normalised Unicode strings +C ObjC C++ ObjC++ RejectNegative Joined Warning CPP(warn_normalize) Init(normalized_C) Var(cpp_warn_normalize) Enum(cpp_normalize_level) +-Wnormalized=<none|id|nfc|nfkc> Warn about non-normalised Unicode strings + +; Required for these enum values. +SourceInclude +cpplib.h + +Enum +Name(cpp_normalize_level) Type(int) UnknownError(argument %qs to %<-Wnormalized%> not recognized) + +EnumValue +Enum(cpp_normalize_level) String(none) Value(normalized_none) + +EnumValue +Enum(cpp_normalize_level) String(nfkc) Value(normalized_KC) + +EnumValue +Enum(cpp_normalize_level) String(id) Value(normalized_identifier_C) + +EnumValue +Enum(cpp_normalize_level) String(nfc) Value(normalized_C) Wold-style-cast C++ ObjC++ Var(warn_old_style_cast) Warning Warn if a C-style cast is used in a program Index: gcc/c-family/c-opts.c =================================================================== --- gcc/c-family/c-opts.c (revision 214904) +++ gcc/c-family/c-opts.c (working copy) @@ -382,33 +382,10 @@ c_common_handle_option (size_t scode, co /* ??? Don't add new options here. Use LangEnabledBy in c.opt. */ cpp_opts->warn_num_sign_change = value; break; - case OPT_Wnormalized_: - /* FIXME: Move all this to c.opt. */ - if (kind == DK_ERROR) - { - gcc_assert (!arg); - inform (input_location, "-Werror=normalized=: set -Wnormalized=nfc"); - cpp_opts->warn_normalize = normalized_C; - } - else - { - if (!value || (arg && strcasecmp (arg, "none") == 0)) - cpp_opts->warn_normalize = normalized_none; - else if (!arg || strcasecmp (arg, "nfkc") == 0) - cpp_opts->warn_normalize = normalized_KC; - else if (strcasecmp (arg, "id") == 0) - cpp_opts->warn_normalize = normalized_identifier_C; - else if (strcasecmp (arg, "nfc") == 0) - cpp_opts->warn_normalize = normalized_C; - else - error ("argument %qs to %<-Wnormalized%> not recognized", arg); - break; - } - case OPT_Wunknown_pragmas: /* Set to greater than 1, so that even unknown pragmas in system headers will be warned about. */ /* ??? There is no way to handle this automatically for now. */ warn_unknown_pragmas = value * 2; Index: gcc/testsuite/gcc.dg/cpp/warn-normalized-3.c =================================================================== --- gcc/testsuite/gcc.dg/cpp/warn-normalized-3.c (revision 214904) +++ gcc/testsuite/gcc.dg/cpp/warn-normalized-3.c (working copy) @@ -1,5 +1,4 @@ // { dg-do preprocess } // { dg-options "-std=gnu99 -fdiagnostics-show-option -fextended-identifiers -Werror=normalized=" } /* { dg-message "some warnings being treated as errors" "" {target "*-*-*"} 0 } */ - // { dg-prune-output ".*-Werror=normalized=: set -Wnormalized=nfc.*" } \u0F43 // { dg-error "`.U00000f43' is not in NFC .-Werror=normalized=." } Index: libcpp/include/cpplib.h =================================================================== --- libcpp/include/cpplib.h (revision 214904) +++ libcpp/include/cpplib.h (working copy) @@ -455,12 +455,12 @@ struct cpp_options /* Holds the name of the input character set. */ const char *input_charset; /* The minimum permitted level of normalization before a warning - is generated. */ - enum cpp_normalize_level warn_normalize; + is generated. See enum cpp_normalize_level. */ + int warn_normalize; /* True to warn about precompiled header files we couldn't use. */ bool warn_invalid_pch; /* True if dependencies should be restored from a precompiled header. */