aaron.ballman added a comment. It looks like precommit CI caught some issues:
Failed Tests (2): Clangd :: utf8.test Clangd Unit Tests :: ./ClangdTests.exe/CompletionStringTest/GetDeclCommentBadUTF8 ================ Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:116-117 "source file is not valid UTF-8">; +def warn_invalid_utf8_in_comment : Warning< + "invalid UTF-8 in comment">, InGroup<DiagGroup<"invalid-utf8">>, DefaultIgnore; def err_character_not_allowed : Error< ---------------- tahonermann wrote: > aaron.ballman wrote: > > What would it take to enable the diagnostic by default? We don't typically > > add off-by-default diagnostics because there's evidence that not many folks > > enable them. > > > > Alternatively, would this make sense as a pedantic warning? It's not really > > an extension for us to accept this stuff, but it seems like we can still > > pedantically diagnose the code? > I agree with enabling this by default in at least some warning mode. Aaron's > suggestion of making it a pedantic warning seems reasonable to me. Great, now that it's marked `Extension` it will show up when someone enables `-pedantic` (but is otherwise off by default). ================ Comment at: clang/lib/Lex/Lexer.cpp:2405-2406 // Skip over characters in the fast loop. - while (C != 0 && // Potentially EOF. - C != '\n' && C != '\r') // Newline or DOS-style newline. + // Warn on invalid UTF-8 if the corresponding warning is enabled, emitting a + // diagnostic only once per sequence that cannot be decoded. + while ((!WarnOnInvalidUtf8 || isASCII(C)) && C != 0 && // Potentially EOF. ---------------- tahonermann wrote: > I think it would be helpful to include a link to [[ > http://unicode.org/review/pr-121.html | Unicode PR issue 121 ]] here and note > that the diagnostic follows policy option 1. Likewise for handling of '//' > comment styles below. Alternatively, provide the link and a note in the > release notes. +1 -- I hadn't realized there were other options. However, ultimately I think this should be committed when WG21 has adopted the paper and then it would be even better to have a comment citing which stable name and paragraph number the code is implementing from P2295. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D128059/new/ https://reviews.llvm.org/D128059 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits