aaron.ballman added a comment.

It looks like precommit CI caught some issues:

  Failed Tests (2):
    Clangd :: utf8.test
    Clangd Unit Tests :: 
./ClangdTests.exe/CompletionStringTest/GetDeclCommentBadUTF8



================
Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:116-117
   "source file is not valid UTF-8">;
+def warn_invalid_utf8_in_comment : Warning<
+  "invalid UTF-8 in comment">, InGroup<DiagGroup<"invalid-utf8">>, 
DefaultIgnore;
 def err_character_not_allowed : Error<
----------------
tahonermann wrote:
> aaron.ballman wrote:
> > What would it take to enable the diagnostic by default? We don't typically 
> > add off-by-default diagnostics because there's evidence that not many folks 
> > enable them.
> > 
> > Alternatively, would this make sense as a pedantic warning? It's not really 
> > an extension for us to accept this stuff, but it seems like we can still 
> > pedantically diagnose the code?
> I agree with enabling this by default in at least some warning mode. Aaron's 
> suggestion of making it a pedantic warning seems reasonable to me.
Great, now that it's marked `Extension` it will show up when someone enables 
`-pedantic` (but is otherwise off by default).


================
Comment at: clang/lib/Lex/Lexer.cpp:2405-2406
     // Skip over characters in the fast loop.
-    while (C != 0 &&                // Potentially EOF.
-           C != '\n' && C != '\r')  // Newline or DOS-style newline.
+    // Warn on invalid UTF-8 if the corresponding warning is enabled, emitting 
a
+    // diagnostic only once per sequence that cannot be decoded.
+    while ((!WarnOnInvalidUtf8 || isASCII(C)) && C != 0 && // Potentially EOF.
----------------
tahonermann wrote:
> I think it would be helpful to include a link to [[ 
> http://unicode.org/review/pr-121.html | Unicode PR issue 121 ]] here and note 
> that the diagnostic follows policy option 1. Likewise for handling of '//' 
> comment styles below. Alternatively, provide the link and a note in the 
> release notes.
+1 -- I hadn't realized there were other options. However, ultimately I think 
this should be committed when WG21 has adopted the paper and then it would be 
even better to have a comment citing which stable name and paragraph number the 
code is implementing from P2295.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128059/new/

https://reviews.llvm.org/D128059

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to