[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-16 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added a comment.

I see. Thank you. I fixed the issue using its workaround.

https://reviews.llvm.org/D125727


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-16 Thread Erich Keane via Phabricator via cfe-commits
erichkeane added a comment.

In D124726#3516896 , @ken-matsui 
wrote:

> @nathanchance
>
> Oops, thank you for your comment!
> I would like to work on a hotfix patch for this issue.
>
> @aaron.ballman
>
> Should we entirely opt-out of this suggestion on `.S` files? I think there 
> are other approaches here, such as decreasing the max edit distance to avoid 
> suggesting this case in `.S` files, but this approach is avoiding just this 
> edge case so that it would possibly bring a new issue like this. Conversely, 
> opting out of this suggestion on the whole `.S` files will not be able to 
> catch any typoes. Considering possible edge cases (`#` directives are also 
> treated as comments), I think opting out would be the only way, 
> unfortunately. What do you think?
>
> For now, I will create a patch opting out of this suggestion on `.S` files.

I believe the `if (getLangOpts().AsmPreprocessor)` condition is appropriately 
accurate here.  I don't think changing the edit distance would be a good 
behavior for anyone, and I don't see any other appropriate heuristics.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-16 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added a comment.

@nathanchance

Oops, thank you for your comment!
I would like to work on a hotfix patch for this issue.

@aaron.ballman

Should we entirely opt-out of this suggestion on `.S` files? I think there are 
other approaches here, such as decreasing the max edit distance to avoid 
suggesting this case in `.S` files, but this approach is avoiding just this 
edge case so that it would possibly bring a new issue like this. Conversely, 
opting out of this suggestion on the whole `.S` files will not be able to catch 
any typoes. Considering possible edge cases (`#` directives are also treated as 
comments), I think opting out would be the only way, unfortunately. What do you 
think?

For now, I will create a patch opting out of this suggestion on `.S` files.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-16 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

In D124726#3516346 , @nathanchance 
wrote:

> I see an instance of this warning in the Linux kernel due to the "Now, for 
> unknown directives inside a skipped conditional block, we diagnose the 
> unknown directive as a warning if it is sufficiently similar to a directive 
> specific to preprocessor conditional blocks" part of this change:
>
>   arch/x86/crypto/aesni-intel_asm.S:532:8: warning: invalid preprocessing 
> directive, did you mean '#if'? [-Wunknown-directives]
>   # in in order to perform
> ^
>   arch/x86/crypto/aesni-intel_asm.S:547:8: warning: invalid preprocessing 
> directive, did you mean '#if'? [-Wunknown-directives]
>   # in in order to perform
> ^
>   2 warnings generated.

Oh wow, that's a really neat one!

> This is a comment within an assembler file that will be preprocessed (this is 
> a 32-bit x86 build and the block is guarded by `#ifdef __x86_64__` on line 
> 500):
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/crypto/aesni-intel_asm.S?h=v5.18-rc7#n532
>
> Is there anything that can be done to improve this heuristic for this case? I 
> can try to push a patch to change the comment style for this one instance but 
> I always worry that a warning of this nature will appear in the future and 
> result in the kernel disabling this warning entirely.

Ah, and we don't get an error for those because of special logic: 
https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/PPDirectives.cpp#L1243
 and it looks like we may need similar logic before issuing the warnings as 
well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-16 Thread Nathan Chancellor via Phabricator via cfe-commits
nathanchance added subscribers: nickdesaulniers, nathanchance.
nathanchance added a comment.

I see an instance of this warning in the Linux kernel due to the "Now, for 
unknown directives inside a skipped conditional block, we diagnose the unknown 
directive as a warning if it is sufficiently similar to a directive specific to 
preprocessor conditional blocks" part of this change:

  arch/x86/crypto/aesni-intel_asm.S:532:8: warning: invalid preprocessing 
directive, did you mean '#if'? [-Wunknown-directives]
  # in in order to perform
^
  arch/x86/crypto/aesni-intel_asm.S:547:8: warning: invalid preprocessing 
directive, did you mean '#if'? [-Wunknown-directives]
  # in in order to perform
^
  2 warnings generated.

This is a comment within an assembler file that will be preprocessed (this is a 
32-bit x86 build and the block is guarded by `#ifdef __x86_64__` on line 500):

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/crypto/aesni-intel_asm.S?h=v5.18-rc7#n532

Is there anything that can be done to improve this heuristic for this case? I 
can try to push a patch to change the comment style for this one instance but I 
always worry that a warning of this nature will appear in the future and result 
in the kernel disabling this warning entirely.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-13 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added a comment.

Thank you for your support!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-13 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

LGTM, thank you for this! I adjusted the release note and commit message 
somewhat when landing to make sure they were detailed enough.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-13 Thread Aaron Ballman via Phabricator via cfe-commits
This revision was not accepted when it landed; it landed in state "Needs 
Review".
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa247ba9d1563: Suggest typo corrections for preprocessor 
directives (authored by ken-matsui, committed by aaron.ballman).

Changed prior to commit:
  https://reviews.llvm.org/D124726?vs=429005&id=429221#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive.c

Index: clang/test/Preprocessor/suggest-typoed-directive.c
===
--- /dev/null
+++ clang/test/Preprocessor/suggest-typoed-directive.c
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=pre-c2x-cpp2b %s
+// RUN: %clang_cc1 -std=c2x -fsyntax-only -verify=c2x-cpp2b %s
+// RUN: %clang_cc1 -x c++ -std=c++2b -fsyntax-only -verify=c2x-cpp2b %s
+
+// id:pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   not suggested to '#elifdef'
+// elfindef:  not suggested to '#elifdef'
+// elfinndef: not suggested to '#elifndef'
+// els:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#endif'?}}
+#ifdef UNDEFINED
+#id
+#ifd
+#ifde
+#elf
+#elsif
+#elseif
+#elfidef
+#elfindef
+#elfinndef
+#els
+#endi
+#endif
+// id:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfindef:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfinndef: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifndef'?}}
+// els:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#endif'?}}
+
+#ifdef UNDEFINED
+#i // no diagnostic
+#endif
+
+#if special_compiler
+#special_compiler_directive // no diagnostic
+#endif
Index: clang/lib/Lex/PPDirectives.cpp
===
--- clang/lib/Lex/PPDirectives.cpp
+++ clang/lib/Lex/PPDirectives.cpp
@@ -266,6 +266,51 @@
 .Default(false);
 }
 
+/// Find a similar string in `Candidates`.
+///
+/// \param LHS a string for a similar string in `Candidates`
+///
+/// \param Candidates the candidates to find a similar string.
+///
+/// \returns a similar string if exists. If no similar string exists,
+/// returns None.
+static Optional findSimilarStr(
+StringRef LHS, const std::vector &Candidates) {
+  // We need to check if `Candidates` has the exact case-insensitive string
+  // because the Levenshtein distance match does not care about it.
+  for (StringRef C : Candidates) {
+if (LHS.equals_insensitive(C)) {
+  return C;
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If the LHS size is less than 3, use the LHS size minus 1 and if not,
+  // use the LHS size divided by 3.
+  size_t Length = LHS.size();
+  size_t MaxDist = Length < 3 ? Length - 1 : Length / 3;
+
+  Optional> SimilarStr = None;
+  for (StringRef C : Candidates) {
+size_t CurDist = LHS.edit_distance(C, true);
+if (CurDist <= MaxDist) {
+  if (!SimilarStr.hasValue()) {
+// The first similar string found.
+SimilarStr = {C, CurDist};
+  } else if (CurDist < SimilarStr->second) {
+// More similar string found.
+SimilarStr = {C, CurDist};
+  }
+}
+  }
+
+  if (SimilarStr.hasValue()) {
+return SimilarStr-

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-13 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added a comment.

@aaron.ballman

I prepared a new public email address: `v...@kmatsui.me` instead of the 
auto-generated one.
So, could you please use the new one for future commit?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-13 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added a comment.

@aaron.ballman

I prepared a new public email address: `g...@kmatsui.me` instead of the 
auto-generated one.
So, could you please use the new one for future commit?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-12 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 429005.
ken-matsui added a comment.

Revert the changes for errored directives


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive.c

Index: clang/test/Preprocessor/suggest-typoed-directive.c
===
--- /dev/null
+++ clang/test/Preprocessor/suggest-typoed-directive.c
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=pre-c2x-cpp2b %s
+// RUN: %clang_cc1 -std=c2x -fsyntax-only -verify=c2x-cpp2b %s
+// RUN: %clang_cc1 -x c++ -std=c++2b -fsyntax-only -verify=c2x-cpp2b %s
+
+// id:pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   not suggested to '#elifdef'
+// elfindef:  not suggested to '#elifdef'
+// elfinndef: not suggested to '#elifndef'
+// els:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#endif'?}}
+#ifdef UNDEFINED
+#id
+#ifd
+#ifde
+#elf
+#elsif
+#elseif
+#elfidef
+#elfindef
+#elfinndef
+#els
+#endi
+#endif
+// id:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfindef:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfinndef: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifndef'?}}
+// els:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#endif'?}}
+
+#ifdef UNDEFINED
+#i // no diagnostic
+#endif
+
+#if special_compiler
+#special_compiler_directive // no diagnostic
+#endif
Index: clang/lib/Lex/PPDirectives.cpp
===
--- clang/lib/Lex/PPDirectives.cpp
+++ clang/lib/Lex/PPDirectives.cpp
@@ -266,6 +266,51 @@
 .Default(false);
 }
 
+/// Find a similar string in `Candidates`.
+///
+/// \param LHS a string for a similar string in `Candidates`
+///
+/// \param Candidates the candidates to find a similar string.
+///
+/// \returns a similar string if exists. If no similar string exists,
+/// returns None.
+static Optional findSimilarStr(
+StringRef LHS, const std::vector &Candidates) {
+  // We need to check if `Candidates` has the exact case-insensitive string
+  // because the Levenshtein distance match does not care about it.
+  for (StringRef C : Candidates) {
+if (LHS.equals_insensitive(C)) {
+  return C;
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If the LHS size is less than 3, use the LHS size minus 1 and if not,
+  // use the LHS size divided by 3.
+  size_t Length = LHS.size();
+  size_t MaxDist = Length < 3 ? Length - 1 : Length / 3;
+
+  Optional> SimilarStr = None;
+  for (StringRef C : Candidates) {
+size_t CurDist = LHS.edit_distance(C, true);
+if (CurDist <= MaxDist) {
+  if (!SimilarStr.hasValue()) {
+// The first similar string found.
+SimilarStr = {C, CurDist};
+  } else if (CurDist < SimilarStr->second) {
+// More similar string found.
+SimilarStr = {C, CurDist};
+  }
+}
+  }
+
+  if (SimilarStr.hasValue()) {
+return SimilarStr->first;
+  } else {
+return None;
+  }
+}
+
 bool Preprocessor::CheckMacroName(Token &MacroNameTok, MacroUse isDefineUndef,
   bool *ShadowFlag) {
   // Missing macro name?
@@ -433,6 +478,25 @@
   return BytesToSkip - LengthDiff;
 }
 
+void Preprocessor::SuggestTypoedDirective(cons

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-12 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added inline comments.



Comment at: clang/lib/Lex/PPDirectives.cpp:1257
   // If we reached here, the preprocessing token is not valid!
-  Diag(Result, diag::err_pp_invalid_directive);
+  Diag(Result, diag::err_pp_invalid_directive) << 0;
 

aaron.ballman wrote:
> ken-matsui wrote:
> > aaron.ballman wrote:
> > > I think we should be attempting to suggest a typo for the error case as 
> > > well e.g.,
> > > ```
> > > #fi WHATEVER
> > > #endif
> > > ```
> > > we currently give no suggestion for that typo, just the error. However, 
> > > this may require a fair amount of changes because of the various edge 
> > > cases where we give better diagnostics than "unknown directive". e.g.,
> > > ```
> > > #if WHATEVER // error: unterminated conditional directive
> > > #endfi // no diagnostic
> > > ```
> > > so if it looks like covering error cases is going to be involved, I'm 
> > > fine doing it in a follow-up if you'd prefer.
> > The former can be implemented easily, but the latter seems not easy.
> > So what about doing the latter in another patch?
> I'm fine doing the error cases entirely in another patch. The current 
> approach here is a problem because it issues a warning and an error for the 
> same line of code. So I'd go back to the way the code was before, and in a 
> follow-up you can handle errors.
> 
> I think one approach to consider for that is having an argument to 
> `SuggestTypoedDirective()` as to whether to err or warn so that we can reuse 
> the same logic.
OK, reverting :+1:

I'll work on it. Thank you.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-12 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: clang/lib/Lex/PPDirectives.cpp:1257
   // If we reached here, the preprocessing token is not valid!
-  Diag(Result, diag::err_pp_invalid_directive);
+  Diag(Result, diag::err_pp_invalid_directive) << 0;
 

ken-matsui wrote:
> aaron.ballman wrote:
> > I think we should be attempting to suggest a typo for the error case as 
> > well e.g.,
> > ```
> > #fi WHATEVER
> > #endif
> > ```
> > we currently give no suggestion for that typo, just the error. However, 
> > this may require a fair amount of changes because of the various edge cases 
> > where we give better diagnostics than "unknown directive". e.g.,
> > ```
> > #if WHATEVER // error: unterminated conditional directive
> > #endfi // no diagnostic
> > ```
> > so if it looks like covering error cases is going to be involved, I'm fine 
> > doing it in a follow-up if you'd prefer.
> The former can be implemented easily, but the latter seems not easy.
> So what about doing the latter in another patch?
I'm fine doing the error cases entirely in another patch. The current approach 
here is a problem because it issues a warning and an error for the same line of 
code. So I'd go back to the way the code was before, and in a follow-up you can 
handle errors.

I think one approach to consider for that is having an argument to 
`SuggestTypoedDirective()` as to whether to err or warn so that we can reuse 
the same logic.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-12 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 428985.
ken-matsui added a comment.

Add test for errored directive but no suggestion


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive.c

Index: clang/test/Preprocessor/suggest-typoed-directive.c
===
--- /dev/null
+++ clang/test/Preprocessor/suggest-typoed-directive.c
@@ -0,0 +1,55 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=pre-c2x-cpp2b %s
+// RUN: %clang_cc1 -std=c2x -fsyntax-only -verify=c2x-cpp2b %s
+// RUN: %clang_cc1 -x c++ -std=c++2b -fsyntax-only -verify=c2x-cpp2b %s
+
+// id:pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   not suggested to '#elifdef'
+// elfindef:  not suggested to '#elifdef'
+// elfinndef: not suggested to '#elifndef'
+// els:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#endif'?}}
+#ifdef UNDEFINED
+#id
+#ifd
+#ifde
+#elf
+#elsif
+#elseif
+#elfidef
+#elfindef
+#elfinndef
+#els
+#endi
+#endif
+// id:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfindef:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfinndef: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifndef'?}}
+// els:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#endif'?}}
+
+#ifdef UNDEFINED
+#i // no diagnostic
+#endif
+
+#if special_compiler
+#special_compiler_directive // no diagnostic
+#endif
+
+// pre-c2x-cpp2b-error@+1 {{invalid preprocessing directive, did you mean '#if'?}}
+#id UNDEFINED
+// c2x-cpp2b-error@-1 {{invalid preprocessing directive, did you mean '#if'?}}
+
+// pre-c2x-cpp2b-error@+1 {{invalid preprocessing directive}}
+#invalid
+// c2x-cpp2b-error@-1 {{invalid preprocessing directive}}
Index: clang/lib/Lex/PPDirectives.cpp
===
--- clang/lib/Lex/PPDirectives.cpp
+++ clang/lib/Lex/PPDirectives.cpp
@@ -266,6 +266,51 @@
 .Default(false);
 }
 
+/// Find a similar string in `Candidates`.
+///
+/// \param LHS a string for a similar string in `Candidates`
+///
+/// \param Candidates the candidates to find a similar string.
+///
+/// \returns a similar string if exists. If no similar string exists,
+/// returns None.
+static Optional findSimilarStr(
+StringRef LHS, const std::vector &Candidates) {
+  // We need to check if `Candidates` has the exact case-insensitive string
+  // because the Levenshtein distance match does not care about it.
+  for (StringRef C : Candidates) {
+if (LHS.equals_insensitive(C)) {
+  return C;
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If the LHS size is less than 3, use the LHS size minus 1 and if not,
+  // use the LHS size divided by 3.
+  size_t Length = LHS.size();
+  size_t MaxDist = Length < 3 ? Length - 1 : Length / 3;
+
+  Optional> SimilarStr = None;
+  for (StringRef C : Candidates) {
+size_t CurDist = LHS.edit_distance(C, true);
+if (CurDist <= MaxDist) {
+  if (!SimilarStr.hasValue()) {
+// The first similar string found.
+SimilarStr = {C, CurDist};
+  } else if (CurDist < SimilarStr->second) {
+// More similar string found.
+SimilarStr = {C, CurDist};
+  }
+}
+  }
+
+  if (SimilarStr.hasValue()) {
+return

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-12 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 428983.
ken-matsui added a comment.

Update the code as reviewed and add a release note


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive.c

Index: clang/test/Preprocessor/suggest-typoed-directive.c
===
--- /dev/null
+++ clang/test/Preprocessor/suggest-typoed-directive.c
@@ -0,0 +1,54 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=pre-c2x-cpp2b %s
+// RUN: %clang_cc1 -std=c2x -fsyntax-only -verify=c2x-cpp2b %s
+// RUN: %clang_cc1 -x c++ -std=c++2b -fsyntax-only -verify=c2x-cpp2b %s
+
+// id:pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   not suggested to '#elifdef'
+// elfindef:  not suggested to '#elifdef'
+// elfinndef: not suggested to '#elifndef'
+// els:   pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  pre-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#endif'?}}
+#ifdef UNDEFINED
+#id
+#ifd
+#ifde
+#elf
+#elsif
+#elseif
+#elfidef
+#elfindef
+#elfinndef
+#els
+#endi
+#endif
+// id:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfindef:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfinndef: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifndef'?}}
+// els:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#endif'?}}
+
+#ifdef UNDEFINED
+#i // no diagnostic
+#endif
+
+#if special_compiler
+#special_compiler_directive // no diagnostic
+#endif
+
+// pre-c2x-cpp2b-error@+1 {{invalid preprocessing directive, did you mean '#if'?}}
+#id UNDEFINED
+// c2x-cpp2b-error@-1 {{invalid preprocessing directive, did you mean '#if'?}}
+// pre-c2x-cpp2b-error@+1 {{#endif without #if}}
+#endif
+// c2x-cpp2b-error@-1 {{#endif without #if}}
Index: clang/lib/Lex/PPDirectives.cpp
===
--- clang/lib/Lex/PPDirectives.cpp
+++ clang/lib/Lex/PPDirectives.cpp
@@ -266,6 +266,51 @@
 .Default(false);
 }
 
+/// Find a similar string in `Candidates`.
+///
+/// \param LHS a string for a similar string in `Candidates`
+///
+/// \param Candidates the candidates to find a similar string.
+///
+/// \returns a similar string if exists. If no similar string exists,
+/// returns None.
+static Optional findSimilarStr(
+StringRef LHS, const std::vector &Candidates) {
+  // We need to check if `Candidates` has the exact case-insensitive string
+  // because the Levenshtein distance match does not care about it.
+  for (StringRef C : Candidates) {
+if (LHS.equals_insensitive(C)) {
+  return C;
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If the LHS size is less than 3, use the LHS size minus 1 and if not,
+  // use the LHS size divided by 3.
+  size_t Length = LHS.size();
+  size_t MaxDist = Length < 3 ? Length - 1 : Length / 3;
+
+  Optional> SimilarStr = None;
+  for (StringRef C : Candidates) {
+size_t CurDist = LHS.edit_distance(C, true);
+if (CurDist <= MaxDist) {
+  if (!SimilarStr.hasValue()) {
+// The first similar string found.
+SimilarStr = {C, CurDist};
+  } else if (CurDist < SimilarStr->second) {
+// More similar string found.
+SimilarStr = {C, CurDist};
+  }
+}
+  }
+
+  if (SimilarStr.hasValue()) {
+return SimilarStr->first;
+  } els

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-12 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added a comment.

Thank you for your review :)




Comment at: clang/lib/Lex/PPDirectives.cpp:1257
   // If we reached here, the preprocessing token is not valid!
-  Diag(Result, diag::err_pp_invalid_directive);
+  Diag(Result, diag::err_pp_invalid_directive) << 0;
 

aaron.ballman wrote:
> I think we should be attempting to suggest a typo for the error case as well 
> e.g.,
> ```
> #fi WHATEVER
> #endif
> ```
> we currently give no suggestion for that typo, just the error. However, this 
> may require a fair amount of changes because of the various edge cases where 
> we give better diagnostics than "unknown directive". e.g.,
> ```
> #if WHATEVER // error: unterminated conditional directive
> #endfi // no diagnostic
> ```
> so if it looks like covering error cases is going to be involved, I'm fine 
> doing it in a follow-up if you'd prefer.
The former can be implemented easily, but the latter seems not easy.
So what about doing the latter in another patch?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-12 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

You should also come up with a release note for the changes.




Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:431-432
+def err_pp_invalid_directive : Error<
+  "invalid preprocessing directive%select{|, did you mean '#%1'?}0"
+>;
+def warn_pp_invalid_directive : Warning<





Comment at: clang/lib/Lex/PPDirectives.cpp:488-490
+  if (LangOpts.C2x || LangOpts.CPlusPlus2b) {
+Candidates.insert(Candidates.end(), {"elifdef", "elifndef"});
+  }





Comment at: clang/lib/Lex/PPDirectives.cpp:1257
   // If we reached here, the preprocessing token is not valid!
-  Diag(Result, diag::err_pp_invalid_directive);
+  Diag(Result, diag::err_pp_invalid_directive) << 0;
 

I think we should be attempting to suggest a typo for the error case as well 
e.g.,
```
#fi WHATEVER
#endif
```
we currently give no suggestion for that typo, just the error. However, this 
may require a fair amount of changes because of the various edge cases where we 
give better diagnostics than "unknown directive". e.g.,
```
#if WHATEVER // error: unterminated conditional directive
#endfi // no diagnostic
```
so if it looks like covering error cases is going to be involved, I'm fine 
doing it in a follow-up if you'd prefer.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:36
+// elfidef:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you 
mean '#elifdef'?}}
+// elfindef:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you 
mean '#elifdef'?}}
+// elfinndef: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you 
mean '#elifndef'?}}

ken-matsui wrote:
> Here, `#elfindef` is suggested to `#elifdef`, not `#elifndef`, but I believe 
> it will help many users to realize that they have typo'd `#elifndef`, or 
> maybe they might have meant actually `#elifdef`.
+1


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-11 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added inline comments.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:36
+// elfidef:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you 
mean '#elifdef'?}}
+// elfindef:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you 
mean '#elifdef'?}}
+// elfinndef: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you 
mean '#elifndef'?}}

Here, `#elfindef` is suggested to `#elifdef`, not `#elifndef`, but I believe it 
will help many users to realize that they have typo'd `#elifndef`, or maybe 
they might have meant actually `#elifdef`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-11 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 428827.
ken-matsui added a comment.

Remove unused includes in `StringRef.h`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive.c

Index: clang/test/Preprocessor/suggest-typoed-directive.c
===
--- /dev/null
+++ clang/test/Preprocessor/suggest-typoed-directive.c
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=not-c2x-cpp2b %s
+// RUN: %clang_cc1 -std=c2x -fsyntax-only -verify=c2x-cpp2b %s
+// RUN: %clang_cc1 -x c++ -std=c++2b -fsyntax-only -verify=c2x-cpp2b %s
+
+// id:not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   not suggested to '#elifdef'
+// elfindef:  not suggested to '#elifdef'
+// elfinndef: not suggested to '#elifndef'
+// els:   not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#endif'?}}
+#ifdef UNDEFINED
+#id
+#ifd
+#ifde
+#elf
+#elsif
+#elseif
+#elfidef
+#elfindef
+#elfinndef
+#els
+#endi
+#endif
+// id:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfindef:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfinndef: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifndef'?}}
+// els:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#endif'?}}
+
+#ifdef UNDEFINED
+#i // no diagnostic
+#endif
+
+#if special_compiler
+#special_compiler_directive // no diagnostic
+#endif
Index: clang/lib/Lex/PPDirectives.cpp
===
--- clang/lib/Lex/PPDirectives.cpp
+++ clang/lib/Lex/PPDirectives.cpp
@@ -266,6 +266,51 @@
 .Default(false);
 }
 
+/// Find a similar string in `Candidates`.
+///
+/// \param LHS a string for a similar string in `Candidates`
+///
+/// \param Candidates the candidates to find a similar string.
+///
+/// \returns a similar string if exists. If no similar string exists,
+/// returns None.
+static Optional findSimilarStr(
+StringRef LHS, const std::vector &Candidates) {
+  // We need to check if `Candidates` has the exact case-insensitive string
+  // because the Levenshtein distance match does not care about it.
+  for (StringRef C : Candidates) {
+if (LHS.equals_insensitive(C)) {
+  return C;
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If the LHS size is less than 3, use the LHS size minus 1 and if not,
+  // use the LHS size divided by 3.
+  size_t Length = LHS.size();
+  size_t MaxDist = Length < 3 ? Length - 1 : Length / 3;
+
+  Optional> SimilarStr = None;
+  for (StringRef C : Candidates) {
+size_t CurDist = LHS.edit_distance(C, true);
+if (CurDist <= MaxDist) {
+  if (!SimilarStr.hasValue()) {
+// The first similar string found.
+SimilarStr = {C, CurDist};
+  } else if (CurDist < SimilarStr->second) {
+// More similar string found.
+SimilarStr = {C, CurDist};
+  }
+}
+  }
+
+  if (SimilarStr.hasValue()) {
+return SimilarStr->first;
+  } else {
+return None;
+  }
+}
+
 bool Preprocessor::CheckMacroName(Token &MacroNameTok, MacroUse isDefineUndef,
   bool *ShadowFlag) {
   // Missing macro name?
@@ -433,6 +478,27 @@
   return BytesToSkip - LengthDiff;
 }
 
+/// SuggestTypoedDirective - Provide a suggestion for a typoed directive. If
+

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-11 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added a comment.

Thank you for your support!

I updated the code, so could you please review this patch again?




Comment at: llvm/lib/Support/StringRef.cpp:102
+// Find a similar string in `Candidates`.
+Optional StringRef::find_similar_str(const std::vector 
&Candidates, size_t Dist) const {
+  // We need to check if `rng` has the exact case-insensitive string because 
the Levenshtein distance match does not

aaron.ballman wrote:
> ken-matsui wrote:
> > aaron.ballman wrote:
> > > I am less convinced that we want this as a general interface on 
> > > `StringRef`. I think callers need to decide whether something is similar 
> > > enough or not. For example, case sensitivity may not matter for this case 
> > > but it might matter to another case. Others may wish to limit the max 
> > > edit_distance for each of the candidates for performance reasons, others 
> > > may not. We already saw there were questions about whether to allow 
> > > replacements or not. That sort of thing.
> > > 
> > > I think this functionality should be moved to PPDirectives.cpp as a 
> > > static helper function that's specific to this use. Also, because this 
> > > returns one of the candidates passed in, and those are a `StringRef`, I 
> > > think this function should return a `StringRef` -- this removes 
> > > allocation costs. I'd also drop the `Dist` parameter as the function is 
> > > no longer intended to be a general purpose one.
> > I am also in favor of it, but where should I put tests for this 
> > functionality?
> > Is it possible to write unit tests for static functions implemented in 
> > `.cpp` files?
> This is something we wouldn't usually add tests for directly but instead test 
> the behavior of through lit -- the tests you already have exercise this code 
> path and would show if there's a situation where we expect a viable candidate 
> and don't find any. So I'd not worry about test coverage for it in this case.
I see. Thank you!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-11 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 428826.
ken-matsui added a comment.

Update the code as reviewed


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive.c
  llvm/include/llvm/ADT/StringRef.h

Index: llvm/include/llvm/ADT/StringRef.h
===
--- llvm/include/llvm/ADT/StringRef.h
+++ llvm/include/llvm/ADT/StringRef.h
@@ -10,6 +10,7 @@
 #define LLVM_ADT_STRINGREF_H
 
 #include "llvm/ADT/DenseMapInfo.h"
+#include "llvm/ADT/Optional.h"
 #include "llvm/ADT/STLFunctionalExtras.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/Compiler.h"
@@ -24,6 +25,7 @@
 #endif
 #include 
 #include 
+#include 
 
 // Declare the __builtin_strlen intrinsic for MSVC so it can be used in
 // constexpr context.
Index: clang/test/Preprocessor/suggest-typoed-directive.c
===
--- /dev/null
+++ clang/test/Preprocessor/suggest-typoed-directive.c
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=not-c2x-cpp2b %s
+// RUN: %clang_cc1 -std=c2x -fsyntax-only -verify=c2x-cpp2b %s
+// RUN: %clang_cc1 -x c++ -std=c++2b -fsyntax-only -verify=c2x-cpp2b %s
+
+// id:not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   not suggested to '#elifdef'
+// elfindef:  not suggested to '#elifdef'
+// elfinndef: not suggested to '#elifndef'
+// els:   not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  not-c2x-cpp2b-warning@+12 {{invalid preprocessing directive, did you mean '#endif'?}}
+#ifdef UNDEFINED
+#id
+#ifd
+#ifde
+#elf
+#elsif
+#elseif
+#elfidef
+#elfindef
+#elfinndef
+#els
+#endi
+#endif
+// id:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifd:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#if'?}}
+// ifde:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#ifdef'?}}
+// elf:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elsif: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elseif:c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elif'?}}
+// elfidef:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfindef:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifdef'?}}
+// elfinndef: c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#elifndef'?}}
+// els:   c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#else'?}}
+// endi:  c2x-cpp2b-warning@-12 {{invalid preprocessing directive, did you mean '#endif'?}}
+
+#ifdef UNDEFINED
+#i // no diagnostic
+#endif
+
+#if special_compiler
+#special_compiler_directive // no diagnostic
+#endif
Index: clang/lib/Lex/PPDirectives.cpp
===
--- clang/lib/Lex/PPDirectives.cpp
+++ clang/lib/Lex/PPDirectives.cpp
@@ -266,6 +266,51 @@
 .Default(false);
 }
 
+/// Find a similar string in `Candidates`.
+///
+/// \param LHS a string for a similar string in `Candidates`
+///
+/// \param Candidates the candidates to find a similar string.
+///
+/// \returns a similar string if exists. If no similar string exists,
+/// returns None.
+static Optional findSimilarStr(
+StringRef LHS, const std::vector &Candidates) {
+  // We need to check if `Candidates` has the exact case-insensitive string
+  // because the Levenshtein distance match does not care about it.
+  for (StringRef C : Candidates) {
+if (LHS.equals_insensitive(C)) {
+  return C;
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If the LHS size is less than 3, use the LHS size minus 1 and if not,
+  // use the LHS size divided by 3.
+  size_t Length = LHS.size();
+  size_t MaxDist = Length < 3 ? Length - 1 : Length / 3;
+
+  Optional> SimilarStr = None;
+  for (StringRef C : Candidates) {
+size_t CurDist = LHS.edit_distance(C, true);
+if (CurDist <= MaxDist) {
+  if (!SimilarStr.hasValue()) {
+// The first similar st

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-11 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: llvm/lib/Support/StringRef.cpp:102
+// Find a similar string in `Candidates`.
+Optional StringRef::find_similar_str(const std::vector 
&Candidates, size_t Dist) const {
+  // We need to check if `rng` has the exact case-insensitive string because 
the Levenshtein distance match does not

ken-matsui wrote:
> aaron.ballman wrote:
> > I am less convinced that we want this as a general interface on 
> > `StringRef`. I think callers need to decide whether something is similar 
> > enough or not. For example, case sensitivity may not matter for this case 
> > but it might matter to another case. Others may wish to limit the max 
> > edit_distance for each of the candidates for performance reasons, others 
> > may not. We already saw there were questions about whether to allow 
> > replacements or not. That sort of thing.
> > 
> > I think this functionality should be moved to PPDirectives.cpp as a static 
> > helper function that's specific to this use. Also, because this returns one 
> > of the candidates passed in, and those are a `StringRef`, I think this 
> > function should return a `StringRef` -- this removes allocation costs. I'd 
> > also drop the `Dist` parameter as the function is no longer intended to be 
> > a general purpose one.
> I am also in favor of it, but where should I put tests for this functionality?
> Is it possible to write unit tests for static functions implemented in `.cpp` 
> files?
This is something we wouldn't usually add tests for directly but instead test 
the behavior of through lit -- the tests you already have exercise this code 
path and would show if there's a situation where we expect a viable candidate 
and don't find any. So I'd not worry about test coverage for it in this case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-11 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added inline comments.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}

aaron.ballman wrote:
> ken-matsui wrote:
> > aaron.ballman wrote:
> > > erichkeane wrote:
> > > > aaron.ballman wrote:
> > > > > ken-matsui wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > ken-matsui wrote:
> > > > > > > > aaron.ballman wrote:
> > > > > > > > > ken-matsui wrote:
> > > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > > It's interesting that this one suggested `#elifdef` 
> > > > > > > > > > > instead of `#elifndef` -- is there anything that can be 
> > > > > > > > > > > done for that?
> > > > > > > > > > > 
> > > > > > > > > > > Also, one somewhat interesting question is whether we 
> > > > > > > > > > > want to recommend `#elifdef` and `#elifndef` outside of 
> > > > > > > > > > > C2x/C++2b mode. Those directives only exist in the latest 
> > > > > > > > > > > language standard, but Clang supports them as a 
> > > > > > > > > > > conforming extension in all language modes. Given that 
> > > > > > > > > > > this diagnostic is about typos, I think I'm okay 
> > > > > > > > > > > suggesting the directives even in older language modes. 
> > > > > > > > > > > That's as likely to be a correct suggestion as not, IMO.
> > > > > > > > > > > It's interesting that this one suggested `#elifdef` 
> > > > > > > > > > > instead of `#elifndef` -- is there anything that can be 
> > > > > > > > > > > done for that?
> > > > > > > > > > 
> > > > > > > > > > I found I have to use `std::min_element` instead of 
> > > > > > > > > > `std::max_element` because we are finding the nearest (most 
> > > > > > > > > > minimum distance) string. Meanwhile, `#elfindef` has 2 
> > > > > > > > > > distance with both `#elifdef` and `#elifndef`. After 
> > > > > > > > > > replacing `std::max_element` with `std::min_element`, I 
> > > > > > > > > > could suggest `#elifndef` from `#elfinndef`.
> > > > > > > > > > 
> > > > > > > > > > > Also, one somewhat interesting question is whether we 
> > > > > > > > > > > want to recommend `#elifdef` and `#elifndef` outside of 
> > > > > > > > > > > C2x/C++2b mode. Those directives only exist in the latest 
> > > > > > > > > > > language standard, but Clang supports them as a 
> > > > > > > > > > > conforming extension in all language modes. Given that 
> > > > > > > > > > > this diagnostic is about typos, I think I'm okay 
> > > > > > > > > > > suggesting the directives even in older language modes. 
> > > > > > > > > > > That's as likely to be a correct suggestion as not, IMO.
> > > > > > > > > > 
> > > > > > > > > > I agree with you because Clang implements those directives, 
> > > > > > > > > > and the suggested code will also be compilable. I 
> > > > > > > > > > personally think only not compilable suggestions should be 
> > > > > > > > > > avoided. (Or, we might place additional info for outside of 
> > > > > > > > > > C2x/C++2b mode like `this is a C2x/C++2b feature but 
> > > > > > > > > > compilable on Clang`?)
> > > > > > > > > > 
> > > > > > > > > > ---
> > > > > > > > > > 
> > > > > > > > > > According to the algorithm of `std::min_element`, we only 
> > > > > > > > > > get an iterator of the first element even if there is 
> > > > > > > > > > another element that has the same distance. So, `#elfindef` 
> > > > > > > > > > only suggests `#elifdef` in accordance with the order of 
> > > > > > > > > > `Candidates`, and I don't think it is beautiful to depend 
> > > > > > > > > > on the order of candidates. I would say that we can suggest 
> > > > > > > > > > all the same distance like the following, but I'm not sure 
> > > > > > > > > > this is preferable:
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > #elfindef // diag: unknown directive, did you mean #elifdef 
> > > > > > > > > > or #elifndef?
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > I agree with you because Clang implements those directives, 
> > > > > > > > > > and the suggested code will also be compilable. I 
> > > > > > > > > > personally think only not compilable suggestions should be 
> > > > > > > > > > avoided. (Or, we might place additional info for outside of 
> > > > > > > > > > C2x/C++2b mode like this is a C2x/C++2b feature but 
> > > > > > > > > > compilable on Clang?)
> > > > > > > > > 
> > > > > > > > > I may be changing my mind on this a bit. I now see we don't 
> > > > > > > > > issue an extension warning when using `#elifdef` or 
> > > > > > > > > `#elifndef` in older language modes. That means suggesting 
> > > > > > > > > those will be correct but only for Clang, and the user won't 
> > > > > > > > > have any other diagnostics to tell them about the portability 
> > > 

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-11 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}

ken-matsui wrote:
> aaron.ballman wrote:
> > erichkeane wrote:
> > > aaron.ballman wrote:
> > > > ken-matsui wrote:
> > > > > aaron.ballman wrote:
> > > > > > ken-matsui wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > ken-matsui wrote:
> > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > It's interesting that this one suggested `#elifdef` instead 
> > > > > > > > > > of `#elifndef` -- is there anything that can be done for 
> > > > > > > > > > that?
> > > > > > > > > > 
> > > > > > > > > > Also, one somewhat interesting question is whether we want 
> > > > > > > > > > to recommend `#elifdef` and `#elifndef` outside of 
> > > > > > > > > > C2x/C++2b mode. Those directives only exist in the latest 
> > > > > > > > > > language standard, but Clang supports them as a conforming 
> > > > > > > > > > extension in all language modes. Given that this diagnostic 
> > > > > > > > > > is about typos, I think I'm okay suggesting the directives 
> > > > > > > > > > even in older language modes. That's as likely to be a 
> > > > > > > > > > correct suggestion as not, IMO.
> > > > > > > > > > It's interesting that this one suggested `#elifdef` instead 
> > > > > > > > > > of `#elifndef` -- is there anything that can be done for 
> > > > > > > > > > that?
> > > > > > > > > 
> > > > > > > > > I found I have to use `std::min_element` instead of 
> > > > > > > > > `std::max_element` because we are finding the nearest (most 
> > > > > > > > > minimum distance) string. Meanwhile, `#elfindef` has 2 
> > > > > > > > > distance with both `#elifdef` and `#elifndef`. After 
> > > > > > > > > replacing `std::max_element` with `std::min_element`, I could 
> > > > > > > > > suggest `#elifndef` from `#elfinndef`.
> > > > > > > > > 
> > > > > > > > > > Also, one somewhat interesting question is whether we want 
> > > > > > > > > > to recommend `#elifdef` and `#elifndef` outside of 
> > > > > > > > > > C2x/C++2b mode. Those directives only exist in the latest 
> > > > > > > > > > language standard, but Clang supports them as a conforming 
> > > > > > > > > > extension in all language modes. Given that this diagnostic 
> > > > > > > > > > is about typos, I think I'm okay suggesting the directives 
> > > > > > > > > > even in older language modes. That's as likely to be a 
> > > > > > > > > > correct suggestion as not, IMO.
> > > > > > > > > 
> > > > > > > > > I agree with you because Clang implements those directives, 
> > > > > > > > > and the suggested code will also be compilable. I personally 
> > > > > > > > > think only not compilable suggestions should be avoided. (Or, 
> > > > > > > > > we might place additional info for outside of C2x/C++2b mode 
> > > > > > > > > like `this is a C2x/C++2b feature but compilable on Clang`?)
> > > > > > > > > 
> > > > > > > > > ---
> > > > > > > > > 
> > > > > > > > > According to the algorithm of `std::min_element`, we only get 
> > > > > > > > > an iterator of the first element even if there is another 
> > > > > > > > > element that has the same distance. So, `#elfindef` only 
> > > > > > > > > suggests `#elifdef` in accordance with the order of 
> > > > > > > > > `Candidates`, and I don't think it is beautiful to depend on 
> > > > > > > > > the order of candidates. I would say that we can suggest all 
> > > > > > > > > the same distance like the following, but I'm not sure this 
> > > > > > > > > is preferable:
> > > > > > > > > 
> > > > > > > > > ```
> > > > > > > > > #elfindef // diag: unknown directive, did you mean #elifdef 
> > > > > > > > > or #elifndef?
> > > > > > > > > ```
> > > > > > > > > 
> > > > > > > > > I agree with you because Clang implements those directives, 
> > > > > > > > > and the suggested code will also be compilable. I personally 
> > > > > > > > > think only not compilable suggestions should be avoided. (Or, 
> > > > > > > > > we might place additional info for outside of C2x/C++2b mode 
> > > > > > > > > like this is a C2x/C++2b feature but compilable on Clang?)
> > > > > > > > 
> > > > > > > > I may be changing my mind on this a bit. I now see we don't 
> > > > > > > > issue an extension warning when using `#elifdef` or `#elifndef` 
> > > > > > > > in older language modes. That means suggesting those will be 
> > > > > > > > correct but only for Clang, and the user won't have any other 
> > > > > > > > diagnostics to tell them about the portability issue. And those 
> > > > > > > > particular macros are reasonably likely to be used in a header 
> > > > > > > > where the user is trying to aim for portability. So I'm 
> > > > > > > > starting to think we should on

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-10 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added inline comments.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}

aaron.ballman wrote:
> erichkeane wrote:
> > aaron.ballman wrote:
> > > ken-matsui wrote:
> > > > aaron.ballman wrote:
> > > > > ken-matsui wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > ken-matsui wrote:
> > > > > > > > aaron.ballman wrote:
> > > > > > > > > It's interesting that this one suggested `#elifdef` instead 
> > > > > > > > > of `#elifndef` -- is there anything that can be done for that?
> > > > > > > > > 
> > > > > > > > > Also, one somewhat interesting question is whether we want to 
> > > > > > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b 
> > > > > > > > > mode. Those directives only exist in the latest language 
> > > > > > > > > standard, but Clang supports them as a conforming extension 
> > > > > > > > > in all language modes. Given that this diagnostic is about 
> > > > > > > > > typos, I think I'm okay suggesting the directives even in 
> > > > > > > > > older language modes. That's as likely to be a correct 
> > > > > > > > > suggestion as not, IMO.
> > > > > > > > > It's interesting that this one suggested `#elifdef` instead 
> > > > > > > > > of `#elifndef` -- is there anything that can be done for that?
> > > > > > > > 
> > > > > > > > I found I have to use `std::min_element` instead of 
> > > > > > > > `std::max_element` because we are finding the nearest (most 
> > > > > > > > minimum distance) string. Meanwhile, `#elfindef` has 2 distance 
> > > > > > > > with both `#elifdef` and `#elifndef`. After replacing 
> > > > > > > > `std::max_element` with `std::min_element`, I could suggest 
> > > > > > > > `#elifndef` from `#elfinndef`.
> > > > > > > > 
> > > > > > > > > Also, one somewhat interesting question is whether we want to 
> > > > > > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b 
> > > > > > > > > mode. Those directives only exist in the latest language 
> > > > > > > > > standard, but Clang supports them as a conforming extension 
> > > > > > > > > in all language modes. Given that this diagnostic is about 
> > > > > > > > > typos, I think I'm okay suggesting the directives even in 
> > > > > > > > > older language modes. That's as likely to be a correct 
> > > > > > > > > suggestion as not, IMO.
> > > > > > > > 
> > > > > > > > I agree with you because Clang implements those directives, and 
> > > > > > > > the suggested code will also be compilable. I personally think 
> > > > > > > > only not compilable suggestions should be avoided. (Or, we 
> > > > > > > > might place additional info for outside of C2x/C++2b mode like 
> > > > > > > > `this is a C2x/C++2b feature but compilable on Clang`?)
> > > > > > > > 
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > According to the algorithm of `std::min_element`, we only get 
> > > > > > > > an iterator of the first element even if there is another 
> > > > > > > > element that has the same distance. So, `#elfindef` only 
> > > > > > > > suggests `#elifdef` in accordance with the order of 
> > > > > > > > `Candidates`, and I don't think it is beautiful to depend on 
> > > > > > > > the order of candidates. I would say that we can suggest all 
> > > > > > > > the same distance like the following, but I'm not sure this is 
> > > > > > > > preferable:
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > #elfindef // diag: unknown directive, did you mean #elifdef or 
> > > > > > > > #elifndef?
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > I agree with you because Clang implements those directives, and 
> > > > > > > > the suggested code will also be compilable. I personally think 
> > > > > > > > only not compilable suggestions should be avoided. (Or, we 
> > > > > > > > might place additional info for outside of C2x/C++2b mode like 
> > > > > > > > this is a C2x/C++2b feature but compilable on Clang?)
> > > > > > > 
> > > > > > > I may be changing my mind on this a bit. I now see we don't issue 
> > > > > > > an extension warning when using `#elifdef` or `#elifndef` in 
> > > > > > > older language modes. That means suggesting those will be correct 
> > > > > > > but only for Clang, and the user won't have any other diagnostics 
> > > > > > > to tell them about the portability issue. And those particular 
> > > > > > > macros are reasonably likely to be used in a header where the 
> > > > > > > user is trying to aim for portability. So I'm starting to think 
> > > > > > > we should only suggest those two in C2x mode (and we should 
> > > > > > > probably add a portability warning for #elifdef and #elifndef, so 
> > > > > > > I filed: https://github.com/llvm/llvm-project/issues/55306)
> > > > >

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-10 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 428471.
ken-matsui added a comment.

Set `AllowReplacements` to `true`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive.c
  llvm/include/llvm/ADT/StringRef.h
  llvm/lib/Support/StringRef.cpp
  llvm/unittests/ADT/StringRefTest.cpp

Index: llvm/unittests/ADT/StringRefTest.cpp
===
--- llvm/unittests/ADT/StringRefTest.cpp
+++ llvm/unittests/ADT/StringRefTest.cpp
@@ -566,6 +566,13 @@
 }
 
 TEST(StringRefTest, EditDistance) {
+  EXPECT_EQ(0U, StringRef("").edit_distance(""));
+  EXPECT_EQ(1U, StringRef("").edit_distance("aaab"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("aacb"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("abca"));
+  EXPECT_EQ(3U, StringRef("aabc").edit_distance("adef"));
+  EXPECT_EQ(4U, StringRef("abcd").edit_distance("efgh"));
+
   StringRef Hello("hello");
   EXPECT_EQ(2U, Hello.edit_distance("hill"));
 
@@ -584,6 +591,40 @@
"people soiled our green "));
 }
 
+TEST(StringRefTest, FindSimilarStr) {
+  {
+std::vector Candidates{"aaab", "aaac"};
+EXPECT_EQ(std::string("aaab"), StringRef("").find_similar_str(Candidates));
+  }
+  {
+std::vector Candidates{"aab", "aac"};
+EXPECT_EQ(std::string("aab"), StringRef("aaa").find_similar_str(Candidates));
+  }
+  {
+std::vector Candidates{"ab", "ac"};
+EXPECT_EQ(std::string("ab"), StringRef("aa").find_similar_str(Candidates));
+  }
+  {
+std::vector Candidates{"b", "c"};
+EXPECT_EQ(None, StringRef("a").find_similar_str(Candidates));
+  }
+  { // macros
+std::vector Candidates{
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elfidef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(None, StringRef("special_compiler_directive").find_similar_str(Candidates));
+  }
+  { // case-insensitive
+std::vector Candidates{
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("ELIFDEF").find_similar_str(Candidates));
+  }
+}
+
 TEST(StringRefTest, Misc) {
   std::string Storage;
   raw_string_ostream OS(Storage);
Index: llvm/lib/Support/StringRef.cpp
===
--- llvm/lib/Support/StringRef.cpp
+++ llvm/lib/Support/StringRef.cpp
@@ -98,6 +98,43 @@
   AllowReplacements, MaxEditDistance);
 }
 
+// Find a similar string in `Candidates`.
+Optional StringRef::find_similar_str(const std::vector &Candidates, size_t Dist) const {
+  // We need to check if `rng` has the exact case-insensitive string because the Levenshtein distance match does not
+  // care about it.
+  for (StringRef C : Candidates) {
+if (equals_insensitive(C)) {
+  return C.str();
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If dist is given, use the dist for maxDist; otherwise, if the LHS size is less than 3, use the LHS size minus 1
+  // and if not, use the LHS size divided by 3.
+  size_t MaxDist = Dist != 0? Dist
+   : Length < 3 ? Length - 1
+: Length / 3;
+
+  std::vector> Cand;
+  for (StringRef C : Candidates) {
+size_t CurDist = edit_distance(C, true);
+if (CurDist <= MaxDist) {
+  Cand.emplace_back(C, CurDist);
+}
+  }
+
+  if (Cand.empty()) {
+return None;
+  } else if (Cand.size() == 1) {
+return Cand[0].first;
+  } else {
+auto SimilarStr = std::min_element(
+Cand.cbegin(), Cand.cend(),
+[](const auto &A, const auto &B) { return A.second < B.second; });
+return SimilarStr->first;
+  }
+}
+
 //===--===//
 // String Operations
 //===--===//
Index: llvm/include/llvm/ADT/StringRef.h
===
--- llvm/include/llvm/ADT/StringRef.h
+++ llvm/include/llvm/ADT/StringRef.h
@@ -10,6 +10,7 @@
 #define LLVM_ADT_STRINGREF_H
 
 #include "llvm/ADT/DenseMapInfo.h"
+#include "llvm/ADT/Optional.h"
 #include "llvm/ADT/STLFunctionalExtras.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/Compiler.h"
@@ -24,6 +25,7 @@
 #endif
 #include 
 #include 
+#include 
 
 // Declare the __builtin_strlen intrinsic for MSVC so it can be used in
 // constexpr context.
@@ -240,6 +242,23 @@
 uns

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-10 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:3
+
+// id:   not suggested to '#if'
+// ifd:  expected-warning@+11 {{invalid preprocessing directive, did you 
mean '#if'?}}

erichkeane wrote:
> aaron.ballman wrote:
> > This still suggests that something's wrong as I would imagine this would 
> > have an edit distance of 1. Oh, interesting... setting the replacement 
> > option to `false` may have made things better for the `elfindef` case but 
> > worse for the `id` case?
> > 
> > This is tricky because we want to identify things that are most likely 
> > simple typos but exclude things that may reasonably not be a typo but a 
> > custom preprocessor directive. Based on that, I *think* setting the 
> > replacement option to `true` gives the more conservative answer (it treats 
> > a replacement as 1 edit rather than 2). @erichkeane -- do you have thoughts?
> Not particularly.  I don't have a good hold of how much we want to suggest 
> with the 'did you mean'.  Line 9 and line 10 here are unfortunate, I would 
> hope those would happen?  Its unfortunate we don't have a way to figure out 
> these common typos.
Yeah, ideally I want this to "do what I'm thinking" and make all three 
situations work. :-D



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}

erichkeane wrote:
> aaron.ballman wrote:
> > ken-matsui wrote:
> > > aaron.ballman wrote:
> > > > ken-matsui wrote:
> > > > > aaron.ballman wrote:
> > > > > > ken-matsui wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > > > > > `#elifndef` -- is there anything that can be done for that?
> > > > > > > > 
> > > > > > > > Also, one somewhat interesting question is whether we want to 
> > > > > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. 
> > > > > > > > Those directives only exist in the latest language standard, 
> > > > > > > > but Clang supports them as a conforming extension in all 
> > > > > > > > language modes. Given that this diagnostic is about typos, I 
> > > > > > > > think I'm okay suggesting the directives even in older language 
> > > > > > > > modes. That's as likely to be a correct suggestion as not, IMO.
> > > > > > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > > > > > `#elifndef` -- is there anything that can be done for that?
> > > > > > > 
> > > > > > > I found I have to use `std::min_element` instead of 
> > > > > > > `std::max_element` because we are finding the nearest (most 
> > > > > > > minimum distance) string. Meanwhile, `#elfindef` has 2 distance 
> > > > > > > with both `#elifdef` and `#elifndef`. After replacing 
> > > > > > > `std::max_element` with `std::min_element`, I could suggest 
> > > > > > > `#elifndef` from `#elfinndef`.
> > > > > > > 
> > > > > > > > Also, one somewhat interesting question is whether we want to 
> > > > > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. 
> > > > > > > > Those directives only exist in the latest language standard, 
> > > > > > > > but Clang supports them as a conforming extension in all 
> > > > > > > > language modes. Given that this diagnostic is about typos, I 
> > > > > > > > think I'm okay suggesting the directives even in older language 
> > > > > > > > modes. That's as likely to be a correct suggestion as not, IMO.
> > > > > > > 
> > > > > > > I agree with you because Clang implements those directives, and 
> > > > > > > the suggested code will also be compilable. I personally think 
> > > > > > > only not compilable suggestions should be avoided. (Or, we might 
> > > > > > > place additional info for outside of C2x/C++2b mode like `this is 
> > > > > > > a C2x/C++2b feature but compilable on Clang`?)
> > > > > > > 
> > > > > > > ---
> > > > > > > 
> > > > > > > According to the algorithm of `std::min_element`, we only get an 
> > > > > > > iterator of the first element even if there is another element 
> > > > > > > that has the same distance. So, `#elfindef` only suggests 
> > > > > > > `#elifdef` in accordance with the order of `Candidates`, and I 
> > > > > > > don't think it is beautiful to depend on the order of candidates. 
> > > > > > > I would say that we can suggest all the same distance like the 
> > > > > > > following, but I'm not sure this is preferable:
> > > > > > > 
> > > > > > > ```
> > > > > > > #elfindef // diag: unknown directive, did you mean #elifdef or 
> > > > > > > #elifndef?
> > > > > > > ```
> > > > > > > 
> > > > > > > I agree with you because Clang implements those directives, and 
> > > > > > > the sugges

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-10 Thread Erich Keane via Phabricator via cfe-commits
erichkeane added inline comments.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:3
+
+// id:   not suggested to '#if'
+// ifd:  expected-warning@+11 {{invalid preprocessing directive, did you 
mean '#if'?}}

aaron.ballman wrote:
> This still suggests that something's wrong as I would imagine this would have 
> an edit distance of 1. Oh, interesting... setting the replacement option to 
> `false` may have made things better for the `elfindef` case but worse for the 
> `id` case?
> 
> This is tricky because we want to identify things that are most likely simple 
> typos but exclude things that may reasonably not be a typo but a custom 
> preprocessor directive. Based on that, I *think* setting the replacement 
> option to `true` gives the more conservative answer (it treats a replacement 
> as 1 edit rather than 2). @erichkeane -- do you have thoughts?
Not particularly.  I don't have a good hold of how much we want to suggest with 
the 'did you mean'.  Line 9 and line 10 here are unfortunate, I would hope 
those would happen?  Its unfortunate we don't have a way to figure out these 
common typos.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}

aaron.ballman wrote:
> ken-matsui wrote:
> > aaron.ballman wrote:
> > > ken-matsui wrote:
> > > > aaron.ballman wrote:
> > > > > ken-matsui wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > > > > `#elifndef` -- is there anything that can be done for that?
> > > > > > > 
> > > > > > > Also, one somewhat interesting question is whether we want to 
> > > > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. 
> > > > > > > Those directives only exist in the latest language standard, but 
> > > > > > > Clang supports them as a conforming extension in all language 
> > > > > > > modes. Given that this diagnostic is about typos, I think I'm 
> > > > > > > okay suggesting the directives even in older language modes. 
> > > > > > > That's as likely to be a correct suggestion as not, IMO.
> > > > > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > > > > `#elifndef` -- is there anything that can be done for that?
> > > > > > 
> > > > > > I found I have to use `std::min_element` instead of 
> > > > > > `std::max_element` because we are finding the nearest (most minimum 
> > > > > > distance) string. Meanwhile, `#elfindef` has 2 distance with both 
> > > > > > `#elifdef` and `#elifndef`. After replacing `std::max_element` with 
> > > > > > `std::min_element`, I could suggest `#elifndef` from `#elfinndef`.
> > > > > > 
> > > > > > > Also, one somewhat interesting question is whether we want to 
> > > > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. 
> > > > > > > Those directives only exist in the latest language standard, but 
> > > > > > > Clang supports them as a conforming extension in all language 
> > > > > > > modes. Given that this diagnostic is about typos, I think I'm 
> > > > > > > okay suggesting the directives even in older language modes. 
> > > > > > > That's as likely to be a correct suggestion as not, IMO.
> > > > > > 
> > > > > > I agree with you because Clang implements those directives, and the 
> > > > > > suggested code will also be compilable. I personally think only not 
> > > > > > compilable suggestions should be avoided. (Or, we might place 
> > > > > > additional info for outside of C2x/C++2b mode like `this is a 
> > > > > > C2x/C++2b feature but compilable on Clang`?)
> > > > > > 
> > > > > > ---
> > > > > > 
> > > > > > According to the algorithm of `std::min_element`, we only get an 
> > > > > > iterator of the first element even if there is another element that 
> > > > > > has the same distance. So, `#elfindef` only suggests `#elifdef` in 
> > > > > > accordance with the order of `Candidates`, and I don't think it is 
> > > > > > beautiful to depend on the order of candidates. I would say that we 
> > > > > > can suggest all the same distance like the following, but I'm not 
> > > > > > sure this is preferable:
> > > > > > 
> > > > > > ```
> > > > > > #elfindef // diag: unknown directive, did you mean #elifdef or 
> > > > > > #elifndef?
> > > > > > ```
> > > > > > 
> > > > > > I agree with you because Clang implements those directives, and the 
> > > > > > suggested code will also be compilable. I personally think only not 
> > > > > > compilable suggestions should be avoided. (Or, we might place 
> > > > > > additional info for outside of C2x/C++2b mode like this is a 
> > > > > > C2x/C++2b feature but compilable on Clang?)
> > > > > 
> > > > > I 

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-10 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:3
+
+// id:   not suggested to '#if'
+// ifd:  expected-warning@+11 {{invalid preprocessing directive, did you 
mean '#if'?}}

This still suggests that something's wrong as I would imagine this would have 
an edit distance of 1. Oh, interesting... setting the replacement option to 
`false` may have made things better for the `elfindef` case but worse for the 
`id` case?

This is tricky because we want to identify things that are most likely simple 
typos but exclude things that may reasonably not be a typo but a custom 
preprocessor directive. Based on that, I *think* setting the replacement option 
to `true` gives the more conservative answer (it treats a replacement as 1 edit 
rather than 2). @erichkeane -- do you have thoughts?



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}

ken-matsui wrote:
> aaron.ballman wrote:
> > ken-matsui wrote:
> > > aaron.ballman wrote:
> > > > ken-matsui wrote:
> > > > > aaron.ballman wrote:
> > > > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > > > `#elifndef` -- is there anything that can be done for that?
> > > > > > 
> > > > > > Also, one somewhat interesting question is whether we want to 
> > > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. 
> > > > > > Those directives only exist in the latest language standard, but 
> > > > > > Clang supports them as a conforming extension in all language 
> > > > > > modes. Given that this diagnostic is about typos, I think I'm okay 
> > > > > > suggesting the directives even in older language modes. That's as 
> > > > > > likely to be a correct suggestion as not, IMO.
> > > > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > > > `#elifndef` -- is there anything that can be done for that?
> > > > > 
> > > > > I found I have to use `std::min_element` instead of 
> > > > > `std::max_element` because we are finding the nearest (most minimum 
> > > > > distance) string. Meanwhile, `#elfindef` has 2 distance with both 
> > > > > `#elifdef` and `#elifndef`. After replacing `std::max_element` with 
> > > > > `std::min_element`, I could suggest `#elifndef` from `#elfinndef`.
> > > > > 
> > > > > > Also, one somewhat interesting question is whether we want to 
> > > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. 
> > > > > > Those directives only exist in the latest language standard, but 
> > > > > > Clang supports them as a conforming extension in all language 
> > > > > > modes. Given that this diagnostic is about typos, I think I'm okay 
> > > > > > suggesting the directives even in older language modes. That's as 
> > > > > > likely to be a correct suggestion as not, IMO.
> > > > > 
> > > > > I agree with you because Clang implements those directives, and the 
> > > > > suggested code will also be compilable. I personally think only not 
> > > > > compilable suggestions should be avoided. (Or, we might place 
> > > > > additional info for outside of C2x/C++2b mode like `this is a 
> > > > > C2x/C++2b feature but compilable on Clang`?)
> > > > > 
> > > > > ---
> > > > > 
> > > > > According to the algorithm of `std::min_element`, we only get an 
> > > > > iterator of the first element even if there is another element that 
> > > > > has the same distance. So, `#elfindef` only suggests `#elifdef` in 
> > > > > accordance with the order of `Candidates`, and I don't think it is 
> > > > > beautiful to depend on the order of candidates. I would say that we 
> > > > > can suggest all the same distance like the following, but I'm not 
> > > > > sure this is preferable:
> > > > > 
> > > > > ```
> > > > > #elfindef // diag: unknown directive, did you mean #elifdef or 
> > > > > #elifndef?
> > > > > ```
> > > > > 
> > > > > I agree with you because Clang implements those directives, and the 
> > > > > suggested code will also be compilable. I personally think only not 
> > > > > compilable suggestions should be avoided. (Or, we might place 
> > > > > additional info for outside of C2x/C++2b mode like this is a 
> > > > > C2x/C++2b feature but compilable on Clang?)
> > > > 
> > > > I may be changing my mind on this a bit. I now see we don't issue an 
> > > > extension warning when using `#elifdef` or `#elifndef` in older 
> > > > language modes. That means suggesting those will be correct but only 
> > > > for Clang, and the user won't have any other diagnostics to tell them 
> > > > about the portability issue. And those particular macros are reasonably 
> > > > likely to be used in a header where the use

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-10 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 428395.
ken-matsui added a comment.

Fix the test for FindSimilarStr


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive-c2x-cpp2b.c
  clang/test/Preprocessor/suggest-typoed-directive.c
  llvm/include/llvm/ADT/StringRef.h
  llvm/lib/Support/StringRef.cpp
  llvm/unittests/ADT/StringRefTest.cpp

Index: llvm/unittests/ADT/StringRefTest.cpp
===
--- llvm/unittests/ADT/StringRefTest.cpp
+++ llvm/unittests/ADT/StringRefTest.cpp
@@ -566,6 +566,13 @@
 }
 
 TEST(StringRefTest, EditDistance) {
+  EXPECT_EQ(0U, StringRef("").edit_distance(""));
+  EXPECT_EQ(1U, StringRef("").edit_distance("aaab"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("aacb"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("abca"));
+  EXPECT_EQ(3U, StringRef("aabc").edit_distance("adef"));
+  EXPECT_EQ(4U, StringRef("abcd").edit_distance("efgh"));
+
   StringRef Hello("hello");
   EXPECT_EQ(2U, Hello.edit_distance("hill"));
 
@@ -584,6 +591,28 @@
"people soiled our green "));
 }
 
+TEST(StringRefTest, FindSimilarStr) {
+  {
+std::vector Candidates{"b", "c"};
+EXPECT_EQ(None, StringRef("a").find_similar_str(Candidates));
+  }
+  { // macros
+std::vector Candidates{
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elfidef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(None, StringRef("special_compiler_directive").find_similar_str(Candidates));
+  }
+  { // case-insensitive
+std::vector Candidates{
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("ELIFDEF").find_similar_str(Candidates));
+  }
+}
+
 TEST(StringRefTest, Misc) {
   std::string Storage;
   raw_string_ostream OS(Storage);
Index: llvm/lib/Support/StringRef.cpp
===
--- llvm/lib/Support/StringRef.cpp
+++ llvm/lib/Support/StringRef.cpp
@@ -98,6 +98,43 @@
   AllowReplacements, MaxEditDistance);
 }
 
+// Find a similar string in `Candidates`.
+Optional StringRef::find_similar_str(const std::vector &Candidates, size_t Dist) const {
+  // We need to check if `rng` has the exact case-insensitive string because the Levenshtein distance match does not
+  // care about it.
+  for (StringRef C : Candidates) {
+if (equals_insensitive(C)) {
+  return C.str();
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If dist is given, use the dist for maxDist; otherwise, if the LHS size is less than 3, use the LHS size minus 1
+  // and if not, use the LHS size divided by 3.
+  size_t MaxDist = Dist != 0? Dist
+   : Length < 3 ? Length - 1
+: Length / 3;
+
+  std::vector> Cand;
+  for (StringRef C : Candidates) {
+size_t CurDist = edit_distance(C, false);
+if (CurDist <= MaxDist) {
+  Cand.emplace_back(C, CurDist);
+}
+  }
+
+  if (Cand.empty()) {
+return None;
+  } else if (Cand.size() == 1) {
+return Cand[0].first;
+  } else {
+auto SimilarStr = std::min_element(
+Cand.cbegin(), Cand.cend(),
+[](const auto &A, const auto &B) { return A.second < B.second; });
+return SimilarStr->first;
+  }
+}
+
 //===--===//
 // String Operations
 //===--===//
Index: llvm/include/llvm/ADT/StringRef.h
===
--- llvm/include/llvm/ADT/StringRef.h
+++ llvm/include/llvm/ADT/StringRef.h
@@ -10,6 +10,7 @@
 #define LLVM_ADT_STRINGREF_H
 
 #include "llvm/ADT/DenseMapInfo.h"
+#include "llvm/ADT/Optional.h"
 #include "llvm/ADT/STLFunctionalExtras.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/Compiler.h"
@@ -24,6 +25,7 @@
 #endif
 #include 
 #include 
+#include 
 
 // Declare the __builtin_strlen intrinsic for MSVC so it can be used in
 // constexpr context.
@@ -240,6 +242,23 @@
 unsigned edit_distance(StringRef Other, bool AllowReplacements = true,
unsigned MaxEditDistance = 0) const;
 
+/// Find a similar string in `Candidates`.
+///
+/// \param Candidates the candidates to find a similar string.
+///
+/// \param Dist Maximum distance to elect as similar strings. Eventually

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-10 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added inline comments.



Comment at: clang/lib/Lex/PPDirectives.cpp:444
+
+  if (auto Sugg = Directive.find_similar_str(Candidates)) {
+CharSourceRange DirectiveRange =

erichkeane wrote:
> I don't believe this meets our requirements for 'auto'. 
Thank you. I replaced it with the actual type.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}

aaron.ballman wrote:
> ken-matsui wrote:
> > aaron.ballman wrote:
> > > ken-matsui wrote:
> > > > aaron.ballman wrote:
> > > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > > `#elifndef` -- is there anything that can be done for that?
> > > > > 
> > > > > Also, one somewhat interesting question is whether we want to 
> > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. Those 
> > > > > directives only exist in the latest language standard, but Clang 
> > > > > supports them as a conforming extension in all language modes. Given 
> > > > > that this diagnostic is about typos, I think I'm okay suggesting the 
> > > > > directives even in older language modes. That's as likely to be a 
> > > > > correct suggestion as not, IMO.
> > > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > > `#elifndef` -- is there anything that can be done for that?
> > > > 
> > > > I found I have to use `std::min_element` instead of `std::max_element` 
> > > > because we are finding the nearest (most minimum distance) string. 
> > > > Meanwhile, `#elfindef` has 2 distance with both `#elifdef` and 
> > > > `#elifndef`. After replacing `std::max_element` with 
> > > > `std::min_element`, I could suggest `#elifndef` from `#elfinndef`.
> > > > 
> > > > > Also, one somewhat interesting question is whether we want to 
> > > > > recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. Those 
> > > > > directives only exist in the latest language standard, but Clang 
> > > > > supports them as a conforming extension in all language modes. Given 
> > > > > that this diagnostic is about typos, I think I'm okay suggesting the 
> > > > > directives even in older language modes. That's as likely to be a 
> > > > > correct suggestion as not, IMO.
> > > > 
> > > > I agree with you because Clang implements those directives, and the 
> > > > suggested code will also be compilable. I personally think only not 
> > > > compilable suggestions should be avoided. (Or, we might place 
> > > > additional info for outside of C2x/C++2b mode like `this is a C2x/C++2b 
> > > > feature but compilable on Clang`?)
> > > > 
> > > > ---
> > > > 
> > > > According to the algorithm of `std::min_element`, we only get an 
> > > > iterator of the first element even if there is another element that has 
> > > > the same distance. So, `#elfindef` only suggests `#elifdef` in 
> > > > accordance with the order of `Candidates`, and I don't think it is 
> > > > beautiful to depend on the order of candidates. I would say that we can 
> > > > suggest all the same distance like the following, but I'm not sure this 
> > > > is preferable:
> > > > 
> > > > ```
> > > > #elfindef // diag: unknown directive, did you mean #elifdef or 
> > > > #elifndef?
> > > > ```
> > > > 
> > > > I agree with you because Clang implements those directives, and the 
> > > > suggested code will also be compilable. I personally think only not 
> > > > compilable suggestions should be avoided. (Or, we might place 
> > > > additional info for outside of C2x/C++2b mode like this is a C2x/C++2b 
> > > > feature but compilable on Clang?)
> > > 
> > > I may be changing my mind on this a bit. I now see we don't issue an 
> > > extension warning when using `#elifdef` or `#elifndef` in older language 
> > > modes. That means suggesting those will be correct but only for Clang, 
> > > and the user won't have any other diagnostics to tell them about the 
> > > portability issue. And those particular macros are reasonably likely to 
> > > be used in a header where the user is trying to aim for portability. So 
> > > I'm starting to think we should only suggest those two in C2x mode (and 
> > > we should probably add a portability warning for #elifdef and #elifndef, 
> > > so I filed: https://github.com/llvm/llvm-project/issues/55306)
> > > 
> > > > I would say that we can suggest all the same distance like the 
> > > > following, but I'm not sure this is preferable:
> > > 
> > > The way we typically handle this is to attach FixIt hints to a note 
> > > instead of to the diagnostic. This way, automatic fixes aren't applied 
> > > (because there are multiple choices to pick from) but the user is still 
> > > able to apply whichever fix they want in a

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-10 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 428335.
ken-matsui added a comment.
Herald added a subscriber: hiraditya.

Updated the code as reviewed


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive-c2x-cpp2b.c
  clang/test/Preprocessor/suggest-typoed-directive.c
  llvm/include/llvm/ADT/StringRef.h
  llvm/lib/Support/StringRef.cpp
  llvm/unittests/ADT/StringRefTest.cpp

Index: llvm/unittests/ADT/StringRefTest.cpp
===
--- llvm/unittests/ADT/StringRefTest.cpp
+++ llvm/unittests/ADT/StringRefTest.cpp
@@ -566,6 +566,13 @@
 }
 
 TEST(StringRefTest, EditDistance) {
+  EXPECT_EQ(0U, StringRef("").edit_distance(""));
+  EXPECT_EQ(1U, StringRef("").edit_distance("aaab"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("aacb"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("abca"));
+  EXPECT_EQ(3U, StringRef("aabc").edit_distance("adef"));
+  EXPECT_EQ(4U, StringRef("abcd").edit_distance("efgh"));
+
   StringRef Hello("hello");
   EXPECT_EQ(2U, Hello.edit_distance("hill"));
 
@@ -584,6 +591,40 @@
"people soiled our green "));
 }
 
+TEST(StringRefTest, FindSimilarStr) {
+  {
+std::vector Candidates{"aaab", "aaabc"};
+EXPECT_EQ(std::string("aaab"), StringRef("").find_similar_str(Candidates));
+  }
+  {
+std::vector Candidates{"aab", "abc"};
+EXPECT_EQ(std::string("aab"), StringRef("aaa").find_similar_str(Candidates));
+  }
+  {
+std::vector Candidates{"ab", "bc"};
+EXPECT_EQ(std::string("ab"), StringRef("aa").find_similar_str(Candidates));
+  }
+  {
+std::vector Candidates{"b", "c"};
+EXPECT_EQ(None, StringRef("a").find_similar_str(Candidates));
+  }
+  { // macros
+std::vector Candidates{
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elfidef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(None, StringRef("special_compiler_directive").find_similar_str(Candidates));
+  }
+  { // case-insensitive
+std::vector Candidates{
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("ELIFDEF").find_similar_str(Candidates));
+  }
+}
+
 TEST(StringRefTest, Misc) {
   std::string Storage;
   raw_string_ostream OS(Storage);
Index: llvm/lib/Support/StringRef.cpp
===
--- llvm/lib/Support/StringRef.cpp
+++ llvm/lib/Support/StringRef.cpp
@@ -98,6 +98,43 @@
   AllowReplacements, MaxEditDistance);
 }
 
+// Find a similar string in `Candidates`.
+Optional StringRef::find_similar_str(const std::vector &Candidates, size_t Dist) const {
+  // We need to check if `rng` has the exact case-insensitive string because the Levenshtein distance match does not
+  // care about it.
+  for (StringRef C : Candidates) {
+if (equals_insensitive(C)) {
+  return C.str();
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If dist is given, use the dist for maxDist; otherwise, if the LHS size is less than 3, use the LHS size minus 1
+  // and if not, use the LHS size divided by 3.
+  size_t MaxDist = Dist != 0? Dist
+   : Length < 3 ? Length - 1
+: Length / 3;
+
+  std::vector> Cand;
+  for (StringRef C : Candidates) {
+size_t CurDist = edit_distance(C, false);
+if (CurDist <= MaxDist) {
+  Cand.emplace_back(C, CurDist);
+}
+  }
+
+  if (Cand.empty()) {
+return None;
+  } else if (Cand.size() == 1) {
+return Cand[0].first;
+  } else {
+auto SimilarStr = std::min_element(
+Cand.cbegin(), Cand.cend(),
+[](const auto &A, const auto &B) { return A.second < B.second; });
+return SimilarStr->first;
+  }
+}
+
 //===--===//
 // String Operations
 //===--===//
Index: llvm/include/llvm/ADT/StringRef.h
===
--- llvm/include/llvm/ADT/StringRef.h
+++ llvm/include/llvm/ADT/StringRef.h
@@ -10,6 +10,7 @@
 #define LLVM_ADT_STRINGREF_H
 
 #include "llvm/ADT/DenseMapInfo.h"
+#include "llvm/ADT/Optional.h"
 #include "llvm/ADT/STLFunctionalExtras.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/Compiler.h"
@@ -24,6 +25,7 @@
 #endif
 #include 
 #include 
+#include 
 
 // Declare the __builtin_s

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-09 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: clang/lib/Lex/PPDirectives.cpp:441
+  constexpr StringRef Candidates[] = {
+  "if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+  };

erichkeane wrote:
> Should this be dependent on C modes?  it would be kind of silly to suggest 
> "invalid preprocessing token elifndef, did you mean elifndef"?
Both C and C++ have the new directives and we enable them as an extension in 
all language modes, so no need to care about it being in C mode specifically. 
However, I think we do need to care about *which* C and C++ mode we're in so 
that we only suggest the new directives in C2x and C++2b mode.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-09 Thread Erich Keane via Phabricator via cfe-commits
erichkeane added inline comments.



Comment at: clang/lib/Lex/PPDirectives.cpp:441
+  constexpr StringRef Candidates[] = {
+  "if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+  };

Should this be dependent on C modes?  it would be kind of silly to suggest 
"invalid preprocessing token elifndef, did you mean elifndef"?



Comment at: clang/lib/Lex/PPDirectives.cpp:444
+
+  if (auto Sugg = Directive.find_similar_str(Candidates)) {
+CharSourceRange DirectiveRange =

I don't believe this meets our requirements for 'auto'. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-09 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a subscriber: erichkeane.
aaron.ballman added inline comments.



Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}

ken-matsui wrote:
> aaron.ballman wrote:
> > ken-matsui wrote:
> > > aaron.ballman wrote:
> > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > `#elifndef` -- is there anything that can be done for that?
> > > > 
> > > > Also, one somewhat interesting question is whether we want to recommend 
> > > > `#elifdef` and `#elifndef` outside of C2x/C++2b mode. Those directives 
> > > > only exist in the latest language standard, but Clang supports them as 
> > > > a conforming extension in all language modes. Given that this 
> > > > diagnostic is about typos, I think I'm okay suggesting the directives 
> > > > even in older language modes. That's as likely to be a correct 
> > > > suggestion as not, IMO.
> > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > `#elifndef` -- is there anything that can be done for that?
> > > 
> > > I found I have to use `std::min_element` instead of `std::max_element` 
> > > because we are finding the nearest (most minimum distance) string. 
> > > Meanwhile, `#elfindef` has 2 distance with both `#elifdef` and 
> > > `#elifndef`. After replacing `std::max_element` with `std::min_element`, 
> > > I could suggest `#elifndef` from `#elfinndef`.
> > > 
> > > > Also, one somewhat interesting question is whether we want to recommend 
> > > > `#elifdef` and `#elifndef` outside of C2x/C++2b mode. Those directives 
> > > > only exist in the latest language standard, but Clang supports them as 
> > > > a conforming extension in all language modes. Given that this 
> > > > diagnostic is about typos, I think I'm okay suggesting the directives 
> > > > even in older language modes. That's as likely to be a correct 
> > > > suggestion as not, IMO.
> > > 
> > > I agree with you because Clang implements those directives, and the 
> > > suggested code will also be compilable. I personally think only not 
> > > compilable suggestions should be avoided. (Or, we might place additional 
> > > info for outside of C2x/C++2b mode like `this is a C2x/C++2b feature but 
> > > compilable on Clang`?)
> > > 
> > > ---
> > > 
> > > According to the algorithm of `std::min_element`, we only get an iterator 
> > > of the first element even if there is another element that has the same 
> > > distance. So, `#elfindef` only suggests `#elifdef` in accordance with the 
> > > order of `Candidates`, and I don't think it is beautiful to depend on the 
> > > order of candidates. I would say that we can suggest all the same 
> > > distance like the following, but I'm not sure this is preferable:
> > > 
> > > ```
> > > #elfindef // diag: unknown directive, did you mean #elifdef or #elifndef?
> > > ```
> > > 
> > > I agree with you because Clang implements those directives, and the 
> > > suggested code will also be compilable. I personally think only not 
> > > compilable suggestions should be avoided. (Or, we might place additional 
> > > info for outside of C2x/C++2b mode like this is a C2x/C++2b feature but 
> > > compilable on Clang?)
> > 
> > I may be changing my mind on this a bit. I now see we don't issue an 
> > extension warning when using `#elifdef` or `#elifndef` in older language 
> > modes. That means suggesting those will be correct but only for Clang, and 
> > the user won't have any other diagnostics to tell them about the 
> > portability issue. And those particular macros are reasonably likely to be 
> > used in a header where the user is trying to aim for portability. So I'm 
> > starting to think we should only suggest those two in C2x mode (and we 
> > should probably add a portability warning for #elifdef and #elifndef, so I 
> > filed: https://github.com/llvm/llvm-project/issues/55306)
> > 
> > > I would say that we can suggest all the same distance like the following, 
> > > but I'm not sure this is preferable:
> > 
> > The way we typically handle this is to attach FixIt hints to a note instead 
> > of to the diagnostic. This way, automatic fixes aren't applied (because 
> > there are multiple choices to pick from) but the user is still able to 
> > apply whichever fix they want in an IDE or other tool. It might be worth 
> > trying that approach (e.g., if there's only one candidate, attach it to the 
> > warning, and if there are two or more, emit a warning without a "did you 
> > mean" in it and use a new note for the fixit. e.g.,
> > ```
> > #elfindef // expected-warning {{invalid preprocessing directive}} \
> >  expected-note {{did you mean '#elifdef'?}} \
> >  expected-note {{did you mean '

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-07 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:362-365
 
+def warn_pp_typo_directive : Warning<
+  "'#%0' directive not found, did you mean '#%1'?">,
+  InGroup;

aaron.ballman wrote:
> ken-matsui wrote:
> > aaron.ballman wrote:
> > > Rather than adding a warning over the top of an existing error, I think 
> > > we should modify `err_pp_invalid_directive` to have an optional "did you 
> > > mean?" when we find a plausible typo to correct.
> > > 
> > > However, we do not issue that diagnostic when it's inside of a skipped 
> > > conditional block, and that's what the crux of 
> > > https://github.com/llvm/llvm-project/issues/51598 is about. As @rsmith 
> > > pointed out in that issue (and I agree), compilers are free to support 
> > > custom directives and those will validly appear inside of a conditional 
> > > block that is skipped. We need to be careful not to diagnose those kinds 
> > > of situations as an error. However, we still want to diagnose when the 
> > > unknown directive is "sufficiently close" to another one which can 
> > > control the conditional chain. e.g.,
> > > ```
> > > #fi FOO // error unknown directive, did you mean #if?
> > > #endfi // error unknown directive, did you mean #endif?
> > > 
> > > #if FOO
> > > #esle // diag: unknown directive, did you mean #else?
> > > #elfi // diag: unknown directive, did you mean #elif?
> > > #elfidef // diag: unknown directive, did you mean #elifdef
> > > #elinfdef // diag: unknown directive, did you mean #elifndef
> > > 
> > > #frobble // No diagnostic, not close enough to a conditional directive to 
> > > warrant diagnosing
> > > #eerp // No diagnostic, not close enough to a conditional directive to 
> > > warrant diagnosing
> > > 
> > > #endif
> > > ```
> > > Today, if `FOO` is defined to a nonzero value, you'll get diagnostics for 
> > > all of those, but if `FOO` is not defined or is defined to 0, then 
> > > there's no diagnostics. I think we want to consider directives that are 
> > > *very likely* to be a typo (edit distance of 1, maybe 2?) for a 
> > > conditional directive as a special case.
> > > 
> > > Currently, we only diagnose unknown directives as an error. I think for 
> > > these special cased conditional directive diagnostics, we'll want to use 
> > > a warning rather than an error in this circumstance (just in case it 
> > > turns out to be a valid directive in a skipped conditional block). If we 
> > > do go that route and make it a warning, I think the warning group should 
> > > be `-Wunknown-directives` to mirror `-Wunknown-pragmas`, 
> > > `-Wunknown-attributes`, etc and it should be defined to have the same 
> > > text as the error case. e.g., 
> > > ```
> > > def err_pp_invalid_directive : Error<
> > >   "invalid preprocessing directive%select{|; did you mean '#%1'?}0"
> > > >;
> > > def warn_pp_invalid_directive : Warning<
> > >   err_pp_invalid_directive.Text>, 
> > > InGroup>;
> > > ```
> > > WDYT?
> > > 
> > > (These were my thoughts before seeing the rest of the patch. After 
> > > reading the patch, it looks like we have pretty similar ideas here, which 
> > > is great, but leaving the comment anyway in case you have further 
> > > opinions.)
> > > Currently, we only diagnose unknown directives as an error. I think for 
> > > these special cased conditional directive diagnostics, we'll want to use 
> > > a warning rather than an error in this circumstance (just in case it 
> > > turns out to be a valid directive in a skipped conditional block). If we 
> > > do go that route and make it a warning, I think the warning group should 
> > > be `-Wunknown-directives` to mirror `-Wunknown-pragmas`, 
> > > `-Wunknown-attributes`, etc and it should be defined to have the same 
> > > text as the error case. e.g., 
> > > ```
> > > def err_pp_invalid_directive : Error<
> > >   "invalid preprocessing directive%select{|; did you mean '#%1'?}0"
> > > >;
> > > def warn_pp_invalid_directive : Warning<
> > >   err_pp_invalid_directive.Text>, 
> > > InGroup>;
> > > ```
> > > WDYT?
> > > 
> > > (These were my thoughts before seeing the rest of the patch. After 
> > > reading the patch, it looks like we have pretty similar ideas here, which 
> > > is great, but leaving the comment anyway in case you have further 
> > > opinions.)
> > 
> > For now, I totally agree with deriving a new warning from 
> > `err_pp_invalid_directive`.
> > 
> > However, for future scalability, I think it would be great if we could 
> > split those diagnostics into Error & Warning and Help, for example. Rustc 
> > does split the diagnostics like the following, and I think this is quite 
> > clear. So, a bit apart from this patch, I speculate creating a diagnostic 
> > system that can split them would bring Clang diagnostics much more readable.
> > 
> > https://github.com/rust-lang/rust/blob/598d89bf142823b5d84e2eb0f0f9e418ee966a4b/src/test/ui/suggestions/suggest-trait-items.stderr
>

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-06 Thread Christopher Di Bella via Phabricator via cfe-commits
cjdb added a comment.

Thanks for working on this!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-06 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a subscriber: cjdb.
aaron.ballman added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:362-365
 
+def warn_pp_typo_directive : Warning<
+  "'#%0' directive not found, did you mean '#%1'?">,
+  InGroup;

ken-matsui wrote:
> aaron.ballman wrote:
> > Rather than adding a warning over the top of an existing error, I think we 
> > should modify `err_pp_invalid_directive` to have an optional "did you 
> > mean?" when we find a plausible typo to correct.
> > 
> > However, we do not issue that diagnostic when it's inside of a skipped 
> > conditional block, and that's what the crux of 
> > https://github.com/llvm/llvm-project/issues/51598 is about. As @rsmith 
> > pointed out in that issue (and I agree), compilers are free to support 
> > custom directives and those will validly appear inside of a conditional 
> > block that is skipped. We need to be careful not to diagnose those kinds of 
> > situations as an error. However, we still want to diagnose when the unknown 
> > directive is "sufficiently close" to another one which can control the 
> > conditional chain. e.g.,
> > ```
> > #fi FOO // error unknown directive, did you mean #if?
> > #endfi // error unknown directive, did you mean #endif?
> > 
> > #if FOO
> > #esle // diag: unknown directive, did you mean #else?
> > #elfi // diag: unknown directive, did you mean #elif?
> > #elfidef // diag: unknown directive, did you mean #elifdef
> > #elinfdef // diag: unknown directive, did you mean #elifndef
> > 
> > #frobble // No diagnostic, not close enough to a conditional directive to 
> > warrant diagnosing
> > #eerp // No diagnostic, not close enough to a conditional directive to 
> > warrant diagnosing
> > 
> > #endif
> > ```
> > Today, if `FOO` is defined to a nonzero value, you'll get diagnostics for 
> > all of those, but if `FOO` is not defined or is defined to 0, then there's 
> > no diagnostics. I think we want to consider directives that are *very 
> > likely* to be a typo (edit distance of 1, maybe 2?) for a conditional 
> > directive as a special case.
> > 
> > Currently, we only diagnose unknown directives as an error. I think for 
> > these special cased conditional directive diagnostics, we'll want to use a 
> > warning rather than an error in this circumstance (just in case it turns 
> > out to be a valid directive in a skipped conditional block). If we do go 
> > that route and make it a warning, I think the warning group should be 
> > `-Wunknown-directives` to mirror `-Wunknown-pragmas`, 
> > `-Wunknown-attributes`, etc and it should be defined to have the same text 
> > as the error case. e.g., 
> > ```
> > def err_pp_invalid_directive : Error<
> >   "invalid preprocessing directive%select{|; did you mean '#%1'?}0"
> > >;
> > def warn_pp_invalid_directive : Warning<
> >   err_pp_invalid_directive.Text>, InGroup>;
> > ```
> > WDYT?
> > 
> > (These were my thoughts before seeing the rest of the patch. After reading 
> > the patch, it looks like we have pretty similar ideas here, which is great, 
> > but leaving the comment anyway in case you have further opinions.)
> > Currently, we only diagnose unknown directives as an error. I think for 
> > these special cased conditional directive diagnostics, we'll want to use a 
> > warning rather than an error in this circumstance (just in case it turns 
> > out to be a valid directive in a skipped conditional block). If we do go 
> > that route and make it a warning, I think the warning group should be 
> > `-Wunknown-directives` to mirror `-Wunknown-pragmas`, 
> > `-Wunknown-attributes`, etc and it should be defined to have the same text 
> > as the error case. e.g., 
> > ```
> > def err_pp_invalid_directive : Error<
> >   "invalid preprocessing directive%select{|; did you mean '#%1'?}0"
> > >;
> > def warn_pp_invalid_directive : Warning<
> >   err_pp_invalid_directive.Text>, InGroup>;
> > ```
> > WDYT?
> > 
> > (These were my thoughts before seeing the rest of the patch. After reading 
> > the patch, it looks like we have pretty similar ideas here, which is great, 
> > but leaving the comment anyway in case you have further opinions.)
> 
> For now, I totally agree with deriving a new warning from 
> `err_pp_invalid_directive`.
> 
> However, for future scalability, I think it would be great if we could split 
> those diagnostics into Error & Warning and Help, for example. Rustc does 
> split the diagnostics like the following, and I think this is quite clear. 
> So, a bit apart from this patch, I speculate creating a diagnostic system 
> that can split them would bring Clang diagnostics much more readable.
> 
> https://github.com/rust-lang/rust/blob/598d89bf142823b5d84e2eb0f0f9e418ee966a4b/src/test/ui/suggestions/suggest-trait-items.stderr
> 
> However, for future scalability, I think it would be great if we could split 
> those diagnostics into Error & Warning and Help, for example. Rustc does 
> split the diagnostics li

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-05 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 427263.
ken-matsui added a comment.

Remove unnecessary includes


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive.c
  llvm/include/llvm/ADT/StringRef.h
  llvm/unittests/ADT/StringRefTest.cpp

Index: llvm/unittests/ADT/StringRefTest.cpp
===
--- llvm/unittests/ADT/StringRefTest.cpp
+++ llvm/unittests/ADT/StringRefTest.cpp
@@ -566,6 +566,13 @@
 }
 
 TEST(StringRefTest, EditDistance) {
+  EXPECT_EQ(0U, StringRef("").edit_distance(""));
+  EXPECT_EQ(1U, StringRef("").edit_distance("aaab"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("aacb"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("abca"));
+  EXPECT_EQ(3U, StringRef("aabc").edit_distance("adef"));
+  EXPECT_EQ(4U, StringRef("abcd").edit_distance("efgh"));
+
   StringRef Hello("hello");
   EXPECT_EQ(2U, Hello.edit_distance("hill"));
 
@@ -584,6 +591,40 @@
"people soiled our green "));
 }
 
+TEST(StringRefTest, FindSimilarStr) {
+  {
+constexpr StringRef Candidates[] = {"aaab", "aaabc"};
+EXPECT_EQ(std::string("aaab"), StringRef("").find_similar_str(Candidates));
+  }
+  {
+constexpr StringRef Candidates[] = {"aab", "abc"};
+EXPECT_EQ(std::string("aab"), StringRef("aaa").find_similar_str(Candidates));
+  }
+  {
+constexpr StringRef Candidates[] = {"ab", "bc"};
+EXPECT_EQ(std::string("ab"), StringRef("aa").find_similar_str(Candidates));
+  }
+  {
+constexpr StringRef Candidates[] = {"b", "c"};
+EXPECT_EQ(None, StringRef("a").find_similar_str(Candidates));
+  }
+  { // macros
+constexpr StringRef Candidates[] = {
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elfidef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(None, StringRef("special_compiler_directive").find_similar_str(Candidates));
+  }
+  { // case-insensitive
+constexpr StringRef Candidates[] = {
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("ELIFDEF").find_similar_str(Candidates));
+  }
+}
+
 TEST(StringRefTest, Misc) {
   std::string Storage;
   raw_string_ostream OS(Storage);
Index: llvm/include/llvm/ADT/StringRef.h
===
--- llvm/include/llvm/ADT/StringRef.h
+++ llvm/include/llvm/ADT/StringRef.h
@@ -10,6 +10,7 @@
 #define LLVM_ADT_STRINGREF_H
 
 #include "llvm/ADT/DenseMapInfo.h"
+#include "llvm/ADT/Optional.h"
 #include "llvm/ADT/STLFunctionalExtras.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/Compiler.h"
@@ -24,6 +25,7 @@
 #endif
 #include 
 #include 
+#include 
 
 // Declare the __builtin_strlen intrinsic for MSVC so it can be used in
 // constexpr context.
@@ -240,6 +242,46 @@
 unsigned edit_distance(StringRef Other, bool AllowReplacements = true,
unsigned MaxEditDistance = 0) const;
 
+/// Find a similar string in `Candidates`.
+template 
+Optional find_similar_str(const StringRef (&Candidates)[Size],
+   size_t Dist = 0) const {
+  // We need to check if `rng` has the exact case-insensitive string because the
+  // Levenshtein distance match does not care about it.
+  for (StringRef C : Candidates) {
+if (equals_insensitive(C)) {
+  return C.str();
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If dist is given, use the dist for maxDist; otherwise, if word size is
+  // less than 3, use the word size minus 1 and if not, use the word size
+  // divided by 3.
+  size_t MaxDist = Dist != 0? Dist
+   : Length < 3 ? Length - 1
+: Length / 3;
+
+  std::vector> Cand;
+  for (StringRef C : Candidates) {
+size_t CurDist = edit_distance(C, false);
+if (CurDist <= MaxDist) {
+  Cand.emplace_back(C, CurDist);
+}
+  }
+
+  if (Cand.empty()) {
+return None;
+  } else if (Cand.size() == 1) {
+return Cand[0].first;
+  } else {
+auto SimilarStr = std::min_element(
+Cand.cbegin(), Cand.cend(),
+[](const auto &A, const auto &B) { return A.second < B.second; });
+return SimilarStr->first;
+  }
+}
+
 /// str - Get the contents as an std::string.
   

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-05 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui updated this revision to Diff 427262.
ken-matsui added a comment.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Update codes as reviewed


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

Files:
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/PPDirectives.cpp
  clang/test/Preprocessor/suggest-typoed-directive.c
  llvm/include/llvm/ADT/StringRef.h
  llvm/unittests/ADT/StringRefTest.cpp

Index: llvm/unittests/ADT/StringRefTest.cpp
===
--- llvm/unittests/ADT/StringRefTest.cpp
+++ llvm/unittests/ADT/StringRefTest.cpp
@@ -566,6 +566,13 @@
 }
 
 TEST(StringRefTest, EditDistance) {
+  EXPECT_EQ(0U, StringRef("").edit_distance(""));
+  EXPECT_EQ(1U, StringRef("").edit_distance("aaab"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("aacb"));
+  EXPECT_EQ(2U, StringRef("aabc").edit_distance("abca"));
+  EXPECT_EQ(3U, StringRef("aabc").edit_distance("adef"));
+  EXPECT_EQ(4U, StringRef("abcd").edit_distance("efgh"));
+
   StringRef Hello("hello");
   EXPECT_EQ(2U, Hello.edit_distance("hill"));
 
@@ -584,6 +591,40 @@
"people soiled our green "));
 }
 
+TEST(StringRefTest, FindSimilarStr) {
+  {
+constexpr StringRef Candidates[] = {"aaab", "aaabc"};
+EXPECT_EQ(std::string("aaab"), StringRef("").find_similar_str(Candidates));
+  }
+  {
+constexpr StringRef Candidates[] = {"aab", "abc"};
+EXPECT_EQ(std::string("aab"), StringRef("aaa").find_similar_str(Candidates));
+  }
+  {
+constexpr StringRef Candidates[] = {"ab", "bc"};
+EXPECT_EQ(std::string("ab"), StringRef("aa").find_similar_str(Candidates));
+  }
+  {
+constexpr StringRef Candidates[] = {"b", "c"};
+EXPECT_EQ(None, StringRef("a").find_similar_str(Candidates));
+  }
+  { // macros
+constexpr StringRef Candidates[] = {
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elfidef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(None, StringRef("special_compiler_directive").find_similar_str(Candidates));
+  }
+  { // case-insensitive
+constexpr StringRef Candidates[] = {
+"if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"
+};
+EXPECT_EQ(std::string("elifdef"), StringRef("elifdef").find_similar_str(Candidates));
+EXPECT_EQ(std::string("elifdef"), StringRef("ELIFDEF").find_similar_str(Candidates));
+  }
+}
+
 TEST(StringRefTest, Misc) {
   std::string Storage;
   raw_string_ostream OS(Storage);
Index: llvm/include/llvm/ADT/StringRef.h
===
--- llvm/include/llvm/ADT/StringRef.h
+++ llvm/include/llvm/ADT/StringRef.h
@@ -10,6 +10,7 @@
 #define LLVM_ADT_STRINGREF_H
 
 #include "llvm/ADT/DenseMapInfo.h"
+#include "llvm/ADT/Optional.h"
 #include "llvm/ADT/STLFunctionalExtras.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/Compiler.h"
@@ -24,6 +25,7 @@
 #endif
 #include 
 #include 
+#include 
 
 // Declare the __builtin_strlen intrinsic for MSVC so it can be used in
 // constexpr context.
@@ -240,6 +242,46 @@
 unsigned edit_distance(StringRef Other, bool AllowReplacements = true,
unsigned MaxEditDistance = 0) const;
 
+/// Find a similar string in `Candidates`.
+template 
+Optional find_similar_str(const StringRef (&Candidates)[Size],
+   size_t Dist = 0) const {
+  // We need to check if `rng` has the exact case-insensitive string because the
+  // Levenshtein distance match does not care about it.
+  for (StringRef C : Candidates) {
+if (equals_insensitive(C)) {
+  return C.str();
+}
+  }
+
+  // Keep going with the Levenshtein distance match.
+  // If dist is given, use the dist for maxDist; otherwise, if word size is
+  // less than 3, use the word size minus 1 and if not, use the word size
+  // divided by 3.
+  size_t MaxDist = Dist != 0? Dist
+   : Length < 3 ? Length - 1
+: Length / 3;
+
+  std::vector> Cand;
+  for (StringRef C : Candidates) {
+size_t CurDist = edit_distance(C, false);
+if (CurDist <= MaxDist) {
+  Cand.emplace_back(C, CurDist);
+}
+  }
+
+  if (Cand.empty()) {
+return None;
+  } else if (Cand.size() == 1) {
+return Cand[0].first;
+  } else {
+auto SimilarStr = std::min_element(
+Cand.cbegin(), Cand.cend(),
+[](const auto &A, const auto &B) { return A.second < B.second; });
+return SimilarStr->first;
+   

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-05 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added inline comments.



Comment at: clang/include/clang/Tooling/LevDistance.h:1
+//===--- LevDistance.h --*- C++ 
-*-===//
+//

ken-matsui wrote:
> aaron.ballman wrote:
> > You shouldn't need this file or the implementation file -- we have this 
> > functionality already in: 
> > https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ADT/edit_distance.h
> >  and 
> > https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ADT/StringRef.h#L240.
> I totally missed the implementations! Sorry.
> 
> But where should I put both `findSimilarStr` & `findSimilarStr`?
> It seems that their same implementations aren't there.
@aaron.ballman 

The computation results are not matched with my implementation and another 
language implementation.

So, several directives that could be suggested in my implementation are no 
longer possible to be suggested, such as `#id` -> `#if`, `#elid` -> `#elif`, 
and `#elsi` -> `#else`, when using `llvm::ComputeEditDistance`.

The implementation of `llvm::ComputeEditDistance` might be also correct, but 
some distances seem to be calculated longer, which causes fewer suggestions.

Should I keep going with this implementation or not?

---

[llvm::ComputeEditDistance](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ADT/edit_distance.h):

```
size_t Dist = Str1.edit_distance(Str2, false);
```

```
Str1: id, Str2: if, Dist: 2 # <-- here
Str1: id, Str2: ifdef, Dist: 3
Str1: id, Str2: ifndef, Dist: 4
Str1: ifd, Str2: if, Dist: 1
Str1: ifd, Str2: ifdef, Dist: 2
Str1: ifd, Str2: ifndef, Dist: 3
```

[My implementation](https://wandbox.org/permlink/zRjT41alOHdwcf00):

```
Str1: id, Str2: if, Dist: 1 # <-- here
Str1: id, Str2: ifdef, Dist: 3
Str1: id, Str2: ifndef, Dist: 4
Str1: ifd, Str2: if, Dist: 1
Str1: ifd, Str2: ifdef, Dist: 2
Str1: ifd, Str2: ifndef, Dist: 3
```

[Rustc implementation](https://wandbox.org/permlink/08SB08JFF5G4awx4):

```
Str1: id, Str2: if, Dist: 1 # <-- here
Str1: id, Str2: ifdef, Dist: 3
Str1: id, Str2: ifndef, Dist: 4
Str1: ifd, Str2: if, Dist: 1
Str1: ifd, Str2: ifdef, Dist: 2
Str1: ifd, Str2: ifndef, Dist: 3
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-04 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added a comment.

Thank you so much for your clear review!




Comment at: clang/include/clang/Basic/DiagnosticGroups.td:1023-1024
 
+// Typoed directive warnings
+def TypoedDirective : DiagGroup<"typoed-directive">;
+

aaron.ballman wrote:
> We don't typically use "typo" in the user-facing part of diagnostics and this 
> group doesn't seem likely to be reused, so I would remove it (another comment 
> on this elsewhere).
Ah, I see. Thank you!



Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:362-365
 
+def warn_pp_typo_directive : Warning<
+  "'#%0' directive not found, did you mean '#%1'?">,
+  InGroup;

aaron.ballman wrote:
> Rather than adding a warning over the top of an existing error, I think we 
> should modify `err_pp_invalid_directive` to have an optional "did you mean?" 
> when we find a plausible typo to correct.
> 
> However, we do not issue that diagnostic when it's inside of a skipped 
> conditional block, and that's what the crux of 
> https://github.com/llvm/llvm-project/issues/51598 is about. As @rsmith 
> pointed out in that issue (and I agree), compilers are free to support custom 
> directives and those will validly appear inside of a conditional block that 
> is skipped. We need to be careful not to diagnose those kinds of situations 
> as an error. However, we still want to diagnose when the unknown directive is 
> "sufficiently close" to another one which can control the conditional chain. 
> e.g.,
> ```
> #fi FOO // error unknown directive, did you mean #if?
> #endfi // error unknown directive, did you mean #endif?
> 
> #if FOO
> #esle // diag: unknown directive, did you mean #else?
> #elfi // diag: unknown directive, did you mean #elif?
> #elfidef // diag: unknown directive, did you mean #elifdef
> #elinfdef // diag: unknown directive, did you mean #elifndef
> 
> #frobble // No diagnostic, not close enough to a conditional directive to 
> warrant diagnosing
> #eerp // No diagnostic, not close enough to a conditional directive to 
> warrant diagnosing
> 
> #endif
> ```
> Today, if `FOO` is defined to a nonzero value, you'll get diagnostics for all 
> of those, but if `FOO` is not defined or is defined to 0, then there's no 
> diagnostics. I think we want to consider directives that are *very likely* to 
> be a typo (edit distance of 1, maybe 2?) for a conditional directive as a 
> special case.
> 
> Currently, we only diagnose unknown directives as an error. I think for these 
> special cased conditional directive diagnostics, we'll want to use a warning 
> rather than an error in this circumstance (just in case it turns out to be a 
> valid directive in a skipped conditional block). If we do go that route and 
> make it a warning, I think the warning group should be `-Wunknown-directives` 
> to mirror `-Wunknown-pragmas`, `-Wunknown-attributes`, etc and it should be 
> defined to have the same text as the error case. e.g., 
> ```
> def err_pp_invalid_directive : Error<
>   "invalid preprocessing directive%select{|; did you mean '#%1'?}0"
> >;
> def warn_pp_invalid_directive : Warning<
>   err_pp_invalid_directive.Text>, InGroup>;
> ```
> WDYT?
> 
> (These were my thoughts before seeing the rest of the patch. After reading 
> the patch, it looks like we have pretty similar ideas here, which is great, 
> but leaving the comment anyway in case you have further opinions.)
> Currently, we only diagnose unknown directives as an error. I think for these 
> special cased conditional directive diagnostics, we'll want to use a warning 
> rather than an error in this circumstance (just in case it turns out to be a 
> valid directive in a skipped conditional block). If we do go that route and 
> make it a warning, I think the warning group should be `-Wunknown-directives` 
> to mirror `-Wunknown-pragmas`, `-Wunknown-attributes`, etc and it should be 
> defined to have the same text as the error case. e.g., 
> ```
> def err_pp_invalid_directive : Error<
>   "invalid preprocessing directive%select{|; did you mean '#%1'?}0"
> >;
> def warn_pp_invalid_directive : Warning<
>   err_pp_invalid_directive.Text>, InGroup>;
> ```
> WDYT?
> 
> (These were my thoughts before seeing the rest of the patch. After reading 
> the patch, it looks like we have pretty similar ideas here, which is great, 
> but leaving the comment anyway in case you have further opinions.)

For now, I totally agree with deriving a new warning from 
`err_pp_invalid_directive`.

However, for future scalability, I think it would be great if we could split 
those diagnostics into Error & Warning and Help, for example. Rustc does split 
the diagnostics like the following, and I think this is quite clear. So, a bit 
apart from this patch, I speculate creating a diagnostic system that can split 
them would bring Clang diagnostics much more readable.

https://github.com/rust-lang/rust/blob/598d89bf142823b5d84e2eb0f0f9e418ee966a4b/src/test/ui/suggestions

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-05-03 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a subscriber: rsmith.
aaron.ballman added a comment.

Thank you for working on this!




Comment at: clang/include/clang/Basic/DiagnosticGroups.td:1023-1024
 
+// Typoed directive warnings
+def TypoedDirective : DiagGroup<"typoed-directive">;
+

We don't typically use "typo" in the user-facing part of diagnostics and this 
group doesn't seem likely to be reused, so I would remove it (another comment 
on this elsewhere).



Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:362-365
 
+def warn_pp_typo_directive : Warning<
+  "'#%0' directive not found, did you mean '#%1'?">,
+  InGroup;

Rather than adding a warning over the top of an existing error, I think we 
should modify `err_pp_invalid_directive` to have an optional "did you mean?" 
when we find a plausible typo to correct.

However, we do not issue that diagnostic when it's inside of a skipped 
conditional block, and that's what the crux of 
https://github.com/llvm/llvm-project/issues/51598 is about. As @rsmith pointed 
out in that issue (and I agree), compilers are free to support custom 
directives and those will validly appear inside of a conditional block that is 
skipped. We need to be careful not to diagnose those kinds of situations as an 
error. However, we still want to diagnose when the unknown directive is 
"sufficiently close" to another one which can control the conditional chain. 
e.g.,
```
#fi FOO // error unknown directive, did you mean #if?
#endfi // error unknown directive, did you mean #endif?

#if FOO
#esle // diag: unknown directive, did you mean #else?
#elfi // diag: unknown directive, did you mean #elif?
#elfidef // diag: unknown directive, did you mean #elifdef
#elinfdef // diag: unknown directive, did you mean #elifndef

#frobble // No diagnostic, not close enough to a conditional directive to 
warrant diagnosing
#eerp // No diagnostic, not close enough to a conditional directive to warrant 
diagnosing

#endif
```
Today, if `FOO` is defined to a nonzero value, you'll get diagnostics for all 
of those, but if `FOO` is not defined or is defined to 0, then there's no 
diagnostics. I think we want to consider directives that are *very likely* to 
be a typo (edit distance of 1, maybe 2?) for a conditional directive as a 
special case.

Currently, we only diagnose unknown directives as an error. I think for these 
special cased conditional directive diagnostics, we'll want to use a warning 
rather than an error in this circumstance (just in case it turns out to be a 
valid directive in a skipped conditional block). If we do go that route and 
make it a warning, I think the warning group should be `-Wunknown-directives` 
to mirror `-Wunknown-pragmas`, `-Wunknown-attributes`, etc and it should be 
defined to have the same text as the error case. e.g., 
```
def err_pp_invalid_directive : Error<
  "invalid preprocessing directive%select{|; did you mean '#%1'?}0"
>;
def warn_pp_invalid_directive : Warning<
  err_pp_invalid_directive.Text>, InGroup>;
```
WDYT?

(These were my thoughts before seeing the rest of the patch. After reading the 
patch, it looks like we have pretty similar ideas here, which is great, but 
leaving the comment anyway in case you have further opinions.)



Comment at: clang/include/clang/Tooling/LevDistance.h:1
+//===--- LevDistance.h --*- C++ 
-*-===//
+//

You shouldn't need this file or the implementation file -- we have this 
functionality already in: 
https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ADT/edit_distance.h
 and 
https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ADT/StringRef.h#L240.



Comment at: clang/lib/Lex/PPDirectives.cpp:441-449
+  const SourceLocation &endLoc) {
+  const std::array candidates{
+  "if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"};
+
+  if (const auto sugg =
+  tooling::levdistance::findSimilarStr(candidates, Directive)) {
+CharSourceRange DirectiveRange =

Mostly just cleaning up for coding conventions, but also, no need to use a 
`std::array` and we typically don't use local top-level `const` qualification.



Comment at: clang/test/OpenMP/predefined_macro.c:7
 // RUN: %clang_cc1 -fopenmp-simd -fopenmp-version=50 -verify -o - %s
-// expected-no-diagnostics
+// expected-warning@+5 {{'#elsif' directive not found, did you mean '#elif'?}}
 #ifdef FOPENMP

ken-matsui wrote:
> I am not sure if this typo was intended.
> 
> When I renamed `elsif` to `elif`, `#error "_OPENMP has incorrect value"` on 
> line `13` was evaluated.
> 
> Therefore, if this typo was intended, just suppressing the warning with 
> `expected-warning` would be better. However, if this typo was NOT intended, I 
> think I should make `_OPENMP` equal to `201107`. It is also poss

[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-04-30 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui added inline comments.



Comment at: clang/test/OpenMP/predefined_macro.c:7
 // RUN: %clang_cc1 -fopenmp-simd -fopenmp-version=50 -verify -o - %s
-// expected-no-diagnostics
+// expected-warning@+5 {{'#elsif' directive not found, did you mean '#elif'?}}
 #ifdef FOPENMP

I am not sure if this typo was intended.

When I renamed `elsif` to `elif`, `#error "_OPENMP has incorrect value"` on 
line `13` was evaluated.

Therefore, if this typo was intended, just suppressing the warning with 
`expected-warning` would be better. However, if this typo was NOT intended, I 
think I should make `_OPENMP` equal to `201107`. It is also possible that this 
test was mistakenly written.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

2022-04-30 Thread Ken Matsui via Phabricator via cfe-commits
ken-matsui created this revision.
ken-matsui added a reviewer: aaron.ballman.
Herald added a subscriber: mgorny.
Herald added a project: All.
ken-matsui requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This patch implements suggestion for typoed directives in preprocessor 
conditionals. The related issue is: 
https://github.com/llvm/llvm-project/issues/51598.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D124726

Files:
  clang/include/clang/Basic/DiagnosticGroups.td
  clang/include/clang/Basic/DiagnosticLexKinds.td
  clang/include/clang/Lex/Preprocessor.h
  clang/include/clang/Tooling/LevDistance.h
  clang/lib/Lex/CMakeLists.txt
  clang/lib/Lex/PPDirectives.cpp
  clang/lib/Tooling/CMakeLists.txt
  clang/lib/Tooling/LevDistance.cpp
  clang/test/OpenMP/predefined_macro.c
  clang/test/Preprocessor/suggest-typoed-directive.c
  clang/unittests/Tooling/CMakeLists.txt
  clang/unittests/Tooling/LevDistanceTest.cpp

Index: clang/unittests/Tooling/LevDistanceTest.cpp
===
--- /dev/null
+++ clang/unittests/Tooling/LevDistanceTest.cpp
@@ -0,0 +1,60 @@
+//===- unittest/Tooling/LevDistanceTest.cpp ---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang/Tooling/LevDistance.h"
+#include "gtest/gtest.h"
+#include 
+
+using namespace clang;
+
+using tooling::levdistance::findSimilarStr;
+using tooling::levdistance::levDistance;
+
+namespace {
+
+TEST(LevDistanceTest, levDistance) {
+  EXPECT_EQ(0ul, levDistance("", ""));
+  EXPECT_EQ(1ul, levDistance("", "aaab"));
+  EXPECT_EQ(2ul, levDistance("aabc", "aacb"));
+  EXPECT_EQ(2ul, levDistance("aabc", "abca"));
+  EXPECT_EQ(3ul, levDistance("aabc", "adef"));
+  EXPECT_EQ(4ul, levDistance("abcd", "efgh"));
+}
+
+TEST(LevDistanceTest, findSimilarStr) {
+  std::array candidates{"aaab", "aaabc"};
+  EXPECT_EQ(std::string("aaab"), findSimilarStr(candidates, ""));
+
+  candidates = {"aab", "abc"};
+  EXPECT_EQ(std::string("aab"), findSimilarStr(candidates, "aaa"));
+
+  candidates = {"ab", "bc"};
+  EXPECT_EQ(std::string("ab"), findSimilarStr(candidates, "aa"));
+
+  candidates = {"b", "c"};
+  EXPECT_EQ(None, findSimilarStr(candidates, "a"));
+}
+
+TEST(LevDistanceTest, findSimilarStrToMacros) {
+  const std::array candidates{
+  "if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"};
+
+  EXPECT_EQ(std::string("elifdef"), findSimilarStr(candidates, "elfidef"));
+  EXPECT_EQ(std::string("elifdef"), findSimilarStr(candidates, "elifdef"));
+  EXPECT_EQ(None, findSimilarStr(candidates, "special_compiler_directive"));
+}
+
+TEST(LevDistanceTest, findSimilarStrInCaseInsensitive) {
+  const std::array candidates{
+  "if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"};
+
+  EXPECT_EQ(std::string("elifdef"), findSimilarStr(candidates, "elifdef"));
+  EXPECT_EQ(std::string("elifdef"), findSimilarStr(candidates, "ELIFDEF"));
+}
+
+} // end anonymous namespace
Index: clang/unittests/Tooling/CMakeLists.txt
===
--- clang/unittests/Tooling/CMakeLists.txt
+++ clang/unittests/Tooling/CMakeLists.txt
@@ -18,6 +18,7 @@
   FixItTest.cpp
   HeaderIncludesTest.cpp
   StandardLibraryTest.cpp
+  LevDistanceTest.cpp
   LexicallyOrderedRecursiveASTVisitorTest.cpp
   LookupTest.cpp
   QualTypeNamesTest.cpp
Index: clang/test/Preprocessor/suggest-typoed-directive.c
===
--- /dev/null
+++ clang/test/Preprocessor/suggest-typoed-directive.c
@@ -0,0 +1,30 @@
+// RUN: %clang_cc1 -std=c99 -fsyntax-only -verify -pedantic %s
+
+// expected-warning@+11 {{'#id' directive not found, did you mean '#if'?}}
+// expected-warning@+11 {{'#ifd' directive not found, did you mean '#if'?}}
+// expected-warning@+11 {{'#ifde' directive not found, did you mean '#ifdef'?}}
+// expected-warning@+11 {{'#elid' directive not found, did you mean '#elif'?}}
+// expected-warning@+11 {{'#elsif' directive not found, did you mean '#elif'?}}
+// expected-warning@+11 {{'#elseif' directive not found, did you mean '#elif'?}}
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean '#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean '#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}
+// expected-warning@+11 {{'#endi' directive not found, did you mean '#endif'?}}
+#ifdef UNDEFINED
+#id
+#ifd
+#ifde
+#elid
+#elsif
+#elseif
+#elfidef
+#elfindef
+#elsi
+#endi
+#endif
+
+#if special_compiler
+#special_compiler_directive // no diagnostics
+#endif
+
+