salman-javed-nz marked an inline comment as not done. salman-javed-nz added inline comments.
================ Comment at: clang-tools-extra/clang-tidy/utils/HeaderGuard.cpp:28 +/// Replace invalid characters in the identifier with '_'. +static std::string replaceInvalidChars(StringRef Identifier) { + std::string Fixed{Identifier}; ---------------- salman-javed-nz wrote: > aaron.ballman wrote: > > salman-javed-nz wrote: > > > aaron.ballman wrote: > > > > This fixes some issues but I think it introduces new issues too. The > > > > old code would accept Unicode characters and pass them along as (valid) > > > > identifier characters. The new code will replace all the Unicode > > > > characters with `_` which means some people's header guards may go from > > > > useful to `_______________`. We should definitely add test cases for > > > > Unicode file names. > > > > > > > > Ultimately, I think this functionality will want to use > > > > `UnicodeCharSets.h` to implement something along the lines of what's > > > > done in `isAllowedInitiallyIDChar()` and `isAllowedIDChar()` from > > > > Lexer.cpp. > > > `isAllowed*IDChar()`does exactly what I want, but is a static function in > > > Lexer.cpp. > > > Should I be changing its declaration so that it's exposed via Lexer.h? > > > I don't know what the bar is to warrant changing core public Clang API. > > > > > > If I can just call the function directly from llvm-header-guard, I can > > > avoid code duplication. > > > > > > > > We could expose the interface from `Lexer` if we feel that's the right > > approach. I was going back and forth on whether it was the right approach > > and eventually thought that we don't want to expose it directly because I > > don't think we want clang-tidy to care about things like > > `LangOpts.AsmPreprocessor` or even `LangOpts.DollarIdents`. > > > > So my recommendation is to put this into LexerUtils.h/.cpp in clang-tidy > > rather than changing the Clang Lexer interface. WDYT? > //(For brevity, when I say `isAllowedIDChar`, I mean both `isAllowedIDChar` > and `isAllowedInitiallyIDChar`)// > > I see two possible directions: > > If I expose the interface from Lexer.h, then in Clang-Tidy, it's only a > matter of calling > `isAllowedIDChar(C, Context->getLangOpts())`. Clang-Tidy doesn't need to care > about the contents of `LangOpts` - it just needs to pass them on as-is to the > `isAllowedIDChar()` function. > `getLangOpts()` is already used all over the Clang-Tidy code base. > > The other way is to create a copy of `isAllowedIDChar()` in the Clang-Tidy > project and take out the parts to do with `LangOpts.AsmPreprocessor` and > `LangOpts.DollarIdents`. Now we have an implementation tailored to just the > things Clang-Tidy cares about. I will need to add `clangLex` to > CMakeLists.txt to gain visibility of clang/lib/Lex/UnicodeCharSets.h's > character tables. I need to make a correction to my previous comment (I swear Phabricator had an "edit comment" button but I just don't see it at the moment). > The other way is to create a copy of `isAllowedIDChar()` in the Clang-Tidy > project and take out the parts to do with `LangOpts.AsmPreprocessor` and > `LangOpts.DollarIdents`. Now we have an implementation tailored to just the > things Clang-Tidy cares about. I will need to add `clangLex` to > CMakeLists.txt to gain visibility of clang/lib/Lex/UnicodeCharSets.h's > character tables. Of course clangTidyUtils already depends on clangLex, so we don't need another `add_clang_library` line in CMakeLists.txt, contrary to what I said earlier. However, UnicodeCharSets.h is still doesn't come up in Clang-Tidy's header search path because it doesn't sit in the include/clang folder. The file will need moving. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D114149/new/ https://reviews.llvm.org/D114149 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits