salman-javed-nz marked an inline comment as not done. salman-javed-nz added inline comments.
================ Comment at: clang-tools-extra/clang-tidy/utils/HeaderGuard.cpp:28 +/// Replace invalid characters in the identifier with '_'. +static std::string replaceInvalidChars(StringRef Identifier) { + std::string Fixed{Identifier}; ---------------- aaron.ballman wrote: > salman-javed-nz wrote: > > aaron.ballman wrote: > > > This fixes some issues but I think it introduces new issues too. The old > > > code would accept Unicode characters and pass them along as (valid) > > > identifier characters. The new code will replace all the Unicode > > > characters with `_` which means some people's header guards may go from > > > useful to `_______________`. We should definitely add test cases for > > > Unicode file names. > > > > > > Ultimately, I think this functionality will want to use > > > `UnicodeCharSets.h` to implement something along the lines of what's done > > > in `isAllowedInitiallyIDChar()` and `isAllowedIDChar()` from Lexer.cpp. > > `isAllowed*IDChar()`does exactly what I want, but is a static function in > > Lexer.cpp. > > Should I be changing its declaration so that it's exposed via Lexer.h? > > I don't know what the bar is to warrant changing core public Clang API. > > > > If I can just call the function directly from llvm-header-guard, I can > > avoid code duplication. > > > > > We could expose the interface from `Lexer` if we feel that's the right > approach. I was going back and forth on whether it was the right approach and > eventually thought that we don't want to expose it directly because I don't > think we want clang-tidy to care about things like `LangOpts.AsmPreprocessor` > or even `LangOpts.DollarIdents`. > > So my recommendation is to put this into LexerUtils.h/.cpp in clang-tidy > rather than changing the Clang Lexer interface. WDYT? //(For brevity, when I say `isAllowedIDChar`, I mean both `isAllowedIDChar` and `isAllowedInitiallyIDChar`)// I see two possible directions: If I expose the interface from Lexer.h, then in Clang-Tidy, it's only a matter of calling `isAllowedIDChar(C, Context->getLangOpts())`. Clang-Tidy doesn't need to care about the contents of `LangOpts` - it just needs to pass them on as-is to the `isAllowedIDChar()` function. `getLangOpts()` is already used all over the Clang-Tidy code base. The other way is to create a copy of `isAllowedIDChar()` in the Clang-Tidy project and take out the parts to do with `LangOpts.AsmPreprocessor` and `LangOpts.DollarIdents`. Now we have an implementation tailored to just the things Clang-Tidy cares about. I will need to add `clangLex` to CMakeLists.txt to gain visibility of clang/lib/Lex/UnicodeCharSets.h's character tables. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D114149/new/ https://reviews.llvm.org/D114149 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits