Issue |
91311
|
Summary |
clang::SourceRange of clang::RawComment
|
Labels |
clang
|
Assignees |
|
Reporter |
T-Gruber
|
While working with the LibTooling library (llvm-project release 17.x), I noticed a strange behaviour of the SourceRange related to RawComments. I have written a small clang-based tool that essentially has the task of detecting all comments in a given C file and removing them. For this I get the SourceRange of all RawComments and remove them using the rewriter.
```C
inline void removeAllComments(clang::ASTContext &Context, clang::Rewriter &R) {
const clang::SourceManager &SrcMgr = Context.getSourceManager();
if (const std::map<unsigned int, clang::RawComment *> *CommentMap =
Context.Comments.getCommentsInFile(SrcMgr.getMainFileID()))
for (auto [LineInfo, Comment] : *CommentMap) {
R.RemoveText(Comment->getSourceRange());
std::cout << "SourceRange via Lexer: "
<< clang::Lexer::getSourceText(
clang::CharSourceRange::getTokenRange(Comment->getSourceRange()),
SrcMgr, Context.getLangOpts(), 0).str()
<< "\nRawTest: " << Comment->getRawText(SrcMgr).str() << "\n\n";
}
}
```
To run the standalone tool I use the following command:
```
~/llvm-project/build$ bin/comment_tool test.c --extra-arg=-fparse-all-comments --
```
I tested a short code snippet:
```C
int a = 1 /*comment*/;
extern int /*comment*/ b;
```
And got the following rewritten result:
```C
int a = 1
extern int b;
```
As you can see, all comments are removed, but the semicolon is also deleted in the first VarDecl. If you take a closer look at the output in the terminal, you can see that the SourceRange of the first comment includes the semicolon. However, the RawTest corresponds to my expectation:
```
SourceRange via Lexer: /*comment*/;
RawTest: /*comment*/
SourceRange via Lexer: /*comment*/
RawTest: /*comment*/
```
Is this intentional behaviour? I would be grateful for any advice!
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs