[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
HighCommander4 wrote: Superseded by #78092. Thanks for putting together this patch, @robozati. While we didn't go with the fix approach taken in this patch, it did prod me to look at the issue in more detail :laughing: https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
https://github.com/HighCommander4 closed https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
HighCommander4 wrote: > What do you think about this as an alternative: if `spelledForExpanded` > receives as input an expanded token range `[firstExpanded, lastExpanded]` > where `lastExpanded` is the `eof` token, return the spelled tokens > corresponding to `[firstExpanded, lastExpanded - 1]` instead? (In the case > where the `eof` token is the only one in the range, we could return nullopt.) > > This would have the advantage of gracefully handling AST nodes like this one > whose end location is the `eof`, and return all the spelled tokens actually > making up the node. I wrote this up at https://github.com/llvm/llvm-project/pull/78092 https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
HighCommander4 wrote: > I am slightly suspicious of source locations pointing at `eof` in the AST in > the first place, even in invalid code. I wonder if we would be better off > just having an invalid source location there instead of pointing at `eof`. I think in this case it's deliberate. If you look at the compiler error for the missing brace: ```c++ test.cpp:1:13: error: expected '}' 1 | int main() { | ^ test.cpp:1:12: note: to match this '{' 1 | int main() { |^ ``` you can see that the diagnostic location points to one character past the opening brace, which is the `eof` token. I'm not sure how else the diagnostic location could end up there. > Asserting that `tok::eof` should not be passed to `spelledForExpandedToken` > and returning `nullopt` in `spelledForExpanded` looks like a reasonable > workaround until we fix this. (The semantics looks reasonable as we don't > really have a spelled token we can map `eof` back to). What do you think about this as an alternative: if `spelledForExpanded` receives as input an expanded token range `[firstExpanded, lastExpanded]` where `lastExpanded` is the `eof` token, return the spelled tokens corresponding to `[firstExpanded, lastExpanded - 1]` instead? (In the case where the `eof` token is the only one in the range, we could return nullopt.) This would have the advantage of gracefully handling AST nodes like this one whose end location is the `eof`, and return all the spelled tokens actually making up the node. https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
ilya-biryukov wrote: I second everything that @HighCommander4 said. It would be really useful to add an assertion that we never get out-of-bounds here, but this indicates a logic error somewhere in the code that should be fixed. I am slightly suspicious of source locations pointing at `eof` in the AST in the first place, even in invalid code. I wonder if we would be better off just having an invalid source location there instead of pointing at `eof`. Asserting that `tok::eof` should not be passed to `spelledForExpandedToken` and returning `nullopt` in `spelledForExpanded` looks like a reasonable workaround until we fix this. (The semantics looks reasonable as we don't really have a spelled token we can map `eof` back to). https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
HighCommander4 wrote: Anyways, in a case like this, this preferable behaviour would be for `spelledForExpanded()` to return the spelled tokens covering `int main() {` (not including the `eof` token but including everything else), **not** nullopt. https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
HighCommander4 wrote: I poked at this a bit more. A few more notes: * Since the test case currently in the patch does not trigger the crash (without the fix), and it was not obvious to me what sort of call to `spelledForExpanded()` to formulate to trigger the crash, I instead formulated a test case in `DumpASTTests.cpp` that calls `dumpAST()` on the `TranslationUnitDecl` of a file containing the code `int main() {`. This does trigger the crash. * The crash occurs when dumping the `FunctionDecl` node for `main`. When dumping a node, `dumpAST()` [calls](https://searchfox.org/llvm/rev/fd1c156e5d6584684ce58c0536dca96cedcc41f0/clang-tools-extra/clangd/DumpAST.cpp#93) `Tokens.spelledForExpanded(Tokens.expandedTokens(SR))` where `SR` is the node's source range. * For this `FunctionDecl` node, the end location of its source range points to a `tok::eof` token. * This comes from the end location of the function's `CompoundStmt` body, which is what the parser assigns in [Parser::ParseCompoundStatementBody()](https://searchfox.org/llvm/rev/fd1c156e5d6584684ce58c0536dca96cedcc41f0/clang/lib/Parse/ParseStmt.cpp#1292) in this situation. * The range of expanded tokens returned by `Tok.expandedTokens(SR)` thus includes as its last element the `tok::eof` token. * The [comment](https://searchfox.org/llvm/rev/fd1c156e5d6584684ce58c0536dca96cedcc41f0/clang/include/clang/Tooling/Syntax/Tokens.h#162) above `TokenBuffer` says "the expanded token stream has a `tok::eof` token at the end, the spelled tokens never store a 'eof' token". I haven't dug into the implementation of `spelledForExpanded()` yet, but from the above it's looking to me like `spelledForExpanded()` is not expecting an input expanded token range that includes the `eof` token, but we are passing it one (and such a token range can legitimately arise from the source range of the AST node in cases of invalid code like this). https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
HighCommander4 wrote: I spent a bit more time looking at this. I'd say I'm still quite far from having a complete understanding of this code, but two things occur to me: 1. `TokenBuffer` operates at the level of the preprocessor. "spelledForExpanded" is basically an operation which takes as input a range of tokens in the preprocessed token stream ("expanded tokens"), and maps them back to a range of tokens in the un-preprocessed token stream ("spelled tokens"). * The preprocessor has no notion of matching or balancing braces. Braces only become relevant later, during parsing. * So, I don't understand why an input token stream that contains unbalanced braces would pose a problem for the "spelledForExpanded" operation. * I'm hesitant to approve a patch that adds an early exit to "spelledForExpanded" to prevent an out-of-bounds array access, without understanding in more detail **how** the out-of-bounds array access arises / how it's connected to the unbalanced braces. It's possible that the out-of-bounds access is indicative of a deeper logic error which we'd be covering up. 2. Since we've observed the problem arise during the `dumpAST` operation (frame 10 of the stack trace [here](https://github.com/clangd/clangd/issues/1559#issue-1646293852)), it would be good to (also) add a test that exercises the `dumpAST` operation as a whole. * We have such tests in [DumpASTTests.cpp](https://searchfox.org/llvm/source/clang-tools-extra/clangd/unittests/DumpASTTests.cpp). * I think it would be preferable if the AST dump for an input with unbalanced braces, such as `int main() {`, was **not** empty. (I suspect the current patch may make it empty.) `clang -Xclang -ast-dump` manages to print AST nodes in this case, I think so should we. https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
robozati wrote: @HighCommander4, thank you for reviewing this PR! Unfortunately, I’m currently in my finals and will only have time to look at this next week. So, I’ll update this later. https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
@@ -585,6 +585,17 @@ TEST_F(TokenCollectorTest, DelayedParsing) { EXPECT_THAT(collectAndDump(Code), StartsWith(ExpectedTokens)); } +TEST_F(TokenCollectorTest, UnclosedToken) { HighCommander4 wrote: Indeed, it looks like `TokenBuffer::spelledForExpanded()` (which is the only call site of `spelledForExpandedToken()`) is not called by anything currently in this test. To exercise `spelledForExpanded()`, we will need to make an explicit call to it, the way a number of other tests in this file do. https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
@@ -585,6 +585,17 @@ TEST_F(TokenCollectorTest, DelayedParsing) { EXPECT_THAT(collectAndDump(Code), StartsWith(ExpectedTokens)); } +TEST_F(TokenCollectorTest, UnclosedToken) { HighCommander4 wrote: I started looking at this to try and make some progress towards getting this landed. However, when running this test locally, I'm not seeing either of the branches added to `spelledForExpandedToken()` being taken. Am I missing something, or is this test not exercising the intended codepath? https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
robozati wrote: Rewrote the test and now it passes. Sorry for pinging you @HighCommander4, but I don't know who to ping and you participated in that issue. https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
https://github.com/robozati ready_for_review https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
https://github.com/robozati updated https://github.com/llvm/llvm-project/pull/69849 >From 239302d66dea74333e27288acbceffdf3f2d3b6b Mon Sep 17 00:00:00 2001 From: Gabriel Pezati Date: Sat, 21 Oct 2023 19:28:57 -0300 Subject: [PATCH] Fix UB on unclosed parentheses, brackets or braces in Tokens --- clang/lib/Tooling/Syntax/Tokens.cpp | 25 ++- clang/unittests/Tooling/Syntax/TokensTest.cpp | 11 2 files changed, 30 insertions(+), 6 deletions(-) diff --git a/clang/lib/Tooling/Syntax/Tokens.cpp b/clang/lib/Tooling/Syntax/Tokens.cpp index 2f28b9cf286a638..97f794e09cb5590 100644 --- a/clang/lib/Tooling/Syntax/Tokens.cpp +++ b/clang/lib/Tooling/Syntax/Tokens.cpp @@ -286,9 +286,13 @@ TokenBuffer::spelledForExpandedToken(const syntax::Token *Expanded) const { }); // Our token could only be produced by the previous mapping. if (It == File.Mappings.begin()) { -// No previous mapping, no need to modify offsets. -return {[ExpandedIndex - File.BeginExpanded], -/*Mapping=*/nullptr}; +const auto offset = ExpandedIndex - File.BeginExpanded; +// Bounds check so parsing an unclosed parenthesis, brackets or braces do +// not result in UB. +if (offset >= File.SpelledTokens.size()) { + return {nullptr, nullptr}; +} +return {[offset], /*Mapping=*/nullptr}; } --It; // 'It' now points to last mapping that started before our token. @@ -298,9 +302,12 @@ TokenBuffer::spelledForExpandedToken(const syntax::Token *Expanded) const { // Not part of the mapping, use the index from previous mapping to compute the // corresponding spelled token. - return { - [It->EndSpelled + (ExpandedIndex - It->EndExpanded)], - /*Mapping=*/nullptr}; + const auto offset = It->EndSpelled + (ExpandedIndex - It->EndExpanded); + // This index can also result in UB when parsing unclosed tokens. + if (offset >= File.SpelledTokens.size()) { +return {nullptr, nullptr}; + } + return {[offset], /*Mapping=*/nullptr}; } const TokenBuffer::Mapping * @@ -410,6 +417,12 @@ TokenBuffer::spelledForExpanded(llvm::ArrayRef Expanded) const { auto [FirstSpelled, FirstMapping] = spelledForExpandedToken(First); auto [LastSpelled, LastMapping] = spelledForExpandedToken(Last); + // If there was an unclosed token, FirstSpelled or LastSpelled is null and the + // operation shouldn't continue. + if (FirstSpelled == nullptr || LastSpelled == nullptr) { +return std::nullopt; + }; + FileID FID = SourceMgr->getFileID(FirstSpelled->location()); // FIXME: Handle multi-file changes by trying to map onto a common root. if (FID != SourceMgr->getFileID(LastSpelled->location())) diff --git a/clang/unittests/Tooling/Syntax/TokensTest.cpp b/clang/unittests/Tooling/Syntax/TokensTest.cpp index 0c08318a637c0b6..e1b053ce45b841b 100644 --- a/clang/unittests/Tooling/Syntax/TokensTest.cpp +++ b/clang/unittests/Tooling/Syntax/TokensTest.cpp @@ -585,6 +585,17 @@ TEST_F(TokenCollectorTest, DelayedParsing) { EXPECT_THAT(collectAndDump(Code), StartsWith(ExpectedTokens)); } +TEST_F(TokenCollectorTest, UnclosedToken) { + llvm::StringLiteral Code = R"cpp( +int main() { +// this should not result in a segfault or UB. + )cpp"; + std::string ExpectedTokens = + "expanded tokens:\n int main ( ) {\nfile './input.cpp'\n spelled " + "tokens:\nint main ( ) {\n no mappings.\n"; + EXPECT_THAT(collectAndDump(Code), StartsWith(ExpectedTokens)); +} + TEST_F(TokenCollectorTest, MultiFile) { addFile("./foo.h", R"cpp( #define ADD(X, Y) X+Y ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
https://github.com/robozati converted_to_draft https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
robozati wrote: Tests are failing because of the `findExpanded` function, but it didn’t result in UB. Will fix the test later. https://github.com/llvm/llvm-project/pull/69849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
https://github.com/robozati updated https://github.com/llvm/llvm-project/pull/69849 >From e57657cf296e1e93fa17f021bb8ce591071b2176 Mon Sep 17 00:00:00 2001 From: robozati <139823421+roboz...@users.noreply.github.com> Date: Sat, 21 Oct 2023 12:04:14 -0300 Subject: [PATCH 1/2] Fix UB on unclosed parentheses, brackets or braces in Tokens --- clang/lib/Tooling/Syntax/Tokens.cpp | 23 ++- clang/unittests/Tooling/Syntax/TokensTest.cpp | 6 + 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/clang/lib/Tooling/Syntax/Tokens.cpp b/clang/lib/Tooling/Syntax/Tokens.cpp index 2f28b9cf286a638..e4c24aa1ef3fbf0 100644 --- a/clang/lib/Tooling/Syntax/Tokens.cpp +++ b/clang/lib/Tooling/Syntax/Tokens.cpp @@ -286,9 +286,13 @@ TokenBuffer::spelledForExpandedToken(const syntax::Token *Expanded) const { }); // Our token could only be produced by the previous mapping. if (It == File.Mappings.begin()) { -// No previous mapping, no need to modify offsets. -return {[ExpandedIndex - File.BeginExpanded], -/*Mapping=*/nullptr}; +const auto offset = ExpandedIndex - File.BeginExpanded; +// Bounds check so parsing an unclosed parenthesis, brackets or braces do +// not result in UB. +if (offset >= File.SpelledTokens.size()) { + return {nullptr, nullptr}; +} +return {[offset], /*Mapping=*/nullptr}; } --It; // 'It' now points to last mapping that started before our token. @@ -298,9 +302,12 @@ TokenBuffer::spelledForExpandedToken(const syntax::Token *Expanded) const { // Not part of the mapping, use the index from previous mapping to compute the // corresponding spelled token. - return { - [It->EndSpelled + (ExpandedIndex - It->EndExpanded)], - /*Mapping=*/nullptr}; + const auto offset = It->EndSpelled + (ExpandedIndex - It->EndExpanded); + // This index can also result in UB when parsing unclosed tokens. + if (offset >= File.SpelledTokens.size()) { +return {nullptr, nullptr}; + } + return {[offset], /*Mapping=*/nullptr}; } const TokenBuffer::Mapping * @@ -410,6 +417,10 @@ TokenBuffer::spelledForExpanded(llvm::ArrayRef Expanded) const { auto [FirstSpelled, FirstMapping] = spelledForExpandedToken(First); auto [LastSpelled, LastMapping] = spelledForExpandedToken(Last); + if (FirstSpelled == nullptr || LastSpelled == nullptr) { +return std::nullopt; + }; + FileID FID = SourceMgr->getFileID(FirstSpelled->location()); // FIXME: Handle multi-file changes by trying to map onto a common root. if (FID != SourceMgr->getFileID(LastSpelled->location())) diff --git a/clang/unittests/Tooling/Syntax/TokensTest.cpp b/clang/unittests/Tooling/Syntax/TokensTest.cpp index 0c08318a637c0b6..44f0c57b3bad875 100644 --- a/clang/unittests/Tooling/Syntax/TokensTest.cpp +++ b/clang/unittests/Tooling/Syntax/TokensTest.cpp @@ -814,6 +814,12 @@ TEST_F(TokenBufferTest, SpelledByExpanded) { EXPECT_THAT(Buffer.spelledForExpanded(findExpanded("good")), ValueIs(SameRange(findSpelled("good"; EXPECT_EQ(Buffer.spelledForExpanded(findExpanded("prev good")), std::nullopt); + + // Unclosed parentheses, brackets or braces do not result in a UB + recordTokens(R"cpp( +foo(x + )cpp"); + EXPECT_EQ(Buffer.spelledForExpanded(findExpanded("x")), std::nullopt); } TEST_F(TokenBufferTest, ExpandedTokensForRange) { >From 474e43fdf6682f83e4a7839eabe2f4c2f5fc1b92 Mon Sep 17 00:00:00 2001 From: robozati <139823421+roboz...@users.noreply.github.com> Date: Sat, 21 Oct 2023 12:58:43 -0300 Subject: [PATCH 2/2] Fix failing test --- clang/unittests/Tooling/Syntax/TokensTest.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/unittests/Tooling/Syntax/TokensTest.cpp b/clang/unittests/Tooling/Syntax/TokensTest.cpp index 44f0c57b3bad875..2b7b1474228c4d0 100644 --- a/clang/unittests/Tooling/Syntax/TokensTest.cpp +++ b/clang/unittests/Tooling/Syntax/TokensTest.cpp @@ -819,7 +819,7 @@ TEST_F(TokenBufferTest, SpelledByExpanded) { recordTokens(R"cpp( foo(x )cpp"); - EXPECT_EQ(Buffer.spelledForExpanded(findExpanded("x")), std::nullopt); + EXPECT_EQ(Buffer.spelledForExpanded(findExpanded("foo(x)")), std::nullopt); } TEST_F(TokenBufferTest, ExpandedTokensForRange) { ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Bounds checking on unclosed parentheses, brackets or braces in Expanded Tokens (PR #69849)
https://github.com/robozati created https://github.com/llvm/llvm-project/pull/69849 When getting the AST from clangd, if there is an unclosed token, a segfault or an UB can happen when accessing `File.SpelledTokens` because the supposed closing token is not present, so its index is greater than the vec's size. This fixes the issue by returning a `nullptr` on the function that caused the UB and then checking if the spelled token is null, returning `nullopt` for the whole expanded token. Fixes https://github.com/clangd/clangd/issues/1559. >From e57657cf296e1e93fa17f021bb8ce591071b2176 Mon Sep 17 00:00:00 2001 From: robozati <139823421+roboz...@users.noreply.github.com> Date: Sat, 21 Oct 2023 12:04:14 -0300 Subject: [PATCH] Fix UB on unclosed parentheses, brackets or braces in Tokens --- clang/lib/Tooling/Syntax/Tokens.cpp | 23 ++- clang/unittests/Tooling/Syntax/TokensTest.cpp | 6 + 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/clang/lib/Tooling/Syntax/Tokens.cpp b/clang/lib/Tooling/Syntax/Tokens.cpp index 2f28b9cf286a638..e4c24aa1ef3fbf0 100644 --- a/clang/lib/Tooling/Syntax/Tokens.cpp +++ b/clang/lib/Tooling/Syntax/Tokens.cpp @@ -286,9 +286,13 @@ TokenBuffer::spelledForExpandedToken(const syntax::Token *Expanded) const { }); // Our token could only be produced by the previous mapping. if (It == File.Mappings.begin()) { -// No previous mapping, no need to modify offsets. -return {[ExpandedIndex - File.BeginExpanded], -/*Mapping=*/nullptr}; +const auto offset = ExpandedIndex - File.BeginExpanded; +// Bounds check so parsing an unclosed parenthesis, brackets or braces do +// not result in UB. +if (offset >= File.SpelledTokens.size()) { + return {nullptr, nullptr}; +} +return {[offset], /*Mapping=*/nullptr}; } --It; // 'It' now points to last mapping that started before our token. @@ -298,9 +302,12 @@ TokenBuffer::spelledForExpandedToken(const syntax::Token *Expanded) const { // Not part of the mapping, use the index from previous mapping to compute the // corresponding spelled token. - return { - [It->EndSpelled + (ExpandedIndex - It->EndExpanded)], - /*Mapping=*/nullptr}; + const auto offset = It->EndSpelled + (ExpandedIndex - It->EndExpanded); + // This index can also result in UB when parsing unclosed tokens. + if (offset >= File.SpelledTokens.size()) { +return {nullptr, nullptr}; + } + return {[offset], /*Mapping=*/nullptr}; } const TokenBuffer::Mapping * @@ -410,6 +417,10 @@ TokenBuffer::spelledForExpanded(llvm::ArrayRef Expanded) const { auto [FirstSpelled, FirstMapping] = spelledForExpandedToken(First); auto [LastSpelled, LastMapping] = spelledForExpandedToken(Last); + if (FirstSpelled == nullptr || LastSpelled == nullptr) { +return std::nullopt; + }; + FileID FID = SourceMgr->getFileID(FirstSpelled->location()); // FIXME: Handle multi-file changes by trying to map onto a common root. if (FID != SourceMgr->getFileID(LastSpelled->location())) diff --git a/clang/unittests/Tooling/Syntax/TokensTest.cpp b/clang/unittests/Tooling/Syntax/TokensTest.cpp index 0c08318a637c0b6..44f0c57b3bad875 100644 --- a/clang/unittests/Tooling/Syntax/TokensTest.cpp +++ b/clang/unittests/Tooling/Syntax/TokensTest.cpp @@ -814,6 +814,12 @@ TEST_F(TokenBufferTest, SpelledByExpanded) { EXPECT_THAT(Buffer.spelledForExpanded(findExpanded("good")), ValueIs(SameRange(findSpelled("good"; EXPECT_EQ(Buffer.spelledForExpanded(findExpanded("prev good")), std::nullopt); + + // Unclosed parentheses, brackets or braces do not result in a UB + recordTokens(R"cpp( +foo(x + )cpp"); + EXPECT_EQ(Buffer.spelledForExpanded(findExpanded("x")), std::nullopt); } TEST_F(TokenBufferTest, ExpandedTokensForRange) { ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits