[llvm-branch-commits] [llvm] [AArch64][PAC] Reduce the size of synchronous CFI (PR #96377)
https://github.com/igorkudrin created https://github.com/llvm/llvm-project/pull/96377 For synchronous unwind tables, the call frame information can be slightly reduced by bundling the `.cfi_negate_ra_state` instruction with other CFI instructions in the prolog, saving 1 byte per function used for `DW_CFA_advance_loc`. This was suggested in [D156428](https://reviews.llvm.org/D156428#4554317). >From 4880bc9fca58a185f70acf00a8c31891184272cd Mon Sep 17 00:00:00 2001 From: Igor Kudrin Date: Thu, 20 Jun 2024 18:53:45 -0700 Subject: [PATCH] [AArch64][PAC] Reduce the size of synchronous CFI For synchronous unwind tables, the call frame information can be slightly reduced by bundling the `.cfi_negate_ra_state` instruction with other CFI instructions in the prolog, saving 1 byte per function used for `DW_CFA_advance_loc`. This was suggested in [D156428](https://reviews.llvm.org/D156428#4554317). --- .../lib/Target/AArch64/AArch64PointerAuth.cpp | 13 + .../machine-outliner-retaddr-sign-cfi.ll | 3 +- ...tliner-retaddr-sign-diff-scope-same-key.ll | 6 ++-- .../machine-outliner-retaddr-sign-non-leaf.ll | 9 -- .../machine-outliner-retaddr-sign-regsave.mir | 3 +- ...tliner-retaddr-sign-same-scope-diff-key.ll | 9 -- ...machine-outliner-retaddr-sign-subtarget.ll | 9 -- .../machine-outliner-retaddr-sign-thunk.ll| 12 +--- .../AArch64/pacbti-llvm-generated-funcs-2.ll | 9 -- ...sign-return-address-cfi-negate-ra-state.ll | 13 + .../AArch64/sign-return-address-pauth-lr.ll | 28 +-- .../CodeGen/AArch64/sign-return-address.ll| 18 ++-- 12 files changed, 84 insertions(+), 48 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp index e900f6881620f..eb0ff73200407 100644 --- a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp +++ b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp @@ -100,6 +100,7 @@ void AArch64PointerAuth::signLR(MachineFunction , auto = *MF.getInfo(); bool UseBKey = MFnI.shouldSignWithBKey(); bool EmitCFI = MFnI.needsDwarfUnwindInfo(MF); + bool EmitAsyncCFI = MFnI.needsAsyncDwarfUnwindInfo(MF); bool NeedsWinCFI = MF.hasWinCFI(); MachineBasicBlock = *MBBI->getParent(); @@ -137,6 +138,18 @@ void AArch64PointerAuth::signLR(MachineFunction , } if (EmitCFI) { +if (!EmitAsyncCFI) { + // Reduce the size of the generated call frame information for synchronous + // CFI by bundling the new CFI instruction with others in the prolog, so + // that no additional DW_CFA_advance_loc is needed. + for (auto I = MBBI; I != MBB.end(); ++I) { +if (I->getOpcode() == TargetOpcode::CFI_INSTRUCTION && +I->getFlag(MachineInstr::FrameSetup)) { + MBBI = I; + break; +} + } +} unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createNegateRAState(nullptr)); BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION)) diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-cfi.ll b/llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-cfi.ll index 4bbbe40176313..c64b3842aa5ba 100644 --- a/llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-cfi.ll +++ b/llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-cfi.ll @@ -11,7 +11,8 @@ define void @a() "sign-return-address"="all" "sign-return-address-key"="b_key" { ; CHECK-NEXT: .cfi_b_key_frame ; V8A-NEXT:hint #27 ; V83A-NEXT: pacibsp -; CHECK-NEXT: .cfi_negate_ra_state +; CHECK: .cfi_negate_ra_state +; CHECK-NEXT: .cfi_def_cfa_offset %1 = alloca i32, align 4 %2 = alloca i32, align 4 %3 = alloca i32, align 4 diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-diff-scope-same-key.ll b/llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-diff-scope-same-key.ll index f4e9c0a4c2204..3221815da33c5 100644 --- a/llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-diff-scope-same-key.ll +++ b/llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-diff-scope-same-key.ll @@ -7,7 +7,8 @@ define void @a() "sign-return-address"="all" { ; CHECK-LABEL: a: // @a ; V8A: hint #25 ; V83A: paciasp -; CHECK-NEXT: .cfi_negate_ra_state +; CHECK:.cfi_negate_ra_state +; CHECK-NEXT: .cfi_def_cfa_offset %1 = alloca i32, align 4 %2 = alloca i32, align 4 %3 = alloca i32, align 4 @@ -54,7 +55,8 @@ define void @c() "sign-return-address"="all" { ; CHECK-LABEL: c: // @c ; V8A: hint #25 ; V83A:paciasp -; CHECK-NEXT: .cfi_negate_ra_state +; CHECK: .cfi_negate_ra_state +; CHECK-NEXT: .cfi_def_cfa_offset %1 = alloca i32, align 4 %2 = alloca i32, align 4 %3 = alloca i32, align 4 diff --git
[llvm-branch-commits] [llvm] [AArch64][PAC] Fix creating check instructions for BBs without an epilog (PR #92508)
https://github.com/igorkudrin created https://github.com/llvm/llvm-project/pull/92508 `AArch64PAuth::checkAuthenticatedRegister()` splits the basic block containing the tail call instruction to add check instructions, assuming at least one more instruction before the call. This assumption is incorrect in cases where some execution paths lead to the termination block without creating the stack frame. This patch rearranges the creation of the checks so that the prior splitting is not required. >From a3039508f7bf9eeacbb4739460468cb3e71ba133 Mon Sep 17 00:00:00 2001 From: Igor Kudrin Date: Thu, 16 May 2024 22:26:32 -0700 Subject: [PATCH 1/2] test --- .../AArch64/sign-return-address-tailcall.ll | 32 +++ 1 file changed, 32 insertions(+) diff --git a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll index cf033cb8208cc..0cc707298e458 100644 --- a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll +++ b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll @@ -129,4 +129,36 @@ define i32 @tailcall_ib_key() "sign-return-address"="all" "sign-return-address-k ret i32 %call } +define i32 @tailcall_two_branches(i1 %0) "sign-return-address"="all" { +; COMMON-LABEL:tailcall_two_branches: +; COMMON:tbz w0, #0, .[[ELSE:LBB[_0-9]+]] +; COMMON:str x30, [sp, #-16]! +; COMMON:bl callee2 +; COMMON:ldr x30, [sp], #16 +; COMMON-NEXT: [[AUTIASP]] +; COMMON-NEXT: .[[ELSE]]: + +; LDR-NEXT: ldr w16, [x30] +; +; BITS-NOTBI-NEXT: eor x16, x30, x30, lsl #1 +; BITS-NOTBI-NEXT: tbnz x16, #62, .[[FAIL:LBB[_0-9]+]] +; +; XPAC-NEXT: mov x16, x30 +; XPAC-NEXT: [[XPACLRI]] +; XPAC-NEXT: cmp x16, x30 +; XPAC-NEXT: b.ne .[[FAIL:LBB[_0-9]+]] +; +; COMMON-NEXT: b callee +; BRK-NEXT:.[[FAIL]]: +; BRK-NEXT: brk #0xc470 + br i1 %0, label %2, label %3 +2: + call void @callee2() + br label %3 +3: + %call = tail call i32 @callee() + ret i32 %call +} + declare i32 @callee() +declare void @callee2() >From 2641fe82837455b422d6c8229cc2f3d3736de4da Mon Sep 17 00:00:00 2001 From: Igor Kudrin Date: Thu, 16 May 2024 22:26:40 -0700 Subject: [PATCH 2/2] [AArch64][PAC] Fix creating check instructions for BBs without an epilog `AArch64PAuth::checkAuthenticatedRegister()` splits the basic block containing the tail call instruction to add check instructions, assuming at least one more instruction before the call. This assumption is incorrect in cases where some execution paths lead to the termination block without creating the stack frame. This patch rearranges the creation of the checks so that the prior splitting is not required. --- .../lib/Target/AArch64/AArch64PointerAuth.cpp | 23 ++- 1 file changed, 7 insertions(+), 16 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp index 90bf089dbebf7..60d3d533d9c10 100644 --- a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp +++ b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp @@ -257,21 +257,12 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister( // Control flow has to be changed, so arrange new MBBs. - // At now, at least an AUT* instruction is expected before MBBI - assert(MBBI != MBB.begin() && - "Cannot insert the check at the very beginning of MBB"); - // The block to insert check into. - MachineBasicBlock *CheckBlock = - // The remaining part of the original MBB that is executed on success. - MachineBasicBlock *SuccessBlock = MBB.splitAt(*std::prev(MBBI)); - // The block that explicitly generates a break-point exception on failure. MachineBasicBlock *BreakBlock = MF.CreateMachineBasicBlock(MBB.getBasicBlock()); MF.push_back(BreakBlock); - MBB.splitSuccessor(SuccessBlock, BreakBlock); + MBB.addSuccessor(BreakBlock); - assert(CheckBlock->getFallThrough() == SuccessBlock); BuildMI(BreakBlock, DL, TII->get(AArch64::BRK)).addImm(BrkImm); switch (Method) { @@ -279,11 +270,11 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister( case AuthCheckMethod::DummyLoad: llvm_unreachable("Should be handled above"); case AuthCheckMethod::HighBitsNoTBI: -BuildMI(CheckBlock, DL, TII->get(AArch64::EORXrs), TmpReg) +BuildMI(MBB, MBBI, DL, TII->get(AArch64::EORXrs), TmpReg) .addReg(AuthenticatedReg) .addReg(AuthenticatedReg) .addImm(1); -BuildMI(CheckBlock, DL, TII->get(AArch64::TBNZX)) +BuildMI(MBB, MBBI, DL, TII->get(AArch64::TBNZX)) .addReg(TmpReg) .addImm(62) .addMBB(BreakBlock); @@ -292,16 +283,16 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister( assert(AuthenticatedReg == AArch64::LR && "XPACHint mode is only compatible with checking the LR register"); assert(UseIKey && "XPACHint mode is only compatible with I-keys"); -
[llvm-branch-commits] [llvm] [YAMLParser] Unfold multi-line scalar values (PR #70898)
https://github.com/igorkudrin updated https://github.com/llvm/llvm-project/pull/70898 >From f38dc24c2dd940e18eb424746d13cd99e3ffdd91 Mon Sep 17 00:00:00 2001 From: Igor Kudrin Date: Tue, 7 Nov 2023 18:42:02 -0800 Subject: [PATCH] [YAMLParser] Unfold multi-line scalar values Long scalar values can be split into multiple lines to improve readability. The rules are described in Section 6.5. "Line Folding", https://yaml.org/spec/1.2.2/#65-line-folding. In addition, for flow scalar styles, the Spec states that "All leading and trailing white space characters on each line are excluded from the content", https://yaml.org/spec/1.2.2/#73-flow-scalar-styles. The patch implements these unfolding rules for double-quoted, single-quoted, and plain scalars. --- llvm/include/llvm/Support/YAMLParser.h| 9 +- llvm/lib/Support/YAMLParser.cpp | 373 -- llvm/test/YAMLParser/spec-05-13.test | 2 +- llvm/test/YAMLParser/spec-05-14.test | 2 +- llvm/test/YAMLParser/spec-09-01.test | 4 +- llvm/test/YAMLParser/spec-09-02.test | 18 +- llvm/test/YAMLParser/spec-09-03.test | 6 +- llvm/test/YAMLParser/spec-09-04.test | 2 +- llvm/test/YAMLParser/spec-09-05.test | 6 +- llvm/test/YAMLParser/spec-09-07.test | 4 +- llvm/test/YAMLParser/spec-09-08.test | 8 +- llvm/test/YAMLParser/spec-09-09.test | 6 +- llvm/test/YAMLParser/spec-09-10.test | 2 +- llvm/test/YAMLParser/spec-09-11.test | 4 +- llvm/test/YAMLParser/spec-09-13.test | 4 +- llvm/test/YAMLParser/spec-09-16.test | 8 +- llvm/test/YAMLParser/spec-09-17.test | 2 +- llvm/test/YAMLParser/spec-10-02.test | 6 +- llvm/test/YAMLParser/spec1.2-07-05.test | 2 +- llvm/test/YAMLParser/spec1.2-07-06.test | 2 +- llvm/test/YAMLParser/spec1.2-07-09.test | 2 +- llvm/test/YAMLParser/spec1.2-07-12.test | 2 +- llvm/unittests/Support/YAMLParserTest.cpp | 102 ++ 23 files changed, 376 insertions(+), 200 deletions(-) diff --git a/llvm/include/llvm/Support/YAMLParser.h b/llvm/include/llvm/Support/YAMLParser.h index f4767641647c217..9d95a1e13a0dff4 100644 --- a/llvm/include/llvm/Support/YAMLParser.h +++ b/llvm/include/llvm/Support/YAMLParser.h @@ -240,9 +240,14 @@ class ScalarNode final : public Node { private: StringRef Value; - StringRef unescapeDoubleQuoted(StringRef UnquotedValue, - StringRef::size_type Start, + StringRef getDoubleQuotedValue(StringRef UnquotedValue, SmallVectorImpl ) const; + + static StringRef getSingleQuotedValue(StringRef RawValue, +SmallVectorImpl ); + + static StringRef getPlainValue(StringRef RawValue, + SmallVectorImpl ); }; /// A block scalar node is an opaque datum that can be presented as a diff --git a/llvm/lib/Support/YAMLParser.cpp b/llvm/lib/Support/YAMLParser.cpp index b47cb3ae3b44a75..fdd0ed6e682eb5e 100644 --- a/llvm/lib/Support/YAMLParser.cpp +++ b/llvm/lib/Support/YAMLParser.cpp @@ -2030,184 +2030,229 @@ bool Node::failed() const { } StringRef ScalarNode::getValue(SmallVectorImpl ) const { - // TODO: Handle newlines properly. We need to remove leading whitespace. - if (Value[0] == '"') { // Double quoted. -// Pull off the leading and trailing "s. -StringRef UnquotedValue = Value.substr(1, Value.size() - 2); -// Search for characters that would require unescaping the value. -StringRef::size_type i = UnquotedValue.find_first_of("\\\r\n"); -if (i != StringRef::npos) - return unescapeDoubleQuoted(UnquotedValue, i, Storage); + if (Value[0] == '"') +return getDoubleQuotedValue(Value, Storage); + if (Value[0] == '\'') +return getSingleQuotedValue(Value, Storage); + return getPlainValue(Value, Storage); +} + +/// parseScalarValue - A common parsing routine for all flow scalar styles. +/// It handles line break characters by itself, adds regular content characters +/// to the result, and forwards escaped sequences to the provided routine for +/// the style-specific processing. +/// +/// \param UnquotedValue - An input value without quotation marks. +/// \param Storage - A storage for the result if the input value is multiline or +/// contains escaped characters. +/// \param LookupChars - A set of special characters to search in the input +/// string. Should include line break characters and the escape character +/// specific for the processing scalar style, if any. +/// \param UnescapeCallback - This is called when the escape character is found +/// in the input. +/// \returns - The unfolded and unescaped value. +static StringRef +parseScalarValue(StringRef UnquotedValue, SmallVectorImpl , + StringRef LookupChars, + std::function &)> + UnescapeCallback) { + size_t I = UnquotedValue.find_first_of(LookupChars); + if (I == StringRef::npos)
[llvm-branch-commits] [llvm] b9b9c49 - [YAMLParser] Fix handling escaped line breaks in double-quoted scalars
Author: Igor Kudrin Date: 2023-11-09T16:12:49-08:00 New Revision: b9b9c49c018a28c46d7709ed3b8c8fcb53036f8f URL: https://github.com/llvm/llvm-project/commit/b9b9c49c018a28c46d7709ed3b8c8fcb53036f8f DIFF: https://github.com/llvm/llvm-project/commit/b9b9c49c018a28c46d7709ed3b8c8fcb53036f8f.diff LOG: [YAMLParser] Fix handling escaped line breaks in double-quoted scalars Leading white spaces on the line following an escaped line break should be excluded from the content. See https://yaml.org/spec/1.2.2/#731-double-quoted-style. Added: Modified: llvm/lib/Support/YAMLParser.cpp llvm/test/YAMLParser/spec-09-02.test llvm/test/YAMLParser/spec-09-04.test llvm/test/YAMLParser/spec1.2-07-05.test Removed: diff --git a/llvm/lib/Support/YAMLParser.cpp b/llvm/lib/Support/YAMLParser.cpp index 17d727b6cc07da8..b47cb3ae3b44a75 100644 --- a/llvm/lib/Support/YAMLParser.cpp +++ b/llvm/lib/Support/YAMLParser.cpp @@ -2107,14 +2107,13 @@ StringRef ScalarNode::unescapeDoubleQuoted( StringRef UnquotedValue return ""; } case '\r': +// Shrink the Windows-style EOL. +if (UnquotedValue.size() >= 2 && UnquotedValue[1] == '\n') + UnquotedValue = UnquotedValue.drop_front(1); +[[fallthrough]]; case '\n': -// Remove the new line. -if ( UnquotedValue.size() > 1 -&& (UnquotedValue[1] == '\r' || UnquotedValue[1] == '\n')) - UnquotedValue = UnquotedValue.substr(1); -// If this was just a single byte newline, it will get skipped -// below. -break; +UnquotedValue = UnquotedValue.drop_front(1).ltrim(" \t"); +continue; case '0': Storage.push_back(0x00); break; diff --git a/llvm/test/YAMLParser/spec-09-02.test b/llvm/test/YAMLParser/spec-09-02.test index 6b68a00e3fc3e6f..51ea61dd23273d3 100644 --- a/llvm/test/YAMLParser/spec-09-02.test +++ b/llvm/test/YAMLParser/spec-09-02.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s 2>&1 | FileCheck %s --strict-whitespace -# CHECK: "as space\n trimmed \n specific\L\n escaped\t \n none" +# CHECK: "as space\n trimmed \n specific\L\n escaped\t\n none" ## Note: The example was originally taken from Spec 1.1, but the parsing rules ## have been changed since then. diff --git a/llvm/test/YAMLParser/spec-09-04.test b/llvm/test/YAMLParser/spec-09-04.test index 1e904eaa70992e5..e4f77ea83c7ac5f 100644 --- a/llvm/test/YAMLParser/spec-09-04.test +++ b/llvm/test/YAMLParser/spec-09-04.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace -# CHECK: "first\n \tinner 1\t\n inner 2 last" +# CHECK: "first\n \tinner 1\t\n inner 2 last" "first inner 1 diff --git a/llvm/test/YAMLParser/spec1.2-07-05.test b/llvm/test/YAMLParser/spec1.2-07-05.test index 3ea0e5aa37743e4..f923f68d04295f9 100644 --- a/llvm/test/YAMLParser/spec1.2-07-05.test +++ b/llvm/test/YAMLParser/spec1.2-07-05.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace -# CHECK: "folded \nto a space,\t\n \nto a line feed, or \t \tnon-content" +# CHECK: "folded \nto a space,\t\n \nto a line feed, or \t \tnon-content" "folded to a space, ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [YAMLParser] Unfold multi-line scalar values (PR #70898)
https://github.com/igorkudrin updated https://github.com/llvm/llvm-project/pull/70898 >From 37ab3fff62b1a3aa373fd513745b1c2b91b1b865 Mon Sep 17 00:00:00 2001 From: Igor Kudrin Date: Tue, 7 Nov 2023 18:42:02 -0800 Subject: [PATCH] [YAMLParser] Unfold multi-line scalar values Long scalar values can be split into multiple lines to improve readability. The rules are described in Section 6.5. "Line Folding", https://yaml.org/spec/1.2.2/#65-line-folding. In addition, for flow scalar styles, the Spec states that "All leading and trailing white space characters on each line are excluded from the content", https://yaml.org/spec/1.2.2/#73-flow-scalar-styles. The patch implements these unfolding rules for double-quoted, single-quoted, and plain scalars. --- llvm/include/llvm/Support/YAMLParser.h| 9 +- llvm/lib/Support/YAMLParser.cpp | 373 -- llvm/test/YAMLParser/spec-05-13.test | 2 +- llvm/test/YAMLParser/spec-05-14.test | 2 +- llvm/test/YAMLParser/spec-09-01.test | 4 +- llvm/test/YAMLParser/spec-09-02.test | 18 +- llvm/test/YAMLParser/spec-09-03.test | 6 +- llvm/test/YAMLParser/spec-09-04.test | 2 +- llvm/test/YAMLParser/spec-09-05.test | 6 +- llvm/test/YAMLParser/spec-09-07.test | 4 +- llvm/test/YAMLParser/spec-09-08.test | 8 +- llvm/test/YAMLParser/spec-09-09.test | 6 +- llvm/test/YAMLParser/spec-09-10.test | 2 +- llvm/test/YAMLParser/spec-09-11.test | 4 +- llvm/test/YAMLParser/spec-09-13.test | 4 +- llvm/test/YAMLParser/spec-09-16.test | 8 +- llvm/test/YAMLParser/spec-09-17.test | 2 +- llvm/test/YAMLParser/spec-10-02.test | 6 +- llvm/test/YAMLParser/spec1.2-07-05.test | 2 +- llvm/test/YAMLParser/spec1.2-07-06.test | 2 +- llvm/test/YAMLParser/spec1.2-07-09.test | 2 +- llvm/test/YAMLParser/spec1.2-07-12.test | 2 +- llvm/unittests/Support/YAMLParserTest.cpp | 102 ++ 23 files changed, 376 insertions(+), 200 deletions(-) diff --git a/llvm/include/llvm/Support/YAMLParser.h b/llvm/include/llvm/Support/YAMLParser.h index f4767641647c217..9d95a1e13a0dff4 100644 --- a/llvm/include/llvm/Support/YAMLParser.h +++ b/llvm/include/llvm/Support/YAMLParser.h @@ -240,9 +240,14 @@ class ScalarNode final : public Node { private: StringRef Value; - StringRef unescapeDoubleQuoted(StringRef UnquotedValue, - StringRef::size_type Start, + StringRef getDoubleQuotedValue(StringRef UnquotedValue, SmallVectorImpl ) const; + + static StringRef getSingleQuotedValue(StringRef RawValue, +SmallVectorImpl ); + + static StringRef getPlainValue(StringRef RawValue, + SmallVectorImpl ); }; /// A block scalar node is an opaque datum that can be presented as a diff --git a/llvm/lib/Support/YAMLParser.cpp b/llvm/lib/Support/YAMLParser.cpp index b47cb3ae3b44a75..fdd0ed6e682eb5e 100644 --- a/llvm/lib/Support/YAMLParser.cpp +++ b/llvm/lib/Support/YAMLParser.cpp @@ -2030,184 +2030,229 @@ bool Node::failed() const { } StringRef ScalarNode::getValue(SmallVectorImpl ) const { - // TODO: Handle newlines properly. We need to remove leading whitespace. - if (Value[0] == '"') { // Double quoted. -// Pull off the leading and trailing "s. -StringRef UnquotedValue = Value.substr(1, Value.size() - 2); -// Search for characters that would require unescaping the value. -StringRef::size_type i = UnquotedValue.find_first_of("\\\r\n"); -if (i != StringRef::npos) - return unescapeDoubleQuoted(UnquotedValue, i, Storage); + if (Value[0] == '"') +return getDoubleQuotedValue(Value, Storage); + if (Value[0] == '\'') +return getSingleQuotedValue(Value, Storage); + return getPlainValue(Value, Storage); +} + +/// parseScalarValue - A common parsing routine for all flow scalar styles. +/// It handles line break characters by itself, adds regular content characters +/// to the result, and forwards escaped sequences to the provided routine for +/// the style-specific processing. +/// +/// \param UnquotedValue - An input value without quotation marks. +/// \param Storage - A storage for the result if the input value is multiline or +/// contains escaped characters. +/// \param LookupChars - A set of special characters to search in the input +/// string. Should include line break characters and the escape character +/// specific for the processing scalar style, if any. +/// \param UnescapeCallback - This is called when the escape character is found +/// in the input. +/// \returns - The unfolded and unescaped value. +static StringRef +parseScalarValue(StringRef UnquotedValue, SmallVectorImpl , + StringRef LookupChars, + std::function &)> + UnescapeCallback) { + size_t I = UnquotedValue.find_first_of(LookupChars); + if (I == StringRef::npos)
[llvm-branch-commits] [llvm] b4e19d2 - [YAMLParser] Fix handling escaped line breaks in double-quoted scalars
Author: Igor Kudrin Date: 2023-11-09T13:51:04-08:00 New Revision: b4e19d2f0531c99167e3391f3742729c731d9c34 URL: https://github.com/llvm/llvm-project/commit/b4e19d2f0531c99167e3391f3742729c731d9c34 DIFF: https://github.com/llvm/llvm-project/commit/b4e19d2f0531c99167e3391f3742729c731d9c34.diff LOG: [YAMLParser] Fix handling escaped line breaks in double-quoted scalars Leading white spaces on the line following an escaped line break should be excluded from the content. See https://yaml.org/spec/1.2.2/#731-double-quoted-style. Added: Modified: llvm/lib/Support/YAMLParser.cpp llvm/test/YAMLParser/spec-09-02.test llvm/test/YAMLParser/spec-09-04.test llvm/test/YAMLParser/spec1.2-07-05.test Removed: diff --git a/llvm/lib/Support/YAMLParser.cpp b/llvm/lib/Support/YAMLParser.cpp index 17d727b6cc07da8..b47cb3ae3b44a75 100644 --- a/llvm/lib/Support/YAMLParser.cpp +++ b/llvm/lib/Support/YAMLParser.cpp @@ -2107,14 +2107,13 @@ StringRef ScalarNode::unescapeDoubleQuoted( StringRef UnquotedValue return ""; } case '\r': +// Shrink the Windows-style EOL. +if (UnquotedValue.size() >= 2 && UnquotedValue[1] == '\n') + UnquotedValue = UnquotedValue.drop_front(1); +[[fallthrough]]; case '\n': -// Remove the new line. -if ( UnquotedValue.size() > 1 -&& (UnquotedValue[1] == '\r' || UnquotedValue[1] == '\n')) - UnquotedValue = UnquotedValue.substr(1); -// If this was just a single byte newline, it will get skipped -// below. -break; +UnquotedValue = UnquotedValue.drop_front(1).ltrim(" \t"); +continue; case '0': Storage.push_back(0x00); break; diff --git a/llvm/test/YAMLParser/spec-09-02.test b/llvm/test/YAMLParser/spec-09-02.test index 6b68a00e3fc3e6f..51ea61dd23273d3 100644 --- a/llvm/test/YAMLParser/spec-09-02.test +++ b/llvm/test/YAMLParser/spec-09-02.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s 2>&1 | FileCheck %s --strict-whitespace -# CHECK: "as space\n trimmed \n specific\L\n escaped\t \n none" +# CHECK: "as space\n trimmed \n specific\L\n escaped\t\n none" ## Note: The example was originally taken from Spec 1.1, but the parsing rules ## have been changed since then. diff --git a/llvm/test/YAMLParser/spec-09-04.test b/llvm/test/YAMLParser/spec-09-04.test index 1e904eaa70992e5..e4f77ea83c7ac5f 100644 --- a/llvm/test/YAMLParser/spec-09-04.test +++ b/llvm/test/YAMLParser/spec-09-04.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace -# CHECK: "first\n \tinner 1\t\n inner 2 last" +# CHECK: "first\n \tinner 1\t\n inner 2 last" "first inner 1 diff --git a/llvm/test/YAMLParser/spec1.2-07-05.test b/llvm/test/YAMLParser/spec1.2-07-05.test index 3ea0e5aa37743e4..f923f68d04295f9 100644 --- a/llvm/test/YAMLParser/spec1.2-07-05.test +++ b/llvm/test/YAMLParser/spec1.2-07-05.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace -# CHECK: "folded \nto a space,\t\n \nto a line feed, or \t \tnon-content" +# CHECK: "folded \nto a space,\t\n \nto a line feed, or \t \tnon-content" "folded to a space, ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [YAMLParser] Fix handling escaped line breaks in double-quoted scalars (PR #71775)
https://github.com/igorkudrin updated https://github.com/llvm/llvm-project/pull/71775 >From b4e19d2f0531c99167e3391f3742729c731d9c34 Mon Sep 17 00:00:00 2001 From: Igor Kudrin Date: Wed, 8 Nov 2023 20:48:49 -0800 Subject: [PATCH] [YAMLParser] Fix handling escaped line breaks in double-quoted scalars Leading white spaces on the line following an escaped line break should be excluded from the content. See https://yaml.org/spec/1.2.2/#731-double-quoted-style. --- llvm/lib/Support/YAMLParser.cpp | 13 ++--- llvm/test/YAMLParser/spec-09-02.test| 2 +- llvm/test/YAMLParser/spec-09-04.test| 2 +- llvm/test/YAMLParser/spec1.2-07-05.test | 2 +- 4 files changed, 9 insertions(+), 10 deletions(-) diff --git a/llvm/lib/Support/YAMLParser.cpp b/llvm/lib/Support/YAMLParser.cpp index 17d727b6cc07da8..b47cb3ae3b44a75 100644 --- a/llvm/lib/Support/YAMLParser.cpp +++ b/llvm/lib/Support/YAMLParser.cpp @@ -2107,14 +2107,13 @@ StringRef ScalarNode::unescapeDoubleQuoted( StringRef UnquotedValue return ""; } case '\r': +// Shrink the Windows-style EOL. +if (UnquotedValue.size() >= 2 && UnquotedValue[1] == '\n') + UnquotedValue = UnquotedValue.drop_front(1); +[[fallthrough]]; case '\n': -// Remove the new line. -if ( UnquotedValue.size() > 1 -&& (UnquotedValue[1] == '\r' || UnquotedValue[1] == '\n')) - UnquotedValue = UnquotedValue.substr(1); -// If this was just a single byte newline, it will get skipped -// below. -break; +UnquotedValue = UnquotedValue.drop_front(1).ltrim(" \t"); +continue; case '0': Storage.push_back(0x00); break; diff --git a/llvm/test/YAMLParser/spec-09-02.test b/llvm/test/YAMLParser/spec-09-02.test index 6b68a00e3fc3e6f..51ea61dd23273d3 100644 --- a/llvm/test/YAMLParser/spec-09-02.test +++ b/llvm/test/YAMLParser/spec-09-02.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s 2>&1 | FileCheck %s --strict-whitespace -# CHECK: "as space\n trimmed \n specific\L\n escaped\t \n none" +# CHECK: "as space\n trimmed \n specific\L\n escaped\t\n none" ## Note: The example was originally taken from Spec 1.1, but the parsing rules ## have been changed since then. diff --git a/llvm/test/YAMLParser/spec-09-04.test b/llvm/test/YAMLParser/spec-09-04.test index 1e904eaa70992e5..e4f77ea83c7ac5f 100644 --- a/llvm/test/YAMLParser/spec-09-04.test +++ b/llvm/test/YAMLParser/spec-09-04.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace -# CHECK: "first\n \tinner 1\t\n inner 2 last" +# CHECK: "first\n \tinner 1\t\n inner 2 last" "first inner 1 diff --git a/llvm/test/YAMLParser/spec1.2-07-05.test b/llvm/test/YAMLParser/spec1.2-07-05.test index 3ea0e5aa37743e4..f923f68d04295f9 100644 --- a/llvm/test/YAMLParser/spec1.2-07-05.test +++ b/llvm/test/YAMLParser/spec1.2-07-05.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace -# CHECK: "folded \nto a space,\t\n \nto a line feed, or \t \tnon-content" +# CHECK: "folded \nto a space,\t\n \nto a line feed, or \t \tnon-content" "folded to a space, ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 9a6f97c - [YAMLParser] Enable tests for flow scalar styles. NFC
Author: Igor Kudrin Date: 2023-11-09T13:48:06-08:00 New Revision: 9a6f97c327be5a5380c29295a6f73a1ec81ca41d URL: https://github.com/llvm/llvm-project/commit/9a6f97c327be5a5380c29295a6f73a1ec81ca41d DIFF: https://github.com/llvm/llvm-project/commit/9a6f97c327be5a5380c29295a6f73a1ec81ca41d.diff LOG: [YAMLParser] Enable tests for flow scalar styles. NFC This is a preparing commit for #70898 and #71775. It activates checks in tests for single-quoted, double-quoted, and plain values and demonstrates how they are handled currently. Added: llvm/test/YAMLParser/spec1.2-07-05.test llvm/test/YAMLParser/spec1.2-07-06.test llvm/test/YAMLParser/spec1.2-07-09.test llvm/test/YAMLParser/spec1.2-07-12.test Modified: llvm/test/YAMLParser/spec-02-17.test llvm/test/YAMLParser/spec-05-13.test llvm/test/YAMLParser/spec-05-14.test llvm/test/YAMLParser/spec-09-01.test llvm/test/YAMLParser/spec-09-02.test llvm/test/YAMLParser/spec-09-03.test llvm/test/YAMLParser/spec-09-04.test llvm/test/YAMLParser/spec-09-05.test llvm/test/YAMLParser/spec-09-06.test llvm/test/YAMLParser/spec-09-07.test llvm/test/YAMLParser/spec-09-08.test llvm/test/YAMLParser/spec-09-09.test llvm/test/YAMLParser/spec-09-10.test llvm/test/YAMLParser/spec-09-11.test llvm/test/YAMLParser/spec-09-13.test llvm/test/YAMLParser/spec-09-16.test llvm/test/YAMLParser/spec-09-17.test llvm/test/YAMLParser/spec-10-02.test Removed: diff --git a/llvm/test/YAMLParser/spec-02-17.test b/llvm/test/YAMLParser/spec-02-17.test index 2bcb60c8d933bd8..e7b0147a0fcd89f 100644 --- a/llvm/test/YAMLParser/spec-02-17.test +++ b/llvm/test/YAMLParser/spec-02-17.test @@ -1,4 +1,4 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s unicode: "Sosa did fine.\u263A" control: "\b1998\t1999\t2000\n" diff --git a/llvm/test/YAMLParser/spec-05-13.test b/llvm/test/YAMLParser/spec-05-13.test index db62e866a755a32..e7ec42a4aaa80d7 100644 --- a/llvm/test/YAMLParser/spec-05-13.test +++ b/llvm/test/YAMLParser/spec-05-13.test @@ -1,4 +1,5 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace +# CHECK: "Text containing \n both space and\t\n \ttab\tcharacters" "Text containing both space and diff --git a/llvm/test/YAMLParser/spec-05-14.test b/llvm/test/YAMLParser/spec-05-14.test index 65451651b69e96b..984f3721312ab63 100644 --- a/llvm/test/YAMLParser/spec-05-14.test +++ b/llvm/test/YAMLParser/spec-05-14.test @@ -1,4 +1,4 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace "Fun with \\ \" \a \b \e \f \ diff --git a/llvm/test/YAMLParser/spec-09-01.test b/llvm/test/YAMLParser/spec-09-01.test index 8999b4961626470..2b5a6f31166ddf1 100644 --- a/llvm/test/YAMLParser/spec-09-01.test +++ b/llvm/test/YAMLParser/spec-09-01.test @@ -1,4 +1,13 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace +# CHECK: !!map { +# CHECK-NEXT: ? !!str "simple key" +# CHECK-NEXT: : !!map { +# CHECK-NEXT: ? !!str "also simple" +# CHECK-NEXT: : !!str "value", +# CHECK-NEXT: ? !!str "not a\n simple key" +# CHECK-NEXT: : !!str "any\n value", +# CHECK-NEXT: }, +# CHECK-NEXT: } "simple key" : { "also simple" : value, diff --git a/llvm/test/YAMLParser/spec-09-02.test b/llvm/test/YAMLParser/spec-09-02.test index 3f8e49a8bd31079..6b68a00e3fc3e6f 100644 --- a/llvm/test/YAMLParser/spec-09-02.test +++ b/llvm/test/YAMLParser/spec-09-02.test @@ -1,14 +1,17 @@ -# RUN: yaml-bench -canonical %s 2>&1 | FileCheck %s +# RUN: yaml-bench -canonical %s 2>&1 | FileCheck %s --strict-whitespace +# CHECK: "as space\n trimmed \n specific\L\n escaped\t \n none" - "as space - trimmed +## Note: The example was originally taken from Spec 1.1, but the parsing rules +## have been changed since then. +## * The paragraph-separator character '\u2029' is excluded from line-break +## characters, so the original sequence "escaped\t\\\u2029" is no longer +## considered valid. This is replaced by "escaped\t\\\n" in the test source. +## See https://yaml.org/spec/1.2.2/ext/changes/ for details. - specific + "as space + trimmed + specific
 escaped \ + none" - -# FIXME: The string below should actually be -# "as space trimmed\nspecific\nescaped\tnone", but the parser currently has -# a bug when parsing multiline quoted strings. -# CHECK: !!str "as space\n trimmed\n specific\n escaped\t none" diff --git a/llvm/test/YAMLParser/spec-09-03.test b/llvm/test/YAMLParser/spec-09-03.test index 3fb0d8b184abb16..c656058b7ff8b3e 100644 --- a/llvm/test/YAMLParser/spec-09-03.test +++ b/llvm/test/YAMLParser/spec-09-03.test @@ -1,4 +1,9 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s
[llvm-branch-commits] [llvm] [YAMLParser] Unfold multi-line scalar values (PR #70898)
igorkudrin wrote: > I don't mean to make existing debt your problem, but if it isn't too much > work could you post a pre-patch that just adds the `FileCheck`s to the > existing tests where the behavior changes, so the test diff is more > self-documenting? * Added #71774 for the tests * Also added a unittest `YAMLParser.UnfoldsScalarValue` to check various combinations of line breaks and other characters. It seems like a `gtest`-based test suits better than a lit one. https://github.com/llvm/llvm-project/pull/70898 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [YAMLParser] Unfold multi-line scalar values (PR #70898)
@@ -2030,187 +2030,219 @@ bool Node::failed() const { } StringRef ScalarNode::getValue(SmallVectorImpl ) const { - // TODO: Handle newlines properly. We need to remove leading whitespace. - if (Value[0] == '"') { // Double quoted. -// Pull off the leading and trailing "s. -StringRef UnquotedValue = Value.substr(1, Value.size() - 2); -// Search for characters that would require unescaping the value. -StringRef::size_type i = UnquotedValue.find_first_of("\\\r\n"); -if (i != StringRef::npos) - return unescapeDoubleQuoted(UnquotedValue, i, Storage); + if (Value[0] == '"') +return getDoubleQuotedValue(Value, Storage); + if (Value[0] == '\'') +return getSingleQuotedValue(Value, Storage); + return getPlainValue(Value, Storage); +} + +static StringRef +parseScalarValue(StringRef UnquotedValue, SmallVectorImpl , + StringRef LookupChars, igorkudrin wrote: Added a description for the function and its arguments. https://github.com/llvm/llvm-project/pull/70898 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [YAMLParser] Unfold multi-line scalar values (PR #70898)
@@ -2030,187 +2030,219 @@ bool Node::failed() const { } StringRef ScalarNode::getValue(SmallVectorImpl ) const { - // TODO: Handle newlines properly. We need to remove leading whitespace. - if (Value[0] == '"') { // Double quoted. -// Pull off the leading and trailing "s. -StringRef UnquotedValue = Value.substr(1, Value.size() - 2); -// Search for characters that would require unescaping the value. -StringRef::size_type i = UnquotedValue.find_first_of("\\\r\n"); -if (i != StringRef::npos) - return unescapeDoubleQuoted(UnquotedValue, i, Storage); + if (Value[0] == '"') +return getDoubleQuotedValue(Value, Storage); + if (Value[0] == '\'') +return getSingleQuotedValue(Value, Storage); + return getPlainValue(Value, Storage); +} + +static StringRef +parseScalarValue(StringRef UnquotedValue, SmallVectorImpl , + StringRef LookupChars, + std::function &)> + UnescapeCallback) { + size_t I = UnquotedValue.find_first_of(LookupChars); + if (I == StringRef::npos) return UnquotedValue; - } else if (Value[0] == '\'') { // Single quoted. -// Pull off the leading and trailing 's. -StringRef UnquotedValue = Value.substr(1, Value.size() - 2); -StringRef::size_type i = UnquotedValue.find('\''); -if (i != StringRef::npos) { - // We're going to need Storage. - Storage.clear(); - Storage.reserve(UnquotedValue.size()); - for (; i != StringRef::npos; i = UnquotedValue.find('\'')) { -StringRef Valid(UnquotedValue.begin(), i); -llvm::append_range(Storage, Valid); -Storage.push_back('\''); -UnquotedValue = UnquotedValue.substr(i + 2); - } - llvm::append_range(Storage, UnquotedValue); - return StringRef(Storage.begin(), Storage.size()); -} -return UnquotedValue; - } - // Plain. - // Trim whitespace ('b-char' and 's-white'). - // NOTE: Alternatively we could change the scanner to not include whitespace - // here in the first place. - return Value.rtrim("\x0A\x0D\x20\x09"); -} -StringRef ScalarNode::unescapeDoubleQuoted( StringRef UnquotedValue - , StringRef::size_type i - , SmallVectorImpl ) - const { - // Use Storage to build proper value. Storage.clear(); Storage.reserve(UnquotedValue.size()); - for (; i != StringRef::npos; i = UnquotedValue.find_first_of("\\\r\n")) { -// Insert all previous chars into Storage. -StringRef Valid(UnquotedValue.begin(), i); -llvm::append_range(Storage, Valid); -// Chop off inserted chars. -UnquotedValue = UnquotedValue.substr(i); - -assert(!UnquotedValue.empty() && "Can't be empty!"); - -// Parse escape or line break. -switch (UnquotedValue[0]) { -case '\r': -case '\n': - Storage.push_back('\n'); - if ( UnquotedValue.size() > 1 - && (UnquotedValue[1] == '\r' || UnquotedValue[1] == '\n')) -UnquotedValue = UnquotedValue.substr(1); - UnquotedValue = UnquotedValue.substr(1); - break; -default: - if (UnquotedValue.size() == 1) { -Token T; -T.Range = StringRef(UnquotedValue.begin(), 1); -setError("Unrecognized escape code", T); -return ""; - } - UnquotedValue = UnquotedValue.substr(1); - switch (UnquotedValue[0]) { - default: { - Token T; - T.Range = StringRef(UnquotedValue.begin(), 1); - setError("Unrecognized escape code", T); - return ""; -} - case '\r': - case '\n': -// Remove the new line. -if ( UnquotedValue.size() > 1 -&& (UnquotedValue[1] == '\r' || UnquotedValue[1] == '\n')) - UnquotedValue = UnquotedValue.substr(1); -// If this was just a single byte newline, it will get skipped -// below. -break; - case '0': -Storage.push_back(0x00); -break; - case 'a': -Storage.push_back(0x07); -break; - case 'b': -Storage.push_back(0x08); -break; - case 't': - case 0x09: -Storage.push_back(0x09); -break; - case 'n': -Storage.push_back(0x0A); -break; - case 'v': -Storage.push_back(0x0B); -break; - case 'f': -Storage.push_back(0x0C); -break; - case 'r': -Storage.push_back(0x0D); -break; - case 'e': -Storage.push_back(0x1B); -break; + char LastNewLineAddedAs = '\0'; + for (; I != StringRef::npos; I = UnquotedValue.find_first_of(LookupChars)) { +if (UnquotedValue[I] != '\x0D' && UnquotedValue[I] != '\x0A') { igorkudrin wrote: It was an idea to be a bit closer to the spec, where all special characters are defined by their value. I changed them back to mnemonics in the last update.
[llvm-branch-commits] [llvm] [YAMLParser] Unfold multi-line scalar values (PR #70898)
@@ -2030,187 +2030,219 @@ bool Node::failed() const { } StringRef ScalarNode::getValue(SmallVectorImpl ) const { - // TODO: Handle newlines properly. We need to remove leading whitespace. - if (Value[0] == '"') { // Double quoted. -// Pull off the leading and trailing "s. -StringRef UnquotedValue = Value.substr(1, Value.size() - 2); -// Search for characters that would require unescaping the value. -StringRef::size_type i = UnquotedValue.find_first_of("\\\r\n"); -if (i != StringRef::npos) - return unescapeDoubleQuoted(UnquotedValue, i, Storage); + if (Value[0] == '"') +return getDoubleQuotedValue(Value, Storage); + if (Value[0] == '\'') +return getSingleQuotedValue(Value, Storage); + return getPlainValue(Value, Storage); +} + +static StringRef +parseScalarValue(StringRef UnquotedValue, SmallVectorImpl , + StringRef LookupChars, + std::function &)> + UnescapeCallback) { + size_t I = UnquotedValue.find_first_of(LookupChars); + if (I == StringRef::npos) return UnquotedValue; - } else if (Value[0] == '\'') { // Single quoted. -// Pull off the leading and trailing 's. -StringRef UnquotedValue = Value.substr(1, Value.size() - 2); -StringRef::size_type i = UnquotedValue.find('\''); -if (i != StringRef::npos) { - // We're going to need Storage. - Storage.clear(); - Storage.reserve(UnquotedValue.size()); - for (; i != StringRef::npos; i = UnquotedValue.find('\'')) { -StringRef Valid(UnquotedValue.begin(), i); -llvm::append_range(Storage, Valid); -Storage.push_back('\''); -UnquotedValue = UnquotedValue.substr(i + 2); - } - llvm::append_range(Storage, UnquotedValue); - return StringRef(Storage.begin(), Storage.size()); -} -return UnquotedValue; - } - // Plain. - // Trim whitespace ('b-char' and 's-white'). - // NOTE: Alternatively we could change the scanner to not include whitespace - // here in the first place. - return Value.rtrim("\x0A\x0D\x20\x09"); -} -StringRef ScalarNode::unescapeDoubleQuoted( StringRef UnquotedValue - , StringRef::size_type i - , SmallVectorImpl ) - const { - // Use Storage to build proper value. Storage.clear(); Storage.reserve(UnquotedValue.size()); - for (; i != StringRef::npos; i = UnquotedValue.find_first_of("\\\r\n")) { -// Insert all previous chars into Storage. -StringRef Valid(UnquotedValue.begin(), i); -llvm::append_range(Storage, Valid); -// Chop off inserted chars. -UnquotedValue = UnquotedValue.substr(i); - -assert(!UnquotedValue.empty() && "Can't be empty!"); - -// Parse escape or line break. -switch (UnquotedValue[0]) { -case '\r': -case '\n': - Storage.push_back('\n'); - if ( UnquotedValue.size() > 1 - && (UnquotedValue[1] == '\r' || UnquotedValue[1] == '\n')) -UnquotedValue = UnquotedValue.substr(1); - UnquotedValue = UnquotedValue.substr(1); - break; -default: - if (UnquotedValue.size() == 1) { -Token T; -T.Range = StringRef(UnquotedValue.begin(), 1); -setError("Unrecognized escape code", T); -return ""; - } - UnquotedValue = UnquotedValue.substr(1); - switch (UnquotedValue[0]) { - default: { - Token T; - T.Range = StringRef(UnquotedValue.begin(), 1); - setError("Unrecognized escape code", T); - return ""; -} - case '\r': - case '\n': -// Remove the new line. -if ( UnquotedValue.size() > 1 -&& (UnquotedValue[1] == '\r' || UnquotedValue[1] == '\n')) - UnquotedValue = UnquotedValue.substr(1); -// If this was just a single byte newline, it will get skipped -// below. -break; - case '0': -Storage.push_back(0x00); -break; - case 'a': -Storage.push_back(0x07); -break; - case 'b': -Storage.push_back(0x08); -break; - case 't': - case 0x09: -Storage.push_back(0x09); -break; - case 'n': -Storage.push_back(0x0A); -break; - case 'v': -Storage.push_back(0x0B); -break; - case 'f': -Storage.push_back(0x0C); -break; - case 'r': -Storage.push_back(0x0D); -break; - case 'e': -Storage.push_back(0x1B); -break; + char LastNewLineAddedAs = '\0'; + for (; I != StringRef::npos; I = UnquotedValue.find_first_of(LookupChars)) { +if (UnquotedValue[I] != '\x0D' && UnquotedValue[I] != '\x0A') { + llvm::append_range(Storage, UnquotedValue.take_front(I)); + UnquotedValue = UnescapeCallback(UnquotedValue.drop_front(I), Storage); + LastNewLineAddedAs = '\0'; + continue; +} +if
[llvm-branch-commits] [llvm] [YAMLParser] Fix handling escaped line breaks in double-quoted scalars (PR #71775)
https://github.com/igorkudrin edited https://github.com/llvm/llvm-project/pull/71775 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [YAMLParser] Unfold multi-line scalar values (PR #70898)
https://github.com/igorkudrin edited https://github.com/llvm/llvm-project/pull/70898 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 69bd4da - [YAMLParser] Fix handling escaped line breaks in double-quoted scalars
Author: Igor Kudrin Date: 2023-11-08T21:02:13-08:00 New Revision: 69bd4da46c438ce23ec0773f1d38abee800e6ed4 URL: https://github.com/llvm/llvm-project/commit/69bd4da46c438ce23ec0773f1d38abee800e6ed4 DIFF: https://github.com/llvm/llvm-project/commit/69bd4da46c438ce23ec0773f1d38abee800e6ed4.diff LOG: [YAMLParser] Fix handling escaped line breaks in double-quoted scalars Leading white spaces on the line following an escaped line break should be excluded from the content. See https://yaml.org/spec/1.2.2/#731-double-quoted-style. Added: Modified: llvm/lib/Support/YAMLParser.cpp llvm/test/YAMLParser/spec-09-02.test llvm/test/YAMLParser/spec-09-04.test llvm/test/YAMLParser/spec1.2-07-05.test Removed: diff --git a/llvm/lib/Support/YAMLParser.cpp b/llvm/lib/Support/YAMLParser.cpp index 17d727b6cc07da8..b47cb3ae3b44a75 100644 --- a/llvm/lib/Support/YAMLParser.cpp +++ b/llvm/lib/Support/YAMLParser.cpp @@ -2107,14 +2107,13 @@ StringRef ScalarNode::unescapeDoubleQuoted( StringRef UnquotedValue return ""; } case '\r': +// Shrink the Windows-style EOL. +if (UnquotedValue.size() >= 2 && UnquotedValue[1] == '\n') + UnquotedValue = UnquotedValue.drop_front(1); +[[fallthrough]]; case '\n': -// Remove the new line. -if ( UnquotedValue.size() > 1 -&& (UnquotedValue[1] == '\r' || UnquotedValue[1] == '\n')) - UnquotedValue = UnquotedValue.substr(1); -// If this was just a single byte newline, it will get skipped -// below. -break; +UnquotedValue = UnquotedValue.drop_front(1).ltrim(" \t"); +continue; case '0': Storage.push_back(0x00); break; diff --git a/llvm/test/YAMLParser/spec-09-02.test b/llvm/test/YAMLParser/spec-09-02.test index 6b68a00e3fc3e6f..51ea61dd23273d3 100644 --- a/llvm/test/YAMLParser/spec-09-02.test +++ b/llvm/test/YAMLParser/spec-09-02.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s 2>&1 | FileCheck %s --strict-whitespace -# CHECK: "as space\n trimmed \n specific\L\n escaped\t \n none" +# CHECK: "as space\n trimmed \n specific\L\n escaped\t\n none" ## Note: The example was originally taken from Spec 1.1, but the parsing rules ## have been changed since then. diff --git a/llvm/test/YAMLParser/spec-09-04.test b/llvm/test/YAMLParser/spec-09-04.test index 1e904eaa70992e5..e4f77ea83c7ac5f 100644 --- a/llvm/test/YAMLParser/spec-09-04.test +++ b/llvm/test/YAMLParser/spec-09-04.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace -# CHECK: "first\n \tinner 1\t\n inner 2 last" +# CHECK: "first\n \tinner 1\t\n inner 2 last" "first inner 1 diff --git a/llvm/test/YAMLParser/spec1.2-07-05.test b/llvm/test/YAMLParser/spec1.2-07-05.test index 3ea0e5aa37743e4..f923f68d04295f9 100644 --- a/llvm/test/YAMLParser/spec1.2-07-05.test +++ b/llvm/test/YAMLParser/spec1.2-07-05.test @@ -1,5 +1,5 @@ # RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace -# CHECK: "folded \nto a space,\t\n \nto a line feed, or \t \tnon-content" +# CHECK: "folded \nto a space,\t\n \nto a line feed, or \t \tnon-content" "folded to a space, ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 5784f20 - [YAMLParser] Enable tests for flow scalar styles
Author: Igor Kudrin Date: 2023-11-08T19:20:14-08:00 New Revision: 5784f2014981cdd16095e737d1d128a2995a3dbd URL: https://github.com/llvm/llvm-project/commit/5784f2014981cdd16095e737d1d128a2995a3dbd DIFF: https://github.com/llvm/llvm-project/commit/5784f2014981cdd16095e737d1d128a2995a3dbd.diff LOG: [YAMLParser] Enable tests for flow scalar styles This is a preparing commit for #70898. It activates checks in tests for single-quoted, double-quoted, and plain values and demonstrates how they are handled currently. Added: llvm/test/YAMLParser/spec1.2-07-05.test llvm/test/YAMLParser/spec1.2-07-06.test llvm/test/YAMLParser/spec1.2-07-09.test llvm/test/YAMLParser/spec1.2-07-12.test Modified: llvm/test/YAMLParser/spec-02-17.test llvm/test/YAMLParser/spec-05-13.test llvm/test/YAMLParser/spec-05-14.test llvm/test/YAMLParser/spec-09-01.test llvm/test/YAMLParser/spec-09-02.test llvm/test/YAMLParser/spec-09-03.test llvm/test/YAMLParser/spec-09-04.test llvm/test/YAMLParser/spec-09-05.test llvm/test/YAMLParser/spec-09-06.test llvm/test/YAMLParser/spec-09-07.test llvm/test/YAMLParser/spec-09-08.test llvm/test/YAMLParser/spec-09-09.test llvm/test/YAMLParser/spec-09-10.test llvm/test/YAMLParser/spec-09-11.test llvm/test/YAMLParser/spec-09-13.test llvm/test/YAMLParser/spec-09-16.test llvm/test/YAMLParser/spec-09-17.test llvm/test/YAMLParser/spec-10-02.test Removed: diff --git a/llvm/test/YAMLParser/spec-02-17.test b/llvm/test/YAMLParser/spec-02-17.test index 2bcb60c8d933bd8..e7b0147a0fcd89f 100644 --- a/llvm/test/YAMLParser/spec-02-17.test +++ b/llvm/test/YAMLParser/spec-02-17.test @@ -1,4 +1,4 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s unicode: "Sosa did fine.\u263A" control: "\b1998\t1999\t2000\n" diff --git a/llvm/test/YAMLParser/spec-05-13.test b/llvm/test/YAMLParser/spec-05-13.test index db62e866a755a32..e7ec42a4aaa80d7 100644 --- a/llvm/test/YAMLParser/spec-05-13.test +++ b/llvm/test/YAMLParser/spec-05-13.test @@ -1,4 +1,5 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace +# CHECK: "Text containing \n both space and\t\n \ttab\tcharacters" "Text containing both space and diff --git a/llvm/test/YAMLParser/spec-05-14.test b/llvm/test/YAMLParser/spec-05-14.test index 65451651b69e96b..984f3721312ab63 100644 --- a/llvm/test/YAMLParser/spec-05-14.test +++ b/llvm/test/YAMLParser/spec-05-14.test @@ -1,4 +1,4 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace "Fun with \\ \" \a \b \e \f \ diff --git a/llvm/test/YAMLParser/spec-09-01.test b/llvm/test/YAMLParser/spec-09-01.test index 8999b4961626470..2b5a6f31166ddf1 100644 --- a/llvm/test/YAMLParser/spec-09-01.test +++ b/llvm/test/YAMLParser/spec-09-01.test @@ -1,4 +1,13 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace +# CHECK: !!map { +# CHECK-NEXT: ? !!str "simple key" +# CHECK-NEXT: : !!map { +# CHECK-NEXT: ? !!str "also simple" +# CHECK-NEXT: : !!str "value", +# CHECK-NEXT: ? !!str "not a\n simple key" +# CHECK-NEXT: : !!str "any\n value", +# CHECK-NEXT: }, +# CHECK-NEXT: } "simple key" : { "also simple" : value, diff --git a/llvm/test/YAMLParser/spec-09-02.test b/llvm/test/YAMLParser/spec-09-02.test index 3f8e49a8bd31079..6b68a00e3fc3e6f 100644 --- a/llvm/test/YAMLParser/spec-09-02.test +++ b/llvm/test/YAMLParser/spec-09-02.test @@ -1,14 +1,17 @@ -# RUN: yaml-bench -canonical %s 2>&1 | FileCheck %s +# RUN: yaml-bench -canonical %s 2>&1 | FileCheck %s --strict-whitespace +# CHECK: "as space\n trimmed \n specific\L\n escaped\t \n none" - "as space - trimmed +## Note: The example was originally taken from Spec 1.1, but the parsing rules +## have been changed since then. +## * The paragraph-separator character '\u2029' is excluded from line-break +## characters, so the original sequence "escaped\t\\\u2029" is no longer +## considered valid. This is replaced by "escaped\t\\\n" in the test source. +## See https://yaml.org/spec/1.2.2/ext/changes/ for details. - specific + "as space + trimmed + specific
 escaped \ + none" - -# FIXME: The string below should actually be -# "as space trimmed\nspecific\nescaped\tnone", but the parser currently has -# a bug when parsing multiline quoted strings. -# CHECK: !!str "as space\n trimmed\n specific\n escaped\t none" diff --git a/llvm/test/YAMLParser/spec-09-03.test b/llvm/test/YAMLParser/spec-09-03.test index 3fb0d8b184abb16..c656058b7ff8b3e 100644 --- a/llvm/test/YAMLParser/spec-09-03.test +++ b/llvm/test/YAMLParser/spec-09-03.test @@ -1,4 +1,9 @@ -# RUN: yaml-bench -canonical %s +# RUN: yaml-bench -canonical %s | FileCheck %s --strict-whitespace +#
[llvm-branch-commits] [libcxx] 7803636 - [libcxx testing] Fix UB in tests for std::lock_guard
Author: Igor Kudrin Date: 2021-01-15T16:11:45+07:00 New Revision: 78036360573c35ea9e6a697d2eed92db893b4850 URL: https://github.com/llvm/llvm-project/commit/78036360573c35ea9e6a697d2eed92db893b4850 DIFF: https://github.com/llvm/llvm-project/commit/78036360573c35ea9e6a697d2eed92db893b4850.diff LOG: [libcxx testing] Fix UB in tests for std::lock_guard If mutex::try_lock() is called in a thread that already owns the mutex, the behavior is undefined. The patch fixes the issue by creating another thread, where the call is allowed. Differential Revision: https://reviews.llvm.org/D94656 Added: Modified: libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/adopt_lock.pass.cpp libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/mutex.pass.cpp Removed: diff --git a/libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/adopt_lock.pass.cpp b/libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/adopt_lock.pass.cpp index 5135dbcef816..db6a2e35f9c5 100644 --- a/libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/adopt_lock.pass.cpp +++ b/libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/adopt_lock.pass.cpp @@ -18,15 +18,21 @@ #include #include +#include "make_test_thread.h" #include "test_macros.h" std::mutex m; +void do_try_lock() { + assert(m.try_lock() == false); +} + int main(int, char**) { { m.lock(); std::lock_guard lg(m, std::adopt_lock); -assert(m.try_lock() == false); +std::thread t = support::make_test_thread(do_try_lock); +t.join(); } m.lock(); diff --git a/libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/mutex.pass.cpp b/libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/mutex.pass.cpp index 0e096eabe4b6..5dcecd344c36 100644 --- a/libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/mutex.pass.cpp +++ b/libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.guard/mutex.pass.cpp @@ -21,14 +21,20 @@ #include #include +#include "make_test_thread.h" #include "test_macros.h" std::mutex m; +void do_try_lock() { + assert(m.try_lock() == false); +} + int main(int, char**) { { std::lock_guard lg(m); -assert(m.try_lock() == false); +std::thread t = support::make_test_thread(do_try_lock); +t.join(); } m.lock(); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits