[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-23 Thread Pavel Labath via lldb-commits

https://github.com/labath closed https://github.com/llvm/llvm-project/pull/93006
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-23 Thread Pavel Labath via lldb-commits

https://github.com/labath updated 
https://github.com/llvm/llvm-project/pull/93006

>From 03ab6febc0f40d0f39f533a5b4fd7c33a6728ae1 Mon Sep 17 00:00:00 2001
From: Pavel Labath 
Date: Tue, 21 May 2024 15:31:23 +
Subject: [PATCH 1/2] [lldb/DWARF] Make sure bad abbreviation codes do not
 crash lldb

We currently cannot represent abbreviation codes with more than 16 bits,
and we were lldb-asserting if we ever ran into one. While I haven't seen
any real DWARF with these kinds of abbreviations, it is possible to hit
this with handcrafted evil dwarf, due some sort of corruptions, or just
bugs (the addition of PeekDIEName makes these bugs more likely, as the
function blindly dereferences offsets within the debug info section) .

Missing abbreviations were already reporting an error. This patch turns
sure that large abbreviations into an error as well, and adds a test for
both cases.
---
 .../SymbolFile/DWARF/DWARFDebugInfoEntry.cpp  | 34 +++---
 .../DWARF/x86/invalid_abbreviation.s  | 47 +++
 2 files changed, 63 insertions(+), 18 deletions(-)
 create mode 100644 lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp 
b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
index 1b0fefedf9836..4357ccb2f5c9f 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
@@ -11,6 +11,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 #include "llvm/Support/LEB128.h"
@@ -44,10 +45,20 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
   const DWARFUnit *cu,
   lldb::offset_t *offset_ptr) {
   m_offset = *offset_ptr;
+  auto report_error = [&](const char *fmt, const auto &...vals) {
+cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
+"[{0:x16}]: {1}, please file a bug and "
+"attach the file at the start of this error message",
+static_cast(m_offset), llvm::formatv(fmt, vals...));
+*offset_ptr = std::numeric_limits::max();
+return false;
+  };
+
   m_parent_idx = 0;
   m_sibling_idx = 0;
   const uint64_t abbr_idx = data.GetULEB128(offset_ptr);
-  lldbassert(abbr_idx <= UINT16_MAX);
+  if (abbr_idx > std::numeric_limits::max())
+return report_error("abbreviation code {0} too big", abbr_idx);
   m_abbr_idx = abbr_idx;
 
   if (m_abbr_idx == 0) {
@@ -57,16 +68,9 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
   }
 
   const auto *abbrevDecl = GetAbbreviationDeclarationPtr(cu);
-  if (abbrevDecl == nullptr) {
-cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
-"[{0:x16}]: invalid abbreviation code {1}, "
-"please file a bug and "
-"attach the file at the start of this error message",
-(uint64_t)m_offset, (unsigned)abbr_idx);
-// WE can't parse anymore if the DWARF is borked...
-*offset_ptr = UINT32_MAX;
-return false;
-  }
+  if (abbrevDecl == nullptr)
+return report_error("invalid abbreviation code {0}", abbr_idx);
+
   m_tag = abbrevDecl->getTag();
   m_has_children = abbrevDecl->hasChildren();
   // Skip all data in the .debug_info or .debug_types for the attributes
@@ -74,13 +78,7 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
 if (DWARFFormValue::SkipValue(attribute.Form, data, offset_ptr, cu))
   continue;
 
-cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
-"[{0:x16}]: Unsupported DW_FORM_{1:x}, please file a bug "
-"and "
-"attach the file at the start of this error message",
-(uint64_t)m_offset, (unsigned)attribute.Form);
-*offset_ptr = m_offset;
-return false;
+return report_error("Unsupported DW_FORM_{1:x}", attribute.Form);
   }
   return true;
 }
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s 
b/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s
new file mode 100644
index 0..3f32c037aeb20
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s
@@ -0,0 +1,47 @@
+# REQUIRES: x86
+
+# RUN: llvm-mc -triple=x86_64-pc-linux -filetype=obj %s > %t
+# RUN: %lldb %t \
+# RUN:   -o exit 2>&1 | FileCheck %s
+
+# CHECK-DAG: error: {{.*}} [0x0022]: abbreviation code 65536 too 
big, please file a bug and attach the file at the start of this error message
+# CHECK-DAG: error: {{.*}} [0x0048]: invalid abbreviation code 47, 
please file a bug and attach the file at the start of this error message
+
+
+.section.debug_abbrev,"",@progbits
+.uleb128 65535  # Largest representable Abbreviation 
Code
+.byte   17  # DW_TAG_compile_unit
+.byte   1   # DW_CHILDREN_yes
+.byte   37  # DW_AT_producer
+

[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-23 Thread Pavel Labath via lldb-commits

labath wrote:

> We have to back out the PeekDieName() patch locally at Meta because it was 
> crashing us due to this assertion due to .debug_names tables having incorrect 
> values for type units. Only one type unit will appear in the final file and 
> type units can have differing contents for the same type. This means 
> accelerator table entries from .o file that had a type unit, but its type 
> unit didn't end up in the final output file, can be bogus and not point to 
> valid DIE offsets which can cause PeekDieName to parse at random offsets in a 
> type unit and crash or report an error now. So we might need an extra bool to 
> be passed to the `DWARFDebugInfoEntry::Extract(...)` function that says to 
> report an error and have PeekDieName call this with "report_errors = false". 
> Right now many of these entries will cause a large numver of errors to be 
> reported. This is being fixed by verifying that type unit accelerator table 
> entries are matched to the right type unit, but that PR isn't in yet.

I think that is orthogonal to this patch, but it is troubling nonetheless 
(thank you for bringing it to my attention). If you want to go down this path, 
you might want to consider something like call_once (or call_once_every_N_sec) 
to make sure there is some record of the issue somewhere. If not, I might get 
around to this at some point.

https://github.com/llvm/llvm-project/pull/93006
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-23 Thread Pavel Labath via lldb-commits


@@ -44,10 +45,20 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
   const DWARFUnit *cu,
   lldb::offset_t *offset_ptr) {
   m_offset = *offset_ptr;
+  auto report_error = [&](const char *fmt, const auto &...vals) {
+cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(

labath wrote:

All of the callers are in DWARFUnit (passing `this`), but I'm not going to pass 
up an opportunity to add more references. :P

https://github.com/llvm/llvm-project/pull/93006
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-22 Thread Greg Clayton via lldb-commits

https://github.com/clayborg commented:

We have to back out the PeekDieName() patch locally at Meta because it was 
crashing us due to this assertion due to .debug_names tables having incorrect 
values for type units. Only one type unit will appear in the final file and 
type units can have differing contents for the same type. This means 
accelerator table entries from .o file that had a type unit, but its type unit 
didn't end up in the final output file, can be bogus and not point to valid DIE 
offsets which can cause PeekDieName to parse at random  offsets in a type unit 
and crash or report an error now. So we might need an extra bool to be passed 
to the `DWARFDebugInfoEntry::Extract(...)` function that says to report an 
error and have PeekDieName call this with "report_errors = false". Right now 
many of these entries will cause a large numver of errors to be reported. This 
is being fixed by verifying that type unit accelerator table entries are 
matched to the right type unit, but that PR isn't in yet.

https://github.com/llvm/llvm-project/pull/93006
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-22 Thread Jonas Devlieghere via lldb-commits

https://github.com/JDevlieghere edited 
https://github.com/llvm/llvm-project/pull/93006
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-22 Thread Jonas Devlieghere via lldb-commits


@@ -44,10 +45,20 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
   const DWARFUnit *cu,
   lldb::offset_t *offset_ptr) {
   m_offset = *offset_ptr;
+  auto report_error = [&](const char *fmt, const auto &...vals) {
+cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(

JDevlieghere wrote:

Can we assume `cu` is always a valid pointer? If so, should the signature take 
it by const reference instead? 

AFAIK only `GetAbbreviationDeclarationPtr` checks that it's not NULL but we 
have a code path that calls the lambda before as well as when `abbrevDecl` is 
NULL which would happen if CU is NULL. 

https://github.com/llvm/llvm-project/pull/93006
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-22 Thread Alex Langford via lldb-commits

https://github.com/bulbazord approved this pull request.


https://github.com/llvm/llvm-project/pull/93006
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-22 Thread Felipe de Azevedo Piovezan via lldb-commits

https://github.com/felipepiovezan approved this pull request.

LGTM! Thanks for catching this

https://github.com/llvm/llvm-project/pull/93006
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-22 Thread via lldb-commits

llvmbot wrote:




@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)


Changes

We currently cannot represent abbreviation codes with more than 16 bits, and we 
were lldb-asserting if we ever ran into one. While I haven't seen any real 
DWARF with these kinds of abbreviations, it is possible to hit this with 
handcrafted evil dwarf, due some sort of corruptions, or just bugs (the 
addition of PeekDIEName makes these bugs more likely, as the function blindly 
dereferences offsets within the debug info section) .

Missing abbreviations were already reporting an error. This patch turns sure 
that large abbreviations into an error as well, and adds a test for both cases.

---
Full diff: https://github.com/llvm/llvm-project/pull/93006.diff


2 Files Affected:

- (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp 
(+16-18) 
- (added) lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s (+47) 


``diff
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp 
b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
index 1b0fefedf9836..4357ccb2f5c9f 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
@@ -11,6 +11,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 #include "llvm/Support/LEB128.h"
@@ -44,10 +45,20 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
   const DWARFUnit *cu,
   lldb::offset_t *offset_ptr) {
   m_offset = *offset_ptr;
+  auto report_error = [&](const char *fmt, const auto &...vals) {
+cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
+"[{0:x16}]: {1}, please file a bug and "
+"attach the file at the start of this error message",
+static_cast(m_offset), llvm::formatv(fmt, vals...));
+*offset_ptr = std::numeric_limits::max();
+return false;
+  };
+
   m_parent_idx = 0;
   m_sibling_idx = 0;
   const uint64_t abbr_idx = data.GetULEB128(offset_ptr);
-  lldbassert(abbr_idx <= UINT16_MAX);
+  if (abbr_idx > std::numeric_limits::max())
+return report_error("abbreviation code {0} too big", abbr_idx);
   m_abbr_idx = abbr_idx;
 
   if (m_abbr_idx == 0) {
@@ -57,16 +68,9 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
   }
 
   const auto *abbrevDecl = GetAbbreviationDeclarationPtr(cu);
-  if (abbrevDecl == nullptr) {
-cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
-"[{0:x16}]: invalid abbreviation code {1}, "
-"please file a bug and "
-"attach the file at the start of this error message",
-(uint64_t)m_offset, (unsigned)abbr_idx);
-// WE can't parse anymore if the DWARF is borked...
-*offset_ptr = UINT32_MAX;
-return false;
-  }
+  if (abbrevDecl == nullptr)
+return report_error("invalid abbreviation code {0}", abbr_idx);
+
   m_tag = abbrevDecl->getTag();
   m_has_children = abbrevDecl->hasChildren();
   // Skip all data in the .debug_info or .debug_types for the attributes
@@ -74,13 +78,7 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
 if (DWARFFormValue::SkipValue(attribute.Form, data, offset_ptr, cu))
   continue;
 
-cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
-"[{0:x16}]: Unsupported DW_FORM_{1:x}, please file a bug "
-"and "
-"attach the file at the start of this error message",
-(uint64_t)m_offset, (unsigned)attribute.Form);
-*offset_ptr = m_offset;
-return false;
+return report_error("Unsupported DW_FORM_{1:x}", attribute.Form);
   }
   return true;
 }
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s 
b/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s
new file mode 100644
index 0..3f32c037aeb20
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s
@@ -0,0 +1,47 @@
+# REQUIRES: x86
+
+# RUN: llvm-mc -triple=x86_64-pc-linux -filetype=obj %s > %t
+# RUN: %lldb %t \
+# RUN:   -o exit 2>&1 | FileCheck %s
+
+# CHECK-DAG: error: {{.*}} [0x0022]: abbreviation code 65536 too 
big, please file a bug and attach the file at the start of this error message
+# CHECK-DAG: error: {{.*}} [0x0048]: invalid abbreviation code 47, 
please file a bug and attach the file at the start of this error message
+
+
+.section.debug_abbrev,"",@progbits
+.uleb128 65535  # Largest representable Abbreviation 
Code
+.byte   17  # DW_TAG_compile_unit
+.byte   1   # DW_CHILDREN_yes
+.byte   37  # DW_AT_producer
+.byte   8   # DW_FORM_string
+.byte   0   # EOM(1)
+.byte   0   # EOM(2)
+.byte   0   # EOM(3)
+
+

[Lldb-commits] [lldb] [lldb/DWARF] Make sure bad abbreviation codes do not crash lldb (PR #93006)

2024-05-22 Thread Pavel Labath via lldb-commits

https://github.com/labath created 
https://github.com/llvm/llvm-project/pull/93006

We currently cannot represent abbreviation codes with more than 16 bits, and we 
were lldb-asserting if we ever ran into one. While I haven't seen any real 
DWARF with these kinds of abbreviations, it is possible to hit this with 
handcrafted evil dwarf, due some sort of corruptions, or just bugs (the 
addition of PeekDIEName makes these bugs more likely, as the function blindly 
dereferences offsets within the debug info section) .

Missing abbreviations were already reporting an error. This patch turns sure 
that large abbreviations into an error as well, and adds a test for both cases.

>From 03ab6febc0f40d0f39f533a5b4fd7c33a6728ae1 Mon Sep 17 00:00:00 2001
From: Pavel Labath 
Date: Tue, 21 May 2024 15:31:23 +
Subject: [PATCH] [lldb/DWARF] Make sure bad abbreviation codes do not crash
 lldb

We currently cannot represent abbreviation codes with more than 16 bits,
and we were lldb-asserting if we ever ran into one. While I haven't seen
any real DWARF with these kinds of abbreviations, it is possible to hit
this with handcrafted evil dwarf, due some sort of corruptions, or just
bugs (the addition of PeekDIEName makes these bugs more likely, as the
function blindly dereferences offsets within the debug info section) .

Missing abbreviations were already reporting an error. This patch turns
sure that large abbreviations into an error as well, and adds a test for
both cases.
---
 .../SymbolFile/DWARF/DWARFDebugInfoEntry.cpp  | 34 +++---
 .../DWARF/x86/invalid_abbreviation.s  | 47 +++
 2 files changed, 63 insertions(+), 18 deletions(-)
 create mode 100644 lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp 
b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
index 1b0fefedf9836..4357ccb2f5c9f 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
@@ -11,6 +11,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 #include "llvm/Support/LEB128.h"
@@ -44,10 +45,20 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
   const DWARFUnit *cu,
   lldb::offset_t *offset_ptr) {
   m_offset = *offset_ptr;
+  auto report_error = [&](const char *fmt, const auto &...vals) {
+cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
+"[{0:x16}]: {1}, please file a bug and "
+"attach the file at the start of this error message",
+static_cast(m_offset), llvm::formatv(fmt, vals...));
+*offset_ptr = std::numeric_limits::max();
+return false;
+  };
+
   m_parent_idx = 0;
   m_sibling_idx = 0;
   const uint64_t abbr_idx = data.GetULEB128(offset_ptr);
-  lldbassert(abbr_idx <= UINT16_MAX);
+  if (abbr_idx > std::numeric_limits::max())
+return report_error("abbreviation code {0} too big", abbr_idx);
   m_abbr_idx = abbr_idx;
 
   if (m_abbr_idx == 0) {
@@ -57,16 +68,9 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
   }
 
   const auto *abbrevDecl = GetAbbreviationDeclarationPtr(cu);
-  if (abbrevDecl == nullptr) {
-cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
-"[{0:x16}]: invalid abbreviation code {1}, "
-"please file a bug and "
-"attach the file at the start of this error message",
-(uint64_t)m_offset, (unsigned)abbr_idx);
-// WE can't parse anymore if the DWARF is borked...
-*offset_ptr = UINT32_MAX;
-return false;
-  }
+  if (abbrevDecl == nullptr)
+return report_error("invalid abbreviation code {0}", abbr_idx);
+
   m_tag = abbrevDecl->getTag();
   m_has_children = abbrevDecl->hasChildren();
   // Skip all data in the .debug_info or .debug_types for the attributes
@@ -74,13 +78,7 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor 
,
 if (DWARFFormValue::SkipValue(attribute.Form, data, offset_ptr, cu))
   continue;
 
-cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
-"[{0:x16}]: Unsupported DW_FORM_{1:x}, please file a bug "
-"and "
-"attach the file at the start of this error message",
-(uint64_t)m_offset, (unsigned)attribute.Form);
-*offset_ptr = m_offset;
-return false;
+return report_error("Unsupported DW_FORM_{1:x}", attribute.Form);
   }
   return true;
 }
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s 
b/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s
new file mode 100644
index 0..3f32c037aeb20
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s
@@ -0,0 +1,47 @@
+# REQUIRES: x86
+
+# RUN: llvm-mc -triple=x86_64-pc-linux -filetype=obj %s > %t
+# RUN: %lldb %t \
+# RUN:   -o exit 2>&1 | FileCheck %s
+
+# CHECK-DAG: error: {{.*}} [0x0022]: