[Lldb-commits] [lldb] [lldb] Change the implementation of Status to store an llvm::Error (NFC) (PR #106774)
@@ -174,33 +233,91 @@ const char *Status::AsCString(const char *default_error_str) const { // Clear the error and any cached error string that it might contain. void Status::Clear() { - m_code = 0; - m_type = eErrorTypeInvalid; - m_string.clear(); + if (m_error) +LLDB_LOG_ERRORV(GetLog(LLDBLog::API), std::move(m_error), +"dropping error {0}"); + m_error = llvm::Error::success(); + llvm::consumeError(std::move(m_error)); } // Access the error value. -Status::ValueType Status::GetError() const { return m_code; } +Status::ValueType Status::GetError() const { + Status::ValueType result = 0; + llvm::visitErrors(m_error, [&](const llvm::ErrorInfoBase &error) { +std::error_code ec = error.convertToErrorCode(); +if (ec.category() == std::generic_category() || +ec.category() == generic_category() || +ec.category() == expression_category()) + result = ec.value(); labath wrote: How about just returning `ec.value()` regardless of the category? https://github.com/llvm/llvm-project/pull/106774 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Change the implementation of Status to store an llvm::Error (NFC) (PR #106774)
@@ -96,16 +124,44 @@ Status Status::FromErrorStringWithFormat(const char *format, ...) { return Status(string); } -llvm::Error Status::ToError() const { - if (Success()) -return llvm::Error::success(); - if (m_type == ErrorType::eErrorTypePOSIX) -return llvm::errorCodeToError( -std::error_code(m_code, std::generic_category())); - return llvm::createStringError(AsCString()); +Status Status::FromExpressionError(lldb::ExpressionResults result, + std::string msg) { + return Status(llvm::make_error( + std::error_code(result, expression_category()), msg)); } -Status::~Status() = default; +/// Creates a deep copy of all known errors and converts all other +/// errors to a new llvm::StringError. +static llvm::Error cloneError(llvm::Error &error) { + llvm::Error result = llvm::Error::success(); + llvm::consumeError(std::move(result)); + llvm::visitErrors(error, [&](const llvm::ErrorInfoBase &error) { +if (error.isA()) + result = llvm::make_error(error.convertToErrorCode()); +else if (error.isA()) + result = llvm::make_error(error.convertToErrorCode()); +else if (error.isA()) + result = llvm::make_error(error.convertToErrorCode(), + error.message()); +else + result = + llvm::createStringError(error.convertToErrorCode(), error.message()); + }); + return result; +} + +Status Status::FromError(llvm::Error &&error) { + // Existing clients assume that the conversion to Status consumes + // and destroys the error. Use cloneError to convert all unnown + // errors to strings. + llvm::Error clone = cloneError(error); + llvm::consumeError(std::move(error)); labath wrote: I'm still confused by this. Why isn't this just `return Status(std::move(error))` ? https://github.com/llvm/llvm-project/pull/106774 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Change the implementation of Status to store an llvm::Error (NFC) (PR #106774)
@@ -174,33 +233,91 @@ const char *Status::AsCString(const char *default_error_str) const { // Clear the error and any cached error string that it might contain. void Status::Clear() { - m_code = 0; - m_type = eErrorTypeInvalid; - m_string.clear(); + if (m_error) +LLDB_LOG_ERRORV(GetLog(LLDBLog::API), std::move(m_error), +"dropping error {0}"); + m_error = llvm::Error::success(); + llvm::consumeError(std::move(m_error)); } // Access the error value. -Status::ValueType Status::GetError() const { return m_code; } +Status::ValueType Status::GetError() const { + Status::ValueType result = 0; + llvm::visitErrors(m_error, [&](const llvm::ErrorInfoBase &error) { +std::error_code ec = error.convertToErrorCode(); +if (ec.category() == std::generic_category() || +ec.category() == generic_category() || +ec.category() == expression_category()) + result = ec.value(); +else + result = 0xff; labath wrote: A more insteresting case is what should this return in case of multiple errors? Right now, it will return the last code, but maybe this is a better case for a special error code? https://github.com/llvm/llvm-project/pull/106774 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Change the implementation of Status to store an llvm::Error (NFC) (PR #106774)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/106774 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][AIX] Taking Linux Host Info header's base for AIX (PR #106910)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/106910 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][AIX] Taking Linux Host Info header's base for AIX (PR #106910)
https://github.com/labath commented: So, how different is AIX from linux? Should we be treating it as a flavour of linux. E.g., have HostInfoAIX derive from HostInfoLinux instead of copying it ? https://github.com/llvm/llvm-project/pull/106910 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][AIX] Taking Linux Host Info header's base for AIX (PR #106910)
@@ -38,6 +38,12 @@ endif() include(LLDBConfig) include(AddLLDB) +# This has been added to keep the AIX build isolated for now. +# It will need to be modified later. +if (UNIX AND ${CMAKE_SYSTEM_NAME} MATCHES "AIX") + add_definitions("-D__AIX__") labath wrote: We shouldn't be defining a reserved identifier. Also, isn't there an existing macro that we could test use to check if we're on AIX? https://github.com/llvm/llvm-project/pull/106910 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [LLDB] Finish implementing support for DW_FORM_data16 (PR #106799)
labath wrote: > Sadly, I haven't figured out a good way to test this. `test/Shell/SymbolFile/DWARF/x86/DW_AT_const_value.s` tests various encodings of DW_AT_const_value. It doesn't do DW_FORM_data16, most likely because it was not working at the time. You can add another case to the file to exercise this code path. https://github.com/llvm/llvm-project/pull/106799 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Better matching of types in anonymous namespaces (PR #102111)
https://github.com/labath closed https://github.com/llvm/llvm-project/pull/102111 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Better matching of types in anonymous namespaces (PR #102111)
labath wrote: No worries, I was out a large part of August anyway. https://github.com/llvm/llvm-project/pull/102111 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add Status::Detail to store expression evaluator diagnostics [… (PR #106442)
labath wrote: Not really (although this might work as well). What I meant was that, AIUI you're doing this so that you can plumb structured error information from one place (clang expression parser?) to another place (the thing which generates the expression errors -- I guess somewhere CommandObjectExpression?). That could be done by creating a (custom) llvm::Error in the first place, and consuming it in the second one *and* by making sure that all of the functions along that path don't lose this information by converting the error to a lldb_private::Status. I haven't looked at how hard much changes would that require, but I'm hoping that will be a single relatively well defined path that can be ported with reasonable effort. Even if there's e.g. a virtual function somewhere along the path, we don't necessarily have to port all implementations of that function immediately. We could just port the ones we need, and provide a shim so that the other implementations can remain unchanged... https://github.com/llvm/llvm-project/pull/106442 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -96,16 +98,119 @@ def getArchSpec(self, architecture): """ return ["ARCH=" + architecture] if architecture else [] -def getCCSpec(self, compiler): +def getToolchainSpec(self, compiler): """ -Helper function to return the key-value string to specify the compiler +Helper function to return the key-value strings to specify the toolchain used for the make system. """ cc = compiler if compiler else None if not cc and configuration.compiler: cc = configuration.compiler + if cc: -return ['CC="%s"' % cc] +exe_ext = "" +if lldbplatformutil.getHostPlatform() == "windows": +exe_ext = ".exe" + +cc = cc.strip() +cc_path = pathlib.Path(cc) + +# We can get CC compiler string in the following formats: +# [] - such as 'xrun clang', 'xrun /usr/bin/clang' & etc +# +# Where could contain the following parts: +# [.] - sucn as 'clang', 'clang.exe' ('clang-cl.exe'?) +# -[.] - such as 'armv7-linux-gnueabi-gcc' +# /[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' +# /-[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' + +cc_ext = cc_path.suffix +# Compiler name without extension +cc_name = cc_path.stem.split(" ")[-1] + +# A kind of compiler (canonical name): clang, gcc, cc & etc. +cc_type = cc_name +# A triple prefix of compiler name: gcc +cc_prefix = "" +if not "clang-cl" in cc_name and not "llvm-gcc" in cc_name: +cc_name_parts = cc_name.split("-") +cc_type = cc_name_parts[-1] +if len(cc_name_parts) > 1: +cc_prefix = "-".join(cc_name_parts[:-1]) + "-" + +# A kind of C++ compiler. +cxx_types = { +"icc": "icpc", +"llvm-gcc": "llvm-g++", +"gcc": "g++", +"cc": "c++", +"clang": "clang++", +} +cxx_type = cxx_types.get(cc_type, cc_type) + +cc_dir = cc_path.parent + +def getLlvmUtil(util_name): +llvm_tools_dir = os.getenv("LLVM_TOOLS_DIR", cc_dir) +return os.path.join(llvm_tools_dir, util_name + exe_ext) + +def getToolchainUtil(util_name): +return cc_dir / (cc_prefix + util_name + cc_ext) + +cxx = getToolchainUtil(cxx_type) + +util_names = { +"OBJCOPY": "objcopy", +"STRIP": "objcopy", labath wrote: Can we get rid of this variable? I don't think it's very helpful. Even though `objcopy` supports (I think) all of the functionality of `strip`, their args are different, so I think this could be pretty confusing. https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
https://github.com/labath commented: Looks pretty good now, but I still have some questions. https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -213,7 +229,7 @@ endif LIMIT_DEBUG_INFO_FLAGS = NO_LIMIT_DEBUG_INFO_FLAGS = MODULE_DEBUG_INFO_FLAGS = -ifneq (,$(findstring clang,$(CC))) +ifeq ($(CCC), clang) labath wrote: So the last `C` stands for "type" ? :) Can we call it CC_TYPE/KIND (or similar) instead? It's not like we're using it that often... https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -96,16 +98,119 @@ def getArchSpec(self, architecture): """ return ["ARCH=" + architecture] if architecture else [] -def getCCSpec(self, compiler): +def getToolchainSpec(self, compiler): """ -Helper function to return the key-value string to specify the compiler +Helper function to return the key-value strings to specify the toolchain used for the make system. """ cc = compiler if compiler else None if not cc and configuration.compiler: cc = configuration.compiler + if cc: labath wrote: ```suggestion if not cc: return [] ``` .. and then unindent the rest of the code https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -96,16 +98,119 @@ def getArchSpec(self, architecture): """ return ["ARCH=" + architecture] if architecture else [] -def getCCSpec(self, compiler): +def getToolchainSpec(self, compiler): """ -Helper function to return the key-value string to specify the compiler +Helper function to return the key-value strings to specify the toolchain used for the make system. """ cc = compiler if compiler else None if not cc and configuration.compiler: cc = configuration.compiler + if cc: -return ['CC="%s"' % cc] +exe_ext = "" +if lldbplatformutil.getHostPlatform() == "windows": +exe_ext = ".exe" + +cc = cc.strip() +cc_path = pathlib.Path(cc) + +# We can get CC compiler string in the following formats: +# [] - such as 'xrun clang', 'xrun /usr/bin/clang' & etc +# +# Where could contain the following parts: +# [.] - sucn as 'clang', 'clang.exe' ('clang-cl.exe'?) +# -[.] - such as 'armv7-linux-gnueabi-gcc' +# /[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' +# /-[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' + +cc_ext = cc_path.suffix +# Compiler name without extension +cc_name = cc_path.stem.split(" ")[-1] + +# A kind of compiler (canonical name): clang, gcc, cc & etc. +cc_type = cc_name +# A triple prefix of compiler name: gcc +cc_prefix = "" +if not "clang-cl" in cc_name and not "llvm-gcc" in cc_name: +cc_name_parts = cc_name.split("-") +cc_type = cc_name_parts[-1] +if len(cc_name_parts) > 1: +cc_prefix = "-".join(cc_name_parts[:-1]) + "-" + +# A kind of C++ compiler. +cxx_types = { +"icc": "icpc", +"llvm-gcc": "llvm-g++", +"gcc": "g++", +"cc": "c++", +"clang": "clang++", +} +cxx_type = cxx_types.get(cc_type, cc_type) + +cc_dir = cc_path.parent + +def getLlvmUtil(util_name): +llvm_tools_dir = os.getenv("LLVM_TOOLS_DIR", cc_dir) +return os.path.join(llvm_tools_dir, util_name + exe_ext) + +def getToolchainUtil(util_name): +return cc_dir / (cc_prefix + util_name + cc_ext) + +cxx = getToolchainUtil(cxx_type) + +util_names = { +"OBJCOPY": "objcopy", +"STRIP": "objcopy", +"ARCHIVER": "ar", +"DWP": "dwp", +} +utils = [] + +# Note: LLVM_AR is currently required by API TestBSDArchives.py tests. +# Assembly a full path to llvm-ar for given LLVM_TOOLS_DIR; labath wrote: ```suggestion # Assemble a full path to llvm-ar for given LLVM_TOOLS_DIR; ``` https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -96,16 +98,120 @@ def getArchSpec(self, architecture): """ return ["ARCH=" + architecture] if architecture else [] -def getCCSpec(self, compiler): +def getToolchainSpec(self, compiler): """ -Helper function to return the key-value string to specify the compiler +Helper function to return the key-value strings to specify the toolchain used for the make system. """ cc = compiler if compiler else None if not cc and configuration.compiler: cc = configuration.compiler + if cc: -return ['CC="%s"' % cc] +exe_ext = "" +if lldbplatformutil.getHostPlatform() == "windows": +exe_ext = ".exe" + +cc = cc.strip() +cc_path = pathlib.Path(cc) +cc = cc_path.as_posix() + +# We can get CC compiler string in the following formats: +# [] - such as 'xrun clang', 'xrun /usr/bin/clang' & etc +# +# Where could contain the following parts: +# [.] - sucn as 'clang', 'clang.exe' ('clamg-cl.exe'?) +# -[.] - such as 'armv7-linux-gnueabi-gcc' +# /[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' +# /-[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' + +cc_ext = cc_path.suffix +# Compiler name without extension +cc_name = cc_path.stem.split(" ")[-1] + +# A kind of compiler (canonical name): clang, gcc, cc & etc. +cc_type = cc_name +# A triple prefix of compiler name: gcc +cc_prefix = "" +if not "clang-cl" in cc_name and not "llvm-gcc" in cc_name: +cc_name_parts = cc_name.split("-") +cc_type = cc_name_parts[-1] +if len(cc_name_parts) > 1: +cc_prefix = "-".join(cc_name_parts[:-1]) + "-" + +# A kind of C++ compiler. +cxx_types = { +"icc": "icpc", +"llvm-gcc": "llvm-g++", +"gcc": "g++", +"cc": "c++", +"clang": "clang++", +} +cxx_type = cxx_types.get(cc_type, cc_type) + +cc_dir = cc_path.parent + +def getLlvmUtil(util_name): +llvm_tools_dir = os.getenv("LLVM_TOOLS_DIR", cc_dir.as_posix()) +return llvm_tools_dir + "/" + util_name + exe_ext + +def getToolchainUtil(util_name): +return (cc_dir / (cc_prefix + util_name + cc_ext)).as_posix() + +cxx = getToolchainUtil(cxx_type) + +util_names = { +"OBJCOPY": "objcopy", +"STRIP": "objcopy", +"ARCHIVER": "ar", +"DWP": "dwp", +} +utils = [] + +# Note: LLVM_AR is currently required by API TestBSDArchives.py tests. +# Assembly a full path to llvm-ar for given LLVM_TOOLS_DIR; +# otherwise assume we have llvm-ar at the same place where is CC specified. +if not os.getenv("LLVM_AR"): +llvm_ar = getLlvmUtil("llvm-ar") +utils.extend(['LLVM_AR="%s"' % llvm_ar]) + +if not lldbplatformutil.platformIsDarwin(): +if os.getenv("USE_LLVM_TOOLS"): labath wrote: I'm still puzzled by this.. https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Better matching of types in anonymous namespaces (PR #102111)
labath wrote: Any more thoughts on this, @clayborg ? https://github.com/llvm/llvm-project/pull/102111 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
labath wrote: The linux read fix is in #106532. https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb/linux] Make truncated reads work (PR #106532)
https://github.com/labath updated https://github.com/llvm/llvm-project/pull/106532 >From 610757ced98139d3d10451dcca195692b0c2d832 Mon Sep 17 00:00:00 2001 From: Pavel Labath Date: Thu, 29 Aug 2024 12:26:01 +0200 Subject: [PATCH 1/2] [lldb/linux] Make truncated reads work Previously, we were returning an error if we couldn't read the whole region. This doesn't matter most of the time, because lldb caches memory reads, and in that process it aligns them to cache line boundaries. As (LLDB) cache lines are smaller than pages, the reads are unlikely to cross page boundaries. Nonetheless, this can cause a problem for large reads (which bypass the cache), where we're unable to read anything even if just a single byte of the memory is unreadable. This patch fixes the lldb-server to do that, and also changes the linux implementation, to reuse any partial results it got from the process_vm_readv call (to avoid having to re-read everything again using ptrace, only to find that it stopped at the same place). This matches debugserver behavior. It is also consistent with the gdb remote protocol documentation, but -- notably -- not with actual gdbserver behavior (which returns errors instead of partial results). We filed a [clarification bug](https://sourceware.org/bugzilla/show_bug.cgi?id=24751) several years ago. Though we did not really reach a conclusion there, I think this is the most logical behavior. The associated test does not currently pass on windows, because the windows memory read APIs don't support partial reads (I have a WIP patch to work around that). --- .../Process/Linux/NativeProcessLinux.cpp | 41 - .../GDBRemoteCommunicationServerLLGS.cpp | 20 + .../API/functionalities/memory/holes/Makefile | 3 + .../memory/holes/TestMemoryHoles.py | 63 ++ .../API/functionalities/memory/holes/main.cpp | 87 +++ 5 files changed, 176 insertions(+), 38 deletions(-) create mode 100644 lldb/test/API/functionalities/memory/holes/Makefile create mode 100644 lldb/test/API/functionalities/memory/holes/TestMemoryHoles.py create mode 100644 lldb/test/API/functionalities/memory/holes/main.cpp diff --git a/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp b/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp index cb95da9fc72363..1e2e3a80b18bf6 100644 --- a/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp +++ b/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp @@ -1641,6 +1641,10 @@ NativeProcessLinux::GetSoftwareBreakpointTrapOpcode(size_t size_hint) { Status NativeProcessLinux::ReadMemory(lldb::addr_t addr, void *buf, size_t size, size_t &bytes_read) { + Log *log = GetLog(POSIXLog::Memory); + LLDB_LOG(log, "addr = {0}, buf = {1}, size = {2}", addr, buf, size); + + bytes_read = 0; if (ProcessVmReadvSupported()) { // The process_vm_readv path is about 50 times faster than ptrace api. We // want to use this syscall if it is supported. @@ -1651,32 +1655,29 @@ Status NativeProcessLinux::ReadMemory(lldb::addr_t addr, void *buf, size_t size, remote_iov.iov_base = reinterpret_cast(addr); remote_iov.iov_len = size; -bytes_read = process_vm_readv(GetCurrentThreadID(), &local_iov, 1, - &remote_iov, 1, 0); -const bool success = bytes_read == size; +ssize_t read_result = process_vm_readv(GetCurrentThreadID(), &local_iov, 1, + &remote_iov, 1, 0); +int error = 0; +if (read_result < 0) + error = errno; +else + bytes_read = read_result; -Log *log = GetLog(POSIXLog::Process); LLDB_LOG(log, - "using process_vm_readv to read {0} bytes from inferior " - "address {1:x}: {2}", - size, addr, success ? "Success" : llvm::sys::StrError(errno)); - -if (success) - return Status(); -// else the call failed for some reason, let's retry the read using ptrace -// api. + "process_vm_readv({0}, [iovec({1}, {2})], [iovec({3:x}, {2})], 1, " + "0) => {4} ({5})", + GetCurrentThreadID(), buf, size, addr, read_result, + error > 0 ? llvm::sys::StrError(errno) : "sucesss"); } unsigned char *dst = static_cast(buf); size_t remainder; long data; - Log *log = GetLog(POSIXLog::Memory); - LLDB_LOG(log, "addr = {0}, buf = {1}, size = {2}", addr, buf, size); - - for (bytes_read = 0; bytes_read < size; bytes_read += remainder) { + for (; bytes_read < size; bytes_read += remainder) { Status error = NativeProcessLinux::PtraceWrapper( -PTRACE_PEEKDATA, GetCurrentThreadID(), (void *)addr, nullptr, 0, &data); +PTRACE_PEEKDATA, GetCurrentThreadID(), +reinterpret_cast(addr + bytes_read), nullptr, 0, &data); if (error.Fail()) return error; @@ -1684,11 +1685,7 @@ Status NativeProcessLinux::ReadMemory(lldb::addr_t addr, void *buf,
[Lldb-commits] [lldb] [lldb/linux] Make truncated reads work (PR #106532)
@@ -2526,28 +2526,16 @@ GDBRemoteCommunicationServerLLGS::Handle_memory_read( size_t bytes_read = 0; Status error = m_current_process->ReadMemoryWithoutTrap( read_addr, &buf[0], byte_count, bytes_read); - if (error.Fail()) { -LLDB_LOGF(log, - "GDBRemoteCommunicationServerLLGS::%s pid %" PRIu64 - " mem 0x%" PRIx64 ": failed to read. Error: %s", - __FUNCTION__, m_current_process->GetID(), read_addr, - error.AsCString()); + LLDB_LOG(log, "ReadMemoryWithoutTrap({0}) read {1}/{2} bytes (error: {3})", + read_addr, byte_count, bytes_read, error); + if (bytes_read == 0) labath wrote: This was changed to not assume that an error means nothing was read. https://github.com/llvm/llvm-project/pull/106532 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb/linux] Make truncated reads work (PR #106532)
@@ -0,0 +1,63 @@ +""" +Test the memory commands operating on memory regions with holes +""" + +import lldb +from lldbsuite.test.lldbtest import * +import lldbsuite.test.lldbutil as lldbutil +from lldbsuite.test.decorators import * + + +class MemoryHolesTestCase(TestBase): +NO_DEBUG_INFO_TESTCASE = True + +def setUp(self): +super().setUp() +# Find the line number to break inside main(). +self.line = line_number("main.cpp", "// break here") + +def _prepare_inferior(self): +self.build() +exe = self.getBuildArtifact("a.out") +self.runCmd("file " + exe, CURRENT_EXECUTABLE_SET) + +# Break in main() after the variables are assigned values. +lldbutil.run_break_set_by_file_and_line( +self, "main.cpp", self.line, num_expected_locations=1, loc_exact=True +) + +self.runCmd("run", RUN_SUCCEEDED) + +# The stop reason of the thread should be breakpoint. +self.expect( +"thread list", +STOPPED_DUE_TO_BREAKPOINT, +substrs=["stopped", "stop reason = breakpoint"], +) + +# The breakpoint should have a hit count of 1. +lldbutil.check_breakpoint(self, bpno=1, expected_hit_count=1) + +# Avoid the expression evaluator, as it can allocate allocate memory +# inside the holes we've deliberately left empty. +self.memory = self.frame().FindVariable("mem_with_holes").GetValueAsUnsigned() +self.pagesize = self.frame().FindVariable("pagesize").GetValueAsUnsigned() +positions = self.frame().FindVariable("positions") +self.positions = [ +positions.GetChildAtIndex(i).GetValueAsUnsigned() +for i in range(positions.GetNumChildren()) +] +self.assertEqual(len(self.positions), 5) + +@expectedFailureWindows +def test_memory_read(self): +self._prepare_inferior() + +error = lldb.SBError() +content = self.process().ReadMemory(self.memory, 2 * self.pagesize, error) +self.assertEqual(len(content), self.pagesize) +self.assertEqual(content[0:7], b"needle\0") +self.assertTrue(error.Fail()) +self.assertEqual( +f"memory read failed for {self.memory+self.pagesize:#x}", error.GetCString() +) labath wrote: I guess this kind of error behavior can be useful in some situations (you get the memory that can be read *and* the error for the part that cannot), but I've found it very counterintuitive to use. I'd definitely be open to changing that (returning an error only if we didn't read anything). https://github.com/llvm/llvm-project/pull/106532 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb/linux] Make truncated reads work (PR #106532)
https://github.com/labath created https://github.com/llvm/llvm-project/pull/106532 Previously, we were returning an error if we couldn't read the whole region. This doesn't matter most of the time, because lldb caches memory reads, and in that process it aligns them to cache line boundaries. As (LLDB) cache lines are smaller than pages, the reads are unlikely to cross page boundaries. Nonetheless, this can cause a problem for large reads (which bypass the cache), where we're unable to read anything even if just a single byte of the memory is unreadable. This patch fixes the lldb-server to do that, and also changes the linux implementation, to reuse any partial results it got from the process_vm_readv call (to avoid having to re-read everything again using ptrace, only to find that it stopped at the same place). This matches debugserver behavior. It is also consistent with the gdb remote protocol documentation, but -- notably -- not with actual gdbserver behavior (which returns errors instead of partial results). We filed a [clarification bug](https://sourceware.org/bugzilla/show_bug.cgi?id=24751) several years ago. Though we did not really reach a conclusion there, I think this is the most logical behavior. The associated test does not currently pass on windows, because the windows memory read APIs don't support partial reads (I have a WIP patch to work around that). >From 610757ced98139d3d10451dcca195692b0c2d832 Mon Sep 17 00:00:00 2001 From: Pavel Labath Date: Thu, 29 Aug 2024 12:26:01 +0200 Subject: [PATCH] [lldb/linux] Make truncated reads work Previously, we were returning an error if we couldn't read the whole region. This doesn't matter most of the time, because lldb caches memory reads, and in that process it aligns them to cache line boundaries. As (LLDB) cache lines are smaller than pages, the reads are unlikely to cross page boundaries. Nonetheless, this can cause a problem for large reads (which bypass the cache), where we're unable to read anything even if just a single byte of the memory is unreadable. This patch fixes the lldb-server to do that, and also changes the linux implementation, to reuse any partial results it got from the process_vm_readv call (to avoid having to re-read everything again using ptrace, only to find that it stopped at the same place). This matches debugserver behavior. It is also consistent with the gdb remote protocol documentation, but -- notably -- not with actual gdbserver behavior (which returns errors instead of partial results). We filed a [clarification bug](https://sourceware.org/bugzilla/show_bug.cgi?id=24751) several years ago. Though we did not really reach a conclusion there, I think this is the most logical behavior. The associated test does not currently pass on windows, because the windows memory read APIs don't support partial reads (I have a WIP patch to work around that). --- .../Process/Linux/NativeProcessLinux.cpp | 41 - .../GDBRemoteCommunicationServerLLGS.cpp | 20 + .../API/functionalities/memory/holes/Makefile | 3 + .../memory/holes/TestMemoryHoles.py | 63 ++ .../API/functionalities/memory/holes/main.cpp | 87 +++ 5 files changed, 176 insertions(+), 38 deletions(-) create mode 100644 lldb/test/API/functionalities/memory/holes/Makefile create mode 100644 lldb/test/API/functionalities/memory/holes/TestMemoryHoles.py create mode 100644 lldb/test/API/functionalities/memory/holes/main.cpp diff --git a/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp b/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp index cb95da9fc72363..1e2e3a80b18bf6 100644 --- a/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp +++ b/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp @@ -1641,6 +1641,10 @@ NativeProcessLinux::GetSoftwareBreakpointTrapOpcode(size_t size_hint) { Status NativeProcessLinux::ReadMemory(lldb::addr_t addr, void *buf, size_t size, size_t &bytes_read) { + Log *log = GetLog(POSIXLog::Memory); + LLDB_LOG(log, "addr = {0}, buf = {1}, size = {2}", addr, buf, size); + + bytes_read = 0; if (ProcessVmReadvSupported()) { // The process_vm_readv path is about 50 times faster than ptrace api. We // want to use this syscall if it is supported. @@ -1651,32 +1655,29 @@ Status NativeProcessLinux::ReadMemory(lldb::addr_t addr, void *buf, size_t size, remote_iov.iov_base = reinterpret_cast(addr); remote_iov.iov_len = size; -bytes_read = process_vm_readv(GetCurrentThreadID(), &local_iov, 1, - &remote_iov, 1, 0); -const bool success = bytes_read == size; +ssize_t read_result = process_vm_readv(GetCurrentThreadID(), &local_iov, 1, + &remote_iov, 1, 0); +int error = 0; +if (read_result < 0) + error = errno; +else + bytes_read = read_result; -Log *log = GetLog(POSIX
[Lldb-commits] [lldb] [lldb] Updated TCPSocket to listen multiple ports on the same single thread (PR #104797)
labath wrote: > @labath > > > Having a single socket listen on multiple ports sounds like a bad idea to > > me. > > Note that currently the class TCPSocket already contains a list of > NativeSocket `m_listen_sockets`. I am aware of that, and I'm not entirely happy with how the class implements listening. I think it would be better to have a separate class for a listening socket, because those need a completely different APIs. But, even ignoring that, I think there's a difference between listening on two addresses with the same port (I believe the only way to reach that state is if a hostname resolves to multiple addresses, in which case one really could argue that it's the same "logical" address), and two addresses with completely unrelated ports. > We do not need 2 TCPSocket instances with 2 separated lists of native sockets > even with a common MainLoop. > > We have 2 options: > > * A) 2 threads - first one calls TCPSocket::Accept() for the platform > connection, second calls TCPSocket::Accept() for the gdb connection > > * B) 1 thread - a common TCPSocket::Accept() can accept platform and gdb > connections from both ports We have (at least) three options, the third one being what I outlined in the previous message. I'm sorry this work of yours is going to waste. I could've been more explicit about what I wanted to do on the first PR, but I was more focused on what I don't want to do. > > > I have implemented the option B. It was easy because the class TCPSocket > already contains `m_listen_sockets`. > > > Then we could create two sockets and have them listen on the same main loop > > instance. > > It is already implemented inside TCPSocket but for different addresses. That's true, but I draw a different conclusion from that. Instead of saying "this should be extended to support multiple ports", my take is that "this wasn't a good place for multiplexing to begin with". > I just added different ports. > > The changes are minimal really. Yes, but that's because you extended a stringly typed api to do that you wanted. That function is used from other places as well, and this means that other places in lldb (including user-facing code, I believe) can accept these multi-port connection strings. I don't think we want to support that. https://github.com/llvm/llvm-project/pull/104797 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
https://github.com/labath updated https://github.com/llvm/llvm-project/pull/104193 >From 7b8b8b699902d2365ea43e5ae015546c4d20fac8 Mon Sep 17 00:00:00 2001 From: Pavel Labath Date: Wed, 14 Aug 2024 19:58:27 +0200 Subject: [PATCH 1/5] [lldb] Fix and speedup the `memory find` command This patch fixes an issue where the `memory find` command would effectively stop searching after encountering a memory read error (which could happen due to unreadable memory), without giving any indication that it has done so (it would just print it could not find the pattern). To make matters worse, it would not terminate after encountering this error, but rather proceed to slowly increment the address pointer, which meant that searching a large region could take a very long time (and give the appearance that lldb is actually searching for the thing). The patch fixes this first problem (*) by detecting read errors and skipping over (using GetMemoryRegionInfo) the unreadable parts of memory and resuming the search after them. It also reads the memory in bulk (up to 1MB), which speeds up the search significantly (up to 6x for live processes, 18x for core files). (*) The fix does not work on windows yet, because the ReadMemory API does not return partial results (like it does for other systems). I'm preparing a separate patch to deal with that. --- lldb/source/Target/Process.cpp| 68 +++ .../memory/find/TestMemoryFind.py | 30 ++- .../API/functionalities/memory/find/main.cpp | 83 +-- 3 files changed, 136 insertions(+), 45 deletions(-) diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp index b2a0f13b9a1549..5d066d264bd3f5 100644 --- a/lldb/source/Target/Process.cpp +++ b/lldb/source/Target/Process.cpp @@ -114,33 +114,6 @@ class ProcessOptionValueProperties } }; -class ProcessMemoryIterator { -public: - ProcessMemoryIterator(Process &process, lldb::addr_t base) - : m_process(process), m_base_addr(base) {} - - bool IsValid() { return m_is_valid; } - - uint8_t operator[](lldb::addr_t offset) { -if (!IsValid()) - return 0; - -uint8_t retval = 0; -Status error; -if (0 == m_process.ReadMemory(m_base_addr + offset, &retval, 1, error)) { - m_is_valid = false; - return 0; -} - -return retval; - } - -private: - Process &m_process; - const lldb::addr_t m_base_addr; - bool m_is_valid = true; -}; - static constexpr OptionEnumValueElement g_follow_fork_mode_values[] = { { eFollowParent, @@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; + // Position of the memory buffer. + addr_t mem_pos = low; + // Maximum number of bytes read (and buffered). We need to read at least + // `size` bytes for a successful match. + const size_t max_read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { +if (s + size > mem.size() + mem_pos) { + // We need to read more data. We don't attempt to reuse the data we've + // already read (up to `size-1` bytes from `s` to `mem_pos+mem.size()`). + // This is fine for patterns much smaller than max_read_size. For very + // long patterns we may need to do something more elaborate. + mem.resize_for_overwrite(max_read_size); + Status error; + mem.resize( + ReadMemory(s, mem.data(), std::min(mem.size(), high - s), error)); + mem_pos = s; + if (error.Fail() || size > mem.size()) { +// We didn't read enough data. Skip to the next memory region. +MemoryRegionInfo info; +error = GetMemoryRegionInfo(mem_pos + mem.size(), info); +if (error.Fail()) + break; +s = info.GetRange().GetRangeEnd(); +continue; + } +} int64_t j = size - 1; -while (j >= 0 && buf[j] == iterator[s + j]) +while (j >= 0 && buf[j] == mem[s + j - mem_pos]) j--; if (j < 0) - return low + s; -else - s += bad_char_heuristic[iterator[s + size - 1]]; + return s; // We have a match. +s += bad_char_heuristic[mem[s + size - 1 - mem_pos]]; } return LLDB_INVALID_ADDRESS; diff --git a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py b/lldb/test/API/functionalities/memory/find/TestMemoryFind.py index 09611cc808777d..72acfb3d600701 100644 --- a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py +++ b/lldb/test/API/functionalities/memory/fin
[Lldb-commits] [lldb] lldb: get lldb API tests working with newer Android NDKs (PR #106443)
@@ -1,81 +1,55 @@ NDK_ROOT := $(shell dirname $(CC))/../../../../.. -ifeq "$(findstring 64, $(ARCH))" "64" - # lowest 64-bit API level - API_LEVEL := 21 -else ifeq "$(ARCH)" "i386" - # clone(2) declaration is present only since this api level - API_LEVEL := 17 +ifeq "$(HOST_OS)" "Linux" + HOST_TAG := linux-x86_64 +else ifeq "$(HOST_OS)" "Darwin" + HOST_TAG := darwin-x86_64 else - # lowest supported 32-bit API level - API_LEVEL := 16 + HOST_TAG := windows-x86_64 +endif + +TOOLCHAIN_SYSROOT := $(NDK_ROOT)/toolchains/llvm/prebuilt/$(HOST_TAG)/sysroot + +ifeq "$(wildcard $(TOOLCHAIN_SYSROOT)/.)" "" +# Compiling test inferiors for Android requires an NDK with the unified +# toolchain introduced in version r19. +$(error "No unified toolchain sysroot found in $(NDK_ROOT). NDK must be r19 or later.") endif ifeq "$(ARCH)" "arm" - SYSROOT_ARCH := arm - STL_ARCH := armeabi-v7a TRIPLE := armv7-none-linux-androideabi ARCH_CFLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -marm else ifeq "$(ARCH)" "aarch64" - SYSROOT_ARCH := arm64 - STL_ARCH := arm64-v8a TRIPLE := aarch64-none-linux-android else ifeq "$(ARCH)" "i386" - SYSROOT_ARCH := x86 - STL_ARCH := x86 TRIPLE := i686-none-linux-android else - SYSROOT_ARCH := $(ARCH) - STL_ARCH := $(ARCH) TRIPLE := $(ARCH)-none-linux-android endif -ifeq "$(findstring 86,$(ARCH))" "86" - TOOLCHAIN_DIR := $(STL_ARCH)-4.9 -else ifeq "$(ARCH)" "arm" - TOOLCHAIN_DIR := arm-linux-androideabi-4.9 -else - TOOLCHAIN_DIR := $(subst -none,,$(TRIPLE))-4.9 -endif +# lowest 64-bit API level +API_LEVEL := 21 ifeq "$(ARCH)" "arm" - TOOL_PREFIX := arm-linux-androideabi -else - TOOL_PREFIX := $(subst -none,,$(TRIPLE)) -endif - -ifeq "$(HOST_OS)" "Linux" - HOST_TAG := linux-x86_64 -else ifeq "$(HOST_OS)" "Darwin" - HOST_TAG := darwin-x86_64 + ARCH_DIR := arm-linux-androideabi else - HOST_TAG := windows-x86_64 -endif - -GCC_TOOLCHAIN = $(NDK_ROOT)/toolchains/$(TOOLCHAIN_DIR)/prebuilt/$(HOST_TAG) - -OBJCOPY ?= $(GCC_TOOLCHAIN)/bin/$(TOOL_PREFIX)-objcopy -ARCHIVER ?= $(GCC_TOOLCHAIN)/bin/$(TOOL_PREFIX)-ar - -ifeq "$(findstring clang,$(CC))" "clang" - ARCH_CFLAGS += -target $(TRIPLE) --gcc-toolchain=$(GCC_TOOLCHAIN) - ARCH_LDFLAGS += -target $(TRIPLE) --gcc-toolchain=$(GCC_TOOLCHAIN) + ARCH_DIR := $(subst -none,,$(TRIPLE)) endif -ARCH_CFLAGS += --sysroot=$(NDK_ROOT)/sysroot \ - -isystem $(NDK_ROOT)/sysroot/usr/include/$(TOOL_PREFIX) \ - -D__ANDROID_API__=$(API_LEVEL) \ - -isystem $(NDK_ROOT)/platforms/android-$(API_LEVEL)/arch-$(SYSROOT_ARCH)/usr/include - -ARCH_LDFLAGS += --sysroot=$(NDK_ROOT)/platforms/android-$(API_LEVEL)/arch-$(SYSROOT_ARCH) -lm +ARCH_CFLAGS += \ + --target=$(TRIPLE) \ + --sysroot=$(TOOLCHAIN_SYSROOT) \ + -D __ANDROID_API__=$(API_LEVEL) \ ARCH_CXXFLAGS += \ - -isystem $(NDK_ROOT)/sources/cxx-stl/llvm-libc++/include \ - -isystem $(NDK_ROOT)/sources/android/support/include \ - -isystem $(NDK_ROOT)/sources/cxx-stl/llvm-libc++abi/include + -isystem $(TOOLCHAIN_SYSROOT)/usr/include/c++/v1 \ ARCH_LDFLAGS += \ - -L$(NDK_ROOT)/sources/cxx-stl/llvm-libc++/libs/$(STL_ARCH) \ - $(NDK_ROOT)/sources/cxx-stl/llvm-libc++/libs/$(STL_ARCH)/libc++_static.a \ + --target=$(TRIPLE) \ + --sysroot=$(TOOLCHAIN_SYSROOT) \ + --prefix=$(TOOLCHAIN_SYSROOT)/usr/lib/$(ARCH_DIR)/$(API_LEVEL) \ + --library-directory=$(TOOLCHAIN_SYSROOT)/usr/lib/$(ARCH_DIR)/$(API_LEVEL) \ labath wrote: ```suggestion -L$(TOOLCHAIN_SYSROOT)/usr/lib/$(ARCH_DIR)/$(API_LEVEL) \ ``` I haven't seen anyone use the long version of this option (I had to look it up to be sure of what it does), so I think it'd be better to stick to `-L`. `--prefix` is probably ok, since its short form (`-B`) is also not very common. https://github.com/llvm/llvm-project/pull/106443 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] lldb: get lldb API tests working with newer Android NDKs (PR #106443)
@@ -1,81 +1,55 @@ NDK_ROOT := $(shell dirname $(CC))/../../../../.. -ifeq "$(findstring 64, $(ARCH))" "64" - # lowest 64-bit API level - API_LEVEL := 21 -else ifeq "$(ARCH)" "i386" - # clone(2) declaration is present only since this api level - API_LEVEL := 17 +ifeq "$(HOST_OS)" "Linux" + HOST_TAG := linux-x86_64 +else ifeq "$(HOST_OS)" "Darwin" + HOST_TAG := darwin-x86_64 else - # lowest supported 32-bit API level - API_LEVEL := 16 + HOST_TAG := windows-x86_64 +endif + +TOOLCHAIN_SYSROOT := $(NDK_ROOT)/toolchains/llvm/prebuilt/$(HOST_TAG)/sysroot + +ifeq "$(wildcard $(TOOLCHAIN_SYSROOT)/.)" "" +# Compiling test inferiors for Android requires an NDK with the unified +# toolchain introduced in version r19. +$(error "No unified toolchain sysroot found in $(NDK_ROOT). NDK must be r19 or later.") endif ifeq "$(ARCH)" "arm" - SYSROOT_ARCH := arm - STL_ARCH := armeabi-v7a TRIPLE := armv7-none-linux-androideabi ARCH_CFLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -marm else ifeq "$(ARCH)" "aarch64" - SYSROOT_ARCH := arm64 - STL_ARCH := arm64-v8a TRIPLE := aarch64-none-linux-android else ifeq "$(ARCH)" "i386" - SYSROOT_ARCH := x86 - STL_ARCH := x86 TRIPLE := i686-none-linux-android else - SYSROOT_ARCH := $(ARCH) - STL_ARCH := $(ARCH) TRIPLE := $(ARCH)-none-linux-android endif -ifeq "$(findstring 86,$(ARCH))" "86" - TOOLCHAIN_DIR := $(STL_ARCH)-4.9 -else ifeq "$(ARCH)" "arm" - TOOLCHAIN_DIR := arm-linux-androideabi-4.9 -else - TOOLCHAIN_DIR := $(subst -none,,$(TRIPLE))-4.9 -endif +# lowest 64-bit API level +API_LEVEL := 21 ifeq "$(ARCH)" "arm" - TOOL_PREFIX := arm-linux-androideabi -else - TOOL_PREFIX := $(subst -none,,$(TRIPLE)) -endif - -ifeq "$(HOST_OS)" "Linux" - HOST_TAG := linux-x86_64 -else ifeq "$(HOST_OS)" "Darwin" - HOST_TAG := darwin-x86_64 + ARCH_DIR := arm-linux-androideabi else - HOST_TAG := windows-x86_64 -endif - -GCC_TOOLCHAIN = $(NDK_ROOT)/toolchains/$(TOOLCHAIN_DIR)/prebuilt/$(HOST_TAG) - -OBJCOPY ?= $(GCC_TOOLCHAIN)/bin/$(TOOL_PREFIX)-objcopy -ARCHIVER ?= $(GCC_TOOLCHAIN)/bin/$(TOOL_PREFIX)-ar - -ifeq "$(findstring clang,$(CC))" "clang" - ARCH_CFLAGS += -target $(TRIPLE) --gcc-toolchain=$(GCC_TOOLCHAIN) - ARCH_LDFLAGS += -target $(TRIPLE) --gcc-toolchain=$(GCC_TOOLCHAIN) + ARCH_DIR := $(subst -none,,$(TRIPLE)) endif -ARCH_CFLAGS += --sysroot=$(NDK_ROOT)/sysroot \ - -isystem $(NDK_ROOT)/sysroot/usr/include/$(TOOL_PREFIX) \ - -D__ANDROID_API__=$(API_LEVEL) \ - -isystem $(NDK_ROOT)/platforms/android-$(API_LEVEL)/arch-$(SYSROOT_ARCH)/usr/include - -ARCH_LDFLAGS += --sysroot=$(NDK_ROOT)/platforms/android-$(API_LEVEL)/arch-$(SYSROOT_ARCH) -lm +ARCH_CFLAGS += \ + --target=$(TRIPLE) \ + --sysroot=$(TOOLCHAIN_SYSROOT) \ + -D __ANDROID_API__=$(API_LEVEL) \ ARCH_CXXFLAGS += \ - -isystem $(NDK_ROOT)/sources/cxx-stl/llvm-libc++/include \ - -isystem $(NDK_ROOT)/sources/android/support/include \ - -isystem $(NDK_ROOT)/sources/cxx-stl/llvm-libc++abi/include + -isystem $(TOOLCHAIN_SYSROOT)/usr/include/c++/v1 \ ARCH_LDFLAGS += \ - -L$(NDK_ROOT)/sources/cxx-stl/llvm-libc++/libs/$(STL_ARCH) \ - $(NDK_ROOT)/sources/cxx-stl/llvm-libc++/libs/$(STL_ARCH)/libc++_static.a \ + --target=$(TRIPLE) \ + --sysroot=$(TOOLCHAIN_SYSROOT) \ + --prefix=$(TOOLCHAIN_SYSROOT)/usr/lib/$(ARCH_DIR)/$(API_LEVEL) \ + --library-directory=$(TOOLCHAIN_SYSROOT)/usr/lib/$(ARCH_DIR)/$(API_LEVEL) \ + $(TOOLCHAIN_SYSROOT)/usr/lib/$(ARCH_DIR)/libc++_static.a \ + -lm \ -lc++abi \ - -nostdlib++ + -nostdlib++ \ labath wrote: I get the diff benefits of this, but a backslash continuation at the end of a file looks very suspicious. I'd just remove all of the final backslashes you have here. https://github.com/llvm/llvm-project/pull/106443 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] lldb: get lldb API tests working with newer Android NDKs (PR #106443)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/106443 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] lldb: get lldb API tests working with newer Android NDKs (PR #106443)
https://github.com/labath commented: This looks great. I don't think anyone will object to removing old ndk support, but let's leave this PR open for a while to give people a chance to notice. https://github.com/llvm/llvm-project/pull/106443 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add Status::Detail to store expression evaluator diagnostics [… (PR #106442)
labath wrote: The llvm::Error class already supports this kind of metadata (and in a much better way IMO). Given that moving to llvm::Error is our long term goal, would it be possible to port enough APIs so that it's possible to pass an unmodified Error object from the producer to the place that needs to consume it? https://github.com/llvm/llvm-project/pull/106442 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Inline expression evaluator error visualization (PR #106470)
labath wrote: This seems like it could be problematic for IDEs, if they don't print the error in the same window as the expression being evaluated. The arrows could end up pointing to nowhere, or to the wrong place in the expression (if we don't get the right offset). I think this could be made simpler and more robust by just printing a reproduction of the expression as the first line of the error message. It saving that one line output worth it? Also, how will this behave for multiline expressions? Or with errors that refer to multiple source locations (e.g inside macro expansions)? ``` (lldb) expr Enter expressions, then terminate with an empty line to evaluate: 1: #define FOO(x) foo+x 2: FOO(bar) error: :2:1: use of undeclared identifier 'foo' 2 | FOO(bar) | ^ :1:16: expanded from macro 'FOO' 1 | #define FOO(x) foo+x |^ error: :2:5: use of undeclared identifier 'bar' 2 | FOO(bar) | ^ ``` https://github.com/llvm/llvm-project/pull/106470 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
labath wrote: I figured out the problem, and it's kind of hilarious. Turns out (linux?) lldb-server does not actually return partial reads (i.e., it behaves like windows), but the reason that the test succeeds (for me) is that the holes in the allocated memory get filled my memory allocated by lldb. For whatever reason (different kernel) this does not happen on the bot. I guess I'll need to fix the memory read issues first.. https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
@@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; + // Position of the memory buffer. + addr_t mem_pos = low; + // Maximum number of bytes read (and buffered). We need to read at least + // `size` bytes for a successful match. + const size_t max_read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { +if (s + size > mem.size() + mem_pos) { + // We need to read more data. We don't attempt to reuse the data we've + // already read (up to `size-1` bytes from `s` to `mem_pos+mem.size()`). + // This is fine for patterns much smaller than max_read_size. For very + // long patterns we may need to do something more elaborate. + mem.resize_for_overwrite(max_read_size); + Status error; + mem.resize( + ReadMemory(s, mem.data(), std::min(mem.size(), high - s), error)); + mem_pos = s; + if (error.Fail() || size > mem.size()) { +// We didn't read enough data. Skip to the next memory region. +MemoryRegionInfo info; +error = GetMemoryRegionInfo(mem_pos + mem.size(), info); +if (error.Fail()) + break; +s = info.GetRange().GetRangeEnd(); labath wrote: Correct. mem_pos is updated when changing the `mem` array, which we're not doing here. That will happen on the next iteration of the loop when we realize that the mem buffer does not hold the thing we're interested in. https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
@@ -1,9 +1,76 @@ -#include -#include - -int main (int argc, char const *argv[]) -{ -const char* stringdata = "hello world; I like to write text in const char pointers"; -uint8_t bytedata[] = {0xAA,0xBB,0xCC,0xDD,0xEE,0xFF,0x00,0x11,0x22,0x33,0x44,0x55,0x66,0x77,0x88,0x99}; -return 0; // break here +#include +#include +#include +#include +#include +#include + +#ifdef _WIN32 +#include "Windows.h" + +int getpagesize() { + SYSTEM_INFO system_info; + GetSystemInfo(&system_info); + return system_info.dwPageSize; +} + +char *allocate_memory_with_holes() { + int pagesize = getpagesize(); + void *mem = VirtualAlloc(nullptr, 5 * pagesize, MEM_RESERVE, PAGE_NOACCESS); + if (!mem) { +std::cerr << std::system_category().message(GetLastError()) << std::endl; +exit(1); + } + char *bytes = static_cast(mem); + for (int page : {0, 2, 4}) { +if (!VirtualAlloc(bytes + page * pagesize, pagesize, MEM_COMMIT, labath wrote: Added a comment at the top. Definitely not a Windows expert here, but my understanding is that the main difference is due to the MEM_RESERVE/MEM_COMMIT. MEM_RESERVE does not actually allocate any memory. It just marks the range as in use ("reserved") so that it is won't be taken by other allocation calls. MEM_COMMIT is what actually allocates the page. https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
@@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; + // Position of the memory buffer. + addr_t mem_pos = low; + // Maximum number of bytes read (and buffered). We need to read at least + // `size` bytes for a successful match. + const size_t max_read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { labath wrote: How about cur_addr? :) (`s` was a leftover from the original implementation, but I've now completely changed it anyway). https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
@@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; + // Position of the memory buffer. + addr_t mem_pos = low; + // Maximum number of bytes read (and buffered). We need to read at least + // `size` bytes for a successful match. + const size_t max_read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { +if (s + size > mem.size() + mem_pos) { labath wrote: good idea https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
@@ -79,3 +83,25 @@ def test_memory_find(self): 'memory find -s "nothere" `stringdata` `stringdata+10`', substrs=["data not found within the range."], ) + +@expectedFailureWindows +def test_memory_find_with_holes(self): +self._prepare_inferior() + +self.runCmd("log enable gdb-remote packets") +self.runCmd("log enable lldb process target") + +pagesize = self.frame().FindVariable("pagesize").GetValueAsUnsigned() +mem_with_holes = ( +self.frame().FindVariable("mem_with_holes").GetValueAsUnsigned() +) +matches_var = self.frame().FindVariable("matches") +self.assertEqual(matches_var.GetNumChildren(), 4) +matches = [ +f"data found at location: {matches_var.GetChildAtIndex(i).GetValueAsUnsigned():#x}" +for i in range(4) +] +self.expect( +'memory find -c 5 -s "needle" `mem_with_holes` `mem_with_holes+5*pagesize`', labath wrote: Done (the first part). https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
https://github.com/labath updated https://github.com/llvm/llvm-project/pull/104193 >From 7b8b8b699902d2365ea43e5ae015546c4d20fac8 Mon Sep 17 00:00:00 2001 From: Pavel Labath Date: Wed, 14 Aug 2024 19:58:27 +0200 Subject: [PATCH 1/4] [lldb] Fix and speedup the `memory find` command This patch fixes an issue where the `memory find` command would effectively stop searching after encountering a memory read error (which could happen due to unreadable memory), without giving any indication that it has done so (it would just print it could not find the pattern). To make matters worse, it would not terminate after encountering this error, but rather proceed to slowly increment the address pointer, which meant that searching a large region could take a very long time (and give the appearance that lldb is actually searching for the thing). The patch fixes this first problem (*) by detecting read errors and skipping over (using GetMemoryRegionInfo) the unreadable parts of memory and resuming the search after them. It also reads the memory in bulk (up to 1MB), which speeds up the search significantly (up to 6x for live processes, 18x for core files). (*) The fix does not work on windows yet, because the ReadMemory API does not return partial results (like it does for other systems). I'm preparing a separate patch to deal with that. --- lldb/source/Target/Process.cpp| 68 +++ .../memory/find/TestMemoryFind.py | 30 ++- .../API/functionalities/memory/find/main.cpp | 83 +-- 3 files changed, 136 insertions(+), 45 deletions(-) diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp index b2a0f13b9a1549..5d066d264bd3f5 100644 --- a/lldb/source/Target/Process.cpp +++ b/lldb/source/Target/Process.cpp @@ -114,33 +114,6 @@ class ProcessOptionValueProperties } }; -class ProcessMemoryIterator { -public: - ProcessMemoryIterator(Process &process, lldb::addr_t base) - : m_process(process), m_base_addr(base) {} - - bool IsValid() { return m_is_valid; } - - uint8_t operator[](lldb::addr_t offset) { -if (!IsValid()) - return 0; - -uint8_t retval = 0; -Status error; -if (0 == m_process.ReadMemory(m_base_addr + offset, &retval, 1, error)) { - m_is_valid = false; - return 0; -} - -return retval; - } - -private: - Process &m_process; - const lldb::addr_t m_base_addr; - bool m_is_valid = true; -}; - static constexpr OptionEnumValueElement g_follow_fork_mode_values[] = { { eFollowParent, @@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; + // Position of the memory buffer. + addr_t mem_pos = low; + // Maximum number of bytes read (and buffered). We need to read at least + // `size` bytes for a successful match. + const size_t max_read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { +if (s + size > mem.size() + mem_pos) { + // We need to read more data. We don't attempt to reuse the data we've + // already read (up to `size-1` bytes from `s` to `mem_pos+mem.size()`). + // This is fine for patterns much smaller than max_read_size. For very + // long patterns we may need to do something more elaborate. + mem.resize_for_overwrite(max_read_size); + Status error; + mem.resize( + ReadMemory(s, mem.data(), std::min(mem.size(), high - s), error)); + mem_pos = s; + if (error.Fail() || size > mem.size()) { +// We didn't read enough data. Skip to the next memory region. +MemoryRegionInfo info; +error = GetMemoryRegionInfo(mem_pos + mem.size(), info); +if (error.Fail()) + break; +s = info.GetRange().GetRangeEnd(); +continue; + } +} int64_t j = size - 1; -while (j >= 0 && buf[j] == iterator[s + j]) +while (j >= 0 && buf[j] == mem[s + j - mem_pos]) j--; if (j < 0) - return low + s; -else - s += bad_char_heuristic[iterator[s + size - 1]]; + return s; // We have a match. +s += bad_char_heuristic[mem[s + size - 1 - mem_pos]]; } return LLDB_INVALID_ADDRESS; diff --git a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py b/lldb/test/API/functionalities/memory/find/TestMemoryFind.py index 09611cc808777d..72acfb3d600701 100644 --- a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py +++ b/lldb/test/API/functionalities/memory/fin
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
@@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; labath wrote: I'm using the SmallVector for the `resize_for_overwrite` functionality (allocating memory without initializing it). https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
labath wrote: > > It also reads the memory in bulk (up to 1MB) > > ``` > // Maximum number of bytes read (and buffered). We need to read at least > // `size` bytes for a successful match. > const size_t max_read_size = std::max(size, 0x1); > ``` > > It seems the minimal chunk is 64KB and the maximal chunk may be very long > (more than 1MB). Depends on how you look at it, I guess. The actual read size will be the minimum of this number and the "remainder of memory to be searched" (`std::min(mem.size(), high - s)`), so `max_read_size` is really a cap on the read size. It's true that the value can be bigger than 64K (or even 1MB). However that can only happen if the string being searched is larger than that value. This is necessary because the search algorithm matches from the end of the pattern. I wanted to keep the algorithm simple and keep the entire (potential) match in memory at once, as trying to read in the middle of matching would get messy. I don't think anyone searches for strings this long, but if we do want to (better) support that use case, I think that the best solution would be to switch to an algorithm (e.g. [KMP](https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm)) which scans the search space in a strictly linear fashion -- as that will make it easy to read the memory in arbitrary increments. > > Something is wrong with the first memory block in tests > > ``` > Got output: > data found at location: 0x7f6bdf3eb000 > 0x7f6bdf3eb000: 6e 65 65 64 6c 65 00 00 00 00 00 00 00 00 00 00 > needle.. > 0x7f6bdf3eb010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > data found at location: 0x7f6bdf3eb800 > 0x7f6bdf3eb800: 6e 65 65 64 6c 65 00 00 00 00 00 00 00 00 00 00 > needle.. > 0x7f6bdf3eb810: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > data found at location: 0x7f6bdf3edff9 > 0x7f6bdf3edff9: 6e 65 65 64 6c 65 00 50 e0 3e df 6b 7f 00 00 6d > needle.P.>.k...m > 0x7f6bdf3ee009: dd 41 df 6b 7f 00 00 00 00 00 00 00 00 00 00 40 > .A.k...@ > no more matches within the range. > > Expecting sub string: "data found at location: 0x7f6bdf3e9000" (was not found) > ``` > Interesting. It does work on my machine. I'll try to debug this. > Note the capacity=0 is a special case here `llvm::SmallVector > mem;` I'd recommend > > ``` > const size_t min_read_size = 0x1; > const size_t max_read_size = std::max(size, min_read_size); > llvm::SmallVector mem; > ``` That would allocate the 64k on the stack, which isn't good practice (I doubt it will make things faster, and it can cause us to blow the stack). 64k is 1000x larger than the [preffered small vector size](https://github.com/llvm/llvm-project/blob/16910a21ee0fabab2df291e4e5bc18289bd5762d/llvm/include/llvm/ADT/SmallVector.h#L1150C27-L1150C54). In practice there will only ever be one heap allocation here -- first `resize` call on the vector will allocate a ~64k heap block, and all subsequent resizes will just move the end pointer around. https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
https://github.com/labath updated https://github.com/llvm/llvm-project/pull/104193 >From 7b8b8b699902d2365ea43e5ae015546c4d20fac8 Mon Sep 17 00:00:00 2001 From: Pavel Labath Date: Wed, 14 Aug 2024 19:58:27 +0200 Subject: [PATCH 1/3] [lldb] Fix and speedup the `memory find` command This patch fixes an issue where the `memory find` command would effectively stop searching after encountering a memory read error (which could happen due to unreadable memory), without giving any indication that it has done so (it would just print it could not find the pattern). To make matters worse, it would not terminate after encountering this error, but rather proceed to slowly increment the address pointer, which meant that searching a large region could take a very long time (and give the appearance that lldb is actually searching for the thing). The patch fixes this first problem (*) by detecting read errors and skipping over (using GetMemoryRegionInfo) the unreadable parts of memory and resuming the search after them. It also reads the memory in bulk (up to 1MB), which speeds up the search significantly (up to 6x for live processes, 18x for core files). (*) The fix does not work on windows yet, because the ReadMemory API does not return partial results (like it does for other systems). I'm preparing a separate patch to deal with that. --- lldb/source/Target/Process.cpp| 68 +++ .../memory/find/TestMemoryFind.py | 30 ++- .../API/functionalities/memory/find/main.cpp | 83 +-- 3 files changed, 136 insertions(+), 45 deletions(-) diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp index b2a0f13b9a1549..5d066d264bd3f5 100644 --- a/lldb/source/Target/Process.cpp +++ b/lldb/source/Target/Process.cpp @@ -114,33 +114,6 @@ class ProcessOptionValueProperties } }; -class ProcessMemoryIterator { -public: - ProcessMemoryIterator(Process &process, lldb::addr_t base) - : m_process(process), m_base_addr(base) {} - - bool IsValid() { return m_is_valid; } - - uint8_t operator[](lldb::addr_t offset) { -if (!IsValid()) - return 0; - -uint8_t retval = 0; -Status error; -if (0 == m_process.ReadMemory(m_base_addr + offset, &retval, 1, error)) { - m_is_valid = false; - return 0; -} - -return retval; - } - -private: - Process &m_process; - const lldb::addr_t m_base_addr; - bool m_is_valid = true; -}; - static constexpr OptionEnumValueElement g_follow_fork_mode_values[] = { { eFollowParent, @@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; + // Position of the memory buffer. + addr_t mem_pos = low; + // Maximum number of bytes read (and buffered). We need to read at least + // `size` bytes for a successful match. + const size_t max_read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { +if (s + size > mem.size() + mem_pos) { + // We need to read more data. We don't attempt to reuse the data we've + // already read (up to `size-1` bytes from `s` to `mem_pos+mem.size()`). + // This is fine for patterns much smaller than max_read_size. For very + // long patterns we may need to do something more elaborate. + mem.resize_for_overwrite(max_read_size); + Status error; + mem.resize( + ReadMemory(s, mem.data(), std::min(mem.size(), high - s), error)); + mem_pos = s; + if (error.Fail() || size > mem.size()) { +// We didn't read enough data. Skip to the next memory region. +MemoryRegionInfo info; +error = GetMemoryRegionInfo(mem_pos + mem.size(), info); +if (error.Fail()) + break; +s = info.GetRange().GetRangeEnd(); +continue; + } +} int64_t j = size - 1; -while (j >= 0 && buf[j] == iterator[s + j]) +while (j >= 0 && buf[j] == mem[s + j - mem_pos]) j--; if (j < 0) - return low + s; -else - s += bad_char_heuristic[iterator[s + size - 1]]; + return s; // We have a match. +s += bad_char_heuristic[mem[s + size - 1 - mem_pos]]; } return LLDB_INVALID_ADDRESS; diff --git a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py b/lldb/test/API/functionalities/memory/find/TestMemoryFind.py index 09611cc808777d..72acfb3d600701 100644 --- a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py +++ b/lldb/test/API/functionalities/memory/fin
[Lldb-commits] [lldb] [lldb] Turn lldb_private::Status into a value type. (PR #106163)
labath wrote: > > * since this seems like a perfect opportunity to bikesh^Wdiscuss the names, > > I'm going to ask if there's any appetite for shortening some of the new > > factory functions. `Status::FromErrorStringWithFormatv` is a bit of a > > mouthful, so I was thinking if we could use something shorter instead > > (`Status::FromFormatv` or even `Status::Formatv`) ? > > I picked these names, because they are in line with the old names, which made > the regex replacement feasible. Renaming them afterwards is going to be > easier. My 2 cents on the naming: I had `Status::FromFormatv` in a previous > iteration of this patch and changed my mind, because it doesn't indicate that > this is going to be an error. What do you think about the slightly shorter > `Status::ErrorFromFromatv()`? I think it's better (because its shorter), though it's still a bit long for my taste. FWIW, I don't find the short name at all confusing, because my mental model of these error types is (and all types I know behave that way) is that they have only one "success" value, and everything else represents some kind of an error/failure. So, when I see anything more complicated than `Status()` (or whatever the final API is for constructing success values), I immediately expect an error status. I guess one difference is that I'm used to working with `absl::Status` (which I believe was zturner's inspiration for this name), so "status" feels just like a synonym for "error" to me. If you think having the word "error" in the name helps, I'd also be fine with `Status::Errorf` and `Status::Errorv` (for printf vs formatv styles) or something like that. https://github.com/llvm/llvm-project/pull/106163 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
https://github.com/labath updated https://github.com/llvm/llvm-project/pull/104193 >From 7b8b8b699902d2365ea43e5ae015546c4d20fac8 Mon Sep 17 00:00:00 2001 From: Pavel Labath Date: Wed, 14 Aug 2024 19:58:27 +0200 Subject: [PATCH 1/2] [lldb] Fix and speedup the `memory find` command This patch fixes an issue where the `memory find` command would effectively stop searching after encountering a memory read error (which could happen due to unreadable memory), without giving any indication that it has done so (it would just print it could not find the pattern). To make matters worse, it would not terminate after encountering this error, but rather proceed to slowly increment the address pointer, which meant that searching a large region could take a very long time (and give the appearance that lldb is actually searching for the thing). The patch fixes this first problem (*) by detecting read errors and skipping over (using GetMemoryRegionInfo) the unreadable parts of memory and resuming the search after them. It also reads the memory in bulk (up to 1MB), which speeds up the search significantly (up to 6x for live processes, 18x for core files). (*) The fix does not work on windows yet, because the ReadMemory API does not return partial results (like it does for other systems). I'm preparing a separate patch to deal with that. --- lldb/source/Target/Process.cpp| 68 +++ .../memory/find/TestMemoryFind.py | 30 ++- .../API/functionalities/memory/find/main.cpp | 83 +-- 3 files changed, 136 insertions(+), 45 deletions(-) diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp index b2a0f13b9a1549..5d066d264bd3f5 100644 --- a/lldb/source/Target/Process.cpp +++ b/lldb/source/Target/Process.cpp @@ -114,33 +114,6 @@ class ProcessOptionValueProperties } }; -class ProcessMemoryIterator { -public: - ProcessMemoryIterator(Process &process, lldb::addr_t base) - : m_process(process), m_base_addr(base) {} - - bool IsValid() { return m_is_valid; } - - uint8_t operator[](lldb::addr_t offset) { -if (!IsValid()) - return 0; - -uint8_t retval = 0; -Status error; -if (0 == m_process.ReadMemory(m_base_addr + offset, &retval, 1, error)) { - m_is_valid = false; - return 0; -} - -return retval; - } - -private: - Process &m_process; - const lldb::addr_t m_base_addr; - bool m_is_valid = true; -}; - static constexpr OptionEnumValueElement g_follow_fork_mode_values[] = { { eFollowParent, @@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; + // Position of the memory buffer. + addr_t mem_pos = low; + // Maximum number of bytes read (and buffered). We need to read at least + // `size` bytes for a successful match. + const size_t max_read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { +if (s + size > mem.size() + mem_pos) { + // We need to read more data. We don't attempt to reuse the data we've + // already read (up to `size-1` bytes from `s` to `mem_pos+mem.size()`). + // This is fine for patterns much smaller than max_read_size. For very + // long patterns we may need to do something more elaborate. + mem.resize_for_overwrite(max_read_size); + Status error; + mem.resize( + ReadMemory(s, mem.data(), std::min(mem.size(), high - s), error)); + mem_pos = s; + if (error.Fail() || size > mem.size()) { +// We didn't read enough data. Skip to the next memory region. +MemoryRegionInfo info; +error = GetMemoryRegionInfo(mem_pos + mem.size(), info); +if (error.Fail()) + break; +s = info.GetRange().GetRangeEnd(); +continue; + } +} int64_t j = size - 1; -while (j >= 0 && buf[j] == iterator[s + j]) +while (j >= 0 && buf[j] == mem[s + j - mem_pos]) j--; if (j < 0) - return low + s; -else - s += bad_char_heuristic[iterator[s + size - 1]]; + return s; // We have a match. +s += bad_char_heuristic[mem[s + size - 1 - mem_pos]]; } return LLDB_INVALID_ADDRESS; diff --git a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py b/lldb/test/API/functionalities/memory/find/TestMemoryFind.py index 09611cc808777d..72acfb3d600701 100644 --- a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py +++ b/lldb/test/API/functionalities/memory/fin
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
https://github.com/labath ready_for_review https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
https://github.com/labath updated https://github.com/llvm/llvm-project/pull/104193 >From 7b8b8b699902d2365ea43e5ae015546c4d20fac8 Mon Sep 17 00:00:00 2001 From: Pavel Labath Date: Wed, 14 Aug 2024 19:58:27 +0200 Subject: [PATCH] [lldb] Fix and speedup the `memory find` command This patch fixes an issue where the `memory find` command would effectively stop searching after encountering a memory read error (which could happen due to unreadable memory), without giving any indication that it has done so (it would just print it could not find the pattern). To make matters worse, it would not terminate after encountering this error, but rather proceed to slowly increment the address pointer, which meant that searching a large region could take a very long time (and give the appearance that lldb is actually searching for the thing). The patch fixes this first problem (*) by detecting read errors and skipping over (using GetMemoryRegionInfo) the unreadable parts of memory and resuming the search after them. It also reads the memory in bulk (up to 1MB), which speeds up the search significantly (up to 6x for live processes, 18x for core files). (*) The fix does not work on windows yet, because the ReadMemory API does not return partial results (like it does for other systems). I'm preparing a separate patch to deal with that. --- lldb/source/Target/Process.cpp| 68 +++ .../memory/find/TestMemoryFind.py | 30 ++- .../API/functionalities/memory/find/main.cpp | 83 +-- 3 files changed, 136 insertions(+), 45 deletions(-) diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp index b2a0f13b9a1549..5d066d264bd3f5 100644 --- a/lldb/source/Target/Process.cpp +++ b/lldb/source/Target/Process.cpp @@ -114,33 +114,6 @@ class ProcessOptionValueProperties } }; -class ProcessMemoryIterator { -public: - ProcessMemoryIterator(Process &process, lldb::addr_t base) - : m_process(process), m_base_addr(base) {} - - bool IsValid() { return m_is_valid; } - - uint8_t operator[](lldb::addr_t offset) { -if (!IsValid()) - return 0; - -uint8_t retval = 0; -Status error; -if (0 == m_process.ReadMemory(m_base_addr + offset, &retval, 1, error)) { - m_is_valid = false; - return 0; -} - -return retval; - } - -private: - Process &m_process; - const lldb::addr_t m_base_addr; - bool m_is_valid = true; -}; - static constexpr OptionEnumValueElement g_follow_fork_mode_values[] = { { eFollowParent, @@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; + // Position of the memory buffer. + addr_t mem_pos = low; + // Maximum number of bytes read (and buffered). We need to read at least + // `size` bytes for a successful match. + const size_t max_read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { +if (s + size > mem.size() + mem_pos) { + // We need to read more data. We don't attempt to reuse the data we've + // already read (up to `size-1` bytes from `s` to `mem_pos+mem.size()`). + // This is fine for patterns much smaller than max_read_size. For very + // long patterns we may need to do something more elaborate. + mem.resize_for_overwrite(max_read_size); + Status error; + mem.resize( + ReadMemory(s, mem.data(), std::min(mem.size(), high - s), error)); + mem_pos = s; + if (error.Fail() || size > mem.size()) { +// We didn't read enough data. Skip to the next memory region. +MemoryRegionInfo info; +error = GetMemoryRegionInfo(mem_pos + mem.size(), info); +if (error.Fail()) + break; +s = info.GetRange().GetRangeEnd(); +continue; + } +} int64_t j = size - 1; -while (j >= 0 && buf[j] == iterator[s + j]) +while (j >= 0 && buf[j] == mem[s + j - mem_pos]) j--; if (j < 0) - return low + s; -else - s += bad_char_heuristic[iterator[s + size - 1]]; + return s; // We have a match. +s += bad_char_heuristic[mem[s + size - 1 - mem_pos]]; } return LLDB_INVALID_ADDRESS; diff --git a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py b/lldb/test/API/functionalities/memory/find/TestMemoryFind.py index 09611cc808777d..72acfb3d600701 100644 --- a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py +++ b/lldb/test/API/functionalities/memory/find/Te
[Lldb-commits] [lldb] [lldb] Fix and speedup the `memory find` command (PR #104193)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][WIP] memory find speedup+bugfix (PR #104193)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/104193 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][WIP] memory find speedup+bugfix (PR #104193)
https://github.com/labath updated https://github.com/llvm/llvm-project/pull/104193 >From b46820a91f1ad8d6db46210e3135abdb1ab475e8 Mon Sep 17 00:00:00 2001 From: Pavel Labath Date: Wed, 14 Aug 2024 19:58:27 +0200 Subject: [PATCH] [lldb] Fix and speedup the `memory find` command This patch fixes an issue where the `memory find` command would effectively stop searching after encountering a memory read error (which could happen due to unreadable memory), without giving any indication that it has done so (it would just print it could not find the pattern). To make matters worse, it would not terminate after encountering this error, but rather proceed to slowly increment the address pointer, which meant that searching a large region could take a very long time (and give the appearance that lldb is actually searching for the thing). The patch fixes this first problem (*) by detecting read errors and skipping over (using GetMemoryRegionInfo) the unreadable parts of memory and resuming the search after them. It also reads the memory in bulk (up to 1MB), which speeds up the search significantly (up to 6x for live processes, 18x for core files). (*) The fix does not work on windows yet, because the ReadMemory API does not return partial results (like it does for other systems). I'm preparing a separate patch to deal with that. --- lldb/source/Target/Process.cpp| 68 +++ .../memory/find/TestMemoryFind.py | 25 +- .../API/functionalities/memory/find/main.cpp | 83 +-- 3 files changed, 131 insertions(+), 45 deletions(-) diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp index b2a0f13b9a1549..5d066d264bd3f5 100644 --- a/lldb/source/Target/Process.cpp +++ b/lldb/source/Target/Process.cpp @@ -114,33 +114,6 @@ class ProcessOptionValueProperties } }; -class ProcessMemoryIterator { -public: - ProcessMemoryIterator(Process &process, lldb::addr_t base) - : m_process(process), m_base_addr(base) {} - - bool IsValid() { return m_is_valid; } - - uint8_t operator[](lldb::addr_t offset) { -if (!IsValid()) - return 0; - -uint8_t retval = 0; -Status error; -if (0 == m_process.ReadMemory(m_base_addr + offset, &retval, 1, error)) { - m_is_valid = false; - return 0; -} - -return retval; - } - -private: - Process &m_process; - const lldb::addr_t m_base_addr; - bool m_is_valid = true; -}; - static constexpr OptionEnumValueElement g_follow_fork_mode_values[] = { { eFollowParent, @@ -3367,21 +3340,48 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, if (region_size < size) return LLDB_INVALID_ADDRESS; + // See "Boyer-Moore string search algorithm". std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + // Memory we're currently searching through. + llvm::SmallVector mem; + // Position of the memory buffer. + addr_t mem_pos = low; + // Maximum number of bytes read (and buffered). We need to read at least + // `size` bytes for a successful match. + const size_t max_read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { +if (s + size > mem.size() + mem_pos) { + // We need to read more data. We don't attempt to reuse the data we've + // already read (up to `size-1` bytes from `s` to `mem_pos+mem.size()`). + // This is fine for patterns much smaller than max_read_size. For very + // long patterns we may need to do something more elaborate. + mem.resize_for_overwrite(max_read_size); + Status error; + mem.resize( + ReadMemory(s, mem.data(), std::min(mem.size(), high - s), error)); + mem_pos = s; + if (error.Fail() || size > mem.size()) { +// We didn't read enough data. Skip to the next memory region. +MemoryRegionInfo info; +error = GetMemoryRegionInfo(mem_pos + mem.size(), info); +if (error.Fail()) + break; +s = info.GetRange().GetRangeEnd(); +continue; + } +} int64_t j = size - 1; -while (j >= 0 && buf[j] == iterator[s + j]) +while (j >= 0 && buf[j] == mem[s + j - mem_pos]) j--; if (j < 0) - return low + s; -else - s += bad_char_heuristic[iterator[s + size - 1]]; + return s; // We have a match. +s += bad_char_heuristic[mem[s + size - 1 - mem_pos]]; } return LLDB_INVALID_ADDRESS; diff --git a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py b/lldb/test/API/functionalities/memory/find/TestMemoryFind.py index 09611cc808777d..88fbd0c192a58e 100644 --- a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py +++ b/lldb/test/API/functionalities/memory/find/Tes
[Lldb-commits] [lldb] [lldb] Updated TCPSocket to listen multiple ports on the same single thread (PR #104797)
labath wrote: Having a single socket listen on multiple ports sounds like a bad idea to me. The socket class is complicated enough as it is, and it's not the way the MainLoop class was meant to be used (the idea being for it to be created at the topmost level possible, so that any number of users could register their handlers). I think that a better and more versatile API would be something like: ``` std::vector /*or sth like that*/ TCPSocket::Accept(MainLoop &mainloop, std::function accepted_socket)> /*or similar*/ socket_cb); ``` Then we could create two sockets and have them listen on the same main loop instance. It should be possible to reimplement the existing TCPSocket::Accept function on top of this API. https://github.com/llvm/llvm-project/pull/104797 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Turn lldb_private::Status into a value type. (PR #106163)
labath wrote: I like this. I have just two remarks: - it might be better to split this into three steps (add new APIs, port to new APIs, remove old APIs), as that will make reverts easier/less disruptive (I don't know how much we can trust pre-commit CI these days, but I wouldn't be surprised if this breaks some platform-specific code). - since this seems like a perfect opportunity to bikesh^Wdiscuss the names, I'm going to ask if there's any appetite for shortening some of the new factory functions. `Status::FromErrorStringWithFormatv` is a bit of a mouthful, so I was thinking if we could use something shorter instead (`Status::FromFormatv` or even `Status::Formatv`) ? https://github.com/llvm/llvm-project/pull/106163 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add 'FindInMemory()' overload for PostMortemProcess. (PR #102536)
labath wrote: > Let me know if you guys don't want this patch in. I will closed it and apply > it to our local branch. @jasonmolenda @labath Without claiming official authority to make this decision, I'm going to say that *I* don't think this is a good idea, because the patch is very (and IMO, unnecessarily) specific to post mortem processes. My (currently WIP) patch (#104193) gets you most of the speed benefits, works with live processes as well and it does that without introducing any new APIs. What you do downstream is up to you, but I would encourage you to wait for me to finish up that patch, as this doesn't look like something that would be a good idea to carry downstream (I certainly wouldn't want to do it). https://github.com/llvm/llvm-project/pull/102536 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] unique_ptr-ify some GetUserExpression APIs. (PR #106034)
https://github.com/labath approved this pull request. https://github.com/llvm/llvm-project/pull/106034 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -96,16 +98,120 @@ def getArchSpec(self, architecture): """ return ["ARCH=" + architecture] if architecture else [] -def getCCSpec(self, compiler): +def getToolchainSpec(self, compiler): """ -Helper function to return the key-value string to specify the compiler +Helper function to return the key-value strings to specify the toolchain used for the make system. """ cc = compiler if compiler else None if not cc and configuration.compiler: cc = configuration.compiler + if cc: -return ['CC="%s"' % cc] +exe_ext = "" +if lldbplatformutil.getHostPlatform() == "windows": +exe_ext = ".exe" + +cc = cc.strip() +cc_path = pathlib.Path(cc) +cc = cc_path.as_posix() + +# We can get CC compiler string in the following formats: +# [] - such as 'xrun clang', 'xrun /usr/bin/clang' & etc +# +# Where could contain the following parts: +# [.] - sucn as 'clang', 'clang.exe' ('clamg-cl.exe'?) +# -[.] - such as 'armv7-linux-gnueabi-gcc' +# /[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' +# /-[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' + +cc_ext = cc_path.suffix +# Compiler name without extension +cc_name = cc_path.stem.split(" ")[-1] + +# A kind of compiler (canonical name): clang, gcc, cc & etc. +cc_type = cc_name +# A triple prefix of compiler name: gcc +cc_prefix = "" +if not "clang-cl" in cc_name and not "llvm-gcc" in cc_name: +cc_name_parts = cc_name.split("-") +cc_type = cc_name_parts[-1] +if len(cc_name_parts) > 1: +cc_prefix = "-".join(cc_name_parts[:-1]) + "-" + +# A kind of C++ compiler. +cxx_types = { +"icc": "icpc", +"llvm-gcc": "llvm-g++", +"gcc": "g++", +"cc": "c++", +"clang": "clang++", +} +cxx_type = cxx_types.get(cc_type, cc_type) + +cc_dir = cc_path.parent + +def getLlvmUtil(util_name): +llvm_tools_dir = os.getenv("LLVM_TOOLS_DIR", cc_dir.as_posix()) +return llvm_tools_dir + "/" + util_name + exe_ext labath wrote: ```suggestion return os.path.join(llvm_tools_dir, util_name + exe_ext) ``` https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -102,15 +102,33 @@ endif # If you change the defaults of CC, be sure to also change it in the file # test/builders/builder_base.py, which provides a Python way to return the # value of the make variable CC -- getCompiler(). -# -# See also these functions: -# o cxx_compiler -# o cxx_linker #-- ifeq "$(CC)" "" $(error "C compiler is not specified. Please run tests through lldb-dotest or lit") endif +# Remove all " and ' characters from the path. Also remove surrounding space chars. +cstrip = $(strip $(subst ",,$(subst ',,$(1 +override CC := $(call cstrip,$(CC)) +override CCC := $(call cstrip,$(CCC)) +override CXX := $(call cstrip,$(CXX)) +# Always override the linker. Assign already normalized CC. +override LD := $(call cstrip,$(CC)) +# A kind of linker. It always gets retrieved from CC. +override LDC := $(call cstrip,$(CCC)) +override OBJCOPY := $(call cstrip,$(OBJCOPY)) +override STRIP := $(call cstrip,$(STRIP)) +override ARCHIVER := $(call cstrip,$(ARCHIVER)) +override AR := $(call cstrip,$(AR)) +override DWP := $(call cstrip,$(DWP)) +override LLVM_AR := $(call cstrip,$(LLVM_AR)) labath wrote: Is this really necessary? Could we get rid of it by *not* passing the extra quotes in the python cmd? https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -96,16 +98,120 @@ def getArchSpec(self, architecture): """ return ["ARCH=" + architecture] if architecture else [] -def getCCSpec(self, compiler): +def getToolchainSpec(self, compiler): """ -Helper function to return the key-value string to specify the compiler +Helper function to return the key-value strings to specify the toolchain used for the make system. """ cc = compiler if compiler else None if not cc and configuration.compiler: cc = configuration.compiler + if cc: -return ['CC="%s"' % cc] +exe_ext = "" +if lldbplatformutil.getHostPlatform() == "windows": +exe_ext = ".exe" + +cc = cc.strip() +cc_path = pathlib.Path(cc) +cc = cc_path.as_posix() + +# We can get CC compiler string in the following formats: +# [] - such as 'xrun clang', 'xrun /usr/bin/clang' & etc +# +# Where could contain the following parts: +# [.] - sucn as 'clang', 'clang.exe' ('clamg-cl.exe'?) +# -[.] - such as 'armv7-linux-gnueabi-gcc' +# /[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' +# /-[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' + +cc_ext = cc_path.suffix +# Compiler name without extension +cc_name = cc_path.stem.split(" ")[-1] + +# A kind of compiler (canonical name): clang, gcc, cc & etc. +cc_type = cc_name +# A triple prefix of compiler name: gcc +cc_prefix = "" +if not "clang-cl" in cc_name and not "llvm-gcc" in cc_name: +cc_name_parts = cc_name.split("-") +cc_type = cc_name_parts[-1] +if len(cc_name_parts) > 1: +cc_prefix = "-".join(cc_name_parts[:-1]) + "-" + +# A kind of C++ compiler. +cxx_types = { +"icc": "icpc", +"llvm-gcc": "llvm-g++", +"gcc": "g++", +"cc": "c++", +"clang": "clang++", +} +cxx_type = cxx_types.get(cc_type, cc_type) + +cc_dir = cc_path.parent + +def getLlvmUtil(util_name): +llvm_tools_dir = os.getenv("LLVM_TOOLS_DIR", cc_dir.as_posix()) +return llvm_tools_dir + "/" + util_name + exe_ext + +def getToolchainUtil(util_name): +return (cc_dir / (cc_prefix + util_name + cc_ext)).as_posix() + +cxx = getToolchainUtil(cxx_type) + +util_names = { +"OBJCOPY": "objcopy", +"STRIP": "objcopy", +"ARCHIVER": "ar", +"DWP": "dwp", +} +utils = [] + +# Note: LLVM_AR is currently required by API TestBSDArchives.py tests. +# Assembly a full path to llvm-ar for given LLVM_TOOLS_DIR; +# otherwise assume we have llvm-ar at the same place where is CC specified. +if not os.getenv("LLVM_AR"): +llvm_ar = getLlvmUtil("llvm-ar") +utils.extend(['LLVM_AR="%s"' % llvm_ar]) + +if not lldbplatformutil.platformIsDarwin(): +if os.getenv("USE_LLVM_TOOLS"): +# Use the llvm project tool instead of the system defaults. +for var, name in util_names.items(): +utils.append('%s="%s"' % (var, getLlvmUtil("llvm-" + name))) +utils.extend(['AR="%s"' % llvm_ar]) +elif cc_type in ["clang", "cc", "gcc"]: +util_paths = {} +# Assembly a toolchain side tool cmd based on passed CC. +for var, name in util_names.items(): +# Do not override explicity specified tool from the cmd line. +if not os.getenv(var): +util_paths[var] = getToolchainUtil(name) +else: +util_paths[var] = os.getenv(var) +utils.extend(['AR="%s"' % util_paths["ARCHIVER"]]) + +# Look for llvm-dwp or gnu dwp +if not lldbutil.which(util_paths["DWP"]): +util_paths["DWP"] = getToolchainUtil("llvm-dwp") +if not lldbutil.which(util_paths["DWP"]): +util_paths["DWP"] = lldbutil.which("llvm-dwp") +if not util_paths["DWP"]: +util_paths["DWP"] = lldbutil.which("dwp") +if not util_paths["DWP"]: +del util_paths["DWP"] + +for var, path in util_paths.items(): +
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -213,7 +229,7 @@ endif LIMIT_DEBUG_INFO_FLAGS = NO_LIMIT_DEBUG_INFO_FLAGS = MODULE_DEBUG_INFO_FLAGS = -ifneq (,$(findstring clang,$(CC))) +ifeq ($(CCC), clang) labath wrote: What does CCC stand for? https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -96,16 +98,120 @@ def getArchSpec(self, architecture): """ return ["ARCH=" + architecture] if architecture else [] -def getCCSpec(self, compiler): +def getToolchainSpec(self, compiler): """ -Helper function to return the key-value string to specify the compiler +Helper function to return the key-value strings to specify the toolchain used for the make system. """ cc = compiler if compiler else None if not cc and configuration.compiler: cc = configuration.compiler + if cc: -return ['CC="%s"' % cc] +exe_ext = "" +if lldbplatformutil.getHostPlatform() == "windows": +exe_ext = ".exe" + +cc = cc.strip() +cc_path = pathlib.Path(cc) +cc = cc_path.as_posix() + +# We can get CC compiler string in the following formats: +# [] - such as 'xrun clang', 'xrun /usr/bin/clang' & etc +# +# Where could contain the following parts: +# [.] - sucn as 'clang', 'clang.exe' ('clamg-cl.exe'?) +# -[.] - such as 'armv7-linux-gnueabi-gcc' +# /[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' +# /-[.]- such as '/usr/bin/clang', 'c:\path\to\compiler\clang,exe' + +cc_ext = cc_path.suffix +# Compiler name without extension +cc_name = cc_path.stem.split(" ")[-1] + +# A kind of compiler (canonical name): clang, gcc, cc & etc. +cc_type = cc_name +# A triple prefix of compiler name: gcc +cc_prefix = "" +if not "clang-cl" in cc_name and not "llvm-gcc" in cc_name: +cc_name_parts = cc_name.split("-") +cc_type = cc_name_parts[-1] +if len(cc_name_parts) > 1: +cc_prefix = "-".join(cc_name_parts[:-1]) + "-" + +# A kind of C++ compiler. +cxx_types = { +"icc": "icpc", +"llvm-gcc": "llvm-g++", +"gcc": "g++", +"cc": "c++", +"clang": "clang++", +} +cxx_type = cxx_types.get(cc_type, cc_type) + +cc_dir = cc_path.parent + +def getLlvmUtil(util_name): +llvm_tools_dir = os.getenv("LLVM_TOOLS_DIR", cc_dir.as_posix()) +return llvm_tools_dir + "/" + util_name + exe_ext + +def getToolchainUtil(util_name): +return (cc_dir / (cc_prefix + util_name + cc_ext)).as_posix() + +cxx = getToolchainUtil(cxx_type) + +util_names = { +"OBJCOPY": "objcopy", +"STRIP": "objcopy", +"ARCHIVER": "ar", +"DWP": "dwp", +} +utils = [] + +# Note: LLVM_AR is currently required by API TestBSDArchives.py tests. +# Assembly a full path to llvm-ar for given LLVM_TOOLS_DIR; +# otherwise assume we have llvm-ar at the same place where is CC specified. +if not os.getenv("LLVM_AR"): +llvm_ar = getLlvmUtil("llvm-ar") +utils.extend(['LLVM_AR="%s"' % llvm_ar]) + +if not lldbplatformutil.platformIsDarwin(): +if os.getenv("USE_LLVM_TOOLS"): labath wrote: I don't see any reference to this var in the llvm tree. Who sets this? (and why) https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -96,16 +98,120 @@ def getArchSpec(self, architecture): """ return ["ARCH=" + architecture] if architecture else [] -def getCCSpec(self, compiler): +def getToolchainSpec(self, compiler): """ -Helper function to return the key-value string to specify the compiler +Helper function to return the key-value strings to specify the toolchain used for the make system. """ cc = compiler if compiler else None if not cc and configuration.compiler: cc = configuration.compiler + if cc: -return ['CC="%s"' % cc] +exe_ext = "" +if lldbplatformutil.getHostPlatform() == "windows": +exe_ext = ".exe" + +cc = cc.strip() +cc_path = pathlib.Path(cc) +cc = cc_path.as_posix() labath wrote: What's up with the posix thing? Could we stick to native paths? https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][test] Improve toolchain detection in Makefile.rules (PR #102185)
@@ -96,16 +98,107 @@ def getArchSpec(self, architecture): """ return ["ARCH=" + architecture] if architecture else [] -def getCCSpec(self, compiler): +def getToolchainSpec(self, compiler): """ -Helper function to return the key-value string to specify the compiler +Helper function to return the key-value strings to specify the toolchain used for the make system. """ cc = compiler if compiler else None if not cc and configuration.compiler: cc = configuration.compiler + if cc: -return ['CC="%s"' % cc] +exe_ext = "" +if lldbplatformutil.getHostPlatform() == "windows": +exe_ext = ".exe" + +cc = cc.strip() +cc_path = pathlib.Path(cc) +cc = cc_path.as_posix() + +# We can get CC compiler string in the following formats: +# [] - such as 'xrun clang', 'xrun /usr/bin/clang' & etc +# +# Where could contain the following parts: +# [.] - sucn as 'clang', 'clang.exe' ('clamg-cl.exe'?) labath wrote: ```suggestion # [.] - sucn as 'clang', 'clang.exe' ('clang-cl.exe'?) ``` https://github.com/llvm/llvm-project/pull/102185 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Fix windows debug build after 9d07f43 (PR #104896)
labath wrote: Thanks. https://github.com/llvm/llvm-project/pull/104896 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][NFC] Moved the SharedSocket class to Socket.* (PR #104787)
https://github.com/labath approved this pull request. https://github.com/llvm/llvm-project/pull/104787 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][dotest] Add an arg to add DLL search paths. (PR #104514)
labath wrote: I think that having fewer knobs that one needs to remember to set to correct values is better. Sometimes that's unavoidable, but I wanted to make sure this is one of those cases. Embedding this information into the swig bindings sounds like a good idea. https://github.com/llvm/llvm-project/pull/104514 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -666,7 +756,23 @@ ConnectionStatus ConnectionFileDescriptor::ConnectFD(llvm::StringRef s, socket_id_callback_type socket_id_callback, Status *error_ptr) { -#if LLDB_ENABLE_POSIX +#ifdef _WIN32 + int64_t fd = -1; + if (!s.getAsInteger(0, fd)) { +// Assume we own fd. +std::unique_ptr tcp_socket = +std::make_unique((NativeSocket)fd, true, false); +m_io_sp = std::move(tcp_socket); +m_uri = s.str(); +return eConnectionStatusSuccess; + } labath wrote: I do. I don't think it's a good state if half of our "fd" refer to file descriptors and others to handles. (In reality, it's not half/half, but more like 90/10, and this is the minority case/) https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -20,6 +20,10 @@ #include #include +#ifdef SendMessage +#undef SendMessage +#endif + labath wrote: If it's just due to header moves that's fine. It's definitely possible to isolate the windows headers. Lllvm does that by making sure it only includes the system headers in implementation files (of the library wrapping them). It'd be nice to have lldb do the same thing, but we're pretty far off. https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -601,15 +570,24 @@ int main_platform(int argc, char *argv[]) { // If not running as a server, this process will not accept // connections while a connection is active. acceptor_up.reset(); - - // When not running in server mode, use all available ports - platform.SetPortMap(std::move(gdbserver_portmap)); } platform.SetConnection(std::unique_ptr(conn)); client_handle(platform, inferior_arguments); } while (g_server); + // FIXME: Make TCPSocket::CloseListenSockets() public and implement + // Acceptor::Close(). + /* + if (acceptor_gdb && gdb_thread.IsJoinable()) { +g_listen_gdb = false; +static_cast(acceptor_gdb->m_listener_socket_up.get()) +->CloseListenSockets(); +lldb::thread_result_t result; +gdb_thread.Join(&result); + } + */ + labath wrote: It's not just linux, that kind of pattern is racy on all posix operating systems. It's possible that other OSs will not hang in this situation, but that doesn't mean that usage is safe. Even on windows, that can cause you to call (WSA)accept on a dangling socket descriptor -- since you can never be sure whether the other thread has already called the function, or is it merely about to do that. It's less likely to blow up in your face because windows does not recycle FDs like posix systems do, but I wouldn't recommend that anywhere. Exiting the main thread may be convenient, but I definitely wouldn't call it "safe", as that means the other thread may be running, executing random code and memory, while the main thread is shutting everything down and deleting that memory. It may not matter often, as the process is done anyway, but if anyone was observing the exit code, he may be suprised to see a SEGV instead of a clean exit. Reasoning about a singlethreaded application is much easier than for a multithreaded one. We should already have all of the pieces to do this the right way, so let's just do that. https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -47,111 +48,10 @@ using namespace llvm; // option descriptors for getopt_long_only() -#ifdef _WIN32 -typedef pipe_t shared_fd_t; -const shared_fd_t kInvalidSharedFD = LLDB_INVALID_PIPE; -#else -typedef NativeSocket shared_fd_t; -const shared_fd_t kInvalidSharedFD = Socket::kInvalidSocketValue; -#endif - -class SharedSocket { -public: - SharedSocket(Connection *conn, Status &error) { -m_fd = kInvalidSharedFD; - -const Socket *socket = -static_cast(conn->GetReadObject().get()); -if (socket == nullptr) { - error = Status("invalid conn socket"); - return; -} - -#ifdef _WIN32 -m_socket = socket->GetNativeSocket(); - -// Create a pipe to transfer WSAPROTOCOL_INFO to the child process. -error = m_socket_pipe.CreateNew(true); -if (error.Fail()) - return; - -m_fd = m_socket_pipe.GetReadPipe(); -#else -m_fd = socket->GetNativeSocket(); -error = Status(); -#endif - } - - shared_fd_t GetSendableFD() { return m_fd; } - - Status CompleteSending(lldb::pid_t child_pid) { -#ifdef _WIN32 -// Transfer WSAPROTOCOL_INFO to the child process. -m_socket_pipe.CloseReadFileDescriptor(); - -WSAPROTOCOL_INFO protocol_info; -if (::WSADuplicateSocket(m_socket, child_pid, &protocol_info) == -SOCKET_ERROR) { - int last_error = ::WSAGetLastError(); - return Status("WSADuplicateSocket() failed, error: %d", last_error); -} - -size_t num_bytes; -Status error = -m_socket_pipe.WriteWithTimeout(&protocol_info, sizeof(protocol_info), - std::chrono::seconds(10), num_bytes); -if (error.Fail()) - return error; -if (num_bytes != sizeof(protocol_info)) - return Status("WriteWithTimeout(WSAPROTOCOL_INFO) failed: %d bytes", -num_bytes); -#endif -return Status(); - } - - static Status GetNativeSocket(shared_fd_t fd, NativeSocket &socket) { -#ifdef _WIN32 -socket = Socket::kInvalidSocketValue; -// Read WSAPROTOCOL_INFO from the parent process and create NativeSocket. -WSAPROTOCOL_INFO protocol_info; -{ - Pipe socket_pipe(fd, LLDB_INVALID_PIPE); - size_t num_bytes; - Status error = - socket_pipe.ReadWithTimeout(&protocol_info, sizeof(protocol_info), - std::chrono::seconds(10), num_bytes); - if (error.Fail()) -return error; - if (num_bytes != sizeof(protocol_info)) { -return Status( -"socket_pipe.ReadWithTimeout(WSAPROTOCOL_INFO) failed: % d bytes", -num_bytes); - } -} -socket = ::WSASocket(FROM_PROTOCOL_INFO, FROM_PROTOCOL_INFO, - FROM_PROTOCOL_INFO, &protocol_info, 0, 0); -if (socket == INVALID_SOCKET) { - return Status("WSASocket(FROM_PROTOCOL_INFO) failed: error %d", -::WSAGetLastError()); -} -return Status(); -#else -socket = fd; -return Status(); -#endif - } - -private: -#ifdef _WIN32 - Pipe m_socket_pipe; - NativeSocket m_socket; -#endif - shared_fd_t m_fd; -}; - static int g_debug = 0; static int g_verbose = 0; static int g_server = 0; +static volatile bool g_listen_gdb = true; labath wrote: yeah, but volatile still isn't the right choice for that. It would have to be an atomic var. https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][NFC] Moved FindSchemeByProtocol() from Acceptor to Socket (PR #104439)
https://github.com/labath approved this pull request. https://github.com/llvm/llvm-project/pull/104439 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add Populate Methods for ELFLinuxPrPsInfo and ELFLinuxPrStatus (PR #104109)
https://github.com/labath approved this pull request. https://github.com/llvm/llvm-project/pull/104109 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb][dotest] Add an arg to add DLL search paths. (PR #104514)
labath wrote: Can those swift libraries be found automatically? For example if they're at some known path relative to lldb or the swift compiler? https://github.com/llvm/llvm-project/pull/104514 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [LLDB][Minidump] Fix ProcessMinidump::GetMemoryRegions to include 64b regions when /proc/pid maps are missing. (PR #101086)
@@ -457,34 +456,24 @@ MinidumpParser::FindMemoryRange(lldb::addr_t addr) { } } - // Some Minidumps have a Memory64ListStream that captures all the heap memory - // (full-memory Minidumps). We can't exactly use the same loop as above, - // because the Minidump uses slightly different data structures to describe - // those - - if (!data64.empty()) { -llvm::ArrayRef memory64_list; -uint64_t base_rva; -std::tie(memory64_list, base_rva) = -MinidumpMemoryDescriptor64::ParseMemory64List(data64); - -if (memory64_list.empty()) - return std::nullopt; - -for (const auto &memory_desc64 : memory64_list) { - const lldb::addr_t range_start = memory_desc64.start_of_memory_range; - const size_t range_size = memory_desc64.data_size; - - if (base_rva + range_size > GetData().size()) -return std::nullopt; - - if (range_start <= addr && addr < range_start + range_size) { -return minidump::Range(range_start, - GetData().slice(base_rva, range_size)); + if (!GetStream(StreamType::Memory64List).empty()) { +llvm::Error err = llvm::Error::success(); +for (const auto &memory_desc : GetMinidumpFile().getMemory64List(err)) { + // Explicit error check so we can return from within labath wrote: Are you sure that's necessary? If I read [this](https://github.com/llvm/llvm-project/blob/4c77cc634d49782aceff77f7ec4e6183ec223020/llvm/include/llvm/ADT/fallible_iterator.h#L66) right, it shouldn't be. https://github.com/llvm/llvm-project/pull/101086 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [LLDB][Minidump] Fix ProcessMinidump::GetMemoryRegions to include 64b regions when /proc/pid maps are missing. (PR #101086)
@@ -553,53 +547,49 @@ static bool CreateRegionsCacheFromMemoryList(MinidumpParser &parser, std::vector ®ions) { Log *log = GetLog(LLDBLog::Modules); + // Cache the expected memory32 into an optional + // because double checking the expected triggers the unchecked warning. + std::optional> memory32_list; auto ExpectedMemory = parser.GetMinidumpFile().getMemoryList(); if (!ExpectedMemory) { LLDB_LOG_ERROR(log, ExpectedMemory.takeError(), "Failed to read memory list: {0}"); labath wrote: I don't get the double-checked comment. Something like ``` if (auto ExpectedMemory = parser.GetMinidumpFile().getMemoryList(); !ExpectedMemory) { LLDB_LOG_ERROR(log, ExpectedMemory.takeError(), "Failed to read memory list: {0}"); } else { for (const MemoryDescriptor &memory_desc : *ExpectedMemory) { ... ``` ought to work just fine (other ways to structure this are also possible, but I don't know if you can do an early exit or whatever...). Once you check the success state of the expected object and find that it succeeded, the check requirement is satisfied. No further checking is needed. That would be different if you tried to copy the value into some other (expected) object, as then you would need to check that one as well, but the solution to that is simple -- just don't do that. :) https://github.com/llvm/llvm-project/pull/101086 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [LLDB][Minidump] Fix ProcessMinidump::GetMemoryRegions to include 64b regions when /proc/pid maps are missing. (PR #101086)
https://github.com/labath commented: A couple of small things, but I think it's fine overall. I'm going to be OOO next week, no need to wait for me. https://github.com/llvm/llvm-project/pull/101086 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [LLDB][Minidump] Fix ProcessMinidump::GetMemoryRegions to include 64b regions when /proc/pid maps are missing. (PR #101086)
@@ -457,34 +456,24 @@ MinidumpParser::FindMemoryRange(lldb::addr_t addr) { } } - // Some Minidumps have a Memory64ListStream that captures all the heap memory - // (full-memory Minidumps). We can't exactly use the same loop as above, - // because the Minidump uses slightly different data structures to describe - // those - - if (!data64.empty()) { -llvm::ArrayRef memory64_list; -uint64_t base_rva; -std::tie(memory64_list, base_rva) = -MinidumpMemoryDescriptor64::ParseMemory64List(data64); - -if (memory64_list.empty()) - return std::nullopt; - -for (const auto &memory_desc64 : memory64_list) { - const lldb::addr_t range_start = memory_desc64.start_of_memory_range; - const size_t range_size = memory_desc64.data_size; - - if (base_rva + range_size > GetData().size()) -return std::nullopt; - - if (range_start <= addr && addr < range_start + range_size) { -return minidump::Range(range_start, - GetData().slice(base_rva, range_size)); + if (!GetStream(StreamType::Memory64List).empty()) { +llvm::Error err = llvm::Error::success(); +for (const auto &memory_desc : GetMinidumpFile().getMemory64List(err)) { + // Explicit error check so we can return from within + if (memory_desc.first.StartOfMemoryRange <= addr + && addr < memory_desc.first.StartOfMemoryRange + memory_desc.first.DataSize + && !err) { +return minidump::Range(memory_desc.first.StartOfMemoryRange, memory_desc.second); } - base_rva += range_size; } + +if (err) + // Without std::move(err) fails with + // error: call to deleted constructor of '::llvm::Error' labath wrote: That's just the general property of llvm::Error objects (that they can't be copied) and the LLDB_LOG_ERROR macro (that it consumes the error). I don't think it needs a special comment. https://github.com/llvm/llvm-project/pull/101086 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [LLDB][Minidump] Fix ProcessMinidump::GetMemoryRegions to include 64b regions when /proc/pid maps are missing. (PR #101086)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/101086 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] Disable ThreadPlanSingleThreadTimeout during step over breakpoint (PR #104532)
labath wrote: Thanks for looking into this. I'll also defer to Jim, but I'll note two things: - stepping over a breakpoint can still block -- if the breakpoint is on a syscall instruction, and the syscall blocks. However, problem with missed breakpoints is also real, and probably more important than blocked steps. As I don't see a way to resume other threads while not risking them missing breakpoints, this may still be the right thing to do. - I was amused by the hedging in "potential race condition". The race is real, not potential. The fact that you win the race most of the time doesn't mean it doesn't exist. :P https://github.com/llvm/llvm-project/pull/104532 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] Reland "[lldb] Reland 2402b3213c2f with `/H` to debug the windows build issue (PR #101672)
labath wrote: Okay, that sort of makes sense. However, unless you actually want to be able to disable each of these plugins independently (which it sounds like you don't), then this is a very.. baroque way to achieve the desired effect (to have multiple (plugin) classes registered with the plugin manager). All of that cmake plugin logic is just a very elaborate way to invoke `SomeClass::Initialize()`. Plugins are registered through `PluginManager::RegisterPlugin`. While the usual case is that the `Initialize` function performs a single `RegisterPlugin` call (and pretty much nothing else), nothing actually depends or enforces that, and we do have "plugins" where a single plugin library registers more than plugin class (e.g. `lldbPluginPlatformMacOSX`). I think it would be perfectly reasonable to register all of these classes from `ScriptInterpreterPythonInterfaces::Initialize` (and maybe even from `ScriptInterpreterPython::Initialize`). https://github.com/llvm/llvm-project/pull/101672 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [WIP] memory find speedup+bugfix (PR #104193)
https://github.com/labath updated https://github.com/llvm/llvm-project/pull/104193 >From a537a48c444e9dec3a85241d9726d6f3187a43cf Mon Sep 17 00:00:00 2001 From: Pavel Labath Date: Wed, 14 Aug 2024 19:58:27 +0200 Subject: [PATCH] [WIP] memory find speedup+bugfix --- lldb/source/Target/Process.cpp| 58 ++--- .../memory/find/TestMemoryFind.py | 10 +++ .../API/functionalities/memory/find/main.cpp | 82 +-- 3 files changed, 108 insertions(+), 42 deletions(-) diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp index e3c4f2ee398cc4..4e3d7651a066ec 100644 --- a/lldb/source/Target/Process.cpp +++ b/lldb/source/Target/Process.cpp @@ -114,33 +114,6 @@ class ProcessOptionValueProperties } }; -class ProcessMemoryIterator { -public: - ProcessMemoryIterator(Process &process, lldb::addr_t base) - : m_process(process), m_base_addr(base) {} - - bool IsValid() { return m_is_valid; } - - uint8_t operator[](lldb::addr_t offset) { -if (!IsValid()) - return 0; - -uint8_t retval = 0; -Status error; -if (0 == m_process.ReadMemory(m_base_addr + offset, &retval, 1, error)) { - m_is_valid = false; - return 0; -} - -return retval; - } - -private: - Process &m_process; - const lldb::addr_t m_base_addr; - bool m_is_valid = true; -}; - static constexpr OptionEnumValueElement g_follow_fork_mode_values[] = { { eFollowParent, @@ -3368,20 +3341,37 @@ lldb::addr_t Process::FindInMemory(lldb::addr_t low, lldb::addr_t high, return LLDB_INVALID_ADDRESS; std::vector bad_char_heuristic(256, size); - ProcessMemoryIterator iterator(*this, low); - for (size_t idx = 0; idx < size - 1; idx++) { decltype(bad_char_heuristic)::size_type bcu_idx = buf[idx]; bad_char_heuristic[bcu_idx] = size - idx - 1; } - for (size_t s = 0; s <= (region_size - size);) { + + llvm::SmallVector mem; + addr_t mem_pos = low; + const size_t read_size = std::max(size, 0x1); + + for (addr_t s = low; s <= (high - size);) { +if (s + size >= mem.size() + mem_pos) { + mem.resize_for_overwrite(read_size); + Status error; + mem.resize( + ReadMemory(s, mem.data(), std::min(mem.size(), high - s), error)); + mem_pos = s; + if (error.Fail()) { +MemoryRegionInfo info; +error = GetMemoryRegionInfo(s, info); +if (error.Fail()) + return LLDB_INVALID_ADDRESS; +s = info.GetRange().GetRangeEnd(); +continue; + } +} int64_t j = size - 1; -while (j >= 0 && buf[j] == iterator[s + j]) +while (j >= 0 && buf[j] == mem[s + j - mem_pos]) j--; if (j < 0) - return low + s; -else - s += bad_char_heuristic[iterator[s + size - 1]]; + return s; +s += bad_char_heuristic[mem[s + size - 1 - mem_pos]]; } return LLDB_INVALID_ADDRESS; diff --git a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py b/lldb/test/API/functionalities/memory/find/TestMemoryFind.py index 09611cc808777d..a606899498b722 100644 --- a/lldb/test/API/functionalities/memory/find/TestMemoryFind.py +++ b/lldb/test/API/functionalities/memory/find/TestMemoryFind.py @@ -79,3 +79,13 @@ def test_memory_find(self): 'memory find -s "nothere" `stringdata` `stringdata+10`', substrs=["data not found within the range."], ) + +pagesize = self.frame().FindVariable("pagesize").GetValueAsUnsigned() +mem_with_holes = self.frame().FindVariable("mem_with_holes").GetValueAsUnsigned() +matches_var = self.frame().FindVariable("matches") +self.assertEqual(matches_var.GetNumChildren(), 4) +matches = [f'data found at location: {matches_var.GetChildAtIndex(i).GetValueAsUnsigned():#x}' for i in range(4)] +self.expect( +'memory find -c 5 -s "needle" `mem_with_holes` `mem_with_holes+5*pagesize`', +substrs=matches + ["no more matches within the range"], +) diff --git a/lldb/test/API/functionalities/memory/find/main.cpp b/lldb/test/API/functionalities/memory/find/main.cpp index e3dcfc762ee0f9..e5525e3ca1f73f 100644 --- a/lldb/test/API/functionalities/memory/find/main.cpp +++ b/lldb/test/API/functionalities/memory/find/main.cpp @@ -1,9 +1,75 @@ -#include -#include - -int main (int argc, char const *argv[]) -{ -const char* stringdata = "hello world; I like to write text in const char pointers"; -uint8_t bytedata[] = {0xAA,0xBB,0xCC,0xDD,0xEE,0xFF,0x00,0x11,0x22,0x33,0x44,0x55,0x66,0x77,0x88,0x99}; -return 0; // break here +#include +#include +#include +#include +#include + +#ifdef _WIN32 +#include "Windows.h" + +int getpagesize() { + SYSTEM_INFO system_info; + GetSystemInfo(&system_info); + return system_info.dwPageSize; +} + +void *allocate_memory_with_holes() { + int pagesize = getpagesize(); + void *mem = VirtualAlloc(nullptr, 5 * pagesize, MEM_RESERVE, PAGE_NOACCESS); + if (!mem) { +
[Lldb-commits] [lldb] [lldb] Add 'FindInMemory()' overload for PostMortemProcess. (PR #102536)
labath wrote: Thanks for chiming in Jason. I'm glad this is getting attention, as I think that our memory reading APIs are in need of an overhaul -- and it gives me a chance to share my idea. ;) I don't think that we need to change all of the APIs to return their memory. Sometimes reading the memory into a caller-provided buffer makes perfect sense. What I would like to do is to regularize all the APIs, and avoid the need to implement the logic of e.g. "reading a c string twice". The approach I'd like is to define a `MemoryReader` interface, which would contain all of the ways one may want to read a memory (as a string, integer, etc..). It would also contain a default implementation of the APIs so that an implementation would only need to implement one or two methods and have everything fall out naturally. Then, anything that wants to support that interface, can just inherit from it and implement the required method. If something wants to provide two different methods for reading the memory (I believe Process supports cached and uncached reads), it can just do two implementations, and provide them via different accessors (e.g. `process->CachedMemory()->ReadInt(addr)` and process->DirectMemory()->ReadInt(addr)`). Another nice part about this is that if we have some code, which only needs to read memory, then it can take a MemoryReader reference (instead of Process or whatever), and then it can be unit tested by just passing it a mock memory reader. https://github.com/llvm/llvm-project/pull/102536 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Realpath symlinks for breakpoints (PR #102223)
https://github.com/labath commented: The deps seem fine. I like the approach you've chosen. I'll leave the rest to other reviewers. https://github.com/llvm/llvm-project/pull/102223 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] Reland "[lldb] Reland 2402b3213c2f with `/H` to debug the windows build issue (PR #101672)
labath wrote: `lldbPluginScriptInterpreterPythonScriptedThreadPlanPythonInterface` quite a mouthful for just a single cpp file. Maybe all of the interfaces could be a part of a single library? That would end up with something like: ``` D:\build-lldb\tools\lldb\source\Plugins\ScriptInterpreter\Python\Interfaces\CMakeFiles\lldbPluginScriptInterpreterPythonInterfaces.dir\lldbPluginScriptInterpreterPythonInterfaces.pdb ``` .. which is 80 chars shorter? https://github.com/llvm/llvm-project/pull/101672 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -47,111 +48,10 @@ using namespace llvm; // option descriptors for getopt_long_only() -#ifdef _WIN32 -typedef pipe_t shared_fd_t; -const shared_fd_t kInvalidSharedFD = LLDB_INVALID_PIPE; -#else -typedef NativeSocket shared_fd_t; -const shared_fd_t kInvalidSharedFD = Socket::kInvalidSocketValue; -#endif - -class SharedSocket { -public: - SharedSocket(Connection *conn, Status &error) { -m_fd = kInvalidSharedFD; - -const Socket *socket = -static_cast(conn->GetReadObject().get()); -if (socket == nullptr) { - error = Status("invalid conn socket"); - return; -} - -#ifdef _WIN32 -m_socket = socket->GetNativeSocket(); - -// Create a pipe to transfer WSAPROTOCOL_INFO to the child process. -error = m_socket_pipe.CreateNew(true); -if (error.Fail()) - return; - -m_fd = m_socket_pipe.GetReadPipe(); -#else -m_fd = socket->GetNativeSocket(); -error = Status(); -#endif - } - - shared_fd_t GetSendableFD() { return m_fd; } - - Status CompleteSending(lldb::pid_t child_pid) { -#ifdef _WIN32 -// Transfer WSAPROTOCOL_INFO to the child process. -m_socket_pipe.CloseReadFileDescriptor(); - -WSAPROTOCOL_INFO protocol_info; -if (::WSADuplicateSocket(m_socket, child_pid, &protocol_info) == -SOCKET_ERROR) { - int last_error = ::WSAGetLastError(); - return Status("WSADuplicateSocket() failed, error: %d", last_error); -} - -size_t num_bytes; -Status error = -m_socket_pipe.WriteWithTimeout(&protocol_info, sizeof(protocol_info), - std::chrono::seconds(10), num_bytes); -if (error.Fail()) - return error; -if (num_bytes != sizeof(protocol_info)) - return Status("WriteWithTimeout(WSAPROTOCOL_INFO) failed: %d bytes", -num_bytes); -#endif -return Status(); - } - - static Status GetNativeSocket(shared_fd_t fd, NativeSocket &socket) { -#ifdef _WIN32 -socket = Socket::kInvalidSocketValue; -// Read WSAPROTOCOL_INFO from the parent process and create NativeSocket. -WSAPROTOCOL_INFO protocol_info; -{ - Pipe socket_pipe(fd, LLDB_INVALID_PIPE); - size_t num_bytes; - Status error = - socket_pipe.ReadWithTimeout(&protocol_info, sizeof(protocol_info), - std::chrono::seconds(10), num_bytes); - if (error.Fail()) -return error; - if (num_bytes != sizeof(protocol_info)) { -return Status( -"socket_pipe.ReadWithTimeout(WSAPROTOCOL_INFO) failed: % d bytes", -num_bytes); - } -} -socket = ::WSASocket(FROM_PROTOCOL_INFO, FROM_PROTOCOL_INFO, - FROM_PROTOCOL_INFO, &protocol_info, 0, 0); -if (socket == INVALID_SOCKET) { - return Status("WSASocket(FROM_PROTOCOL_INFO) failed: error %d", -::WSAGetLastError()); -} -return Status(); -#else -socket = fd; -return Status(); -#endif - } - -private: -#ifdef _WIN32 - Pipe m_socket_pipe; - NativeSocket m_socket; -#endif - shared_fd_t m_fd; -}; - static int g_debug = 0; static int g_verbose = 0; static int g_server = 0; +static volatile bool g_listen_gdb = true; labath wrote: volatile? really? https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -879,20 +879,12 @@ lldb::thread_result_t GDBRemoteCommunication::ListenThread() { return {}; } -Status GDBRemoteCommunication::StartDebugserverProcess( -const char *url, Platform *platform, ProcessLaunchInfo &launch_info, -uint16_t *port, const Args *inferior_args, int pass_comm_fd) { +bool GDBRemoteCommunication::GetDebugserverPath( +Platform *platform, FileSpec &debugserver_file_spec) { labath wrote: ```suggestion FileSpec GDBRemoteCommunication::GetDebugserverPath( Platform *platform) { ``` just return an invalid FileSpec for error. https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -666,7 +756,23 @@ ConnectionStatus ConnectionFileDescriptor::ConnectFD(llvm::StringRef s, socket_id_callback_type socket_id_callback, Status *error_ptr) { -#if LLDB_ENABLE_POSIX +#ifdef _WIN32 + int64_t fd = -1; + if (!s.getAsInteger(0, fd)) { +// Assume we own fd. +std::unique_ptr tcp_socket = +std::make_unique((NativeSocket)fd, true, false); +m_io_sp = std::move(tcp_socket); +m_uri = s.str(); +return eConnectionStatusSuccess; + } labath wrote: I don't exactly have an alternative in mind yet, but I'm not sure this is a good idea. File descriptors are a thing on windows as well (though, IIUC, they're more of like an in-process thing rather than a kernel entity), and we do have code which treats them as such, so I think it could be confusing that this functions considers an "fd" to be a socket handle. https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -40,83 +40,20 @@ #include "lldb/Utility/StringExtractorGDBRemote.h" +#include "../tools/lldb-server/Acceptor.h" labath wrote: We will need to move this someplace else (probably `Host`) https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -601,15 +570,24 @@ int main_platform(int argc, char *argv[]) { // If not running as a server, this process will not accept // connections while a connection is active. acceptor_up.reset(); - - // When not running in server mode, use all available ports - platform.SetPortMap(std::move(gdbserver_portmap)); } platform.SetConnection(std::unique_ptr(conn)); client_handle(platform, inferior_arguments); } while (g_server); + // FIXME: Make TCPSocket::CloseListenSockets() public and implement + // Acceptor::Close(). + /* + if (acceptor_gdb && gdb_thread.IsJoinable()) { +g_listen_gdb = false; +static_cast(acceptor_gdb->m_listener_socket_up.get()) +->CloseListenSockets(); +lldb::thread_result_t result; +gdb_thread.Join(&result); + } + */ + labath wrote: This won't work. This is exactly the pattern from that kernel bug we were arguing over. It's just not possible to safely close an fd while another thread may be operating on it. You need to use an out-of-band mechanism (e.g. a pipe) to notify the thread that it's time to terminate. However, since that will involve some form of multiplexing (e.g. `select(2)`). I would strongly prefer to just multiplex over the two sockets we are accepting. We should be able to use the `MainLoop` class for that -- just register the two sockets with the class, and let it call the right callback when needed. Basically, the code should look similar to [this](https://github.com/llvm/llvm-project/blob/f71b63865140cf3c286baf3a77ba3e467f929504/lldb/source/Host/common/TCPSocket.cpp#L267), except that it will be listening to two unrelated sockets. https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
@@ -20,6 +20,10 @@ #include #include +#ifdef SendMessage +#undef SendMessage +#endif + labath wrote: I guess that means we're including a windows header somewhere. Any chance to avoid that? Llvm is pretty strict about that. Lldb is not, though I think it's a good practice. https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Removed gdbserver ports map from lldb-server (PR #104238)
https://github.com/labath commented: The termination part definitely needs to be done differently, but overall, I think this is a pretty good start. (PSA: I will be OOO all of next week.) https://github.com/llvm/llvm-project/pull/104238 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] Reapply "[lldb] Tolerate multiple compile units with the same DWO ID (#100577)" (PR #104041)
https://github.com/labath closed https://github.com/llvm/llvm-project/pull/104041 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] Fix single thread stepping timeout race condition (PR #104195)
https://github.com/labath approved this pull request. This indeed fixes the race condition. I can confirm that the test no longer hangs after with this patch. I'm glad we were able to figure this out. *However*, this doesn't appear to be the only problem here. With the hang out of the way, I can now see another failure mode, which appears with a lower (say ~0.1%) frequency. Judging by the [logs](https://paste.debian.net/hidden/30235a5c/), the problem is triggered when the interrupt packet (^C) arrives too soon -- before the inferior has had a chance to execute anything. I haven't looked at the code, but my guess would be that the step-over thread plan erroneously decides that it's done (because it's back at the function it wanted to be in, but it does not realize that's because the PC hasn't been moved). If you want to reproduce this yourself, I've found that inserting a 10ms sleep [here](https://github.com/llvm/llvm-project/blob/7898866065f6c9b72b5fa3e45e565baf8a5e7609/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp#L1005) makes the failure reproduce in about 50% of cases (as it increases the likelyhood that the interrupt will arrive "too soon" -- you could probably achieve the same thing by decreasing the single thread timeout) https://github.com/llvm/llvm-project/pull/104195 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add Populate Methods for ELFLinuxPrPsInfo and ELFLinuxPrStatus (PR #104109)
@@ -0,0 +1,162 @@ +//===-- ThreadElfCoreTest.cpp *- C++ +//-*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +#include "Plugins/Process/elf-core/ThreadElfCore.h" +#include "Plugins/Platform/Linux/PlatformLinux.h" +#include "TestingSupport/TestUtilities.h" +#include "lldb/Core/Debugger.h" +#include "lldb/Host/FileSystem.h" +#include "lldb/Host/HostInfo.h" +#include "lldb/Target/Process.h" +#include "lldb/Target/Target.h" +#include "lldb/Target/Thread.h" +#include "lldb/Utility/Listener.h" +#include "llvm/TargetParser/Triple.h" +#include "gtest/gtest.h" + +#include +#include +#include +#include + +using namespace lldb_private; + +namespace { + +struct ElfCoreTest : public testing::Test { + static void SetUpTestCase() { +FileSystem::Initialize(); +HostInfo::Initialize(); +platform_linux::PlatformLinux::Initialize(); +std::call_once(TestUtilities::g_debugger_initialize_flag, + []() { Debugger::Initialize(nullptr); }); + } + static void TearDownTestCase() { +platform_linux::PlatformLinux::Terminate(); +HostInfo::Terminate(); +FileSystem::Terminate(); + } +}; + +struct DummyProcess : public Process { + DummyProcess(lldb::TargetSP target_sp, lldb::ListenerSP listener_sp) + : Process(target_sp, listener_sp) { +SetID(getpid()); + } + + bool CanDebug(lldb::TargetSP target, bool plugin_specified_by_name) override { +return true; + } + Status DoDestroy() override { return {}; } + void RefreshStateAfterStop() override {} + size_t DoReadMemory(lldb::addr_t vm_addr, void *buf, size_t size, + Status &error) override { +return 0; + } + bool DoUpdateThreadList(ThreadList &old_thread_list, + ThreadList &new_thread_list) override { +return false; + } + llvm::StringRef GetPluginName() override { return "Dummy"; } +}; + +struct DummyThread : public Thread { + using Thread::Thread; + + ~DummyThread() override { DestroyThread(); } + + void RefreshStateAfterStop() override {} + + lldb::RegisterContextSP GetRegisterContext() override { return nullptr; } + + lldb::RegisterContextSP + CreateRegisterContextForFrame(StackFrame *frame) override { +return nullptr; + } + + bool CalculateStopInfo() override { return false; } +}; + +lldb::TargetSP CreateTarget(lldb::DebuggerSP &debugger_sp, ArchSpec &arch) { + lldb::PlatformSP platform_sp; + lldb::TargetSP target_sp; + debugger_sp->GetTargetList().CreateTarget( + *debugger_sp, "", arch, eLoadDependentsNo, platform_sp, target_sp); + return target_sp; +} + +lldb::ThreadSP CreateThread(lldb::ProcessSP &process_sp) { + lldb::ThreadSP thread_sp = + std::make_shared(*process_sp.get(), gettid()); + if (thread_sp == nullptr) { +return nullptr; + } + process_sp->GetThreadList().AddThread(thread_sp); + + return thread_sp; +} + +} // namespace + +TEST_F(ElfCoreTest, PopulatePrpsInfoTest) { + ArchSpec arch{HostInfo::GetTargetTriple()}; + lldb::DebuggerSP debugger_sp = Debugger::CreateInstance(); + ASSERT_TRUE(debugger_sp); + + lldb::TargetSP target_sp = CreateTarget(debugger_sp, arch); + ASSERT_TRUE(target_sp); + + lldb::ListenerSP listener_sp(Listener::MakeListener("dummy")); + lldb::ProcessSP process_sp = + std::make_shared(target_sp, listener_sp); + ASSERT_TRUE(process_sp); + auto prpsinfo_opt = ELFLinuxPrPsInfo::Populate(process_sp); labath wrote: This would be easier to test if the API was like `ELFLinuxPrPsInfo::Populate(ProcessInfo, State)`, as then you could call the function with values fully under your control. That way you could e.g. test boundary conditions on the arguments string (the most complicated part of the function). If you want you can still have an overload which takes a process (and calls this function), but given that it will likely only ever have a single caller, it could just be inlined into it. https://github.com/llvm/llvm-project/pull/104109 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add Populate Methods for ELFLinuxPrPsInfo and ELFLinuxPrStatus (PR #104109)
@@ -394,6 +420,114 @@ Status ELFLinuxPrPsInfo::Parse(const DataExtractor &data, return error; } +std::optional +ELFLinuxPrPsInfo::Populate(const lldb::ProcessSP &process_sp) { + ELFLinuxPrPsInfo prpsinfo{}; + prpsinfo.pr_pid = process_sp->GetID(); + ProcessInstanceInfo info; + if (!process_sp->GetProcessInfo(info)) { +return std::nullopt; + } + prpsinfo.pr_nice = info.GetPriorityValue().value_or(0); + prpsinfo.pr_zomb = 0; + if (auto zombie_opt = info.IsZombie(); zombie_opt.value_or(false)) { +prpsinfo.pr_zomb = 1; + } + /** + * In the linux kernel this comes from: + * state = READ_ONCE(p->__state); + * i = state ? ffz(~state) + 1 : 0; + * psinfo->pr_sname = (i > 5) ? '.' : "RSDTZW"[i]; + * + * So we replicate that here. From proc_pid_stats(5) + * R = Running + * S = Sleeping on uninterrutible wait + * D = Waiting on uninterruptable disk sleep + * T = Tracing stop + * Z = Zombie + * W = Paging + */ + lldb::StateType process_state = process_sp->GetState(); + switch (process_state) { + case lldb::StateType::eStateSuspended: +prpsinfo.pr_sname = 'S'; +prpsinfo.pr_state = 1; +break; + case lldb::StateType::eStateStopped: +[[fallthrough]]; + case lldb::StateType::eStateStepping: +prpsinfo.pr_sname = 'T'; +prpsinfo.pr_state = 3; +break; + case lldb::StateType::eStateUnloaded: +[[fallthrough]]; + case lldb::StateType::eStateRunning: +prpsinfo.pr_sname = 'R'; +prpsinfo.pr_state = 0; +break; + default: +break; + } + + /** + * pr_flags is left as 0. The values (in linux) are specific + * to the kernel. We recover them from the proc filesystem + * but don't put them in ProcessInfo because it would really + * become very linux specific and the utility here seems pretty + * dubious + */ + + if (info.EffectiveUserIDIsValid()) { +prpsinfo.pr_uid = info.GetUserID(); + } + if (info.EffectiveGroupIDIsValid()) { +prpsinfo.pr_gid = info.GetGroupID(); + } + + if (info.ParentProcessIDIsValid()) { +prpsinfo.pr_ppid = info.GetParentProcessID(); + } + + if (info.ProcessGroupIDIsValid()) { +prpsinfo.pr_pgrp = info.GetProcessGroupID(); + } + + if (info.ProcessSessionIDIsValid()) { +prpsinfo.pr_sid = info.GetProcessSessionID(); + } + constexpr size_t fname_len = std::extent_v; + static_assert(fname_len > 0, "This should always be non zero"); + const llvm::StringRef fname = info.GetNameAsStringRef(); + auto fname_begin = fname.begin(); + std::copy(fname_begin, +std::next(fname_begin, std::min(fname_len, fname.size())), +prpsinfo.pr_fname); + prpsinfo.pr_fname[fname_len - 1] = '\0'; + auto args = info.GetArguments(); + if (!args.empty()) { labath wrote: The code does the right thing even without this check. https://github.com/llvm/llvm-project/pull/104109 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add Populate Methods for ELFLinuxPrPsInfo and ELFLinuxPrStatus (PR #104109)
@@ -394,6 +420,114 @@ Status ELFLinuxPrPsInfo::Parse(const DataExtractor &data, return error; } +std::optional +ELFLinuxPrPsInfo::Populate(const lldb::ProcessSP &process_sp) { + ELFLinuxPrPsInfo prpsinfo{}; + prpsinfo.pr_pid = process_sp->GetID(); + ProcessInstanceInfo info; + if (!process_sp->GetProcessInfo(info)) { +return std::nullopt; + } + prpsinfo.pr_nice = info.GetPriorityValue().value_or(0); + prpsinfo.pr_zomb = 0; + if (auto zombie_opt = info.IsZombie(); zombie_opt.value_or(false)) { +prpsinfo.pr_zomb = 1; + } + /** + * In the linux kernel this comes from: + * state = READ_ONCE(p->__state); + * i = state ? ffz(~state) + 1 : 0; + * psinfo->pr_sname = (i > 5) ? '.' : "RSDTZW"[i]; + * + * So we replicate that here. From proc_pid_stats(5) + * R = Running + * S = Sleeping on uninterrutible wait + * D = Waiting on uninterruptable disk sleep + * T = Tracing stop + * Z = Zombie + * W = Paging + */ + lldb::StateType process_state = process_sp->GetState(); + switch (process_state) { + case lldb::StateType::eStateSuspended: +prpsinfo.pr_sname = 'S'; +prpsinfo.pr_state = 1; +break; + case lldb::StateType::eStateStopped: +[[fallthrough]]; + case lldb::StateType::eStateStepping: +prpsinfo.pr_sname = 'T'; +prpsinfo.pr_state = 3; +break; + case lldb::StateType::eStateUnloaded: +[[fallthrough]]; + case lldb::StateType::eStateRunning: +prpsinfo.pr_sname = 'R'; +prpsinfo.pr_state = 0; +break; + default: +break; + } + + /** + * pr_flags is left as 0. The values (in linux) are specific + * to the kernel. We recover them from the proc filesystem + * but don't put them in ProcessInfo because it would really + * become very linux specific and the utility here seems pretty + * dubious + */ + + if (info.EffectiveUserIDIsValid()) { +prpsinfo.pr_uid = info.GetUserID(); + } + if (info.EffectiveGroupIDIsValid()) { +prpsinfo.pr_gid = info.GetGroupID(); + } + + if (info.ParentProcessIDIsValid()) { +prpsinfo.pr_ppid = info.GetParentProcessID(); + } + + if (info.ProcessGroupIDIsValid()) { +prpsinfo.pr_pgrp = info.GetProcessGroupID(); + } + + if (info.ProcessSessionIDIsValid()) { +prpsinfo.pr_sid = info.GetProcessSessionID(); + } + constexpr size_t fname_len = std::extent_v; + static_assert(fname_len > 0, "This should always be non zero"); + const llvm::StringRef fname = info.GetNameAsStringRef(); + auto fname_begin = fname.begin(); + std::copy(fname_begin, +std::next(fname_begin, std::min(fname_len, fname.size())), +prpsinfo.pr_fname); + prpsinfo.pr_fname[fname_len - 1] = '\0'; + auto args = info.GetArguments(); + if (!args.empty()) { +constexpr size_t psargs_len = std::extent_v; +static_assert(psargs_len > 0, "This should always be non zero"); +size_t i = psargs_len; +auto argentry_iterator = std::begin(args); +char *psargs = prpsinfo.pr_psargs; +while (i > 0 && argentry_iterator != args.end()) { + llvm::StringRef argentry = argentry_iterator->ref(); + size_t len = std::min(i, argentry.size()); + auto arg_iterator = std::begin(argentry); + std::copy(arg_iterator, std::next(arg_iterator, len), psargs); + i -= len; + psargs += len; + if (i > 0) { +*(psargs++) = ' '; +--i; + } + ++argentry_iterator; +} labath wrote: ```suggestion auto argentry_iterator = std::begin(args); char *psargs = prpsinfo.pr_psargs; char *psargs_end = std::end(prpsinfo.pr_psargs); while (psargs < psargs_end && argentry_iterator != args.end()) { llvm::StringRef argentry = argentry_iterator->ref(); size_t len = std::min(std::distance(psargs, psargs_end), argentry.size()); auto arg_iterator = std::begin(argentry); psargs = std::copy_n(arg_iterator, len, psargs); if (psargs != psargs_end) *(psargs++) = ' '; ++argentry_iterator; } ``` I noticed you like iterators, so I've put in more of them. :) https://github.com/llvm/llvm-project/pull/104109 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add Populate Methods for ELFLinuxPrPsInfo and ELFLinuxPrStatus (PR #104109)
@@ -394,6 +420,114 @@ Status ELFLinuxPrPsInfo::Parse(const DataExtractor &data, return error; } +std::optional +ELFLinuxPrPsInfo::Populate(const lldb::ProcessSP &process_sp) { + ELFLinuxPrPsInfo prpsinfo{}; + prpsinfo.pr_pid = process_sp->GetID(); + ProcessInstanceInfo info; + if (!process_sp->GetProcessInfo(info)) { +return std::nullopt; + } + prpsinfo.pr_nice = info.GetPriorityValue().value_or(0); + prpsinfo.pr_zomb = 0; + if (auto zombie_opt = info.IsZombie(); zombie_opt.value_or(false)) { +prpsinfo.pr_zomb = 1; + } + /** + * In the linux kernel this comes from: + * state = READ_ONCE(p->__state); + * i = state ? ffz(~state) + 1 : 0; + * psinfo->pr_sname = (i > 5) ? '.' : "RSDTZW"[i]; + * + * So we replicate that here. From proc_pid_stats(5) + * R = Running + * S = Sleeping on uninterrutible wait + * D = Waiting on uninterruptable disk sleep + * T = Tracing stop + * Z = Zombie + * W = Paging + */ + lldb::StateType process_state = process_sp->GetState(); + switch (process_state) { + case lldb::StateType::eStateSuspended: +prpsinfo.pr_sname = 'S'; +prpsinfo.pr_state = 1; +break; + case lldb::StateType::eStateStopped: +[[fallthrough]]; + case lldb::StateType::eStateStepping: +prpsinfo.pr_sname = 'T'; +prpsinfo.pr_state = 3; +break; + case lldb::StateType::eStateUnloaded: +[[fallthrough]]; + case lldb::StateType::eStateRunning: +prpsinfo.pr_sname = 'R'; +prpsinfo.pr_state = 0; +break; + default: +break; + } + + /** + * pr_flags is left as 0. The values (in linux) are specific + * to the kernel. We recover them from the proc filesystem + * but don't put them in ProcessInfo because it would really + * become very linux specific and the utility here seems pretty + * dubious + */ + + if (info.EffectiveUserIDIsValid()) { +prpsinfo.pr_uid = info.GetUserID(); + } + if (info.EffectiveGroupIDIsValid()) { +prpsinfo.pr_gid = info.GetGroupID(); + } + + if (info.ParentProcessIDIsValid()) { +prpsinfo.pr_ppid = info.GetParentProcessID(); + } + + if (info.ProcessGroupIDIsValid()) { +prpsinfo.pr_pgrp = info.GetProcessGroupID(); + } + + if (info.ProcessSessionIDIsValid()) { +prpsinfo.pr_sid = info.GetProcessSessionID(); + } + constexpr size_t fname_len = std::extent_v; + static_assert(fname_len > 0, "This should always be non zero"); + const llvm::StringRef fname = info.GetNameAsStringRef(); + auto fname_begin = fname.begin(); + std::copy(fname_begin, +std::next(fname_begin, std::min(fname_len, fname.size())), +prpsinfo.pr_fname); labath wrote: ```suggestion std::copy_n(fname_begin, std::min(fname_len, fname.size()), prpsinfo.pr_fname); ``` https://github.com/llvm/llvm-project/pull/104109 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add Populate Methods for ELFLinuxPrPsInfo and ELFLinuxPrStatus (PR #104109)
https://github.com/labath commented: I like how you've made a separate patch out of these functions. Just a couple of random improvements I noticed. https://github.com/llvm/llvm-project/pull/104109 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [lldb] Add Populate Methods for ELFLinuxPrPsInfo and ELFLinuxPrStatus (PR #104109)
https://github.com/labath edited https://github.com/llvm/llvm-project/pull/104109 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits