Re: [PATCH] Refactor rust-demangle to be independent of C++ demangling.

Eduard-Mihai Burtescu Wed, 23 Oct 2019 06:37:16 -0700

On Tue, Oct 22, 2019, at 9:39 PM, Ian Lance Taylor wrote:
> I have to assume that C++ demangling is still quite a bit more common
> than Rust demangling, so it's troubling that it looks like we're going
> to do extra work for each symbol that starts with _ZN, which is not a
> particularly uncommon prefix for a C++ mangled name.  Is there some
> way we can quickly separate out Rust symbols?  Or should we try C++
> demangling first?
> 
> Ian
>


I definitely agree, I don't want to make demangling plain C++ symbols
significantly slower. The old code was also doing extra work, at least
in the AUTO_DEMANGLING mode, but less than the parse_ident
loop in this patch.

I've come up with an extra quick check that regular C++ symbols
won't pass most of the time and placed it before the parse_ident
loop, that should make it comparable with the old implementation,
and tests pass just fine with the extra check.

The diff is below, but if you want me to send a combined patch,
or anything else for that matter, please let me know.

diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c
index da707dbab9b..4cb189c4019 100644
--- a/libiberty/rust-demangle.c
+++ b/libiberty/rust-demangle.c
@@ -384,6 +384,14 @@ rust_demangle_callback (const char *mangled, int options,
         return 0;
       rdm.sym_len--;
 
+      /* Legacy Rust symbols also always end with a path segment
+         that encodes a 16 hex digit hash, i.e. '17h[a-f0-9]{16}'.
+         This early check, before any parse_ident calls, should
+         quickly filter out most C++ symbols unrelated to Rust. */
+      if (!(rdm.sym_len > 19
+            && !strncmp (&rdm.sym[rdm.sym_len - 19], "17h", 3)))
+        return 0;
+
       do
         {
           ident = parse_ident (&rdm);

Re: [PATCH] Refactor rust-demangle to be independent of C++ demangling.

Reply via email to