Hello,

I am trying to make systemtap work with gentoo-kernel (or ideally all
dist kernels) and I got a few steps closer with kernel-build.eclass
modification I sent this week [0].  However there is still one issue and
that is the fact that build-id of the kernel does not match the
installed vmlinux file:

# stap mba_sc.stp
WARNING: Build-id mismatch [man warning::buildid]:
"/usr/src/linux-5.17.13-gentoo-dist/vmlinux" pid 0 address
0xffffffff8a7b572c, expected c43e775aad5e11755bf5cf1329d2240b519e7518
actual 3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

I also noticed that when kernel-build.eclass installs the vmlinux file
it also (I presume portage) creates vmlinux.debug using objcopy
--only-keep-debug --compress-debug-sections.

So now I am in a situation where I have these relevant files on the
system:

- /usr/src/linux-5.17.13-gentoo-dist/vmlinux
- /usr/lib/debug/.build-id/c4/3e775aad5e11755bf5cf1329d2240b519e7518.debug
  (symlink to the first file)
- /usr/lib/debug/usr/src/linux-5.17.13-gentoo-dist/vmlinux.debug and
- /boot/vmlinuz-5.17.13-gentoo-dist


When I check the build ids (using readelf -n or just "file") of the
first three files I get:

/usr/src/linux-5.17.13-gentoo-dist/vmlinux:
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, not stripped

/usr/lib/debug/.build-id/c4/3e775aad5e11755bf5cf1329d2240b519e7518.debug:
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, with debug_info,
not stripped

/usr/lib/debug/usr/src/linux-5.17.13-gentoo-dist/vmlinux.debug:
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, with debug_info,
not stripped

which looks great except:

1) the first file does not say it is "with debug_info",

2) there is no reason to keep the original vmlinux in place since there
   is a smaller file that works as a substitute, but I'm not sure what's
   a clean way to not install it, and most importantly

3) the fact that the running kernel has a different build id.

The last point is the main issue here.  I was trying to find how to
check for the build id of the running kernel, but haven't found any way
on how to do it with a kernel API, so instead I checked the
/boot/vmlinuz-5.17.13-gentoo-dist like this:

~/dev/linux/scripts/extract-vmlinux /boot/vmlinuz-5.17.13-gentoo-dist 
>vmlinux.extracted

and for good measure also tried what objcopy does to it:

objcopy --only-keep-debug vmlinux.extracted vmlinux.extracted.debug
objcopy --only-keep-debug --compress-debug-sections vmlinux.extracted 
vmlinux.extracted.compressed

Now when I check the build id is different from the first files, but
unchanged with objcopy and same as systemtap reports for the running
kernel:

vmlinux.extracted:
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped

vmlinux.extracted.compressed:
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped

vmlinux.extracted.debug:
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped


At this point I got stuck, not knowing when and how does the build-id
changes and where to extract the debug symbols from.  I would also like
to clean up the change I did.  So I came here with my question(s) and
rather lengthy explanations.  Does anyone know what would be the best
way to deal with this?  Or even where to continue looking?  I would
really like to make systemtap "just work" on Gentoo with the
distribution kernels, but I already spent a lot of time on it, so I
figured I'll rather ask here since I'm not that proficient with the
intricacies of the build system parts.

Thanks a lot for any pointers and have a great day,
Martin

[0] https://github.com/gentoo/gentoo/pull/25789

Reply via email to