Hi Bhupesh and Dave (and everybody CC'ed here), I'm Guilherme Piccoli
and I'm working in the same issue observed in RH bugzilla 1758323 [0] -
or at least, it seems to be the the same heh

The reported issue in my case was that the 2nd kexec fails on Nitro
instanced, and indeed it's reproducible. More than this, it shows as an
initrd corruption. I've found 2 workarounds, using the "new" kexec
syscall (by doing kexec -s -l) and keep the initrd memory "un-freed",
using the kernel parameter "retain_initrd".

I've noticed that your interesting investigation in the BZ led to
SWIOTLB as a potential culprit, but trying with "swiotlb=noforce" or
even "iommu=off" didn't help me.
Also, worth notice a weird behavior: seems Amazon Linux 2 (based on
kernel 4.14) sometimes works, or better saying, in some instances it
works. I have 2x t3.large instances, in one of them I can make the
Amazon Linux works (and to isolate potential out-of-tree patches, I've
used Amazon Linux 2 config file and built a mainline 4.14, which also
works in that particular instance).

The reason for this email is to ask if you managed to figure the issue
root-cause, or have some leads. I continue the debug here, but it's a
bit difficult without access to AWS hypervisor (and it seems like a
hypervisor issue for me). The fact that preserving the initrd memory
prevents the problem seems to indicate that after freeing such
high-address memory, the hypervisor somewhat manages to use that
regardless if some other code is using that...ending up corrupting the
initrd.

I've also looped the kexec list in order to grow the audience, maybe
somebody already faced that kind of issues and have some ideas.
A collaboration in this debug would be greatly appreciate by me, it's a
quite interesting issue and I'm looking forward to understand what's
going on.

Thanks in advance,


Guilherme


[0]https://bugzilla.redhat.com/show_bug.cgi?id=1758323

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

Reply via email to