Reza Arbab's on July 5, 2019 12:50 pm: > On Thu, Jul 04, 2019 at 12:36:18PM +1000, Nicholas Piggin wrote: >>Reza Arbab's on July 4, 2019 3:20 am: >>> Since the notifier chain is actually part of the decision between (1) >>> and (2), it's a hard limitation then that callbacks be in real address >>> space. Is there any way to structure things so that's not the case? >> >>If we tested for KVM guest first, and went through and marked (maybe >>in a paca flag) everywhere else that put the MMU into a bad / non-host >>state, and had the notifiers use the machine check stack, then it >>would be possible to enable MMU here. >> >>Hmm, testing for IR|DR after testing for KVM guest might actually be >>enough without requiring changes outside the machine check handler... >>Actually no that may not quite work because the handler could take a >>SLB miss and it might have been triggered inside the SLB miss handler. >> >>All in all I'm pretty against turning on MMU in the MCE handler >>anywhere. > > Hey, fair enough. Just making sure there really isnt't any room to make > things work the way I was trying.
Understand. > >>> Luckily this patch isn't really necessary for memcpy_mcsafe(), but we >>> have a couple of other potential users of the notifier from external >>> modules (so their callbacks would require virtual mode). >> >>What users are there? Do they do any significant amount of logic that >>can not be moved to vmlinux? > > One I had in mind was the NVIDIA driver. When taking a UE from defective > GPU memory, it could use the notifier to save the bad address to a > blacklist in their nvram. Not so much recovering the machine check, just > logging before the system reboots. > > The other user is a prototype driver for the IBM Research project we had > a talk about offline a while back. Okay. It might be possible to save the address in the kernel and then notify the driver afterward. For user-mode and any non-atomic user copy AFAIK the irq_work should practically run synchronously after the machine check returns so it might be enough to have a notifier in the irq work processing. > We can make this patchset work for memcpy_mcsafe(), but I think it's > back to the drawing board for the others. For the first stage that would be preferable. Thanks, Nick