On 11/19/20 8:30 PM, Suravee Suthikulpanit wrote: > Will, > > To answer your questions from v1 thread. > > On 11/18/20 5:57 AM, Will Deacon wrote: > > On 11/5/20 9:58 PM, Suravee Suthikulpanit wrote: > >> AMD IOMMU requires 4k-aligned pages for the event log, the PPR log, > >> and the completion wait write-back regions. However, when allocating > >> the pages, they could be part of large mapping (e.g. 2M) page. > >> This causes #PF due to the SNP RMP hardware enforces the check based > >> on the page level for these data structures. > > > > Please could you include an example backtrace here? > > Unfortunately, we don't actually have the backtrace available here. > This information is based on the SEV-SNP specification. > > >> So, fix by calling set_memory_4k() on the allocated pages. > > > > I think I'm missing something here. set_memory_4k() will break the > kernel > > linear mapping up into page granular mappings, but the IOMMU isn't > using > > that mapping, right? > > That's correct. This does not affect the IOMMU, but it affects the PSP > FW. > > > It's just using the physical address returned by > iommu_virt_to_phys(), so why does it matter? > > > > Just be nice to capture some of this rationale in the log, > especially as > > I'm not familiar with this device. > > According to the AMD SEV-SNP white paper > (https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf), > the Reverse Map Table (RMP) contains one entry for every 4K page of > DRAM that may be used by the VM. In this case, the pages allocated by > the IOMMU driver are added as 4K entries in the RMP table by the > SEV-SNP FW. > > During the page table walk, the RMP checks if the page is owned by the > hypervisor. Without calling set_memory_4k() to break the mapping up > into 4K pages, pages could end up being part of large mapping (e.g. 2M > page), in which the page access would be denied and result in #PF.
Since the page is added as a 4K page in the RMP table by the SEV-SNP FW, so we need to split the physmap to ensure that this page will be access with a 4K mapping from the x86. If the page is part of large page then write access will cause a RMP violation (i.e #PF), this is because SNP hardware enforce that the CPU page level walk must match with page-level programmed in the RMP table. > > >> Fixes: commit c69d89aff393 ("iommu/amd: Use 4K page for completion > wait write-back semaphore") > > > > I couldn't figure out how that commit could cause this problem. > Please can > > you explain that to me? > > Hope this helps clarify. If so, I'll update the commit log and send > out V3. > > Thanks, > Suravee