Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu
On 11/10/16 at 12:52pm, Joerg Roedel wrote: > Hi Baoquan, > > thanks for working on this, really appreciated! > > On Thu, Oct 20, 2016 at 07:37:11PM +0800, Baoquan He wrote: > > This is v6 post. > > > > The principle of the fix is similar to intel iommu. Just defer the > > assignment > > of device to domain to device driver init. But there's difference than > > intel iommu. AMD iommu create protection domain and assign device to > > domain in iommu driver init stage. So in this patchset I just allow the > > assignment of device to domain in software level, but defer updating the > > domain info, especially the pte_root to dev table entry to device driver > > init stage. > > I recently talked with the IOMMU guys from AMD about whether it is safe > to update the device-table pointer while the iommu is enabled. It turns > out that device-table pointer update is split up into two 32bit writes > in the IOMMU hardware. So updating it while the IOMMU is enabled could > have some nasty side effects. > > The only way to work around this is to allocate the device-table > below 4GB, but that needs more low-mem then in the kdump kernel. So some > adjustments are needed there too. Anyway, can you add that to your > patch-set? Yes, sure. Seems this is the only way to work around the 64bit address being split up into two times of 32bit writes into IOMMU hardware risk. I guess we need add a GFP_DMA32 flag when allocate pages for amd_iommu_dev_table in kdump kernel. And better add a note in kdump.txt. I have been told on some big advanced servers they don't need low mem reseved at all with the help of hardware iommu. Now server with amd iommu hardware have to be exceptional. Thanks Baoquan
Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu
On 11/10/16 at 12:52pm, Joerg Roedel wrote: > Hi Baoquan, > > thanks for working on this, really appreciated! > > On Thu, Oct 20, 2016 at 07:37:11PM +0800, Baoquan He wrote: > > This is v6 post. > > > > The principle of the fix is similar to intel iommu. Just defer the > > assignment > > of device to domain to device driver init. But there's difference than > > intel iommu. AMD iommu create protection domain and assign device to > > domain in iommu driver init stage. So in this patchset I just allow the > > assignment of device to domain in software level, but defer updating the > > domain info, especially the pte_root to dev table entry to device driver > > init stage. > > I recently talked with the IOMMU guys from AMD about whether it is safe > to update the device-table pointer while the iommu is enabled. It turns > out that device-table pointer update is split up into two 32bit writes > in the IOMMU hardware. So updating it while the IOMMU is enabled could > have some nasty side effects. > > The only way to work around this is to allocate the device-table > below 4GB, but that needs more low-mem then in the kdump kernel. So some > adjustments are needed there too. Anyway, can you add that to your > patch-set? Yes, sure. Seems this is the only way to work around the 64bit address being split up into two times of 32bit writes into IOMMU hardware risk. I guess we need add a GFP_DMA32 flag when allocate pages for amd_iommu_dev_table in kdump kernel. And better add a note in kdump.txt. I have been told on some big advanced servers they don't need low mem reseved at all with the help of hardware iommu. Now server with amd iommu hardware have to be exceptional. Thanks Baoquan
Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu
Hi Baoquan, thanks for working on this, really appreciated! On Thu, Oct 20, 2016 at 07:37:11PM +0800, Baoquan He wrote: > This is v6 post. > > The principle of the fix is similar to intel iommu. Just defer the assignment > of device to domain to device driver init. But there's difference than > intel iommu. AMD iommu create protection domain and assign device to > domain in iommu driver init stage. So in this patchset I just allow the > assignment of device to domain in software level, but defer updating the > domain info, especially the pte_root to dev table entry to device driver > init stage. I recently talked with the IOMMU guys from AMD about whether it is safe to update the device-table pointer while the iommu is enabled. It turns out that device-table pointer update is split up into two 32bit writes in the IOMMU hardware. So updating it while the IOMMU is enabled could have some nasty side effects. The only way to work around this is to allocate the device-table below 4GB, but that needs more low-mem then in the kdump kernel. So some adjustments are needed there too. Anyway, can you add that to your patch-set? Joerg
Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu
Hi Baoquan, thanks for working on this, really appreciated! On Thu, Oct 20, 2016 at 07:37:11PM +0800, Baoquan He wrote: > This is v6 post. > > The principle of the fix is similar to intel iommu. Just defer the assignment > of device to domain to device driver init. But there's difference than > intel iommu. AMD iommu create protection domain and assign device to > domain in iommu driver init stage. So in this patchset I just allow the > assignment of device to domain in software level, but defer updating the > domain info, especially the pte_root to dev table entry to device driver > init stage. I recently talked with the IOMMU guys from AMD about whether it is safe to update the device-table pointer while the iommu is enabled. It turns out that device-table pointer update is split up into two 32bit writes in the IOMMU hardware. So updating it while the IOMMU is enabled could have some nasty side effects. The only way to work around this is to allocate the device-table below 4GB, but that needs more low-mem then in the kdump kernel. So some adjustments are needed there too. Anyway, can you add that to your patch-set? Joerg
Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu
On 11/04/16 at 01:14pm, Baoquan He wrote: > Hi Joerg, > > Ping! > > About the v6 post, do you have any suggestions? > > Because of GCR3 special handling in patch 9/9, I spent several days to > study the knowledge and change code. Then when I tried to post, the > virtual interrupt remapping feature caused kernel hang with this pachset > applied. So it took me days to study spec and find it out. Finally it's > very late to post. > > Coule it be possibe that we review and merge patch 9/1~8, and leave the > patch 9/9 which includes GCR3 special handling as 2nd step issue? Then > I can back port patch 9/1~8 to our distro. Since this bug has been > discussed so long time, and currently almost all system are deployed > with amd iommu v1 hardware. It would be great if they can be accepted ~~~ Here I meant in our Redhat lab almost all system are only deployed with amd iommu v1 support. > into 4.9 or 4.10-rc phase. > > About patch 9/9, its code is a little complicated and not being > reviewed, I am not sure if I understand your suggestion and GCR3 code > well. What's your opinion? > > Thanks > Baoquan > > > On 10/20/16 at 07:37pm, Baoquan He wrote: > > This is v6 post. > > > > The principle of the fix is similar to intel iommu. Just defer the > > assignment > > of device to domain to device driver init. But there's difference than > > intel iommu. AMD iommu create protection domain and assign device to > > domain in iommu driver init stage. So in this patchset I just allow the > > assignment of device to domain in software level, but defer updating the > > domain info, especially the pte_root to dev table entry to device driver > > init stage. > > > > v5: > > bnx2 NIC can't reset itself during driver init. Post patch to reset > > it during driver init. IO_PAGE_FAULT can't be seen anymore. > > > > Below is link of v5 post. > > > > https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html > > > > v5->v6: > > According to Joerg's comments made several below main changes: > > - Add sanity check when copy old dev tables. > > > > - Discard the old patch 6/8. > > > > - If a device is set up with guest translations (DTE.GV=1), then don't > > copy that information but move the device over to an empty guest-cr3 > > table and handle the faults in the PPR log (which just answer them > > with INVALID). > > > > Issues need be discussed: > > - Joerg suggested hooking the behaviour that updates domain info into > > dte entry into the set_dma_mask call-back. I tried, but on my local > > machine with amd iommu v2, an ohci pci device doesn't call > > set_dma_mask. > > Then IO_PAGE_FAULT printing flooded. > > > > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB > > OHCI Controller (rev 11) > > > > - About GCR3 root pointer copying issue, I don't know how to setup the > > test environment and haven't tested yet. Hope Joerg or Zongshun can > > tell what steps should be taken to test it, or help take a test in > > your > > test environemnt. > > > > Baoquan He (9): > > iommu/amd: Detect pre enabled translation > > iommu/amd: add several helper function > > iommu/amd: Define bit fields for DTE particularly > > iommu/amd: Add function copy_dev_tables > > iommu/amd: copy old trans table from old kernel > > iommu/amd: Don't update domain info to dte entry at iommu init stage > > iommu/amd: Update domain into to dte entry during device driver init > > iommu/amd: Add sanity check of irq remap information of old dev table > > entry > > iommu/amd: Don't copy GCR3 table root pointer > > > > drivers/iommu/amd_iommu.c | 93 +--- > > drivers/iommu/amd_iommu_init.c | 189 > > +--- > > drivers/iommu/amd_iommu_proto.h | 2 + > > drivers/iommu/amd_iommu_types.h | 53 ++- > > drivers/iommu/amd_iommu_v2.c| 18 +++- > > 5 files changed, 307 insertions(+), 48 deletions(-) > > > > -- > > 2.5.5 > >
Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu
On 11/04/16 at 01:14pm, Baoquan He wrote: > Hi Joerg, > > Ping! > > About the v6 post, do you have any suggestions? > > Because of GCR3 special handling in patch 9/9, I spent several days to > study the knowledge and change code. Then when I tried to post, the > virtual interrupt remapping feature caused kernel hang with this pachset > applied. So it took me days to study spec and find it out. Finally it's > very late to post. > > Coule it be possibe that we review and merge patch 9/1~8, and leave the > patch 9/9 which includes GCR3 special handling as 2nd step issue? Then > I can back port patch 9/1~8 to our distro. Since this bug has been > discussed so long time, and currently almost all system are deployed > with amd iommu v1 hardware. It would be great if they can be accepted ~~~ Here I meant in our Redhat lab almost all system are only deployed with amd iommu v1 support. > into 4.9 or 4.10-rc phase. > > About patch 9/9, its code is a little complicated and not being > reviewed, I am not sure if I understand your suggestion and GCR3 code > well. What's your opinion? > > Thanks > Baoquan > > > On 10/20/16 at 07:37pm, Baoquan He wrote: > > This is v6 post. > > > > The principle of the fix is similar to intel iommu. Just defer the > > assignment > > of device to domain to device driver init. But there's difference than > > intel iommu. AMD iommu create protection domain and assign device to > > domain in iommu driver init stage. So in this patchset I just allow the > > assignment of device to domain in software level, but defer updating the > > domain info, especially the pte_root to dev table entry to device driver > > init stage. > > > > v5: > > bnx2 NIC can't reset itself during driver init. Post patch to reset > > it during driver init. IO_PAGE_FAULT can't be seen anymore. > > > > Below is link of v5 post. > > > > https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html > > > > v5->v6: > > According to Joerg's comments made several below main changes: > > - Add sanity check when copy old dev tables. > > > > - Discard the old patch 6/8. > > > > - If a device is set up with guest translations (DTE.GV=1), then don't > > copy that information but move the device over to an empty guest-cr3 > > table and handle the faults in the PPR log (which just answer them > > with INVALID). > > > > Issues need be discussed: > > - Joerg suggested hooking the behaviour that updates domain info into > > dte entry into the set_dma_mask call-back. I tried, but on my local > > machine with amd iommu v2, an ohci pci device doesn't call > > set_dma_mask. > > Then IO_PAGE_FAULT printing flooded. > > > > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB > > OHCI Controller (rev 11) > > > > - About GCR3 root pointer copying issue, I don't know how to setup the > > test environment and haven't tested yet. Hope Joerg or Zongshun can > > tell what steps should be taken to test it, or help take a test in > > your > > test environemnt. > > > > Baoquan He (9): > > iommu/amd: Detect pre enabled translation > > iommu/amd: add several helper function > > iommu/amd: Define bit fields for DTE particularly > > iommu/amd: Add function copy_dev_tables > > iommu/amd: copy old trans table from old kernel > > iommu/amd: Don't update domain info to dte entry at iommu init stage > > iommu/amd: Update domain into to dte entry during device driver init > > iommu/amd: Add sanity check of irq remap information of old dev table > > entry > > iommu/amd: Don't copy GCR3 table root pointer > > > > drivers/iommu/amd_iommu.c | 93 +--- > > drivers/iommu/amd_iommu_init.c | 189 > > +--- > > drivers/iommu/amd_iommu_proto.h | 2 + > > drivers/iommu/amd_iommu_types.h | 53 ++- > > drivers/iommu/amd_iommu_v2.c| 18 +++- > > 5 files changed, 307 insertions(+), 48 deletions(-) > > > > -- > > 2.5.5 > >
Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu
Hi Joerg, Ping! About the v6 post, do you have any suggestions? Because of GCR3 special handling in patch 9/9, I spent several days to study the knowledge and change code. Then when I tried to post, the virtual interrupt remapping feature caused kernel hang with this pachset applied. So it took me days to study spec and find it out. Finally it's very late to post. Coule it be possibe that we review and merge patch 9/1~8, and leave the patch 9/9 which includes GCR3 special handling as 2nd step issue? Then I can back port patch 9/1~8 to our distro. Since this bug has been discussed so long time, and currently almost all system are deployed with amd iommu v1 hardware. It would be great if they can be accepted into 4.9 or 4.10-rc phase. About patch 9/9, its code is a little complicated and not being reviewed, I am not sure if I understand your suggestion and GCR3 code well. What's your opinion? Thanks Baoquan On 10/20/16 at 07:37pm, Baoquan He wrote: > This is v6 post. > > The principle of the fix is similar to intel iommu. Just defer the assignment > of device to domain to device driver init. But there's difference than > intel iommu. AMD iommu create protection domain and assign device to > domain in iommu driver init stage. So in this patchset I just allow the > assignment of device to domain in software level, but defer updating the > domain info, especially the pte_root to dev table entry to device driver > init stage. > > v5: > bnx2 NIC can't reset itself during driver init. Post patch to reset > it during driver init. IO_PAGE_FAULT can't be seen anymore. > > Below is link of v5 post. > > https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html > > v5->v6: > According to Joerg's comments made several below main changes: > - Add sanity check when copy old dev tables. > > - Discard the old patch 6/8. > > - If a device is set up with guest translations (DTE.GV=1), then don't > copy that information but move the device over to an empty guest-cr3 > table and handle the faults in the PPR log (which just answer them > with INVALID). > > Issues need be discussed: > - Joerg suggested hooking the behaviour that updates domain info into > dte entry into the set_dma_mask call-back. I tried, but on my local > machine with amd iommu v2, an ohci pci device doesn't call set_dma_mask. > Then IO_PAGE_FAULT printing flooded. > > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI > Controller (rev 11) > > - About GCR3 root pointer copying issue, I don't know how to setup the > test environment and haven't tested yet. Hope Joerg or Zongshun can > tell what steps should be taken to test it, or help take a test in your > test environemnt. > > Baoquan He (9): > iommu/amd: Detect pre enabled translation > iommu/amd: add several helper function > iommu/amd: Define bit fields for DTE particularly > iommu/amd: Add function copy_dev_tables > iommu/amd: copy old trans table from old kernel > iommu/amd: Don't update domain info to dte entry at iommu init stage > iommu/amd: Update domain into to dte entry during device driver init > iommu/amd: Add sanity check of irq remap information of old dev table > entry > iommu/amd: Don't copy GCR3 table root pointer > > drivers/iommu/amd_iommu.c | 93 +--- > drivers/iommu/amd_iommu_init.c | 189 > +--- > drivers/iommu/amd_iommu_proto.h | 2 + > drivers/iommu/amd_iommu_types.h | 53 ++- > drivers/iommu/amd_iommu_v2.c| 18 +++- > 5 files changed, 307 insertions(+), 48 deletions(-) > > -- > 2.5.5 >
Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu
Hi Joerg, Ping! About the v6 post, do you have any suggestions? Because of GCR3 special handling in patch 9/9, I spent several days to study the knowledge and change code. Then when I tried to post, the virtual interrupt remapping feature caused kernel hang with this pachset applied. So it took me days to study spec and find it out. Finally it's very late to post. Coule it be possibe that we review and merge patch 9/1~8, and leave the patch 9/9 which includes GCR3 special handling as 2nd step issue? Then I can back port patch 9/1~8 to our distro. Since this bug has been discussed so long time, and currently almost all system are deployed with amd iommu v1 hardware. It would be great if they can be accepted into 4.9 or 4.10-rc phase. About patch 9/9, its code is a little complicated and not being reviewed, I am not sure if I understand your suggestion and GCR3 code well. What's your opinion? Thanks Baoquan On 10/20/16 at 07:37pm, Baoquan He wrote: > This is v6 post. > > The principle of the fix is similar to intel iommu. Just defer the assignment > of device to domain to device driver init. But there's difference than > intel iommu. AMD iommu create protection domain and assign device to > domain in iommu driver init stage. So in this patchset I just allow the > assignment of device to domain in software level, but defer updating the > domain info, especially the pte_root to dev table entry to device driver > init stage. > > v5: > bnx2 NIC can't reset itself during driver init. Post patch to reset > it during driver init. IO_PAGE_FAULT can't be seen anymore. > > Below is link of v5 post. > > https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html > > v5->v6: > According to Joerg's comments made several below main changes: > - Add sanity check when copy old dev tables. > > - Discard the old patch 6/8. > > - If a device is set up with guest translations (DTE.GV=1), then don't > copy that information but move the device over to an empty guest-cr3 > table and handle the faults in the PPR log (which just answer them > with INVALID). > > Issues need be discussed: > - Joerg suggested hooking the behaviour that updates domain info into > dte entry into the set_dma_mask call-back. I tried, but on my local > machine with amd iommu v2, an ohci pci device doesn't call set_dma_mask. > Then IO_PAGE_FAULT printing flooded. > > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI > Controller (rev 11) > > - About GCR3 root pointer copying issue, I don't know how to setup the > test environment and haven't tested yet. Hope Joerg or Zongshun can > tell what steps should be taken to test it, or help take a test in your > test environemnt. > > Baoquan He (9): > iommu/amd: Detect pre enabled translation > iommu/amd: add several helper function > iommu/amd: Define bit fields for DTE particularly > iommu/amd: Add function copy_dev_tables > iommu/amd: copy old trans table from old kernel > iommu/amd: Don't update domain info to dte entry at iommu init stage > iommu/amd: Update domain into to dte entry during device driver init > iommu/amd: Add sanity check of irq remap information of old dev table > entry > iommu/amd: Don't copy GCR3 table root pointer > > drivers/iommu/amd_iommu.c | 93 +--- > drivers/iommu/amd_iommu_init.c | 189 > +--- > drivers/iommu/amd_iommu_proto.h | 2 + > drivers/iommu/amd_iommu_types.h | 53 ++- > drivers/iommu/amd_iommu_v2.c| 18 +++- > 5 files changed, 307 insertions(+), 48 deletions(-) > > -- > 2.5.5 >
[PATCH v6 0/9] Fix kdump faults on system with amd iommu
This is v6 post. The principle of the fix is similar to intel iommu. Just defer the assignment of device to domain to device driver init. But there's difference than intel iommu. AMD iommu create protection domain and assign device to domain in iommu driver init stage. So in this patchset I just allow the assignment of device to domain in software level, but defer updating the domain info, especially the pte_root to dev table entry to device driver init stage. v5: bnx2 NIC can't reset itself during driver init. Post patch to reset it during driver init. IO_PAGE_FAULT can't be seen anymore. Below is link of v5 post. https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html v5->v6: According to Joerg's comments made several below main changes: - Add sanity check when copy old dev tables. - Discard the old patch 6/8. - If a device is set up with guest translations (DTE.GV=1), then don't copy that information but move the device over to an empty guest-cr3 table and handle the faults in the PPR log (which just answer them with INVALID). Issues need be discussed: - Joerg suggested hooking the behaviour that updates domain info into dte entry into the set_dma_mask call-back. I tried, but on my local machine with amd iommu v2, an ohci pci device doesn't call set_dma_mask. Then IO_PAGE_FAULT printing flooded. 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11) - About GCR3 root pointer copying issue, I don't know how to setup the test environment and haven't tested yet. Hope Joerg or Zongshun can tell what steps should be taken to test it, or help take a test in your test environemnt. Baoquan He (9): iommu/amd: Detect pre enabled translation iommu/amd: add several helper function iommu/amd: Define bit fields for DTE particularly iommu/amd: Add function copy_dev_tables iommu/amd: copy old trans table from old kernel iommu/amd: Don't update domain info to dte entry at iommu init stage iommu/amd: Update domain into to dte entry during device driver init iommu/amd: Add sanity check of irq remap information of old dev table entry iommu/amd: Don't copy GCR3 table root pointer drivers/iommu/amd_iommu.c | 93 +--- drivers/iommu/amd_iommu_init.c | 189 +--- drivers/iommu/amd_iommu_proto.h | 2 + drivers/iommu/amd_iommu_types.h | 53 ++- drivers/iommu/amd_iommu_v2.c| 18 +++- 5 files changed, 307 insertions(+), 48 deletions(-) -- 2.5.5
[PATCH v6 0/9] Fix kdump faults on system with amd iommu
This is v6 post. The principle of the fix is similar to intel iommu. Just defer the assignment of device to domain to device driver init. But there's difference than intel iommu. AMD iommu create protection domain and assign device to domain in iommu driver init stage. So in this patchset I just allow the assignment of device to domain in software level, but defer updating the domain info, especially the pte_root to dev table entry to device driver init stage. v5: bnx2 NIC can't reset itself during driver init. Post patch to reset it during driver init. IO_PAGE_FAULT can't be seen anymore. Below is link of v5 post. https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html v5->v6: According to Joerg's comments made several below main changes: - Add sanity check when copy old dev tables. - Discard the old patch 6/8. - If a device is set up with guest translations (DTE.GV=1), then don't copy that information but move the device over to an empty guest-cr3 table and handle the faults in the PPR log (which just answer them with INVALID). Issues need be discussed: - Joerg suggested hooking the behaviour that updates domain info into dte entry into the set_dma_mask call-back. I tried, but on my local machine with amd iommu v2, an ohci pci device doesn't call set_dma_mask. Then IO_PAGE_FAULT printing flooded. 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11) - About GCR3 root pointer copying issue, I don't know how to setup the test environment and haven't tested yet. Hope Joerg or Zongshun can tell what steps should be taken to test it, or help take a test in your test environemnt. Baoquan He (9): iommu/amd: Detect pre enabled translation iommu/amd: add several helper function iommu/amd: Define bit fields for DTE particularly iommu/amd: Add function copy_dev_tables iommu/amd: copy old trans table from old kernel iommu/amd: Don't update domain info to dte entry at iommu init stage iommu/amd: Update domain into to dte entry during device driver init iommu/amd: Add sanity check of irq remap information of old dev table entry iommu/amd: Don't copy GCR3 table root pointer drivers/iommu/amd_iommu.c | 93 +--- drivers/iommu/amd_iommu_init.c | 189 +--- drivers/iommu/amd_iommu_proto.h | 2 + drivers/iommu/amd_iommu_types.h | 53 ++- drivers/iommu/amd_iommu_v2.c| 18 +++- 5 files changed, 307 insertions(+), 48 deletions(-) -- 2.5.5