Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-11-12 Thread Baoquan He
On 11/10/16 at 12:52pm, Joerg Roedel wrote:
> Hi Baoquan,
> 
> thanks for working on this, really appreciated!
> 
> On Thu, Oct 20, 2016 at 07:37:11PM +0800, Baoquan He wrote:
> > This is v6 post. 
> > 
> > The principle of the fix is similar to intel iommu. Just defer the 
> > assignment
> > of device to domain to device driver init. But there's difference than
> > intel iommu. AMD iommu create protection domain and assign device to
> > domain in iommu driver init stage. So in this patchset I just allow the
> > assignment of device to domain in software level, but defer updating the
> > domain info, especially the pte_root to dev table entry to device driver
> > init stage.
> 
> I recently talked with the IOMMU guys from AMD about whether it is safe
> to update the device-table pointer while the iommu is enabled. It turns
> out that device-table pointer update is split up into two 32bit writes
> in the IOMMU hardware. So updating it while the IOMMU is enabled could
> have some nasty side effects.
> 
> The only way to work around this is to allocate the device-table
> below 4GB, but that needs more low-mem then in the kdump kernel. So some
> adjustments are needed there too. Anyway, can you add that to your
> patch-set?

Yes, sure. Seems this is the only way to work around the 64bit address
being split up into two times of 32bit writes into IOMMU hardware risk.

I guess we need add a GFP_DMA32 flag when allocate pages for
amd_iommu_dev_table in kdump kernel. And better add a note in kdump.txt.
I have been told on some big advanced servers they don't need low mem
reseved at all with the help of hardware iommu. Now server with amd
iommu hardware have to be exceptional.

Thanks
Baoquan


Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-11-12 Thread Baoquan He
On 11/10/16 at 12:52pm, Joerg Roedel wrote:
> Hi Baoquan,
> 
> thanks for working on this, really appreciated!
> 
> On Thu, Oct 20, 2016 at 07:37:11PM +0800, Baoquan He wrote:
> > This is v6 post. 
> > 
> > The principle of the fix is similar to intel iommu. Just defer the 
> > assignment
> > of device to domain to device driver init. But there's difference than
> > intel iommu. AMD iommu create protection domain and assign device to
> > domain in iommu driver init stage. So in this patchset I just allow the
> > assignment of device to domain in software level, but defer updating the
> > domain info, especially the pte_root to dev table entry to device driver
> > init stage.
> 
> I recently talked with the IOMMU guys from AMD about whether it is safe
> to update the device-table pointer while the iommu is enabled. It turns
> out that device-table pointer update is split up into two 32bit writes
> in the IOMMU hardware. So updating it while the IOMMU is enabled could
> have some nasty side effects.
> 
> The only way to work around this is to allocate the device-table
> below 4GB, but that needs more low-mem then in the kdump kernel. So some
> adjustments are needed there too. Anyway, can you add that to your
> patch-set?

Yes, sure. Seems this is the only way to work around the 64bit address
being split up into two times of 32bit writes into IOMMU hardware risk.

I guess we need add a GFP_DMA32 flag when allocate pages for
amd_iommu_dev_table in kdump kernel. And better add a note in kdump.txt.
I have been told on some big advanced servers they don't need low mem
reseved at all with the help of hardware iommu. Now server with amd
iommu hardware have to be exceptional.

Thanks
Baoquan


Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-11-10 Thread Joerg Roedel
Hi Baoquan,

thanks for working on this, really appreciated!

On Thu, Oct 20, 2016 at 07:37:11PM +0800, Baoquan He wrote:
> This is v6 post. 
> 
> The principle of the fix is similar to intel iommu. Just defer the assignment
> of device to domain to device driver init. But there's difference than
> intel iommu. AMD iommu create protection domain and assign device to
> domain in iommu driver init stage. So in this patchset I just allow the
> assignment of device to domain in software level, but defer updating the
> domain info, especially the pte_root to dev table entry to device driver
> init stage.

I recently talked with the IOMMU guys from AMD about whether it is safe
to update the device-table pointer while the iommu is enabled. It turns
out that device-table pointer update is split up into two 32bit writes
in the IOMMU hardware. So updating it while the IOMMU is enabled could
have some nasty side effects.

The only way to work around this is to allocate the device-table
below 4GB, but that needs more low-mem then in the kdump kernel. So some
adjustments are needed there too. Anyway, can you add that to your
patch-set?


Joerg



Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-11-10 Thread Joerg Roedel
Hi Baoquan,

thanks for working on this, really appreciated!

On Thu, Oct 20, 2016 at 07:37:11PM +0800, Baoquan He wrote:
> This is v6 post. 
> 
> The principle of the fix is similar to intel iommu. Just defer the assignment
> of device to domain to device driver init. But there's difference than
> intel iommu. AMD iommu create protection domain and assign device to
> domain in iommu driver init stage. So in this patchset I just allow the
> assignment of device to domain in software level, but defer updating the
> domain info, especially the pte_root to dev table entry to device driver
> init stage.

I recently talked with the IOMMU guys from AMD about whether it is safe
to update the device-table pointer while the iommu is enabled. It turns
out that device-table pointer update is split up into two 32bit writes
in the IOMMU hardware. So updating it while the IOMMU is enabled could
have some nasty side effects.

The only way to work around this is to allocate the device-table
below 4GB, but that needs more low-mem then in the kdump kernel. So some
adjustments are needed there too. Anyway, can you add that to your
patch-set?


Joerg



Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-11-03 Thread Baoquan He
On 11/04/16 at 01:14pm, Baoquan He wrote:
> Hi Joerg,
> 
> Ping!
> 
> About the v6 post, do you have any suggestions?
> 
> Because of GCR3 special handling in patch 9/9, I spent several days to
> study the knowledge and change code. Then when I tried to post, the
> virtual interrupt remapping feature caused kernel hang with this pachset
> applied. So it took me days to study spec and find it out. Finally it's
> very late to post.
> 
> Coule it be possibe that we review and merge patch 9/1~8, and leave the
> patch 9/9 which includes GCR3 special handling as 2nd step issue? Then
> I can back port patch 9/1~8 to our distro. Since this bug has been
> discussed so long time, and currently almost all system are deployed
> with amd iommu v1 hardware. It would be great if they can be accepted
 ~~~ Here I meant in our Redhat lab almost all
system are only deployed with amd iommu v1 support. 

> into 4.9 or 4.10-rc phase.
> 
> About patch 9/9, its code is a little complicated and not being
> reviewed, I am not sure if I understand your suggestion and GCR3 code
> well. What's your opinion?
> 
> Thanks
> Baoquan
> 
> 
> On 10/20/16 at 07:37pm, Baoquan He wrote:
> > This is v6 post. 
> > 
> > The principle of the fix is similar to intel iommu. Just defer the 
> > assignment
> > of device to domain to device driver init. But there's difference than
> > intel iommu. AMD iommu create protection domain and assign device to
> > domain in iommu driver init stage. So in this patchset I just allow the
> > assignment of device to domain in software level, but defer updating the
> > domain info, especially the pte_root to dev table entry to device driver
> > init stage.
> > 
> > v5: 
> > bnx2 NIC can't reset itself during driver init. Post patch to reset
> > it during driver init. IO_PAGE_FAULT can't be seen anymore.
> > 
> > Below is link of v5 post.
> > 
> > https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html
> > 
> > v5->v6:
> > According to Joerg's comments made several below main changes:
> > - Add sanity check when copy old dev tables. 
> > 
> > - Discard the old patch 6/8.
> > 
> > - If a device is set up with guest translations (DTE.GV=1), then don't
> >   copy that information but move the device over to an empty guest-cr3
> >   table and handle the faults in the PPR log (which just answer them
> >   with INVALID).
> > 
> > Issues need be discussed:
> > - Joerg suggested hooking the behaviour that updates domain info into
> >   dte entry into the set_dma_mask call-back. I tried, but on my local
> >   machine with amd iommu v2, an ohci pci device doesn't call 
> > set_dma_mask.
> >   Then IO_PAGE_FAULT printing flooded.
> > 
> >   00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB 
> > OHCI Controller (rev 11)
> > 
> > - About GCR3 root pointer copying issue, I don't know how to setup the
> >   test environment and haven't tested yet. Hope Joerg or Zongshun can
> >   tell what steps should be taken to test it, or help take a test in 
> > your
> >   test environemnt.
> >  
> > Baoquan He (9):
> >   iommu/amd: Detect pre enabled translation
> >   iommu/amd: add several helper function
> >   iommu/amd: Define bit fields for DTE particularly
> >   iommu/amd: Add function copy_dev_tables
> >   iommu/amd: copy old trans table from old kernel
> >   iommu/amd: Don't update domain info to dte entry at iommu init stage
> >   iommu/amd: Update domain into to dte entry during device driver init
> >   iommu/amd: Add sanity check of irq remap information of old dev table
> > entry
> >   iommu/amd: Don't copy GCR3 table root pointer
> > 
> >  drivers/iommu/amd_iommu.c   |  93 +---
> >  drivers/iommu/amd_iommu_init.c  | 189 
> > +---
> >  drivers/iommu/amd_iommu_proto.h |   2 +
> >  drivers/iommu/amd_iommu_types.h |  53 ++-
> >  drivers/iommu/amd_iommu_v2.c|  18 +++-
> >  5 files changed, 307 insertions(+), 48 deletions(-)
> > 
> > -- 
> > 2.5.5
> > 


Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-11-03 Thread Baoquan He
On 11/04/16 at 01:14pm, Baoquan He wrote:
> Hi Joerg,
> 
> Ping!
> 
> About the v6 post, do you have any suggestions?
> 
> Because of GCR3 special handling in patch 9/9, I spent several days to
> study the knowledge and change code. Then when I tried to post, the
> virtual interrupt remapping feature caused kernel hang with this pachset
> applied. So it took me days to study spec and find it out. Finally it's
> very late to post.
> 
> Coule it be possibe that we review and merge patch 9/1~8, and leave the
> patch 9/9 which includes GCR3 special handling as 2nd step issue? Then
> I can back port patch 9/1~8 to our distro. Since this bug has been
> discussed so long time, and currently almost all system are deployed
> with amd iommu v1 hardware. It would be great if they can be accepted
 ~~~ Here I meant in our Redhat lab almost all
system are only deployed with amd iommu v1 support. 

> into 4.9 or 4.10-rc phase.
> 
> About patch 9/9, its code is a little complicated and not being
> reviewed, I am not sure if I understand your suggestion and GCR3 code
> well. What's your opinion?
> 
> Thanks
> Baoquan
> 
> 
> On 10/20/16 at 07:37pm, Baoquan He wrote:
> > This is v6 post. 
> > 
> > The principle of the fix is similar to intel iommu. Just defer the 
> > assignment
> > of device to domain to device driver init. But there's difference than
> > intel iommu. AMD iommu create protection domain and assign device to
> > domain in iommu driver init stage. So in this patchset I just allow the
> > assignment of device to domain in software level, but defer updating the
> > domain info, especially the pte_root to dev table entry to device driver
> > init stage.
> > 
> > v5: 
> > bnx2 NIC can't reset itself during driver init. Post patch to reset
> > it during driver init. IO_PAGE_FAULT can't be seen anymore.
> > 
> > Below is link of v5 post.
> > 
> > https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html
> > 
> > v5->v6:
> > According to Joerg's comments made several below main changes:
> > - Add sanity check when copy old dev tables. 
> > 
> > - Discard the old patch 6/8.
> > 
> > - If a device is set up with guest translations (DTE.GV=1), then don't
> >   copy that information but move the device over to an empty guest-cr3
> >   table and handle the faults in the PPR log (which just answer them
> >   with INVALID).
> > 
> > Issues need be discussed:
> > - Joerg suggested hooking the behaviour that updates domain info into
> >   dte entry into the set_dma_mask call-back. I tried, but on my local
> >   machine with amd iommu v2, an ohci pci device doesn't call 
> > set_dma_mask.
> >   Then IO_PAGE_FAULT printing flooded.
> > 
> >   00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB 
> > OHCI Controller (rev 11)
> > 
> > - About GCR3 root pointer copying issue, I don't know how to setup the
> >   test environment and haven't tested yet. Hope Joerg or Zongshun can
> >   tell what steps should be taken to test it, or help take a test in 
> > your
> >   test environemnt.
> >  
> > Baoquan He (9):
> >   iommu/amd: Detect pre enabled translation
> >   iommu/amd: add several helper function
> >   iommu/amd: Define bit fields for DTE particularly
> >   iommu/amd: Add function copy_dev_tables
> >   iommu/amd: copy old trans table from old kernel
> >   iommu/amd: Don't update domain info to dte entry at iommu init stage
> >   iommu/amd: Update domain into to dte entry during device driver init
> >   iommu/amd: Add sanity check of irq remap information of old dev table
> > entry
> >   iommu/amd: Don't copy GCR3 table root pointer
> > 
> >  drivers/iommu/amd_iommu.c   |  93 +---
> >  drivers/iommu/amd_iommu_init.c  | 189 
> > +---
> >  drivers/iommu/amd_iommu_proto.h |   2 +
> >  drivers/iommu/amd_iommu_types.h |  53 ++-
> >  drivers/iommu/amd_iommu_v2.c|  18 +++-
> >  5 files changed, 307 insertions(+), 48 deletions(-)
> > 
> > -- 
> > 2.5.5
> > 


Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-11-03 Thread Baoquan He
Hi Joerg,

Ping!

About the v6 post, do you have any suggestions?

Because of GCR3 special handling in patch 9/9, I spent several days to
study the knowledge and change code. Then when I tried to post, the
virtual interrupt remapping feature caused kernel hang with this pachset
applied. So it took me days to study spec and find it out. Finally it's
very late to post.

Coule it be possibe that we review and merge patch 9/1~8, and leave the
patch 9/9 which includes GCR3 special handling as 2nd step issue? Then
I can back port patch 9/1~8 to our distro. Since this bug has been
discussed so long time, and currently almost all system are deployed
with amd iommu v1 hardware. It would be great if they can be accepted
into 4.9 or 4.10-rc phase.

About patch 9/9, its code is a little complicated and not being
reviewed, I am not sure if I understand your suggestion and GCR3 code
well. What's your opinion?

Thanks
Baoquan


On 10/20/16 at 07:37pm, Baoquan He wrote:
> This is v6 post. 
> 
> The principle of the fix is similar to intel iommu. Just defer the assignment
> of device to domain to device driver init. But there's difference than
> intel iommu. AMD iommu create protection domain and assign device to
> domain in iommu driver init stage. So in this patchset I just allow the
> assignment of device to domain in software level, but defer updating the
> domain info, especially the pte_root to dev table entry to device driver
> init stage.
> 
> v5: 
> bnx2 NIC can't reset itself during driver init. Post patch to reset
> it during driver init. IO_PAGE_FAULT can't be seen anymore.
> 
> Below is link of v5 post.
> 
> https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html
> 
> v5->v6:
> According to Joerg's comments made several below main changes:
> - Add sanity check when copy old dev tables. 
> 
> - Discard the old patch 6/8.
> 
> - If a device is set up with guest translations (DTE.GV=1), then don't
>   copy that information but move the device over to an empty guest-cr3
>   table and handle the faults in the PPR log (which just answer them
>   with INVALID).
> 
> Issues need be discussed:
> - Joerg suggested hooking the behaviour that updates domain info into
>   dte entry into the set_dma_mask call-back. I tried, but on my local
>   machine with amd iommu v2, an ohci pci device doesn't call set_dma_mask.
>   Then IO_PAGE_FAULT printing flooded.
> 
>   00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI 
> Controller (rev 11)
> 
> - About GCR3 root pointer copying issue, I don't know how to setup the
>   test environment and haven't tested yet. Hope Joerg or Zongshun can
>   tell what steps should be taken to test it, or help take a test in your
>   test environemnt.
>  
> Baoquan He (9):
>   iommu/amd: Detect pre enabled translation
>   iommu/amd: add several helper function
>   iommu/amd: Define bit fields for DTE particularly
>   iommu/amd: Add function copy_dev_tables
>   iommu/amd: copy old trans table from old kernel
>   iommu/amd: Don't update domain info to dte entry at iommu init stage
>   iommu/amd: Update domain into to dte entry during device driver init
>   iommu/amd: Add sanity check of irq remap information of old dev table
> entry
>   iommu/amd: Don't copy GCR3 table root pointer
> 
>  drivers/iommu/amd_iommu.c   |  93 +---
>  drivers/iommu/amd_iommu_init.c  | 189 
> +---
>  drivers/iommu/amd_iommu_proto.h |   2 +
>  drivers/iommu/amd_iommu_types.h |  53 ++-
>  drivers/iommu/amd_iommu_v2.c|  18 +++-
>  5 files changed, 307 insertions(+), 48 deletions(-)
> 
> -- 
> 2.5.5
> 


Re: [PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-11-03 Thread Baoquan He
Hi Joerg,

Ping!

About the v6 post, do you have any suggestions?

Because of GCR3 special handling in patch 9/9, I spent several days to
study the knowledge and change code. Then when I tried to post, the
virtual interrupt remapping feature caused kernel hang with this pachset
applied. So it took me days to study spec and find it out. Finally it's
very late to post.

Coule it be possibe that we review and merge patch 9/1~8, and leave the
patch 9/9 which includes GCR3 special handling as 2nd step issue? Then
I can back port patch 9/1~8 to our distro. Since this bug has been
discussed so long time, and currently almost all system are deployed
with amd iommu v1 hardware. It would be great if they can be accepted
into 4.9 or 4.10-rc phase.

About patch 9/9, its code is a little complicated and not being
reviewed, I am not sure if I understand your suggestion and GCR3 code
well. What's your opinion?

Thanks
Baoquan


On 10/20/16 at 07:37pm, Baoquan He wrote:
> This is v6 post. 
> 
> The principle of the fix is similar to intel iommu. Just defer the assignment
> of device to domain to device driver init. But there's difference than
> intel iommu. AMD iommu create protection domain and assign device to
> domain in iommu driver init stage. So in this patchset I just allow the
> assignment of device to domain in software level, but defer updating the
> domain info, especially the pte_root to dev table entry to device driver
> init stage.
> 
> v5: 
> bnx2 NIC can't reset itself during driver init. Post patch to reset
> it during driver init. IO_PAGE_FAULT can't be seen anymore.
> 
> Below is link of v5 post.
> 
> https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html
> 
> v5->v6:
> According to Joerg's comments made several below main changes:
> - Add sanity check when copy old dev tables. 
> 
> - Discard the old patch 6/8.
> 
> - If a device is set up with guest translations (DTE.GV=1), then don't
>   copy that information but move the device over to an empty guest-cr3
>   table and handle the faults in the PPR log (which just answer them
>   with INVALID).
> 
> Issues need be discussed:
> - Joerg suggested hooking the behaviour that updates domain info into
>   dte entry into the set_dma_mask call-back. I tried, but on my local
>   machine with amd iommu v2, an ohci pci device doesn't call set_dma_mask.
>   Then IO_PAGE_FAULT printing flooded.
> 
>   00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI 
> Controller (rev 11)
> 
> - About GCR3 root pointer copying issue, I don't know how to setup the
>   test environment and haven't tested yet. Hope Joerg or Zongshun can
>   tell what steps should be taken to test it, or help take a test in your
>   test environemnt.
>  
> Baoquan He (9):
>   iommu/amd: Detect pre enabled translation
>   iommu/amd: add several helper function
>   iommu/amd: Define bit fields for DTE particularly
>   iommu/amd: Add function copy_dev_tables
>   iommu/amd: copy old trans table from old kernel
>   iommu/amd: Don't update domain info to dte entry at iommu init stage
>   iommu/amd: Update domain into to dte entry during device driver init
>   iommu/amd: Add sanity check of irq remap information of old dev table
> entry
>   iommu/amd: Don't copy GCR3 table root pointer
> 
>  drivers/iommu/amd_iommu.c   |  93 +---
>  drivers/iommu/amd_iommu_init.c  | 189 
> +---
>  drivers/iommu/amd_iommu_proto.h |   2 +
>  drivers/iommu/amd_iommu_types.h |  53 ++-
>  drivers/iommu/amd_iommu_v2.c|  18 +++-
>  5 files changed, 307 insertions(+), 48 deletions(-)
> 
> -- 
> 2.5.5
> 


[PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-10-20 Thread Baoquan He
This is v6 post. 

The principle of the fix is similar to intel iommu. Just defer the assignment
of device to domain to device driver init. But there's difference than
intel iommu. AMD iommu create protection domain and assign device to
domain in iommu driver init stage. So in this patchset I just allow the
assignment of device to domain in software level, but defer updating the
domain info, especially the pte_root to dev table entry to device driver
init stage.

v5: 
bnx2 NIC can't reset itself during driver init. Post patch to reset
it during driver init. IO_PAGE_FAULT can't be seen anymore.

Below is link of v5 post.
https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html

v5->v6:
According to Joerg's comments made several below main changes:
- Add sanity check when copy old dev tables. 

- Discard the old patch 6/8.

- If a device is set up with guest translations (DTE.GV=1), then don't
  copy that information but move the device over to an empty guest-cr3
  table and handle the faults in the PPR log (which just answer them
  with INVALID).

Issues need be discussed:
- Joerg suggested hooking the behaviour that updates domain info into
  dte entry into the set_dma_mask call-back. I tried, but on my local
  machine with amd iommu v2, an ohci pci device doesn't call set_dma_mask.
  Then IO_PAGE_FAULT printing flooded.

  00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI 
Controller (rev 11)

- About GCR3 root pointer copying issue, I don't know how to setup the
  test environment and haven't tested yet. Hope Joerg or Zongshun can
  tell what steps should be taken to test it, or help take a test in your
  test environemnt.
 
Baoquan He (9):
  iommu/amd: Detect pre enabled translation
  iommu/amd: add several helper function
  iommu/amd: Define bit fields for DTE particularly
  iommu/amd: Add function copy_dev_tables
  iommu/amd: copy old trans table from old kernel
  iommu/amd: Don't update domain info to dte entry at iommu init stage
  iommu/amd: Update domain into to dte entry during device driver init
  iommu/amd: Add sanity check of irq remap information of old dev table
entry
  iommu/amd: Don't copy GCR3 table root pointer

 drivers/iommu/amd_iommu.c   |  93 +---
 drivers/iommu/amd_iommu_init.c  | 189 +---
 drivers/iommu/amd_iommu_proto.h |   2 +
 drivers/iommu/amd_iommu_types.h |  53 ++-
 drivers/iommu/amd_iommu_v2.c|  18 +++-
 5 files changed, 307 insertions(+), 48 deletions(-)

-- 
2.5.5



[PATCH v6 0/9] Fix kdump faults on system with amd iommu

2016-10-20 Thread Baoquan He
This is v6 post. 

The principle of the fix is similar to intel iommu. Just defer the assignment
of device to domain to device driver init. But there's difference than
intel iommu. AMD iommu create protection domain and assign device to
domain in iommu driver init stage. So in this patchset I just allow the
assignment of device to domain in software level, but defer updating the
domain info, especially the pte_root to dev table entry to device driver
init stage.

v5: 
bnx2 NIC can't reset itself during driver init. Post patch to reset
it during driver init. IO_PAGE_FAULT can't be seen anymore.

Below is link of v5 post.
https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html

v5->v6:
According to Joerg's comments made several below main changes:
- Add sanity check when copy old dev tables. 

- Discard the old patch 6/8.

- If a device is set up with guest translations (DTE.GV=1), then don't
  copy that information but move the device over to an empty guest-cr3
  table and handle the faults in the PPR log (which just answer them
  with INVALID).

Issues need be discussed:
- Joerg suggested hooking the behaviour that updates domain info into
  dte entry into the set_dma_mask call-back. I tried, but on my local
  machine with amd iommu v2, an ohci pci device doesn't call set_dma_mask.
  Then IO_PAGE_FAULT printing flooded.

  00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI 
Controller (rev 11)

- About GCR3 root pointer copying issue, I don't know how to setup the
  test environment and haven't tested yet. Hope Joerg or Zongshun can
  tell what steps should be taken to test it, or help take a test in your
  test environemnt.
 
Baoquan He (9):
  iommu/amd: Detect pre enabled translation
  iommu/amd: add several helper function
  iommu/amd: Define bit fields for DTE particularly
  iommu/amd: Add function copy_dev_tables
  iommu/amd: copy old trans table from old kernel
  iommu/amd: Don't update domain info to dte entry at iommu init stage
  iommu/amd: Update domain into to dte entry during device driver init
  iommu/amd: Add sanity check of irq remap information of old dev table
entry
  iommu/amd: Don't copy GCR3 table root pointer

 drivers/iommu/amd_iommu.c   |  93 +---
 drivers/iommu/amd_iommu_init.c  | 189 +---
 drivers/iommu/amd_iommu_proto.h |   2 +
 drivers/iommu/amd_iommu_types.h |  53 ++-
 drivers/iommu/amd_iommu_v2.c|  18 +++-
 5 files changed, 307 insertions(+), 48 deletions(-)

-- 
2.5.5