Re: [PATCH 00/19] Fix Intel IOMMU breakage in kdump kernel

2015-06-24 Thread Li, ZhenHua

Hi all,
I see Joerg has backported it to sles12 in the commoent of novel
bugzilla 856382, so I will only backport it to redhat el.

Thanks
Zhenhua

On 06/13/2015 02:47 PM, Joerg Roedel wrote:

Hi,

as David Woodhouse pointed out, my fixes and cleanups for
the original patch-set turned out to be a complete rewrite.
So to have a cleaner history of the feature and to make
backporting easier, here is a rewrite of my changes based on
v4.1-rc7.

Some additional issues have been fixed by this rewrite, like
a kdump-kernel boot panic with 'iommu=pt' and support for
copying the extended root-entry and context table formats
has been added.

I plan to rebuild the x86/vt-d branch of the iommu tree with
these patches.

Regards,

Joerg

Joerg Roedel (19):
   iommu/vt-d: Cleanup log messages
   iommu/vt-d: Init QI before root entry is allocated
   iommu/vt-d: Make root entry visible for hardware right after
 allocation
   iommu/vt-d: Detect pre enabled translation
   iommu/vt-d: Copy translation tables from old kernel
   iommu/vt-d: Do not re-use domain-ids from the old kernel
   iommu/vt-d: Mark copied context entries
   iommu/vt-d: Allocate si_domain in init_dmars()
   iommu/vt-d: Don't do early domain assignment if kdump kernel
   iommu/vt-d: Don't copy translation tables if RTT bit needs to be
 changed
   iommu/vt-d: Don't disable translation prior to OS handover
   iommu/vt-d: Enable Translation only if it was previously disabled
   iommu/vt-d: Move EIM detection to intel_prepare_irq_remapping
   iommu/vt-d: Move QI initializationt to intel_setup_irq_remapping
   iommu/vt-d: Disable IRQ remapping in intel_prepare_irq_remapping
   iommu/vt-d: Set IRTA in intel_setup_irq_remapping
   iommu/vt-d: Copy IR table from old kernel when in kdump mode
   iommu/vt-d: Make sure copied over IR entries are not reused
   iommu/vt-d: Don't disable IR when it was previously enabled

  drivers/iommu/dmar.c|  28 +-
  drivers/iommu/intel-iommu.c | 495 
  drivers/iommu/intel_irq_remapping.c | 255 ---
  include/linux/intel-iommu.h |   5 +
  4 files changed, 573 insertions(+), 210 deletions(-)



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/19] Fix Intel IOMMU breakage in kdump kernel

2015-06-24 Thread Li, ZhenHua

On 06/23/2015 10:38 PM, David Woodhouse wrote:

On Tue, 2015-06-23 at 16:06 +0200, Joerg Roedel wrote:

On Tue, Jun 23, 2015 at 02:31:30PM +0100, David Woodhouse wrote:

However, it's still fairly gratuitous for all non-broken hardware, and
will tend to hide hardware and driver bugs during testing of new
hardware.

I'd much rather see this limited to a blacklist of known-broken
devices, an accompanied by a kernel message along the lines of

  'Preserving VT-d page tables for broken HP device :'

For *any* device which isn't so broken that it craps itself on taking
a DMA fault and cannot be reset, this page table copy shouldn't be
needed, right?


In theory yes, but as it came to my mind recently, there is this BIOS
"value-add" called APEI (ACPI Platform Error Interface) which has a
'Firmware first' mode.

So when this is active the firmware handles any errors happening in the
system and reports them to the OS with a severity it can decide on its
own.

Such errors could be DMA target aborts, for example. And I have seen
systems where at least rejected interrupt requests were reported to the
OS as fatal errors, causing a kernel panic in Linux. But the firmware is
also free to report ordinary DMA failures as fatal errors, who knows...


Yay for BIOS value subtract.

The thing is, this would be utterly broken. The IOMMU is supposed to
protect us from rogue devices. In this hypothetical scenario, a device
can bring the entire system down and we have no chance to isolate it
and recover. It means that assigning devices to guests should be
*disallowed* because it can't be done securely.

On this kind of system, we might as well turn off the IOMMU entirely as
in a lot of respects, it's only making things *worse*.


So while you are right that these changes might hide hardware and driver
bugs, I think it is still the best to try avoiding such faults at all
costs in the kdump kernel to actually get a dump, even if the device
would actually be able to recover from the master abort.


How about an *option* to do it for all devices (which in turn can
perhaps be triggered by a system-level blacklist for things like APEI,
or perhaps just a system DMI match on "HP").


Hi David,
It is a bad idea to check the DMI match on "HP". Though I have not see
any similar problems on other systems, I believe there are. Also not
all HP systems have such problem.
I agree with a blacklist for devices.

Thanks
Zhenhua

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-06-10 Thread Li, ZhenHua

On 06/10/2015 05:21 PM, Joerg Roedel wrote:

On Tue, Jun 09, 2015 at 01:55:50PM +0100, David Woodhouse wrote:

On Mon, 2015-06-08 at 18:13 +0200, Joerg Roedel wrote:

So I think we need to read out that bit when we find translation enabled
and if it is different from what we would set it to, we bail out of any
copying, disable translation and proceed as in a normal boot.


Given that this is only for kdump and not the general case of kexec,
that's probably tolerable. Of course we do still need to make it *not*
broken for the case where DMA_RTADDR_RTT is set, as it is at the
moment.


Yes, I just sent a patch for this and will include it into my x86/vt-d
branch if not objections come in.


And I suspect if we're doing that, it might be simple enough to make it
convert to/from the extended page tables. I don't think we want to
preserve PASID tables; only the "second level" (i.e. traditional)
translation. So we could happily pull the page table pointer out of
either kind of context entry, and install it into either kind. I think
there's a simple mapping of translation types too. I need to sort out
the translation types when adding the real PASID support (imminently!)
anyway.


What happens when we take away the PASID tables from a device? Can it
also go into some failure state?
When doing this, we need at least setup the page request queue before we
copy over anything and change the root-entry. Then we can handle any
faults that are caused by this and tell the device to not try further.


Joerg

Is PASID part of new specs? Is there any plan to upgrade the driver to 
support the latest vt-d specs?


Zhenhua

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/17] Fixes and Cleanups for the Intel VT-d driver

2015-06-08 Thread Li, ZhenHua

My pleasure, thanks.

Zhenhua

On 06/08/2015 04:23 PM, Joerg Roedel wrote:

On Mon, Jun 08, 2015 at 04:06:45PM +0800, Li, ZhenHua wrote:

Finished testing on my HP huge system, SuperDome X.
It works well.


Thanks for testing this, Zhen-Hua. Might I add your Tested-by to the
patches?


Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/17] Fixes and Cleanups for the Intel VT-d driver

2015-06-08 Thread Li, ZhenHua

Finished testing on my HP huge system, SuperDome X.
It works well.

Thanks
Zhenhua

On 06/05/2015 10:10 PM, Joerg Roedel wrote:

Hey,

here are a couple of patches to fix the fall-out from the
patch set fixing the kdump faults with VT-d enabled. A few
important issues have been fixed:

* Moved sleeping functions out of atomic sections,
  mainly iremap_cache/iounmap.

* Mark domain ids used in the old kernel as reserved
  in the kdump kernel, so that any new domains don't
  interfere with domains from the old kernel

* Mark all IRT entries from the old kernel as
  reservered in the new kernel, so that we don't
  overwrite an entry which might still be in use by
  a device

* Only try to load anything from the old kernel when
  we are in a kdump kernel. If we find the iommu
  hardware enabled and we are not in a kdump kernel,
  just disable the iommu and proceed with
  initialization.

* Fix a compile error

Besides that I also cleaned up the code to simplify it:

* Move necessary initialization steps before the
  root entry table and the irq remapping table is
  programmed into hardware. This makes sure QI is
  initialized so that we can flush the iommu caches
  when hardware got the new tables.

* Make the new root entry table and irq remapping
  table visible to hardware immediatly after they
  are created and loaded with the data from the old
  kernel. This allows to remove the code to update
  both, the old and the new version of the table at
  the same time.

* Clean up log messages from the VT-d driver to have
  a common prefix that can be grepped for.

* Remove unused code

The changes have been tested by me an Baoquan He. I'll plan
to put them into the x86/vt-d branch for v4.2.

Regards,

Joerg

Joerg Roedel (17):
   iommu/vt-d: Fix compile error when CONFIG_INTEL_IOMMU=n
   iommu/vt-d: Remove __iommu_save_to_oldmem() function
   iommu/vt-d: Make two functions static
   iommu/vt-d: Load old data structures only in kdump kernel
   iommu/vt-d: Mark root-entry present in set_root_entry
   iommu/vt-d: Rework loading of old root-entry table
   iommu/vt-d: Copy context-tables outside of spin_lock
   iommu/vt-d: Don't reuse domain-ids from old kernel
   iommu/vt-d: Clean up log messages in intel-iommu.c
   iommu/vt-d: Allocate irq remapping table bitmap with GFP_KERNEL
   iommu/vt-d: Move QI initialization to intel_setup_irq_remapping
   iommu/vt-d: Move EIM detection to intel_prepare_irq_remapping
   iommu/vt-d: Split up iommu_set_irq_remapping
   iommu/vt-d: Move kdump pointer intialization to __iommu_load_old_irte
   iommu/vt-d: Mark irt entries from old kernel as allocated
   iommu/vt-d: Copy old irte in intel_setup_irq_remapping
   iommu/vt-d: Remove update code for old ir-table

  drivers/iommu/dmar.c|  26 +-
  drivers/iommu/intel-iommu.c | 463 ++--
  drivers/iommu/intel_irq_remapping.c | 263 +---
  include/linux/intel-iommu.h |  17 +-
  4 files changed, 299 insertions(+), 470 deletions(-)



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch] iommu/vt-d: unlock on error in intel_iommu_load_translation_tables()

2015-06-02 Thread Li, ZhenHua

Dan,

Do not need to call __iommu_free_mapped_mem() for err_unlock. for
no new elements are added into __iommu_remapped_mem when drops to
err_unlock.


Thanks
Zhenhua

On 06/02/2015 10:06 PM, Dan Carpenter wrote:

On Tue, Jun 02, 2015 at 03:45:18PM +0200, Joerg Roedel wrote:

On Tue, Jun 02, 2015 at 01:09:58PM +0300, Dan Carpenter wrote:

Smatch found some error paths that don't unlock.  Also we can return
-ENOMEM instead of -1 if we don't have an old root entry.

Fixes: 5908f10af4b9 ('iommu/vt-d: datatypes and functions used for kdump')
Signed-off-by: Dan Carpenter 


Applied, thanks.


Releasing the lock is good, but we might need to do other error handling
as well.  Presumably this function always succeeds anyway?  It seems
like it might be essential for booting.


What do you mean by 'other error handling'? In error case a negative
value is returned and the caller checks that.


I was just worried we should call __iommu_free_mapped_mem() or
something.  I don't know this code very well is what I'm saying...

regards,
dan carpenter



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 08/10] iommu/vt-d: assign new page table for dma_map

2015-05-21 Thread Li, ZhenHua

Hi Baoquan,
During driver being loaded and initialized, when there is a new dma
request, the function
   __get_valid_domain_for_dev
is called, and then new page is mapped.

Please check this:
struct dma_map_ops intel_dma_ops = {
.alloc = intel_alloc_coherent,
.free = intel_free_coherent,
.map_sg = intel_map_sg,
.unmap_sg = intel_unmap_sg,
.map_page = intel_map_page,
.unmap_page = intel_unmap_page,
.mapping_error = intel_mapping_error,
};

You can also add dump_stack() in __get_valid_domain_for_dev to debug.

Thanks
Zhenhua

On 05/21/2015 02:54 PM, Baoquan He wrote:

On 05/21/15 at 09:27am, Li, ZhenHua wrote:

Hi Baoquan,

In the early version of this patchset, old page tables are used by new
kernel.  But as discussed, we need to make kernel use new pages when
there is a new dma request , so we need to unmap the pages which were
mapped in old kernel, and this is what this patch does.


OK, just a new page table allocated in init_domain(), right? I thought a
specific empty page-table is allocated for these new domains in kdump
kernel.



Thanks
Zhenhua

On 05/21/2015 07:52 AM, Baoquan He wrote:

On 05/11/15 at 05:52pm, Li, Zhen-Hua wrote:

When a device driver issues the first dma_map command for a device, we
assign a new and empty page-table, thus removing all mappings from the
old kernel for the device.


Hi Zhenhua,

 From your patch I got it will remove all mappings, assign a new
page-table. But I didn't got why you stress an empty page-table. Did I
miss anything?

Thanks
Baoquan



Signed-off-by: Li, Zhen-Hua 
--

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 08/10] iommu/vt-d: assign new page table for dma_map

2015-05-20 Thread Li, ZhenHua

Hi Baoquan,

In the early version of this patchset, old page tables are used by new
kernel.  But as discussed, we need to make kernel use new pages when
there is a new dma request , so we need to unmap the pages which were
mapped in old kernel, and this is what this patch does.

Thanks
Zhenhua

On 05/21/2015 07:52 AM, Baoquan He wrote:

On 05/11/15 at 05:52pm, Li, Zhen-Hua wrote:

When a device driver issues the first dma_map command for a device, we
assign a new and empty page-table, thus removing all mappings from the
old kernel for the device.


Hi Zhenhua,

 From your patch I got it will remove all mappings, assign a new
page-table. But I didn't got why you stress an empty page-table. Did I
miss anything?

Thanks
Baoquan



Signed-off-by: Li, Zhen-Hua 
---
  drivers/iommu/intel-iommu.c | 58 ++---
  1 file changed, 50 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 91545bf..3cc1027 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -396,6 +396,9 @@ static int copy_root_entry_table(struct intel_iommu *iommu);

  static int intel_iommu_load_translation_tables(struct intel_iommu *iommu);

+static void unmap_device_dma(struct dmar_domain *domain,
+   struct device *dev,
+   struct intel_iommu *iommu);
  static void iommu_check_pre_te_status(struct intel_iommu *iommu);
  static u8 g_translation_pre_enabled;

@@ -3115,6 +3118,7 @@ static struct iova *intel_alloc_iova(struct device *dev,
  static struct dmar_domain *__get_valid_domain_for_dev(struct device *dev)
  {
struct dmar_domain *domain;
+   struct intel_iommu *iommu;
int ret;

domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
@@ -3124,14 +3128,30 @@ static struct dmar_domain 
*__get_valid_domain_for_dev(struct device *dev)
return NULL;
}

-   /* make sure context mapping is ok */
-   if (unlikely(!domain_context_mapped(dev))) {
-   ret = domain_context_mapping(domain, dev, 
CONTEXT_TT_MULTI_LEVEL);
-   if (ret) {
-   printk(KERN_ERR "Domain context map for %s failed",
-  dev_name(dev));
-   return NULL;
-   }
+   /* if in kdump kernel, we need to unmap the mapped dma pages,
+* detach this device first.
+*/
+   if (likely(domain_context_mapped(dev))) {
+   iommu = domain_get_iommu(domain);
+   if (iommu->pre_enabled_trans) {
+   unmap_device_dma(domain, dev, iommu);
+
+   domain = get_domain_for_dev(dev,
+   DEFAULT_DOMAIN_ADDRESS_WIDTH);
+   if (!domain) {
+   pr_err("Allocating domain for %s failed",
+  dev_name(dev));
+   return NULL;
+   }
+   } else
+   return domain;
+   }
+
+   ret = domain_context_mapping(domain, dev, CONTEXT_TT_MULTI_LEVEL);
+   if (ret) {
+   pr_err("Domain context map for %s failed",
+  dev_name(dev));
+   return NULL;
}

return domain;
@@ -5168,6 +5188,28 @@ static int intel_iommu_load_translation_tables(struct 
intel_iommu *iommu)
return ret;
  }

+static void unmap_device_dma(struct dmar_domain *domain,
+   struct device *dev,
+   struct intel_iommu *iommu)
+{
+   struct context_entry *ce;
+   struct iova *iova;
+   phys_addr_t phys_addr;
+   dma_addr_t dev_addr;
+   struct pci_dev *pdev;
+
+   pdev = to_pci_dev(dev);
+   ce = iommu_context_addr(iommu, pdev->bus->number, pdev->devfn, 1);
+   phys_addr = context_address_root(ce) << VTD_PAGE_SHIFT;
+   dev_addr = phys_to_dma(dev, phys_addr);
+
+   iova = find_iova(&domain->iovad, IOVA_PFN(dev_addr));
+   if (iova)
+   intel_unmap(dev, dev_addr);
+
+   domain_remove_one_dev_info(domain, dev);
+}
+
  static void iommu_check_pre_te_status(struct intel_iommu *iommu)
  {
u32 sts;
--
2.0.0-rc0



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-19 Thread Li, ZhenHua

Hi Dave,

Don't worry :).
Of cause I have read your comments, but most of them contains no actual
code change request, so I did not reply them one by one.
When we are sure there is no actual code change needed, I will update
the comments and other format problems if necessary.

Regards
Zhenhua

On 05/19/2015 09:13 AM, Dave Young wrote:

Hi,

On 05/18/15 at 06:05pm, Li, ZhenHua wrote:

Hi Joerg,

Testing from HP: passed.
Testing from He Baoquan(Redhat): passed

The problem that dmar fault came again when running v10 with latest
kernel is fixed.
And I think there is no need to update the code to a new version now.

So please tell me if there are still any code need to be changed.


Have you reviewed all my comments? Although it is only iommu driver fixes,
The patches are sent to lkml, no? If you do not want other people's comments
why will you send to us? I would say do not assume it is a closed circle for
only iommu driver.

It is not easy to review patches actually, frankly I feel kernel code review
is not so strict like the early days, maybe one reason is people are getting
tired to argue.

Thanks
Dave



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-18 Thread Li, ZhenHua

Hi Joerg,

Testing from HP: passed.
Testing from He Baoquan(Redhat): passed

The problem that dmar fault came again when running v10 with latest
kernel is fixed.
And I think there is no need to update the code to a new version now.

So please tell me if there are still any code need to be changed.

Thanks
Zhenhua

On 05/13/2015 09:54 AM, Li, ZhenHua wrote:

One thing must be pointed out:
There is a known issue that hpsa driver cannot work well in kdump
kernel. And this patchset is not intended to fix this problem.

So this patchset cannot work with HP smart array devices which need hpsa
driver.

On 05/11/2015 05:52 PM, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix
for:
If a kernel boots with intel_iommu=on on a system that supports intel
vt-d,
when a panic happens, the kdump kernel will boot with these faults:

 dmar: DRHD: handling fault status reg 102
 dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
 DMAR:[fault reason 01] Present bit in root entry is clear

 dmar: DRHD: handling fault status reg 2
 dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is
clear

On some system, the interrupt remapping fault will also happen even if
the
intel_iommu is not set to on, because the interrupt remapping will be
enabled
when x2apic is needed by the system.

The cause of the DMA fault is described in Bill's original version,
and the
INTR-Remap fault is caused by a similar reason. In short, the
initialization
of vt-d drivers causes the in-flight DMA and interrupt requests get wrong
response.

To fix this problem, we modifies the behaviors of the intel vt-d in the
crashdump kernel:

For DMA Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the translation, keep it enabled.
3. Use the old root entry table, do not rewrite the RTA register.
4. Malloc and use new context entry table, copy data from the old ones
that
used by the old kernel.
5. Keep using the old page tables before driver is loaded.
6. After device driver is loaded, when it issues the first dma_map
command,
free the dmar_domain structure for this device, and generate a new
one, so
that the device can be assigned a new and empty page table.
7. When a new context entry table is generated, we also save its
address to
the old root entry table.

For Interrupt Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the interrupt remapping, keep it enabled.
3. Use the old interrupt remapping table, do not rewrite the IRTA
register.
4. When ioapic entry is setup, the interrupt remapping table is
changed, and
the updated data will be stored to the old interrupt remapping table.

Advantages of this approach:
1. All manipulation of the IO-device is done by the Linux device-driver
for that device.
2. This approach behaves in a manner very similar to operation without an
active iommu.
3. Any activity between the IO-device and its RMRR areas is handled by
the
device-driver in the same manner as during a non-kdump boot.
4. If an IO-device has no driver in the kdump kernel, it is simply
left alone.
This supports the practice of creating a special kdump kernel without
drivers for any devices that are not required for taking a crashdump.
5. Minimal code-changes among the existing mainline intel vt-d code.

Summary of changes in this patch set:
1. Added some useful function for root entry table in code intel-iommu.c
2. Added new members to struct root_entry and struct irte;
3. Functions to load old root entry table to iommu->root_entry from
the memory
of old kernel.
4. Functions to malloc new context entry table and copy the data from
the old
ones to the malloced new ones.
5. Functions to enable support for DMA remapping in kdump kernel.
6. Functions to load old irte data from the old kernel to the kdump
kernel.
7. Some code changes that support other behaviours that have been listed.
8. In the new functions, use physical address as "unsigned long" type,
not
pointers.

Original version by Bill Sumner:
 https://lkml.org/lkml/2014/1/10/518
 https://lkml.org/lkml/2014/4/15/716
 https://lkml.org/lkml/2014/4/24/836

Zhenhua's updates:
 https://lkml.org/lkml/2014/10/21/134
 https://lkml.org/lkml/2014/12/15/121
 https://lkml.org/lkml/2014/12/22/53
 https://lkml.org/lkml/2015/1/6/1166
 https://lkml.org/lkml/2015/1/12/35
 https://lkml.org/lkml/2015/3/19/33
 https://lkml.org/lkml/2015/4/10/135

Changelog[v11]:
 1. Fix some conflicts with the latest upstream kernel.
Add flush in iommu_context_addr

Changelog[v10]:
 1. Do not use CONFIG_CRASH_DUMP and is_kdump_kernel().
Use one flag which stores the te and ir status in last kernel:
iommu->pre_enabled_trans
iommu->pre_enabled_

Re: [PATCH v11 04/10] iommu/vt-d: functions to copy data from old mem

2015-05-13 Thread Li, ZhenHua

Hi Baoquan,
I am using a list here to store all the mapped addresses, and unmap them 
out of iounmap.


About the reason, please check the old mails. I cannot remember the 
detailed reasons.


Thanks
Zhenhua

On 05/13/2015 05:00 PM, Baoquan He wrote:

On 05/11/15 at 05:52pm, Li, Zhen-Hua wrote:

Add some functions to copy the data from old kernel.
These functions are used to copy context tables and page tables.

To avoid calling iounmap between spin_lock_irqsave and spin_unlock_irqrestore,
use a link here, store the pointers , and then use iounmap to free them in
another place.


Hi Zhenhua,

I remember you mentioned iounmap will cause error inside
spin_lock_irqsave. Do you know why it happened now? And could you also
describe why avoid calling iounmap between
spin_lock_irqsave/unlock_irqsave  is needed here and what's the status
now?

I think other reviewer may want to know it too.

Thanks
Baoquan



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 06/10] iommu/vt-d: datatypes and functions used for kdump

2015-05-13 Thread Li, ZhenHua

Hi Dave,

iommu->root_entry_old_virt is used to store the mapped old rta.
iommu->root_entry_old_phys is used to store the physical address stored 
in registers.

So we must not free/unmap iommu->root_entry_old_phys .

Zhenhua
On 05/13/2015 04:56 PM, Baoquan He wrote:

+
+   iommu->root_entry_old_phys = q & VTD_PAGE_MASK;
+   if (!iommu->root_entry_old_phys) {
+   pr_err("Could not read old root entry address.");
+   return -1;
+   }
+


I didn't find where you call iounmap to free mapping of
iommu->root_entry_old_phys. Am I missing anything?


+   iommu->root_entry_old_virt = ioremap_cache(iommu->root_entry_old_phys,
+   VTD_PAGE_SIZE);
+   if (!iommu->root_entry_old_virt) {
+   pr_err("Could not map the old root entry.");
+   return -ENOMEM;
+   }


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 07/10] iommu/vt-d: enable kdump support in iommu module

2015-05-12 Thread Li, ZhenHua

+static u8 g_translation_pre_enabled;

Hi Zhenhua,

I haven't checked patch one by one, am going through the code flow.

About g_translation_pre_enabled, I don't think it's necessary to define
it as a global variable. Both its assignment and judgement are in
init_dmars(). In this situation a local variable translation_pre_enabled
in init_dmars() is enough.

You can assign value to it here:

iommu_check_pre_te_status(iommu);
if (iommu->pre_enabled_trans) {
translation_pre_enabled = 1;
...
}

Thanks
Baoquan


Hi Baoquan,
This variable is only be used in this file, for it is defined as static.
Till now, I think both global and local variable are fine, got the same
thing.
But I believe global is better, because if other functions want to know 
whether translation is enabled, this global variable is a good choice.


Thanks
Zhenhua





___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-12 Thread Li, ZhenHua

One thing must be pointed out:
There is a known issue that hpsa driver cannot work well in kdump 
kernel. And this patchset is not intended to fix this problem.


So this patchset cannot work with HP smart array devices which need hpsa 
driver.


On 05/11/2015 05:52 PM, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix for:
If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
when a panic happens, the kdump kernel will boot with these faults:

 dmar: DRHD: handling fault status reg 102
 dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
 DMAR:[fault reason 01] Present bit in root entry is clear

 dmar: DRHD: handling fault status reg 2
 dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear

On some system, the interrupt remapping fault will also happen even if the
intel_iommu is not set to on, because the interrupt remapping will be enabled
when x2apic is needed by the system.

The cause of the DMA fault is described in Bill's original version, and the
INTR-Remap fault is caused by a similar reason. In short, the initialization
of vt-d drivers causes the in-flight DMA and interrupt requests get wrong
response.

To fix this problem, we modifies the behaviors of the intel vt-d in the
crashdump kernel:

For DMA Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the translation, keep it enabled.
3. Use the old root entry table, do not rewrite the RTA register.
4. Malloc and use new context entry table, copy data from the old ones that
used by the old kernel.
5. Keep using the old page tables before driver is loaded.
6. After device driver is loaded, when it issues the first dma_map command,
free the dmar_domain structure for this device, and generate a new one, so
that the device can be assigned a new and empty page table.
7. When a new context entry table is generated, we also save its address to
the old root entry table.

For Interrupt Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the interrupt remapping, keep it enabled.
3. Use the old interrupt remapping table, do not rewrite the IRTA register.
4. When ioapic entry is setup, the interrupt remapping table is changed, and
the updated data will be stored to the old interrupt remapping table.

Advantages of this approach:
1. All manipulation of the IO-device is done by the Linux device-driver
for that device.
2. This approach behaves in a manner very similar to operation without an
active iommu.
3. Any activity between the IO-device and its RMRR areas is handled by the
device-driver in the same manner as during a non-kdump boot.
4. If an IO-device has no driver in the kdump kernel, it is simply left alone.
This supports the practice of creating a special kdump kernel without
drivers for any devices that are not required for taking a crashdump.
5. Minimal code-changes among the existing mainline intel vt-d code.

Summary of changes in this patch set:
1. Added some useful function for root entry table in code intel-iommu.c
2. Added new members to struct root_entry and struct irte;
3. Functions to load old root entry table to iommu->root_entry from the memory
of old kernel.
4. Functions to malloc new context entry table and copy the data from the old
ones to the malloced new ones.
5. Functions to enable support for DMA remapping in kdump kernel.
6. Functions to load old irte data from the old kernel to the kdump kernel.
7. Some code changes that support other behaviours that have been listed.
8. In the new functions, use physical address as "unsigned long" type, not
pointers.

Original version by Bill Sumner:
 https://lkml.org/lkml/2014/1/10/518
 https://lkml.org/lkml/2014/4/15/716
 https://lkml.org/lkml/2014/4/24/836

Zhenhua's updates:
 https://lkml.org/lkml/2014/10/21/134
 https://lkml.org/lkml/2014/12/15/121
 https://lkml.org/lkml/2014/12/22/53
 https://lkml.org/lkml/2015/1/6/1166
 https://lkml.org/lkml/2015/1/12/35
 https://lkml.org/lkml/2015/3/19/33
 https://lkml.org/lkml/2015/4/10/135

Changelog[v11]:
 1. Fix some conflicts with the latest upstream kernel.
Add flush in iommu_context_addr

Changelog[v10]:
 1. Do not use CONFIG_CRASH_DUMP and is_kdump_kernel().
Use one flag which stores the te and ir status in last kernel:
iommu->pre_enabled_trans
iommu->pre_enabled_ir

Changelog[v9]:
 1. Add new function iommu_attach_domain_with_id.
 2. Do not copy old page tables, keep using the old ones.
 3. Remove functions:
intel_iommu_did_to_domain_values_entry
intel_iommu_get_dids_from_old_kernel
device_to_domain_id
copy_page_addr
copy_page_table
copy_context_entry
cop

Re: [PATCH v11 05/10] iommu/vt-d: Add functions to load and save old re

2015-05-12 Thread Li, ZhenHua

On 05/12/2015 04:37 PM, Dave Young wrote:

Seems the subject was truncated? Maybe "re" means root entry? Then please fix it

On 05/11/15 at 05:52pm, Li, Zhen-Hua wrote:

Add functions to load root entry table from old kernel, and to save updated
root entry table.
Add two member in struct intel_iommu, to store the RTA in old kernel, and
the mapped virt address of it.


Please explain a bit what is "RTA" in patch log, it is unclear to most of people
who do not know iommu details.





If people want to understand this patchset, I assume they have read the 
vt-d specs. RTA is defined clearly in this spec.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 02/10] iommu/vt-d: Items required for kdump

2015-05-12 Thread Li, ZhenHua

On 05/12/2015 04:17 PM, Dave Young wrote:

On 05/11/15 at 05:52pm, Li, Zhen-Hua wrote:

Add context entry functions needed for kdump.
+/*
+ * Fix Crashdump failure caused by leftover DMA through a hardware IOMMU
+ *
+ * Fixes the crashdump kernel to deal with an active iommu and legacy
+ * DMA from the (old) panicked kernel in a manner similar to how legacy
+ * DMA is handled when no hardware iommu was in use by the old kernel --
+ * allow the legacy DMA to continue into its current buffers.
+ *
+ * In the crashdump kernel, this code:
+ * 1. skips disabling the IOMMU's translating.
+ * 2. Do not re-enable IOMMU's translating.
+ * 3. In kdump kernel, use the old root entry table.
+ * 4. Allocate pages for new context entry, copy data from old context entries
+ *in the old kernel to the new ones.
+ *
+ * In other kinds of kernel, for example, a kernel started by kexec,
+ * do the same thing as crashdump kernel.
+ */
+
+


Above comments should come along with the code changes instead of putting it
in this patch.

BTW, there's one more blank line at the end..

Code change is in 00/10, the cover letter.
And the blank does not matter, I checked the patch with 
scripts/checkpatch.pl,  no warnings, no errors.



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-07 Thread Li, ZhenHua

Hi Joerg,
This problem is caused by the latest updates in iommu module, and we are 
trying to fix it.

When it is fixed, we will send out a new version of the patchset.

Thanks
Zhenhua

On 05/08/2015 01:32 AM, Joerg Roedel wrote:

Hi Baoquan, ZhenHua,

On Mon, May 04, 2015 at 11:17:49AM +0800, Baoquan He wrote:

On 05/04/15 at 11:06am, Li, ZhenHua wrote:

Hi baoquan,
Could you paste the kernel log of the first kernel ?


Please let me know when you have worked this issue out.


Thanks,

Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 04/10] iommu/vt-d: functions to copy data from old mem

2015-05-07 Thread Li, ZhenHua

It is called in
static int copy_root_entry_table(struct intel_iommu *iommu);


On 05/07/2015 03:49 PM, Baoquan He wrote:

On 04/10/15 at 04:42pm, Li, Zhen-Hua wrote:

Add some functions to copy the data from old kernel.
These functions are used to copy context tables and page tables.

To avoid calling iounmap between spin_lock_irqsave and spin_unlock_irqrestore,
use a link here, store the pointers , and then use iounmap to free them in
another place.

Li, Zhen-hua:
 The functions and logics.

Takao Indoh:
 Check if pfn is ram:
 if (page_is_ram(pfn))

Signed-off-by: Li, Zhen-Hua 
Signed-off-by: Takao Indoh 
---
  drivers/iommu/intel-iommu.c | 102 
  include/linux/intel-iommu.h |   6 +++
  2 files changed, 108 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index ff5ac04..5ba403a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -373,6 +373,17 @@ static struct context_entry 
*device_to_existing_context_entry(
struct intel_iommu *iommu,
u8 bus, u8 devfn);

+/*
+ * A structure used to store the address allocated by ioremap();
+ * The we need to call iounmap() to free them out of spin_lock_irqsave/unlock;
+ */
+struct iommu_remapped_entry {
+   struct list_head list;
+   void __iomem *mem;
+};
+static LIST_HEAD(__iommu_remapped_mem);
+static DEFINE_MUTEX(__iommu_mem_list_lock);
+

  /*
   * This domain is a statically identity mapping domain.
@@ -4817,3 +4828,94 @@ static struct context_entry 
*device_to_existing_context_entry(
return ret;
  }

+/*
+ * Copy memory from a physically-addressed area into a virtually-addressed area
+ */


I don't find where __iommu_load_from_oldmem is called. Obsolete code of
this patch?


+int __iommu_load_from_oldmem(void *to, unsigned long from, unsigned long size)
+{
+   unsigned long pfn;  /* Page Frame Number */
+   size_t csize = (size_t)size;/* Num(bytes to copy) */
+   unsigned long offset;   /* Lower 12 bits of to */
+   void __iomem *virt_mem;
+   struct iommu_remapped_entry *mapped;
+
+   pfn = from >> VTD_PAGE_SHIFT;
+   offset = from & (~VTD_PAGE_MASK);
+
+   if (page_is_ram(pfn)) {
+   memcpy(to, pfn_to_kaddr(pfn) + offset, csize);
+   } else{
+
+   mapped = kzalloc(sizeof(struct iommu_remapped_entry),
+   GFP_KERNEL);
+   if (!mapped)
+   return -ENOMEM;
+
+   virt_mem = ioremap_cache((unsigned long)from, size);
+   if (!virt_mem) {
+   kfree(mapped);
+   return -ENOMEM;
+   }
+   memcpy(to, virt_mem, size);
+
+   mutex_lock(&__iommu_mem_list_lock);
+   mapped->mem = virt_mem;
+   list_add_tail(&mapped->list, &__iommu_remapped_mem);
+   mutex_unlock(&__iommu_mem_list_lock);
+   }
+   return size;
+}
+
+/*
+ * Copy memory from a virtually-addressed area into a physically-addressed area
+ */
+int __iommu_save_to_oldmem(unsigned long to, void *from, unsigned long size)
+{
+   unsigned long pfn;  /* Page Frame Number */
+   size_t csize = (size_t)size;/* Num(bytes to copy) */
+   unsigned long offset;   /* Lower 12 bits of to */
+   void __iomem *virt_mem;
+   struct iommu_remapped_entry *mapped;
+
+   pfn = to >> VTD_PAGE_SHIFT;
+   offset = to & (~VTD_PAGE_MASK);
+
+   if (page_is_ram(pfn)) {
+   memcpy(pfn_to_kaddr(pfn) + offset, from, csize);
+   } else{
+   mapped = kzalloc(sizeof(struct iommu_remapped_entry),
+   GFP_KERNEL);
+   if (!mapped)
+   return -ENOMEM;
+
+   virt_mem = ioremap_cache((unsigned long)to, size);
+   if (!virt_mem) {
+   kfree(mapped);
+   return -ENOMEM;
+   }
+   memcpy(virt_mem, from, size);
+   mutex_lock(&__iommu_mem_list_lock);
+   mapped->mem = virt_mem;
+   list_add_tail(&mapped->list, &__iommu_remapped_mem);
+   mutex_unlock(&__iommu_mem_list_lock);
+   }
+   return size;
+}
+
+/*
+ * Free the mapped memory for ioremap;
+ */
+int __iommu_free_mapped_mem(void)
+{
+   struct iommu_remapped_entry *mem_entry, *tmp;
+
+   mutex_lock(&__iommu_mem_list_lock);
+   list_for_each_entry_safe(mem_entry, tmp, &__iommu_remapped_mem, list) {
+   iounmap(mem_entry->mem);
+   list_del(&mem_entry->list);
+   kfree(mem_entry);
+   }
+   mutex_unlock(&__iommu_mem_list_lock);
+   return 0;
+}
+
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index a65208a..4bca7b5 100644
--- a/include/linux/intel-iommu.h

Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-05 Thread Li, ZhenHua

Dave,
This patchset will only write root tables in old kernel,  if it is 
corrupted, faults will also happen in old kernel, and hardware would 
mark it. So things will not go worse..


Thanks
Zhenhua
On 05/06/2015 09:51 AM, Dave Young wrote:

On 05/05/15 at 05:31pm, Joerg Roedel wrote:

On Tue, May 05, 2015 at 02:14:23PM +0800, Dave Young wrote:

The failure is nothing different, but as I said in another reply the
difference is we could use corrupted data to possiblly cause more failure.


I still fail to see how things can get more worse than they already are
by reusing the old data (we just reuse it, we do not modify anything


DMA write will modify system ram, if the old data is corrupted  it is possible
that DMA operation modify wrong ram regions because of wrong mapping.
Am I missing something and is it not possible?

Thanks
Dave



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-03 Thread Li, ZhenHua

Hi baoquan,
Could you paste the kernel log of the first kernel ?

Thanks
Zhenhua
On 05/03/2015 04:55 PM, Baoquan He wrote:

On 04/29/15 at 07:20pm, Baoquan He wrote:

Bad news, I rebuilt a kernel with your patchset on 4.0.0+ (this commit
f614c81). Now dmar fault is  seen again.

The lspci log and kdump log are attached, please check:


I found the lspci log previously attached is emtyp, resend it again.




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-28 Thread Li, ZhenHua

Hi Baoquan,

If old tables are corrupted, we will see the DMAR faults or INTR faults
(which we have seen), or some other error messages. Most of these
messages are from hardware. This means, hardware will do some check when 
running. But I don't think hardware will completely check the

tables.

Till now, I do not have a good idea to do the check in kdump kernel.


Thanks
Zhenhua


On 04/28/2015 04:54 PM, Baoquan He wrote:

On 04/24/15 at 04:49pm, Dave Young wrote:

On 04/24/15 at 04:35pm, Baoquan He wrote:

On 04/24/15 at 04:25pm, Dave Young wrote:

Hi, Baoquan


I support this patchset.

We should not fear oldmem since reserved crashkernel region is similar.
No one can guarantee that any crazy code won't step into crashkernel
region just because 1st kernel says it's reversed for kdump kernel. Here
the root table and context tables are also not built to allow legal code
to danamge. Both of them has the risk to be corrupted, for trying our
best to get a dumped vmcore the risk is worth being taken.


old mem is mapped in 1st kernel so compare with the reserved crashkernel
they are more likely to be corrupted. they are totally different.


Could you tell how and why they are different? Wrong code will choose
root tables and context tables to danamge when they totally lose
control?


iommu will map io address to system ram, right? not to reserved ram, but
yes I'm assuming the page table is right, but I was worrying they are corrupted
while kernel panic is happening.


OK, I think we may need to think more about the old context tables
reuse. Currently dmar faults will cause error or warning message,
occasionally will cause system with iommu hang in kdump kernel. I don't
know what will happen if old root tables or context tables are corrupted
by evil code. For kdump kernel which use the similar mechanism there's a
verification. When load kdump kernel into reserved crashkernel region a
sha256 sum is calculated, then verify it when jump into kdump kernel
after panic. If corrupted context tables will bring worse result, then
we need consider giving it up and change back to the old way and try
to dump though there's error message.

Hi Zhenhua,

I don't know what's your plan about verification whether old root tables
or old context tables are corrupted. Or have you experimented that what
will happen if old tables are corrupted on purpose.

I am fine if you just put this in a TODO list since that's truly in a
rare case. But it maybe necessary to tell it in patch log.

Thanks
Baoquan



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-20 Thread Li, ZhenHua

Hi Dave,
I found the old mail:
http://lkml.iu.edu/hypermail/linux/kernel/1410.2/03584.html

Please check this and you will find the discussion.

Regards
Zhenhua

On 04/15/2015 02:48 PM, Dave Young wrote:

On 04/15/15 at 01:47pm, Li, ZhenHua wrote:

On 04/15/2015 08:57 AM, Dave Young wrote:

Again, I think it is bad to use old page table, below issues need consider:
1) make sure old page table are reliable across crash
2) do not allow writing oldmem after crash

Please correct me if I'm wrong, or if above is not doable I think I will vote 
for
resetting pci bus.

Thanks
Dave


Hi Dave,

When updating the context tables, we have to write their address to root
tables, this will cause writing to old mem.

Resetting the pci bus has been discussed, please check this:
http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
https://lkml.org/lkml/2014/10/21/890


I know one reason to use old pgtable is this looks better because it fixes the
real problem, but it is not a good way if it introduce more problems because of
it have to use oldmem. I will be glad if this is not a problem but I have not
been convinced.

OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
their own fixes, so it looks not that elegant.

For pci reset, it is not perfect, but it has another advantage, the patch is
simpler. The problem I see from the old discusssion is, reset bus in 2nd kernel
is acceptable but it does not fix things on sparc platform. AFAIK current 
reported
problems are intel and amd iommu, at least pci reset stuff does not make it 
worse.

Thanks
Dave



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-14 Thread Li, ZhenHua

On 04/15/2015 08:57 AM, Dave Young wrote:

Again, I think it is bad to use old page table, below issues need consider:
1) make sure old page table are reliable across crash
2) do not allow writing oldmem after crash

Please correct me if I'm wrong, or if above is not doable I think I will vote 
for
resetting pci bus.

Thanks
Dave


Hi Dave,

When updating the context tables, we have to write their address to root 
tables, this will cause writing to old mem.


Resetting the pci bus has been discussed, please check this:
http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
https://lkml.org/lkml/2014/10/21/890

Thanks
Zhenhua



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-07 Thread Li, ZhenHua

On 04/07/2015 05:08 PM, Dave Young wrote:

On 04/07/15 at 11:46am, Dave Young wrote:

On 04/05/15 at 09:54am, Baoquan He wrote:

On 04/03/15 at 05:21pm, Dave Young wrote:

On 04/03/15 at 05:01pm, Li, ZhenHua wrote:

Hi Dave,

There may be some possibilities that the old iommu data is corrupted by
some other modules. Currently we do not have a better solution for the
dmar faults.

But I think when this happens, we need to fix the module that corrupted
the old iommu data. I once met a similar problem in normal kernel, the
queue used by the qi_* functions was written again by another module.
The fix was in that module, not in iommu module.


It is too late, there will be no chance to save vmcore then.

Also if it is possible to continue corrupt other area of oldmem because
of using old iommu tables then it will cause more problems.

So I think the tables at least need some verifycation before being used.



Yes, it's a good thinking anout this and verification is also an
interesting idea. kexec/kdump do a sha256 calculation on loaded kernel
and then verify this again when panic happens in purgatory. This checks
whether any code stomps into region reserved for kexec/kernel and corrupt
the loaded kernel.

If this is decided to do it should be an enhancement to current
patchset but not a approach change. Since this patchset is going very
close to point as maintainers expected maybe this can be merged firstly,
then think about enhancement. After all without this patchset vt-d often
raised error message, hung.


It does not convince me, we should do it right at the beginning instead of
introduce something wrong.

I wonder why the old dma can not be remap to a specific page in kdump kernel
so that it will not corrupt more memory. But I may missed something, I will
looking for old threads and catch up.


I have read the old discussion, above way was dropped because it could corrupt
filesystem. Apologize about late commenting.

But current solution sounds bad to me because of using old memory which is not
reliable.

Thanks
Dave

Seems we do not have a better solution for the dmar faults.  But I 
believe we can find out how to verify the iommu data which is located in 
old memory.


Thanks
Zhenhua

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-03 Thread Li, ZhenHua

Hi Dave,

There may be some possibilities that the old iommu data is corrupted by
some other modules. Currently we do not have a better solution for the
dmar faults.

But I think when this happens, we need to fix the module that corrupted
the old iommu data. I once met a similar problem in normal kernel, the
queue used by the qi_* functions was written again by another module.
The fix was in that module, not in iommu module.


Thanks
Zhenhua

On 04/03/2015 04:40 PM, Dave Young wrote:

To fix this problem, we modifies the behaviors of the intel vt-d in the
crashdump kernel:

For DMA Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the translation, keep it enabled.
3. Use the old root entry table, do not rewrite the RTA register.
4. Malloc and use new context entry table, copy data from the old ones that
used by the old kernel.


Have not read all the patches, but I have a question, not sure this has been
answered before. Old memory is not reliable, what if the old memory get 
corrupted
before panic? Is it safe to continue using it in 2nd kernel, I worry that it 
will
cause problems.

Hope I'm wrong though.

Thanks
Dave




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-03 Thread Li, ZhenHua

On 04/03/2015 04:28 PM, Dave Young wrote:

On 03/19/15 at 01:36pm, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix for:
If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
when a panic happens, the kdump kernel will boot with these faults:

Zhenhua,

I will review the patchset recently, sorry for jumping in late.

Thanks
Dave


Hi Dave,
Thanks for your review. And please also take a look at the plan I sent 
in another mail:


Change use of
 if (is_kdump_kernel()) { }

To:
if (iommu_enabled_in_last_kernel) { }

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [v4 0/8] Add VT-d Posted-Interrupts support - IOMMU part

2015-04-03 Thread Li, ZhenHua

Hi Feng Wu,
In my patchset, I created a new member ir_table->base_old_phys; In the 
normal kernel, everything is the same. In kdump kernel, ir_table->base 
is used for a buffer, and  ir_table->base_old_phys is the physical 
address of the tables used by the old kernel, also being used by the 
current kernel.


I did this in modify_irte():

set_64bit(&irte->high, irte_modified->high);
+
+#ifdef CONFIG_CRASH_DUMP
+   if (is_kdump_kernel())
+   __iommu_update_old_irte(iommu, index);
+#endif
__iommu_flush_cache(iommu, irte, sizeof(*irte));


Here the irte tables are stored in two places:
iommu->ir_table->base : It is a buffer in kdump kernel, which is the 
running kernel;

iommu->ir_table->base_old_phys : It is the irte used by the old kernel;

And function __iommu_update_old_irte is used to save the content of 
iommu->ir_table->base  to iommu->ir_table->base_old_phys. Because in 
kdump kernel, the vt-d is using ir_table->base_old_phys, not 
ir_table->base, so we need to copy the updated ir_table->base to 
ir_table->base_old_phys .



Thanks
Zhenhua

On 04/02/2015 07:28 PM, Joerg Roedel wrote:

On Mon, Feb 02, 2015 at 04:06:56PM +0800, Feng Wu wrote:

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

This series was part of http://thread.gmane.org/gmane.linux.kernel.iommu/7708. 
To make things clear, send out IOMMU part here.


Besides the modify_irte() changes I asked for the patch-set looks good.
I just have some concerns what these changes mean for the VT-d kdump
improvements Zhen-Hua Li is working on. Can you please discuss the
implications of having both patch-sets applied with him and make sure
they work together? I think in its current form your patch-set breaks
the kdump support patches. I added Zhen-Hua to Cc.

Thanks,

Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-03 Thread Li, ZhenHua

Hi Joerg,
This is quite strange. I checked the patches from patch 01 to 10 using 
./scripts/checkpatch.pl under the kernel source directory, but got 0 
errors and 0 warning.  Only some white spaces in the cover letter 00, 
but is could not be checked by this script.


But I checked the intel-iommu.c using "checkpatch.pl -f", found too many 
warnings and errors. Maybe we need a new patch to fix them.


Thanks
Zhenhua

On 04/02/2015 07:11 PM, Joerg Roedel wrote:

Hi Zhen-Hua,

On Thu, Mar 19, 2015 at 01:36:18PM +0800, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix for:
If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
when a panic happens, the kdump kernel will boot with these faults:


I reviewed this patch-set and it is getting closer to a point where it
could be merged. I found a few white-space errors in the review, please
do a check_patch run on the next round and fix these. Besides that and
given some third-party testing and reviews I think we can look forward
to merge it early after the merge-window for v4.1, to give it enough
testing in -next too.


Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 07/10] iommu/vt-d: enable kdump support in iommu module

2015-04-03 Thread Li, ZhenHua

Hi Joerg,
Thinking about it carefully, I think you suggestions are very helpful, 
and the checks should be:
* All these things should be done in the second kernel, not only the 
kdump kernel. Sometimes user may use kexec manually start a new kernel.
* Copying those tables should only happen in the second kernel, only in 
this situation: iommu is enabled in both first and second kernel.


So, I think we can do this:
1. Use a new variable iommu_enabled_in_last_kernel;
2. When iommu module is loaded, it means the iommu is enabled in current 
kernel.  Then we check DMA_GSTS_TES:
   * if DMA_GSTS_TES is set, then it should be the second kernel(kexec 
starts this kernel manually, or it is kdump kernel), and the first 
kernel used iommu tables.  Here we need to copy these tables; then set 
iommu_enabled_in_last_kernel to 1.
   * if DMA_GSTS_TES is NOT set, then we do not need to care about 
these tables.


3. Replace all
#ifdef CONFIG_CRASH_DUMP
if (is_kdump_kernel()) {
// Do some thing
}
#endif

To:
if (iommu_enabled_in_last_kernel) {
 // Do some thing
}

Do you agree with these?

Thanks
Zhenhua

On 04/02/2015 07:06 PM, Joerg Roedel wrote:

On Thu, Mar 19, 2015 at 01:36:25PM +0800, Li, Zhen-Hua wrote:

+#ifdef CONFIG_CRASH_DUMP
+   if (is_kdump_kernel())
+   __iommu_update_old_root_entry(iommu, bus);
+#endif


All the is_kdump_kernel checks in this patch (and maybe in other patches
too) should really be checks whether translation on the IOMMU was
enabled or not when the kernel booted. You might also boot from a kernel
that had translation disabled into a kdump kernel that wants to enable
it. In this case these checks would break.

Speaking of booting from kernels with translation disabled, there is a
valid use of is_kdump_enabled(), to omit iommu initialization in the
kdump kernel when translation was disabled before. But the other checks
should depend on the state the iommu had when booting the kdump kernel.


Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 06/10] iommu/vt-d: datatypes and functions used for kdump

2015-01-14 Thread Li, ZhenHua

Hi Baoquan,
Thank you very much for your review. But according to the latest
discussion, the page tables will not be copied from old kernel. We keep
using the old page tables before driver is loaded. So there are many
changes in next version;

See my comments.

On 01/15/2015 11:28 AM, Baoquan He wrote:

On 01/12/15 at 03:06pm, Li, Zhen-Hua wrote:

+/*
+ * Interface to the "copy translation tables" set of functions
+ * from mainline code.
+ */
+static int intel_iommu_load_translation_tables(struct dmar_drhd_unit *drhd,
+   int g_num_of_iommus)


The argument g_num_of_iommus is the same as the global variable, it's better
to rename it as num_of_iommus. And even it can be removed since you can
just use the global variable g_num_of_iommus in this function.

Argument drhd can be intel_iommu because no other member variable in
drhd is needed.


This function is no longer used. So forget the parameters.




+{
+   struct intel_iommu *iommu;  /* Virt(iommu hardware registers) */
+   unsigned long long q;   /* quadword scratch */
+   int ret = 0;/* Integer return code */
+   int i = 0;  /* Loop index */
+   unsigned long flags;
+
+   /* Structure so copy_page_addr() can accumulate things
+* over multiple calls and returns
+*/
+   struct copy_page_addr_parms ppa_parms = copy_page_addr_parms_init;
+   struct copy_page_addr_parms *ppap = &ppa_parms;
+
+
+   iommu = drhd->iommu;
+   q = dmar_readq(iommu->reg + DMAR_RTADDR_REG);
+   if (!q)
+   return -1;
+
+   /* If (list needs initializing) do it here */


This initializing should not be here, because it's not only for this
drhd. It should be done in init_dmars().


Yes you are right. Though the variable domain_values_list will not be
used in next version, I think I need to check if there are any other
similar problems.


+   if (!domain_values_list) {
+   domain_values_list =
+kcalloc(g_num_of_iommus, sizeof(struct list_head),
+   GFP_KERNEL);
+
+   if (!domain_values_list) {
+   pr_err("Allocation failed for domain_values_list 
array\n");
+   return -ENOMEM;
+   }
+   for (i = 0; i < g_num_of_iommus; i++)
+   INIT_LIST_HEAD(&domain_values_list[i]);
+   }
+
+   spin_lock_irqsave(&iommu->lock, flags);
+
+   /* Load the root-entry table from the old kernel
+* foreach context_entry_table in root_entry
+*foreach context_entry in context_entry_table
+*   foreach level-1 page_table_entry in context_entry
+*  foreach level-2 page_table_entry in level 1 page_table_entry
+* Above pattern continues up to 6 levels of page tables
+*Sanity-check the entry
+*Process the bus, devfn, page_address, page_size
+*/
+   if (!iommu->root_entry) {
+   iommu->root_entry =
+   (struct root_entry *)alloc_pgtable_page(iommu->node);
+   if (!iommu->root_entry) {
+   spin_unlock_irqrestore(&iommu->lock, flags);
+   return -ENOMEM;
+   }
+   }
+
+   iommu->root_entry_old_phys = q & VTD_PAGE_MASK;
+   if (!iommu->root_entry_old_phys) {
+   pr_err("Could not read old root entry address.");
+   return -1;
+   }
+
+   iommu->root_entry_old_virt = ioremap_cache(iommu->root_entry_old_phys,
+   VTD_PAGE_SIZE);
+   if (!iommu->root_entry_old_virt) {
+   pr_err("Could not map the old root entry.");
+   return -ENOMEM;
+   }
+
+   __iommu_load_old_root_entry(iommu);
+   ret = copy_root_entry_table(iommu, ppap);
+   __iommu_flush_cache(iommu, iommu->root_entry, PAGE_SIZE);
+   __iommu_update_old_root_entry(iommu, -1);
+
+   spin_unlock_irqrestore(&iommu->lock, flags);
+
+   __iommu_free_mapped_mem();
+
+   if (ret)
+   return ret;
+
+   ppa_parms.last = 1;
+   copy_page_addr(0, 0, 0, 0, 0, NULL, ppap);
+
+   return 0;
+}
+
  #endif /* CONFIG_CRASH_DUMP */
--
2.0.0-rc0


___
kexec mailing list
ke...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 02/10] iommu/vt-d: Items required for kdump

2015-01-13 Thread Li, ZhenHua

On 01/12/2015 11:22 PM, Joerg Roedel wrote:

On Mon, Jan 12, 2015 at 03:06:20PM +0800, Li, Zhen-Hua wrote:

+
+#ifdef CONFIG_CRASH_DUMP
+
+/*
+ * Fix Crashdump failure caused by leftover DMA through a hardware IOMMU
+ *
+ * Fixes the crashdump kernel to deal with an active iommu and legacy
+ * DMA from the (old) panicked kernel in a manner similar to how legacy
+ * DMA is handled when no hardware iommu was in use by the old kernel --
+ * allow the legacy DMA to continue into its current buffers.
+ *
+ * In the crashdump kernel, this code:
+ * 1. skips disabling the IOMMU's translating of IO Virtual Addresses (IOVA).
+ * 2. Do not re-enable IOMMU's translating.
+ * 3. In kdump kernel, use the old root entry table.
+ * 4. Leaves the current translations in-place so that legacy DMA will
+ *continue to use its current buffers.
+ * 5. Allocates to the device drivers in the crashdump kernel
+ *portions of the iova address ranges that are different
+ *from the iova address ranges that were being used by the old kernel
+ *at the time of the panic.
+ *
+ */


It looks like you are still copying the io-page-tables from the old
kernel into the kdump kernel, is that right? With the approach that was
proposed you only need to copy over the context entries 1-1. They are
still pointing to the page-tables in the old kernels memory (which is
just fine).

The root-entry of the old kernel is also re-used, and when the kdump
kernel starts to use a device, its context entry is updated to point to
a newly allocated page-table.



Current status:
1. Use old root entry table,
2. Copy old context entry tables.
3. Copy old page tables.
4. Allocate new page table when device driver is loading.

I remember the progress we discussed was this. Checking the old mails,
founding we did not clearly discuss about the old page tables, only
agrred with allocating new page tables when device driver is loading.

To me, both of these two options are fine, copying old page tables, or
re-use the old ones. My debugging shows both of them works fine.

If we use the old page tables, the patchset will use less code.
Functions copy_page_addr, copy_page_table are no longer needed, and
functions copy_context_entry and copy_context_entry_table will be
reduced to several lines. The patchset will be much simpler.

Zhenhua


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 01/10] iommu/vt-d: Update iommu_attach_domain() and its callers

2015-01-12 Thread Li, ZhenHua

On 01/12/2015 11:18 PM, Joerg Roedel wrote:

On Mon, Jan 12, 2015 at 03:06:19PM +0800, Li, Zhen-Hua wrote:

Allow specification of the domain-id for the new domain.
This patch only adds the 'did' parameter to iommu_attach_domain()
and modifies all of its callers to specify the default value of -1
which says "no did specified, allocate a new one".


I think its better to keep the old iommu_attach_domain() interface in
place and introduce a new function (like iommu_attach_domain_with_id()
or something) which has the additional parameter. Then you can rewrite
iommu_attach_domain():

iommu_attach_domai(...)
{
return iommu_attach_domain_with_id(..., -1);
}

This way you don't have to update all the callers of
iommu_attach_domain() and the interface is more readable.


Joerg



That's a good way. I will do this in next version.

Thanks
Zhenhua


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-01-12 Thread Li, ZhenHua

On 01/12/2015 05:07 PM, Baoquan He wrote:

On 01/12/15 at 04:00pm, Li, ZhenHua wrote:

Comparing to v7, this version adds only a few lines code:

In function copy_page_table,

+   __iommu_flush_cache(iommu, phys_to_virt(dma_pte_next),
+   VTD_PAGE_SIZE);


So this adding fixs the reported dmar fault on Takao's system, right?

I am not sure whether it can fix the dmar fault on Takao's system, but
I hope it can fix.






On 01/12/2015 03:06 PM, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix for:
If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
when a panic happens, the kdump kernel will boot with these faults:

 dmar: DRHD: handling fault status reg 102
 dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
 DMAR:[fault reason 01] Present bit in root entry is clear

 dmar: DRHD: handling fault status reg 2
 dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear

On some system, the interrupt remapping fault will also happen even if the
intel_iommu is not set to on, because the interrupt remapping will be enabled
when x2apic is needed by the system.

The cause of the DMA fault is described in Bill's original version, and the
INTR-Remap fault is caused by a similar reason. In short, the initialization
of vt-d drivers causes the in-flight DMA and interrupt requests get wrong
response.

To fix this problem, we modifies the behaviors of the intel vt-d in the
crashdump kernel:

For DMA Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the translation, keep it enabled.
3. Use the old root entry table, do not rewrite the RTA register.
4. Malloc and use new context entry table and page table, copy data from the
old ones that used by the old kernel.
5. to use different portions of the iova address ranges for the device drivers
in the crashdump kernel than the iova ranges that were in-use at the time
of the panic.
6. After device driver is loaded, when it issues the first dma_map command,
free the dmar_domain structure for this device, and generate a new one, so
that the device can be assigned a new and empty page table.
7. When a new context entry table is generated, we also save its address to
the old root entry table.

For Interrupt Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the interrupt remapping, keep it enabled.
3. Use the old interrupt remapping table, do not rewrite the IRTA register.
4. When ioapic entry is setup, the interrupt remapping table is changed, and
the updated data will be stored to the old interrupt remapping table.

Advantages of this approach:
1. All manipulation of the IO-device is done by the Linux device-driver
for that device.
2. This approach behaves in a manner very similar to operation without an
active iommu.
3. Any activity between the IO-device and its RMRR areas is handled by the
device-driver in the same manner as during a non-kdump boot.
4. If an IO-device has no driver in the kdump kernel, it is simply left alone.
This supports the practice of creating a special kdump kernel without
drivers for any devices that are not required for taking a crashdump.
5. Minimal code-changes among the existing mainline intel vt-d code.

Summary of changes in this patch set:
1. Added some useful function for root entry table in code intel-iommu.c
2. Added new members to struct root_entry and struct irte;
3. Functions to load old root entry table to iommu->root_entry from the memory
of old kernel.
4. Functions to malloc new context entry table and page table and copy the data
from the old ones to the malloced new ones.
5. Functions to enable support for DMA remapping in kdump kernel.
6. Functions to load old irte data from the old kernel to the kdump kernel.
7. Some code changes that support other behaviours that have been listed.
8. In the new functions, use physical address as "unsigned long" type, not
pointers.

Original version by Bill Sumner:
 https://lkml.org/lkml/2014/1/10/518
 https://lkml.org/lkml/2014/4/15/716
 https://lkml.org/lkml/2014/4/24/836

Zhenhua's updates:
 https://lkml.org/lkml/2014/10/21/134
 https://lkml.org/lkml/2014/12/15/121
 https://lkml.org/lkml/2014/12/22/53
 https://lkml.org/lkml/2015/1/6/1166

Changelog[v8]:
 1. Add a missing __iommu_flush_cache in function copy_page_table.

Changelog[v7]:
 1. Use __iommu_flush_cache to flush the data to hardware.

Changelog[v6]:
 1. Use "unsigned long" as type of physical address.
 2. Use new function unmap_device_dma to unmap the old dma.
 3. Some small incorrect bits order for aw shift.

Changelog[v5]:
 1. Do not disable and re-enable traslation and inter

Re: [PATCH v8 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-01-12 Thread Li, ZhenHua

Comparing to v7, this version adds only a few lines code:

In function copy_page_table,

+   __iommu_flush_cache(iommu, phys_to_virt(dma_pte_next),
+   VTD_PAGE_SIZE);


On 01/12/2015 03:06 PM, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix for:
If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
when a panic happens, the kdump kernel will boot with these faults:

 dmar: DRHD: handling fault status reg 102
 dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
 DMAR:[fault reason 01] Present bit in root entry is clear

 dmar: DRHD: handling fault status reg 2
 dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear

On some system, the interrupt remapping fault will also happen even if the
intel_iommu is not set to on, because the interrupt remapping will be enabled
when x2apic is needed by the system.

The cause of the DMA fault is described in Bill's original version, and the
INTR-Remap fault is caused by a similar reason. In short, the initialization
of vt-d drivers causes the in-flight DMA and interrupt requests get wrong
response.

To fix this problem, we modifies the behaviors of the intel vt-d in the
crashdump kernel:

For DMA Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the translation, keep it enabled.
3. Use the old root entry table, do not rewrite the RTA register.
4. Malloc and use new context entry table and page table, copy data from the
old ones that used by the old kernel.
5. to use different portions of the iova address ranges for the device drivers
in the crashdump kernel than the iova ranges that were in-use at the time
of the panic.
6. After device driver is loaded, when it issues the first dma_map command,
free the dmar_domain structure for this device, and generate a new one, so
that the device can be assigned a new and empty page table.
7. When a new context entry table is generated, we also save its address to
the old root entry table.

For Interrupt Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the interrupt remapping, keep it enabled.
3. Use the old interrupt remapping table, do not rewrite the IRTA register.
4. When ioapic entry is setup, the interrupt remapping table is changed, and
the updated data will be stored to the old interrupt remapping table.

Advantages of this approach:
1. All manipulation of the IO-device is done by the Linux device-driver
for that device.
2. This approach behaves in a manner very similar to operation without an
active iommu.
3. Any activity between the IO-device and its RMRR areas is handled by the
device-driver in the same manner as during a non-kdump boot.
4. If an IO-device has no driver in the kdump kernel, it is simply left alone.
This supports the practice of creating a special kdump kernel without
drivers for any devices that are not required for taking a crashdump.
5. Minimal code-changes among the existing mainline intel vt-d code.

Summary of changes in this patch set:
1. Added some useful function for root entry table in code intel-iommu.c
2. Added new members to struct root_entry and struct irte;
3. Functions to load old root entry table to iommu->root_entry from the memory
of old kernel.
4. Functions to malloc new context entry table and page table and copy the data
from the old ones to the malloced new ones.
5. Functions to enable support for DMA remapping in kdump kernel.
6. Functions to load old irte data from the old kernel to the kdump kernel.
7. Some code changes that support other behaviours that have been listed.
8. In the new functions, use physical address as "unsigned long" type, not
pointers.

Original version by Bill Sumner:
 https://lkml.org/lkml/2014/1/10/518
 https://lkml.org/lkml/2014/4/15/716
 https://lkml.org/lkml/2014/4/24/836

Zhenhua's updates:
 https://lkml.org/lkml/2014/10/21/134
 https://lkml.org/lkml/2014/12/15/121
 https://lkml.org/lkml/2014/12/22/53
 https://lkml.org/lkml/2015/1/6/1166

Changelog[v8]:
 1. Add a missing __iommu_flush_cache in function copy_page_table.

Changelog[v7]:
 1. Use __iommu_flush_cache to flush the data to hardware.

Changelog[v6]:
 1. Use "unsigned long" as type of physical address.
 2. Use new function unmap_device_dma to unmap the old dma.
 3. Some small incorrect bits order for aw shift.

Changelog[v5]:
 1. Do not disable and re-enable traslation and interrupt remapping.
 2. Use old root entry table.
 3. Use old interrupt remapping table.
 4. New functions to copy data from old kernel, and save to old kernel mem.
 5. New functions to save updated root entry table and irte table.
 6. Use intel_unmap to unmap the old dma;
 7. Allocate ne

Re: [PATCH v7 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-01-07 Thread Li, ZhenHua

Well, that's quite good news.
Looking forward Takao's testing on his system.

Regards
Zhenhua
On 01/07/2015 04:28 PM, Baoquan He wrote:

On 01/07/15 at 01:25pm, Li, ZhenHua wrote:

It is same as the last one I send to you yesterday.

The continuous memory that needed for data in this patchset:
RE: PAGE_SIZE, 4096 Bytes;
IRTE: 65536 * 16 ; 1M Bytes;

It should use same memory as the old versions of this patchset. The
changes for the last version do not need more memory.


Hi Zhenhua,

It was my mistake because I didn't strip the debug info of modules, then
initramfs is bloated very big. Just now I tested the latest version, it
works well and dump is successful. No dmar fault and intr-remap fault
seen any more, good job!

Thanks
Baoquan




Regards
Zhenhua

On 01/07/2015 01:02 PM, Baoquan He wrote:

On 01/07/15 at 12:11pm, Li, ZhenHua wrote:

Many thanks to Takao Indoh and Baoquan He, for your testing on more
different systems.

The calling of flush functions are added to this version.

The usage of __iommu_flush_cache function :
1. Fixes a dump on Takao's system.
2. Reduces the count of faults on Baoquan's system.


I am testing the version you sent to me yesterday afternoon. Is that
different with this patchset? I found your patchset man reserve a big
contiguous memory region under 896M, this will cause the crashkernel
reservation failed when I set crashkernel=320M. The reason I increase
the crashkerenl reservation to 320M is 256M is not enough and cause OOM
when that patchset is tested.

I am checking what happened.


Thanks
Baoquan



Regards
Zhenhua

On 01/07/2015 12:04 PM, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix for:
If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
when a panic happens, the kdump kernel will boot with these faults:

 dmar: DRHD: handling fault status reg 102
 dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
 DMAR:[fault reason 01] Present bit in root entry is clear

 dmar: DRHD: handling fault status reg 2
 dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear

On some system, the interrupt remapping fault will also happen even if the
intel_iommu is not set to on, because the interrupt remapping will be enabled
when x2apic is needed by the system.

The cause of the DMA fault is described in Bill's original version, and the
INTR-Remap fault is caused by a similar reason. In short, the initialization
of vt-d drivers causes the in-flight DMA and interrupt requests get wrong
response.

To fix this problem, we modifies the behaviors of the intel vt-d in the
crashdump kernel:

For DMA Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the translation, keep it enabled.
3. Use the old root entry table, do not rewrite the RTA register.
4. Malloc and use new context entry table and page table, copy data from the
old ones that used by the old kernel.
5. to use different portions of the iova address ranges for the device drivers
in the crashdump kernel than the iova ranges that were in-use at the time
of the panic.
6. After device driver is loaded, when it issues the first dma_map command,
free the dmar_domain structure for this device, and generate a new one, so
that the device can be assigned a new and empty page table.
7. When a new context entry table is generated, we also save its address to
the old root entry table.

For Interrupt Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the interrupt remapping, keep it enabled.
3. Use the old interrupt remapping table, do not rewrite the IRTA register.
4. When ioapic entry is setup, the interrupt remapping table is changed, and
the updated data will be stored to the old interrupt remapping table.

Advantages of this approach:
1. All manipulation of the IO-device is done by the Linux device-driver
for that device.
2. This approach behaves in a manner very similar to operation without an
active iommu.
3. Any activity between the IO-device and its RMRR areas is handled by the
device-driver in the same manner as during a non-kdump boot.
4. If an IO-device has no driver in the kdump kernel, it is simply left alone.
This supports the practice of creating a special kdump kernel without
drivers for any devices that are not required for taking a crashdump.
5. Minimal code-changes among the existing mainline intel vt-d code.

Summary of changes in this patch set:
1. Added some useful function for root entry table in code intel-iommu.c
2. Added new members to struct root_entry and struct irte;
3. Functions to load old root entry table to iommu->root_entry from the memory
of old kernel.
4. Functions to malloc new context entry table and page table and copy the data
from the old ones to 

Re: [PATCH v7 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-01-06 Thread Li, ZhenHua

It is same as the last one I send to you yesterday.

The continuous memory that needed for data in this patchset:
RE: PAGE_SIZE, 4096 Bytes;
IRTE: 65536 * 16 ; 1M Bytes;

It should use same memory as the old versions of this patchset. The 
changes for the last version do not need more memory.


Regards
Zhenhua

On 01/07/2015 01:02 PM, Baoquan He wrote:

On 01/07/15 at 12:11pm, Li, ZhenHua wrote:

Many thanks to Takao Indoh and Baoquan He, for your testing on more
different systems.

The calling of flush functions are added to this version.

The usage of __iommu_flush_cache function :
1. Fixes a dump on Takao's system.
2. Reduces the count of faults on Baoquan's system.


I am testing the version you sent to me yesterday afternoon. Is that
different with this patchset? I found your patchset man reserve a big
contiguous memory region under 896M, this will cause the crashkernel
reservation failed when I set crashkernel=320M. The reason I increase
the crashkerenl reservation to 320M is 256M is not enough and cause OOM
when that patchset is tested.

I am checking what happened.


Thanks
Baoquan



Regards
Zhenhua

On 01/07/2015 12:04 PM, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix for:
If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
when a panic happens, the kdump kernel will boot with these faults:

 dmar: DRHD: handling fault status reg 102
 dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
 DMAR:[fault reason 01] Present bit in root entry is clear

 dmar: DRHD: handling fault status reg 2
 dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear

On some system, the interrupt remapping fault will also happen even if the
intel_iommu is not set to on, because the interrupt remapping will be enabled
when x2apic is needed by the system.

The cause of the DMA fault is described in Bill's original version, and the
INTR-Remap fault is caused by a similar reason. In short, the initialization
of vt-d drivers causes the in-flight DMA and interrupt requests get wrong
response.

To fix this problem, we modifies the behaviors of the intel vt-d in the
crashdump kernel:

For DMA Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the translation, keep it enabled.
3. Use the old root entry table, do not rewrite the RTA register.
4. Malloc and use new context entry table and page table, copy data from the
old ones that used by the old kernel.
5. to use different portions of the iova address ranges for the device drivers
in the crashdump kernel than the iova ranges that were in-use at the time
of the panic.
6. After device driver is loaded, when it issues the first dma_map command,
free the dmar_domain structure for this device, and generate a new one, so
that the device can be assigned a new and empty page table.
7. When a new context entry table is generated, we also save its address to
the old root entry table.

For Interrupt Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the interrupt remapping, keep it enabled.
3. Use the old interrupt remapping table, do not rewrite the IRTA register.
4. When ioapic entry is setup, the interrupt remapping table is changed, and
the updated data will be stored to the old interrupt remapping table.

Advantages of this approach:
1. All manipulation of the IO-device is done by the Linux device-driver
for that device.
2. This approach behaves in a manner very similar to operation without an
active iommu.
3. Any activity between the IO-device and its RMRR areas is handled by the
device-driver in the same manner as during a non-kdump boot.
4. If an IO-device has no driver in the kdump kernel, it is simply left alone.
This supports the practice of creating a special kdump kernel without
drivers for any devices that are not required for taking a crashdump.
5. Minimal code-changes among the existing mainline intel vt-d code.

Summary of changes in this patch set:
1. Added some useful function for root entry table in code intel-iommu.c
2. Added new members to struct root_entry and struct irte;
3. Functions to load old root entry table to iommu->root_entry from the memory
of old kernel.
4. Functions to malloc new context entry table and page table and copy the data
from the old ones to the malloced new ones.
5. Functions to enable support for DMA remapping in kdump kernel.
6. Functions to load old irte data from the old kernel to the kdump kernel.
7. Some code changes that support other behaviours that have been listed.
8. In the new functions, use physical address as "unsigned long" type, not
pointers.

Original version by Bill Sumner:
 https://lkml.org/lkml/2014/1/10/518
 https://lkml.org/lkml/2014/4/15/716
 ht

Re: [PATCH v7 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-01-06 Thread Li, ZhenHua

Many thanks to Takao Indoh and Baoquan He, for your testing on more
different systems.

The calling of flush functions are added to this version.

The usage of __iommu_flush_cache function :
1. Fixes a dump on Takao's system.
2. Reduces the count of faults on Baoquan's system.

Regards
Zhenhua

On 01/07/2015 12:04 PM, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix for:
If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
when a panic happens, the kdump kernel will boot with these faults:

 dmar: DRHD: handling fault status reg 102
 dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
 DMAR:[fault reason 01] Present bit in root entry is clear

 dmar: DRHD: handling fault status reg 2
 dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear

On some system, the interrupt remapping fault will also happen even if the
intel_iommu is not set to on, because the interrupt remapping will be enabled
when x2apic is needed by the system.

The cause of the DMA fault is described in Bill's original version, and the
INTR-Remap fault is caused by a similar reason. In short, the initialization
of vt-d drivers causes the in-flight DMA and interrupt requests get wrong
response.

To fix this problem, we modifies the behaviors of the intel vt-d in the
crashdump kernel:

For DMA Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the translation, keep it enabled.
3. Use the old root entry table, do not rewrite the RTA register.
4. Malloc and use new context entry table and page table, copy data from the
old ones that used by the old kernel.
5. to use different portions of the iova address ranges for the device drivers
in the crashdump kernel than the iova ranges that were in-use at the time
of the panic.
6. After device driver is loaded, when it issues the first dma_map command,
free the dmar_domain structure for this device, and generate a new one, so
that the device can be assigned a new and empty page table.
7. When a new context entry table is generated, we also save its address to
the old root entry table.

For Interrupt Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the interrupt remapping, keep it enabled.
3. Use the old interrupt remapping table, do not rewrite the IRTA register.
4. When ioapic entry is setup, the interrupt remapping table is changed, and
the updated data will be stored to the old interrupt remapping table.

Advantages of this approach:
1. All manipulation of the IO-device is done by the Linux device-driver
for that device.
2. This approach behaves in a manner very similar to operation without an
active iommu.
3. Any activity between the IO-device and its RMRR areas is handled by the
device-driver in the same manner as during a non-kdump boot.
4. If an IO-device has no driver in the kdump kernel, it is simply left alone.
This supports the practice of creating a special kdump kernel without
drivers for any devices that are not required for taking a crashdump.
5. Minimal code-changes among the existing mainline intel vt-d code.

Summary of changes in this patch set:
1. Added some useful function for root entry table in code intel-iommu.c
2. Added new members to struct root_entry and struct irte;
3. Functions to load old root entry table to iommu->root_entry from the memory
of old kernel.
4. Functions to malloc new context entry table and page table and copy the data
from the old ones to the malloced new ones.
5. Functions to enable support for DMA remapping in kdump kernel.
6. Functions to load old irte data from the old kernel to the kdump kernel.
7. Some code changes that support other behaviours that have been listed.
8. In the new functions, use physical address as "unsigned long" type, not
pointers.

Original version by Bill Sumner:
 https://lkml.org/lkml/2014/1/10/518
 https://lkml.org/lkml/2014/4/15/716
 https://lkml.org/lkml/2014/4/24/836

Zhenhua's updates:
 https://lkml.org/lkml/2014/10/21/134
 https://lkml.org/lkml/2014/12/15/121
 https://lkml.org/lkml/2014/12/22/53

Changelog[v7]:
 1. Use __iommu_flush_cache to flush the data to hardware.

Changelog[v6]:
 1. Use "unsigned long" as type of physical address.
 2. Use new function unmap_device_dma to unmap the old dma.
 3. Some small incorrect bits order for aw shift.

Changelog[v5]:
 1. Do not disable and re-enable traslation and interrupt remapping.
 2. Use old root entry table.
 3. Use old interrupt remapping table.
 4. New functions to copy data from old kernel, and save to old kernel mem.
 5. New functions to save updated root entry table and irte table.
 6. Use intel_unmap to unmap the old dma;
 7. Allocate new pages while driver is being loaded.

Cha

Re: [PATCH 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-01-05 Thread Li, ZhenHua
According to Takao Indoh's testing, seems adding flush after loading old 
irte and updating old irte does not fix the dmar faults.


According to Takao Indoh's log and your log, the faults happens while 
and after driver is loaded.  Maybe I am using incorrect code in 08/10.



On 01/06/2015 02:37 PM, Baoquan He wrote:

Hi Zhenhua,

I just tested your patchset based on 3.19.0-rc2+, and found several dmar
fault and intr-remap fault, it seems not the same as Takao's, please
check the attachment.

Thanks
Baoquan



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-01-05 Thread Li, ZhenHua
Thank you very much for your help.

I have found there are several places need flush, and I will send a new
version
of this patchset with the flush functions.

Regards
Zhenhua

On 01/06/2015 08:18 AM, Takao Indoh wrote:
> On 2014/12/29 12:15, Li, ZhenHua wrote:
>> Hi Takao Indoh,
>>
>> Happy New Year, and thank you very much for you help.  The flush is quite
> Happy new year!
>
>>   a problem,  as there are several places the flush function should be 
>> called,
>> I think the flush should be placed in functions like __iommu_update_old_*.
>> Created a small patch for this, it is attached.
>>
>>
>>
>> As I cannot reproduce your problems on my system, so could you please try
>> these steps?
>> 1. Apply the latest patchset, including 9/10 and 10/10, and then apply the
>> attached patch_for_flush.patch.  And then test the kernel.
> No inter-remap fault, but there is still DMAR fault message.
>
>> 2.  If 1 does not fix the DMAR fault  problems, then it might be caused by
>> 7/10, so please *unpatch* it from the kernel (others and the  attached one
>> should be patched), and then test the kernel.
> DMAR fault still occurs. I'll dig iommu driver code to find out the
> reason.
>
> Thanks,
> Takao Indoh
>
>> Regards
>> Zhenhua
>>
>> On 12/26/2014 03:27 PM, Takao Indoh wrote:
>>> On 2014/12/26 15:46, Li, ZhenHua wrote:
>>>> Hi Takao Indoh,
>>>>
>>>> Thank you very much for your testing. I will add your update in next
>>>> version.
>>>> Also I think a flush for __iommu_update_old_root_entry is also necessary.
>>>>
>>>> Currently I have no idea about your fault, does it happen before or
>>>> during its loading? Could you send me your full kernel log as an
>>>> attachment?
>>> Sure, see attached file.
>>>
>>> I removed 9/10 and 10/10 patches from my kernel to avoid panic problem I
>>> reported in previous mail, and then tested kdump. So please ignore
>>> intr-remap fault message in log file. Also please ignore stack trace
>>> starting with the following message, it's a problem of my box.
>>>
>>>Flags mismatch irq 0. 0080 (i801_smbus) vs. 00015a00 (timer)
>>>
>>> Thanks,
>>> Takao Indoh
>>>
>>>> Regards and Merry Christmas.
>>>> Zhenhua
>>>>
>>>> On 12/26/2014 01:13 PM, Takao Indoh wrote:
>>>>> Hi Zhen-Hua,
>>>>>
>>>>> I tested your patch and found two problems.
>>>>>
>>>>> [1]
>>>>> Kenel panic occurs during 2nd kernel boot.
>>>>>
>>>>> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>>>>> Kernel panic - not syncing: timer doesn't work through Interrupt-remapped 
>>>>> IO-APIC
>>>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0 #25
>>>>> Hardware name: FUJITSU-SV PRIMERGY BX920 S2/D3030, BIOS 080015 
>>>>> Rev.3D81.3030 02/10/2012
>>>>>0002 880036167d08 815b1c6a 
>>>>>817f7670 880036167d88 815b19f1 0008
>>>>>880036167d98 880036167d38 810a5d2f 880036167d98
>>>>> Call Trace:
>>>>>[] dump_stack+0x48/0x5e
>>>>>[] panic+0xbb/0x1fa
>>>>>[] ? vprintk_default+0x1f/0x30
>>>>>[] panic_if_irq_remap+0x1c/0x20
>>>>>[] check_timer+0x1e7/0x5ed
>>>>>[] ? radix_tree_lookup+0xd/0x10
>>>>>[] setup_IO_APIC+0x261/0x292
>>>>>[] native_smp_prepare_cpus+0x214/0x25d
>>>>>[] kernel_init_freeable+0x1dc/0x28c
>>>>>[] ? rest_init+0x80/0x80
>>>>>[] kernel_init+0xe/0xf0
>>>>>[] ret_from_fork+0x7c/0xb0
>>>>>[] ? rest_init+0x80/0x80
>>>>> ---[ end Kernel panic - not syncing: timer doesn't work through 
>>>>> Interrupt-remapped IO-APIC
>>>>>
>>>>>
>>>>> This panic seems to be related to unflushed cache. I confirmed this
>>>>> problem was fixed by the following patch.
>>>>>
>>>>> --- a/drivers/iommu/intel_irq_remapping.c
>>>>> +++ b/drivers/iommu/intel_irq_remapping.c
>>>>> @@ -200,8 +200,13 @@ static int modify_irte(int irq, struct irte 
>>>>> *irte_modified)
>>>>>   set_64bit(&irte->high, irte_modified-&

Re: [PATCH 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2014-12-28 Thread Li, ZhenHua
Hi Takao Indoh,

Happy New Year, and thank you very much for you help.  The flush is quite
 a problem,  as there are several places the flush function should be called, 
I think the flush should be placed in functions like __iommu_update_old_*.  
Created a small patch for this, it is attached.



As I cannot reproduce your problems on my system, so could you please try 
these steps?
1. Apply the latest patchset, including 9/10 and 10/10, and then apply the 
attached patch_for_flush.patch.  And then test the kernel.

2.  If 1 does not fix the DMAR fault  problems, then it might be caused by 
7/10, so please *unpatch* it from the kernel (others and the  attached one
should be patched), and then test the kernel.

Regards
Zhenhua

On 12/26/2014 03:27 PM, Takao Indoh wrote:
> On 2014/12/26 15:46, Li, ZhenHua wrote:
>> Hi Takao Indoh,
>>
>> Thank you very much for your testing. I will add your update in next
>> version.
>> Also I think a flush for __iommu_update_old_root_entry is also necessary.
>>
>> Currently I have no idea about your fault, does it happen before or
>> during its loading? Could you send me your full kernel log as an
>> attachment?
> Sure, see attached file.
>
> I removed 9/10 and 10/10 patches from my kernel to avoid panic problem I
> reported in previous mail, and then tested kdump. So please ignore
> intr-remap fault message in log file. Also please ignore stack trace
> starting with the following message, it's a problem of my box.
>
>   Flags mismatch irq 0. 0080 (i801_smbus) vs. 00015a00 (timer)
>
> Thanks,
> Takao Indoh
>
>> Regards and Merry Christmas.
>> Zhenhua
>>
>> On 12/26/2014 01:13 PM, Takao Indoh wrote:
>>> Hi Zhen-Hua,
>>>
>>> I tested your patch and found two problems.
>>>
>>> [1]
>>> Kenel panic occurs during 2nd kernel boot.
>>>
>>> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>>> Kernel panic - not syncing: timer doesn't work through Interrupt-remapped 
>>> IO-APIC
>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0 #25
>>> Hardware name: FUJITSU-SV PRIMERGY BX920 S2/D3030, BIOS 080015 
>>> Rev.3D81.3030 02/10/2012
>>>   0002 880036167d08 815b1c6a 
>>>   817f7670 880036167d88 815b19f1 0008
>>>   880036167d98 880036167d38 810a5d2f 880036167d98
>>> Call Trace:
>>>   [] dump_stack+0x48/0x5e
>>>   [] panic+0xbb/0x1fa
>>>   [] ? vprintk_default+0x1f/0x30
>>>   [] panic_if_irq_remap+0x1c/0x20
>>>   [] check_timer+0x1e7/0x5ed
>>>   [] ? radix_tree_lookup+0xd/0x10
>>>   [] setup_IO_APIC+0x261/0x292
>>>   [] native_smp_prepare_cpus+0x214/0x25d
>>>   [] kernel_init_freeable+0x1dc/0x28c
>>>   [] ? rest_init+0x80/0x80
>>>   [] kernel_init+0xe/0xf0
>>>   [] ret_from_fork+0x7c/0xb0
>>>   [] ? rest_init+0x80/0x80
>>> ---[ end Kernel panic - not syncing: timer doesn't work through 
>>> Interrupt-remapped IO-APIC
>>>
>>>
>>> This panic seems to be related to unflushed cache. I confirmed this
>>> problem was fixed by the following patch.
>>>
>>> --- a/drivers/iommu/intel_irq_remapping.c
>>> +++ b/drivers/iommu/intel_irq_remapping.c
>>> @@ -200,8 +200,13 @@ static int modify_irte(int irq, struct irte 
>>> *irte_modified)
>>> set_64bit(&irte->high, irte_modified->high);
>>>   
>>>   #ifdef CONFIG_CRASH_DUMP
>>> -   if (is_kdump_kernel())
>>> +   if (is_kdump_kernel()) {
>>> __iommu_update_old_irte(iommu, index);
>>> +   __iommu_flush_cache(iommu,
>>> +   iommu->ir_table->base_old_virt +
>>> +   index * sizeof(struct irte),
>>> +   sizeof(struct irte));
>>> +   }
>>>   #endif
>>> __iommu_flush_cache(iommu, irte, sizeof(*irte));
>>>   
>>>
>>> [2]
>>> Some DMAR error messages are still found in 2nd kernel boot.
>>>
>>> dmar: DRHD: handling fault status reg 2
>>> dmar: DMAR:[DMA Write] Request device [01:00.0] fault addr ffded000
>>> DMAR:[fault reason 01] Present bit in root entry is clear
>>>
>>> I confiremd your commit 1a2262 was already applied. Any idea?
>>>
>>> Thanks,
>>> Takao Indoh
>>>
>>>
>>> On 2014/12/22 18:15, Li, Zhen-Hua wrote:
>>>> This patchset is an update of Bill Sumner's patchset, implements a fix for:
>>>> 

Re: [PATCH 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2014-12-25 Thread Li, ZhenHua
Hi Takao Indoh,

Thank you very much for your testing. I will add your update in next
version.
Also I think a flush for __iommu_update_old_root_entry is also necessary.

Currently I have no idea about your fault, does it happen before or
during its loading? Could you send me your full kernel log as an
attachment?

Regards and Merry Christmas.
Zhenhua

On 12/26/2014 01:13 PM, Takao Indoh wrote:
> Hi Zhen-Hua,
>
> I tested your patch and found two problems.
>
> [1]
> Kenel panic occurs during 2nd kernel boot.
>
> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> Kernel panic - not syncing: timer doesn't work through Interrupt-remapped 
> IO-APIC
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0 #25
> Hardware name: FUJITSU-SV PRIMERGY BX920 S2/D3030, BIOS 080015 Rev.3D81.3030 
> 02/10/2012
>  0002 880036167d08 815b1c6a 
>  817f7670 880036167d88 815b19f1 0008
>  880036167d98 880036167d38 810a5d2f 880036167d98
> Call Trace:
>  [] dump_stack+0x48/0x5e
>  [] panic+0xbb/0x1fa
>  [] ? vprintk_default+0x1f/0x30
>  [] panic_if_irq_remap+0x1c/0x20
>  [] check_timer+0x1e7/0x5ed
>  [] ? radix_tree_lookup+0xd/0x10
>  [] setup_IO_APIC+0x261/0x292
>  [] native_smp_prepare_cpus+0x214/0x25d
>  [] kernel_init_freeable+0x1dc/0x28c
>  [] ? rest_init+0x80/0x80
>  [] kernel_init+0xe/0xf0
>  [] ret_from_fork+0x7c/0xb0
>  [] ? rest_init+0x80/0x80
> ---[ end Kernel panic - not syncing: timer doesn't work through 
> Interrupt-remapped IO-APIC
>
>
> This panic seems to be related to unflushed cache. I confirmed this
> problem was fixed by the following patch.
>
> --- a/drivers/iommu/intel_irq_remapping.c
> +++ b/drivers/iommu/intel_irq_remapping.c
> @@ -200,8 +200,13 @@ static int modify_irte(int irq, struct irte 
> *irte_modified)
>   set_64bit(&irte->high, irte_modified->high);
>  
>  #ifdef CONFIG_CRASH_DUMP
> - if (is_kdump_kernel())
> + if (is_kdump_kernel()) {
>   __iommu_update_old_irte(iommu, index);
> + __iommu_flush_cache(iommu,
> + iommu->ir_table->base_old_virt +
> + index * sizeof(struct irte),
> + sizeof(struct irte));
> + }
>  #endif
>   __iommu_flush_cache(iommu, irte, sizeof(*irte));
>  
>
> [2]
> Some DMAR error messages are still found in 2nd kernel boot.
>
> dmar: DRHD: handling fault status reg 2
> dmar: DMAR:[DMA Write] Request device [01:00.0] fault addr ffded000
> DMAR:[fault reason 01] Present bit in root entry is clear
>
> I confiremd your commit 1a2262 was already applied. Any idea?
>
> Thanks,
> Takao Indoh
>
>
> On 2014/12/22 18:15, Li, Zhen-Hua wrote:
>> This patchset is an update of Bill Sumner's patchset, implements a fix for:
>> If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
>> when a panic happens, the kdump kernel will boot with these faults:
>>
>>  dmar: DRHD: handling fault status reg 102
>>  dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
>>  DMAR:[fault reason 01] Present bit in root entry is clear
>>
>>  dmar: DRHD: handling fault status reg 2
>>  dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
>>  INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear
>>
>> On some system, the interrupt remapping fault will also happen even if the
>> intel_iommu is not set to on, because the interrupt remapping will be enabled
>> when x2apic is needed by the system.
>>
>> The cause of the DMA fault is described in Bill's original version, and the
>> INTR-Remap fault is caused by a similar reason. In short, the initialization
>> of vt-d drivers causes the in-flight DMA and interrupt requests get wrong
>> response.
>>
>> To fix this problem, we modifies the behaviors of the intel vt-d in the
>> crashdump kernel:
>>
>> For DMA Remapping:
>> 1. To accept the vt-d hardware in an active state,
>> 2. Do not disable and re-enable the translation, keep it enabled.
>> 3. Use the old root entry table, do not rewrite the RTA register.
>> 4. Malloc and use new context entry table and page table, copy data from the
>> old ones that used by the old kernel.
>> 5. to use different portions of the iova address ranges for the device 
>> drivers
>> in the crashdump kernel than the iova ranges that were in-use at the time
>> of the panic.
>> 6. After device driver is loaded, when it issues the first dma_map command,
>> free the dmar_domain structure for this device, and generate a new one, 
>> so
>> that the device can be assigned a new and empty page table.
>> 7. When a new context entry table is generated, we also save its address to
>> the old root entry table.
>>
>> For Interrupt Remapping:
>> 1. To accept the vt-d hardware in an active state,
>> 2. Do not disable and re-enable the interrupt remapping, keep it enabled.
>> 3. Use the old interrupt remapping table, do not rewrite the IRTA register.
>> 4. When i

Re: [PATCH 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2014-12-22 Thread Li, ZhenHua

This version works for 3.19.0-rc1.

Zhenhua
On 12/22/2014 05:15 PM, Li, Zhen-Hua wrote:

This patchset is an update of Bill Sumner's patchset, implements a fix for:
If a kernel boots with intel_iommu=on on a system that supports intel vt-d,
when a panic happens, the kdump kernel will boot with these faults:

 dmar: DRHD: handling fault status reg 102
 dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8
 DMAR:[fault reason 01] Present bit in root entry is clear

 dmar: DRHD: handling fault status reg 2
 dmar: INTR-REMAP: Request device [[61:00.0] fault index 42
 INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear

On some system, the interrupt remapping fault will also happen even if the
intel_iommu is not set to on, because the interrupt remapping will be enabled
when x2apic is needed by the system.

The cause of the DMA fault is described in Bill's original version, and the
INTR-Remap fault is caused by a similar reason. In short, the initialization
of vt-d drivers causes the in-flight DMA and interrupt requests get wrong
response.

To fix this problem, we modifies the behaviors of the intel vt-d in the
crashdump kernel:

For DMA Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the translation, keep it enabled.
3. Use the old root entry table, do not rewrite the RTA register.
4. Malloc and use new context entry table and page table, copy data from the
old ones that used by the old kernel.
5. to use different portions of the iova address ranges for the device drivers
in the crashdump kernel than the iova ranges that were in-use at the time
of the panic.
6. After device driver is loaded, when it issues the first dma_map command,
free the dmar_domain structure for this device, and generate a new one, so
that the device can be assigned a new and empty page table.
7. When a new context entry table is generated, we also save its address to
the old root entry table.

For Interrupt Remapping:
1. To accept the vt-d hardware in an active state,
2. Do not disable and re-enable the interrupt remapping, keep it enabled.
3. Use the old interrupt remapping table, do not rewrite the IRTA register.
4. When ioapic entry is setup, the interrupt remapping table is changed, and
the updated data will be stored to the old interrupt remapping table.

Advantages of this approach:
1. All manipulation of the IO-device is done by the Linux device-driver
for that device.
2. This approach behaves in a manner very similar to operation without an
active iommu.
3. Any activity between the IO-device and its RMRR areas is handled by the
device-driver in the same manner as during a non-kdump boot.
4. If an IO-device has no driver in the kdump kernel, it is simply left alone.
This supports the practice of creating a special kdump kernel without
drivers for any devices that are not required for taking a crashdump.
5. Minimal code-changes among the existing mainline intel vt-d code.

Summary of changes in this patch set:
1. Added some useful function for root entry table in code intel-iommu.c
2. Added new members to struct root_entry and struct irte;
3. Functions to load old root entry table to iommu->root_entry from the memory
of old kernel.
4. Functions to malloc new context entry table and page table and copy the data
from the old ones to the malloced new ones.
5. Functions to enable support for DMA remapping in kdump kernel.
6. Functions to load old irte data from the old kernel to the kdump kernel.
7. Some code changes that support other behaviours that have been listed.
8. In the new functions, use physical address as "unsigned long" type, not
pointers.

Original version by Bill Sumner:
 https://lkml.org/lkml/2014/1/10/518
 https://lkml.org/lkml/2014/4/15/716
 https://lkml.org/lkml/2014/4/24/836

Zhenhua's last of Bill's patchset:
 https://lkml.org/lkml/2014/10/21/134
 https://lkml.org/lkml/2014/12/15/121

Changed in this version:
1. Do not disable and re-enable traslation and interrupt remapping.
2. Use old root entry table.
3. Use old interrupt remapping table.
4. Use "unsigned long" as physical address.
5. Use intel_unmap to unmap the old dma;

This patchset should be applied with this one together:
 https://lkml.org/lkml/2014/11/5/43
 x86/iommu: fix incorrect bit operations in setting values

Bill Sumner (5):
   iommu/vt-d: Update iommu_attach_domain() and its callers
   iommu/vt-d: Items required for kdump
   iommu/vt-d: data types and functions used for kdump
   iommu/vt-d: Add domain-id functions
   iommu/vt-d: enable kdump support in iommu module

Li, Zhen-Hua (10):
   iommu/vt-d: Update iommu_attach_domain() and its callers
   iommu/vt-d: Items required for kdump
   iommu/vt-d: Add domain-id functions
   iommu/vt-d: functions to copy data from old mem
   iommu/vt-d: Add functions to load and save old re
   iommu/vt-d: datatypes and functions

Re: [PATCH 0/5] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-12-11 Thread Li, ZhenHua

Sorry I have no plan yet.
Could you send me your logs on your AMD system?

Thanks
Zhenhua
On 12/10/2014 04:46 PM, Baoquan He wrote:

Hi Joerg, ZhenHua,

This issue happens on AMD iommu too, do you have any plans or
thoughts on that?

Thanks
Baoquan

On 11/17/14 at 02:38pm, Joerg Roedel wrote:

On Fri, Nov 14, 2014 at 02:27:44PM +0800, Li, ZhenHua wrote:

I am working following  your directions:

1.  If the VT-d driver finds the IOMMU enabled, it reuses its root entry
table, and do NOT disable-enable iommu. Other data will be copied.

2. When a device driver issues the first dma_map command for a
device, we assign a new and empty page-table, thus removing all
mappings from the old kernel for the device.

Please let me know if I get something wrong.


Yes, this sounds right. Happily waiting for patches :)


Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-11-30 Thread Li, ZhenHua

Joerg,

After I implement these two steps, there comes a new fault:

[1.594890] dmar: DRHD: handling fault status reg 2
[1.594894] dmar: INTR-REMAP: Request device [[41:00.0] fault index 4d
[1.594894] INTR-REMAP:[fault reason 34] Present field in the IRTE entry
is clear

It is caused by similar reason, so I will fix it like fixing the DMAR
faults: Do NOT disable and re-enable the interrupt remapping, try to
use data from old kernel.


Thanks
Zhenhua

On 11/17/2014 09:38 PM, Joerg Roedel wrote:

On Fri, Nov 14, 2014 at 02:27:44PM +0800, Li, ZhenHua wrote:

I am working following  your directions:

1.  If the VT-d driver finds the IOMMU enabled, it reuses its root entry
table, and do NOT disable-enable iommu. Other data will be copied.

2. When a device driver issues the first dma_map command for a
device, we assign a new and empty page-table, thus removing all
mappings from the old kernel for the device.

Please let me know if I get something wrong.


Yes, this sounds right. Happily waiting for patches :)


Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] x86/iommu: fix incorrect bit operations in setting values

2014-11-14 Thread Li, ZhenHua



1st step shows we should NOT disable the iommu when it is already
enabled. But current code does disable-enable. So there is still works
to do.



The original kernel does a disable and re-enable , Bill's patchset 
removed the disable operation.



I think step 2 is necessary, because when the driver initializes, the
device need a new map, and the old data from the old kernel can not be
used for the new driver.

Now I am trying to implement these ideas.


Thanks

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-11-13 Thread Li, ZhenHua
Hi Takao Indoh,
Your update for the patchset works fine. Thanks.

 Joerg,
I am working following  your directions:

1.  If the VT-d driver finds the IOMMU enabled, it reuses its root entry
table, and do NOT disable-enable iommu. Other data will be copied.

2. When a device driver issues the first dma_map command for a
device, we assign a new and empty page-table, thus removing all
mappings from the old kernel for the device.

Please let me know if I get something wrong.

Thanks
Zhenhua
On 11/06/2014 04:06 PM, Li, ZhenHua wrote:
> Thank you very much for your testing and fix. I will also test it on my
> system.
> I will let you know when I get a result.
> 
> Regards
> Zhenhua
> 
> On 11/06/2014 03:51 PM, Takao Indoh wrote:
>> (2014/11/06 11:11), Li, ZhenHua wrote:
>>> This patch does the same thing as you said in your mail.
>>> It should work, I have tested on my HP huge system.
>>
>> Yep, I also confirmed it worked.
>>
>> BTW, I found another problem. When I tested your patches with 3.17
>> kernel, iommu initialization failed with the following message.
>>
>> IOMMU intel_iommu_in_crashdump = true
>> IOMMU Skip disabling iommu hardware translations
>> IOMMU Copying translate tables from panicked kernel
>> IOMMU: Copy translate tables failed
>> IOMMU: dmar init failed
>>
>>
>> I found that oldcopy() from physical address 0x15000 was failed.
>> oldcopy() copies data using ioremap, and ioremap for the memory region
>> around 0x15000 does not work because it is already mapped to virtual
>> space.
>>
>> 
>> static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>>   unsigned long size, unsigned long prot_val, void *caller)
>> {
>> (snip)
>>   /*
>>* Don't allow anybody to remap normal RAM that we're using..
>>*/
>>   pfn  = phys_addr >> PAGE_SHIFT;
>>   last_pfn = last_addr >> PAGE_SHIFT;
>>   if (walk_system_ram_range(pfn, last_pfn - pfn + 1, NULL,
>> __ioremap_check_ram) == 1)
>>   return NULL;
>>  
>>
>>
>> I'm not sure how we should handle this, but as far as I tested the
>> following fix works.
>>
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 3a9e7b8..8d2bd23 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -4871,7 +4871,12 @@ static int oldcopy(void *to, void *from, int size)
>>
>>   pfn = ((unsigned long) from) >> VTD_PAGE_SHIFT;
>>   offset = ((unsigned long) from) & (~VTD_PAGE_MASK);
>> -   ret = copy_oldmem_page(pfn, buf, csize, offset, userbuf);
>> +
>> +   if (page_is_ram(pfn)) {
>> +   memcpy(buf, pfn_to_kaddr(pfn) + offset, csize);
>> +   ret = size;
>> +   } else
>> +   ret = copy_oldmem_page(pfn, buf, csize, offset, userbuf);
>>
>>   return (int) ret;
>>}
>>
>>
>> Thanks,
>> Takao Indoh
>>
>>
>>>
>>> On 11/06/2014 09:48 AM, Takao Indoh wrote:
>>>> (2014/11/06 10:35), Li, ZhenHua wrote:
>>>>> Yes, that's it. The function context_set_address_root does not set the
>>>>> address root correctly.
>>>>>
>>>>> I have created another patch for it, see
>>>>>   https://lkml.org/lkml/2014/11/5/43
>>>>
>>>> Oh, ok. I'll try again with this patch, thank you.
>>>>
>>>> Thanks,
>>>> Takao Indoh
>>>>
>>>>>
>>>>> Thanks
>>>>> Zhenhua
>>>>>
>>>>> On 11/06/2014 09:31 AM, Takao Indoh wrote:
>>>>>> Hi Zhenhua, Baoquan,
>>>>>>
>>>>>> (2014/10/22 19:05), Baoquan He wrote:
>>>>>>> Hi Zhenhua,
>>>>>>>
>>>>>>> I tested your latest patch on 3.18.0-rc1+, there are still some dmar
>>>>>>> errors. I remember it worked well with Bill's original patchset.
>>>>>>
>>>>>> This should be a problem in copy_context_entry().
>>>>>>
>>>>>>> +static int copy_context_entry(struct intel_iommu *iommu, u32 bus, u32 
>>>>>>> devfn,
>>>>>>> + void *ppap, struct context_entry *ce)
>>>>>>> +{
>>>>>>> +   int ret = 0; 

Re: [PATCH 1/1] x86/iommu: fix incorrect bit operations in setting values

2014-11-13 Thread Li, ZhenHua

On 11/13/2014 09:37 PM, Baoquan He wrote:

On 11/13/14 at 05:06pm, Li, ZhenHua wrote:

Minfei,
Thanks for your testing.
On my system, I got error messages:

[8.019096] usb usb2: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[8.019617] dmar: DRHD: handling fault status reg 102
[8.019621] dmar: DMAR:[DMA Read] Request device [01:00.0] fault
addr fff6a000
[8.019621] DMAR:[fault reason 06] PTE Read access is not set
[8.019627] dmar: DRHD: handling fault status reg 202
[8.019631] dmar: DMAR:[DMA Read] Request device [21:00.0] fault
addr fff6a000
[8.019631] DMAR:[fault reason 06] PTE Read access is not set
[8.019638] dmar: DRHD: handling fault status reg 202
[8.019641] dmar: DMAR:[DMA Read] Request device [41:00.0] fault
addr fff6a000
[8.019641] DMAR:[fault reason 06] PTE Read access is not set
[8.019647] dmar: DRHD: handling fault status reg 202
[8.019651] dmar: DMAR:[DMA Read] Request device [61:00.0] fault
addr fff6a000
[8.019651] DMAR:[fault reason 06] PTE Read access is not set

And this patch can fix this.


The reason you do not get error messages may be there is no ongoing DMA
request on your PCI devices when kdump kernel boots, I am not sure of
this.


I think Minfei means he applied this patch, so no error message is got.
The patches he listed includes this one:

0006-x86-iommu-fix-incorrect-bit-operations-in-setting-va.patch

Oh, thank you very much for your correction.


Hi Zhenhua,

Below is abstracted from Joerg's comments which he and David discussed
and get to this conclusion. So the 1st step is the same as your patches,
how do you think the 2nd step?

1. If the VT-d driver finds the IOMMU enabled, it reuses its
root-context table. This way the IOMMU must not be disabled
and re-enabled, eliminating the race we have now.
Other data structures like the context-entries are copied
over from the old kernel.  We basically keep all mappings
from the old kernel, allowing any in-flight DMA to succeed.
No memory or disk data corruption.
The issue here is, that we modify data from the old kernel
which is about to be dumped. But there are ways to work
around that.

 2. When a device driver issues the first dma_map command for a
device, we assign a new and empty page-table, thus removing
all mappings from the old kernel for the device.
Rationale is, that at this point the device driver should
have reset the device to a point where all in-flight DMA is
canceled.

1st step shows we should NOT disable the iommu when it is already 
enabled. But current code does disable-enable. So there is still works 
to do.


I think step 2 is necessary, because when the driver initializes, the 
device need a new map, and the old data from the old kernel can not be 
used for the new driver.


Now I am trying to implement these ideas.


Thanks
Baoquan



Zhenhua
On 11/12/2014 07:28 PM, Minfei Huang wrote:

The kdump starts 2nd kernel without any error message when I use
3.18.0-rc4 merged last 6 patchs. The following is the message which 2nd
kernel prints during booting.

Patchset:
0001-iommu-vt-d-Update-iommu_attach_domain-and-its-caller.patch
0002-iommu-vt-d-Items-required-for-kdump.patch
0003-iommu-vt-d-data-types-and-functions-used-for-kdump.patch
0004-iommu-vt-d-Add-domain-id-functions.patch
0005-iommu-vt-d-enable-kdump-support-in-iommu-module.patch
0006-x86-iommu-fix-incorrect-bit-operations-in-setting-va.patch

log:
[1.689604] IOMMU intel_iommu_in_crashdump = true
[1.694310] IOMMU Skip disabling iommu hardware translations
[1.699987] DMAR: No ATSR found
[1.703151] IOMMU Copying translate tables from panicked kernel
[1.710786] IOMMU: root_new_virt:0x8800296ec000 phys:0x296ec000
[1.717401] IOMMU:0 Domain ids from panicked kernel:
[1.722364] DID did:13(0x000d)
[1.725424] DID did:12(0x000c)
[1.728482] DID did:11(0x000b)
[1.731542] DID did:10(0x000a)
[1.734603] DID did:6(0x0006)
[1.737574] DID did:7(0x0007)
[1.740549] DID did:5(0x0005)
[1.743522] DID did:9(0x0009)
[1.746495] DID did:8(0x0008)
[1.749467] DID did:4(0x0004)
[1.752439] DID did:3(0x0003)
[1.755413] DID did:2(0x0002)
[1.758385] DID did:0(0x)
[1.761357] DID did:1(0x0001)
[1.764331] 
[1.769302] IOMMU 0 0xfed5: using Queued invalidation

On 11/05/14 at 03:30pm, Li, Zhen-Hua wrote:

The function context_set_address_root() and set_root_value are setting new
address in a wrong way, and this patch is trying to fix this problem.

According to Intel Vt-d specs(Feb 2011, Revision 1.3), Chapter 9.1 and 9.2,
field ctp in root entry is using bits 12:63, field asr in context entry is
using bits 12:63.

To set these fields, the following func

Re: [PATCH 1/1] x86/iommu: fix incorrect bit operations in setting values

2014-11-13 Thread Li, ZhenHua

Minfei,
Thanks for your testing.
On my system, I got error messages:

[8.019096] usb usb2: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1

[8.019617] dmar: DRHD: handling fault status reg 102
[8.019621] dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr 
fff6a000

[8.019621] DMAR:[fault reason 06] PTE Read access is not set
[8.019627] dmar: DRHD: handling fault status reg 202
[8.019631] dmar: DMAR:[DMA Read] Request device [21:00.0] fault addr 
fff6a000

[8.019631] DMAR:[fault reason 06] PTE Read access is not set
[8.019638] dmar: DRHD: handling fault status reg 202
[8.019641] dmar: DMAR:[DMA Read] Request device [41:00.0] fault addr 
fff6a000

[8.019641] DMAR:[fault reason 06] PTE Read access is not set
[8.019647] dmar: DRHD: handling fault status reg 202
[8.019651] dmar: DMAR:[DMA Read] Request device [61:00.0] fault addr 
fff6a000

[8.019651] DMAR:[fault reason 06] PTE Read access is not set

And this patch can fix this.


The reason you do not get error messages may be there is no ongoing DMA
request on your PCI devices when kdump kernel boots, I am not sure of
this.

Zhenhua
On 11/12/2014 07:28 PM, Minfei Huang wrote:

The kdump starts 2nd kernel without any error message when I use
3.18.0-rc4 merged last 6 patchs. The following is the message which 2nd
kernel prints during booting.

Patchset:
0001-iommu-vt-d-Update-iommu_attach_domain-and-its-caller.patch
0002-iommu-vt-d-Items-required-for-kdump.patch
0003-iommu-vt-d-data-types-and-functions-used-for-kdump.patch
0004-iommu-vt-d-Add-domain-id-functions.patch
0005-iommu-vt-d-enable-kdump-support-in-iommu-module.patch
0006-x86-iommu-fix-incorrect-bit-operations-in-setting-va.patch

log:
[1.689604] IOMMU intel_iommu_in_crashdump = true
[1.694310] IOMMU Skip disabling iommu hardware translations
[1.699987] DMAR: No ATSR found
[1.703151] IOMMU Copying translate tables from panicked kernel
[1.710786] IOMMU: root_new_virt:0x8800296ec000 phys:0x296ec000
[1.717401] IOMMU:0 Domain ids from panicked kernel:
[1.722364] DID did:13(0x000d)
[1.725424] DID did:12(0x000c)
[1.728482] DID did:11(0x000b)
[1.731542] DID did:10(0x000a)
[1.734603] DID did:6(0x0006)
[1.737574] DID did:7(0x0007)
[1.740549] DID did:5(0x0005)
[1.743522] DID did:9(0x0009)
[1.746495] DID did:8(0x0008)
[1.749467] DID did:4(0x0004)
[1.752439] DID did:3(0x0003)
[1.755413] DID did:2(0x0002)
[1.758385] DID did:0(0x)
[1.761357] DID did:1(0x0001)
[1.764331] 
[1.769302] IOMMU 0 0xfed5: using Queued invalidation

On 11/05/14 at 03:30pm, Li, Zhen-Hua wrote:

The function context_set_address_root() and set_root_value are setting new
address in a wrong way, and this patch is trying to fix this problem.

According to Intel Vt-d specs(Feb 2011, Revision 1.3), Chapter 9.1 and 9.2,
field ctp in root entry is using bits 12:63, field asr in context entry is
using bits 12:63.

To set these fields, the following functions are used:
static inline void context_set_address_root(struct context_entry *context,
 unsigned long value);
and
static inline void set_root_value(struct root_entry *root, unsigned long value)

But they are using an invalid method to set these fields, in current code, only
a '|' operator is used to set it. This will not set the asr to the expected
value if it has an old value.

For example:
Before calling this function,
context->lo = 0x3456789012111;
value = 0x123456789abcef12;

After we call context_set_address_root(context, value), expected result is
context->lo == 0x123456789abce111;

But the actual result is:
context->lo == 0x1237577f9bbde111;

So we need to clear bits 12:63 before setting the new value, this will fix
this problem.

Signed-off-by: Li, Zhen-Hua 
---
  drivers/iommu/intel-iommu.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a27d6cb..11ac47b 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -195,6 +195,7 @@ static inline void set_root_present(struct root_entry *root)
  }
  static inline void set_root_value(struct root_entry *root, unsigned long 
value)
  {
+   root->val &= ~VTD_PAGE_MASK;
root->val |= value & VTD_PAGE_MASK;
  }

@@ -247,6 +248,7 @@ static inline void context_set_translation_type(struct 
context_entry *context,
  static inline void context_set_address_root(struct context_entry *context,
unsigned long value)
  {
+   context->lo &= ~VTD_PAGE_MASK;
context->lo |= value & VTD_PAGE_MASK;
  }

--
2.0.0-rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please r

Re: [PATCH 0/5] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-11-06 Thread Li, ZhenHua
Thank you very much for your testing and fix. I will also test it on my
system.
I will let you know when I get a result.

Regards
Zhenhua

On 11/06/2014 03:51 PM, Takao Indoh wrote:
> (2014/11/06 11:11), Li, ZhenHua wrote:
>> This patch does the same thing as you said in your mail.
>> It should work, I have tested on my HP huge system.
> 
> Yep, I also confirmed it worked.
> 
> BTW, I found another problem. When I tested your patches with 3.17
> kernel, iommu initialization failed with the following message.
> 
> IOMMU intel_iommu_in_crashdump = true
> IOMMU Skip disabling iommu hardware translations
> IOMMU Copying translate tables from panicked kernel
> IOMMU: Copy translate tables failed
> IOMMU: dmar init failed
> 
> 
> I found that oldcopy() from physical address 0x15000 was failed.
> oldcopy() copies data using ioremap, and ioremap for the memory region
> around 0x15000 does not work because it is already mapped to virtual
> space.
> 
> 
> static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>  unsigned long size, unsigned long prot_val, void *caller)
> {
> (snip)
>  /*
>   * Don't allow anybody to remap normal RAM that we're using..
>   */
>  pfn  = phys_addr >> PAGE_SHIFT;
>  last_pfn = last_addr >> PAGE_SHIFT;
>  if (walk_system_ram_range(pfn, last_pfn - pfn + 1, NULL,
>__ioremap_check_ram) == 1)
>  return NULL;
> 
> 
> 
> I'm not sure how we should handle this, but as far as I tested the
> following fix works.
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 3a9e7b8..8d2bd23 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4871,7 +4871,12 @@ static int oldcopy(void *to, void *from, int size)
> 
>  pfn = ((unsigned long) from) >> VTD_PAGE_SHIFT;
>  offset = ((unsigned long) from) & (~VTD_PAGE_MASK);
> -   ret = copy_oldmem_page(pfn, buf, csize, offset, userbuf);
> +
> +   if (page_is_ram(pfn)) {
> +   memcpy(buf, pfn_to_kaddr(pfn) + offset, csize);
> +   ret = size;
> +   } else
> +   ret = copy_oldmem_page(pfn, buf, csize, offset, userbuf);
> 
>  return (int) ret;
>   }
> 
> 
> Thanks,
> Takao Indoh
> 
> 
>>
>> On 11/06/2014 09:48 AM, Takao Indoh wrote:
>>> (2014/11/06 10:35), Li, ZhenHua wrote:
>>>> Yes, that's it. The function context_set_address_root does not set the
>>>> address root correctly.
>>>>
>>>> I have created another patch for it, see
>>>>https://lkml.org/lkml/2014/11/5/43
>>>
>>> Oh, ok. I'll try again with this patch, thank you.
>>>
>>> Thanks,
>>> Takao Indoh
>>>
>>>>
>>>> Thanks
>>>> Zhenhua
>>>>
>>>> On 11/06/2014 09:31 AM, Takao Indoh wrote:
>>>>> Hi Zhenhua, Baoquan,
>>>>>
>>>>> (2014/10/22 19:05), Baoquan He wrote:
>>>>>> Hi Zhenhua,
>>>>>>
>>>>>> I tested your latest patch on 3.18.0-rc1+, there are still some dmar
>>>>>> errors. I remember it worked well with Bill's original patchset.
>>>>>
>>>>> This should be a problem in copy_context_entry().
>>>>>
>>>>>> +static int copy_context_entry(struct intel_iommu *iommu, u32 bus, u32 
>>>>>> devfn,
>>>>>> +  void *ppap, struct context_entry *ce)
>>>>>> +{
>>>>>> +int ret = 0;/* Integer Return Code */
>>>>>> +u32 shift = 0;  /* bits to shift page_addr  */
>>>>>> +u64 page_addr = 0;  /* Address of translated page */
>>>>>> +struct dma_pte *pgt_old_phys;   /* Adr(page_table in the old 
>>>>>> kernel) */
>>>>>> +struct dma_pte *pgt_new_phys;   /* Adr(page_table in the new 
>>>>>> kernel) */
>>>>>> +u8  t;  /* Translation-type from 
>>>>>> context */
>>>>>> +u8  aw; /* Address-width from context */
>>>>>> +u32 aw_shift[8] = {
>>>>>> +12+9+9, /* [000b] 30-bit AGAW (2-level page 
>>>>>> table) */
>>>>>> +12+9+9+9

Re: [PATCH 0/5] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-11-05 Thread Li, ZhenHua
This patch does the same thing as you said in your mail.
It should work, I have tested on my HP huge system.

On 11/06/2014 09:48 AM, Takao Indoh wrote:
> (2014/11/06 10:35), Li, ZhenHua wrote:
>> Yes, that's it. The function context_set_address_root does not set the
>> address root correctly.
>>
>> I have created another patch for it, see
>>  https://lkml.org/lkml/2014/11/5/43
> 
> Oh, ok. I'll try again with this patch, thank you.
> 
> Thanks,
> Takao Indoh
> 
>>
>> Thanks
>> Zhenhua
>>
>> On 11/06/2014 09:31 AM, Takao Indoh wrote:
>>> Hi Zhenhua, Baoquan,
>>>
>>> (2014/10/22 19:05), Baoquan He wrote:
>>>> Hi Zhenhua,
>>>>
>>>> I tested your latest patch on 3.18.0-rc1+, there are still some dmar
>>>> errors. I remember it worked well with Bill's original patchset.
>>>
>>> This should be a problem in copy_context_entry().
>>>
>>>> +static int copy_context_entry(struct intel_iommu *iommu, u32 bus, u32 
>>>> devfn,
>>>> +void *ppap, struct context_entry *ce)
>>>> +{
>>>> +  int ret = 0;/* Integer Return Code */
>>>> +  u32 shift = 0;  /* bits to shift page_addr  */
>>>> +  u64 page_addr = 0;  /* Address of translated page */
>>>> +  struct dma_pte *pgt_old_phys;   /* Adr(page_table in the old kernel) */
>>>> +  struct dma_pte *pgt_new_phys;   /* Adr(page_table in the new kernel) */
>>>> +  u8  t;  /* Translation-type from context */
>>>> +  u8  aw; /* Address-width from context */
>>>> +  u32 aw_shift[8] = {
>>>> +  12+9+9, /* [000b] 30-bit AGAW (2-level page table) */
>>>> +  12+9+9+9,   /* [001b] 39-bit AGAW (3-level page table) */
>>>> +  12+9+9+9+9, /* [010b] 48-bit AGAW (4-level page table) */
>>>> +  12+9+9+9+9+9,   /* [011b] 57-bit AGAW (5-level page table) */
>>>> +  12+9+9+9+9+9+9, /* [100b] 64-bit AGAW (6-level page table) */
>>>> +  0,  /* [111b] Reserved */
>>>> +  0,  /* [110b] Reserved */
>>>> +  0,  /* [111b] Reserved */
>>>> +  };
>>>> +
>>>> +  struct domain_values_entry *dve = NULL;
>>>> +
>>>> +
>>>> +  if (!context_present(ce)) { /* If (context not present) */
>>>> +  ret = RET_CCE_NOT_PRESENT;  /* Skip it */
>>>> +  goto exit;
>>>> +  }
>>>> +
>>>> +  t = context_translation_type(ce);
>>>> +
>>>> +  /* If we have seen this domain-id before on this iommu,
>>>> +   * give this context the same page-tables and we are done.
>>>> +   */
>>>> +  list_for_each_entry(dve, &domain_values_list[iommu->seq_id], link) {
>>>> +  if (dve->did == (int) context_domain_id(ce)) {
>>>> +  switch (t) {
>>>> +  case 0: /* page tables */
>>>> +  case 1: /* page tables */
>>>> +  context_set_address_root(ce,
>>>> +  virt_to_phys(dve->pgd));
>>>
>>> Here, in context_set_address_root(), the new address is set like this:
>>>
>>> context->lo |= value & VTD_PAGE_MASK;
>>>
>>> This is wrong, the logical disjunction of old address and new address
>>> becomes invalid address.
>>>
>>> This should be like this.
>>>
>>> case 1: /* page tables */
>>> ce->lo &= (~VTD_PAGE_MASK);
>>> context_set_address_root(ce,
>>> virt_to_phys(dve->pgd));
>>>
>>> And one more,
>>>
>>>> +  ret = RET_CCE_PREVIOUS_DID;
>>>> +  break;
>>>> +
>>>> +  case 2: /* Pass through */
>>>> +  if (dve->pgd == NULL)
>>>> +  ret =  RET_CCE_PASS_THROUGH_2;
>>>> +  else
>>>> +  ret = RET_BADCOPY;
>>>> +  break;
>>>> +
>>>> +  de

Re: [PATCH 0/5] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-11-05 Thread Li, ZhenHua
Yes, that's it. The function context_set_address_root does not set the
address root correctly.

I have created another patch for it, see
https://lkml.org/lkml/2014/11/5/43

Thanks
Zhenhua

On 11/06/2014 09:31 AM, Takao Indoh wrote:
> Hi Zhenhua, Baoquan,
> 
> (2014/10/22 19:05), Baoquan He wrote:
>> Hi Zhenhua,
>>
>> I tested your latest patch on 3.18.0-rc1+, there are still some dmar
>> errors. I remember it worked well with Bill's original patchset.
> 
> This should be a problem in copy_context_entry().
> 
>> +static int copy_context_entry(struct intel_iommu *iommu, u32 bus, u32 devfn,
>> +  void *ppap, struct context_entry *ce)
>> +{
>> +int ret = 0;/* Integer Return Code */
>> +u32 shift = 0;  /* bits to shift page_addr  */
>> +u64 page_addr = 0;  /* Address of translated page */
>> +struct dma_pte *pgt_old_phys;   /* Adr(page_table in the old kernel) */
>> +struct dma_pte *pgt_new_phys;   /* Adr(page_table in the new kernel) */
>> +u8  t;  /* Translation-type from context */
>> +u8  aw; /* Address-width from context */
>> +u32 aw_shift[8] = {
>> +12+9+9, /* [000b] 30-bit AGAW (2-level page table) */
>> +12+9+9+9,   /* [001b] 39-bit AGAW (3-level page table) */
>> +12+9+9+9+9, /* [010b] 48-bit AGAW (4-level page table) */
>> +12+9+9+9+9+9,   /* [011b] 57-bit AGAW (5-level page table) */
>> +12+9+9+9+9+9+9, /* [100b] 64-bit AGAW (6-level page table) */
>> +0,  /* [111b] Reserved */
>> +0,  /* [110b] Reserved */
>> +0,  /* [111b] Reserved */
>> +};
>> +
>> +struct domain_values_entry *dve = NULL;
>> +
>> +
>> +if (!context_present(ce)) { /* If (context not present) */
>> +ret = RET_CCE_NOT_PRESENT;  /* Skip it */
>> +goto exit;
>> +}
>> +
>> +t = context_translation_type(ce);
>> +
>> +/* If we have seen this domain-id before on this iommu,
>> + * give this context the same page-tables and we are done.
>> + */
>> +list_for_each_entry(dve, &domain_values_list[iommu->seq_id], link) {
>> +if (dve->did == (int) context_domain_id(ce)) {
>> +switch (t) {
>> +case 0: /* page tables */
>> +case 1: /* page tables */
>> +context_set_address_root(ce,
>> +virt_to_phys(dve->pgd));
> 
> Here, in context_set_address_root(), the new address is set like this:
> 
>   context->lo |= value & VTD_PAGE_MASK;
> 
> This is wrong, the logical disjunction of old address and new address
> becomes invalid address.
> 
> This should be like this.
> 
>   case 1: /* page tables */
>   ce->lo &= (~VTD_PAGE_MASK);
>   context_set_address_root(ce,
>   virt_to_phys(dve->pgd));
> 
> And one more,
> 
>> +ret = RET_CCE_PREVIOUS_DID;
>> +break;
>> +
>> +case 2: /* Pass through */
>> +if (dve->pgd == NULL)
>> +ret =  RET_CCE_PASS_THROUGH_2;
>> +else
>> +ret = RET_BADCOPY;
>> +break;
>> +
>> +default: /* Bad value of 't'*/
>> +ret = RET_BADCOPY;
>> +break;
>> +}
>> +goto exit;
>> +}
>> +}
> (snip)
>> +if (t == 0 || t == 1) { /* If (context has page tables) */
>> +aw = context_address_width(ce);
>> +shift = aw_shift[aw];
>> +
>> +pgt_old_phys = (struct dma_pte *)
>> +(context_address_root(ce) << VTD_PAGE_SHIFT);
>> +
>> +ret = copy_page_table(&pgt_new_phys, pgt_old_phys,
>> +shift-9, page_addr, iommu, bus, devfn, dve, ppap);
>> +
>> +if (ret)/* if (problem) bail out */
>> +goto exit;
>> +
>> +context_set_address_root(ce, (unsigned long)(pgt_new_phys));
> 
> ditto.
> 
> Thanks,
> Takao Indoh
> 
> 
>>
>>
>> 0console [earlya[0.00] allocate tes of page_cg  'a ong[
>> 0.00] tsc: Fast TSC calibration using PIT
>> 0031] Calibrating delay loop (skipped), value calculated using timer
>> frequency.. 5586.77 BogoMIPS (lpj=2793386)
>> [0.010682] pid_max: default: 32768 minimum: 301
>> [0.015317] ACPI: Core revision 20140828
>> [0.044598] ACPI: All ACPI Tables successfully acquired
>> [0.126450] Security Framework initialized
>> [0.130569] SELinux:  Initializing.
>> [

Re: [PATCH 1/1] x86/iommu: fix incorrect bit operations in setting values

2014-11-04 Thread Li, ZhenHua

Hi Joerg,

While debugging Bill's patches, I found this problem:
When copying iommu data from old kernel to the kdump kernel, the 
original function context_set_address_root() may cause kdump kernel 
using incorrect address root value.


So I created this patch to fix it.

Zhenhua

On 11/05/2014 03:30 PM, Li, Zhen-Hua wrote:

The function context_set_address_root() and set_root_value are setting new
address in a wrong way, and this patch is trying to fix this problem.

According to Intel Vt-d specs(Feb 2011, Revision 1.3), Chapter 9.1 and 9.2,
field ctp in root entry is using bits 12:63, field asr in context entry is
using bits 12:63.

To set these fields, the following functions are used:
static inline void context_set_address_root(struct context_entry *context,
 unsigned long value);
and
static inline void set_root_value(struct root_entry *root, unsigned long value)

But they are using an invalid method to set these fields, in current code, only
a '|' operator is used to set it. This will not set the asr to the expected
value if it has an old value.

For example:
Before calling this function,
context->lo = 0x3456789012111;
value = 0x123456789abcef12;

After we call context_set_address_root(context, value), expected result is
context->lo == 0x123456789abce111;

But the actual result is:
context->lo == 0x1237577f9bbde111;

So we need to clear bits 12:63 before setting the new value, this will fix
this problem.

Signed-off-by: Li, Zhen-Hua 
---
  drivers/iommu/intel-iommu.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a27d6cb..11ac47b 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -195,6 +195,7 @@ static inline void set_root_present(struct root_entry *root)
  }
  static inline void set_root_value(struct root_entry *root, unsigned long 
value)
  {
+   root->val &= ~VTD_PAGE_MASK;
root->val |= value & VTD_PAGE_MASK;
  }

@@ -247,6 +248,7 @@ static inline void context_set_translation_type(struct 
context_entry *context,
  static inline void context_set_address_root(struct context_entry *context,
unsigned long value)
  {
+   context->lo &= ~VTD_PAGE_MASK;
context->lo |= value & VTD_PAGE_MASK;
  }




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-10-27 Thread Li, ZhenHua

Hi Baoquan,
I failed in testing this patchset for 3.18.0-rc1, this upstream
3.18.0-rc1 kernel cannot boot on my system, have not found out the
reason.

Could you please test this patchset on 3.17.0 to see whether it has 
these faults?


Thanks
Zhenhua

On 10/22/2014 06:05 PM, Baoquan He wrote:

Hi Zhenhua,

I tested your latest patch on 3.18.0-rc1+, there are still some dmar
errors. I remember it worked well with Bill's original patchset.


0console [earlya[0.00] allocate tes of page_cg  'a ong[
0.00] tsc: Fast TSC calibration using PIT
0031] Calibrating delay loop (skipped), value calculated using timer
frequency.. 5586.77 BogoMIPS (lpj=2793386)
[0.010682] pid_max: default: 32768 minimum: 301
[0.015317] ACPI: Core revision 20140828
[0.044598] ACPI: All ACPI Tables successfully acquired
[0.126450] Security Framework initialized
[0.130569] SELinux:  Initializing.
[0.135211] Dentry cache hash table entries: 2097152 (order: 12,
16777216 bytes)
[0.145731] Inode-cache hash table entries: 1048576 (order: 11,
8388608 bytes)
[0.154249] Mount-cache hash table entries: 32768 (order: 6, 262144
bytes)
[0.161163] Mountpoint-cache hash table entries: 32768 (order: 6,
262144 bytes)
[0.168731] Initializing cgroup subsys memory
[0.173110] Initializing cgroup subsys devices
[0.177570] Initializing cgroup subsys freezer
[0.182026] Initializing cgroup subsys net_cls
[0.186483] Initializing cgroup subsys blkio
[0.190763] Initializing cgroup subsys perf_event
[0.195479] Initializing cgroup subsys hugetlb
[0.199955] CPU: Physical Processor ID: 0
[0.203972] CPU: Processor Core ID: 0
[0.207649] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[0.207649] ENERGY_PERF_BIAS: View and update with
x86_energy_perf_policy(8)
[0.220704] mce: CPU supports 16 MCE banks
[0.224832] CPU0: Thermal monitoring enabled (TM1)
[0.229658] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8
[0.229658] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0
[0.241537] Freeing SMP alternatives memory: 24K (81e8 -
81e86000)
[0.250740] ftrace: allocating 27051 entries in 106 pages
[0.268137] dmar: Host address width 46
[0.271986] dmar: DRHD base: 0x00dfffc000 flags: 0x1
[0.277314] dmar: IOMMU 0: reg_base_addr dfffc000 ver 1:0 cap
d2078c106f0462 ecap f020fe
[0.285423] dmar: RMRR base: 0x00cba11000 end: 0x00cba27fff
[0.291703] dmar: ATSR flags: 0x0
[0.295122] IOAPIC id 0 under DRHD base  0xdfffc000 IOMMU 0
[0.300704] IOAPIC id 2 under DRHD base  0xdfffc000 IOMMU 0
[0.306281] HPET id 0 under DRHD base 0xdfffc000
[0.311011] Enabled IRQ remapping in xapic mode
[0.316070] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[0.332096] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-1603 0 @ 2.80GHz
(fam: 06, model: 2d, stepping: 07)
[0.341495] Performance Events: PEBS fmt1+, 16-deep LBR, SandyBridge
events, full-width counters, Intel PMU driver.
[0.352047] perf_event_intel: PEBS disabled due to CPU errata, please
upgrade microcode
[0.360060] ... version:3
[0.364081] ... bit width:  48
[0.368182] ... generic registers:  8
[0.372196] ... value mask: 
[0.377513] ... max period: 
[0.382829] ... fixed-purpose events:   3
[0.386842] ... event mask: 000700ff
[0.393368] x86: Booting SMP configuration:
[0.397563]  node  #0, CPUs:  #1
[0.414672] NMI watchdog: enabled on all CPUs, permanently consumes
one hw-PMU counter.
[0.422957]  #2 #3
[0.451320] x86: Booted up 1 node, 4 CPUs
[0.455539] smpboot: Total of 4 processors activated (22347.08
BogoMIPS)
[0.466369] devtmpfs: initialized
[0.472993] PM: Registering ACPI NVS region [mem
0xcb75-0xcb7dafff] (569344 bytes)
[0.480930] PM: Registering ACPI NVS region [mem
0xcbaad000-0xcbaaefff] (8192 bytes)
[0.488689] PM: Registering ACPI NVS region [mem
0xcbabb000-0xcbacdfff] (77824 bytes)
[0.496535] PM: Registering ACPI NVS region [mem
0xcbb56000-0xcbb5dfff] (32768 bytes)
[0.504380] PM: Registering ACPI NVS region [mem
0xcbb71000-0xcbff] (4780032 bytes)
[0.513294] atomic64_test: passed for x86-64 platform with CX8 and
with SSE
[0.520272] pinctrl core: initialized pinctrl subsystem
[0.525549] RTC time:  9:52:43, date: 10/22/14
[0.530096] NET: Registered protocol family 16
[0.539573] cpuidle: using governor menu
[0.543583] ACPI: bus type PCI registered
[0.547608] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[0.554133] PCI: MMCONFIG for domain  [bus 00-ff] at [mem
0xe000-0xefff] (base 0xe000)
[0.563457] PCI: MMCONFIG at [mem 0xe000-0xefff] reserved in
E820
[0.570548] PCI: Using configuration type 1 for base access
[0.582492] ACPI: Added _OSI(Module Device)
[0.586683] AC

Re: [PATCH 1/1] pci: fix dmar fault for kdump kernel

2014-10-23 Thread Li, ZhenHua

On 10/22/2014 10:47 AM, Bjorn Helgaas wrote:

[+cc Joerg, Eric, Tom, David, iommu list]

On Wed, Oct 15, 2014 at 2:14 AM, Takao Indoh  wrote:

(2014/10/14 18:34), Li, ZhenHua wrote:

I tested on the latest stable version 3.17, it works well.

On 10/10/2014 03:13 PM, Li, Zhen-Hua wrote:

On a HP system with Intel vt-d supported and many PCI devices on it,
when kernel crashed and the kdump kernel boots with intel_iommu=on,
there may be some unexpected DMA requests on this adapter, which will
cause DMA Remapping faults like:
  dmar: DRHD: handling fault status reg 102
  dmar: DMAR:[DMA Read] Request device [41:00.0] fault addr fff81000
  DMAR:[fault reason 01] Present bit in root entry is clear

This bug may happen on *any* PCI device.
Analysis for this bug:

The present bit is set in this function:

static struct context_entry * device_to_context_entry(
  struct intel_iommu *iommu, u8 bus, u8 devfn)
{
  ..
  set_root_present(root);
  ..
}

Calling tree:
  device driver
  intel_alloc_coherent
  __intel_map_single
  domain_context_mapping
  domain_context_mapping_one
  device_to_context_entry

This means, the present bit in root entry will not be set until the device
driver is loaded.

But in the kdump kernel, hardware devices are not aware that control has
transferred to the second kernel, and those drivers must initialize again.
Consequently there may be unexpected DMA requests from devices activity
initiated in the first kernel leading to the DMA Remapping errors in the
second kernel.

To fix this DMAR fault, we need to reset the bus that this device on. Reset
the device itself does not work.


You have not explained why the DMAR faults are a problem.  The fault
is just an indication that the IOMMU prevented a DMA from completing.
If the DMA is an artifact of the crashed kernel, we probably don't
*want* it to complete, so taking a DMAR fault seems like exactly the
right thing.

If the problem is that we're being flooded with messages, it's easy
enough to just tone down the printks.


A patch for this bug that has been sent before:
https://lkml.org/lkml/2014/9/30/55
As in discussion, this bug may happen on *any* device, so we need to reset all
pci devices.

There was an original version(Takao Indoh) that resets the pcie devices:
https://lkml.org/lkml/2013/5/14/9


As far as I can remember, the original patch was nacked by
the following reasons:

1) On sparc, the IOMMU is initialized before PCI devices are enumerated,
so there would still be a window where ongoing DMA could cause an
IOMMU error.

2) Basically Bjorn is thinking device reset should be done in the
1st kernel before jumping into 2nd kernel.


If you're referring to this: https://lkml.org/lkml/2013/6/12/16, what
I said was "It would be at least conceivable to reset the devices ...
before the kexec."  That's not a requirement to do it in the first
kernel, just an idea that I thought should be investigated.  And Eric
has good reasons for *not* doing the reset in the first kernel, so it
turned out not to be a very good idea.

My fundamental problem with this whole reset thing is that it's a
sledgehammer approach and it's ugly.  Using the IOMMU seems like a
much more elegant approach.

So if we are forced to accept the reset solution, I want to at least
have a concise explanation of why we can't use the IOMMU.


In my understanding, the reset is necessary, because no one knows if
there are any other problems if the devices are not reset in the kdump
kernel.  OS must boot with the devices in a clean state.

But this will not prevent us from using Bill's iommu patch. I think the
most important part for Bill's patch is to dump the data, not to fix
the dmar faults, though it also fixed that.

Zhenhua

The changelog above is perfectly accurate, but it's really not very
useful because it only explains the code without exploring any of the
interesting issues.

Bjorn


And Bill Sumner proposed another idea.
http://comments.gmane.org/gmane.linux.kernel.iommu/4828
I don't know the current status of this patch, but I think Jerry Hoemann
is working on this.

Thanks,
Takao Indoh




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-10-21 Thread Li, ZhenHua
Need more time to read and think about these mails. I just want to 
clarify one thing: Bill has left HP, and now I inherited his works.

That's why I sent an update of his patch
https://lkml.org/lkml/2014/10/21/134

On 10/22/2014 10:47 AM, Eric W. Biederman wrote:

Bjorn Helgaas  writes:


[-cc Bill, +cc Zhen-Hua, Eric, Tom, Jerry]

Hi Joerg,

I was looking at Zhen-Hua's recent patches, trying to figure out if I
need to do anything with them.  Resetting devices in the old kernel
seems like a non-starter.  Resetting devices in the new kernel, ...,
well, maybe.  It seems ugly, and it seems like the sort of problem
that IOMMUs are designed to solve.  Anyway, I found this old
discussion that I didn't quite understand:


For context here is the kexec on panic design, and what I know from
previous rounds of similar conversations.

The way kexec on panic aka kdump is designed to work is that the
recovery kernel lives in a piece of memory reserved at boot time and
known not to be in use by any driver (because we never ever use it for
DMA).  If DMA's continue from any source the old kernel may be a little
more corrupted but our currently running kernel should not.

Device drivers that we use in the recovery kernel are required to be
able to initialize their devices from an arbitrary state or fail to
initialize their devices.

We have discussed things on various occassions but IOMMUs all have their
own individual idiosynchrousies and came late to the party so that it
is hard to generalize.

The reserved region is generally low enough in memory that simply
not using IOMMUs works.

The major challenge with initializing an IOMMU would be that there are
potentially devices whose driver is not loaded in the recover kernel
with on-going DMA sessions (perhaps a NIC in response to network
packet).

Which essentially means that if you are going to use an IOMMU slot in a
recovery kernel you have to either know that IOMMU slot was reserved for
the recovery kernel (what has always felt like the easiest way to me).
Or you have to know everything that could target that IOMMU slot has
been reset or has it's driver loaded.

I have always thought the simplist and easiest solution would be to
reserve a few IOMMU slots for the kexec on panic kernel.  But if folks
can find other ways to guarantee that an on-going DMA isn't targeting
an IOMMU slot (such as resetting everything downstream from that
IOMMU slot) more power to you.


On Wed, Jul 2, 2014 at 7:32 AM, Joerg Roedel  wrote:

On Wed, Apr 30, 2014 at 11:49:33AM +0100, David Woodhouse wrote:



After the last round of this patchset, we discussed a potential
improvement where you point every virtual bus address at the *same*
physical scratch page.


That is a solution to prevent the in-flight DMA failures. But what
happens when there is some in-flight DMA to a disk to write some inodes
or a new superblock. Then this scratch address-space may cause
filesystem corruption at worst.


This in-flight DMA is from a device programmed by the old kernel, and
it would be reading data from the old kernel's buffers.  I think
you're suggesting that we might want that DMA read to complete so the
device can update filesystem metadata?

I don't really understand that argument.  Don't we usually want to
stop any data from escaping the machine after a crash, on the theory
that the old kernel is crashing because something is catastrophically
wrong and we may have already corrupted things in memory?  If so,
allowing this old DMA to complete is just as likely to make things
worse as to make them better.

Without kdump, we likely would reboot through the BIOS and the device
would get reset and the DMA would never happen at all.  So if we made
the dump kernel program the IOMMU to prevent the DMA, that seems like
a similar situation.


So with this in mind I would prefer initially taking over the
page-tables from the old kernel before the device drivers re-initialize
the devices.


This makes the dump kernel more dependent on data from the old kernel,
which we obviously want to avoid when possible.

I didn't find the previous discussion where pointing every virtual bus
address at the same physical scratch page was proposed.  Why was that
better than programming the IOMMU to reject every DMA?

Bjorn


Eric



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] pci: fix dmar fault for kdump kernel

2014-10-21 Thread Li, ZhenHua

On 10/22/2014 10:47 AM, Bjorn Helgaas wrote:

[+cc Joerg, Eric, Tom, David, iommu list]

On Wed, Oct 15, 2014 at 2:14 AM, Takao Indoh  wrote:

(2014/10/14 18:34), Li, ZhenHua wrote:

I tested on the latest stable version 3.17, it works well.

On 10/10/2014 03:13 PM, Li, Zhen-Hua wrote:

On a HP system with Intel vt-d supported and many PCI devices on it,
when kernel crashed and the kdump kernel boots with intel_iommu=on,
there may be some unexpected DMA requests on this adapter, which will
cause DMA Remapping faults like:
  dmar: DRHD: handling fault status reg 102
  dmar: DMAR:[DMA Read] Request device [41:00.0] fault addr fff81000
  DMAR:[fault reason 01] Present bit in root entry is clear

This bug may happen on *any* PCI device.
Analysis for this bug:

The present bit is set in this function:

static struct context_entry * device_to_context_entry(
  struct intel_iommu *iommu, u8 bus, u8 devfn)
{
  ..
  set_root_present(root);
  ..
}

Calling tree:
  device driver
  intel_alloc_coherent
  __intel_map_single
  domain_context_mapping
  domain_context_mapping_one
  device_to_context_entry

This means, the present bit in root entry will not be set until the device
driver is loaded.

But in the kdump kernel, hardware devices are not aware that control has
transferred to the second kernel, and those drivers must initialize again.
Consequently there may be unexpected DMA requests from devices activity
initiated in the first kernel leading to the DMA Remapping errors in the
second kernel.

To fix this DMAR fault, we need to reset the bus that this device on. Reset
the device itself does not work.


You have not explained why the DMAR faults are a problem.  The fault
is just an indication that the IOMMU prevented a DMA from completing.
If the DMA is an artifact of the crashed kernel, we probably don't
*want* it to complete, so taking a DMAR fault seems like exactly the
right thing.
Well, I still need more time to think about other contents you mentioned 
and explained in these mails. But about the DMA fault, I think it is not 
"the iommu prevented a DMA from completing", it is the iommu could not 
help system complete the dma, so the iommu reported an error.


Also I agree with you that these DMA should not be completed.
But I still think,  these dma, programmed by the old kernel, should be 
stopped because they are some kind of illegal for the kdump kernel, no 
matter what iommu did.




If the problem is that we're being flooded with messages, it's easy
enough to just tone down the printks.


A patch for this bug that has been sent before:
https://lkml.org/lkml/2014/9/30/55
As in discussion, this bug may happen on *any* device, so we need to reset all
pci devices.

There was an original version(Takao Indoh) that resets the pcie devices:
https://lkml.org/lkml/2013/5/14/9


As far as I can remember, the original patch was nacked by
the following reasons:

1) On sparc, the IOMMU is initialized before PCI devices are enumerated,
so there would still be a window where ongoing DMA could cause an
IOMMU error.

2) Basically Bjorn is thinking device reset should be done in the
1st kernel before jumping into 2nd kernel.


If you're referring to this: https://lkml.org/lkml/2013/6/12/16, what
I said was "It would be at least conceivable to reset the devices ...
before the kexec."  That's not a requirement to do it in the first
kernel, just an idea that I thought should be investigated.  And Eric
has good reasons for *not* doing the reset in the first kernel, so it
turned out not to be a very good idea.

My fundamental problem with this whole reset thing is that it's a
sledgehammer approach and it's ugly.  Using the IOMMU seems like a
much more elegant approach.

So if we are forced to accept the reset solution, I want to at least
have a concise explanation of why we can't use the IOMMU.

The changelog above is perfectly accurate, but it's really not very
useful because it only explains the code without exploring any of the
interesting issues.

Bjorn


And Bill Sumner proposed another idea.
http://comments.gmane.org/gmane.linux.kernel.iommu/4828
I don't know the current status of this patch, but I think Jerry Hoemann
is working on this.

Thanks,
Takao Indoh




Update of this new version, comparing with Takao Indoh's version:
  Add support for legacy PCI devices.
  Use pci_try_reset_bus instead of do_downstream_device_reset in original 
version

Randy Wright corrects some misunderstanding in this description.

Signed-off-by: Li, Zhen-Hua 
Signed-off-by: Takao Indoh 
Signed-off-by: Randy Wright 
---
   drivers/pci/pci.c | 84 
+++
   1 file changed, 84 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2c9ac70..8c

Re: [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-10-15 Thread Li, ZhenHua

Add Tom to CC list.
On 10/15/2014 04:10 PM, Li, ZhenHua wrote:

David, Joerg,
I plan to merge this patch set with 3.17 stable kernel, and split this
patch set into two :
1. The core part, including the changed functions, like [Patch 4/8],
[Patch 8/8].
2. For the formatting issues, like [Patch 1/8],[Patch 3/8],  including
the changes for code formations, creation of new files
intel-iommu-kdump.c, intel-iommu-private.h.

I believe this will make the patch set more clear to read and understand.

What are your suggestions?

Thanks
Zhenhua


On 07/12/2014 12:27 AM, Jerry Hoemann wrote:

On Wed, Jul 02, 2014 at 03:32:59PM +0200, Joerg Roedel wrote:

Hi David,

On Wed, Apr 30, 2014 at 11:49:33AM +0100, David Woodhouse wrote:

There could be all kinds of existing mappings in the DMA page tables,
and I'm not sure it's safe to preserve them. What prevents the
crashdump
kernel from trying to use any of the physical pages which are
accessible, and which could thus be corrupted by stray DMA?

In fact, the old kernel could even have set up 1:1 passthrough mappings
for some devices, which would then be able to DMA *anywhere*. Surely we
need to prevent that?


Ideally we would prevent that, yes. But the problem is that a failed DMA
transaction might put the device into an unrecoverable state. Usually
any in-flight DMA transactions should only target buffers set up by the
previous kernel and not corrupt any data.


After the last round of this patchset, we discussed a potential
improvement where you point every virtual bus address at the *same*
physical scratch page.


That is a solution to prevent the in-flight DMA failures. But what
happens when there is some in-flight DMA to a disk to write some inodes
or a new superblock. Then this scratch address-space may cause
filesystem corruption at worst.

So with this in mind I would prefer initially taking over the
page-tables from the old kernel before the device drivers re-initialize
the devices.


Joerg


David, Joerg,

What do you think here?  Do you want me to update the patch set for 3.17?

Jerry





___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-10-15 Thread Li, ZhenHua

David, Joerg,
I plan to merge this patch set with 3.17 stable kernel, and split this 
patch set into two :
1. The core part, including the changed functions, like [Patch 4/8], 
[Patch 8/8].
2. For the formatting issues, like [Patch 1/8],[Patch 3/8],  including 
the changes for code formations, creation of new files 
intel-iommu-kdump.c, intel-iommu-private.h.


I believe this will make the patch set more clear to read and understand.

What are your suggestions?

Thanks
Zhenhua


On 07/12/2014 12:27 AM, Jerry Hoemann wrote:

On Wed, Jul 02, 2014 at 03:32:59PM +0200, Joerg Roedel wrote:

Hi David,

On Wed, Apr 30, 2014 at 11:49:33AM +0100, David Woodhouse wrote:

There could be all kinds of existing mappings in the DMA page tables,
and I'm not sure it's safe to preserve them. What prevents the crashdump
kernel from trying to use any of the physical pages which are
accessible, and which could thus be corrupted by stray DMA?

In fact, the old kernel could even have set up 1:1 passthrough mappings
for some devices, which would then be able to DMA *anywhere*. Surely we
need to prevent that?


Ideally we would prevent that, yes. But the problem is that a failed DMA
transaction might put the device into an unrecoverable state. Usually
any in-flight DMA transactions should only target buffers set up by the
previous kernel and not corrupt any data.


After the last round of this patchset, we discussed a potential
improvement where you point every virtual bus address at the *same*
physical scratch page.


That is a solution to prevent the in-flight DMA failures. But what
happens when there is some in-flight DMA to a disk to write some inodes
or a new superblock. Then this scratch address-space may cause
filesystem corruption at worst.

So with this in mind I would prefer initially taking over the
page-tables from the old kernel before the device drivers re-initialize
the devices.


Joerg


David, Joerg,

What do you think here?  Do you want me to update the patch set for 3.17?

Jerry



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d : clear old root entry for dump kernel

2014-08-20 Thread Li, ZhenHua


On my system, error message:
[4.322008] dmar: DRHD: handling fault status reg 2
[4.327484] dmar: DMAR:[DMA Read] Request device [21:00.0] fault addr 
fff66000

[4.327484] DMAR:[fault reason 01] Present bit in root entry is clear

fault happens on device : 21:00.0

To describe this problem clearly, we mark two time points:
A: Just before dump kernel init the DMAR/IOMMU structures.
B: Just after DMAR/IOMMU finished initialization.
C: Just before  21:00.0

When the first kernel crashed, then the dump kernel will boot.
Then there comes a DMA request on  21:00.0 when durting A to C. And this 
will cause this error.


If I clear the root entry table when iommu structure is allocted, this 
error will not happen during A to B, but still can be seen during B and 
C.  I guess the cause is the un-expected DMA request.


I will apply Bill Sumner's patch to see whether it will cause the errors 
disappear.



-- Zhenhua

On 08/19/2014 07:02 PM, Joerg Roedel wrote:

On Mon, Aug 18, 2014 at 11:27:01PM +, Li, Zhen-Hua wrote:

: [fault reason 01] Present bit in root entry is clear
It appears when iommu initializing in the kdump kernel.


Hmm, do you have an explanation how this can happen? From how I read the
code, the kdump kernel disables translation on the IOMMU, then sets a
new root entry, and then re-enabled translation. To me it looks like
there is no point in time where translation is enabled and the
root-entry is clear (present-bit==0).

But obviously I am missing something if you see the message above.

Btw, have you looked into this patch-set posted earlier this year:

https://lkml.org/lkml/2014/4/24/836

It approaches the same problem-space, but also cares about in-flight
DMA.


Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d : clear old root entry for dump kernel

2014-08-18 Thread Li, ZhenHua

I found there are more data need to be cleared for the dump kernel.
So please ignore this patch, I will send out another one.

Thanks
Zhenhua

On 08/19/2014 07:59 AM, Li, Zhen-Hua wrote:

My debugging result is this:

1. Clear the old root entry table, dump kernel will choose another
  memory region for root entry.
2. Do NOT clear the old root entry, when dump kernel initializing
the iommu data structure, it  will allocate memory for root entry,
this is different from the old address.

If not clear old entry , the error message appears before dump kernel
finishes the iommu init works, and also appears in other places(before
device inits).

If I clear the old root entry, the error message disappears before iommu
init work finish, but still appears in other places.

-Original Message-
From: Li, Zhen-Hua
Sent: Tuesday, August 19, 2014 7:48 AM
To: Li, Zhen-Hua; Joerg Roedel
Cc: David Woodhouse; iommu@lists.linux-foundation.org; 
linux-ker...@vger.kernel.org
Subject: RE: [PATCH 1/1] iommu/vt-d : clear old root entry for dump kernel

When the dump kernel boots, it will initialize iommu again, and the root entry 
will be allocted
in another memory region.

That means, no matter kernel clears the old root entry table or not,  the dump 
kernel will use
another memory region when iommu initializing.

-Original Message-
From: Li, Zhen-Hua
Sent: Tuesday, August 19, 2014 7:27 AM
To: 'Joerg Roedel'
Cc: David Woodhouse; iommu@lists.linux-foundation.org; 
linux-ker...@vger.kernel.org
Subject: RE: [PATCH 1/1] iommu/vt-d : clear old root entry for dump kernel

: [fault reason 01] Present bit in root entry is clear
It appears when iommu initializing in the kdump kernel.

-Original Message-
From: Joerg Roedel [mailto:j...@8bytes.org]
Sent: Tuesday, August 19, 2014 7:23 AM
To: Li, Zhen-Hua
Cc: David Woodhouse; iommu@lists.linux-foundation.org; 
linux-ker...@vger.kernel.org
Subject: Re: [PATCH 1/1] iommu/vt-d : clear old root entry for dump kernel

On Mon, Aug 18, 2014 at 11:01:56PM +, Li, Zhen-Hua wrote:

There is a bug when Linux running on an HP large system:
when kdump kernel runs, the hardware is still using the old
root entry. This causes error message when iommu not finished initialization.

What error message are you seeing? When the kdump kernel boots the iommu
should be still enabled from the old kernel with the old root-entry. So
any in-flight DMA initiated from the old kernel can still pass and there
should be no error messages.

When you clear the root-entry that in-flight DMA might go to another
random location in system memory or just fail, no?


Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] x86/iommu: use bit structures for context_entry

2014-01-10 Thread Li, ZhenHua

This patch is sent again because a ';' was missed in my last mail.

> +  context->asr = value >> VTD_PAGE_SHIFT

I corrected it and created a new one. And it is tested .
Please use the latest one.


Sorry for that.

ZhenHua

On 01/10/2014 04:27 PM, Li, Zhen-Hua wrote:

There is a structure named context_entry used by intel iommu, and there
are some bit operations on it. Use bit structure may make these operations
easy.
Also the function context_set_address_root may cause problem because it uses
  |= operation, not set the new value. And this patch can fix it.

Signed-off-by: Li, Zhen-Hua 
---
  drivers/iommu/intel-iommu.c | 37 +++--
  1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 43b9bfe..65cd480 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -221,47 +221,64 @@ get_context_addr_from_root(struct root_entry *root)
   * 8-23: domain id
   */
  struct context_entry {
-   u64 lo;
-   u64 hi;
+   union {
+   struct {
+   u64 present:1,
+   fpd:1,
+   translation_type:2,
+   reserved_l:8,
+   asr:52;
+   };
+   u64 lo;
+   };
+   union {
+   struct {
+   u64 aw:3,
+   avail:4,
+   reserved_h1:1,
+   did:16,
+   reserved_h2:40;
+   };
+   u64 hi;
+   };
  };

  static inline bool context_present(struct context_entry *context)
  {
-   return (context->lo & 1);
+   return context->present;
  }
  static inline void context_set_present(struct context_entry *context)
  {
-   context->lo |= 1;
+   context->present = 1;
  }

  static inline void context_set_fault_enable(struct context_entry *context)
  {
-   context->lo &= (((u64)-1) << 2) | 1;
+   context->fpd = 1;
  }

  static inline void context_set_translation_type(struct context_entry *context,
unsigned long value)
  {
-   context->lo &= (((u64)-1) << 4) | 3;
-   context->lo |= (value & 3) << 2;
+   context->translation_type = value;
  }

  static inline void context_set_address_root(struct context_entry *context,
unsigned long value)
  {
-   context->lo |= value & VTD_PAGE_MASK;
+   context->asr = value >> VTD_PAGE_SHIFT;
  }

  static inline void context_set_address_width(struct context_entry *context,
 unsigned long value)
  {
-   context->hi |= value & 7;
+   context->aw  = value;
  }

  static inline void context_set_domain_id(struct context_entry *context,
 unsigned long value)
  {
-   context->hi |= (value & ((1 << 16) - 1)) << 8;
+   context->did = value;
  }

  static inline void context_clear_entry(struct context_entry *context)



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] x86/iommu: use bit structures for context_entry

2014-01-09 Thread Li, ZhenHua
I have not seen such a bug yet . but obviously a '=' should be used when 
you want to set a value.


for example, if x != 0,
x=10
and
x|=10
will cause different result.

ZhenHua

On 01/07/2014 10:41 PM, Joerg Roedel wrote:

On Fri, Dec 20, 2013 at 04:45:01PM +0800, Li, Zhen-Hua wrote:

There is a structure named context_entry used by intel iommu, and there
are some bit operations on it. Use bit structure may make these operations
easy.
Also the function context_set_address_root may cause problem because it uses
  |= operation, not set the new value. And this patch can fix it.


What is the problem you are trying to fix here? Is it an actual bug?


Joerg




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] x86/iommu: use bit structures for context_entry

2014-01-05 Thread Li, ZhenHua
Yes, that's the problem. And some other structures like this, see "union 
irte".


On 12/26/2013 01:12 PM, Wu, Feng wrote:

I think if using |= operation, it should use &= operation to clear those
bits first.

Thanks,

Feng

*From:*iommu-boun...@lists.linux-foundation.org
[mailto:iommu-boun...@lists.linux-foundation.org] *On Behalf Of *Kai Huang
*Sent:* Thursday, December 26, 2013 9:55 AM
*To:* Li, Zhen-Hua
*Cc:* David Woodhouse; iommu@lists.linux-foundation.org;
linux-ker...@vger.kernel.org
*Subject:* Re: [PATCH 1/1] x86/iommu: use bit structures for context_entry

Just curious, why would |= operation cause problem? :)

On Fri, Dec 20, 2013 at 4:45 PM, Li, Zhen-Hua mailto:zhen-h...@hp.com>> wrote:

There is a structure named context_entry used by intel iommu, and there
are some bit operations on it. Use bit structure may make these operations
easy.
Also the function context_set_address_root may cause problem because it uses
  |= operation, not set the new value. And this patch can fix it.

Signed-off-by: Li, Zhen-Hua mailto:zhen-h...@hp.com>>
---
  drivers/iommu/intel-iommu.c | 37 +++--
  1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 43b9bfe..65cd480 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -221,47 +221,64 @@ get_context_addr_from_root(struct root_entry *root)
   * 8-23: domain id
   */
  struct context_entry {
-   u64 lo;
-   u64 hi;
+   union {
+   struct {
+   u64 present:1,
+   fpd:1,
+   translation_type:2,
+   reserved_l:8,
+   asr:52;
+   };
+   u64 lo;
+   };
+   union {
+   struct {
+   u64 aw:3,
+   avail:4,
+   reserved_h1:1,
+   did:16,
+   reserved_h2:40;
+   };
+   u64 hi;
+   };
  };

  static inline bool context_present(struct context_entry *context)
  {
-   return (context->lo & 1);
+   return context->present;
  }
  static inline void context_set_present(struct context_entry *context)
  {
-   context->lo |= 1;
+   context->present = 1;
  }

  static inline void context_set_fault_enable(struct context_entry *context)
  {
-   context->lo &= (((u64)-1) << 2) | 1;
+   context->fpd = 1;
  }

  static inline void context_set_translation_type(struct context_entry
*context,
 unsigned long value)
  {
-   context->lo &= (((u64)-1) << 4) | 3;
-   context->lo |= (value & 3) << 2;
+   context->translation_type = value;
  }

  static inline void context_set_address_root(struct context_entry *context,
 unsigned long value)
  {
-   context->lo |= value & VTD_PAGE_MASK;
+   context->asr = value >> VTD_PAGE_SHIFT
  }

  static inline void context_set_address_width(struct context_entry
*context,
  unsigned long value)
  {
-   context->hi |= value & 7;
+   context->aw  = value;
  }

  static inline void context_set_domain_id(struct context_entry *context,
  unsigned long value)
  {
-   context->hi |= (value & ((1 << 16) - 1)) << 8;
+   context->did = value;
  }

  static inline void context_clear_entry(struct context_entry *context)
--
1.8.4.3

___
iommu mailing list
iommu@lists.linux-foundation.org 
https://lists.linuxfoundation.org/mailman/listinfo/iommu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu