Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/12/15 at 01:57pm, Dave Young wrote: > On 05/11/15 at 12:11pm, Joerg Roedel wrote: > > On Thu, May 07, 2015 at 09:56:00PM +0800, Dave Young wrote: > > > Joreg, I can not find the last reply from you, so just reply here about > > > my worries here. > > > > > > I said that the patchset will cause more problems, let me explain about > > > it more here: > > > > > > Suppose page table was corrupted, ie. original mapping iova1 -> page 1 > > > it was changed to iova1 -> page 2 accidently while crash happening, > > > thus future dma will read/write page 2 instead page 1, right? > > > > When the page-table is corrupted then it is a left-over from the old > > kernel. When the kdump kernel boots the situation can at least not get > > worse. For the page tables it is also hard to detect wrong mappings (if > > this would be possible the hardware could already do it), so any checks > > we could do there are of limited use anyway. > > Joerg, since both of you guys do not think it is a problem I will object it s/will object/will not object > any more though I still do not like reusing the old page tables. So let's > leave it as a future issue. > > Thanks > Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/11/15 at 12:11pm, Joerg Roedel wrote: > On Thu, May 07, 2015 at 09:56:00PM +0800, Dave Young wrote: > > Joreg, I can not find the last reply from you, so just reply here about > > my worries here. > > > > I said that the patchset will cause more problems, let me explain about > > it more here: > > > > Suppose page table was corrupted, ie. original mapping iova1 -> page 1 > > it was changed to iova1 -> page 2 accidently while crash happening, > > thus future dma will read/write page 2 instead page 1, right? > > When the page-table is corrupted then it is a left-over from the old > kernel. When the kdump kernel boots the situation can at least not get > worse. For the page tables it is also hard to detect wrong mappings (if > this would be possible the hardware could already do it), so any checks > we could do there are of limited use anyway. Joerg, since both of you guys do not think it is a problem I will object it any more though I still do not like reusing the old page tables. So let's leave it as a future issue. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On Thu, May 07, 2015 at 09:56:00PM +0800, Dave Young wrote: > Joreg, I can not find the last reply from you, so just reply here about > my worries here. > > I said that the patchset will cause more problems, let me explain about > it more here: > > Suppose page table was corrupted, ie. original mapping iova1 -> page 1 > it was changed to iova1 -> page 2 accidently while crash happening, > thus future dma will read/write page 2 instead page 1, right? When the page-table is corrupted then it is a left-over from the old kernel. When the kdump kernel boots the situation can at least not get worse. For the page tables it is also hard to detect wrong mappings (if this would be possible the hardware could already do it), so any checks we could do there are of limited use anyway. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/07/2015 09:21 PM, Dave Young wrote: On 05/07/15 at 10:25am, Don Dutile wrote: On 05/07/2015 10:00 AM, Dave Young wrote: On 04/07/15 at 10:12am, Don Dutile wrote: On 04/06/2015 11:46 PM, Dave Young wrote: On 04/05/15 at 09:54am, Baoquan He wrote: On 04/03/15 at 05:21pm, Dave Young wrote: On 04/03/15 at 05:01pm, Li, ZhenHua wrote: Hi Dave, There may be some possibilities that the old iommu data is corrupted by some other modules. Currently we do not have a better solution for the dmar faults. But I think when this happens, we need to fix the module that corrupted the old iommu data. I once met a similar problem in normal kernel, the queue used by the qi_* functions was written again by another module. The fix was in that module, not in iommu module. It is too late, there will be no chance to save vmcore then. Also if it is possible to continue corrupt other area of oldmem because of using old iommu tables then it will cause more problems. So I think the tables at least need some verifycation before being used. Yes, it's a good thinking anout this and verification is also an interesting idea. kexec/kdump do a sha256 calculation on loaded kernel and then verify this again when panic happens in purgatory. This checks whether any code stomps into region reserved for kexec/kernel and corrupt the loaded kernel. If this is decided to do it should be an enhancement to current patchset but not a approach change. Since this patchset is going very close to point as maintainers expected maybe this can be merged firstly, then think about enhancement. After all without this patchset vt-d often raised error message, hung. It does not convince me, we should do it right at the beginning instead of introduce something wrong. I wonder why the old dma can not be remap to a specific page in kdump kernel so that it will not corrupt more memory. But I may missed something, I will looking for old threads and catch up. Thanks Dave The (only) issue is not corruption, but once the iommu is re-configured, the old, not-stopped-yet, dma engines will use iova's that will generate dmar faults, which will be enabled when the iommu is re-configured (even to a single/simple paging scheme) in the kexec kernel. Don, so if iommu is not reconfigured then these faults will not happen? Well, if iommu is not reconfigured, then if the crash isn't caused by an IOMMU fault (some systems have firmware-first catch the IOMMU fault & convert them into NMI_IOCK), then the DMA's will continue into the old kernel memory space. So NMI_IOCK is one reason to cause kernel hang, I think I'm still not clear about what does re-configured means though. DMAR faults will happen originally this is the old behavior but we are removing the faults by alowing DMA continuing into old memory space. A flood of faults occur when the 2nd kernel (re-)configures the IOMMU because the second kernel effectively clears/disable all DMA except RMRRs, so any DMA from 1st kernel will flood the system with faults. Its the flood of dmar faults that eventually wedges &/or crashes the system while trying to take a kdump. Baoquan and me has a confusion below today about iommu=off/intel_iommu=off: intel_iommu_init() { ... dmar_table_init(); disable active iommu translations; if (no_iommu || dmar_disabled) goto out_free_dmar; ... } Any reason not move no_iommu check to the begining of intel_iommu_init function? What does that do/help? Just do not know why the previous handling is necessary with iommu=off, shouldn't we do noting and return earlier? Also there is a guess, dmar faults appears after iommu_init, so not sure if the codes before dmar_disabled checking have some effect about enabling the faults messages. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/07/2015 10:00 AM, Dave Young wrote: On 04/07/15 at 10:12am, Don Dutile wrote: On 04/06/2015 11:46 PM, Dave Young wrote: On 04/05/15 at 09:54am, Baoquan He wrote: On 04/03/15 at 05:21pm, Dave Young wrote: On 04/03/15 at 05:01pm, Li, ZhenHua wrote: Hi Dave, There may be some possibilities that the old iommu data is corrupted by some other modules. Currently we do not have a better solution for the dmar faults. But I think when this happens, we need to fix the module that corrupted the old iommu data. I once met a similar problem in normal kernel, the queue used by the qi_* functions was written again by another module. The fix was in that module, not in iommu module. It is too late, there will be no chance to save vmcore then. Also if it is possible to continue corrupt other area of oldmem because of using old iommu tables then it will cause more problems. So I think the tables at least need some verifycation before being used. Yes, it's a good thinking anout this and verification is also an interesting idea. kexec/kdump do a sha256 calculation on loaded kernel and then verify this again when panic happens in purgatory. This checks whether any code stomps into region reserved for kexec/kernel and corrupt the loaded kernel. If this is decided to do it should be an enhancement to current patchset but not a approach change. Since this patchset is going very close to point as maintainers expected maybe this can be merged firstly, then think about enhancement. After all without this patchset vt-d often raised error message, hung. It does not convince me, we should do it right at the beginning instead of introduce something wrong. I wonder why the old dma can not be remap to a specific page in kdump kernel so that it will not corrupt more memory. But I may missed something, I will looking for old threads and catch up. Thanks Dave The (only) issue is not corruption, but once the iommu is re-configured, the old, not-stopped-yet, dma engines will use iova's that will generate dmar faults, which will be enabled when the iommu is re-configured (even to a single/simple paging scheme) in the kexec kernel. Don, so if iommu is not reconfigured then these faults will not happen? Baoquan and me has a confusion below today about iommu=off/intel_iommu=off: intel_iommu_init() { ... dmar_table_init(); disable active iommu translations; if (no_iommu || dmar_disabled) goto out_free_dmar; ... } Any reason not move no_iommu check to the begining of intel_iommu_init function? Thanks Dave Looks like you could. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/07/15 at 10:25am, Don Dutile wrote: > On 05/07/2015 10:00 AM, Dave Young wrote: > >On 04/07/15 at 10:12am, Don Dutile wrote: > >>On 04/06/2015 11:46 PM, Dave Young wrote: > >>>On 04/05/15 at 09:54am, Baoquan He wrote: > On 04/03/15 at 05:21pm, Dave Young wrote: > >On 04/03/15 at 05:01pm, Li, ZhenHua wrote: > >>Hi Dave, > >> > >>There may be some possibilities that the old iommu data is corrupted by > >>some other modules. Currently we do not have a better solution for the > >>dmar faults. > >> > >>But I think when this happens, we need to fix the module that corrupted > >>the old iommu data. I once met a similar problem in normal kernel, the > >>queue used by the qi_* functions was written again by another module. > >>The fix was in that module, not in iommu module. > > > >It is too late, there will be no chance to save vmcore then. > > > >Also if it is possible to continue corrupt other area of oldmem because > >of using old iommu tables then it will cause more problems. > > > >So I think the tables at least need some verifycation before being used. > > > > Yes, it's a good thinking anout this and verification is also an > interesting idea. kexec/kdump do a sha256 calculation on loaded kernel > and then verify this again when panic happens in purgatory. This checks > whether any code stomps into region reserved for kexec/kernel and corrupt > the loaded kernel. > > If this is decided to do it should be an enhancement to current > patchset but not a approach change. Since this patchset is going very > close to point as maintainers expected maybe this can be merged firstly, > then think about enhancement. After all without this patchset vt-d often > raised error message, hung. > >>> > >>>It does not convince me, we should do it right at the beginning instead of > >>>introduce something wrong. > >>> > >>>I wonder why the old dma can not be remap to a specific page in kdump > >>>kernel > >>>so that it will not corrupt more memory. But I may missed something, I will > >>>looking for old threads and catch up. > >>> > >>>Thanks > >>>Dave > >>> > >>The (only) issue is not corruption, but once the iommu is re-configured, > >>the old, > >>not-stopped-yet, dma engines will use iova's that will generate dmar > >>faults, which > >>will be enabled when the iommu is re-configured (even to a single/simple > >>paging scheme) > >>in the kexec kernel. > >> > > > >Don, so if iommu is not reconfigured then these faults will not happen? > > > Well, if iommu is not reconfigured, then if the crash isn't caused by > an IOMMU fault (some systems have firmware-first catch the IOMMU fault & > convert > them into NMI_IOCK), then the DMA's will continue into the old kernel memory > space. So NMI_IOCK is one reason to cause kernel hang, I think I'm still not clear about what does re-configured means though. DMAR faults will happen originally this is the old behavior but we are removing the faults by alowing DMA continuing into old memory space. > > >Baoquan and me has a confusion below today about iommu=off/intel_iommu=off: > > > >intel_iommu_init() > >{ > >... > > > >dmar_table_init(); > > > >disable active iommu translations; > > > >if (no_iommu || dmar_disabled) > > goto out_free_dmar; > > > >... > >} > > > >Any reason not move no_iommu check to the begining of intel_iommu_init > >function? > > > What does that do/help? Just do not know why the previous handling is necessary with iommu=off, shouldn't we do noting and return earlier? Also there is a guess, dmar faults appears after iommu_init, so not sure if the codes before dmar_disabled checking have some effect about enabling the faults messages. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/07/2015 10:00 AM, Dave Young wrote: On 04/07/15 at 10:12am, Don Dutile wrote: On 04/06/2015 11:46 PM, Dave Young wrote: On 04/05/15 at 09:54am, Baoquan He wrote: On 04/03/15 at 05:21pm, Dave Young wrote: On 04/03/15 at 05:01pm, Li, ZhenHua wrote: Hi Dave, There may be some possibilities that the old iommu data is corrupted by some other modules. Currently we do not have a better solution for the dmar faults. But I think when this happens, we need to fix the module that corrupted the old iommu data. I once met a similar problem in normal kernel, the queue used by the qi_* functions was written again by another module. The fix was in that module, not in iommu module. It is too late, there will be no chance to save vmcore then. Also if it is possible to continue corrupt other area of oldmem because of using old iommu tables then it will cause more problems. So I think the tables at least need some verifycation before being used. Yes, it's a good thinking anout this and verification is also an interesting idea. kexec/kdump do a sha256 calculation on loaded kernel and then verify this again when panic happens in purgatory. This checks whether any code stomps into region reserved for kexec/kernel and corrupt the loaded kernel. If this is decided to do it should be an enhancement to current patchset but not a approach change. Since this patchset is going very close to point as maintainers expected maybe this can be merged firstly, then think about enhancement. After all without this patchset vt-d often raised error message, hung. It does not convince me, we should do it right at the beginning instead of introduce something wrong. I wonder why the old dma can not be remap to a specific page in kdump kernel so that it will not corrupt more memory. But I may missed something, I will looking for old threads and catch up. Thanks Dave The (only) issue is not corruption, but once the iommu is re-configured, the old, not-stopped-yet, dma engines will use iova's that will generate dmar faults, which will be enabled when the iommu is re-configured (even to a single/simple paging scheme) in the kexec kernel. Don, so if iommu is not reconfigured then these faults will not happen? Well, if iommu is not reconfigured, then if the crash isn't caused by an IOMMU fault (some systems have firmware-first catch the IOMMU fault & convert them into NMI_IOCK), then the DMA's will continue into the old kernel memory space. Baoquan and me has a confusion below today about iommu=off/intel_iommu=off: intel_iommu_init() { ... dmar_table_init(); disable active iommu translations; if (no_iommu || dmar_disabled) goto out_free_dmar; ... } Any reason not move no_iommu check to the begining of intel_iommu_init function? What does that do/help? Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/07/15 at 10:12am, Don Dutile wrote: > On 04/06/2015 11:46 PM, Dave Young wrote: > >On 04/05/15 at 09:54am, Baoquan He wrote: > >>On 04/03/15 at 05:21pm, Dave Young wrote: > >>>On 04/03/15 at 05:01pm, Li, ZhenHua wrote: > Hi Dave, > > There may be some possibilities that the old iommu data is corrupted by > some other modules. Currently we do not have a better solution for the > dmar faults. > > But I think when this happens, we need to fix the module that corrupted > the old iommu data. I once met a similar problem in normal kernel, the > queue used by the qi_* functions was written again by another module. > The fix was in that module, not in iommu module. > >>> > >>>It is too late, there will be no chance to save vmcore then. > >>> > >>>Also if it is possible to continue corrupt other area of oldmem because > >>>of using old iommu tables then it will cause more problems. > >>> > >>>So I think the tables at least need some verifycation before being used. > >>> > >> > >>Yes, it's a good thinking anout this and verification is also an > >>interesting idea. kexec/kdump do a sha256 calculation on loaded kernel > >>and then verify this again when panic happens in purgatory. This checks > >>whether any code stomps into region reserved for kexec/kernel and corrupt > >>the loaded kernel. > >> > >>If this is decided to do it should be an enhancement to current > >>patchset but not a approach change. Since this patchset is going very > >>close to point as maintainers expected maybe this can be merged firstly, > >>then think about enhancement. After all without this patchset vt-d often > >>raised error message, hung. > > > >It does not convince me, we should do it right at the beginning instead of > >introduce something wrong. > > > >I wonder why the old dma can not be remap to a specific page in kdump kernel > >so that it will not corrupt more memory. But I may missed something, I will > >looking for old threads and catch up. > > > >Thanks > >Dave > > > The (only) issue is not corruption, but once the iommu is re-configured, the > old, > not-stopped-yet, dma engines will use iova's that will generate dmar faults, > which > will be enabled when the iommu is re-configured (even to a single/simple > paging scheme) > in the kexec kernel. > Don, so if iommu is not reconfigured then these faults will not happen? Baoquan and me has a confusion below today about iommu=off/intel_iommu=off: intel_iommu_init() { ... dmar_table_init(); disable active iommu translations; if (no_iommu || dmar_disabled) goto out_free_dmar; ... } Any reason not move no_iommu check to the begining of intel_iommu_init function? Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/04/15 at 01:05pm, Joerg Roedel wrote: > On Fri, Apr 03, 2015 at 04:40:31PM +0800, Dave Young wrote: > > Have not read all the patches, but I have a question, not sure this > > has been answered before. Old memory is not reliable, what if the old > > memory get corrupted before panic? Is it safe to continue using it in > > 2nd kernel, I worry that it will cause problems. > > Yes, the old memory could be corrupted, and there are more failure cases > left which we have no way of handling yet (if iommu data structures are > in kdump backup areas). > > The question is what to do if we find some of the old data structures > corrupted, hand how far should the tests go. Should we also check the > page-tables, for example? I think if some of the data structures for a > device are corrupted it probably already failed in the old kernel and > things won't get worse in the new one. Joreg, I can not find the last reply from you, so just reply here about my worries here. I said that the patchset will cause more problems, let me explain about it more here: Suppose page table was corrupted, ie. original mapping iova1 -> page 1 it was changed to iova1 -> page 2 accidently while crash happening, thus future dma will read/write page 2 instead page 1, right? so the behavior changes like: originally, dmar faults happen, but kdump kernel may boot ok with these faults, and vmcore can be saved. with the patchset, dmar faults does not happen, dma translation will be handled, but dma write could corrupt unrelevant memory. This might be corner case, but who knows because kernel paniced we can not assume old page table is right. But seems you all think it is safe, but let us understand each other first then go to a solution. Today we talked with Zhenhua about the problem I think both of us are clear about the problems. Just he think it can be left as future work. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/06/15 at 10:16am, Joerg Roedel wrote: > On Wed, May 06, 2015 at 09:46:49AM +0800, Dave Young wrote: > > For the original problem, the key issue is dmar faults cause kdump kernel > > hang so that vmcore can not be saved. I do not know the reason why it hangs > > I think it is acceptable if kdump kernel boot ok with some dma errors.. > > It hangs because some devices can't handle the DMAR faults and the kdump > kernel can't initialize them and will hang itself. For that it doesn't > matter whether the fault was caused by a read or write request. Ok, thanks for explanation. so it explains sometimes kdump kernel boot ok with faults, sometimes it hangs instead. Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On Wed, May 06, 2015 at 09:46:49AM +0800, Dave Young wrote: > For the original problem, the key issue is dmar faults cause kdump kernel > hang so that vmcore can not be saved. I do not know the reason why it hangs > I think it is acceptable if kdump kernel boot ok with some dma errors.. It hangs because some devices can't handle the DMAR faults and the kdump kernel can't initialize them and will hang itself. For that it doesn't matter whether the fault was caused by a read or write request. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/05/15 at 05:23pm, Joerg Roedel wrote: > On Tue, May 05, 2015 at 02:09:31PM +0800, Dave Young wrote: > > I agree that we can do nothing with the old corrupted data, but I worry > > about the future corruption after using the old corrupted data. I wonder > > if we can mark all the oldmem as readonly so that we can lower the risk. > > Is it resonable? > > Do you mean marking it read-only for the devices? That will very likely > cause DMAR faults, re-introducing the problem this patch-set tries to > fix. I means to block all dma write to oldmem, I believe it will cause DMA error. But all other DMA reading requests will continue and work. This will avoid future possible corruption. It will solve half of the problem at least? For the original problem, the key issue is dmar faults cause kdump kernel hang so that vmcore can not be saved. I do not know the reason why it hangs I think it is acceptable if kdump kernel boot ok with some dma errors.. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On Tue, May 05, 2015 at 02:09:31PM +0800, Dave Young wrote: > I agree that we can do nothing with the old corrupted data, but I worry > about the future corruption after using the old corrupted data. I wonder > if we can mark all the oldmem as readonly so that we can lower the risk. > Is it resonable? Do you mean marking it read-only for the devices? That will very likely cause DMAR faults, re-introducing the problem this patch-set tries to fix. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/04/15 at 01:05pm, Joerg Roedel wrote: > On Fri, Apr 03, 2015 at 04:40:31PM +0800, Dave Young wrote: > > Have not read all the patches, but I have a question, not sure this > > has been answered before. Old memory is not reliable, what if the old > > memory get corrupted before panic? Is it safe to continue using it in > > 2nd kernel, I worry that it will cause problems. > > Yes, the old memory could be corrupted, and there are more failure cases > left which we have no way of handling yet (if iommu data structures are > in kdump backup areas). > > The question is what to do if we find some of the old data structures > corrupted, hand how far should the tests go. Should we also check the > page-tables, for example? I think if some of the data structures for a > device are corrupted it probably already failed in the old kernel and > things won't get worse in the new one. I agree that we can do nothing with the old corrupted data, but I worry about the future corruption after using the old corrupted data. I wonder if we can mark all the oldmem as readonly so that we can lower the risk. Is it resonable? Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 05/04/2015 07:05 AM, Joerg Roedel wrote: On Fri, Apr 03, 2015 at 04:40:31PM +0800, Dave Young wrote: Have not read all the patches, but I have a question, not sure this has been answered before. Old memory is not reliable, what if the old memory get corrupted before panic? Is it safe to continue using it in 2nd kernel, I worry that it will cause problems. Yes, the old memory could be corrupted, and there are more failure cases left which we have no way of handling yet (if iommu data structures are in kdump backup areas). The question is what to do if we find some of the old data structures corrupted, hand how far should the tests go. Should we also check the page-tables, for example? I think if some of the data structures for a device are corrupted it probably already failed in the old kernel and things won't get worse in the new one. So checking is not strictly necessary in the first version of these patches (unless we find a valid failure scenario). Once we have some good plan on what to do if we find corruption, we can add checking of course. Regards, Joerg Agreed. This is a significant improvement over what we (don') have. Corruption related to IOMMU must occur within the host, and it must be a software corruption, b/c the IOMMU inherently protects itself by protecting all of memory from errant DMAs. Therefore, if the only IOMMU corruptor is in the host, it's likely the entire host kernel crash dump will either be useless, or corrupted by the security breach, at which point, this is just another scenario of a failed crash dump that will never be taken. The kernel can't protect the mapping tables, which are the most likely area to be corrupted, b/c it'd (minimally) have to be per-device (to avoid locking & coherency issues), and would require significant overhead to keep/update a checksum-like scheme on (potentially) 4 levels of page tables. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On Fri, Apr 03, 2015 at 04:40:31PM +0800, Dave Young wrote: > Have not read all the patches, but I have a question, not sure this > has been answered before. Old memory is not reliable, what if the old > memory get corrupted before panic? Is it safe to continue using it in > 2nd kernel, I worry that it will cause problems. Yes, the old memory could be corrupted, and there are more failure cases left which we have no way of handling yet (if iommu data structures are in kdump backup areas). The question is what to do if we find some of the old data structures corrupted, hand how far should the tests go. Should we also check the page-tables, for example? I think if some of the data structures for a device are corrupted it probably already failed in the old kernel and things won't get worse in the new one. So checking is not strictly necessary in the first version of these patches (unless we find a valid failure scenario). Once we have some good plan on what to do if we find corruption, we can add checking of course. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/07/15 at 05:55pm, Li, ZhenHua wrote: > On 04/07/2015 05:08 PM, Dave Young wrote: > >On 04/07/15 at 11:46am, Dave Young wrote: > >>On 04/05/15 at 09:54am, Baoquan He wrote: > >>>On 04/03/15 at 05:21pm, Dave Young wrote: > On 04/03/15 at 05:01pm, Li, ZhenHua wrote: > >Hi Dave, > > > >There may be some possibilities that the old iommu data is corrupted by > >some other modules. Currently we do not have a better solution for the > >dmar faults. > > > >But I think when this happens, we need to fix the module that corrupted > >the old iommu data. I once met a similar problem in normal kernel, the > >queue used by the qi_* functions was written again by another module. > >The fix was in that module, not in iommu module. > > It is too late, there will be no chance to save vmcore then. > > Also if it is possible to continue corrupt other area of oldmem because > of using old iommu tables then it will cause more problems. > > So I think the tables at least need some verifycation before being used. > > >>> > >>>Yes, it's a good thinking anout this and verification is also an > >>>interesting idea. kexec/kdump do a sha256 calculation on loaded kernel > >>>and then verify this again when panic happens in purgatory. This checks > >>>whether any code stomps into region reserved for kexec/kernel and corrupt > >>>the loaded kernel. > >>> > >>>If this is decided to do it should be an enhancement to current > >>>patchset but not a approach change. Since this patchset is going very > >>>close to point as maintainers expected maybe this can be merged firstly, > >>>then think about enhancement. After all without this patchset vt-d often > >>>raised error message, hung. > >> > >>It does not convince me, we should do it right at the beginning instead of > >>introduce something wrong. > >> > >>I wonder why the old dma can not be remap to a specific page in kdump kernel > >>so that it will not corrupt more memory. But I may missed something, I will > >>looking for old threads and catch up. > > > >I have read the old discussion, above way was dropped because it could > >corrupt > >filesystem. Apologize about late commenting. > > > >But current solution sounds bad to me because of using old memory which is > >not > >reliable. > > > >Thanks > >Dave > > > Seems we do not have a better solution for the dmar faults. But I believe > we can find out how to verify the iommu data which is located in old memory. That will be great, thanks. So there's two things: 1) make sure old pg tables are right, this is what we were talking about. 2) avoid writing old memory, I suppose only dma read could corrupt filesystem, right? So how about for any dma writes just create a scratch page in 2nd kernel memory. Only using old page table for dma read. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/06/2015 11:46 PM, Dave Young wrote: On 04/05/15 at 09:54am, Baoquan He wrote: On 04/03/15 at 05:21pm, Dave Young wrote: On 04/03/15 at 05:01pm, Li, ZhenHua wrote: Hi Dave, There may be some possibilities that the old iommu data is corrupted by some other modules. Currently we do not have a better solution for the dmar faults. But I think when this happens, we need to fix the module that corrupted the old iommu data. I once met a similar problem in normal kernel, the queue used by the qi_* functions was written again by another module. The fix was in that module, not in iommu module. It is too late, there will be no chance to save vmcore then. Also if it is possible to continue corrupt other area of oldmem because of using old iommu tables then it will cause more problems. So I think the tables at least need some verifycation before being used. Yes, it's a good thinking anout this and verification is also an interesting idea. kexec/kdump do a sha256 calculation on loaded kernel and then verify this again when panic happens in purgatory. This checks whether any code stomps into region reserved for kexec/kernel and corrupt the loaded kernel. If this is decided to do it should be an enhancement to current patchset but not a approach change. Since this patchset is going very close to point as maintainers expected maybe this can be merged firstly, then think about enhancement. After all without this patchset vt-d often raised error message, hung. It does not convince me, we should do it right at the beginning instead of introduce something wrong. I wonder why the old dma can not be remap to a specific page in kdump kernel so that it will not corrupt more memory. But I may missed something, I will looking for old threads and catch up. Thanks Dave The (only) issue is not corruption, but once the iommu is re-configured, the old, not-stopped-yet, dma engines will use iova's that will generate dmar faults, which will be enabled when the iommu is re-configured (even to a single/simple paging scheme) in the kexec kernel. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/07/2015 05:08 PM, Dave Young wrote: On 04/07/15 at 11:46am, Dave Young wrote: On 04/05/15 at 09:54am, Baoquan He wrote: On 04/03/15 at 05:21pm, Dave Young wrote: On 04/03/15 at 05:01pm, Li, ZhenHua wrote: Hi Dave, There may be some possibilities that the old iommu data is corrupted by some other modules. Currently we do not have a better solution for the dmar faults. But I think when this happens, we need to fix the module that corrupted the old iommu data. I once met a similar problem in normal kernel, the queue used by the qi_* functions was written again by another module. The fix was in that module, not in iommu module. It is too late, there will be no chance to save vmcore then. Also if it is possible to continue corrupt other area of oldmem because of using old iommu tables then it will cause more problems. So I think the tables at least need some verifycation before being used. Yes, it's a good thinking anout this and verification is also an interesting idea. kexec/kdump do a sha256 calculation on loaded kernel and then verify this again when panic happens in purgatory. This checks whether any code stomps into region reserved for kexec/kernel and corrupt the loaded kernel. If this is decided to do it should be an enhancement to current patchset but not a approach change. Since this patchset is going very close to point as maintainers expected maybe this can be merged firstly, then think about enhancement. After all without this patchset vt-d often raised error message, hung. It does not convince me, we should do it right at the beginning instead of introduce something wrong. I wonder why the old dma can not be remap to a specific page in kdump kernel so that it will not corrupt more memory. But I may missed something, I will looking for old threads and catch up. I have read the old discussion, above way was dropped because it could corrupt filesystem. Apologize about late commenting. But current solution sounds bad to me because of using old memory which is not reliable. Thanks Dave Seems we do not have a better solution for the dmar faults. But I believe we can find out how to verify the iommu data which is located in old memory. Thanks Zhenhua ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/07/15 at 11:46am, Dave Young wrote: > On 04/05/15 at 09:54am, Baoquan He wrote: > > On 04/03/15 at 05:21pm, Dave Young wrote: > > > On 04/03/15 at 05:01pm, Li, ZhenHua wrote: > > > > Hi Dave, > > > > > > > > There may be some possibilities that the old iommu data is corrupted by > > > > some other modules. Currently we do not have a better solution for the > > > > dmar faults. > > > > > > > > But I think when this happens, we need to fix the module that corrupted > > > > the old iommu data. I once met a similar problem in normal kernel, the > > > > queue used by the qi_* functions was written again by another module. > > > > The fix was in that module, not in iommu module. > > > > > > It is too late, there will be no chance to save vmcore then. > > > > > > Also if it is possible to continue corrupt other area of oldmem because > > > of using old iommu tables then it will cause more problems. > > > > > > So I think the tables at least need some verifycation before being used. > > > > > > > Yes, it's a good thinking anout this and verification is also an > > interesting idea. kexec/kdump do a sha256 calculation on loaded kernel > > and then verify this again when panic happens in purgatory. This checks > > whether any code stomps into region reserved for kexec/kernel and corrupt > > the loaded kernel. > > > > If this is decided to do it should be an enhancement to current > > patchset but not a approach change. Since this patchset is going very > > close to point as maintainers expected maybe this can be merged firstly, > > then think about enhancement. After all without this patchset vt-d often > > raised error message, hung. > > It does not convince me, we should do it right at the beginning instead of > introduce something wrong. > > I wonder why the old dma can not be remap to a specific page in kdump kernel > so that it will not corrupt more memory. But I may missed something, I will > looking for old threads and catch up. I have read the old discussion, above way was dropped because it could corrupt filesystem. Apologize about late commenting. But current solution sounds bad to me because of using old memory which is not reliable. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/05/15 at 09:54am, Baoquan He wrote: > On 04/03/15 at 05:21pm, Dave Young wrote: > > On 04/03/15 at 05:01pm, Li, ZhenHua wrote: > > > Hi Dave, > > > > > > There may be some possibilities that the old iommu data is corrupted by > > > some other modules. Currently we do not have a better solution for the > > > dmar faults. > > > > > > But I think when this happens, we need to fix the module that corrupted > > > the old iommu data. I once met a similar problem in normal kernel, the > > > queue used by the qi_* functions was written again by another module. > > > The fix was in that module, not in iommu module. > > > > It is too late, there will be no chance to save vmcore then. > > > > Also if it is possible to continue corrupt other area of oldmem because > > of using old iommu tables then it will cause more problems. > > > > So I think the tables at least need some verifycation before being used. > > > > Yes, it's a good thinking anout this and verification is also an > interesting idea. kexec/kdump do a sha256 calculation on loaded kernel > and then verify this again when panic happens in purgatory. This checks > whether any code stomps into region reserved for kexec/kernel and corrupt > the loaded kernel. > > If this is decided to do it should be an enhancement to current > patchset but not a approach change. Since this patchset is going very > close to point as maintainers expected maybe this can be merged firstly, > then think about enhancement. After all without this patchset vt-d often > raised error message, hung. It does not convince me, we should do it right at the beginning instead of introduce something wrong. I wonder why the old dma can not be remap to a specific page in kdump kernel so that it will not corrupt more memory. But I may missed something, I will looking for old threads and catch up. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/03/15 at 02:05pm, Li, Zhen-Hua wrote: > The hardware will do some verification, but not completely. If people think > the OS should also do this, then it should be another patchset, I think. If there is chance to corrupt more memory I think it is not a right way. We should think about a better solution instead of fix it later. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/03/15 at 05:21pm, Dave Young wrote: > On 04/03/15 at 05:01pm, Li, ZhenHua wrote: > > Hi Dave, > > > > There may be some possibilities that the old iommu data is corrupted by > > some other modules. Currently we do not have a better solution for the > > dmar faults. > > > > But I think when this happens, we need to fix the module that corrupted > > the old iommu data. I once met a similar problem in normal kernel, the > > queue used by the qi_* functions was written again by another module. > > The fix was in that module, not in iommu module. > > It is too late, there will be no chance to save vmcore then. > > Also if it is possible to continue corrupt other area of oldmem because > of using old iommu tables then it will cause more problems. > > So I think the tables at least need some verifycation before being used. > Yes, it's a good thinking anout this and verification is also an interesting idea. kexec/kdump do a sha256 calculation on loaded kernel and then verify this again when panic happens in purgatory. This checks whether any code stomps into region reserved for kexec/kernel and corrupt the loaded kernel. If this is decided to do it should be an enhancement to current patchset but not a approach change. Since this patchset is going very close to point as maintainers expected maybe this can be merged firstly, then think about enhancement. After all without this patchset vt-d often raised error message, hung. By the way I tested this patchset it works very well on my HP z420 work station. Thanks Baoquan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/03/15 at 05:01pm, Li, ZhenHua wrote: > Hi Dave, > > There may be some possibilities that the old iommu data is corrupted by > some other modules. Currently we do not have a better solution for the > dmar faults. > > But I think when this happens, we need to fix the module that corrupted > the old iommu data. I once met a similar problem in normal kernel, the > queue used by the qi_* functions was written again by another module. > The fix was in that module, not in iommu module. It is too late, there will be no chance to save vmcore then. Also if it is possible to continue corrupt other area of oldmem because of using old iommu tables then it will cause more problems. So I think the tables at least need some verifycation before being used. > > > Thanks > Zhenhua > > On 04/03/2015 04:40 PM, Dave Young wrote: > >>To fix this problem, we modifies the behaviors of the intel vt-d in the > >>crashdump kernel: > >> > >>For DMA Remapping: > >>1. To accept the vt-d hardware in an active state, > >>2. Do not disable and re-enable the translation, keep it enabled. > >>3. Use the old root entry table, do not rewrite the RTA register. > >>4. Malloc and use new context entry table, copy data from the old ones that > >>used by the old kernel. > > > >Have not read all the patches, but I have a question, not sure this has been > >answered before. Old memory is not reliable, what if the old memory get > >corrupted > >before panic? Is it safe to continue using it in 2nd kernel, I worry that it > >will > >cause problems. > > > >Hope I'm wrong though. > > > >Thanks > >Dave > > > > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 03/19/15 at 01:36pm, Li, Zhen-Hua wrote: > This patchset is an update of Bill Sumner's patchset, implements a fix for: > If a kernel boots with intel_iommu=on on a system that supports intel vt-d, > when a panic happens, the kdump kernel will boot with these faults: > > dmar: DRHD: handling fault status reg 102 > dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr fff8 > DMAR:[fault reason 01] Present bit in root entry is clear > > dmar: DRHD: handling fault status reg 2 > dmar: INTR-REMAP: Request device [[61:00.0] fault index 42 > INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear > > On some system, the interrupt remapping fault will also happen even if the > intel_iommu is not set to on, because the interrupt remapping will be enabled > when x2apic is needed by the system. > > The cause of the DMA fault is described in Bill's original version, and the > INTR-Remap fault is caused by a similar reason. In short, the initialization > of vt-d drivers causes the in-flight DMA and interrupt requests get wrong > response. > > To fix this problem, we modifies the behaviors of the intel vt-d in the > crashdump kernel: > > For DMA Remapping: > 1. To accept the vt-d hardware in an active state, > 2. Do not disable and re-enable the translation, keep it enabled. > 3. Use the old root entry table, do not rewrite the RTA register. > 4. Malloc and use new context entry table, copy data from the old ones that >used by the old kernel. > 5. Keep using the old page tables before driver is loaded. > 6. After device driver is loaded, when it issues the first dma_map command, >free the dmar_domain structure for this device, and generate a new one, so >that the device can be assigned a new and empty page table. > 7. When a new context entry table is generated, we also save its address to >the old root entry table. > > For Interrupt Remapping: > 1. To accept the vt-d hardware in an active state, > 2. Do not disable and re-enable the interrupt remapping, keep it enabled. > 3. Use the old interrupt remapping table, do not rewrite the IRTA register. > 4. When ioapic entry is setup, the interrupt remapping table is changed, and >the updated data will be stored to the old interrupt remapping table. > > Advantages of this approach: > 1. All manipulation of the IO-device is done by the Linux device-driver >for that device. > 2. This approach behaves in a manner very similar to operation without an >active iommu. > 3. Any activity between the IO-device and its RMRR areas is handled by the >device-driver in the same manner as during a non-kdump boot. > 4. If an IO-device has no driver in the kdump kernel, it is simply left alone. >This supports the practice of creating a special kdump kernel without >drivers for any devices that are not required for taking a crashdump. > 5. Minimal code-changes among the existing mainline intel vt-d code. > > Summary of changes in this patch set: > 1. Added some useful function for root entry table in code intel-iommu.c > 2. Added new members to struct root_entry and struct irte; > 3. Functions to load old root entry table to iommu->root_entry from the > memory >of old kernel. > 4. Functions to malloc new context entry table and copy the data from the old >ones to the malloced new ones. > 5. Functions to enable support for DMA remapping in kdump kernel. > 6. Functions to load old irte data from the old kernel to the kdump kernel. > 7. Some code changes that support other behaviours that have been listed. > 8. In the new functions, use physical address as "unsigned long" type, not >pointers. > > Original version by Bill Sumner: > https://lkml.org/lkml/2014/1/10/518 > https://lkml.org/lkml/2014/4/15/716 > https://lkml.org/lkml/2014/4/24/836 > > Zhenhua's updates: > https://lkml.org/lkml/2014/10/21/134 > https://lkml.org/lkml/2014/12/15/121 > https://lkml.org/lkml/2014/12/22/53 > https://lkml.org/lkml/2015/1/6/1166 > https://lkml.org/lkml/2015/1/12/35 > > Changelog[v9]: > 1. Add new function iommu_attach_domain_with_id. > 2. Do not copy old page tables, keep using the old ones. > 3. Remove functions: >intel_iommu_did_to_domain_values_entry >intel_iommu_get_dids_from_old_kernel >device_to_domain_id >copy_page_addr >copy_page_table >copy_context_entry >copy_context_entry_table > 4. Add new function device_to_existing_context_entry. > > Changelog[v8]: > 1. Add a missing __iommu_flush_cache in function copy_page_table. > > Changelog[v7]: > 1. Use __iommu_flush_cache to flush the data to hardware. > > Changelog[v6]: > 1. Use "unsigned long" as type of physical address. > 2. Use new function unmap_device_dma to unmap the old dma. > 3. Some small incorrect bits order for aw shift. > > Changelog[v5]: > 1. Do not disable a
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
> To fix this problem, we modifies the behaviors of the intel vt-d in the > crashdump kernel: > > For DMA Remapping: > 1. To accept the vt-d hardware in an active state, > 2. Do not disable and re-enable the translation, keep it enabled. > 3. Use the old root entry table, do not rewrite the RTA register. > 4. Malloc and use new context entry table, copy data from the old ones that >used by the old kernel. Have not read all the patches, but I have a question, not sure this has been answered before. Old memory is not reliable, what if the old memory get corrupted before panic? Is it safe to continue using it in 2nd kernel, I worry that it will cause problems. Hope I'm wrong though. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
The hardware will do some verification, but not completely. If people think the OS should also do this, then it should be another patchset, I think. Thanks Zhenhua > 在 2015年4月3日,17:21,Dave Young 写道: > >> On 04/03/15 at 05:01pm, Li, ZhenHua wrote: >> Hi Dave, >> >> There may be some possibilities that the old iommu data is corrupted by >> some other modules. Currently we do not have a better solution for the >> dmar faults. >> >> But I think when this happens, we need to fix the module that corrupted >> the old iommu data. I once met a similar problem in normal kernel, the >> queue used by the qi_* functions was written again by another module. >> The fix was in that module, not in iommu module. > > It is too late, there will be no chance to save vmcore then. > > Also if it is possible to continue corrupt other area of oldmem because > of using old iommu tables then it will cause more problems. > > So I think the tables at least need some verifycation before being used. > >> >> >> Thanks >> Zhenhua >> >> On 04/03/2015 04:40 PM, Dave Young wrote: To fix this problem, we modifies the behaviors of the intel vt-d in the crashdump kernel: For DMA Remapping: 1. To accept the vt-d hardware in an active state, 2. Do not disable and re-enable the translation, keep it enabled. 3. Use the old root entry table, do not rewrite the RTA register. 4. Malloc and use new context entry table, copy data from the old ones that used by the old kernel. >>> >>> Have not read all the patches, but I have a question, not sure this has been >>> answered before. Old memory is not reliable, what if the old memory get >>> corrupted >>> before panic? Is it safe to continue using it in 2nd kernel, I worry that >>> it will >>> cause problems. >>> >>> Hope I'm wrong though. >>> >>> Thanks >>> Dave >> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
Hi Dave, There may be some possibilities that the old iommu data is corrupted by some other modules. Currently we do not have a better solution for the dmar faults. But I think when this happens, we need to fix the module that corrupted the old iommu data. I once met a similar problem in normal kernel, the queue used by the qi_* functions was written again by another module. The fix was in that module, not in iommu module. Thanks Zhenhua On 04/03/2015 04:40 PM, Dave Young wrote: To fix this problem, we modifies the behaviors of the intel vt-d in the crashdump kernel: For DMA Remapping: 1. To accept the vt-d hardware in an active state, 2. Do not disable and re-enable the translation, keep it enabled. 3. Use the old root entry table, do not rewrite the RTA register. 4. Malloc and use new context entry table, copy data from the old ones that used by the old kernel. Have not read all the patches, but I have a question, not sure this has been answered before. Old memory is not reliable, what if the old memory get corrupted before panic? Is it safe to continue using it in 2nd kernel, I worry that it will cause problems. Hope I'm wrong though. Thanks Dave ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
On 04/03/2015 04:28 PM, Dave Young wrote: On 03/19/15 at 01:36pm, Li, Zhen-Hua wrote: This patchset is an update of Bill Sumner's patchset, implements a fix for: If a kernel boots with intel_iommu=on on a system that supports intel vt-d, when a panic happens, the kdump kernel will boot with these faults: Zhenhua, I will review the patchset recently, sorry for jumping in late. Thanks Dave Hi Dave, Thanks for your review. And please also take a look at the plan I sent in another mail: Change use of if (is_kdump_kernel()) { } To: if (iommu_enabled_in_last_kernel) { } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
Hi Joerg, This is quite strange. I checked the patches from patch 01 to 10 using ./scripts/checkpatch.pl under the kernel source directory, but got 0 errors and 0 warning. Only some white spaces in the cover letter 00, but is could not be checked by this script. But I checked the intel-iommu.c using "checkpatch.pl -f", found too many warnings and errors. Maybe we need a new patch to fix them. Thanks Zhenhua On 04/02/2015 07:11 PM, Joerg Roedel wrote: Hi Zhen-Hua, On Thu, Mar 19, 2015 at 01:36:18PM +0800, Li, Zhen-Hua wrote: This patchset is an update of Bill Sumner's patchset, implements a fix for: If a kernel boots with intel_iommu=on on a system that supports intel vt-d, when a panic happens, the kdump kernel will boot with these faults: I reviewed this patch-set and it is getting closer to a point where it could be merged. I found a few white-space errors in the review, please do a check_patch run on the next round and fix these. Besides that and given some third-party testing and reviews I think we can look forward to merge it early after the merge-window for v4.1, to give it enough testing in -next too. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
Hi Zhen-Hua, On Thu, Mar 19, 2015 at 01:36:18PM +0800, Li, Zhen-Hua wrote: > This patchset is an update of Bill Sumner's patchset, implements a fix for: > If a kernel boots with intel_iommu=on on a system that supports intel vt-d, > when a panic happens, the kdump kernel will boot with these faults: I reviewed this patch-set and it is getting closer to a point where it could be merged. I found a few white-space errors in the review, please do a check_patch run on the next round and fix these. Besides that and given some third-party testing and reviews I think we can look forward to merge it early after the merge-window for v4.1, to give it enough testing in -next too. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu