Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Xunlei Pang
On 02/20/2017 at 07:09 PM, Borislav Petkov wrote: > On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote: >> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long >> error_code) >> */ >> int lmce = 1; >> >> -/

[PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-19 Thread Xunlei Pang
://patchwork.kernel.org/patch/6167631/ https://lists.gt.net/linux/kernel/2146557 Cc: Naoya Horiguchi <n-horigu...@ah.jp.nec.com> Suggested-by: Borislav Petkov <b...@alien8.de> Signed-off-by: Xunlei Pang <xlp...@redhat.com> --- v1->v2: Using crashing_cpu according to Borislav's suggestion.

[PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-19 Thread Xunlei Pang
://patchwork.kernel.org/patch/6167631/ https://lists.gt.net/linux/kernel/2146557 Cc: Naoya Horiguchi Suggested-by: Borislav Petkov Signed-off-by: Xunlei Pang --- v1->v2: Using crashing_cpu according to Borislav's suggestion. arch/x86/include/asm/reboot.h| 1 + arch/x86/kernel/cpu/mcheck/mce.c |

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-02-17 Thread Xunlei Pang
On 02/17/2017 at 05:07 PM, Borislav Petkov wrote: > On Fri, Feb 17, 2017 at 09:53:21AM +0800, Xunlei Pang wrote: >> It changes the value of cpu_online_mask/etc which will cause confusion to >> vmcore analysis. > Then export the crashing_cpu variable, initialize it to s

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-02-17 Thread Xunlei Pang
On 02/17/2017 at 05:07 PM, Borislav Petkov wrote: > On Fri, Feb 17, 2017 at 09:53:21AM +0800, Xunlei Pang wrote: >> It changes the value of cpu_online_mask/etc which will cause confusion to >> vmcore analysis. > Then export the crashing_cpu variable, initialize it to s

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-02-16 Thread Xunlei Pang
On 02/16/2017 at 08:22 PM, Borislav Petkov wrote: > On Thu, Feb 16, 2017 at 07:52:09PM +0800, Xunlei Pang wrote: >> then mce will be broadcast to the other cpus which are still running >> in the first kernel(i.e. looping in crash_nmi_callback). > Simple: the crash code

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-02-16 Thread Xunlei Pang
On 02/16/2017 at 08:22 PM, Borislav Petkov wrote: > On Thu, Feb 16, 2017 at 07:52:09PM +0800, Xunlei Pang wrote: >> then mce will be broadcast to the other cpus which are still running >> in the first kernel(i.e. looping in crash_nmi_callback). > Simple: the crash code

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-02-16 Thread Xunlei Pang
On 02/16/2017 at 06:18 PM, Borislav Petkov wrote: > On Thu, Feb 16, 2017 at 01:36:37PM +0800, Xunlei Pang wrote: >> I tried to use qemu to inject SRAO("mce -b 0 0 0xb100 0x5 0x0 >> 0x0"), >> it works well in 1st kernel, but it doesn't work for 1st ker

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-02-16 Thread Xunlei Pang
On 02/16/2017 at 06:18 PM, Borislav Petkov wrote: > On Thu, Feb 16, 2017 at 01:36:37PM +0800, Xunlei Pang wrote: >> I tried to use qemu to inject SRAO("mce -b 0 0 0xb100 0x5 0x0 >> 0x0"), >> it works well in 1st kernel, but it doesn't work for 1st ker

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-02-15 Thread Xunlei Pang
On 01/26/2017 at 02:44 PM, Borislav Petkov wrote: > On Thu, Jan 26, 2017 at 02:30:02PM +0800, Xunlei Pang wrote: >> The hardware machine check is hard to reproduce, but the mce code of >> RHEL7 is quite the same as that of tip/master, anyway we are able to >> inject soft

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-02-15 Thread Xunlei Pang
On 01/26/2017 at 02:44 PM, Borislav Petkov wrote: > On Thu, Jan 26, 2017 at 02:30:02PM +0800, Xunlei Pang wrote: >> The hardware machine check is hard to reproduce, but the mce code of >> RHEL7 is quite the same as that of tip/master, anyway we are able to >> inject soft

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-25 Thread Xunlei Pang
On 01/24/2017 at 08:22 PM, Borislav Petkov wrote: > On Tue, Jan 24, 2017 at 09:27:45AM +0800, Xunlei Pang wrote: >> It occurred on real hardware when testing crash dump. >> >> 1) SysRq-c was injected for the test in 1st kernel >> [ 49.897279] SysRq : Trigger a cras

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-25 Thread Xunlei Pang
On 01/24/2017 at 08:22 PM, Borislav Petkov wrote: > On Tue, Jan 24, 2017 at 09:27:45AM +0800, Xunlei Pang wrote: >> It occurred on real hardware when testing crash dump. >> >> 1) SysRq-c was injected for the test in 1st kernel >> [ 49.897279] SysRq : Trigger a cras

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/24/2017 at 02:14 AM, Borislav Petkov wrote: > On Mon, Jan 23, 2017 at 10:01:53AM -0800, Luck, Tony wrote: >> will ignore the machine check on the other cpus ... assuming >> that "cpu_is_offline(smp_processor_id())" does the right thing >> in the kexec case where this is an "old" cpu that

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/24/2017 at 02:14 AM, Borislav Petkov wrote: > On Mon, Jan 23, 2017 at 10:01:53AM -0800, Luck, Tony wrote: >> will ignore the machine check on the other cpus ... assuming >> that "cpu_is_offline(smp_processor_id())" does the right thing >> in the kexec case where this is an "old" cpu that

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/24/2017 at 09:46 AM, Xunlei Pang wrote: > On 01/24/2017 at 01:51 AM, Borislav Petkov wrote: >> Hey Tony, >> >> a "welcome back" is in order? :-) >> >> On Mon, Jan 23, 2017 at 09:40:09AM -0800, Luck, Tony wrote: >>> If the system had e

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/24/2017 at 09:46 AM, Xunlei Pang wrote: > On 01/24/2017 at 01:51 AM, Borislav Petkov wrote: >> Hey Tony, >> >> a "welcome back" is in order? :-) >> >> On Mon, Jan 23, 2017 at 09:40:09AM -0800, Luck, Tony wrote: >>> If the system had e

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/24/2017 at 01:51 AM, Borislav Petkov wrote: > Hey Tony, > > a "welcome back" is in order? :-) > > On Mon, Jan 23, 2017 at 09:40:09AM -0800, Luck, Tony wrote: >> If the system had experienced some memory corruption, but >> recovered ... then there would be some pages sitting around >> that

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/24/2017 at 01:51 AM, Borislav Petkov wrote: > Hey Tony, > > a "welcome back" is in order? :-) > > On Mon, Jan 23, 2017 at 09:40:09AM -0800, Luck, Tony wrote: >> If the system had experienced some memory corruption, but >> recovered ... then there would be some pages sitting around >> that

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/23/2017 at 10:50 PM, Borislav Petkov wrote: > On Mon, Jan 23, 2017 at 09:35:53PM +0800, Xunlei Pang wrote: >> One possible timing sequence would be: >> 1st kernel running on multiple cpus panicked >> then the crash dump code starts >> the crash dump code s

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/23/2017 at 10:50 PM, Borislav Petkov wrote: > On Mon, Jan 23, 2017 at 09:35:53PM +0800, Xunlei Pang wrote: >> One possible timing sequence would be: >> 1st kernel running on multiple cpus panicked >> then the crash dump code starts >> the crash dump code s

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/23/2017 at 08:51 PM, Borislav Petkov wrote: > On Mon, Jan 23, 2017 at 04:01:51PM +0800, Xunlei Pang wrote: >> We met an issue for kdump: after kdump kernel boots up, >> and there comes a broadcasted mce in first kernel, the > How does that even happen? > >

Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
On 01/23/2017 at 08:51 PM, Borislav Petkov wrote: > On Mon, Jan 23, 2017 at 04:01:51PM +0800, Xunlei Pang wrote: >> We met an issue for kdump: after kdump kernel boots up, >> and there comes a broadcasted mce in first kernel, the > How does that even happen? > >

Re: [PATCH v2] x86/crash: Update the stale comment in reserve_crashkernel()

2017-01-23 Thread Xunlei Pang
On 01/23/2017 at 04:48 PM, Dave Young wrote: > Hi, Xunlei > > On 01/23/17 at 02:48pm, Xunlei Pang wrote: >> CRASH_KERNEL_ADDR_MAX has been missing for a long time, >> update it with more detailed explanation. >> >> Cc: Robert LeBlanc <rob...@leblancnet.us

Re: [PATCH v2] x86/crash: Update the stale comment in reserve_crashkernel()

2017-01-23 Thread Xunlei Pang
On 01/23/2017 at 04:48 PM, Dave Young wrote: > Hi, Xunlei > > On 01/23/17 at 02:48pm, Xunlei Pang wrote: >> CRASH_KERNEL_ADDR_MAX has been missing for a long time, >> update it with more detailed explanation. >> >> Cc: Robert LeBlanc >> Cc: Ba

[tip:x86/debug] x86/crash: Update the stale comment in reserve_crashkernel()

2017-01-23 Thread tip-bot for Xunlei Pang
Commit-ID: a8d4c8246b290ce97f88752d833804843041ac84 Gitweb: http://git.kernel.org/tip/a8d4c8246b290ce97f88752d833804843041ac84 Author: Xunlei Pang <xlp...@redhat.com> AuthorDate: Mon, 23 Jan 2017 14:48:23 +0800 Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Mon, 23 Ja

[tip:x86/debug] x86/crash: Update the stale comment in reserve_crashkernel()

2017-01-23 Thread tip-bot for Xunlei Pang
Commit-ID: a8d4c8246b290ce97f88752d833804843041ac84 Gitweb: http://git.kernel.org/tip/a8d4c8246b290ce97f88752d833804843041ac84 Author: Xunlei Pang AuthorDate: Mon, 23 Jan 2017 14:48:23 +0800 Committer: Ingo Molnar CommitDate: Mon, 23 Jan 2017 08:57:55 +0100 x86/crash: Update the stale

[PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
/patch/6167631/ https://lists.gt.net/linux/kernel/2146557 Cc: Naoya Horiguchi <n-horigu...@ah.jp.nec.com> Signed-off-by: Xunlei Pang <xlp...@redhat.com> --- arch/x86/kernel/cpu/mcheck/mce.c | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/arc

[PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic

2017-01-23 Thread Xunlei Pang
/patch/6167631/ https://lists.gt.net/linux/kernel/2146557 Cc: Naoya Horiguchi Signed-off-by: Xunlei Pang --- arch/x86/kernel/cpu/mcheck/mce.c | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck

[PATCH v2] x86/crash: Update the stale comment in reserve_crashkernel()

2017-01-22 Thread Xunlei Pang
CRASH_KERNEL_ADDR_MAX has been missing for a long time, update it with more detailed explanation. Cc: Robert LeBlanc <rob...@leblancnet.us> Cc: Baoquan He <b...@redhat.com> Signed-off-by: Xunlei Pang <xlp...@redhat.com> --- arch/x86/kernel/setup.c | 4 +++- 1 file changed,

[PATCH v2] x86/crash: Update the stale comment in reserve_crashkernel()

2017-01-22 Thread Xunlei Pang
CRASH_KERNEL_ADDR_MAX has been missing for a long time, update it with more detailed explanation. Cc: Robert LeBlanc Cc: Baoquan He Signed-off-by: Xunlei Pang --- arch/x86/kernel/setup.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86

Re: [PATCH] Add +~800M crashkernel explaination

2017-01-11 Thread Xunlei Pang
On 01/12/2017 at 03:35 AM, Robert LeBlanc wrote: > On Wed, Dec 14, 2016 at 4:17 PM, Xunlei Pang <xp...@redhat.com> wrote: >> As I replied in another post, if you really want to detail the behaviour, >> should mention >> "crashkernel=size[KMG][@offset[KM

Re: [PATCH] Add +~800M crashkernel explaination

2017-01-11 Thread Xunlei Pang
On 01/12/2017 at 03:35 AM, Robert LeBlanc wrote: > On Wed, Dec 14, 2016 at 4:17 PM, Xunlei Pang wrote: >> As I replied in another post, if you really want to detail the behaviour, >> should mention >> "crashkernel=size[KMG][@offset[KMG]]" with @offset[KMG] specifi

Re: [PATCH] x86/crash: Update the stale comment in reserve_crashkernel()

2016-12-23 Thread Xunlei Pang
On 12/22/2016 at 11:22 AM, Baoquan He wrote: > On 12/15/16 at 11:30am, Xunlei Pang wrote: >> CRASH_KERNEL_ADDR_MAX was missing for a long time, update it >> with more detailed explanation. >> >> Cc: Robert LeBlanc <rob...@leblancnet.us> >> Cc: Baoquan He <

Re: [PATCH] x86/crash: Update the stale comment in reserve_crashkernel()

2016-12-23 Thread Xunlei Pang
On 12/22/2016 at 11:22 AM, Baoquan He wrote: > On 12/15/16 at 11:30am, Xunlei Pang wrote: >> CRASH_KERNEL_ADDR_MAX was missing for a long time, update it >> with more detailed explanation. >> >> Cc: Robert LeBlanc >> Cc: Baoquan He >> Signed-off-by:

Re: [PATCH v2] kexec: add cond_resched into kimage_alloc_crash_control_pages

2016-12-20 Thread Xunlei Pang
On 12/19/2016 at 11:23 AM, Baoquan He wrote: > On 12/09/16 at 03:16pm, Xunlei Pang wrote: >> On 12/09/2016 at 01:13 PM, zhong jiang wrote: >>> On 2016/12/8 17:41, Xunlei Pang wrote: >>>> On 12/08/2016 at 10:37 AM, zhongjiang wrote: >>>>> From: zhong jia

Re: [PATCH v2] kexec: add cond_resched into kimage_alloc_crash_control_pages

2016-12-20 Thread Xunlei Pang
On 12/19/2016 at 11:23 AM, Baoquan He wrote: > On 12/09/16 at 03:16pm, Xunlei Pang wrote: >> On 12/09/2016 at 01:13 PM, zhong jiang wrote: >>> On 2016/12/8 17:41, Xunlei Pang wrote: >>>> On 12/08/2016 at 10:37 AM, zhongjiang wrote: >>>>> From: zhong jia

[PATCH] x86/crash: Update the stale comment in reserve_crashkernel()

2016-12-14 Thread Xunlei Pang
CRASH_KERNEL_ADDR_MAX was missing for a long time, update it with more detailed explanation. Cc: Robert LeBlanc <rob...@leblancnet.us> Cc: Baoquan He <b...@redhat.com> Signed-off-by: Xunlei Pang <xlp...@redhat.com> --- arch/x86/kernel/setup.c | 5 - 1 file changed,

[PATCH] x86/crash: Update the stale comment in reserve_crashkernel()

2016-12-14 Thread Xunlei Pang
CRASH_KERNEL_ADDR_MAX was missing for a long time, update it with more detailed explanation. Cc: Robert LeBlanc Cc: Baoquan He Signed-off-by: Xunlei Pang --- arch/x86/kernel/setup.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86

Re: [PATCH] Add +~800M crashkernel explaination

2016-12-14 Thread Xunlei Pang
On 12/15/2016 at 01:50 AM, Robert LeBlanc wrote: > On Tue, Dec 13, 2016 at 8:08 PM, Xunlei Pang <xp...@redhat.com> wrote: >> On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote: >>> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He <b...@redhat.com> wrote: >>>> On

Re: [PATCH] Add +~800M crashkernel explaination

2016-12-14 Thread Xunlei Pang
On 12/15/2016 at 01:50 AM, Robert LeBlanc wrote: > On Tue, Dec 13, 2016 at 8:08 PM, Xunlei Pang wrote: >> On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote: >>> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He wrote: >>>> On 12/09/16 at 05:22pm, Robert LeBlanc wrote:

Re: [PATCH] Add +~800M crashkernel explaination

2016-12-13 Thread Xunlei Pang
On 12/14/2016 at 11:08 AM, Xunlei Pang wrote: > On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote: >> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He <b...@redhat.com> wrote: >>> On 12/09/16 at 05:22pm, Robert LeBlanc wrote: >>>> When trying to configure c

Re: [PATCH] Add +~800M crashkernel explaination

2016-12-13 Thread Xunlei Pang
On 12/14/2016 at 11:08 AM, Xunlei Pang wrote: > On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote: >> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He wrote: >>> On 12/09/16 at 05:22pm, Robert LeBlanc wrote: >>>> When trying to configure crashkernel greater than about

Re: [PATCH] Add +~800M crashkernel explaination

2016-12-13 Thread Xunlei Pang
On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote: > On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He wrote: >> On 12/09/16 at 05:22pm, Robert LeBlanc wrote: >>> When trying to configure crashkernel greater than about 800 MB, the >>> kernel fails to allocate memory on x86 and x86_64.

Re: [PATCH] Add +~800M crashkernel explaination

2016-12-13 Thread Xunlei Pang
On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote: > On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He wrote: >> On 12/09/16 at 05:22pm, Robert LeBlanc wrote: >>> When trying to configure crashkernel greater than about 800 MB, the >>> kernel fails to allocate memory on x86 and x86_64. This is due to an

Re: [PATCH v2] kexec: add cond_resched into kimage_alloc_crash_control_pages

2016-12-08 Thread Xunlei Pang
On 12/09/2016 at 01:13 PM, zhong jiang wrote: > On 2016/12/8 17:41, Xunlei Pang wrote: >> On 12/08/2016 at 10:37 AM, zhongjiang wrote: >>> From: zhong jiang <zhongji...@huawei.com> >>> >>> A soft lookup will occur when I run trinity in syscall kexec_loa

Re: [PATCH v2] kexec: add cond_resched into kimage_alloc_crash_control_pages

2016-12-08 Thread Xunlei Pang
On 12/09/2016 at 01:13 PM, zhong jiang wrote: > On 2016/12/8 17:41, Xunlei Pang wrote: >> On 12/08/2016 at 10:37 AM, zhongjiang wrote: >>> From: zhong jiang >>> >>> A soft lookup will occur when I run trinity in syscall kexec_load. >>> the

Re: [PATCH v2] kexec: add cond_resched into kimage_alloc_crash_control_pages

2016-12-08 Thread Xunlei Pang
On 12/08/2016 at 10:37 AM, zhongjiang wrote: > From: zhong jiang > > A soft lookup will occur when I run trinity in syscall kexec_load. > the corresponding stack information is as follows. > > [ 237.235937] BUG: soft lockup - CPU#6 stuck for 22s! [trinity-c6:13859] > [

Re: [PATCH v2] kexec: add cond_resched into kimage_alloc_crash_control_pages

2016-12-08 Thread Xunlei Pang
On 12/08/2016 at 10:37 AM, zhongjiang wrote: > From: zhong jiang > > A soft lookup will occur when I run trinity in syscall kexec_load. > the corresponding stack information is as follows. > > [ 237.235937] BUG: soft lockup - CPU#6 stuck for 22s! [trinity-c6:13859] > [ 237.242699] Kernel panic

[PATCH v3] iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped

2016-12-05 Thread Xunlei Pang
ot;iommu/vt-d: Mark copied context entries") Signed-off-by: Xunlei Pang <xlp...@redhat.com> --- v2->v3: Flush context cache only and add Fixes-tag, according to Joerg's comments. drivers/iommu/intel-iommu.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/drivers/iom

[PATCH v3] iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped

2016-12-05 Thread Xunlei Pang
Copy translation tables from old kernel") Fixes: dbcd861f252d ("iommu/vt-d: Do not re-use domain-ids from the old kernel") Fixes: cf484d0e6939 ("iommu/vt-d: Mark copied context entries") Signed-off-by: Xunlei Pang --- v2->v3: Flush context cache only and add Fixes-tag, accordi

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-12-01 Thread Xunlei Pang
On 12/01/2016 at 06:33 PM, Joerg Roedel wrote: > On Thu, Dec 01, 2016 at 10:15:45AM +0800, Xunlei Pang wrote: >> index 3965e73..624eac9 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -2024,6 +2024,25 @@ static int domain

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-12-01 Thread Xunlei Pang
On 12/01/2016 at 06:33 PM, Joerg Roedel wrote: > On Thu, Dec 01, 2016 at 10:15:45AM +0800, Xunlei Pang wrote: >> index 3965e73..624eac9 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -2024,6 +2024,25 @@ static int domain

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-30 Thread Xunlei Pang
On 11/30/2016 at 10:26 PM, Joerg Roedel wrote: > On Wed, Nov 30, 2016 at 06:23:34PM +0800, Baoquan He wrote: >> OK, talked with Xunlei. The old cache could be entry with present bit >> set. > -EPARSE > > Anyway, what I was trying to say is, that the IOMMU TLB is tagged with > domain-ids, and that

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-30 Thread Xunlei Pang
On 11/30/2016 at 10:26 PM, Joerg Roedel wrote: > On Wed, Nov 30, 2016 at 06:23:34PM +0800, Baoquan He wrote: >> OK, talked with Xunlei. The old cache could be entry with present bit >> set. > -EPARSE > > Anyway, what I was trying to say is, that the IOMMU TLB is tagged with > domain-ids, and that

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-30 Thread Xunlei Pang
On 11/29/2016 at 10:35 PM, Joerg Roedel wrote: > On Thu, Nov 17, 2016 at 10:47:28AM +0800, Xunlei Pang wrote: >> As per the comment, the code here only needs to flush context caches >> for the special domain 0 which is used to tag the >> non-present/erroneous caches, seems we

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-30 Thread Xunlei Pang
On 11/29/2016 at 10:35 PM, Joerg Roedel wrote: > On Thu, Nov 17, 2016 at 10:47:28AM +0800, Xunlei Pang wrote: >> As per the comment, the code here only needs to flush context caches >> for the special domain 0 which is used to tag the >> non-present/erroneous caches, seems we

Re: [PATCH v2] iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped

2016-11-27 Thread Xunlei Pang
Ping Joerg/David, do you have any comment on it? On 2016/11/19 at 00:23, Xunlei Pang wrote: > We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers > under kdump, it can be steadily reproduced on several different machines, > the dmesg log is like(running on 4.9.0-rc

Re: [PATCH v2] iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped

2016-11-27 Thread Xunlei Pang
Ping Joerg/David, do you have any comment on it? On 2016/11/19 at 00:23, Xunlei Pang wrote: > We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers > under kdump, it can be steadily reproduced on several different machines, > the dmesg log is like(running on 4.9.0-rc

[PATCH v2] iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped

2016-11-18 Thread Xunlei Pang
race <don.br...@microsemi.com> CC: Baoquan He <b...@redhat.com> CC: Dave Young <dyo...@redhat.com> Signed-off-by: Xunlei Pang <xlp...@redhat.com> --- v1 -> v2: Flush caches using old domain id. drivers/iommu/intel-iommu.c | 22 ++ 1 file changed, 22 in

[PATCH v2] iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped

2016-11-18 Thread Xunlei Pang
-by: Xunlei Pang --- v1 -> v2: Flush caches using old domain id. drivers/iommu/intel-iommu.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 3965e73..653304d 100644 --- a/drivers/iommu/intel-iommu.c +++ b/driv

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Xunlei Pang
On 2016/11/16 at 22:58, Myron Stowe wrote: > On Wed, Nov 16, 2016 at 2:13 AM, Xunlei Pang <xp...@redhat.com> wrote: >> Ccing David >> On 2016/11/16 at 17:02, Xunlei Pang wrote: >>> We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers >&

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Xunlei Pang
On 2016/11/16 at 22:58, Myron Stowe wrote: > On Wed, Nov 16, 2016 at 2:13 AM, Xunlei Pang wrote: >> Ccing David >> On 2016/11/16 at 17:02, Xunlei Pang wrote: >>> We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers >>> under kdump, it can

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Xunlei Pang
Ccing David On 2016/11/16 at 17:02, Xunlei Pang wrote: > We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers > under kdump, it can be steadily reproduced on several different machines, > the dmesg log is like: > HP HPSA Driver (v 3.4.16-0) > hpsa :02:00.0:

Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Xunlei Pang
Ccing David On 2016/11/16 at 17:02, Xunlei Pang wrote: > We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers > under kdump, it can be steadily reproduced on several different machines, > the dmesg log is like: > HP HPSA Driver (v 3.4.16-0) > hpsa :02:00.0:

[PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Xunlei Pang
can survive the kdump tests. CC: Myron Stowe <myron.st...@redhat.com> CC: Don Brace <don.br...@microsemi.com> CC: Baoquan He <b...@redhat.com> CC: Dave Young <dyo...@redhat.com> Tested-by: Joseph Szczypek <jszcz...@redhat.com> Signed-off-by: Xunlei Pang <xlp...@red

[PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Xunlei Pang
can survive the kdump tests. CC: Myron Stowe CC: Don Brace CC: Baoquan He CC: Dave Young Tested-by: Joseph Szczypek Signed-off-by: Xunlei Pang --- drivers/iommu/intel-iommu.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers

Re: [PATCH] iommu/vt-d: Fix the size calculation of pasid table

2016-10-31 Thread Xunlei Pang
es — still not ideal, but better than before. > > Reported by Mika Kuoppala <mika.kuopp...@linux.intel.com> and also by > Xunlei Pang <xlp...@redhat.com> who submitted a simpler patch to fix > only the allocation (and not the free) to the "correct" limit... whic

Re: [PATCH] iommu/vt-d: Fix the size calculation of pasid table

2016-10-31 Thread Xunlei Pang
es — still not ideal, but better than before. > > Reported by Mika Kuoppala and also by > Xunlei Pang who submitted a simpler patch to fix > only the allocation (and not the free) to the "correct" limit... which > was still problematic. > > Signed-off-by: David W

Re: [PATCH] iommu/vt-d: Fix the size calculation of pasid table

2016-10-10 Thread Xunlei Pang
Ping David for confirmation On 2016/09/19 at 20:18, Joerg Roedel wrote: > [Cc'ing David] > > On Mon, Sep 12, 2016 at 10:49:11AM +0800, Xunlei Pang wrote: >> According to the vt-d spec, the size of pasid (state) entry is 8B >> which equals 3 in power of 2, the number of

Re: [PATCH] iommu/vt-d: Fix the size calculation of pasid table

2016-10-10 Thread Xunlei Pang
Ping David for confirmation On 2016/09/19 at 20:18, Joerg Roedel wrote: > [Cc'ing David] > > On Mon, Sep 12, 2016 at 10:49:11AM +0800, Xunlei Pang wrote: >> According to the vt-d spec, the size of pasid (state) entry is 8B >> which equals 3 in power of 2, the number of

Re: [V4 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-09-20 Thread Xunlei Pang
;> >>> Reported-by: Daniel Walker <dwal...@fifo99.com> >>> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" >>> option) >>> Signed-off-by: Hidehiro Kawai <hidehiro.kawai...@hitachi.com> >>> Cc: Dave Young <dyo...@

Re: [V4 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version in panic path

2016-09-20 Thread Xunlei Pang
>>> - Revise comments, description, and symbol names >>> >>> Changes in V2: >>> - Replace smp_send_stop() call with crash_kexec version which >>> saves cpu states and cleans up VMX/SVM >>> - Drop a fix for Problem 1 at this moment >>>

[PATCH] iommu/vt-d: Fix the size calculation of pasid table

2016-09-11 Thread Xunlei Pang
. Signed-off-by: Xunlei Pang <xlp...@redhat.com> --- drivers/iommu/intel-svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index 8ebb353..cfa75c2 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@

[PATCH] iommu/vt-d: Fix the size calculation of pasid table

2016-09-11 Thread Xunlei Pang
. Signed-off-by: Xunlei Pang --- drivers/iommu/intel-svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index 8ebb353..cfa75c2 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -39,7

Re: [PATCH][RFC v4] timekeeping: ignore the bogus sleep time if pm_trace is enabled

2016-08-27 Thread Xunlei Pang
rtc is used as persist clock to compensate for sleep time, >(because system does not have a nonstop clocksource) or > 2. rtc is used to calculate the sleep time in rtc_resume. > > Cc: Rafael J. Wysocki <r...@rjwysocki.net> > Cc: John Stultz <john.stu...@linaro.org

Re: [PATCH][RFC v4] timekeeping: ignore the bogus sleep time if pm_trace is enabled

2016-08-27 Thread Xunlei Pang
rtc is used as persist clock to compensate for sleep time, >(because system does not have a nonstop clocksource) or > 2. rtc is used to calculate the sleep time in rtc_resume. > > Cc: Rafael J. Wysocki > Cc: John Stultz > Cc: Thomas Gleixner > Cc: Xunlei Pang

Re: [PATCH v2 1/2] kexec: Introduce "/sys/kernel/kexec_crash_low_size"

2016-08-24 Thread Xunlei Pang
On 2016/08/24 at 16:20, Dave Young wrote: > On 08/23/16 at 06:11pm, Yinghai Lu wrote: >> On Wed, Aug 17, 2016 at 1:20 AM, Dave Young <dyo...@redhat.com> wrote: >>> On 08/17/16 at 09:50am, Xunlei Pang wrote: >>>> "/sys/kernel/kexec_crash_size" only han

Re: [PATCH v2 1/2] kexec: Introduce "/sys/kernel/kexec_crash_low_size"

2016-08-24 Thread Xunlei Pang
On 2016/08/24 at 16:20, Dave Young wrote: > On 08/23/16 at 06:11pm, Yinghai Lu wrote: >> On Wed, Aug 17, 2016 at 1:20 AM, Dave Young wrote: >>> On 08/17/16 at 09:50am, Xunlei Pang wrote: >>>> "/sys/kernel/kexec_crash_size" only handles crashk_res, it >

Re: [RFC 0/4] Kexec: Enable run time memory resrvation of crash kernel

2016-08-23 Thread Xunlei Pang
On 2016/08/22 at 18:59, Pratyush Anand wrote: > On 12/08/2016:07:48:38 PM, Ronit Halder wrote: >> Currenty linux kernel reserves memory at the boot time for crash kernel. >> It will be very useful if we can reserve memory in run time. The user can >> reserve the memory whenerver needed instead of

Re: [RFC 0/4] Kexec: Enable run time memory resrvation of crash kernel

2016-08-23 Thread Xunlei Pang
On 2016/08/22 at 18:59, Pratyush Anand wrote: > On 12/08/2016:07:48:38 PM, Ronit Halder wrote: >> Currenty linux kernel reserves memory at the boot time for crash kernel. >> It will be very useful if we can reserve memory in run time. The user can >> reserve the memory whenerver needed instead of

Re: [PATCH][RFC v4] timekeeping: ignore the bogus sleep time if pm_trace is enabled

2016-08-18 Thread Xunlei Pang
On 2016/08/18 at 18:36, Oliver Neukum wrote: > On Thu, 2016-08-18 at 18:43 +0800, Chen Yu wrote: >> Previously we encountered some memory overflow issues due to >> the bogus sleep time brought by inconsistent rtc, which is >> triggered when pm_trace is enabled, please refer to: >>

Re: [PATCH][RFC v4] timekeeping: ignore the bogus sleep time if pm_trace is enabled

2016-08-18 Thread Xunlei Pang
On 2016/08/18 at 18:36, Oliver Neukum wrote: > On Thu, 2016-08-18 at 18:43 +0800, Chen Yu wrote: >> Previously we encountered some memory overflow issues due to >> the bogus sleep time brought by inconsistent rtc, which is >> triggered when pm_trace is enabled, please refer to: >>

[PATCH] fib_trie: Fix the description of pos and bits

2016-08-17 Thread Xunlei Pang
1) Fix one typo: s/tn/tp/ 2) Fix the description about the "u" bits. Signed-off-by: Xunlei Pang <xlp...@redhat.com> --- net/ipv4/fib_trie.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index d07fc07..eb7c5d1

[PATCH] fib_trie: Fix the description of pos and bits

2016-08-17 Thread Xunlei Pang
1) Fix one typo: s/tn/tp/ 2) Fix the description about the "u" bits. Signed-off-by: Xunlei Pang --- net/ipv4/fib_trie.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index d07fc07..eb7c5d1 100644 --- a/net/ipv4/fib_tr

Re: [PATCH v7 2/2] Documentation: kdump: add description of enable multi-cpus support

2016-08-17 Thread Xunlei Pang
On 2016/08/18 at 09:50, Zhou Wenjian wrote: > multi-cpu support is useful to improve the performance of kdump in > some cases. So add the description of enable multi-cpu support in > dump-capture kernel. > > Signed-off-by: Zhou Wenjian > Acked-by: Baoquan He

Re: [PATCH v7 2/2] Documentation: kdump: add description of enable multi-cpus support

2016-08-17 Thread Xunlei Pang
On 2016/08/18 at 09:50, Zhou Wenjian wrote: > multi-cpu support is useful to improve the performance of kdump in > some cases. So add the description of enable multi-cpu support in > dump-capture kernel. > > Signed-off-by: Zhou Wenjian > Acked-by: Baoquan He > --- >

Re: [PATCH v2 2/2] kexec: Consider crashk_low_res in sanity_check_segment_list()

2016-08-17 Thread Xunlei Pang
On 2016/08/17 at 15:24, Dave Young wrote: > Hi, Xunlei, > > On 08/17/16 at 09:50am, Xunlei Pang wrote: >> We have crashk_res only in most cases, but sometimes we have >> crashk_low_res. >> >> For example, on 64-bit x86 systems, when "crashkernel=32M,high&q

Re: [PATCH v2 2/2] kexec: Consider crashk_low_res in sanity_check_segment_list()

2016-08-17 Thread Xunlei Pang
On 2016/08/17 at 15:24, Dave Young wrote: > Hi, Xunlei, > > On 08/17/16 at 09:50am, Xunlei Pang wrote: >> We have crashk_res only in most cases, but sometimes we have >> crashk_low_res. >> >> For example, on 64-bit x86 systems, when "crashkernel=32M,high&q

[PATCH v2 1/2] kexec: Introduce "/sys/kernel/kexec_crash_low_size"

2016-08-16 Thread Xunlei Pang
corresponding sysfs file "/sys/kernel/kexec_crash_low_size" for crashk_low_res. So, the exact total reserved memory is the sum of the two. crashk_low_res can also be shrunk via this new interface, and users should be aware of what they are doing. Suggested-by: Dave Young <dyo...@

[PATCH v2 1/2] kexec: Introduce "/sys/kernel/kexec_crash_low_size"

2016-08-16 Thread Xunlei Pang
corresponding sysfs file "/sys/kernel/kexec_crash_low_size" for crashk_low_res. So, the exact total reserved memory is the sum of the two. crashk_low_res can also be shrunk via this new interface, and users should be aware of what they are doing. Suggested-by: Dave Young Signed-off-by:

[PATCH v2 2/2] kexec: Consider crashk_low_res in sanity_check_segment_list()

2016-08-16 Thread Xunlei Pang
fail it as a memory violation in these cases. Thus, we add the case to regard the segment as valid if it is within crashk_low_res. Signed-off-by: Xunlei Pang <xlp...@redhat.com> --- kernel/kexec_core.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/ke

[PATCH v2 2/2] kexec: Consider crashk_low_res in sanity_check_segment_list()

2016-08-16 Thread Xunlei Pang
fail it as a memory violation in these cases. Thus, we add the case to regard the segment as valid if it is within crashk_low_res. Signed-off-by: Xunlei Pang --- kernel/kexec_core.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/kernel/kexec_core.c b/kernel/kexec_

Re: [PATCH] kexec: Account crashk_low_res to kexec_crash_size

2016-08-15 Thread Xunlei Pang
On 2016/08/15 at 15:17, Dave Young wrote: > Hi Xunlei, > > On 08/13/16 at 04:26pm, Xunlei Pang wrote: >> "/sys/kernel/kexec_crash_size" only includes crashk_res, it >> is fine in most cases, but sometimes we have crashk_low_res. >> For example, when

Re: [PATCH] kexec: Account crashk_low_res to kexec_crash_size

2016-08-15 Thread Xunlei Pang
On 2016/08/15 at 15:17, Dave Young wrote: > Hi Xunlei, > > On 08/13/16 at 04:26pm, Xunlei Pang wrote: >> "/sys/kernel/kexec_crash_size" only includes crashk_res, it >> is fine in most cases, but sometimes we have crashk_low_res. >> For example, when

[PATCH] kexec: Account crashk_low_res to kexec_crash_size

2016-08-13 Thread Xunlei Pang
ays should stay consistent. Note that write to "/sys/kernel/kexec_crash_size" is to shrink the reserved memory, and we want to shrink crashk_res only. So we add some additional check in crash_shrink_memory() since crashk_low_res now is involved. Signed-off-by: Xunlei Pang <xlp...@redh

[PATCH] kexec: Account crashk_low_res to kexec_crash_size

2016-08-13 Thread Xunlei Pang
ays should stay consistent. Note that write to "/sys/kernel/kexec_crash_size" is to shrink the reserved memory, and we want to shrink crashk_res only. So we add some additional check in crash_shrink_memory() since crashk_low_res now is involved. Signed-off-by: Xunlei Pang --- kernel/

[tip:sched/core] sched/fair: Fix typo in sync_throttle()

2016-08-10 Thread tip-bot for Xunlei Pang
Commit-ID: b8922125e4790fa237a8a4204562ecf457ef54bb Gitweb: http://git.kernel.org/tip/b8922125e4790fa237a8a4204562ecf457ef54bb Author: Xunlei Pang <xlp...@redhat.com> AuthorDate: Sat, 9 Jul 2016 15:54:22 +0800 Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Wed, 10 Au

[tip:sched/core] sched/fair: Fix typo in sync_throttle()

2016-08-10 Thread tip-bot for Xunlei Pang
Commit-ID: b8922125e4790fa237a8a4204562ecf457ef54bb Gitweb: http://git.kernel.org/tip/b8922125e4790fa237a8a4204562ecf457ef54bb Author: Xunlei Pang AuthorDate: Sat, 9 Jul 2016 15:54:22 +0800 Committer: Ingo Molnar CommitDate: Wed, 10 Aug 2016 13:32:55 +0200 sched/fair: Fix typo

Re: [PATCH v5] sched/deadline: remove useless param from setup_new_dl_entity

2016-08-05 Thread Xunlei Pang
sks wakeup time anyway. By doing > so, we don't need to worry about a potential PI donor anymore, as rt_ > mutex_setprio() takes care of that already for us. > > Cc: Ingo Molnar <mi...@redhat.com> > Cc: Peter Zijlstra <pet...@infradead.org> > Cc: Steven Rostedt &

<    1   2   3   4   5   6   7   8   9   10   >