[tip:x86/apic] x86/apic: Update comment about disabling processor focus
Commit-ID: 5035da41996d346c648a65c1d7a9f6469c7d358a Gitweb: http://git.kernel.org/tip/5035da41996d346c648a65c1d7a9f6469c7d358a Author: Wei Jiangang AuthorDate: Fri, 19 Aug 2016 11:22:37 +0800 Committer: Ingo Molnar CommitDate: Wed, 24 Aug 2016 11:24:33 +0200 x86/apic: Update comment about disabling processor focus Fix references to discarded end_level_ioapic_irq(). Signed-off-by: Wei Jiangang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: b...@suse.de Link: http://lkml.kernel.org/r/1471576957-12961-2-git-send-email-weijg.f...@cn.fujitsu.com Signed-off-by: Ingo Molnar --- arch/x86/kernel/apic/apic.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 63b7484..1cbae30 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1374,7 +1374,6 @@ void setup_local_APIC(void) * Actually disabling the focus CPU check just makes the hang less * frequent as it makes the interrupt distributon model be more * like LRU than MRU (the short-term load is more even across CPUs). -* See also the comment in end_level_ioapic_irq(). --macro */ /*
[tip:x86/apic] x86/smpboot: Check APIC ID before setting up default routing
Commit-ID: 384d9fe3741657c8ed8cd9bf30bc1d4611864d56 Gitweb: http://git.kernel.org/tip/384d9fe3741657c8ed8cd9bf30bc1d4611864d56 Author: Wei Jiangang AuthorDate: Fri, 19 Aug 2016 11:22:36 +0800 Committer: Ingo Molnar CommitDate: Wed, 24 Aug 2016 11:24:33 +0200 x86/smpboot: Check APIC ID before setting up default routing This is not a bugfix, but code optimization. If the BSP's APIC ID in local APIC is unexpected, a kernel panic will occur and the system will halt. That means no need to enable APIC mode, and no reason to set up the default routing for APIC. The combination of default_setup_apic_routing() and apic_bsp_setup() are used to enable APIC mode. They two should be kept together, rather than being separated by the codes of checking APIC ID. Just like their usage in APIC_init_uniprocessor(). Signed-off-by: Wei Jiangang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: b...@suse.de Link: http://lkml.kernel.org/r/1471576957-12961-1-git-send-email-weijg.f...@cn.fujitsu.com Signed-off-by: Ingo Molnar --- arch/x86/kernel/smpboot.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 2a6e84a..8216b99 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1316,14 +1316,13 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) break; } - default_setup_apic_routing(); - if (read_apic_id() != boot_cpu_physical_apicid) { panic("Boot APIC ID in local APIC unexpected (%d vs %d)", read_apic_id(), boot_cpu_physical_apicid); /* Or can we switch back to PIC here? */ } + default_setup_apic_routing(); cpu0_logical_apicid = apic_bsp_setup(false); pr_info("CPU%d: ", 0);
Re: [PATCH] vfio/pci: Fix typos in comments
ping... On Wed, 2016-08-17 at 14:37 +0800, Wei Jiangang wrote: > Signed-off-by: Wei Jiangang > --- > drivers/vfio/pci/vfio_pci_config.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_config.c > b/drivers/vfio/pci/vfio_pci_config.c > index 688691d9058d..c4f235452d81 100644 > --- a/drivers/vfio/pci/vfio_pci_config.c > +++ b/drivers/vfio/pci/vfio_pci_config.c > @@ -70,7 +70,7 @@ static const u8 pci_cap_length[PCI_CAP_ID_MAX + 1] = { > > /* > * Lengths of PCIe/PCI-X Extended Config Capabilities > - * 0: Removed or masked from the user visible capabilty list > + * 0: Removed or masked from the user visible capability list > * FF: Variable length > */ > static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] = { > @@ -355,7 +355,7 @@ static int alloc_perm_bits(struct perm_bits *perm, int > size) >* ignore whether a read/write exceeds the defined capability >* structure. We can do this because: >* - Standard config space is already dword aligned > - * - Capabilities are all dword alinged (bits 0:1 of next reserved) > + * - Capabilities are all dword aligned (bits 0:1 of next reserved) >* - Express capabilities defined as dword aligned >*/ > size = round_up(size, 4); > @@ -1516,10 +1516,10 @@ static int vfio_ecap_init(struct vfio_pci_device > *vdev) > * space which tracks reads and writes to bits that we emulate for > * the user. Initial values filled from device. > * > - * Using shared stuct perm_bits between all vfio-pci devices saves > + * Using shared struct perm_bits between all vfio-pci devices saves > * us from allocating cfg_size buffers for virt and write for every > * device. We could remove vconfig and allocate individual buffers > - * for each area requring emulated bits, but the array of pointers > + * for each area requiring emulated bits, but the array of pointers > * would be comparable in size (at least for standard config space). > */ > int vfio_config_init(struct vfio_pci_device *vdev)
[PATCH v2 1/2] x86/smpboot: Check APIC ID before setting up default routing
This is not a bugfix, but code optimization. If the BSP's APIC ID in local APIC is unexpected, a kernel panic will occur and the system will halt. That means no need to enable APIC mode, and no reason to set up the default routing for APIC. The combination of default_setup_apic_routing() and apic_bsp_setup() are used to enable APIC mode. They two should be kept together, rather than being separated by the codes of checking APIC ID. Just like their usage in APIC_init_uniprocessor(). Signed-off-by: Wei Jiangang --- arch/x86/kernel/smpboot.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 2a6e84a30a54..8216b997c1c9 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1316,14 +1316,13 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) break; } - default_setup_apic_routing(); - if (read_apic_id() != boot_cpu_physical_apicid) { panic("Boot APIC ID in local APIC unexpected (%d vs %d)", read_apic_id(), boot_cpu_physical_apicid); /* Or can we switch back to PIC here? */ } + default_setup_apic_routing(); cpu0_logical_apicid = apic_bsp_setup(false); pr_info("CPU%d: ", 0); -- 1.9.3
[PATCH v2 2/2] x86/apic: Update comment about disabling processor focus
Fix references to discarded end_level_ioapic_irq(). Signed-off-by: Wei Jiangang --- arch/x86/kernel/apic/apic.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 20abd912f0e4..9d203712c146 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1350,7 +1350,6 @@ void setup_local_APIC(void) * Actually disabling the focus CPU check just makes the hang less * frequent as it makes the interrupt distributon model be more * like LRU than MRU (the short-term load is more even across CPUs). -* See also the comment in end_level_ioapic_irq(). --macro */ /* -- 1.9.3
Re: [PATCH 1/2] x86/smpboot: Check APIC ID before setting up default routing
On Thu, 2016-08-18 at 11:47 +0200, Ingo Molnar wrote: > * Wei Jiangang wrote: > > > Check the boot APIC ID firstly, > > and then setup the default routing of APIC looks better. > > > > And move default_setup_apic_routing() close to apic_bsp_setup(), > > which staying in step with the codes in APIC_init_uniprocessor(). > > > > Signed-off-by: Wei Jiangang > > --- > > arch/x86/kernel/smpboot.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > If it's not a bug then please clearly note so in the changelog and > explain why the change results in better code. It's not a bug. The combination of default_setup_apic_routing() and apic_bsp_setup() is used to enable APIC mode. If the return of read_apic_id() is not equal to boot_cpu_physical_apicid, it means the APIC ID of current CPU‘s local APIC is unexpected and enable APIC mode maybe fail. so no need to set up the apic routing. That's why I want to move default_setup_apic_routing() behind checking the boot APIC ID. Sorry for obscure commit message. I will improve it in next version. > > If it's fixing a bug/misfeature then please fix the changelog to > conform to the standard changelog style: > > - first describe the symptoms of the bug - how does a user notice? > > - then describe how the code behaves today and how that is causing the bug > > - and then only describe how it's fixed. Thanks for your detailed explanation. I'll keep in my mind. > > Thanks, > > Ingo > >
[PATCH 1/2] x86/smpboot: Check APIC ID before setting up default routing
Check the boot APIC ID firstly, and then setup the default routing of APIC looks better. And move default_setup_apic_routing() close to apic_bsp_setup(), which staying in step with the codes in APIC_init_uniprocessor(). Signed-off-by: Wei Jiangang --- arch/x86/kernel/smpboot.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 2a6e84a30a54..8216b997c1c9 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1316,14 +1316,13 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) break; } - default_setup_apic_routing(); - if (read_apic_id() != boot_cpu_physical_apicid) { panic("Boot APIC ID in local APIC unexpected (%d vs %d)", read_apic_id(), boot_cpu_physical_apicid); /* Or can we switch back to PIC here? */ } + default_setup_apic_routing(); cpu0_logical_apicid = apic_bsp_setup(false); pr_info("CPU%d: ", 0); -- 1.9.3
[PATCH 2/2] x86/apic: Update comment about disabling processor focus
Fix references to discarded end_level_ioapic_irq(). Signed-off-by: Wei Jiangang --- arch/x86/kernel/apic/apic.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 20abd912f0e4..9d203712c146 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1350,7 +1350,6 @@ void setup_local_APIC(void) * Actually disabling the focus CPU check just makes the hang less * frequent as it makes the interrupt distributon model be more * like LRU than MRU (the short-term load is more even across CPUs). -* See also the comment in end_level_ioapic_irq(). --macro */ /* -- 1.9.3
[PATCH] vfio/pci: Fix typos in comments
Signed-off-by: Wei Jiangang --- drivers/vfio/pci/vfio_pci_config.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index 688691d9058d..c4f235452d81 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -70,7 +70,7 @@ static const u8 pci_cap_length[PCI_CAP_ID_MAX + 1] = { /* * Lengths of PCIe/PCI-X Extended Config Capabilities - * 0: Removed or masked from the user visible capabilty list + * 0: Removed or masked from the user visible capability list * FF: Variable length */ static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] = { @@ -355,7 +355,7 @@ static int alloc_perm_bits(struct perm_bits *perm, int size) * ignore whether a read/write exceeds the defined capability * structure. We can do this because: * - Standard config space is already dword aligned -* - Capabilities are all dword alinged (bits 0:1 of next reserved) +* - Capabilities are all dword aligned (bits 0:1 of next reserved) * - Express capabilities defined as dword aligned */ size = round_up(size, 4); @@ -1516,10 +1516,10 @@ static int vfio_ecap_init(struct vfio_pci_device *vdev) * space which tracks reads and writes to bits that we emulate for * the user. Initial values filled from device. * - * Using shared stuct perm_bits between all vfio-pci devices saves + * Using shared struct perm_bits between all vfio-pci devices saves * us from allocating cfg_size buffers for virt and write for every * device. We could remove vconfig and allocate individual buffers - * for each area requring emulated bits, but the array of pointers + * for each area requiring emulated bits, but the array of pointers * would be comparable in size (at least for standard config space). */ int vfio_config_init(struct vfio_pci_device *vdev) -- 1.9.3
Re: [PATCH v2 3/3] x86/apic: Improved the setting of interrupt mode for bsp
Hi Eric, I have several questions about kdump and APIC mode. specific issues at the end of the mail. On Tue, 2016-08-02 at 09:26 -0500, Eric W. Biederman wrote: > "Wei, Jiangang" writes: > > > Hi Eric, > > > > Thanks for your response. > > But I have some different ideas... > > Apologies for not replying to this earlier your reply got lost in my > spam folder and I overlooked it. > > > On Mon, 2016-07-25 at 22:53 -0500, Eric W. Biederman wrote: > >> Wei Jiangang writes: > >> > >> > If we specify the 'notsc' parameter for the dump-capture kernel, > >> > and then trigger a crash(panic) by using "ALT-SysRq-c" or > >> > "echo c > /proc/sysrq-trigger", the dump-capture kernel will > >> > hang in calibrate_delay_converge() and wait for jiffies changes. > >> > serial log as follows: > >> > > >> > tsc: Fast TSC calibration using PIT > >> > tsc: Detected 2099.947 MHz processor > >> > Calibrating delay loop... > >> > > >> > The reason for jiffies not changes is there's no timer interrupt > >> > passed to dump-capture kernel. > >> > > >> > In fact, once kernel panic occurs, the local APIC is disabled > >> > by lapic_shutdown() in reboot path. > >> > generly speaking, local APIC state can be initialized by BIOS > >> > after Power-Up or Reset, which doesn't apply to kdump case. > >> > so the kernel has to be responsible for initialize the interrupt > >> > mode properly according the latest status of APIC in bootup path. > >> > > >> > An MP operating system is booted under either PIC mode or > >> > virtual wire mode. Later, the operating system switches to > >> > symmetric I/O mode as it enters multiprocessor mode. > >> > Two kinds of virtual wire mode are defined in Intel MP spec: > >> > virtual wire mode via local APIC or via I/O APIC. > >> > > >> > Now we determine the mode of APIC only through a SMP BIOS(MP table). > >> > That's not enough. It's better to do further check if APIC works > >> > with effective interrupt mode, and then, do some proper setting. > >> > >> Reading through the code let me pause a moment and say: > >> "Yowzers the interrupt initialization code has gotten hard to follow. It > >> is now full of indirection with ill defined semantics." pre_vector_init > >> indeed. > >> > >> I will argue this is the wrong fix. > >> > >> We really should not have to worry about getting the system functional > >> in virtual wire mode on a modern system. And looking at the code > >> someone has done half the work and made it conditional under > >> acpi_gbl_reduced_hardware. > >> > >> Now reduced hardware implies a bit more than we ware talking about but > >> if there is ACPI apic information we should not need to worry about > >> external interrupts and can just enable the apics. > >> > >> In fact I think having MPtable information is enough for that. > > > > In my opinion, The MP table is not to be trusted if system starts > > without BIOS phrase. > > > > The chapter 3.8 System Initial State (MP v1.4 spec) said below: > > "1. All local APICs are disabled, except for the local APIC of the BSP > > if the system starts in Virtual Wire Mode." > > > > And > > . > > "The BIOS must disable interrupts to all processors and set the APICs to > > the system initial state before giving control to the operating system. > > The operating system is responsible for initializing all of the APICs" > > > > The dump-capture kernel won't through the BIOS phrase, > > so there is no guarantee of the interrupt mode of APIC and the status of > > local APIC. > > > > Although the MP table is read-only, > > the dump-capture kernel uses the MP table which be generated before the > > first kernel boots up is unreasonable. > > At least, only checking MP table is not enough. > > That's the starting point of my patch. > > > > You said “this is the wrong fix”, > > Is there any wrong with the codes or my solution to check the status of > > APIC in bootup path? > > Today in a dump capture kernel it uses the MP table or the ACPI MP > information to set up the apics. > > How I read your patch is you do something clever to get in virtual wire > mode. AKA a historical P
Re: [PATCH v2 0/3] Fix dump-capture kernel hangs with notsc
Hi Eric, Thanks for your reply firstly. On Mon, 2016-08-01 at 12:09 -0500, Eric W. Biederman wrote: > "Wei, Jiangang" writes: > > > Ping ... > > May I ask for some community attention to this series? > > I purpose is fixing the dump-capture kernel hangs in > > calibrate_delay_converge() while specifying notsc. > > Did you not see my reply to patch 3/3? Yes, I read your email and made a reply (https://lkml.org/lkml/2016/7/26/112) . I put forward several questions in that letter, but no feedback... > > The short version of my feedback is that you seem to be fixing a case > that should not exist. So the good fix is to skip completely past > virtual wire mode and into full apic mode as soon as possible. I am afraid that there are some disagreements between us. 1) The case that dump-capture kernel boot up with the disabled APIC is very real, and the bug can be reproduced 100%. I want to emphasize that there is no guarantee of the interrupt mode of APIC and status of local APIC, Especially for the dump-capture kernel that won't through the BIOS phrase. That's why I do more check in init_bsp_APIC(), not only depends on the MP tables which be generated before the first kernel boots up. Make a point here, The BIOS must disable interrupts to all processors and set the APICs to the system initial state before giving control to the operating system. That means APICs won't be reset to initial state without BIOS phrase. 2) Your proposal (switch into full apic mode as soon as possible) seems to contradict the Intel Spec, "An MP operating system is booted under either one of the two PC/AT-compatible modes. Later the operating system switches to Symmetric I/O Mode **as it enters multiprocessor mode**." And in other words, the BSP should be in PIC mode or Virtual wire mode in startup stage. 3) The apic initialization codes maybe need a overhaul, but it goes out the scope of this patch. I focus on fixing kdump failure with notsc. And the apic initialization codes has no modification for a long time and can be regard as stable. Overhaul of it increases the chances of hitting a bug. If there's anything wrong with my understanding, please point out. Thanks, wei > > For a subset of cases the code already supports that. > > Eric > >
Re: [PATCH v2 0/3] Fix dump-capture kernel hangs with notsc
Ping ... May I ask for some community attention to this series? I purpose is fixing the dump-capture kernel hangs in calibrate_delay_converge() while specifying notsc. Thanks in advance. wei On Tue, 2016-07-26 at 10:59 +0800, Wei Jiangang wrote: > v2: > Just about the commit ("x86/apic: Improved the setting of interrupt > mode for bsp") > > - Unify the name > s/virtual_wire_via_*/virt_wire_through_* > - Add check for PIC mode > suggested-by Baoquan He > - Add check enable/disable flag for IO-APIC > suggested-by Xunlei Pang > - Update comments > > v1: > The goal is to fix dump-capture kernel with notsc option hangs > in calibrate_delay_converge() > > Wei Jiangang (3): > x86/apic: Remove "focus disabled" for 64bit case > x86/apic: Update comment about disabling processor focus > x86/apic: Improved the setting of interrupt mode for bsp > > arch/x86/include/asm/io_apic.h | 5 > arch/x86/kernel/apic/apic.c| 63 > +++--- > arch/x86/kernel/apic/io_apic.c | 28 +++ > 3 files changed, 92 insertions(+), 4 deletions(-) >
Re: [PATCH v2 3/3] x86/apic: Improved the setting of interrupt mode for bsp
Hi Eric, Thanks for your response. But I have some different ideas... On Mon, 2016-07-25 at 22:53 -0500, Eric W. Biederman wrote: > Wei Jiangang writes: > > > If we specify the 'notsc' parameter for the dump-capture kernel, > > and then trigger a crash(panic) by using "ALT-SysRq-c" or > > "echo c > /proc/sysrq-trigger", the dump-capture kernel will > > hang in calibrate_delay_converge() and wait for jiffies changes. > > serial log as follows: > > > > tsc: Fast TSC calibration using PIT > > tsc: Detected 2099.947 MHz processor > > Calibrating delay loop... > > > > The reason for jiffies not changes is there's no timer interrupt > > passed to dump-capture kernel. > > > > In fact, once kernel panic occurs, the local APIC is disabled > > by lapic_shutdown() in reboot path. > > generly speaking, local APIC state can be initialized by BIOS > > after Power-Up or Reset, which doesn't apply to kdump case. > > so the kernel has to be responsible for initialize the interrupt > > mode properly according the latest status of APIC in bootup path. > > > > An MP operating system is booted under either PIC mode or > > virtual wire mode. Later, the operating system switches to > > symmetric I/O mode as it enters multiprocessor mode. > > Two kinds of virtual wire mode are defined in Intel MP spec: > > virtual wire mode via local APIC or via I/O APIC. > > > > Now we determine the mode of APIC only through a SMP BIOS(MP table). > > That's not enough. It's better to do further check if APIC works > > with effective interrupt mode, and then, do some proper setting. > > Reading through the code let me pause a moment and say: > "Yowzers the interrupt initialization code has gotten hard to follow. It > is now full of indirection with ill defined semantics." pre_vector_init > indeed. > > I will argue this is the wrong fix. > > We really should not have to worry about getting the system functional > in virtual wire mode on a modern system. And looking at the code > someone has done half the work and made it conditional under > acpi_gbl_reduced_hardware. > > Now reduced hardware implies a bit more than we ware talking about but > if there is ACPI apic information we should not need to worry about > external interrupts and can just enable the apics. > > In fact I think having MPtable information is enough for that. In my opinion, The MP table is not to be trusted if system starts without BIOS phrase. The chapter 3.8 System Initial State (MP v1.4 spec) said below: "1. All local APICs are disabled, except for the local APIC of the BSP if the system starts in Virtual Wire Mode." And . "The BIOS must disable interrupts to all processors and set the APICs to the system initial state before giving control to the operating system. The operating system is responsible for initializing all of the APICs" The dump-capture kernel won't through the BIOS phrase, so there is no guarantee of the interrupt mode of APIC and the status of local APIC. Although the MP table is read-only, the dump-capture kernel uses the MP table which be generated before the first kernel boots up is unreasonable. At least, only checking MP table is not enough. That's the starting point of my patch. You said “this is the wrong fix”, Is there any wrong with the codes or my solution to check the status of APIC in bootup path? > So I think what needs to happens is for the apic initialization to get > an overhaul that makes apic initialization the happy path and the other > irq controllers the odd backwards compatibility path. And when we > are done we never run in anything except full apic mode unless the > hardware doesn't support it. The spec requests "An MP operating system is booted under either one of the two PC/AT-compatible modes. Later the operating system switches to Symmetric I/O Mode as it enters multiprocessor mode." And it seems that the latest kernel follows the rule. Does the apic initialization really need to be changed? Thanks, wei > > I think that will leave things more robust as we don't need to setup > and then reset up the interrupts during boot. > > Eric > > > > Signed-off-by: Cao jin > > Signed-off-by: Wei Jiangang > > --- > > arch/x86/include/asm/io_apic.h | 5 > > arch/x86/kernel/apic/apic.c| 60 > > +- > > arch/x86/kernel/apic/io_apic.c | 28 > > 3 files changed, 92 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h > &g
[PATCH v2 1/3] x86/apic: Remove "focus disabled" for 64bit case
Disable processor focus for 64bit causes a crash, Call Trace as following: [] dump_stack+0x63/0x84 [] __warn+0xd1/0xf0 [] warn_slowpath_fmt+0x5f/0x80 [] ex_handler_wrmsr_unsafe+0x62/0x70 [] fixup_exception+0x39/0x50 [] do_general_protection+0x80/0x160 [] general_protection+0x28/0x30 [] ? native_write_msr+0x4/0x30 [] ? native_apic_msr_write+0x32/0x40 [] init_bsp_APIC+0x5f/0x118 [] init_ISA_irqs+0x19/0x4c [] native_init_IRQ+0xd/0x377 [] init_IRQ+0x42/0x49 [] start_kernel+0x2ce/0x4c8 [] ? set_init_arg+0x55/0x55 [] ? early_idt_handler_array+0x120/0x120 [] x86_64_start_reservations+0x2f/0x31 [] x86_64_start_kernel+0x14c/0x16f Keep a consistent implementation with the setup_local_APIC(), always use processor focus for 64bit. more details refer to commit 89c38c2867eb ("x86: apic - unify setup_local_APIC") Signed-off-by: Cao jin Signed-off-by: Wei Jiangang --- arch/x86/kernel/apic/apic.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 60078a67d7e3..0273b652c689 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1154,9 +1154,7 @@ void __init init_bsp_APIC(void) if ((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) && (boot_cpu_data.x86 == 15)) value &= ~APIC_SPIV_FOCUS_DISABLED; - else #endif - value |= APIC_SPIV_FOCUS_DISABLED; value |= SPURIOUS_APIC_VECTOR; apic_write(APIC_SPIV, value); -- 1.9.3
[PATCH v2 2/3] x86/apic: Update comment about disabling processor focus
Fix references to discarded end_level_ioapic_irq(). Signed-off-by: Cao jin Signed-off-by: Wei Jiangang --- arch/x86/kernel/apic/apic.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 0273b652c689..8e25b9b2d351 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1346,7 +1346,6 @@ void setup_local_APIC(void) * Actually disabling the focus CPU check just makes the hang less * frequent as it makes the interrupt distributon model be more * like LRU than MRU (the short-term load is more even across CPUs). -* See also the comment in end_level_ioapic_irq(). --macro */ /* -- 1.9.3
[PATCH v2 3/3] x86/apic: Improved the setting of interrupt mode for bsp
If we specify the 'notsc' parameter for the dump-capture kernel, and then trigger a crash(panic) by using "ALT-SysRq-c" or "echo c > /proc/sysrq-trigger", the dump-capture kernel will hang in calibrate_delay_converge() and wait for jiffies changes. serial log as follows: tsc: Fast TSC calibration using PIT tsc: Detected 2099.947 MHz processor Calibrating delay loop... The reason for jiffies not changes is there's no timer interrupt passed to dump-capture kernel. In fact, once kernel panic occurs, the local APIC is disabled by lapic_shutdown() in reboot path. generly speaking, local APIC state can be initialized by BIOS after Power-Up or Reset, which doesn't apply to kdump case. so the kernel has to be responsible for initialize the interrupt mode properly according the latest status of APIC in bootup path. An MP operating system is booted under either PIC mode or virtual wire mode. Later, the operating system switches to symmetric I/O mode as it enters multiprocessor mode. Two kinds of virtual wire mode are defined in Intel MP spec: virtual wire mode via local APIC or via I/O APIC. Now we determine the mode of APIC only through a SMP BIOS(MP table). That's not enough. It's better to do further check if APIC works with effective interrupt mode, and then, do some proper setting. Signed-off-by: Cao jin Signed-off-by: Wei Jiangang --- arch/x86/include/asm/io_apic.h | 5 arch/x86/kernel/apic/apic.c| 60 +- arch/x86/kernel/apic/io_apic.c | 28 3 files changed, 92 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h index 6cbf2cfb3f8a..a3257366bf7f 100644 --- a/arch/x86/include/asm/io_apic.h +++ b/arch/x86/include/asm/io_apic.h @@ -190,6 +190,7 @@ static inline unsigned int io_apic_read(unsigned int apic, unsigned int reg) } extern void setup_IO_APIC(void); +extern bool virt_wire_through_ioapic(void); extern void enable_IO_APIC(void); extern void disable_IO_APIC(void); extern void setup_ioapic_dest(void); @@ -231,6 +232,10 @@ static inline void io_apic_init_mappings(void) { } #define native_disable_io_apic NULL static inline void setup_IO_APIC(void) { } +static inline bool virt_wire_through_ioapic(void) +{ + return false; +} static inline void enable_IO_APIC(void) { } static inline void setup_ioapic_dest(void) { } diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 8e25b9b2d351..a3939fb130cc 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1124,6 +1124,58 @@ void __init sync_Arb_IDs(void) } /* + * Check APIC enable/disable flag + */ +static bool check_apic_enabled(void) +{ + unsigned int value; + + /* +* If APIC is disabled globally (IA32_APIC_BASE[11] == 0) +* the boot cpu hasn't X86_FEATURE_APIC, +* and init_bsp_APIC() has already checked it before. +* so no need to check global enable/disable flag here +*/ + + /* Check the software enable/disable flag */ + value = apic_read(APIC_SPIV); + if (!(value & APIC_SPIV_APIC_ENABLED)) + return false; + + return true; +} + +/* + * Return false means the through-local-APIC virtual wire mode is inactive + */ +static bool virt_wire_through_lapic(void) +{ + unsigned int value; + + /* +* The through-local-APIC virtual wire mode requests +* local APIC to enable LINT0 for ExtINT delivery mode +* and LINT1 for NMI delivery mode +*/ + value = apic_read(APIC_LVT0); + if (GET_APIC_DELIVERY_MODE(value) != APIC_MODE_EXTINT) + return false; + + value = apic_read(APIC_LVT1); + if (GET_APIC_DELIVERY_MODE(value) != APIC_MODE_NMI) + return false; + + return true; +} + +static bool check_virt_wire_mode(void) +{ + /* If neither of virtual wire mode is active, return false */ + return (check_apic_enabled() && (virt_wire_through_lapic() || + virt_wire_through_ioapic())); +} + +/* * An initial setup of the virtual wire mode. */ void __init init_bsp_APIC(void) @@ -1133,8 +1185,14 @@ void __init init_bsp_APIC(void) /* * Don't do the setup now if we have a SMP BIOS as the * through-I/O-APIC virtual wire mode might be active. +* +* It's better to do further check if either through-I/O-APIC +* or through-local-APIC is active. +* the worst case is that both of them are inactive, If so, +* we need to enable the through-local-APIC virtual wire mode */ - if (smp_found_config || !boot_cpu_has(X86_FEATURE_APIC)) + if (pic_mode || !boot_cpu_has(X86_FEATURE_APIC) || + (smp_found_config && check_virt_wire_mode())) return; /* diff --git a/arch/x86
[PATCH v2 0/3] Fix dump-capture kernel hangs with notsc
v2: Just about the commit ("x86/apic: Improved the setting of interrupt mode for bsp") - Unify the name s/virtual_wire_via_*/virt_wire_through_* - Add check for PIC mode suggested-by Baoquan He - Add check enable/disable flag for IO-APIC suggested-by Xunlei Pang - Update comments v1: The goal is to fix dump-capture kernel with notsc option hangs in calibrate_delay_converge() Wei Jiangang (3): x86/apic: Remove "focus disabled" for 64bit case x86/apic: Update comment about disabling processor focus x86/apic: Improved the setting of interrupt mode for bsp arch/x86/include/asm/io_apic.h | 5 arch/x86/kernel/apic/apic.c| 63 +++--- arch/x86/kernel/apic/io_apic.c | 28 +++ 3 files changed, 92 insertions(+), 4 deletions(-) -- 1.9.3
Re: [PATCH 3/3] x86/apic: Improved the setting of interrupt mode for bsp
On Mon, 2016-07-25 at 16:21 +0800, Xunlei Pang wrote: > On 2016/07/25 at 11:04, Wei, Jiangang wrote: > > Hi He, > > > > Thanks for your response firstly. > > > > On Fri, 2016-07-22 at 18:40 +0800, Baoquan He wrote: > >> Hi Jiangang, > >> > >> This is very nice, should be the stuff Eric and Ingo would like to see. > >> But I have several questions: > >> > >> 1) Are you not going to clean up the old legacy irq mode setting code in > >> 1st kernel? > > Yes, I would like to pay more attention on fixing kdump's failure with > > notsc. > > No plan to clean up the irq mode setting codes in the crash kernel > > reboot path. > > If you are interested in it, please go on. > > > >> 2)I call them legacy irq mode because not only apic virtual wire mode > >> exists, but the PIC mode in x86 32bit system. You need consider it too. > >> Then init_bsp_APIC need be renamaed to an appropriate one. And assume > >> this has been tested on x86 32bit system. > > Thanks for your reminders. > > As the comment of init_bsp_APIC(), it's just used to setup the virtual > > wire mode. > > so I won't change its name. > > > > But i agree with that PIC mode should be considered. > > Next version will be like below, > > > > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > > index 04358e0cf1e2..d40bab947a2a 100644 > > --- a/arch/x86/kernel/apic/apic.c > > +++ b/arch/x86/kernel/apic/apic.c > > @@ -1186,7 +1186,7 @@ void __init init_bsp_APIC(void) > > * the worst case is that both of them are inactive, > > * If so, We need to enable the virtual wire mode via > > through-local-APIC > > */ > > - if (smp_found_config || !boot_cpu_has(X86_FEATURE_APIC) > > + if ( pic_mode || (smp_found_config && check_virtual_wire_mode()) > > I think this is needless, as init_bsp_APIC() is made under the following > condition: > #if defined(CONFIG_X86_64) || defined(CONFIG_X86_LOCAL_APIC) Nope, I suggest you take a look at default_get_smp_config(). . #if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86_32) if (mpf->feature2 & (1 << 7)) { pr_info("IMCR and PIC compatibility mode.\n"); pic_mode = 1; } else { pr_info("Virtual Wire compatibility mode.\n"); pic_mode = 0; } #endif ... if we declare CONFIG_X86_32 and CONFIG_X86_LOCAL_APIC, . init_bsp_APIC may be called. so check pic_mode is useful. > > > || > > !boot_cpu_has(X86_FEATURE_APIC)) > > return; > >> 3) > >> > >> *)About IO-APIC setting as virtual wire mode, I am always confused. In > >> MP Spec 3.6.2.2, it says "the interrupt signal passes through both the > >> I/O APIC and the BSP’s local APIC". That means IO-APIC virtual wire mode > >> need both IO-APIC and LAPIC to be set, and with my understanding only > >> pin of IO-APIC is set as ExtInt, LAPIC should be pass-through. > >> > >> *)But in Intel Arch manual 3A 10.1, there's only below sentences to mention > >> it: > >> > >> ~~~ > >> The I/O APIC also has a “virtual wire mode” that allows it to communicate > >> with a standard 8259A-style external interrupt controller. Note that the > >> local APIC can be disabled. This allows an associated processor core to > >> receive interrupts directly from an 8259A interrupt controller. > >> > >> > >> Eric's code in native_disable_io_apic() has the same point as above > >> words. > > IMO, > > the through-IO-APIC mode has no relationship with the setting of local > > APIC. > > that's why i only check the pin of IO-APIC in virtual_wire_via_ioapic(). > > For through-IO-APIC mode, I think the bsp local apic should be on, but its > LINT0 pin must not > be set to be "ExtINT"(I guess with this mode set will make it to be > through-LAPIC mode), and > we can clearly see the fact from disconnect_bsp_APIC()'s implementation. Yes, your opinion looks reasonable. both through-IO-APIC and through-local-APIC enable the software flag. /* For the spurious interrupt use vector F, and enable it */ value = apic_read(APIC_SPIV); value &= ~APIC_VECTOR_MASK; value |= APIC_SPIV_APIC_ENABLED; value |= 0xf; apic_write(APIC_SPI
Re: [PATCH 3/3] x86/apic: Improved the setting of interrupt mode for bsp
Hi He, Thanks for your response firstly. On Fri, 2016-07-22 at 18:40 +0800, Baoquan He wrote: > Hi Jiangang, > > This is very nice, should be the stuff Eric and Ingo would like to see. > But I have several questions: > > 1) Are you not going to clean up the old legacy irq mode setting code in > 1st kernel? Yes, I would like to pay more attention on fixing kdump's failure with notsc. No plan to clean up the irq mode setting codes in the crash kernel reboot path. If you are interested in it, please go on. > > 2)I call them legacy irq mode because not only apic virtual wire mode > exists, but the PIC mode in x86 32bit system. You need consider it too. > Then init_bsp_APIC need be renamaed to an appropriate one. And assume > this has been tested on x86 32bit system. Thanks for your reminders. As the comment of init_bsp_APIC(), it's just used to setup the virtual wire mode. so I won't change its name. But i agree with that PIC mode should be considered. Next version will be like below, diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 04358e0cf1e2..d40bab947a2a 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1186,7 +1186,7 @@ void __init init_bsp_APIC(void) * the worst case is that both of them are inactive, * If so, We need to enable the virtual wire mode via through-local-APIC */ - if (smp_found_config || !boot_cpu_has(X86_FEATURE_APIC) + if ( pic_mode || (smp_found_config && check_virtual_wire_mode()) || !boot_cpu_has(X86_FEATURE_APIC)) return; > > 3) > > *)About IO-APIC setting as virtual wire mode, I am always confused. In > MP Spec 3.6.2.2, it says "the interrupt signal passes through both the > I/O APIC and the BSP’s local APIC". That means IO-APIC virtual wire mode > need both IO-APIC and LAPIC to be set, and with my understanding only > pin of IO-APIC is set as ExtInt, LAPIC should be pass-through. > > *)But in Intel Arch manual 3A 10.1, there's only below sentences to mention > it: > > ~~~ > The I/O APIC also has a “virtual wire mode” that allows it to communicate > with a standard 8259A-style external interrupt controller. Note that the > local APIC can be disabled. This allows an associated processor core to > receive interrupts directly from an 8259A interrupt controller. > > > Eric's code in native_disable_io_apic() has the same point as above > words. IMO, the through-IO-APIC mode has no relationship with the setting of local APIC. that's why i only check the pin of IO-APIC in virtual_wire_via_ioapic(). Thanks wei > > *)However please read code comments in irq_remapping_disable_io_apic(), > Joerg's description give me a different impression that we can choose > to only use LAPIC virtual wire mode. Joerg is IOMMU maintainers, he is > very familiar with io-apic since IOMMU need take over io-apic entry > filling, there must be reason he wrote that. Add Joerg to CC list. > > Seems it's difficult to find a system with IO-APIC virtual wire mode, > maybe we can just keep it as is. Not sure if Intel engineers can help > explain and confirm this. > > That's all I can think of. > > Thanks > Baoquan > > On 07/22/16 at 04:10pm, Wei Jiangang wrote: > > If we specify the 'notsc' parameter for the dump-capture kernel, > > and then trigger a crash(panic) by using "ALT-SysRq-c" or > > "echo c > /proc/sysrq-trigger", the dump-capture kernel will > > hang in calibrate_delay_converge() and wait for jiffies changes. > > serial log as follows: > > > > tsc: Fast TSC calibration using PIT > > tsc: Detected 2099.947 MHz processor > > Calibrating delay loop... > > > > The reason for jiffies not changes is there's no timer interrupt > > passed to dump-capture kernel. > > > > In fact, once kernel panic occurs, the local APIC is disabled > > by lapic_shutdown() in reboot path. > > generly speaking, local APIC state can be initialized by BIOS > > after Power-Up or Reset, which doesn't apply to kdump case. > > so the kernel has to be responsible for initialize the interrupt > > mode properly according the latest status of APIC in bootup path. > > > > An MP operating system is booted under either PIC mode or > > virtual wire mode. Later, the operating system switches to > > symmetric I/O mode as it enters multiprocessor mode. > > Two kinds of virtual wire mode are defined in Intel MP spec: > > virtual
[PATCH 3/3] x86/apic: Improved the setting of interrupt mode for bsp
If we specify the 'notsc' parameter for the dump-capture kernel, and then trigger a crash(panic) by using "ALT-SysRq-c" or "echo c > /proc/sysrq-trigger", the dump-capture kernel will hang in calibrate_delay_converge() and wait for jiffies changes. serial log as follows: tsc: Fast TSC calibration using PIT tsc: Detected 2099.947 MHz processor Calibrating delay loop... The reason for jiffies not changes is there's no timer interrupt passed to dump-capture kernel. In fact, once kernel panic occurs, the local APIC is disabled by lapic_shutdown() in reboot path. generly speaking, local APIC state can be initialized by BIOS after Power-Up or Reset, which doesn't apply to kdump case. so the kernel has to be responsible for initialize the interrupt mode properly according the latest status of APIC in bootup path. An MP operating system is booted under either PIC mode or virtual wire mode. Later, the operating system switches to symmetric I/O mode as it enters multiprocessor mode. Two kinds of virtual wire mode are defined in Intel MP spec: virtual wire mode via local APIC or via I/O APIC. Now we determine the mode of APIC only through a SMP BIOS(MP table). That's not enough. It's better to do further check if APIC works with effective mode, and do some porper setting. Signed-off-by: Cao jin Signed-off-by: Wei Jiangang --- arch/x86/include/asm/io_apic.h | 5 arch/x86/kernel/apic/apic.c| 55 +- arch/x86/kernel/apic/io_apic.c | 28 + 3 files changed, 87 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h index 6cbf2cfb3f8a..6550bd43fa39 100644 --- a/arch/x86/include/asm/io_apic.h +++ b/arch/x86/include/asm/io_apic.h @@ -190,6 +190,7 @@ static inline unsigned int io_apic_read(unsigned int apic, unsigned int reg) } extern void setup_IO_APIC(void); +extern bool virtual_wire_via_ioapic(void); extern void enable_IO_APIC(void); extern void disable_IO_APIC(void); extern void setup_ioapic_dest(void); @@ -231,6 +232,10 @@ static inline void io_apic_init_mappings(void) { } #define native_disable_io_apic NULL static inline void setup_IO_APIC(void) { } +static inline bool virtual_wire_via_ioapic(void) +{ + return true; +} static inline void enable_IO_APIC(void) { } static inline void setup_ioapic_dest(void) { } diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 8e25b9b2d351..04358e0cf1e2 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1124,6 +1124,53 @@ void __init sync_Arb_IDs(void) } /* + * Return false means the virtual wire mode through-local-apic is inactive + */ +static bool virtual_wire_via_lapic(void) +{ + unsigned int value; + + /* Check the APIC global enable/disable flag firstly */ + if (boot_cpu_data.x86 >= 6) { + u32 h, l; + + rdmsr(MSR_IA32_APICBASE, l, h); + /* +* If local APIC is disabled by BIOS +* do nothing, but return true +*/ + if (!(l & MSR_IA32_APICBASE_ENABLE)) + return true; + } + + /* Check the software enable/disable flag */ + value = apic_read(APIC_SPIV); + if (!(value & APIC_SPIV_APIC_ENABLED)) + return false; + + /* +* Virtual wire mode via local APIC requests +* APIC to enable the LINT0 for edge-trggered ExtINT delivery mode +* and LINT1 for level-triggered NMI delivery mode +*/ + value = apic_read(APIC_LVT0); + if (GET_APIC_DELIVERY_MODE(value) != APIC_MODE_EXTINT) + return false; + + value = apic_read(APIC_LVT1); + if (GET_APIC_DELIVERY_MODE(value) != APIC_MODE_NMI) + return false; + + return true; +} + +static bool check_virtual_wire_mode(void) +{ + /* Neither of virtual wire mode is active, return false */ + return virtual_wire_via_lapic() || virtual_wire_via_ioapic(); +} + +/* * An initial setup of the virtual wire mode. */ void __init init_bsp_APIC(void) @@ -1133,8 +1180,14 @@ void __init init_bsp_APIC(void) /* * Don't do the setup now if we have a SMP BIOS as the * through-I/O-APIC virtual wire mode might be active. +* +* It's better to do further check if either through-I/O-APIC +* or through-local-APIC is active. +* the worst case is that both of them are inactive, +* If so, We need to enable the virtual wire mode via through-local-APIC */ - if (smp_found_config || !boot_cpu_has(X86_FEATURE_APIC)) + if ((smp_found_config && check_virtual_wire_mode()) || + !boot_cpu_has(X86_FEATURE_APIC)) return; /* diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel
[PATCH 2/3] x86/apic: Update comment about disabling processor focus
Fix references to discarded end_level_ioapic_irq(). Signed-off-by: Cao jin Signed-off-by: Wei Jiangang --- arch/x86/kernel/apic/apic.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 0273b652c689..8e25b9b2d351 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1346,7 +1346,6 @@ void setup_local_APIC(void) * Actually disabling the focus CPU check just makes the hang less * frequent as it makes the interrupt distributon model be more * like LRU than MRU (the short-term load is more even across CPUs). -* See also the comment in end_level_ioapic_irq(). --macro */ /* -- 1.9.3
[PATCH 1/3] x86/apic: Remove "focus disabled" for 64bit case
Disable processor focus for 64bit causes a crash, Call Trace as following: [] dump_stack+0x63/0x84 [] __warn+0xd1/0xf0 [] warn_slowpath_fmt+0x5f/0x80 [] ex_handler_wrmsr_unsafe+0x62/0x70 [] fixup_exception+0x39/0x50 [] do_general_protection+0x80/0x160 [] general_protection+0x28/0x30 [] ? native_write_msr+0x4/0x30 [] ? native_apic_msr_write+0x32/0x40 [] init_bsp_APIC+0x5f/0x118 [] init_ISA_irqs+0x19/0x4c [] native_init_IRQ+0xd/0x377 [] init_IRQ+0x42/0x49 [] start_kernel+0x2ce/0x4c8 [] ? set_init_arg+0x55/0x55 [] ? early_idt_handler_array+0x120/0x120 [] x86_64_start_reservations+0x2f/0x31 [] x86_64_start_kernel+0x14c/0x16f Keep a consistent implementation with the setup_local_APIC(), always use processor focus for 64bit. more details refer to commit 89c38c2867eb ("x86: apic - unify setup_local_APIC") Signed-off-by: Cao jin Signed-off-by: Wei Jiangang --- arch/x86/kernel/apic/apic.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 60078a67d7e3..0273b652c689 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1154,9 +1154,7 @@ void __init init_bsp_APIC(void) if ((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) && (boot_cpu_data.x86 == 15)) value &= ~APIC_SPIV_FOCUS_DISABLED; - else #endif - value |= APIC_SPIV_FOCUS_DISABLED; value |= SPURIOUS_APIC_VECTOR; apic_write(APIC_SPIV, value); -- 1.9.3
Re: [PATCH 0/3] Enable legacy irq mode before jump to kexec/kdump kernel
Hi Baoquan He, Well, Indeed there‘s a relationship between the dump-capture hangs in calibrate_delay_converge() and the interrupt mode. but there‘s no essential difference between your patches and mine that calls disable_IO_APIC() again. Actually, disable_IO_APIC will set APIC to virtual wire mode. In fact, Eric and Ingo suggested that "it should be fixed in the bootup path of the dump kernel, not the crash kernel reboot path", which is convincing and reasonable. And i find a better method can fix the problem. It's better to set virtual wire mode for apic in init_bsp_APIC(), which in the bootup path of dump kernel. But now, init_bsp_APIC doesn't initialize the apic to vitual wire mode when smp_found_config is non-zero. FYI, I'm working on this point. later i will send patches to mail list. Wei On Wed, 2016-07-20 at 10:58 +0800, Baoquan He wrote: > Wei Jiangang reported kdump kernel always hang when "notsc" is specified > in boot parameter. After debugging I found there's no timer interrupt > in the current kexec/kdump kernel. This is caused by commit 522e66464467 > ("x86/apic: Disable I/O APIC before shutdown of the local APIC"). Originally > Eric posted below patch to make system be virtual wire mode in which 8259- > equivalent PIC fields all interrupts and the LAPIC becomes a virtual wire. > Like this interrupts can be delivered from PIC to CPU via the LAPIC's local > interrupt 0 (LINTIN0). In virtual wire APIC mode is disabled while LAPIC > is software enabled and its LINT0 and LINT1 need be programmed specifically. > > https://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm1/broken-out/x86_64-apic-virtwire-on-shutdown.patch > > But with commit 522e66464 you can see after disable_IO_APIC had setting > virtual wire mode, lapic_shutdown disabled LAPIC again. Now virtual wire > mode doesn't work, then it cause no timer interrupt during kdump kernel > initialization stage until system enter into APIC mode. > > So people may be wondering why only kdump kernel hang, the normal kernel > with "notsc" can still work. This is because BIOS has already built PIC mode > or virtual wire mode while kexec/kdump kernel doesn't go through BIOS > initialization. That is why we have to change system to be PIC mode or > virtual wire mode before jump to kexec/kdump kernel. > > Then why kdump kernel didn't hang when "notsc" is not specified. This is > because tsc_init will assign the already calibrated value to lpj_fine. > Then kernel doesn't need to count cpu loops between jiffies with the help > of timer interrupt. So "notsc" is not victim, but a informer. > > In patch 1/3 disable_IO_APIC is changed to only contain code of changeing > system to be PIC mode or virtual wire mode and is renamed as > switch_to_legacy_irq_mode. Now only call clear_IO_APIC where IO-APIC need > be disabled, and call switch_to_legacy_irq_mode before jump to kexe/kdump > kernel. > > Patch 2/3 and 3/3 are clean up patch. > > Baoquan He (3): > x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump > kernel > x86/apic: Clean up the names of legacy irq mode setting related > functions > x86/apic: Clean up the apic delivery mode macro definition > > arch/x86/include/asm/apic.h| 2 +- > arch/x86/include/asm/apicdef.h | 1 - > arch/x86/include/asm/io_apic.h | 6 +++--- > arch/x86/kernel/apic/apic.c| 19 +++ > arch/x86/kernel/apic/io_apic.c | 32 +--- > arch/x86/kernel/crash.c| 2 +- > arch/x86/kernel/machine_kexec_32.c | 15 +-- > arch/x86/kernel/machine_kexec_64.c | 15 +-- > arch/x86/kernel/reboot.c | 2 +- > arch/x86/kernel/x86_init.c | 2 +- > drivers/iommu/irq_remapping.c | 2 +- > 11 files changed, 46 insertions(+), 52 deletions(-) >
[tip:x86/apic] x86/apic: Remove the unused struct apic::apic_id_mask field
Commit-ID: 102bb9fef68a21f357dc813d4792666c8295bc35 Gitweb: http://git.kernel.org/tip/102bb9fef68a21f357dc813d4792666c8295bc35 Author: Wei Jiangang AuthorDate: Thu, 14 Jul 2016 10:24:06 +0800 Committer: Ingo Molnar CommitDate: Fri, 15 Jul 2016 10:39:05 +0200 x86/apic: Remove the unused struct apic::apic_id_mask field The only user verify_local_APIC() had been removed by commit: 4399c03c6780 ("x86/apic: Remove verify_local_APIC()") ... so there is no need to keep it. Signed-off-by: Wei Jiangang Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: boris.ostrov...@oracle.com Cc: b...@redhat.com Cc: david.vra...@citrix.com Cc: jgr...@suse.com Cc: konrad.w...@oracle.com Cc: xen-de...@lists.xenproject.org Link: http://lkml.kernel.org/r/1468463046-20849-1-git-send-email-weijg.f...@cn.fujitsu.com Signed-off-by: Ingo Molnar --- arch/x86/include/asm/apic.h | 1 - arch/x86/kernel/apic/apic_flat_64.c | 2 -- arch/x86/kernel/apic/apic_noop.c | 1 - arch/x86/kernel/apic/apic_numachip.c | 2 -- arch/x86/kernel/apic/bigsmp_32.c | 1 - arch/x86/kernel/apic/probe_32.c | 1 - arch/x86/kernel/apic/x2apic_cluster.c | 1 - arch/x86/kernel/apic/x2apic_phys.c| 1 - arch/x86/kernel/apic/x2apic_uv_x.c| 1 - arch/x86/xen/apic.c | 1 - 10 files changed, 12 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index bc27611..f5befd4 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -300,7 +300,6 @@ struct apic { unsigned int (*get_apic_id)(unsigned long x); unsigned long (*set_apic_id)(unsigned int id); - unsigned long apic_id_mask; int (*cpu_mask_to_apicid_and)(const struct cpumask *cpumask, const struct cpumask *andmask, diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c index 76f89e2..0487477 100644 --- a/arch/x86/kernel/apic/apic_flat_64.c +++ b/arch/x86/kernel/apic/apic_flat_64.c @@ -181,7 +181,6 @@ static struct apic apic_flat = { .get_apic_id= flat_get_apic_id, .set_apic_id= set_apic_id, - .apic_id_mask = 0xFFu << 24, .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, @@ -278,7 +277,6 @@ static struct apic apic_physflat = { .get_apic_id= flat_get_apic_id, .set_apic_id= set_apic_id, - .apic_id_mask = 0xFFu << 24, .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/apic_noop.c b/arch/x86/kernel/apic/apic_noop.c index 13d19ed..2cebf59 100644 --- a/arch/x86/kernel/apic/apic_noop.c +++ b/arch/x86/kernel/apic/apic_noop.c @@ -141,7 +141,6 @@ struct apic apic_noop = { .get_apic_id= noop_get_apic_id, .set_apic_id= NULL, - .apic_id_mask = 0x0F << 24, .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c index ab5c2c6..714d4fd 100644 --- a/arch/x86/kernel/apic/apic_numachip.c +++ b/arch/x86/kernel/apic/apic_numachip.c @@ -269,7 +269,6 @@ static const struct apic apic_numachip1 __refconst = { .get_apic_id= numachip1_get_apic_id, .set_apic_id= numachip1_set_apic_id, - .apic_id_mask = 0xffU << 24, .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, @@ -321,7 +320,6 @@ static const struct apic apic_numachip2 __refconst = { .get_apic_id= numachip2_get_apic_id, .set_apic_id= numachip2_set_apic_id, - .apic_id_mask = 0xffU << 24, .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/bigsmp_32.c b/arch/x86/kernel/apic/bigsmp_32.c index cf9bd89..06dbaa4 100644 --- a/arch/x86/kernel/apic/bigsmp_32.c +++ b/arch/x86/kernel/apic/bigsmp_32.c @@ -171,7 +171,6 @@ static struct apic apic_bigsmp = { .get_apic_id= bigsmp_get_apic_id, .set_apic_id= NULL, - .apic_id_mask = 0xFF << 24, .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/probe_32.c b/arch/x86/kernel/apic/probe_32.c index f316e34..93edfa0 100644 --- a/arch/x86/kernel/apic/probe_32.c +++ b/arch/x86/kernel/apic/probe_32.c @@ -101,7 +101,6 @@ static struct apic apic_default = { .get_apic_id= default_get_apic_id, .set_apic_id= NULL, - .apic_id_mask =
[tip:x86/timers] x86/tsc: Remove the unused check_tsc_disabled()
Commit-ID: c48ec42d6eae08f55685ab660f0743ed33b9f22a Gitweb: http://git.kernel.org/tip/c48ec42d6eae08f55685ab660f0743ed33b9f22a Author: Wei Jiangang AuthorDate: Fri, 15 Jul 2016 16:12:10 +0800 Committer: Ingo Molnar CommitDate: Fri, 15 Jul 2016 10:35:08 +0200 x86/tsc: Remove the unused check_tsc_disabled() check_tsc_disabled() was introduced by commit: c73deb6aecda ("perf/x86: Add ability to calculate TSC from perf sample timestamps") The only caller was arch_perf_update_userpage(), which had been refactored by commit: d8b11a0cbd1c ("perf/x86: Clean up cap_user_time* setting") ... so no need keep and export it any more. Signed-off-by: Wei Jiangang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: a.p.zijls...@chello.nl Cc: adrian.hun...@intel.com Cc: b...@suse.de Link: http://lkml.kernel.org/r/1468570330-25810-1-git-send-email-weijg.f...@cn.fujitsu.com Signed-off-by: Ingo Molnar --- arch/x86/include/asm/tsc.h | 1 - arch/x86/kernel/tsc.c | 6 -- 2 files changed, 7 deletions(-) diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h index a30591e..33b6365 100644 --- a/arch/x86/include/asm/tsc.h +++ b/arch/x86/include/asm/tsc.h @@ -35,7 +35,6 @@ extern void tsc_init(void); extern void mark_tsc_unstable(char *reason); extern int unsynchronized_tsc(void); extern int check_tsc_unstable(void); -extern int check_tsc_disabled(void); extern unsigned long native_calibrate_cpu(void); extern unsigned long native_calibrate_tsc(void); extern unsigned long long native_sched_clock_from_tsc(u64 tsc); diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 2a952fc..a804b5a 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -335,12 +335,6 @@ int check_tsc_unstable(void) } EXPORT_SYMBOL_GPL(check_tsc_unstable); -int check_tsc_disabled(void) -{ - return tsc_disabled; -} -EXPORT_SYMBOL_GPL(check_tsc_disabled); - #ifdef CONFIG_X86_TSC int __init notsc_setup(char *str) {
Re: [PATCH] x86/tsc: remove the unused check_tsc_disabled()
On Fri, 2016-07-15 at 10:34 +0200, Ingo Molnar wrote: > * Wei Jiangang wrote: > > > check_tsc_disabled() was introduced by commit c73deb6aecda ("perf/x86: > > Add ability to calculate TSC from perf sample timestamps"). > > The only caller arch_perf_update_userpage() had been refactored > > by commit fa9cbf320e99 ("perf/x86: Move perf_event.c ... > > => x86/events/core.c"). > > so no need keep and export it any more. > > No, it wasn't refactored by fa9cbf320e99, but by: > > d8b11a0cbd1c perf/x86: Clean up cap_user_time* setting > > I've fixed the changelog. Thanks for your correction and quick reply. wei > > Thanks, > > Ingo > >
[PATCH] x86/tsc: remove the unused check_tsc_disabled()
check_tsc_disabled() was introduced by commit c73deb6aecda ("perf/x86: Add ability to calculate TSC from perf sample timestamps"). The only caller arch_perf_update_userpage() had been refactored by commit fa9cbf320e99 ("perf/x86: Move perf_event.c ... => x86/events/core.c"). so no need keep and export it any more. Signed-off-by: Wei Jiangang --- arch/x86/include/asm/tsc.h | 1 - arch/x86/kernel/tsc.c | 6 -- 2 files changed, 7 deletions(-) diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h index 7428697c5b8d..a2828127afe0 100644 --- a/arch/x86/include/asm/tsc.h +++ b/arch/x86/include/asm/tsc.h @@ -35,7 +35,6 @@ extern void tsc_init(void); extern void mark_tsc_unstable(char *reason); extern int unsynchronized_tsc(void); extern int check_tsc_unstable(void); -extern int check_tsc_disabled(void); extern unsigned long native_calibrate_tsc(void); extern unsigned long long native_sched_clock_from_tsc(u64 tsc); diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 38ba6de56ede..67ca51438af2 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -335,12 +335,6 @@ int check_tsc_unstable(void) } EXPORT_SYMBOL_GPL(check_tsc_unstable); -int check_tsc_disabled(void) -{ - return tsc_disabled; -} -EXPORT_SYMBOL_GPL(check_tsc_disabled); - #ifdef CONFIG_X86_TSC int __init notsc_setup(char *str) { -- 1.9.3
[PATCH] apic: remove the unused field apic_id_mask
The only user verify_local_APIC() had been removed by commit 4399c03c6780 ("x86/apic: Remove verify_local_APIC()"), so no need to keep it. Signed-off-by: Wei Jiangang --- arch/x86/include/asm/apic.h | 1 - arch/x86/kernel/apic/apic_flat_64.c | 2 -- arch/x86/kernel/apic/apic_noop.c | 1 - arch/x86/kernel/apic/apic_numachip.c | 2 -- arch/x86/kernel/apic/bigsmp_32.c | 1 - arch/x86/kernel/apic/probe_32.c | 1 - arch/x86/kernel/apic/x2apic_cluster.c | 1 - arch/x86/kernel/apic/x2apic_phys.c| 1 - arch/x86/kernel/apic/x2apic_uv_x.c| 1 - arch/x86/xen/apic.c | 1 - 10 files changed, 12 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index bc27611fa58f..f5befd4945f2 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -300,7 +300,6 @@ struct apic { unsigned int (*get_apic_id)(unsigned long x); unsigned long (*set_apic_id)(unsigned int id); - unsigned long apic_id_mask; int (*cpu_mask_to_apicid_and)(const struct cpumask *cpumask, const struct cpumask *andmask, diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c index 76f89e2b245a..048747778d37 100644 --- a/arch/x86/kernel/apic/apic_flat_64.c +++ b/arch/x86/kernel/apic/apic_flat_64.c @@ -181,7 +181,6 @@ static struct apic apic_flat = { .get_apic_id= flat_get_apic_id, .set_apic_id= set_apic_id, - .apic_id_mask = 0xFFu << 24, .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, @@ -278,7 +277,6 @@ static struct apic apic_physflat = { .get_apic_id= flat_get_apic_id, .set_apic_id= set_apic_id, - .apic_id_mask = 0xFFu << 24, .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/apic_noop.c b/arch/x86/kernel/apic/apic_noop.c index 13d19ed58514..2cebf59092d8 100644 --- a/arch/x86/kernel/apic/apic_noop.c +++ b/arch/x86/kernel/apic/apic_noop.c @@ -141,7 +141,6 @@ struct apic apic_noop = { .get_apic_id= noop_get_apic_id, .set_apic_id= NULL, - .apic_id_mask = 0x0F << 24, .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c index ab5c2c685a3c..714d4fda0d52 100644 --- a/arch/x86/kernel/apic/apic_numachip.c +++ b/arch/x86/kernel/apic/apic_numachip.c @@ -269,7 +269,6 @@ static const struct apic apic_numachip1 __refconst = { .get_apic_id= numachip1_get_apic_id, .set_apic_id= numachip1_set_apic_id, - .apic_id_mask = 0xffU << 24, .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, @@ -321,7 +320,6 @@ static const struct apic apic_numachip2 __refconst = { .get_apic_id= numachip2_get_apic_id, .set_apic_id= numachip2_set_apic_id, - .apic_id_mask = 0xffU << 24, .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/bigsmp_32.c b/arch/x86/kernel/apic/bigsmp_32.c index cf9bd896c12d..06dbaa458bfe 100644 --- a/arch/x86/kernel/apic/bigsmp_32.c +++ b/arch/x86/kernel/apic/bigsmp_32.c @@ -171,7 +171,6 @@ static struct apic apic_bigsmp = { .get_apic_id= bigsmp_get_apic_id, .set_apic_id= NULL, - .apic_id_mask = 0xFF << 24, .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/probe_32.c b/arch/x86/kernel/apic/probe_32.c index f316e34abb42..93edfa01b408 100644 --- a/arch/x86/kernel/apic/probe_32.c +++ b/arch/x86/kernel/apic/probe_32.c @@ -101,7 +101,6 @@ static struct apic apic_default = { .get_apic_id= default_get_apic_id, .set_apic_id= NULL, - .apic_id_mask = 0x0F << 24, .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c index aca8b75c1552..24170d0809ba 100644 --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -270,7 +270,6 @@ static struct apic apic_x2apic_cluster = { .get_apic_id= x2apic_get_apic_id, .set_apic_id= x2apic_set_apic_id, - .apic_id_mask = 0xu, .cpu_mask_to_apicid_and = x2apic_cpu_mask_to_apicid_and, diff --git a/arch/x86/kernel/apic/x2apic_phys.c b/a
Re: [Xen-devel] [PATCH] xen/apic: Update the comment for apic_id_mask
On Thu, 2016-07-07 at 11:37 -0400, Boris Ostrovsky wrote: > On 07/07/2016 11:25 AM, Konrad Rzeszutek Wilk wrote: > > On Thu, Jul 07, 2016 at 11:28:18AM +0800, Wei Jiangang wrote: > >> verify_local_APIC() had been removed by > >> commit 4399c03c6780 ("x86/apic: Remove verify_local_APIC()"), > >> so apic_id_mask isn't used by it. > > Is anyone actually using this field? It looks like 4399c03c6780 removed > the only user. Indeed, the field is useless. Maybe we can remove this field from the struct apic . what's your opinion? Thanks, wei > > -boris > > > > CC-ing the proper maintainers. > >> Signed-off-by: Wei Jiangang > >> --- > >> arch/x86/xen/apic.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c > >> index db52a7fafcc2..9cbb1f48381b 100644 > >> --- a/arch/x86/xen/apic.c > >> +++ b/arch/x86/xen/apic.c > >> @@ -177,7 +177,7 @@ static struct apic xen_pv_apic = { > >> > >>.get_apic_id= xen_get_apic_id, > >>.set_apic_id= xen_set_apic_id, /* Can be NULL on > >> 32-bit. */ > >> - .apic_id_mask = 0xFF << 24, /* Used by > >> verify_local_APIC. Match with what xen_get_apic_id does. */ > >> + .apic_id_mask = 0xFF << 24, /* Match with what > >> xen_get_apic_id does. */ > >> > >>.cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, > >> > >> -- > >> 1.9.3 > >> > >> > >> > >> > >> ___ > >> Xen-devel mailing list > >> xen-de...@lists.xen.org > >> https://lists.xen.org/xen-devel > > > >
Re: [PATCH v2] kexec: Fix kdump failure with notsc
On Mon, 2016-07-11 at 18:28 +0800, Wei Jiangang wrote: > Hi , Ingo > > On Fri, 2016-07-08 at 09:38 +0200, Ingo Molnar wrote: > > * Eric W. Biederman wrote: > > > > > Sigh. Can we please just do the work to rip out the apic shutdown code > > > from the > > > kexec on panic code path? > > > > > > I forgetting details but the only reason we have do any apic shutdown is > > > bugs in > > > older kernels that could not initialize a system properly if we did not > > > shut > > > down the apics. > > > > > > I certainly don't see an issue with goofy cases like notsc not working on > > > a > > > crash capture kernel if we are not initializing the hardware properly. > > > > > > The strategy really needs to be to only do the absolutely essential > > > hardware > > > shutdown in the crashing kernel, every adintional line of code we execute > > > in the > > > crashing kernel increases our chances of hitting a bug. > > > > Fully agreed. > > > > > Under that policy things like requring we don't pass boot options that > > > inhibit > > > the dump catpure kernel from initializing the hardware from a random > > > state are > > > reasonable requirements. AKA I don't see any justification in this as to > > > why we > > > would even want to support notsc on the dump capture kernel. Especially > > > when > > > things clearly work when that option is not specified. > > > > So at least on the surface it appears 'surprising' that the 'notsc' option > > (which, > > supposedly, disables TSC handling) interferes with being able to fully > > boot. Even > > if 'notsc' is specified we are still using the local APIC, right? > > In most case, It's no problem that using local APIC while notsc is > specified. > But not for kdump. > > We can get evidence, Especially from "Spurious LAPIC timer interrupt on > cpu 0". > > ###serial log, > > [0.00] NR_IRQS:524544 nr_irqs:256 16 > [0.00] Spurious LAPIC timer interrupt on cpu 0 > [0.00] Console: colour dummy device 80x25 > [0.00] console [tty0] enabled > [0.00] console [ttyS0] enabled > [0.00] tsc: Fast TSC calibration using PIT > [0.00] tsc: Detected 2099.947 MHz processor > [0.00] Calibrating delay loop... > > > Due to the local apic and local apic timer hasn't been setup and enabled > fully, The event_handler of clock event is NULL. > > ###codes, > > static void local_apic_timer_interrupt(void) > { > int cpu = smp_processor_id(); > struct clock_event_device *evt = &per_cpu(lapic_events, cpu); > > /* > * Normally we should not be here till LAPIC has been initialized > but > * in some cases like kdump, its possible that there is a pending > LAPIC > * timer interrupt from previous kernel's context and is delivered > in > * new kernel the moment interrupts are enabled. > * > * Interrupts are enabled early and LAPIC is setup much later, hence > * its possible that when we get here evt->event_handler is NULL. > * Check for event_handler being NULL and discard the interrupt as > * spurious. > */ > if (!evt->event_handler) { > pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu); > /* Switch it off */ > lapic_timer_shutdown(evt); > return; > } > > . > } > > > IMHO, > If we specify notsc, the dump-capture kernel waits for jiffies being > updated early and LAPIC and timer are setup much later, which causes no > timer interrupts is passed to BSP. as following, > > start_kernel --> > 1)-> calibrate_delay() -> calibrate_delay_converge() # hang and wait > for jiffies changed > > 2)-> rest_init() -> kernel_init() -> -> apic_bsp_setup() -> > setup_local_APIC() > > -> setup_percpu_clockev(). > > the setup_percpu_clockev points setup_boot_APIC_clock() which used to > setup the boot APIC and timer. > > > > So it might be a good idea to find the root cause of this bootup fragility > > even if > > 'notsc' is specified. And I fully agree that it should be fixed in the > > bootup path > > of the dump kernel, not the crash kernel reboot path. > Can anyone give some advice or commet on the following idea? Thanks in advance. > Becaus
Re: [PATCH v2] kexec: Fix kdump failure with notsc
On Tue, 2016-07-12 at 16:21 +0800, Baoquan He wrote: > On 07/12/16 at 02:52pm, Xunlei Pang wrote: > > On 2016/07/07 at 18:17, Wei Jiangang wrote: > > > Signed-off-by: Wei Jiangang > > > --- > > > +/* Local APIC is disabled by the kernel for crash or reboot path */ > > > +static int disabled_local_apic; > > > + > > > /* > > > * Knob to control our willingness to enable the local APIC. > > > * > > > @@ -1097,10 +1100,16 @@ void lapic_shutdown(void) > > > #endif > > > disable_local_APIC(); > > > > > > + disabled_local_apic = 1; > > > > > > local_irq_restore(flags); > > > } > > > > > > +int lapic_disabled(void) > > > +{ > > > + return disabled_local_apic; > > > +} > > > + > > > /** > > > * sync_Arb_IDs - synchronize APIC bus arbitration IDs > > > */ > > > diff --git a/arch/x86/kernel/machine_kexec_32.c > > > b/arch/x86/kernel/machine_kexec_32.c > > > index 469b23d6acc2..c934a7868e6b 100644 > > > --- a/arch/x86/kernel/machine_kexec_32.c > > > +++ b/arch/x86/kernel/machine_kexec_32.c > > > @@ -202,14 +202,13 @@ void machine_kexec(struct kimage *image) > > > local_irq_disable(); > > > hw_breakpoint_disable(); > > > > > > - if (image->preserve_context) { > > > + if (image->preserve_context || lapic_disabled()) { > > > #ifdef CONFIG_X86_IO_APIC > > > /* > > >* We need to put APICs in legacy mode so that we can > > >* get timer interrupts in second kernel. kexec/kdump > > >* paths already have calls to disable_IO_APIC() in > > > - * one form or other. kexec jump path also need > > > - * one. > > > + * one form or other. kexec jump path also need one. > > >*/ > > > disable_IO_APIC(); > > > > Hi Wei, > > > > As the comment says, kexec/kdump paths already have disable_IO_APIC(), why > > again here? > > I also have this question. I guess Jiangang didn't post his modification > correctly. He should remove calling of disable_IO_APIC in > native_machine_crash_shutdown(). Assume his test was done on correct > code change. Hi he, Thanks for your suggestion firstly. In fact, Eric had suggested me to do that last week (https://lkml.org/lkml/2016/7/8/14). And i had a try to remove the calling of disable_IO_APIC in native_machine_crash_shutdown(). But it doesn't work for kdump with notsc. Thanks, wei > > > > > Regards, > > Xunlei > > > > > #endif > > > diff --git a/arch/x86/kernel/machine_kexec_64.c > > > b/arch/x86/kernel/machine_kexec_64.c > > > index 5a294e48b185..d3598cdd6437 100644 > > > --- a/arch/x86/kernel/machine_kexec_64.c > > > +++ b/arch/x86/kernel/machine_kexec_64.c > > > @@ -23,6 +23,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > #include > > > #include > > > #include > > > @@ -269,14 +270,13 @@ void machine_kexec(struct kimage *image) > > > local_irq_disable(); > > > hw_breakpoint_disable(); > > > > > > - if (image->preserve_context) { > > > + if (image->preserve_context || lapic_disabled()) { > > > #ifdef CONFIG_X86_IO_APIC > > > /* > > >* We need to put APICs in legacy mode so that we can > > >* get timer interrupts in second kernel. kexec/kdump > > >* paths already have calls to disable_IO_APIC() in > > > - * one form or other. kexec jump path also need > > > - * one. > > > + * one form or other. kexec jump path also need one. > > >*/ > > > disable_IO_APIC(); > > > #endif > > > > > > ___ > > kexec mailing list > > ke...@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/kexec > >
Re: [PATCH v2] kexec: Fix kdump failure with notsc
On Tue, 2016-07-12 at 14:52 +0800, Xunlei Pang wrote: > On 2016/07/07 at 18:17, Wei Jiangang wrote: > > If we specify the 'notsc' boot parameter for the dump-capture kernel, > > and then trigger a crash(panic) by using "ALT-SysRq-c" or "echo c > > > /proc/sysrq-trigger", > > the dump-capture kernel will hang in calibrate_delay_converge(): > > > > /* wait for "start of" clock tick */ > > ticks = jiffies; > > while (ticks == jiffies) > > ; /* nothing */ > > > > serial log of the hang is as follows: > > > > tsc: Fast TSC calibration using PIT > > tsc: Detected 2099.947 MHz processor > > Calibrating delay loop... > > > > The reason is that the dump-capture kernel hangs in while loops and > > waits for jiffies to be updated, but no timer interrupts is passed > > to BSP by APIC. > > > > In fact, the local APIC was disabled in reboot and crash path by > > lapic_shutdown(). We need to put APIC in legacy mode in kexec jump path > > (put the system into PIT during the crash kernel), > > so that the dump-capture kernel can get timer interrupts. > > > > BTW, > > I found the buggy commit 522e66464467 ("x86/apic: Disable I/O APIC > > before shutdown of the local APIC") via bisection. > > > > Originally, I want to revert it. > > But Ingo Molnar comments that "By reverting the change can paper over > > the bug, but re-introduce the bug that can result in certain CPUs hanging > > if IO-APIC sends an APIC message if the lapic is disabled prematurely" > > And I think it's pertinent. > > > > Signed-off-by: Wei Jiangang > > --- > > arch/x86/include/asm/apic.h| 5 + > > arch/x86/kernel/apic/apic.c| 9 + > > arch/x86/kernel/machine_kexec_32.c | 5 ++--- > > arch/x86/kernel/machine_kexec_64.c | 6 +++--- > > 4 files changed, 19 insertions(+), 6 deletions(-) > > > > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h > > index bc27611fa58f..5d7e635e580a 100644 > > --- a/arch/x86/include/asm/apic.h > > +++ b/arch/x86/include/asm/apic.h > > @@ -128,6 +128,7 @@ extern void clear_local_APIC(void); > > extern void disconnect_bsp_APIC(int virt_wire_setup); > > extern void disable_local_APIC(void); > > extern void lapic_shutdown(void); > > +extern int lapic_disabled(void); > > extern void sync_Arb_IDs(void); > > extern void init_bsp_APIC(void); > > extern void setup_local_APIC(void); > > @@ -165,6 +166,10 @@ extern int setup_APIC_eilvt(u8 lvt_off, u8 vector, u8 > > msg_type, u8 mask); > > > > #else /* !CONFIG_X86_LOCAL_APIC */ > > static inline void lapic_shutdown(void) { } > > +static inline int lapic_disabled(void) > > +{ > > + return 0; > > +} > > #define local_apic_timer_c2_ok 1 > > static inline void init_apic_mappings(void) { } > > static inline void disable_local_APIC(void) { } > > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > > index 60078a67d7e3..d1df250994bb 100644 > > --- a/arch/x86/kernel/apic/apic.c > > +++ b/arch/x86/kernel/apic/apic.c > > @@ -133,6 +133,9 @@ static inline void imcr_apic_to_pic(void) > > } > > #endif > > > > +/* Local APIC is disabled by the kernel for crash or reboot path */ > > +static int disabled_local_apic; > > + > > /* > > * Knob to control our willingness to enable the local APIC. > > * > > @@ -1097,10 +1100,16 @@ void lapic_shutdown(void) > > #endif > > disable_local_APIC(); > > > > + disabled_local_apic = 1; > > > > local_irq_restore(flags); > > } > > > > +int lapic_disabled(void) > > +{ > > + return disabled_local_apic; > > +} > > + > > /** > > * sync_Arb_IDs - synchronize APIC bus arbitration IDs > > */ > > diff --git a/arch/x86/kernel/machine_kexec_32.c > > b/arch/x86/kernel/machine_kexec_32.c > > index 469b23d6acc2..c934a7868e6b 100644 > > --- a/arch/x86/kernel/machine_kexec_32.c > > +++ b/arch/x86/kernel/machine_kexec_32.c > > @@ -202,14 +202,13 @@ void machine_kexec(struct kimage *image) > > local_irq_disable(); > > hw_breakpoint_disable(); > > > > - if (image->preserve_context) { > > + if (image->preserve_context || lapic_disabled()) { > > #ifdef CONFIG_X86_IO_APIC > > /* > > * We need to put APICs in legacy mode so that we can > >
Re: [PATCH v2] kexec: Fix kdump failure with notsc
Hi , Ingo On Fri, 2016-07-08 at 09:38 +0200, Ingo Molnar wrote: > * Eric W. Biederman wrote: > > > Sigh. Can we please just do the work to rip out the apic shutdown code > > from the > > kexec on panic code path? > > > > I forgetting details but the only reason we have do any apic shutdown is > > bugs in > > older kernels that could not initialize a system properly if we did not > > shut > > down the apics. > > > > I certainly don't see an issue with goofy cases like notsc not working on a > > crash capture kernel if we are not initializing the hardware properly. > > > > The strategy really needs to be to only do the absolutely essential > > hardware > > shutdown in the crashing kernel, every adintional line of code we execute > > in the > > crashing kernel increases our chances of hitting a bug. > > Fully agreed. > > > Under that policy things like requring we don't pass boot options that > > inhibit > > the dump catpure kernel from initializing the hardware from a random state > > are > > reasonable requirements. AKA I don't see any justification in this as to > > why we > > would even want to support notsc on the dump capture kernel. Especially > > when > > things clearly work when that option is not specified. > > So at least on the surface it appears 'surprising' that the 'notsc' option > (which, > supposedly, disables TSC handling) interferes with being able to fully boot. > Even > if 'notsc' is specified we are still using the local APIC, right? In most case, It's no problem that using local APIC while notsc is specified. But not for kdump. We can get evidence, Especially from "Spurious LAPIC timer interrupt on cpu 0". ###serial log, [0.00] NR_IRQS:524544 nr_irqs:256 16 [0.00] Spurious LAPIC timer interrupt on cpu 0 [0.00] Console: colour dummy device 80x25 [0.00] console [tty0] enabled [0.00] console [ttyS0] enabled [0.00] tsc: Fast TSC calibration using PIT [0.00] tsc: Detected 2099.947 MHz processor [0.00] Calibrating delay loop... Due to the local apic and local apic timer hasn't been setup and enabled fully, The event_handler of clock event is NULL. ###codes, static void local_apic_timer_interrupt(void) { int cpu = smp_processor_id(); struct clock_event_device *evt = &per_cpu(lapic_events, cpu); /* * Normally we should not be here till LAPIC has been initialized but * in some cases like kdump, its possible that there is a pending LAPIC * timer interrupt from previous kernel's context and is delivered in * new kernel the moment interrupts are enabled. * * Interrupts are enabled early and LAPIC is setup much later, hence * its possible that when we get here evt->event_handler is NULL. * Check for event_handler being NULL and discard the interrupt as * spurious. */ if (!evt->event_handler) { pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu); /* Switch it off */ lapic_timer_shutdown(evt); return; } . } IMHO, If we specify notsc, the dump-capture kernel waits for jiffies being updated early and LAPIC and timer are setup much later, which causes no timer interrupts is passed to BSP. as following, start_kernel --> 1)-> calibrate_delay() -> calibrate_delay_converge() # hang and wait for jiffies changed 2)-> rest_init() -> kernel_init() -> -> apic_bsp_setup() -> setup_local_APIC() -> setup_percpu_clockev(). the setup_percpu_clockev points setup_boot_APIC_clock() which used to setup the boot APIC and timer. > So it might be a good idea to find the root cause of this bootup fragility > even if > 'notsc' is specified. And I fully agree that it should be fixed in the bootup > path > of the dump kernel, not the crash kernel reboot path. Because the lapic and timer are not ready when dump-capture waits them to update the jiffies value. so I suggest to put APIC in legacy mode in local_apic_timer_interrupt() temporarily, which in the bootup path of dump kernel. diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index dcb52850a28f..af3be93997ed 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -879,6 +879,7 @@ static void local_apic_timer_interrupt(void) pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu); /* Switch it off */ lapic_timer_shutdown(evt); + disable_IO_APIC(); return; } And the new solution can fix the problem. What‘s your opinion about it? Thanks, wei > > Thanks, > > Ingo > >
Re: [PATCH v2] kexec: Fix kdump failure with notsc
Hi , Eric Thanks for your comments firstly. On Thu, 2016-07-07 at 12:55 -0500, Eric W. Biederman wrote: > Wei Jiangang writes: > > > If we specify the 'notsc' boot parameter for the dump-capture kernel, > > and then trigger a crash(panic) by using "ALT-SysRq-c" or "echo c > > > /proc/sysrq-trigger", > > the dump-capture kernel will hang in calibrate_delay_converge(): > > > > /* wait for "start of" clock tick */ > > ticks = jiffies; > > while (ticks == jiffies) > > ; /* nothing */ > > > > serial log of the hang is as follows: > > > > tsc: Fast TSC calibration using PIT > > tsc: Detected 2099.947 MHz processor > > Calibrating delay loop... > > > > The reason is that the dump-capture kernel hangs in while loops and > > waits for jiffies to be updated, but no timer interrupts is passed > > to BSP by APIC. > > > > In fact, the local APIC was disabled in reboot and crash path by > > lapic_shutdown(). We need to put APIC in legacy mode in kexec jump path > > (put the system into PIT during the crash kernel), > > so that the dump-capture kernel can get timer interrupts. > > > > BTW, > > I found the buggy commit 522e66464467 ("x86/apic: Disable I/O APIC > > before shutdown of the local APIC") via bisection. > > > > Originally, I want to revert it. > > But Ingo Molnar comments that "By reverting the change can paper over > > the bug, but re-introduce the bug that can result in certain CPUs hanging > > if IO-APIC sends an APIC message if the lapic is disabled prematurely" > > And I think it's pertinent. > > Sigh. Can we please just do the work to rip out the apic shutdown code > from the kexec on panic code path? Do you mean remove the calls for disable_IO_APIC() and lapic_shutdown() in native_machine_crash_shutdown()? If so, I have tried it, but it doesn't work for this problem. > > I forgetting details but the only reason we have do any apic shutdown > is bugs in older kernels that could not initialize a system properly > if we did not shut down the apics. > > I certainly don't see an issue with goofy cases like notsc not working > on a crash capture kernel if we are not initializing the hardware > properly. > > The strategy really needs to be to only do the absolutely essential > hardware shutdown in the crashing kernel, every adintional line of code > we execute in the crashing kernel increases our chances of hitting a > bug. > > Under that policy things like requring we don't pass boot options that > inhibit the dump catpure kernel from initializing the hardware from a > random state are reasonable requirements. AKA I don't see any > justification in this as to why we would even want to support notsc > on the dump capture kernel. Especially when things clearly work when > that option is not specified. firstly do some clarification, My commit message metioned that "specify the 'notsc' boot parameter for the dump-capture kernel ". That's just the reproducing method used by myself for this problem. In fact, If we specify notsc only for the first kernel, which also can trigger the bug. And secondly, In multiple CPU configurations the TSC values on different processors may be different, which may cause random (bad) results. for example, FUJITSU's server (PRIMEQUEST 2000 series) supports Dynamic Reconfiguration. http://www.fujitsu.com/global/products/computing/servers/mission-critical/primequest/technology/availability/dynamic-reconfiguration.html This feature enables to hot-add system board which contains cpus and memories, this means some cpus can be hot-added to system. tsc of hot-added cpus is not consistent with tsc of existing-from-boot-time cpus. (though hardware and firmware make an effort to speficy the same tsc value as existing one) PRIMEQUEST can happen this tsc-inconsistency, we recommend to specify "notsc" boot option for Dynamic Reconfiguration users. so we really need to specify 'notsc'. Regards, wei > Eric > > > > Signed-off-by: Wei Jiangang > > --- > > arch/x86/include/asm/apic.h| 5 + > > arch/x86/kernel/apic/apic.c| 9 + > > arch/x86/kernel/machine_kexec_32.c | 5 ++--- > > arch/x86/kernel/machine_kexec_64.c | 6 +++--- > > 4 files changed, 19 insertions(+), 6 deletions(-) > > > > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h > > index bc27611fa58f..5d7e635e580a 100644 > > --- a/arch/x86/include/asm/apic.h > > +++ b/arch/x86/include/asm/apic.h > > @@ -128,6 +128,7 @@ extern void cle
Re: [PATCH] kexec: Fix kdump failure with notsc
Hi all, I fixed the compile error and sent the v2. Please ignore this serial. Thanks, wei On Thu, 2016-07-07 at 16:46 +0800, kbuild test robot wrote: > Hi, > > [auto build test ERROR on tip/x86/core] > [also build test ERROR on v4.7-rc6 next-20160706] > [if your patch is applied to the wrong git tree, please drop us a note to > help improve the system] > > url: > https://github.com/0day-ci/linux/commits/Wei-Jiangang/kexec-Fix-kdump-failure-with-notsc/20160707-152535 > config: x86_64-randconfig-s5-07071541 (attached as .config) > compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430 > reproduce: > # save the attached .config to linux build tree > make ARCH=x86_64 > > All errors (new ones prefixed by >>): > >arch/x86/kernel/machine_kexec_64.c: In function 'machine_kexec': > >> arch/x86/kernel/machine_kexec_64.c:272:33: error: implicit declaration of > >> function 'lapic_disabled' [-Werror=implicit-function-declaration] > if (image->preserve_context || lapic_disabled()) { > ^~ >cc1: some warnings being treated as errors > > vim +/lapic_disabled +272 arch/x86/kernel/machine_kexec_64.c > >266save_ftrace_enabled = __ftrace_enabled_save(); >267 >268/* Interrupts aren't acceptable while we reboot */ >269local_irq_disable(); >270hw_breakpoint_disable(); >271 > > 272if (image->preserve_context || lapic_disabled()) { >273#ifdef CONFIG_X86_IO_APIC >274/* >275 * We need to put APICs in legacy mode so that > we can > > --- > 0-DAY kernel test infrastructureOpen Source Technology Center > https://lists.01.org/pipermail/kbuild-all Intel Corporation > >
[PATCH v2] kexec: Fix kdump failure with notsc
If we specify the 'notsc' boot parameter for the dump-capture kernel, and then trigger a crash(panic) by using "ALT-SysRq-c" or "echo c > /proc/sysrq-trigger", the dump-capture kernel will hang in calibrate_delay_converge(): /* wait for "start of" clock tick */ ticks = jiffies; while (ticks == jiffies) ; /* nothing */ serial log of the hang is as follows: tsc: Fast TSC calibration using PIT tsc: Detected 2099.947 MHz processor Calibrating delay loop... The reason is that the dump-capture kernel hangs in while loops and waits for jiffies to be updated, but no timer interrupts is passed to BSP by APIC. In fact, the local APIC was disabled in reboot and crash path by lapic_shutdown(). We need to put APIC in legacy mode in kexec jump path (put the system into PIT during the crash kernel), so that the dump-capture kernel can get timer interrupts. BTW, I found the buggy commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown of the local APIC") via bisection. Originally, I want to revert it. But Ingo Molnar comments that "By reverting the change can paper over the bug, but re-introduce the bug that can result in certain CPUs hanging if IO-APIC sends an APIC message if the lapic is disabled prematurely" And I think it's pertinent. Signed-off-by: Wei Jiangang --- arch/x86/include/asm/apic.h| 5 + arch/x86/kernel/apic/apic.c| 9 + arch/x86/kernel/machine_kexec_32.c | 5 ++--- arch/x86/kernel/machine_kexec_64.c | 6 +++--- 4 files changed, 19 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index bc27611fa58f..5d7e635e580a 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -128,6 +128,7 @@ extern void clear_local_APIC(void); extern void disconnect_bsp_APIC(int virt_wire_setup); extern void disable_local_APIC(void); extern void lapic_shutdown(void); +extern int lapic_disabled(void); extern void sync_Arb_IDs(void); extern void init_bsp_APIC(void); extern void setup_local_APIC(void); @@ -165,6 +166,10 @@ extern int setup_APIC_eilvt(u8 lvt_off, u8 vector, u8 msg_type, u8 mask); #else /* !CONFIG_X86_LOCAL_APIC */ static inline void lapic_shutdown(void) { } +static inline int lapic_disabled(void) +{ + return 0; +} #define local_apic_timer_c2_ok 1 static inline void init_apic_mappings(void) { } static inline void disable_local_APIC(void) { } diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 60078a67d7e3..d1df250994bb 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -133,6 +133,9 @@ static inline void imcr_apic_to_pic(void) } #endif +/* Local APIC is disabled by the kernel for crash or reboot path */ +static int disabled_local_apic; + /* * Knob to control our willingness to enable the local APIC. * @@ -1097,10 +1100,16 @@ void lapic_shutdown(void) #endif disable_local_APIC(); + disabled_local_apic = 1; local_irq_restore(flags); } +int lapic_disabled(void) +{ + return disabled_local_apic; +} + /** * sync_Arb_IDs - synchronize APIC bus arbitration IDs */ diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c index 469b23d6acc2..c934a7868e6b 100644 --- a/arch/x86/kernel/machine_kexec_32.c +++ b/arch/x86/kernel/machine_kexec_32.c @@ -202,14 +202,13 @@ void machine_kexec(struct kimage *image) local_irq_disable(); hw_breakpoint_disable(); - if (image->preserve_context) { + if (image->preserve_context || lapic_disabled()) { #ifdef CONFIG_X86_IO_APIC /* * We need to put APICs in legacy mode so that we can * get timer interrupts in second kernel. kexec/kdump * paths already have calls to disable_IO_APIC() in -* one form or other. kexec jump path also need -* one. +* one form or other. kexec jump path also need one. */ disable_IO_APIC(); #endif diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 5a294e48b185..d3598cdd6437 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -269,14 +270,13 @@ void machine_kexec(struct kimage *image) local_irq_disable(); hw_breakpoint_disable(); - if (image->preserve_context) { + if (image->preserve_context || lapic_disabled()) { #ifdef CONFIG_X86_IO_APIC /* * We need to put APICs in legacy mode so that we can * get timer interrupts in second kernel. kexec/kdump * paths already have calls to disable_IO_APIC() in -* one form or other.
[PATCH] kexec: Fix kdump failure with notsc
If we specify the 'notsc' boot parameter for the dump-capture kernel, and then trigger a crash(panic) by using "ALT-SysRq-c" or "echo c > /proc/sysrq-trigger", the dump-capture kernel will hang in calibrate_delay_converge(): /* wait for "start of" clock tick */ ticks = jiffies; while (ticks == jiffies) ; /* nothing */ serial log of the hang is as follows: tsc: Fast TSC calibration using PIT tsc: Detected 2099.947 MHz processor Calibrating delay loop... The reason is that the dump-capture kernel hangs in while loops and waits for jiffies to be updated, but no timer interrupts is passed to BSP by APIC. In fact, the local APIC was disabled in reboot and crash path by lapic_shutdown(). We need to put APIC in legacy mode in kexec jump path (put the system into PIT during the crash kernel), so that the dump-capture kernel can get timer interrupts. BTW, I found the buggy commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown of the local APIC") via bisection. Originally, I want to revert it. But Ingo Molnar comments that "By reverting the change can paper over the bug, but re-introduce the bug that can result in certain CPUs hanging if IO-APIC sends an APIC message if the lapic is disabled prematurely" And I think it's pertinent. Signed-off-by: Wei Jiangang --- arch/x86/include/asm/apic.h| 5 + arch/x86/kernel/apic/apic.c| 9 + arch/x86/kernel/machine_kexec_32.c | 5 ++--- arch/x86/kernel/machine_kexec_64.c | 5 ++--- 4 files changed, 18 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index bc27611fa58f..5d7e635e580a 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -128,6 +128,7 @@ extern void clear_local_APIC(void); extern void disconnect_bsp_APIC(int virt_wire_setup); extern void disable_local_APIC(void); extern void lapic_shutdown(void); +extern int lapic_disabled(void); extern void sync_Arb_IDs(void); extern void init_bsp_APIC(void); extern void setup_local_APIC(void); @@ -165,6 +166,10 @@ extern int setup_APIC_eilvt(u8 lvt_off, u8 vector, u8 msg_type, u8 mask); #else /* !CONFIG_X86_LOCAL_APIC */ static inline void lapic_shutdown(void) { } +static inline int lapic_disabled(void) +{ + return 0; +} #define local_apic_timer_c2_ok 1 static inline void init_apic_mappings(void) { } static inline void disable_local_APIC(void) { } diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 60078a67d7e3..d1df250994bb 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -133,6 +133,9 @@ static inline void imcr_apic_to_pic(void) } #endif +/* Local APIC is disabled by the kernel for crash or reboot path */ +static int disabled_local_apic; + /* * Knob to control our willingness to enable the local APIC. * @@ -1097,10 +1100,16 @@ void lapic_shutdown(void) #endif disable_local_APIC(); + disabled_local_apic = 1; local_irq_restore(flags); } +int lapic_disabled(void) +{ + return disabled_local_apic; +} + /** * sync_Arb_IDs - synchronize APIC bus arbitration IDs */ diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c index 469b23d6acc2..c934a7868e6b 100644 --- a/arch/x86/kernel/machine_kexec_32.c +++ b/arch/x86/kernel/machine_kexec_32.c @@ -202,14 +202,13 @@ void machine_kexec(struct kimage *image) local_irq_disable(); hw_breakpoint_disable(); - if (image->preserve_context) { + if (image->preserve_context || lapic_disabled()) { #ifdef CONFIG_X86_IO_APIC /* * We need to put APICs in legacy mode so that we can * get timer interrupts in second kernel. kexec/kdump * paths already have calls to disable_IO_APIC() in -* one form or other. kexec jump path also need -* one. +* one form or other. kexec jump path also need one. */ disable_IO_APIC(); #endif diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 5a294e48b185..4925a6bb932a 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -269,14 +269,13 @@ void machine_kexec(struct kimage *image) local_irq_disable(); hw_breakpoint_disable(); - if (image->preserve_context) { + if (image->preserve_context || lapic_disabled()) { #ifdef CONFIG_X86_IO_APIC /* * We need to put APICs in legacy mode so that we can * get timer interrupts in second kernel. kexec/kdump * paths already have calls to disable_IO_APIC() in -* one form or other. kexec jump path also need -* one. +* one form or other. ke
[PATCH] xen/apic: Update the comment for apic_id_mask
verify_local_APIC() had been removed by commit 4399c03c6780 ("x86/apic: Remove verify_local_APIC()"), so apic_id_mask isn't used by it. Signed-off-by: Wei Jiangang --- arch/x86/xen/apic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c index db52a7fafcc2..9cbb1f48381b 100644 --- a/arch/x86/xen/apic.c +++ b/arch/x86/xen/apic.c @@ -177,7 +177,7 @@ static struct apic xen_pv_apic = { .get_apic_id= xen_get_apic_id, .set_apic_id= xen_set_apic_id, /* Can be NULL on 32-bit. */ - .apic_id_mask = 0xFF << 24, /* Used by verify_local_APIC. Match with what xen_get_apic_id does. */ + .apic_id_mask = 0xFF << 24, /* Match with what xen_get_apic_id does. */ .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, -- 1.9.3
Re: [PATCH 1/2] x86/apic: shutdown local APIC before I/O APIC during crash
Hi, Ingo Thanks for your comments firstly. On Fri, 2016-07-01 at 12:36 +0200, Ingo Molnar wrote: > * Wei Jiangang wrote: > > > commit <522e66464467> disables I/O APIC before shutdown of > > the local APIC for both reboot and crash path. > > and commit <2885432aaf15> declares that 'it still makes sense to > > quiet IO APIC before disabling Local APIC'. > > That's not how we refer to commits in changelogs. > OK, I will fix it and pay attention to it in the following patch. > > However, the former introduced a bug for crashdown. > > What is 'crashdown'? It's not referred to in the kernel source even once. well, I mean ... If we trigger kernel panic with the following commands, the capture kernel should boot normally and captures the dump image. #echo 1 > /proc/sys/kernel/sysrq #echo c > /proc/sysrq-trigger But due to commit 522e66464467 changes the APIC shutdown sequence in native_machine_crash_shutdown(), the capture kernel doesn't boot normally and hang in calibrate_delay_converge(), waiting for the jiffies to be updated. BTW, without commit 522e66464467, the capture kernel works well. > > > If specify 'notsc' for capture-kernel, and then trigger crashdown. > > The capture-kernel will be blocked at calibrate_delay_converge(). > > This is a more readable way of saying the same: > > If we specify the 'notsc' boot parameter for the dump-capture kernel, > and then trigger a crash-down, then the dump-capture kernel will hang > in calibrate_delay_converge(): > > (Assuming the changelog first explains what a 'crash-down' is.) > > > /* wait for "start of" clock tick */ > > ticks = jiffies; > > while (ticks == jiffies) > > ; /* nothing */ > > Plase align quoted code to the right with at least a single tab. > OK > > serial console log as following, > > serial log of the hang is as follows: > > > > > [0.00] Linux version 4.7.0-rc2+ (root@localhost.localdomain) > > (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #2 SMP Wed Jun > > 156 > > [0.00] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+ > > root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap > > vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=256M > > vconsole.keymap=us console=tty0 console=ttyS0,115200n8 LANG=en_US.UTF-8 > > irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off > > panic=10 rootflags=nofail acpi_no_memhotplug notsc > > > > [0.00] tsc: Kernel compiled with CONFIG_X86_TSC, cannot disable > > TSC completely > > > > [0.00] clocksource: hpet: mask: 0x max_cycles: > > 0x, max_idle_ns: 133484882848 ns > > [0.00] tsc: Fast TSC calibration using PIT > > [0.00] tsc: Detected 3192.714 MHz processor > > [0.00] Calibrating delay loop... > > Just quote the last few lines and skip the useless timestamp column. Also, > please > right-align this too. OK > > > The bug remains and unsolved for a long time, since 2013. > > I find the arch-criminal by bisect. > > What is an arch-criminal? Did you want to say: > > The bug has been introduced in 2013. I found the buggy commit via bisection. > > ? Yes, That's what i want to say. > > > The commit <522e66464467> used to fix erratum AVR31 for "Intel Atom > > Processor C2000 Product Family Specification Update". > > You can find the doc at http://www.intel.com/content/dam/www/public/us > > /en/documents/specification-updates/atom-c2000-family-spec-update.pdf. > > > > IMO, > > It doesn't make sense that change the order of disabling between > > I/O APIC and local APIC just for a certain model C2000. > > And I couldn't find any related descriptions for Intel 64 and IA-32 Arch. > > > > so, I want to revert the crash part of commit <522e66464467>. > > So why does the crashdump kernel hang in calibrate_delay_converge()? The jiffies value doesn't increase, which causes the capture kernel hang in calibrate_delay_converge(). It seems that there's a relationship with the shutdown(disable) order between IO APIC and local APIC. I'm not sure of this point One thing for sure by debugging is that do_timer() is not called while capture kernel boots up. I suspect the timer interrupts (irq0) is not passed to cpu by APIC. > > To me it appears this is a weakness in the crashdump kernel: it is unable to > boot > if we crash the original host system in a particular hardware state, right? Maybe you're right ... I specify 'notsc' only for capture-kernel, not the original host system(first kernel). And I suspect the APIC shutdown sequence in first kernel maybe bring some bad influence on capture kernel. I need to do more investigation. Do you have any advice? Thanks in advance. Wei > By reverting this change we'll just paper over the bug and re-introduce the > bug > that can result in certain CPUs hanging if the IO-APIC sends an APIC message > if > the lapic is disabled prematurely. > Thanks, > > Ingo > >
[tip:timers/core] timers/nohz: Fix several typos
Commit-ID: 6168f8ed01dc46a277908938294f1132d723f58d Gitweb: http://git.kernel.org/tip/6168f8ed01dc46a277908938294f1132d723f58d Author: Wei Jiangang AuthorDate: Wed, 29 Jun 2016 12:51:50 +0800 Committer: Ingo Molnar CommitDate: Fri, 1 Jul 2016 12:39:22 +0200 timers/nohz: Fix several typos Signed-off-by: Wei Jiangang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: fenghua...@intel.com Link: http://lkml.kernel.org/r/1467175910-2966-2-git-send-email-weijg.f...@cn.fujitsu.com Signed-off-by: Ingo Molnar --- kernel/time/tick-sched.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 536ada8..6d83e9c 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -61,7 +61,7 @@ static void tick_do_update_jiffies64(ktime_t now) if (delta.tv64 < tick_period.tv64) return; - /* Reevalute with jiffies_lock held */ + /* Reevaluate with jiffies_lock held */ write_seqlock(&jiffies_lock); delta = ktime_sub(now, last_jiffies_update); @@ -117,7 +117,7 @@ static void tick_sched_do_timer(ktime_t now) /* * Check if the do_timer duty was dropped. We don't care about * concurrency: This happens only when the cpu in charge went -* into a long sleep. If two cpus happen to assign themself to +* into a long sleep. If two cpus happen to assign themselves to * this duty, then the jiffies update is still serialized by * jiffies_lock. */ @@ -571,7 +571,7 @@ static ktime_t tick_nohz_start_idle(struct tick_sched *ts) * @last_update_time: variable to store update time in. Do not update * counters if NULL. * - * Return the cummulative idle time (since boot) for a given + * Return the cumulative idle time (since boot) for a given * CPU, in microseconds. * * This time is measured via accounting rather than sampling, @@ -612,7 +612,7 @@ EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); * @last_update_time: variable to store update time in. Do not update * counters if NULL. * - * Return the cummulative iowait time (since boot) for a given + * Return the cumulative iowait time (since boot) for a given * CPU, in microseconds. * * This time is measured via accounting rather than sampling, @@ -733,7 +733,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts, * do_timer() never invoked. Keep track of the fact that it * was the one which had the do_timer() duty last. If this cpu * is the one which had the do_timer() duty last, we limit the -* sleep time to the timekeeping max_deferement value. +* sleep time to the timekeeping max_deferment value. * Otherwise we can sleep as long as we want. */ delta = timekeeping_max_deferment();
[PATCH 1/2] x86/apic: shutdown local APIC before I/O APIC during crash
commit <522e66464467> disables I/O APIC before shutdown of the local APIC for both reboot and crash path. and commit <2885432aaf15> declares that 'it still makes sense to quiet IO APIC before disabling Local APIC'. However, the former introduced a bug for crashdown. If specify 'notsc' for capture-kernel, and then trigger crashdown. The capture-kernel will be blocked at calibrate_delay_converge(). /* wait for "start of" clock tick */ ticks = jiffies; while (ticks == jiffies) ; /* nothing */ serial console log as following, [0.00] Linux version 4.7.0-rc2+ (root@localhost.localdomain) (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #2 SMP Wed Jun 156 [0.00] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+ root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=256M vconsole.keymap=us console=tty0 console=ttyS0,115200n8 LANG=en_US.UTF-8 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off panic=10 rootflags=nofail acpi_no_memhotplug notsc [0.00] tsc: Kernel compiled with CONFIG_X86_TSC, cannot disable TSC completely [0.00] clocksource: hpet: mask: 0x max_cycles: 0x, max_idle_ns: 133484882848 ns [0.00] tsc: Fast TSC calibration using PIT [0.00] tsc: Detected 3192.714 MHz processor [0.00] Calibrating delay loop... The bug remains and unsolved for a long time, since 2013. I find the arch-criminal by bisect. The commit <522e66464467> used to fix erratum AVR31 for "Intel Atom Processor C2000 Product Family Specification Update". You can find the doc at http://www.intel.com/content/dam/www/public/us /en/documents/specification-updates/atom-c2000-family-spec-update.pdf. IMO, It doesn't make sense that change the order of disabling between I/O APIC and local APIC just for a certain model C2000. And I couldn't find any related descriptions for Intel 64 and IA-32 Arch. so, I want to revert the crash part of commit <522e66464467>. Signed-off-by: Wei Jiangang --- arch/x86/kernel/crash.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index 9ef978d69c22..9aa0235c1f7d 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -179,12 +179,13 @@ void native_machine_crash_shutdown(struct pt_regs *regs) */ cpu_emergency_stop_pt(); + lapic_shutdown(); #ifdef CONFIG_X86_IO_APIC /* Prevent crash_kexec() from deadlocking on ioapic_lock. */ ioapic_zap_locks(); disable_IO_APIC(); #endif - lapic_shutdown(); + #ifdef CONFIG_HPET_TIMER hpet_disable(); #endif -- 1.9.3
[PATCH 2/2] time/tick-schede: fix typos
When investigating kdump's failure with 'notsc' and jiffies not incrementing, I found several spelling mistakes Signed-off-by: Wei Jiangang --- kernel/time/tick-sched.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 536ada80f6dd..6d83e9c4a302 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -61,7 +61,7 @@ static void tick_do_update_jiffies64(ktime_t now) if (delta.tv64 < tick_period.tv64) return; - /* Reevalute with jiffies_lock held */ + /* Reevaluate with jiffies_lock held */ write_seqlock(&jiffies_lock); delta = ktime_sub(now, last_jiffies_update); @@ -117,7 +117,7 @@ static void tick_sched_do_timer(ktime_t now) /* * Check if the do_timer duty was dropped. We don't care about * concurrency: This happens only when the cpu in charge went -* into a long sleep. If two cpus happen to assign themself to +* into a long sleep. If two cpus happen to assign themselves to * this duty, then the jiffies update is still serialized by * jiffies_lock. */ @@ -571,7 +571,7 @@ static ktime_t tick_nohz_start_idle(struct tick_sched *ts) * @last_update_time: variable to store update time in. Do not update * counters if NULL. * - * Return the cummulative idle time (since boot) for a given + * Return the cumulative idle time (since boot) for a given * CPU, in microseconds. * * This time is measured via accounting rather than sampling, @@ -612,7 +612,7 @@ EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); * @last_update_time: variable to store update time in. Do not update * counters if NULL. * - * Return the cummulative iowait time (since boot) for a given + * Return the cumulative iowait time (since boot) for a given * CPU, in microseconds. * * This time is measured via accounting rather than sampling, @@ -733,7 +733,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts, * do_timer() never invoked. Keep track of the fact that it * was the one which had the do_timer() duty last. If this cpu * is the one which had the do_timer() duty last, we limit the -* sleep time to the timekeeping max_deferement value. +* sleep time to the timekeeping max_deferment value. * Otherwise we can sleep as long as we want. */ delta = timekeeping_max_deferment(); -- 1.9.3
Re: RFC: Fix kdump failed with 'notsc'
Hi Alok, Thanks for your reply. CC Ingo and fenghua.yu I found the arch-criminal by bisect. [root@localhost linux]# git show 522e6646 commit 522e66464467543c0d88d023336eec4df03ad40b Author: Fenghua Yu Date: Wed Oct 23 18:30:12 2013 -0700 x86/apic: Disable I/O APIC before shutdown of the local APIC In reboot and crash path, when we shut down the local APIC, the I/O APIC is still active. This may cause issues because external interrupts can still come in and disturb the local APIC during shutdown process. To quiet external interrupts, disable I/O APIC before shutdown local APIC. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1382578212-4677-1-git-send-email-fenghua...@intel.com Cc: [ I suppose the 'issue' is a hang during shutdown. It's a fine change nevertheless. ] Signed-off-by: Ingo Molnar When I revert it and disable local APIC before disabling IO-APIC in native_machine_crash_shutdown(), The kdump can finish without any error. The commit 2885432 said 'it still makes sense to quiet IO APIC before disabling Local APIC', But I don't find any descriptions about the sequence of disable IO-APIC and Local APIC in "Intel 64 and IA-32 Architectures software developer's manual volume 3A". Only erratum AVR31 for "Intel Atom Processor C2000 Product Family Specification Update". IMO, It doesn't make sense that change the order of disabling IO APIC and Local APIC just for a certain model C2000. do you have any suggestion to fix it? thanks in advance. PS, My machine is Lenovo's QiTianM4340, and the CPU is Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz, 4Cores. Thanks, wei On Fri, 2016-06-24 at 10:41 +, Alok Kataria wrote: > Hi Wei, > > On Tue, 2016-06-14 at 09:56 +, Wei, Jiangang wrote: > > Hi, > > > > When I trigger kernel crash and specify 'notsc' for capture-kernel, > > The process of kdump will be blocked at calibrate_delay_converge(). > > > > /* wait for "start of" clock tick */ > > ticks = jiffies; > > while (ticks == jiffies) > > ; /* nothing */ > > > > The reason is that the jiffies remains the same, no changed. > > > > serial console log as following, > > > > [0.00] Linux version 4.7.0-rc2+ (root@localhost.localdomain) > > (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #2 SMP Wed Jun > > 156 > > [0.00] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+ > > root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap > > vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=256M > > vconsole.keymap=us console=tty0 console=ttyS0,115200n8 LANG=en_US.UTF-8 > > irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off > > panic=10 rootflags=nofail acpi_no_memhotplug notsc > > > > [0.00] tsc: Kernel compiled with CONFIG_X86_TSC, cannot disable > > TSC completely > > > > [0.00] clocksource: hpet: mask: 0x max_cycles: > > 0x, max_idle_ns: 133484882848 ns > > [0.00] tsc: Fast TSC calibration using PIT > > [0.00] tsc: Detected 3192.714 MHz processor > > [0.00] Calibrating delay loop... > > > > # The last log is raised by calibrate_delay(), which calls > > calibrate_delay_converge() to compute the lpj value. > > > > # So far, I don't know why the jiffies stays the same. > > # But I found two methods can avoid this problem。 > > > > 1)specify the 'lpj=' with 'notsc' together. > > > > 2) revert the 70de9a9. > > > > commit 70de9a97049e0ba79dc040868564408d5ce697f9 > > Author: Alok Kataria > > Date: Mon Nov 3 11:18:47 2008 -0800 > > > > x86: don't use tsc_khz to calculate lpj if notsc is passed > > > > Impact: fix udelay when "notsc" boot parameter is passed > > > > With notsc passed on commandline, tsc may not be used for > > udelays, make sure that we do not use tsc_khz to calculate > > the lpj value in such cases. > > > > IMO, > > The flow of getting tsc_khz as following, > > tsc_init()->x86_platform.calibrate_tsc()->native_calibrate_tsc()->quick_pit_calibrate(). > > No codes use or call 'rdtsc'. > > The intent of that change was to skip calculating the lpj value based on > the tsc_khz value if notsc is specified. Note that it has noting to do > with using rdtsc for tsc frequency calibration, instead we use the tsc > frequency (tsc_khz) derived lpj value for udelay (see delay_tsc). > > If notsc is pas
Re: RFC: Fix kdump failed with 'notsc'
ping... On Tue, 2016-06-14 at 17:54 +0800, Wei Jiangang wrote: > Hi, > > When I trigger kernel crash and specify 'notsc' for capture-kernel, > The process of kdump will be blocked at calibrate_delay_converge(). > > /* wait for "start of" clock tick */ > ticks = jiffies; > while (ticks == jiffies) > ; /* nothing */ > > The reason is that the jiffies remains the same, no changed. > > serial console log as following, > > [0.00] Linux version 4.7.0-rc2+ (root@localhost.localdomain) > (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #2 SMP Wed Jun > 156 > [0.00] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+ > root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap > vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=256M > vconsole.keymap=us console=tty0 console=ttyS0,115200n8 LANG=en_US.UTF-8 > irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off > panic=10 rootflags=nofail acpi_no_memhotplug notsc > > [0.00] tsc: Kernel compiled with CONFIG_X86_TSC, cannot disable > TSC completely > > [0.00] clocksource: hpet: mask: 0x max_cycles: > 0x, max_idle_ns: 133484882848 ns > [0.00] tsc: Fast TSC calibration using PIT > [0.00] tsc: Detected 3192.714 MHz processor > [0.00] Calibrating delay loop... > > # The last log is raised by calibrate_delay(), which calls > calibrate_delay_converge() to compute the lpj value. > > # So far, I don't know why the jiffies stays the same. > # But I found two methods can avoid this problem。 > > 1)specify the 'lpj=' with 'notsc' together. > > 2) revert the 70de9a9. > > commit 70de9a97049e0ba79dc040868564408d5ce697f9 > Author: Alok Kataria > Date: Mon Nov 3 11:18:47 2008 -0800 > > x86: don't use tsc_khz to calculate lpj if notsc is passed > > Impact: fix udelay when "notsc" boot parameter is passed > > With notsc passed on commandline, tsc may not be used for > udelays, make sure that we do not use tsc_khz to calculate > the lpj value in such cases. > > IMO, > The flow of getting tsc_khz as following, > tsc_init()->x86_platform.calibrate_tsc()->native_calibrate_tsc()->quick_pit_calibrate(). > No codes use or call 'rdtsc'. > > Even if ‘notsc’ is passed, the tsc_khz is credible. > and we can get lpj by it. > > So I want to push a patch to revert the 70de9a9. > Any comments or suggestions is appreciated. > > Thanks, > wei > > >
RFC: Fix kdump failed with 'notsc'
Hi, When I trigger kernel crash and specify 'notsc' for capture-kernel, The process of kdump will be blocked at calibrate_delay_converge(). /* wait for "start of" clock tick */ ticks = jiffies; while (ticks == jiffies) ; /* nothing */ The reason is that the jiffies remains the same, no changed. serial console log as following, [0.00] Linux version 4.7.0-rc2+ (root@localhost.localdomain) (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #2 SMP Wed Jun 156 [0.00] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+ root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=256M vconsole.keymap=us console=tty0 console=ttyS0,115200n8 LANG=en_US.UTF-8 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off panic=10 rootflags=nofail acpi_no_memhotplug notsc [0.00] tsc: Kernel compiled with CONFIG_X86_TSC, cannot disable TSC completely [0.00] clocksource: hpet: mask: 0x max_cycles: 0x, max_idle_ns: 133484882848 ns [0.00] tsc: Fast TSC calibration using PIT [0.00] tsc: Detected 3192.714 MHz processor [0.00] Calibrating delay loop... # The last log is raised by calibrate_delay(), which calls calibrate_delay_converge() to compute the lpj value. # So far, I don't know why the jiffies stays the same. # But I found two methods can avoid this problem。 1)specify the 'lpj=' with 'notsc' together. 2) revert the 70de9a9. commit 70de9a97049e0ba79dc040868564408d5ce697f9 Author: Alok Kataria Date: Mon Nov 3 11:18:47 2008 -0800 x86: don't use tsc_khz to calculate lpj if notsc is passed Impact: fix udelay when "notsc" boot parameter is passed With notsc passed on commandline, tsc may not be used for udelays, make sure that we do not use tsc_khz to calculate the lpj value in such cases. IMO, The flow of getting tsc_khz as following, tsc_init()->x86_platform.calibrate_tsc()->native_calibrate_tsc()->quick_pit_calibrate(). No codes use or call 'rdtsc'. Even if ‘notsc’ is passed, the tsc_khz is credible. and we can get lpj by it. So I want to push a patch to revert the 70de9a9. Any comments or suggestions is appreciated. Thanks, wei
Re: [PATCH] tools: ffs-aio-example: free memory upon failure
On Wed, 2015-11-18 at 10:46 +0100, Robert Baldyga wrote: > Hi Wei, > > On 11/18/2015 07:13 AM, Wei, Jiangang wrote: > > To whom it may concern: > > > > Sorry to bother again, > > But any comment about this patch? > > > > Reviewed-by: Robert Baldyga > > It looks good to me. By the way, maybe we should also close file > descriptors in error path? Thanks for your kindness and comments, I update it and send the v2. If you have any other advice, Please let me know. Thanks, wei > > Thanks, > Robert Baldyga > > > On Mon, 2015-11-09 at 14:16 +0800, Wei Jiangang wrote: > >> Free buffer to avoid memory leak upon failure occurs. > >> > >> Signed-off-by: Wei Jiangang > >> --- > >> tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c | 4 > >> tools/usb/ffs-aio-example/simple/device_app/aio_simple.c | 4 > >> 2 files changed, 8 insertions(+) > >> > >> diff --git > >> a/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c > >> b/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c > >> index aaca1f44e788..3eb1a92baacf 100644 > >> --- a/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c > >> +++ b/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c > >> @@ -263,20 +263,24 @@ int main(int argc, char *argv[]) > >>sprintf(ep_path, "%s/ep0", argv[1]); > >>ep0 = open(ep_path, O_RDWR); > >>if (ep0 < 0) { > >> + free(ep_path); > >>perror("unable to open ep0"); > >>return 1; > >>} > >>if (write(ep0, &descriptors, sizeof(descriptors)) < 0) { > >> + free(ep_path); > >>perror("unable do write descriptors"); > >>return 1; > >>} > >>if (write(ep0, &strings, sizeof(strings)) < 0) { > >> + free(ep_path); > >>perror("unable to write strings"); > >>return 1; > >>} > >>sprintf(ep_path, "%s/ep1", argv[1]); > >>ep1 = open(ep_path, O_RDWR); > >>if (ep1 < 0) { > >> + free(ep_path); > >>perror("unable to open ep1"); > >>return 1; > >>} > >> diff --git a/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c > >> b/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c > >> index 1f44a29818bf..ac96892ca5d2 100644 > >> --- a/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c > >> +++ b/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c > >> @@ -234,14 +234,17 @@ int main(int argc, char *argv[]) > >>sprintf(ep_path, "%s/ep0", argv[1]); > >>ep0 = open(ep_path, O_RDWR); > >>if (ep0 < 0) { > >> + free(ep_path); > >>perror("unable to open ep0"); > >>return 1; > >>} > >>if (write(ep0, &descriptors, sizeof(descriptors)) < 0) { > >> + free(ep_path); > >>perror("unable do write descriptors"); > >>return 1; > >>} > >>if (write(ep0, &strings, sizeof(strings)) < 0) { > >> + free(ep_path); > >>perror("unable to write strings"); > >>return 1; > >>} > >> @@ -249,6 +252,7 @@ int main(int argc, char *argv[]) > >>sprintf(ep_path, "%s/ep%d", argv[1], i+1); > >>ep[i] = open(ep_path, O_RDWR); > >>if (ep[i] < 0) { > >> + free(ep_path); > >>printf("unable to open ep%d: %s\n", i+1, > >> strerror(errno)); > >>return 1; > > >
[PATCH v2] tools: ffs-aio-example: clean up upon failure
Free buffer to avoid memory leak and close fd upon failure occurs. Signed-off-by: Wei Jiangang --- tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c | 6 ++ tools/usb/ffs-aio-example/simple/device_app/aio_simple.c | 6 ++ 2 files changed, 12 insertions(+) diff --git a/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c b/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c index aaca1f44e788..c08b1b0f7a6b 100644 --- a/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c +++ b/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c @@ -263,20 +263,26 @@ int main(int argc, char *argv[]) sprintf(ep_path, "%s/ep0", argv[1]); ep0 = open(ep_path, O_RDWR); if (ep0 < 0) { + free(ep_path); perror("unable to open ep0"); return 1; } if (write(ep0, &descriptors, sizeof(descriptors)) < 0) { + close(ep0); + free(ep_path); perror("unable do write descriptors"); return 1; } if (write(ep0, &strings, sizeof(strings)) < 0) { + close(ep0); + free(ep_path); perror("unable to write strings"); return 1; } sprintf(ep_path, "%s/ep1", argv[1]); ep1 = open(ep_path, O_RDWR); if (ep1 < 0) { + free(ep_path); perror("unable to open ep1"); return 1; } diff --git a/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c b/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c index 1f44a29818bf..9ff657ed0bec 100644 --- a/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c +++ b/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c @@ -234,14 +234,19 @@ int main(int argc, char *argv[]) sprintf(ep_path, "%s/ep0", argv[1]); ep0 = open(ep_path, O_RDWR); if (ep0 < 0) { + free(ep_path); perror("unable to open ep0"); return 1; } if (write(ep0, &descriptors, sizeof(descriptors)) < 0) { + close(ep0); + free(ep_path); perror("unable do write descriptors"); return 1; } if (write(ep0, &strings, sizeof(strings)) < 0) { + close(ep0); + free(ep_path); perror("unable to write strings"); return 1; } @@ -249,6 +254,7 @@ int main(int argc, char *argv[]) sprintf(ep_path, "%s/ep%d", argv[1], i+1); ep[i] = open(ep_path, O_RDWR); if (ep[i] < 0) { + free(ep_path); printf("unable to open ep%d: %s\n", i+1, strerror(errno)); return 1; -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tools: ffs-aio-example: free memory upon failure
To whom it may concern: Sorry to bother again, But any comment about this patch? Regards, wei On Mon, 2015-11-09 at 14:16 +0800, Wei Jiangang wrote: > Free buffer to avoid memory leak upon failure occurs. > > Signed-off-by: Wei Jiangang > --- > tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c | 4 > tools/usb/ffs-aio-example/simple/device_app/aio_simple.c | 4 > 2 files changed, 8 insertions(+) > > diff --git a/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c > b/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c > index aaca1f44e788..3eb1a92baacf 100644 > --- a/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c > +++ b/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c > @@ -263,20 +263,24 @@ int main(int argc, char *argv[]) > sprintf(ep_path, "%s/ep0", argv[1]); > ep0 = open(ep_path, O_RDWR); > if (ep0 < 0) { > + free(ep_path); > perror("unable to open ep0"); > return 1; > } > if (write(ep0, &descriptors, sizeof(descriptors)) < 0) { > + free(ep_path); > perror("unable do write descriptors"); > return 1; > } > if (write(ep0, &strings, sizeof(strings)) < 0) { > + free(ep_path); > perror("unable to write strings"); > return 1; > } > sprintf(ep_path, "%s/ep1", argv[1]); > ep1 = open(ep_path, O_RDWR); > if (ep1 < 0) { > + free(ep_path); > perror("unable to open ep1"); > return 1; > } > diff --git a/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c > b/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c > index 1f44a29818bf..ac96892ca5d2 100644 > --- a/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c > +++ b/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c > @@ -234,14 +234,17 @@ int main(int argc, char *argv[]) > sprintf(ep_path, "%s/ep0", argv[1]); > ep0 = open(ep_path, O_RDWR); > if (ep0 < 0) { > + free(ep_path); > perror("unable to open ep0"); > return 1; > } > if (write(ep0, &descriptors, sizeof(descriptors)) < 0) { > + free(ep_path); > perror("unable do write descriptors"); > return 1; > } > if (write(ep0, &strings, sizeof(strings)) < 0) { > + free(ep_path); > perror("unable to write strings"); > return 1; > } > @@ -249,6 +252,7 @@ int main(int argc, char *argv[]) > sprintf(ep_path, "%s/ep%d", argv[1], i+1); > ep[i] = open(ep_path, O_RDWR); > if (ep[i] < 0) { > + free(ep_path); > printf("unable to open ep%d: %s\n", i+1, > strerror(errno)); > return 1; N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
[PATCH] tools:perf: fix typo in scripts/perl/Perf-Trace-Util/README
Correct typo in tools/perf/scripts/perl/Perf-Trace-Util/README, desciptions => descriptions. Signed-off-by: Wei Jiangang --- tools/perf/scripts/perl/Perf-Trace-Util/README | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/scripts/perl/Perf-Trace-Util/README b/tools/perf/scripts/perl/Perf-Trace-Util/README index 2f0c7f3043ee..66d1238f4945 100644 --- a/tools/perf/scripts/perl/Perf-Trace-Util/README +++ b/tools/perf/scripts/perl/Perf-Trace-Util/README @@ -12,7 +12,7 @@ executable; scripts wishing to do that should 'use Context.pm'. The Perl->C perf interface is completely driven by Context.xs. If you want to add new Perl functions that end up accessing C data in the -perf executable, you add desciptions of the new functions here. +perf executable, you add descriptions of the new functions here. scripting_context is a pointer to the perf data in the perf executable that you want to access - it's passed as the second parameter, $context, to all handler functions. -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] tools:testing/selftests: fix typo in futex/README
Correct typo in tools/testing/selftests/futex/README. Signed-off-by: Wei Jiangang --- tools/testing/selftests/futex/README | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/futex/README b/tools/testing/selftests/futex/README index 3224a049b196..0558bb9ce0a6 100644 --- a/tools/testing/selftests/futex/README +++ b/tools/testing/selftests/futex/README @@ -27,7 +27,7 @@ o The build system shall remain as simple as possible, avoiding any archive or o Where possible, any helper functions or other package-wide code shall be implemented in header files, avoiding the need to compile intermediate object files. -o External dependendencies shall remain as minimal as possible. Currently gcc +o External dependencies shall remain as minimal as possible. Currently gcc and glibc are the only dependencies. o Tests return 0 for success and < 0 for failure. -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] tools: ffs-aio-example: free memory upon failure
Free buffer to avoid memory leak upon failure occurs. Signed-off-by: Wei Jiangang --- tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c | 4 tools/usb/ffs-aio-example/simple/device_app/aio_simple.c | 4 2 files changed, 8 insertions(+) diff --git a/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c b/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c index aaca1f44e788..3eb1a92baacf 100644 --- a/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c +++ b/tools/usb/ffs-aio-example/multibuff/device_app/aio_multibuff.c @@ -263,20 +263,24 @@ int main(int argc, char *argv[]) sprintf(ep_path, "%s/ep0", argv[1]); ep0 = open(ep_path, O_RDWR); if (ep0 < 0) { + free(ep_path); perror("unable to open ep0"); return 1; } if (write(ep0, &descriptors, sizeof(descriptors)) < 0) { + free(ep_path); perror("unable do write descriptors"); return 1; } if (write(ep0, &strings, sizeof(strings)) < 0) { + free(ep_path); perror("unable to write strings"); return 1; } sprintf(ep_path, "%s/ep1", argv[1]); ep1 = open(ep_path, O_RDWR); if (ep1 < 0) { + free(ep_path); perror("unable to open ep1"); return 1; } diff --git a/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c b/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c index 1f44a29818bf..ac96892ca5d2 100644 --- a/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c +++ b/tools/usb/ffs-aio-example/simple/device_app/aio_simple.c @@ -234,14 +234,17 @@ int main(int argc, char *argv[]) sprintf(ep_path, "%s/ep0", argv[1]); ep0 = open(ep_path, O_RDWR); if (ep0 < 0) { + free(ep_path); perror("unable to open ep0"); return 1; } if (write(ep0, &descriptors, sizeof(descriptors)) < 0) { + free(ep_path); perror("unable do write descriptors"); return 1; } if (write(ep0, &strings, sizeof(strings)) < 0) { + free(ep_path); perror("unable to write strings"); return 1; } @@ -249,6 +252,7 @@ int main(int argc, char *argv[]) sprintf(ep_path, "%s/ep%d", argv[1], i+1); ep[i] = open(ep_path, O_RDWR); if (ep[i] < 0) { + free(ep_path); printf("unable to open ep%d: %s\n", i+1, strerror(errno)); return 1; -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/