Re: [PATCH] svm: Fix AVIC incomplete IPI emulation
Hi Suravee, Turns out, the _same_ bug was already discussed in the past by yourself, Paolo and Radim (both now 'cc'-ed) Please read it here: https://patchwork.kernel.org/patch/8292231/ After reading that thread, I have couple of questions: First, You wrote : "I have tried NOT setting the IRR, and only kick_vcpu(). And things seem to work fine. Therefore, I think your analysis is likely to be correct." AFAIU, it means that the below patch is wrong just as Paolo suggested in his original answer and you did fixed it back than, but the code is now back ? Second, Did you made sure (with your HW desginer) that what the specifications refer as "atomically" means that the IRR is set (i.e the cacheline is taken exclusively) _before_ the isRunning bit is _read_ ? Because, if it doesn't, just like Paolo suggested, it means there is no way to use the feature as a "sending" vcpu can send IPI without any exit and the receiving cpu will never see the IRR bit as the sending cpu didn't made sure the IRR is set _before_ reading the isRunning. Thanks, Oren On 03/13/2019 09:30 AM, Oren Twaig wrote: Hi Suravee, Please see below.. On 03/11/2019 01:38 PM, Suthikulpanit, Suravee wrote: Hi Oren, Sorry for delay response. On 3/5/19 1:15 AM, Oren Twaig wrote: Hello Suravee, According to AMD's SDM, the target-not-running incomplete ipi exit is only received if any of the destination cpus had the not-running bit set in the avic backing page. I believe you are referring to the "isRunning" (IR) bit is in the AVIC physical APIC ID table entry. I meant cause ID=1 in IPI Delivery Failure Cause (SDM rev 3.30, sep 2018, Table 15-28): " 1: IPI Target Not Running: IsRunning bit of the target for a Singlecast/Broadcast/Multicast IPI is not set in the physical APIC ID table. " However, not before the CPU _already_ set the relevant IRR bit in all these cpus. Not sure what you meant here. Here is the full snippet from the specifications: " 5.For every valid destination: - Atomically set the appropriate IRR bit in each of the destinations’ vAPIC backing page. - Check the IsRunning status of each destination. - If the destination IsRunning bit is set, send a doorbell message using the host physical core number from the Physical APIC ID table. 6. If any destinations are identified as not currently scheduled on a physical core (i.e., the IsRunning bit for that virtual processor is not set), cause a #VMEXIT. " According to the specification above, the HW should first set the appropriate bit in the IRR (Interrupt Request Register) _before_ causing VMEXIT of IPI-delivery-not-completed with ID=1 (Target not running). In this change, the patch forces KVM to send another interrupt to the vcpu whether SVM already did that or not. Which means the vcpu/s, under some conditions, can get an EXTRA interrupt it never intended to get >> Example: 1. vcpu B: Is in "not-running" state. 2. vcpu A: Writes to the ICR to send vector 80 to vcpu B 3. vcpu A: SVM updates vcpu B IRR with bit 80 4. vcpu A: SVM exits on incomplete IPI target-not-running exit. 5. vcpu A: Now stops executing any code @ hypervisor level. 6. vcpu B: Due to another interrupt (like lapic timer) resumes running the guest. While handling interrupts, it also handles interrupt vector 80 (as it's in his IRR) 7. vcpu A: resumes executing the below code and sends an _additional_interrupt to vcpu B. Overall, vcpu B got two interrupts. The second is unwanted and not documented in the system architecture. Can you please elaborate more to why the implementation below conflict with the specifications (which was the code before this commit) ? This patch was introduced to fix an issue where the SVM driver tries to handle the step 5 above by scheduling vcpu B into _running_ state to handle the IPI from vcpu A. However, prior to this patch, vcpu B was never get scheduled to run unless there are other interrupts (e.g. timer). Exactly. Only what needed here is *only* to wakeup the vcpu B. Why ? because the apic of vcpu B _already_ contains the interrupt in the pending IRR. Than, once vcpu B will run it will process the IRR which contains the vector placed by the HW and will deliver it. This should not be the case as Vcpu B should have been running regardless of other interrupts. So, I don't think step 6 and 7 above are correct. The example of vcpu A that stops executing is just to highlight that the code can't depend on that the kvm code of vcpu A will finish the ICR "fake" call before vcpu B runs (beacuse of any interrupt) and process that IRR request placed by the HW. The issue was caused by the apic->irr_pending not set to true when trying to get vcpu B scheduled. This flag is checked in apic_find_highest_irr() before searching for the highest bit. To fix the issue, I decided to leverage the existing emulation
Re: [PATCH] svm: Fix AVIC incomplete IPI emulation
Hi Suravee, Please see below.. On 03/11/2019 01:38 PM, Suthikulpanit, Suravee wrote: Hi Oren, Sorry for delay response. On 3/5/19 1:15 AM, Oren Twaig wrote: Hello Suravee, According to AMD's SDM, the target-not-running incomplete ipi exit is only received if any of the destination cpus had the not-running bit set in the avic backing page. I believe you are referring to the "isRunning" (IR) bit is in the AVIC physical APIC ID table entry. I meant cause ID=1 in IPI Delivery Failure Cause (SDM rev 3.30, sep 2018, Table 15-28): " 1: IPI Target Not Running: IsRunning bit of the target for a Singlecast/Broadcast/Multicast IPI is not set in the physical APIC ID table. " However, not before the CPU _already_ set the relevant IRR bit in all these cpus. Not sure what you meant here. Here is the full snippet from the specifications: " 5.For every valid destination: - Atomically set the appropriate IRR bit in each of the destinations’ vAPIC backing page. - Check the IsRunning status of each destination. - If the destination IsRunning bit is set, send a doorbell message using the host physical core number from the Physical APIC ID table. 6. If any destinations are identified as not currently scheduled on a physical core (i.e., the IsRunning bit for that virtual processor is not set), cause a #VMEXIT. " According to the specification above, the HW should first set the appropriate bit in the IRR (Interrupt Request Register) _before_ causing VMEXIT of IPI-delivery-not-completed with ID=1 (Target not running). In this change, the patch forces KVM to send another interrupt to the vcpu whether SVM already did that or not. Which means the vcpu/s, under some conditions, can get an EXTRA interrupt it never intended to get >> Example: 1. vcpu B: Is in "not-running" state. 2. vcpu A: Writes to the ICR to send vector 80 to vcpu B 3. vcpu A: SVM updates vcpu B IRR with bit 80 4. vcpu A: SVM exits on incomplete IPI target-not-running exit. 5. vcpu A: Now stops executing any code @ hypervisor level. 6. vcpu B: Due to another interrupt (like lapic timer) resumes running the guest. While handling interrupts, it also handles interrupt vector 80 (as it's in his IRR) 7. vcpu A: resumes executing the below code and sends an _additional_interrupt to vcpu B. Overall, vcpu B got two interrupts. The second is unwanted and not documented in the system architecture. Can you please elaborate more to why the implementation below conflict with the specifications (which was the code before this commit) ? This patch was introduced to fix an issue where the SVM driver tries to handle the step 5 above by scheduling vcpu B into _running_ state to handle the IPI from vcpu A. However, prior to this patch, vcpu B was never get scheduled to run unless there are other interrupts (e.g. timer). Exactly. Only what needed here is *only* to wakeup the vcpu B. Why ? because the apic of vcpu B _already_ contains the interrupt in the pending IRR. Than, once vcpu B will run it will process the IRR which contains the vector placed by the HW and will deliver it. This should not be the case as Vcpu B should have been running regardless of other interrupts. So, I don't think step 6 and 7 above are correct. The example of vcpu A that stops executing is just to highlight that the code can't depend on that the kvm code of vcpu A will finish the ICR "fake" call before vcpu B runs (beacuse of any interrupt) and process that IRR request placed by the HW. The issue was caused by the apic->irr_pending not set to true when trying to get vcpu B scheduled. This flag is checked in apic_find_highest_irr() before searching for the highest bit. To fix the issue, I decided to leverage the existing emulation code for ICR and ICR2, which in turn calls apic_send_ipi() to deliver interrupt to vpu B. However, looking a bit more closely, I notice the logic in svm_deliver_avic_intr() should also have been changed from kvm_vcpu_wake_up() to kvm_vcpu_kick() since the latter will result in clearing the IRR bit for the IPI vector when trying to send IPI as part of the following call path. vcpu_enter_guest() |-- inject_pending_event() |-- kvm_cpu_get_interrupt() |-- kvm_get_apic_interrupt() |-- apic_clear_irr() |-- apic_set_isr() |-- apic_update_ppr() Please see the patch below. Not sure if this would address the problem you are seeing. I still think there a bug here where vcpu B will get two interrupts instead of one. Thanks, Oren Thanks, Suravee diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 24dfa6a93711..d2841c3dbc04 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -5219,11 +5256,13 @@ static void svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec) kvm_lapic_set_irr(vec, vcpu->arch.apic);
Re: [PATCH] svm: Fix AVIC incomplete IPI emulation
Hi, Any help appreciated.. Thanks, Oren Twaig On 3/4/2019 8:15 PM, Oren Twaig wrote: Hello Suravee, According to AMD's SDM, the target-not-running incomplete ipi exit is only received if any of the destination cpus had the not-running bit set in the avic backing page. However, not before the CPU _already_ set the relevant IRR bit in all these cpus. In this change, the patch forces KVM to send another interrupt to the vcpu whether SVM already did that or not. Which means the vcpu/s, under some conditions, can get an EXTRA interrupt it never intended to get. Example: 1. vcpu B: Is in "not-running" state. 2. vcpu A: Writes to the ICR to send vector 80 to vcpu B 3. vcpu A: SVM updates vcpu B IRR with bit 80 4. vcpu A: SVM exits on incomplete IPI target-not-running exit. 5. vcpu A: Now stops executing any code @ hypervisor level. 6. vcpu B: Due to another interrupt (like lapic timer) resumes running the guest. While handling interrupts, it also handles interrupt vector 80 (as it's in his IRR) 7. vcpu A: resumes executing the below code and sends an _additional_interrupt to vcpu B. Overall, vcpu B got two interrupts. The second is unwanted and not documented in the system architecture. Can you please elaborate more to why the implementation below conflict with the specifications (which was the code before this commit) ? Thanks, Oren Twaig > From "Suthikulpanit, Suravee" <> > Subject [PATCH] svm: Fix AVIC incomplete IPI emulation > Date Tue, 22 Jan 2019 10:25:13 + > share > From: Suravee Suthikulpanit > > In case of incomplete IPI with invalid interrupt type, the current > SVM driver does not properly emulate the IPI, and fails to boot > FreeBSD guests with multiple vcpus when enabling AVIC. > > Fix this by update APIC ICR high/low registers, which also > emulate sending the IPI. > > Signed-off-by: Suravee Suthikulpanit > --- > arch/x86/kvm/svm.c | 19 --- > 1 file changed, 4 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > index 2aff835a65ed..8a0c9a1f6ac8 100644 > --- a/arch/x86/kvm/svm.c > +++ b/arch/x86/kvm/svm.c > @@ -4504,25 +4504,14 @@ static int avic_incomplete_ipi_interception(struct vcpu_svm *svm) > kvm_lapic_reg_write(apic, APIC_ICR, icrl); > break; > case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING: { > - int i; > - struct kvm_vcpu *vcpu; > - struct kvm *kvm = svm->vcpu.kvm; > struct kvm_lapic *apic = svm->vcpu.arch.apic; > /* > - * At this point, we expect that the AVIC HW has already > - * set the appropriate IRR bits on the valid target > - * vcpus. So, we just need to kick the appropriate vcpu. > + * Update ICR high and low, then emulate sending IPI, > + * which is handled when writing APIC_ICR. > */ > - kvm_for_each_vcpu(i, vcpu, kvm) { > - bool m = kvm_apic_match_dest(vcpu, apic, > - icrl & KVM_APIC_SHORT_MASK, > - GET_APIC_DEST_FIELD(icrh), > - icrl & KVM_APIC_DEST_MASK); > - > - if (m && !avic_vcpu_is_running(vcpu)) > - kvm_vcpu_wake_up(vcpu); > - } > + kvm_lapic_reg_write(apic, APIC_ICR2, icrh); > + kvm_lapic_reg_write(apic, APIC_ICR, icrl); > break; > } > case AVIC_IPI_FAILURE_INVALID_TARGET: > -- > 2.17.1
re: [PATCH] svm: Fix AVIC incomplete IPI emulation
Hello Suravee, According to AMD's SDM, the target-not-running incomplete ipi exit is only received if any of the destination cpus had the not-running bit set in the avic backing page. However, not before the CPU _already_ set the relevant IRR bit in all these cpus. In this change, the patch forces KVM to send another interrupt to the vcpu whether SVM already did that or not. Which means the vcpu/s, under some conditions, can get an EXTRA interrupt it never intended to get. Example: 1. vcpu B: Is in "not-running" state. 2. vcpu A: Writes to the ICR to send vector 80 to vcpu B 3. vcpu A: SVM updates vcpu B IRR with bit 80 4. vcpu A: SVM exits on incomplete IPI target-not-running exit. 5. vcpu A: Now stops executing any code @ hypervisor level. 6. vcpu B: Due to another interrupt (like lapic timer) resumes running the guest. While handling interrupts, it also handles interrupt vector 80 (as it's in his IRR) 7. vcpu A: resumes executing the below code and sends an _additional_interrupt to vcpu B. Overall, vcpu B got two interrupts. The second is unwanted and not documented in the system architecture. Can you please elaborate more to why the implementation below conflict with the specifications (which was the code before this commit) ? Thanks, Oren Twaig > From "Suthikulpanit, Suravee" <> > Subject [PATCH] svm: Fix AVIC incomplete IPI emulation > Date Tue, 22 Jan 2019 10:25:13 + > share > From: Suravee Suthikulpanit > > In case of incomplete IPI with invalid interrupt type, the current > SVM driver does not properly emulate the IPI, and fails to boot > FreeBSD guests with multiple vcpus when enabling AVIC. > > Fix this by update APIC ICR high/low registers, which also > emulate sending the IPI. > > Signed-off-by: Suravee Suthikulpanit > --- > arch/x86/kvm/svm.c | 19 --- > 1 file changed, 4 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > index 2aff835a65ed..8a0c9a1f6ac8 100644 > --- a/arch/x86/kvm/svm.c > +++ b/arch/x86/kvm/svm.c > @@ -4504,25 +4504,14 @@ static int avic_incomplete_ipi_interception(struct vcpu_svm *svm) > kvm_lapic_reg_write(apic, APIC_ICR, icrl); > break; > case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING: { > - int i; > - struct kvm_vcpu *vcpu; > - struct kvm *kvm = svm->vcpu.kvm; > struct kvm_lapic *apic = svm->vcpu.arch.apic; > /* > - * At this point, we expect that the AVIC HW has already > - * set the appropriate IRR bits on the valid target > - * vcpus. So, we just need to kick the appropriate vcpu. > + * Update ICR high and low, then emulate sending IPI, > + * which is handled when writing APIC_ICR. > */ > - kvm_for_each_vcpu(i, vcpu, kvm) { > - bool m = kvm_apic_match_dest(vcpu, apic, > - icrl & KVM_APIC_SHORT_MASK, > - GET_APIC_DEST_FIELD(icrh), > - icrl & KVM_APIC_DEST_MASK); > - > - if (m && !avic_vcpu_is_running(vcpu)) > - kvm_vcpu_wake_up(vcpu); > - } > + kvm_lapic_reg_write(apic, APIC_ICR2, icrh); > + kvm_lapic_reg_write(apic, APIC_ICR, icrl); > break; > } > case AVIC_IPI_FAILURE_INVALID_TARGET: > -- > 2.17.1
Re: [x86_64] Question about early page tables initialization
Hi, This is the corresponding C code which can help you understand: u64 *pml4 = (u64*)pgtable; u64 pdp = pgtable + 0x1000; u64 pml4_entry = pdp | PTE_P | PTE_W | PTU; // present, write, userspace = 0x7 pml4[0] = pml4_entry; The 0x1007 you see is just the calculation of the pml4_entry. Oren Twaig. On 02/03/2015 02:25 PM, Alex Kuleshov wrote: > Hello All, > > I have a question about page tables initialization in the > arch/x86/boot/compressed/head_64.S > > After we clear memory for page tables, there is code which > build PML4: > > lealpgtable + 0(%ebx), %edi > leal0x1007(%edi), %eax > movl%eax, 0(%edi) > > Why there is offset 0x1007 instead just 0x7? 0x1007 is > 4k + 7bit (PML4E) flags as i understand correctly. But > why we skip first 4k here? > > Thank you. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [x86_64] Question about early page tables initialization
Hi, This is the corresponding C code which can help you understand: u64 *pml4 = (u64*)pgtable; u64 pdp = pgtable + 0x1000; u64 pml4_entry = pdp | PTE_P | PTE_W | PTU; // present, write, userspace = 0x7 pml4[0] = pml4_entry; The 0x1007 you see is just the calculation of the pml4_entry. Oren Twaig. On 02/03/2015 02:25 PM, Alex Kuleshov wrote: Hello All, I have a question about page tables initialization in the arch/x86/boot/compressed/head_64.S After we clear memory for page tables, there is code which build PML4: lealpgtable + 0(%ebx), %edi leal0x1007(%edi), %eax movl%eax, 0(%edi) Why there is offset 0x1007 instead just 0x7? 0x1007 is 4k + 7bit (PML4E) flags as i understand correctly. But why we skip first 4k here? Thank you. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86: vmalloc and THP
Hi Kirill, I saw the thread has developed nicely :), still - wanted to answer your question below. On 8/12/2014 9:07 AM, Kirill A. Shutemov wrote: On Tue, Aug 12, 2014 at 08:00:54AM +0300, Oren Twaig wrote: plain/text, please. Yes - noticed the html, sent again in plain text. If not, is there any fast way to change this behavior ? Maybe by changing the granularity/alignment of such allocations to allow such mapping ? What's the point to use vmalloc() in this case? I've noticed that some lock/s are using linear addresses which are located at 0xc901922b4500 and from what I understand from mm.txt (kernel 3.0.101): *c900 - e8ff (=45 bits) vmalloc/ioremap space *So I'm not sure who/how/why this lock got allocated there, but obviously it is using that linear set. No ? --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86: vmalloc and THP
Hi Kirill, I saw the thread has developed nicely :), still - wanted to answer your question below. On 8/12/2014 9:07 AM, Kirill A. Shutemov wrote: On Tue, Aug 12, 2014 at 08:00:54AM +0300, Oren Twaig wrote: html style=direction: ltr; plain/text, please. Yes - noticed the html, sent again in plain text. If not, is there any fast way to change this behavior ? Maybe by changing the granularity/alignment of such allocations to allow such mapping ? What's the point to use vmalloc() in this case? I've noticed that some lock/s are using linear addresses which are located at 0xc901922b4500 and from what I understand from mm.txt (kernel 3.0.101): *c900 - e8ff (=45 bits) vmalloc/ioremap space *So I'm not sure who/how/why this lock got allocated there, but obviously it is using that linear set. No ? --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
x86: vmalloc and THP
Hello, Does memory allocated using vmalloc() will be mapped using huge pages either directly or later by THP ? If not, is there any fast way to change this behavior ? Maybe by changing the granularity/alignment of such allocations to allow such mapping ? Thanks, Oren Twaig. --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
x86: vmalloc and THP
Hello, Does memory allocated using vmalloc() will be mapped using huge pages either directly or later by THP ? If not, is there any fast way to change this behavior ? Maybe by changing the granularity/alignment of such allocations to allow such mapping ? Thanks, Oren Twaig. --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/apic] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Commit-ID: 411cf9ee2946492c0ac7eca48422fcf94a723ce5 Gitweb: http://git.kernel.org/tip/411cf9ee2946492c0ac7eca48422fcf94a723ce5 Author: Oren Twaig AuthorDate: Sun, 29 Jun 2014 13:01:08 +0300 Committer: H. Peter Anvin CommitDate: Sun, 13 Jul 2014 17:48:03 -0700 x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box() When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Acked-by: Shai Fultheim Signed-off-by: Oren Twaig Link: http://lkml.kernel.org/r/1404036068-11674-1-git-send-email-o...@scalemp.com Signed-off-by: H. Peter Anvin --- arch/x86/include/asm/apic.h | 8 -- arch/x86/kernel/apic/apic.c | 60 + 2 files changed, 1 insertion(+), 67 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 69ed79a..b40ea7e 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ - return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ca1bd75..6b35d30 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ - int i, clusters, zeros; - unsigned id; - u16 *bios_cpu_apicid; - DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - - bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); - bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - - for (i = 0; i < nr_cpu_ids; i++) { - /* are we being called early in kernel startup? */ - if (bios_cpu_apicid) { - id = bios_cpu_apicid[i]; - } else if (i < nr_cpu_ids) { - if (cpu_present(i)) - id = per_cpu(x86_bios_cpu_apicid, i); - else - continue; - } else - break; - - if (id != BAD_APICID) - __set_bit(APIC_CLUSTERID(id), clustermap); - } - - /* Problem: Partially populated chassis may not have CPUs in some of -* the APIC clusters they have been allocated. Only present CPUs have -* x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. -* Since clusters are allocated sequentially, count zeros only if -* they are bounded by ones. -*/ - clusters = 0; - zeros = 0; - for (i = 0; i < NUM_APIC_CLUSTERS; i++) { - if (test_bit(i, clustermap)) { - clusters += 1 + zeros; - zeros = 0; - } else - ++zeros; - } - - return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); - if (multi) - return 1; - - if (!is_vsmp_box()) - return 0; - - /* -* ScaleMP vSMPowered boxes have one cluster per board and TSCs are -* not guaranteed to be synced between boards -*/ - if (apic_cluster_num() > 1) - return 1; - - return 0; + return multi; } #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
I see - thanks. BTW, the v2 was sent because I had a mailer send-as-text issues which caused the patch not to be sent properly and you had some conflicts with it when merging. All should be good now with the v2. Thanks, Oren. On 7/13/2014 7:40 PM, H. Peter Anvin wrote: Yes, although sometimes pings we useful to keep things from getting lost. Typically 1-2 weeks without response is a good reason to ping. *Don't repost*, just put a ping as a reply to the original 0/ patch, which in most threaded mail readers brings the thread back to the front. Sent from my tablet, pardon any formatting problems. On Jul 13, 2014, at 2:10, Richard Weinberger wrote: Am 13.07.2014 08:51, schrieb Oren Twaig: ping On 07/06/2014 09:33 AM, Oren Twaig wrote: Hi, Quick question: As I'm new at this (submitting patches), what is a reasonable time to wait before ping-ing a patch ? Oren. AFACT Peter is still crawling through his mail backlog to catch up. Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
I see - Thanks for the quick respond. Oren. On 07/13/2014 12:10 PM, Richard Weinberger wrote: > Am 13.07.2014 08:51, schrieb Oren Twaig: >> ping >> On 07/06/2014 09:33 AM, Oren Twaig wrote: >>> Hi, >>> >>> Quick question: As I'm new at this (submitting patches), what is a >>> reasonable >>> time to wait before ping-ing a patch ? >>> >>> Oren. > AFACT Peter is still crawling through his mail backlog to catch up. > > Thanks, > //richard > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
ping On 07/06/2014 09:33 AM, Oren Twaig wrote: > Hi, > > Quick question: As I'm new at this (submitting patches), what is a reasonable > time to wait before ping-ing a patch ? > > Oren. > > On 06/29/2014 01:01 PM, Oren Twaig wrote: >> When a vSMP Foundation box is detected, the function apic_cluster_num() >> counts >> the number of APIC clusters found. If more than one found, a multi board >> configuration is assumed, and TSC marked as unstable. This behavior is >> incorrect as vSMP Foundation may use processors from single node only, >> attached >> to memory of other nodes - and such node may have more than one APIC cluster >> (typically any recent intel box has more than single APIC_CLUSTERID(x)). >> >> To fix this, we simply remove the code which detects a vSMP Foundation box >> and >> affects apic_is_clusted_box() return value. This can be done because later >> the >> kernel checks by itself if the TSC is stable using the >> check_tsc_sync_[source|target]() functions and marks TSC as unstable if >> needed. >> >> Acked-by: Shai Fultheim >> Signed-off-by: Oren Twaig >> --- >> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h >> index 19b0eba..c100694 100644 >> --- a/arch/x86/include/asm/apic.h >> +++ b/arch/x86/include/asm/apic.h >> @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) >> #include >> #endif >> >> -#ifdef CONFIG_X86_64 >> -extern int is_vsmp_box(void); >> -#else >> -static inline int is_vsmp_box(void) >> -{ >> -return 0; >> -} >> -#endif >> extern int setup_profiling_timer(unsigned int); >> >> static inline void native_apic_mem_write(u32 reg, u32 v) >> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c >> index ad28db7..2b85bb9 100644 >> --- a/arch/x86/kernel/apic/apic.c >> +++ b/arch/x86/kernel/apic/apic.c >> @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } >> >> #ifdef CONFIG_X86_64 >> >> -static int apic_cluster_num(void) >> -{ >> -int i, clusters, zeros; >> -unsigned id; >> -u16 *bios_cpu_apicid; >> -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); >> - >> -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); >> -bitmap_zero(clustermap, NUM_APIC_CLUSTERS); >> - >> -for (i = 0; i < nr_cpu_ids; i++) { >> -/* are we being called early in kernel startup? */ >> -if (bios_cpu_apicid) { >> -id = bios_cpu_apicid[i]; >> -} else if (i < nr_cpu_ids) { >> -if (cpu_present(i)) >> -id = per_cpu(x86_bios_cpu_apicid, i); >> -else >> -continue; >> -} else >> -break; >> - >> -if (id != BAD_APICID) >> -__set_bit(APIC_CLUSTERID(id), clustermap); >> -} >> - >> -/* Problem: Partially populated chassis may not have CPUs in some of >> - * the APIC clusters they have been allocated. Only present CPUs have >> - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. >> - * Since clusters are allocated sequentially, count zeros only if >> - * they are bounded by ones. >> - */ >> -clusters = 0; >> -zeros = 0; >> -for (i = 0; i < NUM_APIC_CLUSTERS; i++) { >> -if (test_bit(i, clustermap)) { >> -clusters += 1 + zeros; >> -zeros = 0; >> -} else >> -++zeros; >> -} >> - >> -return clusters; >> -} >> - >> static int multi_checked; >> static int multi; >> >> @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) >> int apic_is_clustered_box(void) >> { >> dmi_check_multi(); >> -if (multi) >> -return 1; >> - >> -if (!is_vsmp_box()) >> -return 0; >> - >> -/* >> - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are >> - * not guaranteed to be synced between boards >> - */ >> -if (apic_cluster_num() > 1) >> -return 1; >> - >> -return 0; >> +return multi; >> } >> #endif >> >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
ping On 07/06/2014 09:33 AM, Oren Twaig wrote: Hi, Quick question: As I'm new at this (submitting patches), what is a reasonable time to wait before ping-ing a patch ? Oren. On 06/29/2014 01:01 PM, Oren Twaig wrote: When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Acked-by: Shai Fultheim s...@scalemp.com Signed-off-by: Oren Twaig o...@scalemp.com --- diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 19b0eba..c100694 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include asm/paravirt.h #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ -return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ad28db7..2b85bb9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ -int i, clusters, zeros; -unsigned id; -u16 *bios_cpu_apicid; -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); -bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - -for (i = 0; i nr_cpu_ids; i++) { -/* are we being called early in kernel startup? */ -if (bios_cpu_apicid) { -id = bios_cpu_apicid[i]; -} else if (i nr_cpu_ids) { -if (cpu_present(i)) -id = per_cpu(x86_bios_cpu_apicid, i); -else -continue; -} else -break; - -if (id != BAD_APICID) -__set_bit(APIC_CLUSTERID(id), clustermap); -} - -/* Problem: Partially populated chassis may not have CPUs in some of - * the APIC clusters they have been allocated. Only present CPUs have - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. - * Since clusters are allocated sequentially, count zeros only if - * they are bounded by ones. - */ -clusters = 0; -zeros = 0; -for (i = 0; i NUM_APIC_CLUSTERS; i++) { -if (test_bit(i, clustermap)) { -clusters += 1 + zeros; -zeros = 0; -} else -++zeros; -} - -return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); -if (multi) -return 1; - -if (!is_vsmp_box()) -return 0; - -/* - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are - * not guaranteed to be synced between boards - */ -if (apic_cluster_num() 1) -return 1; - -return 0; +return multi; } #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
I see - Thanks for the quick respond. Oren. On 07/13/2014 12:10 PM, Richard Weinberger wrote: Am 13.07.2014 08:51, schrieb Oren Twaig: ping On 07/06/2014 09:33 AM, Oren Twaig wrote: Hi, Quick question: As I'm new at this (submitting patches), what is a reasonable time to wait before ping-ing a patch ? Oren. AFACT Peter is still crawling through his mail backlog to catch up. Thanks, //richard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
I see - thanks. BTW, the v2 was sent because I had a mailer send-as-text issues which caused the patch not to be sent properly and you had some conflicts with it when merging. All should be good now with the v2. Thanks, Oren. On 7/13/2014 7:40 PM, H. Peter Anvin wrote: Yes, although sometimes pings we useful to keep things from getting lost. Typically 1-2 weeks without response is a good reason to ping. *Don't repost*, just put a ping as a reply to the original 0/ patch, which in most threaded mail readers brings the thread back to the front. Sent from my tablet, pardon any formatting problems. On Jul 13, 2014, at 2:10, Richard Weinberger rich...@nod.at wrote: Am 13.07.2014 08:51, schrieb Oren Twaig: ping On 07/06/2014 09:33 AM, Oren Twaig wrote: Hi, Quick question: As I'm new at this (submitting patches), what is a reasonable time to wait before ping-ing a patch ? Oren. AFACT Peter is still crawling through his mail backlog to catch up. Thanks, //richard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/apic] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Commit-ID: 411cf9ee2946492c0ac7eca48422fcf94a723ce5 Gitweb: http://git.kernel.org/tip/411cf9ee2946492c0ac7eca48422fcf94a723ce5 Author: Oren Twaig o...@scalemp.com AuthorDate: Sun, 29 Jun 2014 13:01:08 +0300 Committer: H. Peter Anvin h...@zytor.com CommitDate: Sun, 13 Jul 2014 17:48:03 -0700 x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box() When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Acked-by: Shai Fultheim s...@scalemp.com Signed-off-by: Oren Twaig o...@scalemp.com Link: http://lkml.kernel.org/r/1404036068-11674-1-git-send-email-o...@scalemp.com Signed-off-by: H. Peter Anvin h...@zytor.com --- arch/x86/include/asm/apic.h | 8 -- arch/x86/kernel/apic/apic.c | 60 + 2 files changed, 1 insertion(+), 67 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 69ed79a..b40ea7e 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include asm/paravirt.h #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ - return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ca1bd75..6b35d30 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ - int i, clusters, zeros; - unsigned id; - u16 *bios_cpu_apicid; - DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - - bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); - bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - - for (i = 0; i nr_cpu_ids; i++) { - /* are we being called early in kernel startup? */ - if (bios_cpu_apicid) { - id = bios_cpu_apicid[i]; - } else if (i nr_cpu_ids) { - if (cpu_present(i)) - id = per_cpu(x86_bios_cpu_apicid, i); - else - continue; - } else - break; - - if (id != BAD_APICID) - __set_bit(APIC_CLUSTERID(id), clustermap); - } - - /* Problem: Partially populated chassis may not have CPUs in some of -* the APIC clusters they have been allocated. Only present CPUs have -* x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. -* Since clusters are allocated sequentially, count zeros only if -* they are bounded by ones. -*/ - clusters = 0; - zeros = 0; - for (i = 0; i NUM_APIC_CLUSTERS; i++) { - if (test_bit(i, clustermap)) { - clusters += 1 + zeros; - zeros = 0; - } else - ++zeros; - } - - return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); - if (multi) - return 1; - - if (!is_vsmp_box()) - return 0; - - /* -* ScaleMP vSMPowered boxes have one cluster per board and TSCs are -* not guaranteed to be synced between boards -*/ - if (apic_cluster_num() 1) - return 1; - - return 0; + return multi; } #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Hi, Quick question: As I'm new at this (submitting patches), what is a reasonable time to wait before ping-ing a patch ? Oren. On 06/29/2014 01:01 PM, Oren Twaig wrote: > When a vSMP Foundation box is detected, the function apic_cluster_num() counts > the number of APIC clusters found. If more than one found, a multi board > configuration is assumed, and TSC marked as unstable. This behavior is > incorrect as vSMP Foundation may use processors from single node only, > attached > to memory of other nodes - and such node may have more than one APIC cluster > (typically any recent intel box has more than single APIC_CLUSTERID(x)). > > To fix this, we simply remove the code which detects a vSMP Foundation box and > affects apic_is_clusted_box() return value. This can be done because later the > kernel checks by itself if the TSC is stable using the > check_tsc_sync_[source|target]() functions and marks TSC as unstable if > needed. > > Acked-by: Shai Fultheim > Signed-off-by: Oren Twaig > --- > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h > index 19b0eba..c100694 100644 > --- a/arch/x86/include/asm/apic.h > +++ b/arch/x86/include/asm/apic.h > @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) > #include > #endif > > -#ifdef CONFIG_X86_64 > -extern int is_vsmp_box(void); > -#else > -static inline int is_vsmp_box(void) > -{ > -return 0; > -} > -#endif > extern int setup_profiling_timer(unsigned int); > > static inline void native_apic_mem_write(u32 reg, u32 v) > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > index ad28db7..2b85bb9 100644 > --- a/arch/x86/kernel/apic/apic.c > +++ b/arch/x86/kernel/apic/apic.c > @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } > > #ifdef CONFIG_X86_64 > > -static int apic_cluster_num(void) > -{ > -int i, clusters, zeros; > -unsigned id; > -u16 *bios_cpu_apicid; > -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); > - > -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); > -bitmap_zero(clustermap, NUM_APIC_CLUSTERS); > - > -for (i = 0; i < nr_cpu_ids; i++) { > -/* are we being called early in kernel startup? */ > -if (bios_cpu_apicid) { > -id = bios_cpu_apicid[i]; > -} else if (i < nr_cpu_ids) { > -if (cpu_present(i)) > -id = per_cpu(x86_bios_cpu_apicid, i); > -else > -continue; > -} else > -break; > - > -if (id != BAD_APICID) > -__set_bit(APIC_CLUSTERID(id), clustermap); > -} > - > -/* Problem: Partially populated chassis may not have CPUs in some of > - * the APIC clusters they have been allocated. Only present CPUs have > - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. > - * Since clusters are allocated sequentially, count zeros only if > - * they are bounded by ones. > - */ > -clusters = 0; > -zeros = 0; > -for (i = 0; i < NUM_APIC_CLUSTERS; i++) { > -if (test_bit(i, clustermap)) { > -clusters += 1 + zeros; > -zeros = 0; > -} else > -++zeros; > -} > - > -return clusters; > -} > - > static int multi_checked; > static int multi; > > @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) > int apic_is_clustered_box(void) > { > dmi_check_multi(); > -if (multi) > -return 1; > - > -if (!is_vsmp_box()) > -return 0; > - > -/* > - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are > - * not guaranteed to be synced between boards > - */ > -if (apic_cluster_num() > 1) > -return 1; > - > -return 0; > +return multi; > } > #endif > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Hi, Quick question: As I'm new at this (submitting patches), what is a reasonable time to wait before ping-ing a patch ? Oren. On 06/29/2014 01:01 PM, Oren Twaig wrote: > When a vSMP Foundation box is detected, the function apic_cluster_num() counts > the number of APIC clusters found. If more than one found, a multi board > configuration is assumed, and TSC marked as unstable. This behavior is > incorrect as vSMP Foundation may use processors from single node only, > attached > to memory of other nodes - and such node may have more than one APIC cluster > (typically any recent intel box has more than single APIC_CLUSTERID(x)). > > To fix this, we simply remove the code which detects a vSMP Foundation box and > affects apic_is_clusted_box() return value. This can be done because later the > kernel checks by itself if the TSC is stable using the > check_tsc_sync_[source|target]() functions and marks TSC as unstable if > needed. > > Acked-by: Shai Fultheim > Signed-off-by: Oren Twaig > --- > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h > index 19b0eba..c100694 100644 > --- a/arch/x86/include/asm/apic.h > +++ b/arch/x86/include/asm/apic.h > @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) > #include > #endif > > -#ifdef CONFIG_X86_64 > -extern int is_vsmp_box(void); > -#else > -static inline int is_vsmp_box(void) > -{ > - return 0; > -} > -#endif > extern int setup_profiling_timer(unsigned int); > > static inline void native_apic_mem_write(u32 reg, u32 v) > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > index ad28db7..2b85bb9 100644 > --- a/arch/x86/kernel/apic/apic.c > +++ b/arch/x86/kernel/apic/apic.c > @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } > > #ifdef CONFIG_X86_64 > > -static int apic_cluster_num(void) > -{ > - int i, clusters, zeros; > - unsigned id; > - u16 *bios_cpu_apicid; > - DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); > - > - bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); > - bitmap_zero(clustermap, NUM_APIC_CLUSTERS); > - > - for (i = 0; i < nr_cpu_ids; i++) { > - /* are we being called early in kernel startup? */ > - if (bios_cpu_apicid) { > - id = bios_cpu_apicid[i]; > - } else if (i < nr_cpu_ids) { > - if (cpu_present(i)) > - id = per_cpu(x86_bios_cpu_apicid, i); > - else > - continue; > - } else > - break; > - > - if (id != BAD_APICID) > - __set_bit(APIC_CLUSTERID(id), clustermap); > - } > - > - /* Problem: Partially populated chassis may not have CPUs in some of > - * the APIC clusters they have been allocated. Only present CPUs have > - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. > - * Since clusters are allocated sequentially, count zeros only if > - * they are bounded by ones. > - */ > - clusters = 0; > - zeros = 0; > - for (i = 0; i < NUM_APIC_CLUSTERS; i++) { > - if (test_bit(i, clustermap)) { > - clusters += 1 + zeros; > - zeros = 0; > - } else > - ++zeros; > - } > - > - return clusters; > -} > - > static int multi_checked; > static int multi; > > @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) > int apic_is_clustered_box(void) > { > dmi_check_multi(); > - if (multi) > - return 1; > - > - if (!is_vsmp_box()) > - return 0; > - > - /* > - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are > - * not guaranteed to be synced between boards > - */ > - if (apic_cluster_num() > 1) > - return 1; > - > - return 0; > + return multi; > } > #endif > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Hi, Quick question: As I'm new at this (submitting patches), what is a reasonable time to wait before ping-ing a patch ? Oren. On 06/29/2014 01:01 PM, Oren Twaig wrote: When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Acked-by: Shai Fultheim s...@scalemp.com Signed-off-by: Oren Twaig o...@scalemp.com --- diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 19b0eba..c100694 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include asm/paravirt.h #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ - return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ad28db7..2b85bb9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ - int i, clusters, zeros; - unsigned id; - u16 *bios_cpu_apicid; - DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - - bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); - bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - - for (i = 0; i nr_cpu_ids; i++) { - /* are we being called early in kernel startup? */ - if (bios_cpu_apicid) { - id = bios_cpu_apicid[i]; - } else if (i nr_cpu_ids) { - if (cpu_present(i)) - id = per_cpu(x86_bios_cpu_apicid, i); - else - continue; - } else - break; - - if (id != BAD_APICID) - __set_bit(APIC_CLUSTERID(id), clustermap); - } - - /* Problem: Partially populated chassis may not have CPUs in some of - * the APIC clusters they have been allocated. Only present CPUs have - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. - * Since clusters are allocated sequentially, count zeros only if - * they are bounded by ones. - */ - clusters = 0; - zeros = 0; - for (i = 0; i NUM_APIC_CLUSTERS; i++) { - if (test_bit(i, clustermap)) { - clusters += 1 + zeros; - zeros = 0; - } else - ++zeros; - } - - return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); - if (multi) - return 1; - - if (!is_vsmp_box()) - return 0; - - /* - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are - * not guaranteed to be synced between boards - */ - if (apic_cluster_num() 1) - return 1; - - return 0; + return multi; } #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Hi, Quick question: As I'm new at this (submitting patches), what is a reasonable time to wait before ping-ing a patch ? Oren. On 06/29/2014 01:01 PM, Oren Twaig wrote: When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Acked-by: Shai Fultheim s...@scalemp.com Signed-off-by: Oren Twaig o...@scalemp.com --- diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 19b0eba..c100694 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include asm/paravirt.h #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ -return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ad28db7..2b85bb9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ -int i, clusters, zeros; -unsigned id; -u16 *bios_cpu_apicid; -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); -bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - -for (i = 0; i nr_cpu_ids; i++) { -/* are we being called early in kernel startup? */ -if (bios_cpu_apicid) { -id = bios_cpu_apicid[i]; -} else if (i nr_cpu_ids) { -if (cpu_present(i)) -id = per_cpu(x86_bios_cpu_apicid, i); -else -continue; -} else -break; - -if (id != BAD_APICID) -__set_bit(APIC_CLUSTERID(id), clustermap); -} - -/* Problem: Partially populated chassis may not have CPUs in some of - * the APIC clusters they have been allocated. Only present CPUs have - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. - * Since clusters are allocated sequentially, count zeros only if - * they are bounded by ones. - */ -clusters = 0; -zeros = 0; -for (i = 0; i NUM_APIC_CLUSTERS; i++) { -if (test_bit(i, clustermap)) { -clusters += 1 + zeros; -zeros = 0; -} else -++zeros; -} - -return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); -if (multi) -return 1; - -if (!is_vsmp_box()) -return 0; - -/* - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are - * not guaranteed to be synced between boards - */ -if (apic_cluster_num() 1) -return 1; - -return 0; +return multi; } #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Hi, There was actually a problem that the mailer which changed a '\t' to spaces. This is why it conflicted with the tip:x86/apic branch. Any how, I've also tested on the x86/apic branch and everything is ok. Sent : "[PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()" Sorry for any troubles. Thanks, Oren. On 06/27/2014 08:39 AM, H. Peter Anvin wrote: > On 06/26/2014 10:05 PM, Oren Twaig wrote: >> ping > > This patch conflicts with the changes on the tip:x86/apic branch. Could > you please rebase the patch on top of that branch? > > -hpa > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Acked-by: Shai Fultheim Signed-off-by: Oren Twaig --- diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 19b0eba..c100694 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ - return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ad28db7..2b85bb9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ - int i, clusters, zeros; - unsigned id; - u16 *bios_cpu_apicid; - DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - - bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); - bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - - for (i = 0; i < nr_cpu_ids; i++) { - /* are we being called early in kernel startup? */ - if (bios_cpu_apicid) { - id = bios_cpu_apicid[i]; - } else if (i < nr_cpu_ids) { - if (cpu_present(i)) - id = per_cpu(x86_bios_cpu_apicid, i); - else - continue; - } else - break; - - if (id != BAD_APICID) - __set_bit(APIC_CLUSTERID(id), clustermap); - } - - /* Problem: Partially populated chassis may not have CPUs in some of -* the APIC clusters they have been allocated. Only present CPUs have -* x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. -* Since clusters are allocated sequentially, count zeros only if -* they are bounded by ones. -*/ - clusters = 0; - zeros = 0; - for (i = 0; i < NUM_APIC_CLUSTERS; i++) { - if (test_bit(i, clustermap)) { - clusters += 1 + zeros; - zeros = 0; - } else - ++zeros; - } - - return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); - if (multi) - return 1; - - if (!is_vsmp_box()) - return 0; - - /* -* ScaleMP vSMPowered boxes have one cluster per board and TSCs are -* not guaranteed to be synced between boards -*/ - if (apic_cluster_num() > 1) - return 1; - - return 0; + return multi; } #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Acked-by: Shai Fultheim s...@scalemp.com Signed-off-by: Oren Twaig o...@scalemp.com --- diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 19b0eba..c100694 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include asm/paravirt.h #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ - return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ad28db7..2b85bb9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ - int i, clusters, zeros; - unsigned id; - u16 *bios_cpu_apicid; - DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - - bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); - bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - - for (i = 0; i nr_cpu_ids; i++) { - /* are we being called early in kernel startup? */ - if (bios_cpu_apicid) { - id = bios_cpu_apicid[i]; - } else if (i nr_cpu_ids) { - if (cpu_present(i)) - id = per_cpu(x86_bios_cpu_apicid, i); - else - continue; - } else - break; - - if (id != BAD_APICID) - __set_bit(APIC_CLUSTERID(id), clustermap); - } - - /* Problem: Partially populated chassis may not have CPUs in some of -* the APIC clusters they have been allocated. Only present CPUs have -* x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. -* Since clusters are allocated sequentially, count zeros only if -* they are bounded by ones. -*/ - clusters = 0; - zeros = 0; - for (i = 0; i NUM_APIC_CLUSTERS; i++) { - if (test_bit(i, clustermap)) { - clusters += 1 + zeros; - zeros = 0; - } else - ++zeros; - } - - return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); - if (multi) - return 1; - - if (!is_vsmp_box()) - return 0; - - /* -* ScaleMP vSMPowered boxes have one cluster per board and TSCs are -* not guaranteed to be synced between boards -*/ - if (apic_cluster_num() 1) - return 1; - - return 0; + return multi; } #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Hi, There was actually a problem that the mailer which changed a '\t' to spaces. This is why it conflicted with the tip:x86/apic branch. Any how, I've also tested on the x86/apic branch and everything is ok. Sent : [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box() Sorry for any troubles. Thanks, Oren. On 06/27/2014 08:39 AM, H. Peter Anvin wrote: On 06/26/2014 10:05 PM, Oren Twaig wrote: ping This patch conflicts with the changes on the tip:x86/apic branch. Could you please rebase the patch on top of that branch? -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
ping On 06/23/2014 08:35 AM, Oren Twaig wrote: > Remove invalid code which caused TSC to be declared as "unstable" on vSMP > Foundation box even if it was stable and let the kernel decide for itself. > > When a vSMP Foundation box is detected, the function apic_cluster_num() counts > the number of APIC clusters found. If more than one found, a multi board > configuration is assumed, and TSC marked as unstable. This behavior is > incorrect as vSMP Foundation may use processors from single node only, > attached > to memory of other nodes - and such node may have more than one APIC cluster > (typically any recent intel box has more than single APIC_CLUSTERID(x)). > > To fix this, we simply remove the code which detects a vSMP Foundation box and > affects apic_is_clusted_box() return value. This can be done because later the > kernel checks by itself if the TSC is stable using the > check_tsc_sync_[source|target]() functions and marks TSC as unstable if > needed. > > Signed-off-by: Oren Twaig > Acked-by: Shai Fultheim > --- > arch/x86/include/asm/apic.h |8 -- > arch/x86/kernel/apic/apic.c | 60 > +-- > 2 files changed, 1 insertion(+), 67 deletions(-) > > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h > index 19b0eba..c100694 100644 > --- a/arch/x86/include/asm/apic.h > +++ b/arch/x86/include/asm/apic.h > @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) > #include > #endif > > -#ifdef CONFIG_X86_64 > -extern int is_vsmp_box(void); > -#else > -static inline int is_vsmp_box(void) > -{ > -return 0; > -} > -#endif > extern int setup_profiling_timer(unsigned int); > > static inline void native_apic_mem_write(u32 reg, u32 v) > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > index ad28db7..2b85bb9 100644 > --- a/arch/x86/kernel/apic/apic.c > +++ b/arch/x86/kernel/apic/apic.c > @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } > > #ifdef CONFIG_X86_64 > > -static int apic_cluster_num(void) > -{ > -int i, clusters, zeros; > -unsigned id; > -u16 *bios_cpu_apicid; > -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); > - > -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); > -bitmap_zero(clustermap, NUM_APIC_CLUSTERS); > - > -for (i = 0; i < nr_cpu_ids; i++) { > -/* are we being called early in kernel startup? */ > -if (bios_cpu_apicid) { > -id = bios_cpu_apicid[i]; > -} else if (i < nr_cpu_ids) { > -if (cpu_present(i)) > -id = per_cpu(x86_bios_cpu_apicid, i); > -else > -continue; > -} else > -break; > - > -if (id != BAD_APICID) > -__set_bit(APIC_CLUSTERID(id), clustermap); > -} > - > -/* Problem: Partially populated chassis may not have CPUs in some of > - * the APIC clusters they have been allocated. Only present CPUs have > - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. > - * Since clusters are allocated sequentially, count zeros only if > - * they are bounded by ones. > - */ > -clusters = 0; > -zeros = 0; > -for (i = 0; i < NUM_APIC_CLUSTERS; i++) { > -if (test_bit(i, clustermap)) { > -clusters += 1 + zeros; > -zeros = 0; > -} else > -++zeros; > -} > - > -return clusters; > -} > - > static int multi_checked; > static int multi; > > @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) > int apic_is_clustered_box(void) > { > dmi_check_multi(); > -if (multi) > -return 1; > - > -if (!is_vsmp_box()) > -return 0; > - > -/* > - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are > - * not guaranteed to be synced between boards > - */ > -if (apic_cluster_num() > 1) > -return 1; > - > -return 0; > +return multi; > } > #endif > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
ping On 06/23/2014 08:35 AM, Oren Twaig wrote: Remove invalid code which caused TSC to be declared as unstable on vSMP Foundation box even if it was stable and let the kernel decide for itself. When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Signed-off-by: Oren Twaig o...@scalemp.com Acked-by: Shai Fultheim s...@scalemp.com --- arch/x86/include/asm/apic.h |8 -- arch/x86/kernel/apic/apic.c | 60 +-- 2 files changed, 1 insertion(+), 67 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 19b0eba..c100694 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include asm/paravirt.h #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ -return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ad28db7..2b85bb9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ -int i, clusters, zeros; -unsigned id; -u16 *bios_cpu_apicid; -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); -bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - -for (i = 0; i nr_cpu_ids; i++) { -/* are we being called early in kernel startup? */ -if (bios_cpu_apicid) { -id = bios_cpu_apicid[i]; -} else if (i nr_cpu_ids) { -if (cpu_present(i)) -id = per_cpu(x86_bios_cpu_apicid, i); -else -continue; -} else -break; - -if (id != BAD_APICID) -__set_bit(APIC_CLUSTERID(id), clustermap); -} - -/* Problem: Partially populated chassis may not have CPUs in some of - * the APIC clusters they have been allocated. Only present CPUs have - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. - * Since clusters are allocated sequentially, count zeros only if - * they are bounded by ones. - */ -clusters = 0; -zeros = 0; -for (i = 0; i NUM_APIC_CLUSTERS; i++) { -if (test_bit(i, clustermap)) { -clusters += 1 + zeros; -zeros = 0; -} else -++zeros; -} - -return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); -if (multi) -return 1; - -if (!is_vsmp_box()) -return 0; - -/* - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are - * not guaranteed to be synced between boards - */ -if (apic_cluster_num() 1) -return 1; - -return 0; +return multi; } #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Remove invalid code which caused TSC to be declared as "unstable" on vSMP Foundation box even if it was stable and let the kernel decide for itself. When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Signed-off-by: Oren Twaig Acked-by: Shai Fultheim --- arch/x86/include/asm/apic.h |8 -- arch/x86/kernel/apic/apic.c | 60 +-- 2 files changed, 1 insertion(+), 67 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 19b0eba..c100694 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ -return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ad28db7..2b85bb9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ -int i, clusters, zeros; -unsigned id; -u16 *bios_cpu_apicid; -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); -bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - -for (i = 0; i < nr_cpu_ids; i++) { -/* are we being called early in kernel startup? */ -if (bios_cpu_apicid) { -id = bios_cpu_apicid[i]; -} else if (i < nr_cpu_ids) { -if (cpu_present(i)) -id = per_cpu(x86_bios_cpu_apicid, i); -else -continue; -} else -break; - -if (id != BAD_APICID) -__set_bit(APIC_CLUSTERID(id), clustermap); -} - -/* Problem: Partially populated chassis may not have CPUs in some of - * the APIC clusters they have been allocated. Only present CPUs have - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. - * Since clusters are allocated sequentially, count zeros only if - * they are bounded by ones. - */ -clusters = 0; -zeros = 0; -for (i = 0; i < NUM_APIC_CLUSTERS; i++) { -if (test_bit(i, clustermap)) { -clusters += 1 + zeros; -zeros = 0; -} else -++zeros; -} - -return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); -if (multi) -return 1; - -if (!is_vsmp_box()) -return 0; - -/* - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are - * not guaranteed to be synced between boards - */ -if (apic_cluster_num() > 1) -return 1; - -return 0; +return multi; } #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
Remove invalid code which caused TSC to be declared as unstable on vSMP Foundation box even if it was stable and let the kernel decide for itself. When a vSMP Foundation box is detected, the function apic_cluster_num() counts the number of APIC clusters found. If more than one found, a multi board configuration is assumed, and TSC marked as unstable. This behavior is incorrect as vSMP Foundation may use processors from single node only, attached to memory of other nodes - and such node may have more than one APIC cluster (typically any recent intel box has more than single APIC_CLUSTERID(x)). To fix this, we simply remove the code which detects a vSMP Foundation box and affects apic_is_clusted_box() return value. This can be done because later the kernel checks by itself if the TSC is stable using the check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed. Signed-off-by: Oren Twaig o...@scalemp.com Acked-by: Shai Fultheim s...@scalemp.com --- arch/x86/include/asm/apic.h |8 -- arch/x86/kernel/apic/apic.c | 60 +-- 2 files changed, 1 insertion(+), 67 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 19b0eba..c100694 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void) #include asm/paravirt.h #endif -#ifdef CONFIG_X86_64 -extern int is_vsmp_box(void); -#else -static inline int is_vsmp_box(void) -{ -return 0; -} -#endif extern int setup_profiling_timer(unsigned int); static inline void native_apic_mem_write(u32 reg, u32 v) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index ad28db7..2b85bb9 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { } #ifdef CONFIG_X86_64 -static int apic_cluster_num(void) -{ -int i, clusters, zeros; -unsigned id; -u16 *bios_cpu_apicid; -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS); - -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid); -bitmap_zero(clustermap, NUM_APIC_CLUSTERS); - -for (i = 0; i nr_cpu_ids; i++) { -/* are we being called early in kernel startup? */ -if (bios_cpu_apicid) { -id = bios_cpu_apicid[i]; -} else if (i nr_cpu_ids) { -if (cpu_present(i)) -id = per_cpu(x86_bios_cpu_apicid, i); -else -continue; -} else -break; - -if (id != BAD_APICID) -__set_bit(APIC_CLUSTERID(id), clustermap); -} - -/* Problem: Partially populated chassis may not have CPUs in some of - * the APIC clusters they have been allocated. Only present CPUs have - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. - * Since clusters are allocated sequentially, count zeros only if - * they are bounded by ones. - */ -clusters = 0; -zeros = 0; -for (i = 0; i NUM_APIC_CLUSTERS; i++) { -if (test_bit(i, clustermap)) { -clusters += 1 + zeros; -zeros = 0; -} else -++zeros; -} - -return clusters; -} - static int multi_checked; static int multi; @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void) int apic_is_clustered_box(void) { dmi_check_multi(); -if (multi) -return 1; - -if (!is_vsmp_box()) -return 0; - -/* - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are - * not guaranteed to be synced between boards - */ -if (apic_cluster_num() 1) -return 1; - -return 0; +return multi; } #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] x86, Clean up smp_num_siblings calculation
Hi Prarit, See below, On 05/30/2014 02:43 PM, Prarit Bhargava wrote: > I have a system on which I have disabled threading in the BIOS, and I am > booting > the kernel with the option "idle=poll". > > The kernel displays > > process: WARNING: polling idle and HT enabled, performance may degrade > > which is incorrect -- I've already disabled HT. > > This warning is issued here: > > void select_idle_routine(const struct cpuinfo_x86 *c) > { > if (boot_option_idle_override == IDLE_POLL && smp_num_siblings > 1) > pr_warn_once("WARNING: polling idle and HT enabled, > performance may degrade\n"); > > >From my understanding of the other ares of kernel that use > smp_num_siblings, the value is supposed to be the the number of threads > per core. > > The value of smp_num_siblings is incorrect. In theory, it should be 1 but it > is reported as 2. When I looked into how smp_num_siblings is calculated I > found the following call sequence in the kernel: > > start_kernel -> > check_bugs -> > identify_boot_cpu -> > identify_cpu -> > c_init = init_intel > init_intel -> > > detect_extended_topology > (sets value) > > OR > > c_init = init_amd > init_amd -> amd_detect_cmp > -> > amd_get_topology > (sets value) > -> detect_ht() > ...(sets value) > detect_ht() > (also sets value) > > ie) it is set three times in some cases and is overwritten by the call > to detect_ht() from identify_cpu() in all cases. > > It should be noted that nothing in the identify_cpu() path or the cpu_up() > path requires smp_num_siblings to be set, prior to the final call to > detect_ht(). > > For x86 boxes, smp_num_siblings is set to a value read in a CPUID call in > detect_ht(). This value is the *factory defined* value in all cases; even > if HT is disabled in BIOS the value still returns 2 if the CPU supports > HT. AMD also reports the factory defined value in all cases. The above is incorrect in case of X-TOPOLOGY mode. I such a case the information about number of siblings comes from the LEVEL_MAX_SIBLINGS() macro and the X86_FEATURE_XTOPOLOGY flag is set to skip detect_ht() work : void detect_ht(struct cpuinfo_x86 *c) ... if (cpu_has(c, X86_FEATURE_XTOPOLOGY)) return; In addition, the information about the number of sibling no longer comes from CPUID(0x1)->ebx but rather from the 0xb leaf of CPUID. I've marked below the problematic code change. Thanks, Oren Twaig > > Other uses of smp_num_siblings involve oprofile (used after boot), and > the perf code which is done well after the initial cpus are brought online. > > This patch removes dead code and moves the assignment of smp_num_siblings > to only the detect_ht() code; it is still always reporting 2. A follow > on patch will fix the calculation. > > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: "H. Peter Anvin" > Cc: x...@kernel.org > Cc: Borislav Petkov > Cc: Paul Gortmaker > Cc: Andrew Morton > Cc: Andi Kleen > Cc: Dave Jones > Cc: Torsten Kaiser > Cc: Jan Beulich > Cc: Jan Kiszka > Cc: Toshi Kani > Cc: Andrew Jones > Signed-off-by: Prarit Bhargava > --- > arch/x86/kernel/cpu/amd.c |1 - > arch/x86/kernel/cpu/common.c | 23 +++ > arch/x86/kernel/cpu/topology.c |2 +- > arch/x86/kernel/smpboot.c |5 ++--- > 4 files changed, 14 insertions(+), 17 deletions(-) > > diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c > index ce8b8ff..6aca2b6 100644 > --- a/arch/x86/kernel/cpu/amd.c > +++ b/arch/x86/kernel/cpu/amd.c > @@ -304,7 +304,6 @@ static void amd_get_topology(struct cpuinfo_x86 *c) > node_id = ecx & 7; > > /* get compute unit information */ > -smp_num_siblings = ((ebx >> 8) & 3) + 1; > c->compute_unit_id = ebx & 0xff; > cores_per_cu += ((ebx >> 8) & 3); > } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { > diff --git a/a
Re: [PATCH 1/2] x86, Clean up smp_num_siblings calculation
Hi Prarit, See below, On 05/30/2014 02:43 PM, Prarit Bhargava wrote: I have a system on which I have disabled threading in the BIOS, and I am booting the kernel with the option idle=poll. The kernel displays process: WARNING: polling idle and HT enabled, performance may degrade which is incorrect -- I've already disabled HT. This warning is issued here: void select_idle_routine(const struct cpuinfo_x86 *c) { if (boot_option_idle_override == IDLE_POLL smp_num_siblings 1) pr_warn_once(WARNING: polling idle and HT enabled, performance may degrade\n); From my understanding of the other ares of kernel that use smp_num_siblings, the value is supposed to be the the number of threads per core. The value of smp_num_siblings is incorrect. In theory, it should be 1 but it is reported as 2. When I looked into how smp_num_siblings is calculated I found the following call sequence in the kernel: start_kernel - check_bugs - identify_boot_cpu - identify_cpu - c_init = init_intel init_intel - detect_extended_topology (sets value) OR c_init = init_amd init_amd - amd_detect_cmp - amd_get_topology (sets value) - detect_ht() ...(sets value) detect_ht() (also sets value) ie) it is set three times in some cases and is overwritten by the call to detect_ht() from identify_cpu() in all cases. It should be noted that nothing in the identify_cpu() path or the cpu_up() path requires smp_num_siblings to be set, prior to the final call to detect_ht(). For x86 boxes, smp_num_siblings is set to a value read in a CPUID call in detect_ht(). This value is the *factory defined* value in all cases; even if HT is disabled in BIOS the value still returns 2 if the CPU supports HT. AMD also reports the factory defined value in all cases. The above is incorrect in case of X-TOPOLOGY mode. I such a case the information about number of siblings comes from the LEVEL_MAX_SIBLINGS() macro and the X86_FEATURE_XTOPOLOGY flag is set to skip detect_ht() work : void detect_ht(struct cpuinfo_x86 *c) ... if (cpu_has(c, X86_FEATURE_XTOPOLOGY)) return; In addition, the information about the number of sibling no longer comes from CPUID(0x1)-ebx but rather from the 0xb leaf of CPUID. I've marked below the problematic code change. Thanks, Oren Twaig Other uses of smp_num_siblings involve oprofile (used after boot), and the perf code which is done well after the initial cpus are brought online. This patch removes dead code and moves the assignment of smp_num_siblings to only the detect_ht() code; it is still always reporting 2. A follow on patch will fix the calculation. Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Cc: x...@kernel.org Cc: Borislav Petkov b...@suse.de Cc: Paul Gortmaker paul.gortma...@windriver.com Cc: Andrew Morton a...@linux-foundation.org Cc: Andi Kleen a...@linux.intel.com Cc: Dave Jones da...@redhat.com Cc: Torsten Kaiser just.for.l...@googlemail.com Cc: Jan Beulich jbeul...@suse.com Cc: Jan Kiszka jan.kis...@siemens.com Cc: Toshi Kani toshi.k...@hp.com Cc: Andrew Jones drjo...@redhat.com Signed-off-by: Prarit Bhargava pra...@redhat.com --- arch/x86/kernel/cpu/amd.c |1 - arch/x86/kernel/cpu/common.c | 23 +++ arch/x86/kernel/cpu/topology.c |2 +- arch/x86/kernel/smpboot.c |5 ++--- 4 files changed, 14 insertions(+), 17 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index ce8b8ff..6aca2b6 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -304,7 +304,6 @@ static void amd_get_topology(struct cpuinfo_x86 *c) node_id = ecx 7; /* get compute unit information */ -smp_num_siblings = ((ebx 8) 3) + 1; c-compute_unit_id = ebx 0xff; cores_per_cu += ((ebx 8) 3); } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a135239..fc1235c 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -507,42 +507,41 @@ void detect_ht(struct cpuinfo_x86 *c) u32 eax, ebx, ecx, edx; int index_msb
Re: [PATCH] x86, hugetlb: add missing TLB page invalidation for hugetlb_cow()
On 05/15/2014 08:00 PM, Anthony Iliopoulos wrote: > I have dismissed this case, since I assume that there are many more > cycles spent in servicing the TLB invalidation IPI, walking the pgtable > plus other related overhead (e.g. sched) than in updating the pte/pmd > so I am not sure how possible it would be to hit this condition. Hi Anthony, I have a question about the above statement. What will happen with multi cpu VMs ? won't the race described above can happen ? I.e one virtual CPU can will visit the host and the other will continue to encounter your race ? Thanks, Oren. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86, hugetlb: add missing TLB page invalidation for hugetlb_cow()
On 05/15/2014 08:00 PM, Anthony Iliopoulos wrote: I have dismissed this case, since I assume that there are many more cycles spent in servicing the TLB invalidation IPI, walking the pgtable plus other related overhead (e.g. sched) than in updating the pte/pmd so I am not sure how possible it would be to hit this condition. Hi Anthony, I have a question about the above statement. What will happen with multi cpu VMs ? won't the race described above can happen ? I.e one virtual CPU can will visit the host and the other will continue to encounter your race ? Thanks, Oren. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/urgent] x86/vsmp: Fix irq routing
Commit-ID: 39025ba38278f3003ee538409f7c98970620ef49 Gitweb: http://git.kernel.org/tip/39025ba38278f3003ee538409f7c98970620ef49 Author: Oren Twaig AuthorDate: Mon, 28 Apr 2014 10:21:37 +0300 Committer: Ingo Molnar CommitDate: Mon, 28 Apr 2014 09:27:34 +0200 x86/vsmp: Fix irq routing Correct IRQ routing in case a vSMP box is detected but the Interrupt Routing Comply (IRC) value is set to "comply", which leads to incorrect IRQ routing. Before the patch: When a vSMP box was detected and IRC was set to "comply", users (and the kernel) couldn't effectively set the destination of the IRQs. This is because the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using cpumask_setall() as the return value for IRQ allocation mask. Later, this "overrided" mask caused the kernel to set the IRQ destination to the lowest online CPU in the mask (CPU0 usually). After the patch: When the IRC is set to "comply", users (and the kernel) can control the destination of the IRQs as we will not be changing the default "apic->vector_allocation_domain". Signed-off-by: Oren Twaig Acked-by: Shai Fultheim Link: http://lkml.kernel.org/r/1398669697-2123-1-git-send-email-o...@scalemp.com [ Minor readability edits. ] Signed-off-by: Ingo Molnar --- arch/x86/kernel/vsmp_64.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index f6584a9..5edc34b 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -26,6 +26,9 @@ #define TOPOLOGY_REGISTER_OFFSET 0x10 +/* Flag below is initialized once during vSMP PCI initialization. */ +static int irq_routing_comply = 1; + #if defined CONFIG_PCI && defined CONFIG_PARAVIRT /* * Interrupt control on vSMPowered systems: @@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void) #ifdef CONFIG_SMP if (cap & ctl & BIT(8)) { ctl &= ~BIT(8); + + /* Interrupt routing set to ignore */ + irq_routing_comply = 0; + #ifdef CONFIG_PROC_FS /* Don't let users change irq affinity via procfs */ no_irq_affinity = 1; @@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void) { /* need to update phys_pkg_id */ apic->phys_pkg_id = apicid_phys_pkg_id; - apic->vector_allocation_domain = fill_vector_allocation_domain; + + if (!irq_routing_comply) + apic->vector_allocation_domain = fill_vector_allocation_domain; } void __init vsmp_init(void) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/urgent] x86/vsmp: Fix irq routing
Commit-ID: 39025ba38278f3003ee538409f7c98970620ef49 Gitweb: http://git.kernel.org/tip/39025ba38278f3003ee538409f7c98970620ef49 Author: Oren Twaig o...@scalemp.com AuthorDate: Mon, 28 Apr 2014 10:21:37 +0300 Committer: Ingo Molnar mi...@kernel.org CommitDate: Mon, 28 Apr 2014 09:27:34 +0200 x86/vsmp: Fix irq routing Correct IRQ routing in case a vSMP box is detected but the Interrupt Routing Comply (IRC) value is set to comply, which leads to incorrect IRQ routing. Before the patch: When a vSMP box was detected and IRC was set to comply, users (and the kernel) couldn't effectively set the destination of the IRQs. This is because the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using cpumask_setall() as the return value for IRQ allocation mask. Later, this overrided mask caused the kernel to set the IRQ destination to the lowest online CPU in the mask (CPU0 usually). After the patch: When the IRC is set to comply, users (and the kernel) can control the destination of the IRQs as we will not be changing the default apic-vector_allocation_domain. Signed-off-by: Oren Twaig o...@scalemp.com Acked-by: Shai Fultheim s...@scalemp.com Link: http://lkml.kernel.org/r/1398669697-2123-1-git-send-email-o...@scalemp.com [ Minor readability edits. ] Signed-off-by: Ingo Molnar mi...@kernel.org --- arch/x86/kernel/vsmp_64.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index f6584a9..5edc34b 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -26,6 +26,9 @@ #define TOPOLOGY_REGISTER_OFFSET 0x10 +/* Flag below is initialized once during vSMP PCI initialization. */ +static int irq_routing_comply = 1; + #if defined CONFIG_PCI defined CONFIG_PARAVIRT /* * Interrupt control on vSMPowered systems: @@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void) #ifdef CONFIG_SMP if (cap ctl BIT(8)) { ctl = ~BIT(8); + + /* Interrupt routing set to ignore */ + irq_routing_comply = 0; + #ifdef CONFIG_PROC_FS /* Don't let users change irq affinity via procfs */ no_irq_affinity = 1; @@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void) { /* need to update phys_pkg_id */ apic-phys_pkg_id = apicid_phys_pkg_id; - apic-vector_allocation_domain = fill_vector_allocation_domain; + + if (!irq_routing_comply) + apic-vector_allocation_domain = fill_vector_allocation_domain; } void __init vsmp_init(void) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Correct IRQ routing in case a vSMP Foundation box is detected but the Interrupt Routing Comply (IRC) is set to "comply". Before the patch: When a vSMP Foundation box was detected and IRC was set to "comply", users (and kernel) couldn't effectively set the destination of the IRQs. This is because the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using cpumask_setall() as the return value for IRQ allocation mask. Later, this "overrided" mask caused the kernel to set the IRQ destination to the lowest online CPU in the mask (CPU0 usually). After the patch: When the IRC is set to "comply", Users (and kernel) can control the destination of the IRQs as we will not be changing the default "apic->vector_allocation_domain". Signed-off-by: Oren Twaig Acked-by: Shai Fultheim --- arch/x86/kernel/vsmp_64.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index f6584a9..39dd854 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -26,6 +26,9 @@ #define TOPOLOGY_REGISTER_OFFSET 0x10 +/* Flag below is initialized once during vSMP Foundation PCI initialization. */ +static int interrupt_routing_comply = 1; + #if defined CONFIG_PCI && defined CONFIG_PARAVIRT /* * Interrupt control on vSMPowered systems: @@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void) #ifdef CONFIG_SMP if (cap & ctl & BIT(8)) { ctl &= ~BIT(8); + + /* Interrupt routing set to ignore */ + interrupt_routing_comply = 0; + #ifdef CONFIG_PROC_FS /* Don't let users change irq affinity via procfs */ no_irq_affinity = 1; @@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void) { /* need to update phys_pkg_id */ apic->phys_pkg_id = apicid_phys_pkg_id; - apic->vector_allocation_domain = fill_vector_allocation_domain; + + if (!interrupt_routing_comply) + apic->vector_allocation_domain = fill_vector_allocation_domain; } void __init vsmp_init(void) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Hi Andi, On 04/27/2014 09:34 PM, Andi Kleen wrote: > Again, what lock protects it? > > If you cannot answer that question you likely shouldn't use static. The only function which touches this variable is vsmp_init() which is an "_init" function which is guarantee to run by a single cpu - this means, no race. Thanks, Oren > > -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Hi Andi, On 04/27/2014 09:34 PM, Andi Kleen wrote: Again, what lock protects it? If you cannot answer that question you likely shouldn't use static. The only function which touches this variable is vsmp_init() which is an _init function which is guarantee to run by a single cpu - this means, no race. Thanks, Oren -Andi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Correct IRQ routing in case a vSMP Foundation box is detected but the Interrupt Routing Comply (IRC) is set to comply. Before the patch: When a vSMP Foundation box was detected and IRC was set to comply, users (and kernel) couldn't effectively set the destination of the IRQs. This is because the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using cpumask_setall() as the return value for IRQ allocation mask. Later, this overrided mask caused the kernel to set the IRQ destination to the lowest online CPU in the mask (CPU0 usually). After the patch: When the IRC is set to comply, Users (and kernel) can control the destination of the IRQs as we will not be changing the default apic-vector_allocation_domain. Signed-off-by: Oren Twaig o...@scalemp.com Acked-by: Shai Fultheim s...@scalemp.com --- arch/x86/kernel/vsmp_64.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index f6584a9..39dd854 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -26,6 +26,9 @@ #define TOPOLOGY_REGISTER_OFFSET 0x10 +/* Flag below is initialized once during vSMP Foundation PCI initialization. */ +static int interrupt_routing_comply = 1; + #if defined CONFIG_PCI defined CONFIG_PARAVIRT /* * Interrupt control on vSMPowered systems: @@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void) #ifdef CONFIG_SMP if (cap ctl BIT(8)) { ctl = ~BIT(8); + + /* Interrupt routing set to ignore */ + interrupt_routing_comply = 0; + #ifdef CONFIG_PROC_FS /* Don't let users change irq affinity via procfs */ no_irq_affinity = 1; @@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void) { /* need to update phys_pkg_id */ apic-phys_pkg_id = apicid_phys_pkg_id; - apic-vector_allocation_domain = fill_vector_allocation_domain; + + if (!interrupt_routing_comply) + apic-vector_allocation_domain = fill_vector_allocation_domain; } void __init vsmp_init(void) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Hi Andi, On 04/25/2014 05:22 PM, Andi Kleen wrote: >> +static int irc = 1; > Using a static for such state is very unusual. You need to describe what > protects it against races and why that is needed over a cleaner solution. The only reason I've used a static variable is because I wanted to avoid inserting another code/functions which are depended on CONFIG_PCI. The code is used once during initialization and hence cannot be racy. But, if static variables are unusual (new at linux kernel), I will change the flow to read the HW state again (using the PCI functions). Please let me know if that is desirable. Thanks, Oren. > > -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Hi Ingo, On 04/26/2014 09:09 AM, Ingo Molnar wrote: > I still don't see a clear explanation of what the _user_ saw and sees > before and after the change. What is the effect of the patch: correct > IRQ routing (i.e. before the change IRQs would end up on the wrong > CPU), lower overhead IRQ routing (i.e. before the change IRQ routing > overhead was more expensive), or something else? > > You don't spell this out clearly and it's a crucial piece of > information that comes before every other explanation. > I see.. I tried to explain the entire flow and that was confusing - I'll explain only the patch. As you stated, in general, the patch corrects IRQ routing in case a vSMP Foundation box is detected but the Interrupt Routing Comply (IRC) is set to "comply". Before the patch: When a vSMP Foundation box was detected and IRC was set to "comply", users (and kernel) couldn't effectively set the destination of the IRQs. This is because the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using cpumask_setall() as the return value for IRQ allocation mask. Later, this "overrided" mask caused the kernel to set the IRQ destination to the lowest online CPU in the mask (CPU0 usually). After the patch: When the IRC is set to "comply", Users (and kernel) can control the destination of the IRQs as we will not be changing the default "apic->vector_allocation_domain". Thanks, Oren > Thanks, > > Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Hi Ingo, On 04/26/2014 09:09 AM, Ingo Molnar wrote: I still don't see a clear explanation of what the _user_ saw and sees before and after the change. What is the effect of the patch: correct IRQ routing (i.e. before the change IRQs would end up on the wrong CPU), lower overhead IRQ routing (i.e. before the change IRQ routing overhead was more expensive), or something else? You don't spell this out clearly and it's a crucial piece of information that comes before every other explanation. I see.. I tried to explain the entire flow and that was confusing - I'll explain only the patch. As you stated, in general, the patch corrects IRQ routing in case a vSMP Foundation box is detected but the Interrupt Routing Comply (IRC) is set to comply. Before the patch: When a vSMP Foundation box was detected and IRC was set to comply, users (and kernel) couldn't effectively set the destination of the IRQs. This is because the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using cpumask_setall() as the return value for IRQ allocation mask. Later, this overrided mask caused the kernel to set the IRQ destination to the lowest online CPU in the mask (CPU0 usually). After the patch: When the IRC is set to comply, Users (and kernel) can control the destination of the IRQs as we will not be changing the default apic-vector_allocation_domain. Thanks, Oren Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Hi Andi, On 04/25/2014 05:22 PM, Andi Kleen wrote: +static int irc = 1; Using a static for such state is very unusual. You need to describe what protects it against races and why that is needed over a cleaner solution. The only reason I've used a static variable is because I wanted to avoid inserting another code/functions which are depended on CONFIG_PCI. The code is used once during initialization and hence cannot be racy. But, if static variables are unusual (new at linux kernel), I will change the flow to read the HW state again (using the PCI functions). Please let me know if that is desirable. Thanks, Oren. -Andi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
On 4/25/2014 11:01 AM, Ingo Molnar wrote: > * Oren Twaig wrote: > >> vSMP Foundation provides locality based interrupt routing which needed >> vector_allocation_domain to allow all online cpus can handle all possible >> vectors. >> >> Enforcing Interrupt Routing Comply (IRC) mode requires us to unplug this hook as >> otherwise the IOAPIC, MSI and MSIX destination selectors to always select the >> lowest online cpu as the destination. I.e affinity of HW interrupts cannot be >> controled by kernel and/or userspace code. >> >> The purpose of the patch is to fix the code to set override vector allocation >> domain only when IRC is set to ignore to allow the kernel and userspace to >> effectively control the destination of the HW interrupts. >> >> Signed-off-by: Oren Twaig >> Acked-by: Shai Fultheim > > So what was the behavior before the change - certain IRQs did not get > routed, they just ended up on CPU0 or on some other undesirable CPU? > Or was IRQ distribution random? It's not clear from the changelog. It all depends on the IRC flag. When set to "ignore" by the linux kernel, vSMP Foundation knew that it can deliver the IRQ to the CPU which would result in less virtualization overhead. For example, we could deliver the HW interrupt to the CPU which got it or any other CPU in the system. We couldn't have done it without the kernel making sure that each vector can be passed to all CPUs. This is why we override the verctor allocation domain to signal all CPUs. But, when the IRC is set to "comply" we, before this patch, still efected the allocation domains alltough it wasn't needed. It wasn't needed because when in "comply" mode, we always pass the HW interrupt to the CPU the kernel requested (by setting the IOAPIC entry, MSI/X entry or IR entry) Thanks, Oren > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
On 4/25/2014 11:01 AM, Ingo Molnar wrote: * Oren Twaig o...@scalemp.com wrote: vSMP Foundation provides locality based interrupt routing which needed vector_allocation_domain to allow all online cpus can handle all possible vectors. Enforcing Interrupt Routing Comply (IRC) mode requires us to unplug this hook as otherwise the IOAPIC, MSI and MSIX destination selectors to always select the lowest online cpu as the destination. I.e affinity of HW interrupts cannot be controled by kernel and/or userspace code. The purpose of the patch is to fix the code to set override vector allocation domain only when IRC is set to ignore to allow the kernel and userspace to effectively control the destination of the HW interrupts. Signed-off-by: Oren Twaig o...@scalemp.com Acked-by: Shai Fultheim s...@scalemp.com So what was the behavior before the change - certain IRQs did not get routed, they just ended up on CPU0 or on some other undesirable CPU? Or was IRQ distribution random? It's not clear from the changelog. It all depends on the IRC flag. When set to ignore by the linux kernel, vSMP Foundation knew that it can deliver the IRQ to the CPU which would result in less virtualization overhead. For example, we could deliver the HW interrupt to the CPU which got it or any other CPU in the system. We couldn't have done it without the kernel making sure that each vector can be passed to all CPUs. This is why we override the verctor allocation domain to signal all CPUs. But, when the IRC is set to comply we, before this patch, still efected the allocation domains alltough it wasn't needed. It wasn't needed because when in comply mode, we always pass the HW interrupt to the CPU the kernel requested (by setting the IOAPIC entry, MSI/X entry or IR entry) Thanks, Oren Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
vSMP Foundation provides locality based interrupt routing which needed vector_allocation_domain to allow all online cpus can handle all possible vectors. Enforcing Interrupt Routing Comply (IRC) mode requires us to unplug this hook as otherwise the IOAPIC, MSI and MSIX destination selectors to always select the lowest online cpu as the destination. I.e affinity of HW interrupts cannot be controled by kernel and/or userspace code. The purpose of the patch is to fix the code to set override vector allocation domain only when IRC is set to ignore to allow the kernel and userspace to effectively control the destination of the HW interrupts. Signed-off-by: Oren Twaig Acked-by: Shai Fultheim --- arch/x86/kernel/vsmp_64.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index f6584a9..b7f8e5b 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -26,6 +26,8 @@ #define TOPOLOGY_REGISTER_OFFSET 0x10 +static int irc = 1; + #if defined CONFIG_PCI && defined CONFIG_PARAVIRT /* * Interrupt control on vSMPowered systems: @@ -101,6 +103,10 @@ static void __init set_vsmp_pv_ops(void) #ifdef CONFIG_SMP if (cap & ctl & BIT(8)) { ctl &= ~BIT(8); + + /* Interrupt routing set to ignore */ + irc = 0; + #ifdef CONFIG_PROC_FS /* Don't let users change irq affinity via procfs */ no_irq_affinity = 1; @@ -218,7 +224,9 @@ static void vsmp_apic_post_init(void) { /* need to update phys_pkg_id */ apic->phys_pkg_id = apicid_phys_pkg_id; - apic->vector_allocation_domain = fill_vector_allocation_domain; + + if (!irc) + apic->vector_allocation_domain = fill_vector_allocation_domain; } void __init vsmp_init(void) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Hi Ingo, On 04/24/2014 09:22 AM, Ingo Molnar wrote: > > * Oren Twaig wrote: > >> Set all inclusive vector allocation domain only if interrupt routing >> is set to ignore. When in comply mode, vector allocation behavior >> isn't changed. > > This changelog is too terse: > > Please update the changelog to describe the current behavior, and how > it affects your platform. (I.e. how do users notice, if at all?) > > Then describe why you think that behavior should be changed. ie: > what's the reason for this patch. > > Only then describe the details of the change itself. I will send v2 of the patch with a revised and more detailed explanation about the behavior aspect of the patch. > >> -apic->vector_allocation_domain = fill_vector_allocation_domain; >> + >> +if (!irc) >> +apic->vector_allocation_domain = fill_vector_allocation_domain; > > Please also run scripts/checkpatch.pl over your patch which will > report trivial coding style problems like the one here. Yep - did that just like the instructions on "SubmmitingPatches", unfortunately, although I've followed the Thunderbird configuration instructions with external editor, it seems to sent the the email content wrong (bad line endings as I saw now). So sorry for the trouble, will change to a text email client and will resend the patch. Thanks, Oren. > > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Hi Ingo, On 04/24/2014 09:22 AM, Ingo Molnar wrote: * Oren Twaig o...@scalemp.com wrote: Set all inclusive vector allocation domain only if interrupt routing is set to ignore. When in comply mode, vector allocation behavior isn't changed. This changelog is too terse: Please update the changelog to describe the current behavior, and how it affects your platform. (I.e. how do users notice, if at all?) Then describe why you think that behavior should be changed. ie: what's the reason for this patch. Only then describe the details of the change itself. I will send v2 of the patch with a revised and more detailed explanation about the behavior aspect of the patch. -apic-vector_allocation_domain = fill_vector_allocation_domain; + +if (!irc) +apic-vector_allocation_domain = fill_vector_allocation_domain; Please also run scripts/checkpatch.pl over your patch which will report trivial coding style problems like the one here. Yep - did that just like the instructions on SubmmitingPatches, unfortunately, although I've followed the Thunderbird configuration instructions with external editor, it seems to sent the the email content wrong (bad line endings as I saw now). So sorry for the trouble, will change to a text email client and will resend the patch. Thanks, Oren. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
vSMP Foundation provides locality based interrupt routing which needed vector_allocation_domain to allow all online cpus can handle all possible vectors. Enforcing Interrupt Routing Comply (IRC) mode requires us to unplug this hook as otherwise the IOAPIC, MSI and MSIX destination selectors to always select the lowest online cpu as the destination. I.e affinity of HW interrupts cannot be controled by kernel and/or userspace code. The purpose of the patch is to fix the code to set override vector allocation domain only when IRC is set to ignore to allow the kernel and userspace to effectively control the destination of the HW interrupts. Signed-off-by: Oren Twaig o...@scalemp.com Acked-by: Shai Fultheim s...@scalemp.com --- arch/x86/kernel/vsmp_64.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index f6584a9..b7f8e5b 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -26,6 +26,8 @@ #define TOPOLOGY_REGISTER_OFFSET 0x10 +static int irc = 1; + #if defined CONFIG_PCI defined CONFIG_PARAVIRT /* * Interrupt control on vSMPowered systems: @@ -101,6 +103,10 @@ static void __init set_vsmp_pv_ops(void) #ifdef CONFIG_SMP if (cap ctl BIT(8)) { ctl = ~BIT(8); + + /* Interrupt routing set to ignore */ + irc = 0; + #ifdef CONFIG_PROC_FS /* Don't let users change irq affinity via procfs */ no_irq_affinity = 1; @@ -218,7 +224,9 @@ static void vsmp_apic_post_init(void) { /* need to update phys_pkg_id */ apic-phys_pkg_id = apicid_phys_pkg_id; - apic-vector_allocation_domain = fill_vector_allocation_domain; + + if (!irc) + apic-vector_allocation_domain = fill_vector_allocation_domain; } void __init vsmp_init(void) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Set all inclusive vector allocation domain only if interrupt routing is set to ignore. When in comply mode, vector allocation behavior isn't changed. Signed-off-by: Oren Twaig Acked-by: Shai Fultheim --- diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index f6584a9..60a195d 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -27,6 +27,9 @@ #define TOPOLOGY_REGISTER_OFFSET 0x10 #if defined CONFIG_PCI && defined CONFIG_PARAVIRT + +static int irc = 1; + /* * Interrupt control on vSMPowered systems: * ~AC is a shadow of IF. If IF is 'on' AC should be 'off' @@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void) #ifdef CONFIG_SMP if (cap & ctl & BIT(8)) { ctl &= ~BIT(8); + +/* Interrupt routing set to ignore */ +irc = 0; + #ifdef CONFIG_PROC_FS /* Don't let users change irq affinity via procfs */ no_irq_affinity = 1; @@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void) { /* need to update phys_pkg_id */ apic->phys_pkg_id = apicid_phys_pkg_id; -apic->vector_allocation_domain = fill_vector_allocation_domain; + +if (!irc) +apic->vector_allocation_domain = fill_vector_allocation_domain; } void __init vsmp_init(void) --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore
Set all inclusive vector allocation domain only if interrupt routing is set to ignore. When in comply mode, vector allocation behavior isn't changed. Signed-off-by: Oren Twaig o...@scalemp.com Acked-by: Shai Fultheim s...@scalemp.com --- diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index f6584a9..60a195d 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -27,6 +27,9 @@ #define TOPOLOGY_REGISTER_OFFSET 0x10 #if defined CONFIG_PCI defined CONFIG_PARAVIRT + +static int irc = 1; + /* * Interrupt control on vSMPowered systems: * ~AC is a shadow of IF. If IF is 'on' AC should be 'off' @@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void) #ifdef CONFIG_SMP if (cap ctl BIT(8)) { ctl = ~BIT(8); + +/* Interrupt routing set to ignore */ +irc = 0; + #ifdef CONFIG_PROC_FS /* Don't let users change irq affinity via procfs */ no_irq_affinity = 1; @@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void) { /* need to update phys_pkg_id */ apic-phys_pkg_id = apicid_phys_pkg_id; -apic-vector_allocation_domain = fill_vector_allocation_domain; + +if (!irc) +apic-vector_allocation_domain = fill_vector_allocation_domain; } void __init vsmp_init(void) --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/