Re: [PATCH] svm: Fix AVIC incomplete IPI emulation

2019-03-13 Thread Oren Twaig

Hi Suravee,

Turns out, the _same_ bug was already discussed in the past by
yourself, Paolo and Radim (both now 'cc'-ed)

Please read it here:
https://patchwork.kernel.org/patch/8292231/


After reading that thread, I have couple of questions:

First,

You wrote : "I have tried NOT setting the IRR, and only
kick_vcpu(). And things seem to work fine. Therefore, I think your
analysis is likely to be correct."

AFAIU, it means that the below patch is wrong just as Paolo
suggested in his original answer and you did fixed it back than, but the
code is now back ?


Second,

Did you made sure (with your HW desginer) that what the
specifications refer as "atomically" means that the IRR is
set (i.e the cacheline is taken exclusively) _before_ the
isRunning bit is _read_ ?

Because, if it doesn't, just like Paolo suggested,
it means there is no way to use the feature as a
"sending" vcpu can send IPI without any exit and
the receiving cpu will never see the IRR bit as the
sending cpu didn't made sure the IRR is set _before_
reading the isRunning.

Thanks,
Oren

On 03/13/2019 09:30 AM, Oren Twaig wrote:

Hi Suravee,

Please see below..

On 03/11/2019 01:38 PM, Suthikulpanit, Suravee wrote:

Hi Oren,

Sorry for delay response.

On 3/5/19 1:15 AM, Oren Twaig wrote:

Hello Suravee,

According to AMD's SDM, the target-not-running incomplete
ipi exit is only received if any of the destination cpus had the
not-running bit set in the avic backing page.

I believe you are referring to the "isRunning" (IR) bit is in the
AVIC physical APIC ID table entry.
I meant cause ID=1 in IPI Delivery Failure Cause (SDM rev 3.30, sep 
2018, Table 15-28):


"
1: IPI Target Not Running:

IsRunning bit of the target for a
Singlecast/Broadcast/Multicast IPI is not set in
the physical APIC ID table.

"




However, not before the CPU _already_ set the relevant IRR bit
in all these cpus.

Not sure what you meant here.


Here is the full snippet from the specifications:
"
5.For every valid destination:
- Atomically set the appropriate IRR bit in each of the destinations’
  vAPIC backing page.
- Check the IsRunning status of each destination.
- If the destination IsRunning bit is set, send a doorbell message
  using the host physical core number from the Physical APIC ID table.

6. If any destinations are identified as not currently scheduled on
 a physical core (i.e., the IsRunning
 bit for that virtual processor is not set), cause a #VMEXIT.
"

According to the specification above, the HW should first
set the appropriate bit in the IRR (Interrupt Request Register)
_before_ causing VMEXIT of IPI-delivery-not-completed
with  ID=1 (Target not running).




In this change, the patch forces KVM to send another interrupt
to the vcpu whether SVM already did that or not. Which means
the vcpu/s, under some conditions, can get an EXTRA interrupt
it never intended to get >> Example:
    1. vcpu B: Is in "not-running" state.
    2. vcpu A: Writes to the ICR to send vector 80 to vcpu B
    3. vcpu A: SVM updates vcpu B IRR with bit 80
    4. vcpu A: SVM exits on incomplete IPI target-not-running exit.
    5. vcpu A: Now stops executing any code @ hypervisor level.
    6. vcpu B: Due to another interrupt (like lapic timer)
   resumes running the guest. While handling interrupts,
   it also handles interrupt vector 80 (as it's in his IRR)
    7. vcpu A: resumes executing the below code and sends
   an _additional_interrupt to vcpu B.

Overall, vcpu B got two interrupts. The second is unwanted and
not documented in the system architecture.

Can you please elaborate more to why the implementation
below conflict with the specifications (which was the code
before this commit) ?

This patch was introduced to fix an issue where the SVM driver tries to
handle the step 5 above by scheduling vcpu B into _running_ state to 
handle

the IPI from vcpu A. However, prior to this patch, vcpu B was never get
scheduled to run unless there are other interrupts (e.g. timer).
Exactly. Only what needed here is *only* to wakeup the vcpu B. Why ? 
because

the apic of vcpu B _already_ contains the interrupt in the pending
IRR. Than, once vcpu B will run it will process the IRR which contains
the vector placed by the HW and will deliver it.
This should not be the case as Vcpu B should have been running 
regardless

of other interrupts. So, I don't think step 6 and 7 above are correct.

The example of vcpu A that stops executing is just to highlight that
the code can't depend on that the kvm code of vcpu A will finish the
ICR "fake" call before vcpu B runs (beacuse of any interrupt) and process
that IRR request placed by the HW.


The issue was caused by the apic->irr_pending not set to true when 
trying to
get vcpu B scheduled. This flag is checked in apic_find_highest_irr() 
before

searching for the highest bit.

To fix the issue, I decided to leverage the existing emulation 

Re: [PATCH] svm: Fix AVIC incomplete IPI emulation

2019-03-13 Thread Oren Twaig

Hi Suravee,

Please see below..

On 03/11/2019 01:38 PM, Suthikulpanit, Suravee wrote:

Hi Oren,

Sorry for delay response.

On 3/5/19 1:15 AM, Oren Twaig wrote:

Hello Suravee,

According to AMD's SDM, the target-not-running incomplete
ipi exit is only received if any of the destination cpus had the
not-running bit set in the avic backing page.

I believe you are referring to the "isRunning" (IR) bit is in the
AVIC physical APIC ID table entry.
I meant cause ID=1 in IPI Delivery Failure Cause (SDM rev 3.30, sep 
2018, Table 15-28):


"
1: IPI Target Not Running:

IsRunning bit of the target for a
Singlecast/Broadcast/Multicast IPI is not set in
the physical APIC ID table.

"




However, not before the CPU _already_ set the relevant IRR bit
in all these cpus.

Not sure what you meant here.


Here is the full snippet from the specifications:
"
5.For every valid destination:
- Atomically set the appropriate IRR bit in each of the destinations’
  vAPIC backing page.
- Check the IsRunning status of each destination.
- If the destination IsRunning bit is set, send a doorbell message
  using the host physical core number from the Physical APIC ID table.

6. If any destinations are identified as not currently scheduled on
 a physical core (i.e., the IsRunning
 bit for that virtual processor is not set), cause a #VMEXIT.
"

According to the specification above, the HW should first
set the appropriate bit in the IRR (Interrupt Request Register)
_before_ causing VMEXIT of IPI-delivery-not-completed
with  ID=1 (Target not running).




In this change, the patch forces KVM to send another interrupt
to the vcpu whether SVM already did that or not. Which means
the vcpu/s, under some conditions, can get an EXTRA interrupt
it never intended to get >> Example:
    1. vcpu B: Is in "not-running" state.
    2. vcpu A: Writes to the ICR to send vector 80 to vcpu B
    3. vcpu A: SVM updates vcpu B IRR with bit 80
    4. vcpu A: SVM exits on incomplete IPI target-not-running exit.
    5. vcpu A: Now stops executing any code @ hypervisor level.
    6. vcpu B: Due to another interrupt (like lapic timer)
   resumes running the guest. While handling interrupts,
   it also handles interrupt vector 80 (as it's in his IRR)
    7. vcpu A: resumes executing the below code and sends
   an _additional_interrupt to vcpu B.

Overall, vcpu B got two interrupts. The second is unwanted and
not documented in the system architecture.

Can you please elaborate more to why the implementation
below conflict with the specifications (which was the code
before this commit) ?

This patch was introduced to fix an issue where the SVM driver tries to
handle the step 5 above by scheduling vcpu B into _running_ state to handle
the IPI from vcpu A. However, prior to this patch, vcpu B was never get
scheduled to run unless there are other interrupts (e.g. timer).

Exactly. Only what needed here is *only* to wakeup the vcpu B. Why ? because
the apic of vcpu B _already_ contains the interrupt in the pending
IRR. Than, once vcpu B will run it will process the IRR which contains
the vector placed by the HW and will deliver it.

This should not be the case as Vcpu B should have been running regardless
of other interrupts. So, I don't think step 6 and 7 above are correct.

The example of vcpu A that stops executing is just to highlight that
the code can't depend on that the kvm code of vcpu A will finish the
ICR "fake" call before vcpu B runs (beacuse of any interrupt) and process
that IRR request placed by the HW.


The issue was caused by the apic->irr_pending not set to true when trying to
get vcpu B scheduled. This flag is checked in apic_find_highest_irr() before
searching for the highest bit.

To fix the issue, I decided to leverage the existing emulation code for
ICR and ICR2, which in turn calls apic_send_ipi() to deliver interrupt to vpu B.

However, looking a bit more closely, I notice the logic in 
svm_deliver_avic_intr()
should also have been changed from kvm_vcpu_wake_up() to kvm_vcpu_kick()
since the latter will result in clearing the IRR bit for the IPI vector
when trying to send IPI as part of the following call path.

vcpu_enter_guest()
  |-- inject_pending_event()
|-- kvm_cpu_get_interrupt()
  |--  kvm_get_apic_interrupt()
|-- apic_clear_irr()
|-- apic_set_isr()
|-- apic_update_ppr() 

Please see the patch below.

Not sure if this would address the problem you are seeing.

I still think there a bug here where vcpu B will get two interrupts
instead of one.

Thanks,
Oren


Thanks,
Suravee

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 24dfa6a93711..d2841c3dbc04 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5219,11 +5256,13 @@ static void svm_deliver_avic_intr(struct kvm_vcpu 
*vcpu, int vec)
  kvm_lapic_set_irr(vec, vcpu->arch.apic);
  

Re: [PATCH] svm: Fix AVIC incomplete IPI emulation

2019-03-08 Thread Oren Twaig

Hi,


Any help appreciated..


Thanks,

Oren Twaig

On 3/4/2019 8:15 PM, Oren Twaig wrote:

Hello Suravee,

According to AMD's SDM, the target-not-running incomplete
ipi exit is only received if any of the destination cpus had the

not-running bit set in the avic backing page. However, not

before the CPU _already_ set the relevant IRR bit in all these cpus.


In this change, the patch forces KVM to send another interrupt

to the vcpu whether SVM already did that or not. Which means

the vcpu/s, under some conditions, can get an EXTRA interrupt

it never intended to get.


Example:
  1. vcpu B: Is in "not-running" state.
  2. vcpu A: Writes to the ICR to send vector 80 to vcpu B
  3. vcpu A: SVM updates vcpu B IRR with bit 80
  4. vcpu A: SVM exits on incomplete IPI target-not-running exit.
  5. vcpu A: Now stops executing any code @ hypervisor level.
  6. vcpu B: Due to another interrupt (like lapic timer)
 resumes running the guest. While handling interrupts,
 it also handles interrupt vector 80 (as it's in his IRR)
  7. vcpu A: resumes executing the below code and sends
 an _additional_interrupt to vcpu B.

Overall, vcpu B got two interrupts. The second is unwanted and
not documented in the system architecture.

Can you please elaborate more to why the implementation

below conflict with the specifications (which was the code

before this commit) ?


Thanks,
Oren Twaig


> From    "Suthikulpanit, Suravee" <>
> Subject    [PATCH] svm: Fix AVIC incomplete IPI emulation
> Date    Tue, 22 Jan 2019 10:25:13 +
> share
> From: Suravee Suthikulpanit 
>
> In case of incomplete IPI with invalid interrupt type, the current
> SVM driver does not properly emulate the IPI, and fails to boot
> FreeBSD guests with multiple vcpus when enabling AVIC.
>
> Fix this by update APIC ICR high/low registers, which also
> emulate sending the IPI.
>
> Signed-off-by: Suravee Suthikulpanit 
> ---
> arch/x86/kvm/svm.c | 19 ---
> 1 file changed, 4 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 2aff835a65ed..8a0c9a1f6ac8 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -4504,25 +4504,14 @@ static int 
avic_incomplete_ipi_interception(struct vcpu_svm *svm)

>      kvm_lapic_reg_write(apic, APIC_ICR, icrl);
>      break;
>  case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING: {
> -        int i;
> -        struct kvm_vcpu *vcpu;
> -        struct kvm *kvm = svm->vcpu.kvm;
>      struct kvm_lapic *apic = svm->vcpu.arch.apic;
>
     /*
> -         * At this point, we expect that the AVIC HW has already
> -         * set the appropriate IRR bits on the valid target
> -         * vcpus. So, we just need to kick the appropriate vcpu.
> +         * Update ICR high and low, then emulate sending IPI,
> +         * which is handled when writing APIC_ICR.
>   */
> -        kvm_for_each_vcpu(i, vcpu, kvm) {
> -            bool m = kvm_apic_match_dest(vcpu, apic,
> -                         icrl & KVM_APIC_SHORT_MASK,
> - GET_APIC_DEST_FIELD(icrh),
> -  icrl & KVM_APIC_DEST_MASK);
> -
> -            if (m && !avic_vcpu_is_running(vcpu))
> - kvm_vcpu_wake_up(vcpu);
> -        }
> +        kvm_lapic_reg_write(apic, APIC_ICR2, icrh);
> +        kvm_lapic_reg_write(apic, APIC_ICR, icrl);
>      break;
>  }
>  case AVIC_IPI_FAILURE_INVALID_TARGET:
> --
> 2.17.1



re: [PATCH] svm: Fix AVIC incomplete IPI emulation

2019-03-04 Thread Oren Twaig

Hello Suravee,

According to AMD's SDM, the target-not-running incomplete
ipi exit is only received if any of the destination cpus had the

not-running bit set in the avic backing page. However, not

before the CPU _already_ set the relevant IRR bit in all these cpus.


In this change, the patch forces KVM to send another interrupt

to the vcpu whether SVM already did that or not. Which means

the vcpu/s, under some conditions, can get an EXTRA interrupt

it never intended to get.


Example:
  1. vcpu B: Is in "not-running" state.
  2. vcpu A: Writes to the ICR to send vector 80 to vcpu B
  3. vcpu A: SVM updates vcpu B IRR with bit 80
  4. vcpu A: SVM exits on incomplete IPI target-not-running exit.
  5. vcpu A: Now stops executing any code @ hypervisor level.
  6. vcpu B: Due to another interrupt (like lapic timer)
 resumes running the guest. While handling interrupts,
 it also handles interrupt vector 80 (as it's in his IRR)
  7. vcpu A: resumes executing the below code and sends
 an _additional_interrupt to vcpu B.

Overall, vcpu B got two interrupts. The second is unwanted and
not documented in the system architecture.

Can you please elaborate more to why the implementation

below conflict with the specifications (which was the code

before this commit) ?


Thanks,
Oren Twaig


> From    "Suthikulpanit, Suravee" <>
> Subject    [PATCH] svm: Fix AVIC incomplete IPI emulation
> Date    Tue, 22 Jan 2019 10:25:13 +
> share
> From: Suravee Suthikulpanit 
>
> In case of incomplete IPI with invalid interrupt type, the current
> SVM driver does not properly emulate the IPI, and fails to boot
> FreeBSD guests with multiple vcpus when enabling AVIC.
>
> Fix this by update APIC ICR high/low registers, which also
> emulate sending the IPI.
>
> Signed-off-by: Suravee Suthikulpanit 
> ---
> arch/x86/kvm/svm.c | 19 ---
> 1 file changed, 4 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 2aff835a65ed..8a0c9a1f6ac8 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -4504,25 +4504,14 @@ static int 
avic_incomplete_ipi_interception(struct vcpu_svm *svm)

>      kvm_lapic_reg_write(apic, APIC_ICR, icrl);
>      break;
>  case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING: {
> -        int i;
> -        struct kvm_vcpu *vcpu;
> -        struct kvm *kvm = svm->vcpu.kvm;
>      struct kvm_lapic *apic = svm->vcpu.arch.apic;
>
     /*
> -         * At this point, we expect that the AVIC HW has already
> -         * set the appropriate IRR bits on the valid target
> -         * vcpus. So, we just need to kick the appropriate vcpu.
> +         * Update ICR high and low, then emulate sending IPI,
> +         * which is handled when writing APIC_ICR.
>   */
> -        kvm_for_each_vcpu(i, vcpu, kvm) {
> -            bool m = kvm_apic_match_dest(vcpu, apic,
> -                         icrl & KVM_APIC_SHORT_MASK,
> - GET_APIC_DEST_FIELD(icrh),
> -  icrl & KVM_APIC_DEST_MASK);
> -
> -            if (m && !avic_vcpu_is_running(vcpu))
> - kvm_vcpu_wake_up(vcpu);
> -        }
> +        kvm_lapic_reg_write(apic, APIC_ICR2, icrh);
> +        kvm_lapic_reg_write(apic, APIC_ICR, icrl);
>      break;
>  }
>  case AVIC_IPI_FAILURE_INVALID_TARGET:
> --
> 2.17.1


Re: [x86_64] Question about early page tables initialization

2015-02-03 Thread Oren Twaig
Hi,

This is the corresponding C code which can help you understand:

   u64 *pml4 = (u64*)pgtable;
   u64 pdp = pgtable + 0x1000;
   u64 pml4_entry = pdp | PTE_P | PTE_W | PTU; // present, write, userspace = 
0x7
   pml4[0] = pml4_entry;

The 0x1007 you see is just the calculation of the pml4_entry.

Oren Twaig.

On 02/03/2015 02:25 PM, Alex Kuleshov wrote:
> Hello All,
>
> I have a question about page tables initialization in the
> arch/x86/boot/compressed/head_64.S
>
> After we clear memory for page tables, there is code which
> build PML4:
>
> lealpgtable + 0(%ebx), %edi
> leal0x1007(%edi), %eax 
> movl%eax, 0(%edi)
>
> Why there is offset 0x1007 instead just 0x7? 0x1007 is
> 4k + 7bit (PML4E) flags as i understand correctly. But
> why we skip first 4k here?
>
> Thank you.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [x86_64] Question about early page tables initialization

2015-02-03 Thread Oren Twaig
Hi,

This is the corresponding C code which can help you understand:

   u64 *pml4 = (u64*)pgtable;
   u64 pdp = pgtable + 0x1000;
   u64 pml4_entry = pdp | PTE_P | PTE_W | PTU; // present, write, userspace = 
0x7
   pml4[0] = pml4_entry;

The 0x1007 you see is just the calculation of the pml4_entry.

Oren Twaig.

On 02/03/2015 02:25 PM, Alex Kuleshov wrote:
 Hello All,

 I have a question about page tables initialization in the
 arch/x86/boot/compressed/head_64.S

 After we clear memory for page tables, there is code which
 build PML4:

 lealpgtable + 0(%ebx), %edi
 leal0x1007(%edi), %eax 
 movl%eax, 0(%edi)

 Why there is offset 0x1007 instead just 0x7? 0x1007 is
 4k + 7bit (PML4E) flags as i understand correctly. But
 why we skip first 4k here?

 Thank you.
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86: vmalloc and THP

2014-08-12 Thread Oren Twaig

Hi Kirill,

I saw the thread has developed nicely :), still - wanted to answer your 
question

below.

On 8/12/2014 9:07 AM, Kirill A. Shutemov wrote:

On Tue, Aug 12, 2014 at 08:00:54AM +0300, Oren Twaig wrote:



plain/text, please.

Yes - noticed the html, sent again in plain text.

If not, is there any fast way to change this behavior ? Maybe by
changing the granularity/alignment of such allocations to allow such
mapping ?

What's the point to use vmalloc() in this case?

I've noticed that some lock/s are using linear addresses which are
located at 0xc901922b4500 and from what I understand
from mm.txt (kernel 3.0.101):
*c900 - e8ff (=45 bits) vmalloc/ioremap space

*So I'm not sure who/how/why this lock got allocated there, but obviously
it is using that linear set. No ?






---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86: vmalloc and THP

2014-08-12 Thread Oren Twaig

Hi Kirill,

I saw the thread has developed nicely :), still - wanted to answer your 
question

below.

On 8/12/2014 9:07 AM, Kirill A. Shutemov wrote:

On Tue, Aug 12, 2014 at 08:00:54AM +0300, Oren Twaig wrote:

html style=direction: ltr;

plain/text, please.

Yes - noticed the html, sent again in plain text.

If not, is there any fast way to change this behavior ? Maybe by
changing the granularity/alignment of such allocations to allow such
mapping ?

What's the point to use vmalloc() in this case?

I've noticed that some lock/s are using linear addresses which are
located at 0xc901922b4500 and from what I understand
from mm.txt (kernel 3.0.101):
*c900 - e8ff (=45 bits) vmalloc/ioremap space

*So I'm not sure who/how/why this lock got allocated there, but obviously
it is using that linear set. No ?






---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


x86: vmalloc and THP

2014-08-11 Thread Oren Twaig

Hello,

Does memory allocated using vmalloc() will be mapped using huge pages 
either directly or later by THP ?


If not, is there any fast way to change this behavior ? Maybe by 
changing the granularity/alignment of such allocations to allow such 
mapping ?


Thanks,
Oren Twaig.

---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


x86: vmalloc and THP

2014-08-11 Thread Oren Twaig

Hello,

Does memory allocated using vmalloc() will be mapped using huge pages 
either directly or later by THP ?


If not, is there any fast way to change this behavior ? Maybe by 
changing the granularity/alignment of such allocations to allow such 
mapping ?


Thanks,
Oren Twaig.

---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/apic] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-13 Thread tip-bot for Oren Twaig
Commit-ID:  411cf9ee2946492c0ac7eca48422fcf94a723ce5
Gitweb: http://git.kernel.org/tip/411cf9ee2946492c0ac7eca48422fcf94a723ce5
Author: Oren Twaig 
AuthorDate: Sun, 29 Jun 2014 13:01:08 +0300
Committer:  H. Peter Anvin 
CommitDate: Sun, 13 Jul 2014 17:48:03 -0700

x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

When a vSMP Foundation box is detected, the function apic_cluster_num() counts
the number of APIC clusters found. If more than one found, a multi board
configuration is assumed, and TSC marked as unstable. This behavior is
incorrect as vSMP Foundation may use processors from single node only, attached
to memory of other nodes - and such node may have more than one APIC cluster
(typically any recent intel box has more than single APIC_CLUSTERID(x)).

To fix this, we simply remove the code which detects a vSMP Foundation box and
affects apic_is_clusted_box() return value. This can be done because later the
kernel checks by itself if the TSC is stable using the
check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed.

Acked-by: Shai Fultheim 
Signed-off-by: Oren Twaig 
Link: 
http://lkml.kernel.org/r/1404036068-11674-1-git-send-email-o...@scalemp.com
Signed-off-by: H. Peter Anvin 
---
 arch/x86/include/asm/apic.h |  8 --
 arch/x86/kernel/apic/apic.c | 60 +
 2 files changed, 1 insertion(+), 67 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 69ed79a..b40ea7e 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
 #include 
 #endif
 
-#ifdef CONFIG_X86_64
-extern int is_vsmp_box(void);
-#else
-static inline int is_vsmp_box(void)
-{
-   return 0;
-}
-#endif
 extern int setup_profiling_timer(unsigned int);
 
 static inline void native_apic_mem_write(u32 reg, u32 v)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index ca1bd75..6b35d30 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
 
 #ifdef CONFIG_X86_64
 
-static int apic_cluster_num(void)
-{
-   int i, clusters, zeros;
-   unsigned id;
-   u16 *bios_cpu_apicid;
-   DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
-
-   bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
-   bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
-
-   for (i = 0; i < nr_cpu_ids; i++) {
-   /* are we being called early in kernel startup? */
-   if (bios_cpu_apicid) {
-   id = bios_cpu_apicid[i];
-   } else if (i < nr_cpu_ids) {
-   if (cpu_present(i))
-   id = per_cpu(x86_bios_cpu_apicid, i);
-   else
-   continue;
-   } else
-   break;
-
-   if (id != BAD_APICID)
-   __set_bit(APIC_CLUSTERID(id), clustermap);
-   }
-
-   /* Problem:  Partially populated chassis may not have CPUs in some of
-* the APIC clusters they have been allocated.  Only present CPUs have
-* x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
-* Since clusters are allocated sequentially, count zeros only if
-* they are bounded by ones.
-*/
-   clusters = 0;
-   zeros = 0;
-   for (i = 0; i < NUM_APIC_CLUSTERS; i++) {
-   if (test_bit(i, clustermap)) {
-   clusters += 1 + zeros;
-   zeros = 0;
-   } else
-   ++zeros;
-   }
-
-   return clusters;
-}
-
 static int multi_checked;
 static int multi;
 
@@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
 int apic_is_clustered_box(void)
 {
dmi_check_multi();
-   if (multi)
-   return 1;
-
-   if (!is_vsmp_box())
-   return 0;
-
-   /*
-* ScaleMP vSMPowered boxes have one cluster per board and TSCs are
-* not guaranteed to be synced between boards
-*/
-   if (apic_cluster_num() > 1)
-   return 1;
-
-   return 0;
+   return multi;
 }
 #endif
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-13 Thread Oren Twaig

I see - thanks.

BTW, the v2 was sent because I had a mailer send-as-text issues which caused
the patch not to be sent properly and you had some conflicts with it when
merging.

All should be good now with the v2.

Thanks,
Oren.
On 7/13/2014 7:40 PM, H. Peter Anvin wrote:

Yes, although sometimes pings we useful to keep things from getting lost.  
Typically 1-2 weeks without response is a good reason to ping.  *Don't repost*, 
just put a ping as a reply to the original 0/ patch, which in most threaded 
mail readers brings the thread back to the front.

Sent from my tablet, pardon any formatting problems.


On Jul 13, 2014, at 2:10, Richard Weinberger  wrote:

Am 13.07.2014 08:51, schrieb Oren Twaig:

ping
On 07/06/2014 09:33 AM, Oren Twaig wrote:

Hi,

Quick question: As I'm new at this (submitting patches), what is a reasonable
time to wait before ping-ing a patch ?

Oren.

AFACT Peter is still crawling through his mail backlog to catch up.

Thanks,
//richard

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-13 Thread Oren Twaig
I see - Thanks for the quick respond.

Oren.
On 07/13/2014 12:10 PM, Richard Weinberger wrote:
> Am 13.07.2014 08:51, schrieb Oren Twaig:
>> ping
>> On 07/06/2014 09:33 AM, Oren Twaig wrote:
>>> Hi,
>>>
>>> Quick question: As I'm new at this (submitting patches), what is a 
>>> reasonable
>>> time to wait before ping-ing a patch ?
>>>
>>> Oren.
> AFACT Peter is still crawling through his mail backlog to catch up.
>
> Thanks,
> //richard
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-13 Thread Oren Twaig
ping
On 07/06/2014 09:33 AM, Oren Twaig wrote:
> Hi,
>
> Quick question: As I'm new at this (submitting patches), what is a reasonable
> time to wait before ping-ing a patch ?
>
> Oren.
>
> On 06/29/2014 01:01 PM, Oren Twaig wrote:
>> When a vSMP Foundation box is detected, the function apic_cluster_num() 
>> counts
>> the number of APIC clusters found. If more than one found, a multi board
>> configuration is assumed, and TSC marked as unstable. This behavior is
>> incorrect as vSMP Foundation may use processors from single node only, 
>> attached
>> to memory of other nodes - and such node may have more than one APIC cluster
>> (typically any recent intel box has more than single APIC_CLUSTERID(x)).
>>
>> To fix this, we simply remove the code which detects a vSMP Foundation box 
>> and
>> affects apic_is_clusted_box() return value. This can be done because later 
>> the
>> kernel checks by itself if the TSC is stable using the
>> check_tsc_sync_[source|target]() functions and marks TSC as unstable if 
>> needed.
>>
>> Acked-by: Shai Fultheim 
>> Signed-off-by: Oren Twaig 
>> ---
>> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
>> index 19b0eba..c100694 100644
>> --- a/arch/x86/include/asm/apic.h
>> +++ b/arch/x86/include/asm/apic.h
>> @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
>>  #include 
>>  #endif
>>
>> -#ifdef CONFIG_X86_64
>> -extern int is_vsmp_box(void);
>> -#else
>> -static inline int is_vsmp_box(void)
>> -{
>> -return 0;
>> -}
>> -#endif
>>  extern int setup_profiling_timer(unsigned int);
>>
>>  static inline void native_apic_mem_write(u32 reg, u32 v)
>> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
>> index ad28db7..2b85bb9 100644
>> --- a/arch/x86/kernel/apic/apic.c
>> +++ b/arch/x86/kernel/apic/apic.c
>> @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
>>
>>  #ifdef CONFIG_X86_64
>>
>> -static int apic_cluster_num(void)
>> -{
>> -int i, clusters, zeros;
>> -unsigned id;
>> -u16 *bios_cpu_apicid;
>> -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
>> -
>> -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
>> -bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
>> -
>> -for (i = 0; i < nr_cpu_ids; i++) {
>> -/* are we being called early in kernel startup? */
>> -if (bios_cpu_apicid) {
>> -id = bios_cpu_apicid[i];
>> -} else if (i < nr_cpu_ids) {
>> -if (cpu_present(i))
>> -id = per_cpu(x86_bios_cpu_apicid, i);
>> -else
>> -continue;
>> -} else
>> -break;
>> -
>> -if (id != BAD_APICID)
>> -__set_bit(APIC_CLUSTERID(id), clustermap);
>> -}
>> -
>> -/* Problem:  Partially populated chassis may not have CPUs in some of
>> - * the APIC clusters they have been allocated.  Only present CPUs have
>> - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
>> - * Since clusters are allocated sequentially, count zeros only if
>> - * they are bounded by ones.
>> - */
>> -clusters = 0;
>> -zeros = 0;
>> -for (i = 0; i < NUM_APIC_CLUSTERS; i++) {
>> -if (test_bit(i, clustermap)) {
>> -clusters += 1 + zeros;
>> -zeros = 0;
>> -} else
>> -++zeros;
>> -}
>> -
>> -return clusters;
>> -}
>> -
>>  static int multi_checked;
>>  static int multi;
>>
>> @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
>>  int apic_is_clustered_box(void)
>>  {
>>  dmi_check_multi();
>> -if (multi)
>> -return 1;
>> -
>> -if (!is_vsmp_box())
>> -return 0;
>> -
>> -/*
>> - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
>> - * not guaranteed to be synced between boards
>> - */
>> -if (apic_cluster_num() > 1)
>> -return 1;
>> -
>> -return 0;
>> +return multi;
>>  }
>>  #endif
>>
>>
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-13 Thread Oren Twaig
ping
On 07/06/2014 09:33 AM, Oren Twaig wrote:
 Hi,

 Quick question: As I'm new at this (submitting patches), what is a reasonable
 time to wait before ping-ing a patch ?

 Oren.

 On 06/29/2014 01:01 PM, Oren Twaig wrote:
 When a vSMP Foundation box is detected, the function apic_cluster_num() 
 counts
 the number of APIC clusters found. If more than one found, a multi board
 configuration is assumed, and TSC marked as unstable. This behavior is
 incorrect as vSMP Foundation may use processors from single node only, 
 attached
 to memory of other nodes - and such node may have more than one APIC cluster
 (typically any recent intel box has more than single APIC_CLUSTERID(x)).

 To fix this, we simply remove the code which detects a vSMP Foundation box 
 and
 affects apic_is_clusted_box() return value. This can be done because later 
 the
 kernel checks by itself if the TSC is stable using the
 check_tsc_sync_[source|target]() functions and marks TSC as unstable if 
 needed.

 Acked-by: Shai Fultheim s...@scalemp.com
 Signed-off-by: Oren Twaig o...@scalemp.com
 ---
 diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
 index 19b0eba..c100694 100644
 --- a/arch/x86/include/asm/apic.h
 +++ b/arch/x86/include/asm/apic.h
 @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
  #include asm/paravirt.h
  #endif

 -#ifdef CONFIG_X86_64
 -extern int is_vsmp_box(void);
 -#else
 -static inline int is_vsmp_box(void)
 -{
 -return 0;
 -}
 -#endif
  extern int setup_profiling_timer(unsigned int);

  static inline void native_apic_mem_write(u32 reg, u32 v)
 diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
 index ad28db7..2b85bb9 100644
 --- a/arch/x86/kernel/apic/apic.c
 +++ b/arch/x86/kernel/apic/apic.c
 @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }

  #ifdef CONFIG_X86_64

 -static int apic_cluster_num(void)
 -{
 -int i, clusters, zeros;
 -unsigned id;
 -u16 *bios_cpu_apicid;
 -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
 -
 -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
 -bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
 -
 -for (i = 0; i  nr_cpu_ids; i++) {
 -/* are we being called early in kernel startup? */
 -if (bios_cpu_apicid) {
 -id = bios_cpu_apicid[i];
 -} else if (i  nr_cpu_ids) {
 -if (cpu_present(i))
 -id = per_cpu(x86_bios_cpu_apicid, i);
 -else
 -continue;
 -} else
 -break;
 -
 -if (id != BAD_APICID)
 -__set_bit(APIC_CLUSTERID(id), clustermap);
 -}
 -
 -/* Problem:  Partially populated chassis may not have CPUs in some of
 - * the APIC clusters they have been allocated.  Only present CPUs have
 - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
 - * Since clusters are allocated sequentially, count zeros only if
 - * they are bounded by ones.
 - */
 -clusters = 0;
 -zeros = 0;
 -for (i = 0; i  NUM_APIC_CLUSTERS; i++) {
 -if (test_bit(i, clustermap)) {
 -clusters += 1 + zeros;
 -zeros = 0;
 -} else
 -++zeros;
 -}
 -
 -return clusters;
 -}
 -
  static int multi_checked;
  static int multi;

 @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
  int apic_is_clustered_box(void)
  {
  dmi_check_multi();
 -if (multi)
 -return 1;
 -
 -if (!is_vsmp_box())
 -return 0;
 -
 -/*
 - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
 - * not guaranteed to be synced between boards
 - */
 -if (apic_cluster_num()  1)
 -return 1;
 -
 -return 0;
 +return multi;
  }
  #endif






--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-13 Thread Oren Twaig
I see - Thanks for the quick respond.

Oren.
On 07/13/2014 12:10 PM, Richard Weinberger wrote:
 Am 13.07.2014 08:51, schrieb Oren Twaig:
 ping
 On 07/06/2014 09:33 AM, Oren Twaig wrote:
 Hi,

 Quick question: As I'm new at this (submitting patches), what is a 
 reasonable
 time to wait before ping-ing a patch ?

 Oren.
 AFACT Peter is still crawling through his mail backlog to catch up.

 Thanks,
 //richard
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ping: Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-13 Thread Oren Twaig

I see - thanks.

BTW, the v2 was sent because I had a mailer send-as-text issues which caused
the patch not to be sent properly and you had some conflicts with it when
merging.

All should be good now with the v2.

Thanks,
Oren.
On 7/13/2014 7:40 PM, H. Peter Anvin wrote:

Yes, although sometimes pings we useful to keep things from getting lost.  
Typically 1-2 weeks without response is a good reason to ping.  *Don't repost*, 
just put a ping as a reply to the original 0/ patch, which in most threaded 
mail readers brings the thread back to the front.

Sent from my tablet, pardon any formatting problems.


On Jul 13, 2014, at 2:10, Richard Weinberger rich...@nod.at wrote:

Am 13.07.2014 08:51, schrieb Oren Twaig:

ping
On 07/06/2014 09:33 AM, Oren Twaig wrote:

Hi,

Quick question: As I'm new at this (submitting patches), what is a reasonable
time to wait before ping-ing a patch ?

Oren.

AFACT Peter is still crawling through his mail backlog to catch up.

Thanks,
//richard

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/apic] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-13 Thread tip-bot for Oren Twaig
Commit-ID:  411cf9ee2946492c0ac7eca48422fcf94a723ce5
Gitweb: http://git.kernel.org/tip/411cf9ee2946492c0ac7eca48422fcf94a723ce5
Author: Oren Twaig o...@scalemp.com
AuthorDate: Sun, 29 Jun 2014 13:01:08 +0300
Committer:  H. Peter Anvin h...@zytor.com
CommitDate: Sun, 13 Jul 2014 17:48:03 -0700

x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

When a vSMP Foundation box is detected, the function apic_cluster_num() counts
the number of APIC clusters found. If more than one found, a multi board
configuration is assumed, and TSC marked as unstable. This behavior is
incorrect as vSMP Foundation may use processors from single node only, attached
to memory of other nodes - and such node may have more than one APIC cluster
(typically any recent intel box has more than single APIC_CLUSTERID(x)).

To fix this, we simply remove the code which detects a vSMP Foundation box and
affects apic_is_clusted_box() return value. This can be done because later the
kernel checks by itself if the TSC is stable using the
check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed.

Acked-by: Shai Fultheim s...@scalemp.com
Signed-off-by: Oren Twaig o...@scalemp.com
Link: 
http://lkml.kernel.org/r/1404036068-11674-1-git-send-email-o...@scalemp.com
Signed-off-by: H. Peter Anvin h...@zytor.com
---
 arch/x86/include/asm/apic.h |  8 --
 arch/x86/kernel/apic/apic.c | 60 +
 2 files changed, 1 insertion(+), 67 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 69ed79a..b40ea7e 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
 #include asm/paravirt.h
 #endif
 
-#ifdef CONFIG_X86_64
-extern int is_vsmp_box(void);
-#else
-static inline int is_vsmp_box(void)
-{
-   return 0;
-}
-#endif
 extern int setup_profiling_timer(unsigned int);
 
 static inline void native_apic_mem_write(u32 reg, u32 v)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index ca1bd75..6b35d30 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
 
 #ifdef CONFIG_X86_64
 
-static int apic_cluster_num(void)
-{
-   int i, clusters, zeros;
-   unsigned id;
-   u16 *bios_cpu_apicid;
-   DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
-
-   bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
-   bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
-
-   for (i = 0; i  nr_cpu_ids; i++) {
-   /* are we being called early in kernel startup? */
-   if (bios_cpu_apicid) {
-   id = bios_cpu_apicid[i];
-   } else if (i  nr_cpu_ids) {
-   if (cpu_present(i))
-   id = per_cpu(x86_bios_cpu_apicid, i);
-   else
-   continue;
-   } else
-   break;
-
-   if (id != BAD_APICID)
-   __set_bit(APIC_CLUSTERID(id), clustermap);
-   }
-
-   /* Problem:  Partially populated chassis may not have CPUs in some of
-* the APIC clusters they have been allocated.  Only present CPUs have
-* x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
-* Since clusters are allocated sequentially, count zeros only if
-* they are bounded by ones.
-*/
-   clusters = 0;
-   zeros = 0;
-   for (i = 0; i  NUM_APIC_CLUSTERS; i++) {
-   if (test_bit(i, clustermap)) {
-   clusters += 1 + zeros;
-   zeros = 0;
-   } else
-   ++zeros;
-   }
-
-   return clusters;
-}
-
 static int multi_checked;
 static int multi;
 
@@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
 int apic_is_clustered_box(void)
 {
dmi_check_multi();
-   if (multi)
-   return 1;
-
-   if (!is_vsmp_box())
-   return 0;
-
-   /*
-* ScaleMP vSMPowered boxes have one cluster per board and TSCs are
-* not guaranteed to be synced between boards
-*/
-   if (apic_cluster_num()  1)
-   return 1;
-
-   return 0;
+   return multi;
 }
 #endif
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-06 Thread Oren Twaig
Hi,

Quick question: As I'm new at this (submitting patches), what is a reasonable
time to wait before ping-ing a patch ?

Oren.

On 06/29/2014 01:01 PM, Oren Twaig wrote:
> When a vSMP Foundation box is detected, the function apic_cluster_num() counts
> the number of APIC clusters found. If more than one found, a multi board
> configuration is assumed, and TSC marked as unstable. This behavior is
> incorrect as vSMP Foundation may use processors from single node only, 
> attached
> to memory of other nodes - and such node may have more than one APIC cluster
> (typically any recent intel box has more than single APIC_CLUSTERID(x)).
>
> To fix this, we simply remove the code which detects a vSMP Foundation box and
> affects apic_is_clusted_box() return value. This can be done because later the
> kernel checks by itself if the TSC is stable using the
> check_tsc_sync_[source|target]() functions and marks TSC as unstable if 
> needed.
>
> Acked-by: Shai Fultheim 
> Signed-off-by: Oren Twaig 
> ---
> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> index 19b0eba..c100694 100644
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
>  #include 
>  #endif
> 
> -#ifdef CONFIG_X86_64
> -extern int is_vsmp_box(void);
> -#else
> -static inline int is_vsmp_box(void)
> -{
> -return 0;
> -}
> -#endif
>  extern int setup_profiling_timer(unsigned int);
> 
>  static inline void native_apic_mem_write(u32 reg, u32 v)
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index ad28db7..2b85bb9 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
> 
>  #ifdef CONFIG_X86_64
> 
> -static int apic_cluster_num(void)
> -{
> -int i, clusters, zeros;
> -unsigned id;
> -u16 *bios_cpu_apicid;
> -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
> -
> -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
> -bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
> -
> -for (i = 0; i < nr_cpu_ids; i++) {
> -/* are we being called early in kernel startup? */
> -if (bios_cpu_apicid) {
> -id = bios_cpu_apicid[i];
> -} else if (i < nr_cpu_ids) {
> -if (cpu_present(i))
> -id = per_cpu(x86_bios_cpu_apicid, i);
> -else
> -continue;
> -} else
> -break;
> -
> -if (id != BAD_APICID)
> -__set_bit(APIC_CLUSTERID(id), clustermap);
> -}
> -
> -/* Problem:  Partially populated chassis may not have CPUs in some of
> - * the APIC clusters they have been allocated.  Only present CPUs have
> - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
> - * Since clusters are allocated sequentially, count zeros only if
> - * they are bounded by ones.
> - */
> -clusters = 0;
> -zeros = 0;
> -for (i = 0; i < NUM_APIC_CLUSTERS; i++) {
> -if (test_bit(i, clustermap)) {
> -clusters += 1 + zeros;
> -zeros = 0;
> -} else
> -++zeros;
> -}
> -
> -return clusters;
> -}
> -
>  static int multi_checked;
>  static int multi;
> 
> @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
>  int apic_is_clustered_box(void)
>  {
>  dmi_check_multi();
> -if (multi)
> -return 1;
> -
> -if (!is_vsmp_box())
> -return 0;
> -
> -/*
> - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
> - * not guaranteed to be synced between boards
> - */
> -if (apic_cluster_num() > 1)
> -return 1;
> -
> -return 0;
> +return multi;
>  }
>  #endif
> 
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-06 Thread Oren Twaig
Hi,

Quick question: As I'm new at this (submitting patches), what is a reasonable 
time to wait before ping-ing a patch ?

Oren.

On 06/29/2014 01:01 PM, Oren Twaig wrote:
> When a vSMP Foundation box is detected, the function apic_cluster_num() counts
> the number of APIC clusters found. If more than one found, a multi board
> configuration is assumed, and TSC marked as unstable. This behavior is
> incorrect as vSMP Foundation may use processors from single node only, 
> attached
> to memory of other nodes - and such node may have more than one APIC cluster
> (typically any recent intel box has more than single APIC_CLUSTERID(x)).
>
> To fix this, we simply remove the code which detects a vSMP Foundation box and
> affects apic_is_clusted_box() return value. This can be done because later the
> kernel checks by itself if the TSC is stable using the
> check_tsc_sync_[source|target]() functions and marks TSC as unstable if 
> needed.
>
> Acked-by: Shai Fultheim 
> Signed-off-by: Oren Twaig 
> ---
> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> index 19b0eba..c100694 100644
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
>  #include 
>  #endif
>  
> -#ifdef CONFIG_X86_64
> -extern int is_vsmp_box(void);
> -#else
> -static inline int is_vsmp_box(void)
> -{
> - return 0;
> -}
> -#endif
>  extern int setup_profiling_timer(unsigned int);
>  
>  static inline void native_apic_mem_write(u32 reg, u32 v)
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index ad28db7..2b85bb9 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
>  
>  #ifdef CONFIG_X86_64
>  
> -static int apic_cluster_num(void)
> -{
> - int i, clusters, zeros;
> - unsigned id;
> - u16 *bios_cpu_apicid;
> - DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
> -
> - bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
> - bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
> -
> - for (i = 0; i < nr_cpu_ids; i++) {
> - /* are we being called early in kernel startup? */
> - if (bios_cpu_apicid) {
> - id = bios_cpu_apicid[i];
> - } else if (i < nr_cpu_ids) {
> - if (cpu_present(i))
> - id = per_cpu(x86_bios_cpu_apicid, i);
> - else
> - continue;
> - } else
> - break;
> -
> - if (id != BAD_APICID)
> - __set_bit(APIC_CLUSTERID(id), clustermap);
> - }
> -
> - /* Problem:  Partially populated chassis may not have CPUs in some of
> -  * the APIC clusters they have been allocated.  Only present CPUs have
> -  * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
> -  * Since clusters are allocated sequentially, count zeros only if
> -  * they are bounded by ones.
> -  */
> - clusters = 0;
> - zeros = 0;
> - for (i = 0; i < NUM_APIC_CLUSTERS; i++) {
> - if (test_bit(i, clustermap)) {
> - clusters += 1 + zeros;
> - zeros = 0;
> - } else
> - ++zeros;
> - }
> -
> - return clusters;
> -}
> -
>  static int multi_checked;
>  static int multi;
>  
> @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
>  int apic_is_clustered_box(void)
>  {
>   dmi_check_multi();
> - if (multi)
> - return 1;
> -
> - if (!is_vsmp_box())
> - return 0;
> -
> - /*
> -  * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
> -  * not guaranteed to be synced between boards
> -  */
> - if (apic_cluster_num() > 1)
> - return 1;
> -
> - return 0;
> + return multi;
>  }
>  #endif
>  
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-06 Thread Oren Twaig
Hi,

Quick question: As I'm new at this (submitting patches), what is a reasonable 
time to wait before ping-ing a patch ?

Oren.

On 06/29/2014 01:01 PM, Oren Twaig wrote:
 When a vSMP Foundation box is detected, the function apic_cluster_num() counts
 the number of APIC clusters found. If more than one found, a multi board
 configuration is assumed, and TSC marked as unstable. This behavior is
 incorrect as vSMP Foundation may use processors from single node only, 
 attached
 to memory of other nodes - and such node may have more than one APIC cluster
 (typically any recent intel box has more than single APIC_CLUSTERID(x)).

 To fix this, we simply remove the code which detects a vSMP Foundation box and
 affects apic_is_clusted_box() return value. This can be done because later the
 kernel checks by itself if the TSC is stable using the
 check_tsc_sync_[source|target]() functions and marks TSC as unstable if 
 needed.

 Acked-by: Shai Fultheim s...@scalemp.com
 Signed-off-by: Oren Twaig o...@scalemp.com
 ---
 diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
 index 19b0eba..c100694 100644
 --- a/arch/x86/include/asm/apic.h
 +++ b/arch/x86/include/asm/apic.h
 @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
  #include asm/paravirt.h
  #endif
  
 -#ifdef CONFIG_X86_64
 -extern int is_vsmp_box(void);
 -#else
 -static inline int is_vsmp_box(void)
 -{
 - return 0;
 -}
 -#endif
  extern int setup_profiling_timer(unsigned int);
  
  static inline void native_apic_mem_write(u32 reg, u32 v)
 diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
 index ad28db7..2b85bb9 100644
 --- a/arch/x86/kernel/apic/apic.c
 +++ b/arch/x86/kernel/apic/apic.c
 @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
  
  #ifdef CONFIG_X86_64
  
 -static int apic_cluster_num(void)
 -{
 - int i, clusters, zeros;
 - unsigned id;
 - u16 *bios_cpu_apicid;
 - DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
 -
 - bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
 - bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
 -
 - for (i = 0; i  nr_cpu_ids; i++) {
 - /* are we being called early in kernel startup? */
 - if (bios_cpu_apicid) {
 - id = bios_cpu_apicid[i];
 - } else if (i  nr_cpu_ids) {
 - if (cpu_present(i))
 - id = per_cpu(x86_bios_cpu_apicid, i);
 - else
 - continue;
 - } else
 - break;
 -
 - if (id != BAD_APICID)
 - __set_bit(APIC_CLUSTERID(id), clustermap);
 - }
 -
 - /* Problem:  Partially populated chassis may not have CPUs in some of
 -  * the APIC clusters they have been allocated.  Only present CPUs have
 -  * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
 -  * Since clusters are allocated sequentially, count zeros only if
 -  * they are bounded by ones.
 -  */
 - clusters = 0;
 - zeros = 0;
 - for (i = 0; i  NUM_APIC_CLUSTERS; i++) {
 - if (test_bit(i, clustermap)) {
 - clusters += 1 + zeros;
 - zeros = 0;
 - } else
 - ++zeros;
 - }
 -
 - return clusters;
 -}
 -
  static int multi_checked;
  static int multi;
  
 @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
  int apic_is_clustered_box(void)
  {
   dmi_check_multi();
 - if (multi)
 - return 1;
 -
 - if (!is_vsmp_box())
 - return 0;
 -
 - /*
 -  * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
 -  * not guaranteed to be synced between boards
 -  */
 - if (apic_cluster_num()  1)
 - return 1;
 -
 - return 0;
 + return multi;
  }
  #endif
  


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-07-06 Thread Oren Twaig
Hi,

Quick question: As I'm new at this (submitting patches), what is a reasonable
time to wait before ping-ing a patch ?

Oren.

On 06/29/2014 01:01 PM, Oren Twaig wrote:
 When a vSMP Foundation box is detected, the function apic_cluster_num() counts
 the number of APIC clusters found. If more than one found, a multi board
 configuration is assumed, and TSC marked as unstable. This behavior is
 incorrect as vSMP Foundation may use processors from single node only, 
 attached
 to memory of other nodes - and such node may have more than one APIC cluster
 (typically any recent intel box has more than single APIC_CLUSTERID(x)).

 To fix this, we simply remove the code which detects a vSMP Foundation box and
 affects apic_is_clusted_box() return value. This can be done because later the
 kernel checks by itself if the TSC is stable using the
 check_tsc_sync_[source|target]() functions and marks TSC as unstable if 
 needed.

 Acked-by: Shai Fultheim s...@scalemp.com
 Signed-off-by: Oren Twaig o...@scalemp.com
 ---
 diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
 index 19b0eba..c100694 100644
 --- a/arch/x86/include/asm/apic.h
 +++ b/arch/x86/include/asm/apic.h
 @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
  #include asm/paravirt.h
  #endif
 
 -#ifdef CONFIG_X86_64
 -extern int is_vsmp_box(void);
 -#else
 -static inline int is_vsmp_box(void)
 -{
 -return 0;
 -}
 -#endif
  extern int setup_profiling_timer(unsigned int);
 
  static inline void native_apic_mem_write(u32 reg, u32 v)
 diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
 index ad28db7..2b85bb9 100644
 --- a/arch/x86/kernel/apic/apic.c
 +++ b/arch/x86/kernel/apic/apic.c
 @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
 
  #ifdef CONFIG_X86_64
 
 -static int apic_cluster_num(void)
 -{
 -int i, clusters, zeros;
 -unsigned id;
 -u16 *bios_cpu_apicid;
 -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
 -
 -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
 -bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
 -
 -for (i = 0; i  nr_cpu_ids; i++) {
 -/* are we being called early in kernel startup? */
 -if (bios_cpu_apicid) {
 -id = bios_cpu_apicid[i];
 -} else if (i  nr_cpu_ids) {
 -if (cpu_present(i))
 -id = per_cpu(x86_bios_cpu_apicid, i);
 -else
 -continue;
 -} else
 -break;
 -
 -if (id != BAD_APICID)
 -__set_bit(APIC_CLUSTERID(id), clustermap);
 -}
 -
 -/* Problem:  Partially populated chassis may not have CPUs in some of
 - * the APIC clusters they have been allocated.  Only present CPUs have
 - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
 - * Since clusters are allocated sequentially, count zeros only if
 - * they are bounded by ones.
 - */
 -clusters = 0;
 -zeros = 0;
 -for (i = 0; i  NUM_APIC_CLUSTERS; i++) {
 -if (test_bit(i, clustermap)) {
 -clusters += 1 + zeros;
 -zeros = 0;
 -} else
 -++zeros;
 -}
 -
 -return clusters;
 -}
 -
  static int multi_checked;
  static int multi;
 
 @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
  int apic_is_clustered_box(void)
  {
  dmi_check_multi();
 -if (multi)
 -return 1;
 -
 -if (!is_vsmp_box())
 -return 0;
 -
 -/*
 - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
 - * not guaranteed to be synced between boards
 - */
 -if (apic_cluster_num()  1)
 -return 1;
 -
 -return 0;
 +return multi;
  }
  #endif
 



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-06-29 Thread Oren Twaig
Hi,

There was actually a problem that the mailer which changed a '\t' to spaces.
This is why it conflicted with the tip:x86/apic branch. Any how, I've also
tested on the x86/apic branch and everything is ok.

Sent :
"[PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()"

Sorry for any troubles.

Thanks,
Oren.

On 06/27/2014 08:39 AM, H. Peter Anvin wrote:
> On 06/26/2014 10:05 PM, Oren Twaig wrote:
>> ping
>
> This patch conflicts with the changes on the tip:x86/apic branch.  Could
> you please rebase the patch on top of that branch?
>
> -hpa
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-06-29 Thread Oren Twaig
When a vSMP Foundation box is detected, the function apic_cluster_num() counts
the number of APIC clusters found. If more than one found, a multi board
configuration is assumed, and TSC marked as unstable. This behavior is
incorrect as vSMP Foundation may use processors from single node only, attached
to memory of other nodes - and such node may have more than one APIC cluster
(typically any recent intel box has more than single APIC_CLUSTERID(x)).

To fix this, we simply remove the code which detects a vSMP Foundation box and
affects apic_is_clusted_box() return value. This can be done because later the
kernel checks by itself if the TSC is stable using the
check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed.

Acked-by: Shai Fultheim 
Signed-off-by: Oren Twaig 
---
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 19b0eba..c100694 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
 #include 
 #endif
 
-#ifdef CONFIG_X86_64
-extern int is_vsmp_box(void);
-#else
-static inline int is_vsmp_box(void)
-{
-   return 0;
-}
-#endif
 extern int setup_profiling_timer(unsigned int);
 
 static inline void native_apic_mem_write(u32 reg, u32 v)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index ad28db7..2b85bb9 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
 
 #ifdef CONFIG_X86_64
 
-static int apic_cluster_num(void)
-{
-   int i, clusters, zeros;
-   unsigned id;
-   u16 *bios_cpu_apicid;
-   DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
-
-   bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
-   bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
-
-   for (i = 0; i < nr_cpu_ids; i++) {
-   /* are we being called early in kernel startup? */
-   if (bios_cpu_apicid) {
-   id = bios_cpu_apicid[i];
-   } else if (i < nr_cpu_ids) {
-   if (cpu_present(i))
-   id = per_cpu(x86_bios_cpu_apicid, i);
-   else
-   continue;
-   } else
-   break;
-
-   if (id != BAD_APICID)
-   __set_bit(APIC_CLUSTERID(id), clustermap);
-   }
-
-   /* Problem:  Partially populated chassis may not have CPUs in some of
-* the APIC clusters they have been allocated.  Only present CPUs have
-* x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
-* Since clusters are allocated sequentially, count zeros only if
-* they are bounded by ones.
-*/
-   clusters = 0;
-   zeros = 0;
-   for (i = 0; i < NUM_APIC_CLUSTERS; i++) {
-   if (test_bit(i, clustermap)) {
-   clusters += 1 + zeros;
-   zeros = 0;
-   } else
-   ++zeros;
-   }
-
-   return clusters;
-}
-
 static int multi_checked;
 static int multi;
 
@@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
 int apic_is_clustered_box(void)
 {
dmi_check_multi();
-   if (multi)
-   return 1;
-
-   if (!is_vsmp_box())
-   return 0;
-
-   /*
-* ScaleMP vSMPowered boxes have one cluster per board and TSCs are
-* not guaranteed to be synced between boards
-*/
-   if (apic_cluster_num() > 1)
-   return 1;
-
-   return 0;
+   return multi;
 }
 #endif
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-06-29 Thread Oren Twaig
When a vSMP Foundation box is detected, the function apic_cluster_num() counts
the number of APIC clusters found. If more than one found, a multi board
configuration is assumed, and TSC marked as unstable. This behavior is
incorrect as vSMP Foundation may use processors from single node only, attached
to memory of other nodes - and such node may have more than one APIC cluster
(typically any recent intel box has more than single APIC_CLUSTERID(x)).

To fix this, we simply remove the code which detects a vSMP Foundation box and
affects apic_is_clusted_box() return value. This can be done because later the
kernel checks by itself if the TSC is stable using the
check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed.

Acked-by: Shai Fultheim s...@scalemp.com
Signed-off-by: Oren Twaig o...@scalemp.com
---
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 19b0eba..c100694 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
 #include asm/paravirt.h
 #endif
 
-#ifdef CONFIG_X86_64
-extern int is_vsmp_box(void);
-#else
-static inline int is_vsmp_box(void)
-{
-   return 0;
-}
-#endif
 extern int setup_profiling_timer(unsigned int);
 
 static inline void native_apic_mem_write(u32 reg, u32 v)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index ad28db7..2b85bb9 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
 
 #ifdef CONFIG_X86_64
 
-static int apic_cluster_num(void)
-{
-   int i, clusters, zeros;
-   unsigned id;
-   u16 *bios_cpu_apicid;
-   DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
-
-   bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
-   bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
-
-   for (i = 0; i  nr_cpu_ids; i++) {
-   /* are we being called early in kernel startup? */
-   if (bios_cpu_apicid) {
-   id = bios_cpu_apicid[i];
-   } else if (i  nr_cpu_ids) {
-   if (cpu_present(i))
-   id = per_cpu(x86_bios_cpu_apicid, i);
-   else
-   continue;
-   } else
-   break;
-
-   if (id != BAD_APICID)
-   __set_bit(APIC_CLUSTERID(id), clustermap);
-   }
-
-   /* Problem:  Partially populated chassis may not have CPUs in some of
-* the APIC clusters they have been allocated.  Only present CPUs have
-* x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
-* Since clusters are allocated sequentially, count zeros only if
-* they are bounded by ones.
-*/
-   clusters = 0;
-   zeros = 0;
-   for (i = 0; i  NUM_APIC_CLUSTERS; i++) {
-   if (test_bit(i, clustermap)) {
-   clusters += 1 + zeros;
-   zeros = 0;
-   } else
-   ++zeros;
-   }
-
-   return clusters;
-}
-
 static int multi_checked;
 static int multi;
 
@@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
 int apic_is_clustered_box(void)
 {
dmi_check_multi();
-   if (multi)
-   return 1;
-
-   if (!is_vsmp_box())
-   return 0;
-
-   /*
-* ScaleMP vSMPowered boxes have one cluster per board and TSCs are
-* not guaranteed to be synced between boards
-*/
-   if (apic_cluster_num()  1)
-   return 1;
-
-   return 0;
+   return multi;
 }
 #endif
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-06-29 Thread Oren Twaig
Hi,

There was actually a problem that the mailer which changed a '\t' to spaces.
This is why it conflicted with the tip:x86/apic branch. Any how, I've also
tested on the x86/apic branch and everything is ok.

Sent :
[PATCH v2] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

Sorry for any troubles.

Thanks,
Oren.

On 06/27/2014 08:39 AM, H. Peter Anvin wrote:
 On 06/26/2014 10:05 PM, Oren Twaig wrote:
 ping

 This patch conflicts with the changes on the tip:x86/apic branch.  Could
 you please rebase the patch on top of that branch?

 -hpa


 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-06-26 Thread Oren Twaig
ping
On 06/23/2014 08:35 AM, Oren Twaig wrote:
> Remove invalid code which caused TSC to be declared as "unstable" on vSMP
> Foundation box even if it was stable and let the kernel decide for itself.
>
> When a vSMP Foundation box is detected, the function apic_cluster_num() counts
> the number of APIC clusters found. If more than one found, a multi board
> configuration is assumed, and TSC marked as unstable. This behavior is
> incorrect as vSMP Foundation may use processors from single node only, 
> attached
> to memory of other nodes - and such node may have more than one APIC cluster
> (typically any recent intel box has more than single APIC_CLUSTERID(x)).
>
> To fix this, we simply remove the code which detects a vSMP Foundation box and
> affects apic_is_clusted_box() return value. This can be done because later the
> kernel checks by itself if the TSC is stable using the
> check_tsc_sync_[source|target]() functions and marks TSC as unstable if 
> needed.
>
> Signed-off-by: Oren Twaig 
> Acked-by: Shai Fultheim 
> ---
>  arch/x86/include/asm/apic.h |8 --
>  arch/x86/kernel/apic/apic.c |   60 
> +--
>  2 files changed, 1 insertion(+), 67 deletions(-)
>
> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> index 19b0eba..c100694 100644
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
>  #include 
>  #endif
> 
> -#ifdef CONFIG_X86_64
> -extern int is_vsmp_box(void);
> -#else
> -static inline int is_vsmp_box(void)
> -{
> -return 0;
> -}
> -#endif
>  extern int setup_profiling_timer(unsigned int);
> 
>  static inline void native_apic_mem_write(u32 reg, u32 v)
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index ad28db7..2b85bb9 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
> 
>  #ifdef CONFIG_X86_64
> 
> -static int apic_cluster_num(void)
> -{
> -int i, clusters, zeros;
> -unsigned id;
> -u16 *bios_cpu_apicid;
> -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
> -
> -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
> -bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
> -
> -for (i = 0; i < nr_cpu_ids; i++) {
> -/* are we being called early in kernel startup? */
> -if (bios_cpu_apicid) {
> -id = bios_cpu_apicid[i];
> -} else if (i < nr_cpu_ids) {
> -if (cpu_present(i))
> -id = per_cpu(x86_bios_cpu_apicid, i);
> -else
> -continue;
> -} else
> -break;
> -
> -if (id != BAD_APICID)
> -__set_bit(APIC_CLUSTERID(id), clustermap);
> -}
> -
> -/* Problem:  Partially populated chassis may not have CPUs in some of
> - * the APIC clusters they have been allocated.  Only present CPUs have
> - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
> - * Since clusters are allocated sequentially, count zeros only if
> - * they are bounded by ones.
> - */
> -clusters = 0;
> -zeros = 0;
> -for (i = 0; i < NUM_APIC_CLUSTERS; i++) {
> -if (test_bit(i, clustermap)) {
> -clusters += 1 + zeros;
> -zeros = 0;
> -} else
> -++zeros;
> -}
> -
> -return clusters;
> -}
> -
>  static int multi_checked;
>  static int multi;
> 
> @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
>  int apic_is_clustered_box(void)
>  {
>  dmi_check_multi();
> -if (multi)
> -return 1;
> -
> -if (!is_vsmp_box())
> -return 0;
> -
> -/*
> - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
> - * not guaranteed to be synced between boards
> - */
> -if (apic_cluster_num() > 1)
> -return 1;
> -
> -return 0;
> +return multi;
>  }
>  #endif
> 
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-06-26 Thread Oren Twaig
ping
On 06/23/2014 08:35 AM, Oren Twaig wrote:
 Remove invalid code which caused TSC to be declared as unstable on vSMP
 Foundation box even if it was stable and let the kernel decide for itself.

 When a vSMP Foundation box is detected, the function apic_cluster_num() counts
 the number of APIC clusters found. If more than one found, a multi board
 configuration is assumed, and TSC marked as unstable. This behavior is
 incorrect as vSMP Foundation may use processors from single node only, 
 attached
 to memory of other nodes - and such node may have more than one APIC cluster
 (typically any recent intel box has more than single APIC_CLUSTERID(x)).

 To fix this, we simply remove the code which detects a vSMP Foundation box and
 affects apic_is_clusted_box() return value. This can be done because later the
 kernel checks by itself if the TSC is stable using the
 check_tsc_sync_[source|target]() functions and marks TSC as unstable if 
 needed.

 Signed-off-by: Oren Twaig o...@scalemp.com
 Acked-by: Shai Fultheim s...@scalemp.com
 ---
  arch/x86/include/asm/apic.h |8 --
  arch/x86/kernel/apic/apic.c |   60 
 +--
  2 files changed, 1 insertion(+), 67 deletions(-)

 diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
 index 19b0eba..c100694 100644
 --- a/arch/x86/include/asm/apic.h
 +++ b/arch/x86/include/asm/apic.h
 @@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
  #include asm/paravirt.h
  #endif
 
 -#ifdef CONFIG_X86_64
 -extern int is_vsmp_box(void);
 -#else
 -static inline int is_vsmp_box(void)
 -{
 -return 0;
 -}
 -#endif
  extern int setup_profiling_timer(unsigned int);
 
  static inline void native_apic_mem_write(u32 reg, u32 v)
 diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
 index ad28db7..2b85bb9 100644
 --- a/arch/x86/kernel/apic/apic.c
 +++ b/arch/x86/kernel/apic/apic.c
 @@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
 
  #ifdef CONFIG_X86_64
 
 -static int apic_cluster_num(void)
 -{
 -int i, clusters, zeros;
 -unsigned id;
 -u16 *bios_cpu_apicid;
 -DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
 -
 -bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
 -bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
 -
 -for (i = 0; i  nr_cpu_ids; i++) {
 -/* are we being called early in kernel startup? */
 -if (bios_cpu_apicid) {
 -id = bios_cpu_apicid[i];
 -} else if (i  nr_cpu_ids) {
 -if (cpu_present(i))
 -id = per_cpu(x86_bios_cpu_apicid, i);
 -else
 -continue;
 -} else
 -break;
 -
 -if (id != BAD_APICID)
 -__set_bit(APIC_CLUSTERID(id), clustermap);
 -}
 -
 -/* Problem:  Partially populated chassis may not have CPUs in some of
 - * the APIC clusters they have been allocated.  Only present CPUs have
 - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
 - * Since clusters are allocated sequentially, count zeros only if
 - * they are bounded by ones.
 - */
 -clusters = 0;
 -zeros = 0;
 -for (i = 0; i  NUM_APIC_CLUSTERS; i++) {
 -if (test_bit(i, clustermap)) {
 -clusters += 1 + zeros;
 -zeros = 0;
 -} else
 -++zeros;
 -}
 -
 -return clusters;
 -}
 -
  static int multi_checked;
  static int multi;
 
 @@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
  int apic_is_clustered_box(void)
  {
  dmi_check_multi();
 -if (multi)
 -return 1;
 -
 -if (!is_vsmp_box())
 -return 0;
 -
 -/*
 - * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
 - * not guaranteed to be synced between boards
 - */
 -if (apic_cluster_num()  1)
 -return 1;
 -
 -return 0;
 +return multi;
  }
  #endif
 




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-06-22 Thread Oren Twaig
Remove invalid code which caused TSC to be declared as "unstable" on vSMP
Foundation box even if it was stable and let the kernel decide for itself.

When a vSMP Foundation box is detected, the function apic_cluster_num() counts
the number of APIC clusters found. If more than one found, a multi board
configuration is assumed, and TSC marked as unstable. This behavior is
incorrect as vSMP Foundation may use processors from single node only, attached
to memory of other nodes - and such node may have more than one APIC cluster
(typically any recent intel box has more than single APIC_CLUSTERID(x)).

To fix this, we simply remove the code which detects a vSMP Foundation box and
affects apic_is_clusted_box() return value. This can be done because later the
kernel checks by itself if the TSC is stable using the
check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed.

Signed-off-by: Oren Twaig 
Acked-by: Shai Fultheim 
---
 arch/x86/include/asm/apic.h |8 --
 arch/x86/kernel/apic/apic.c |   60 +--
 2 files changed, 1 insertion(+), 67 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 19b0eba..c100694 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
 #include 
 #endif
 
-#ifdef CONFIG_X86_64
-extern int is_vsmp_box(void);
-#else
-static inline int is_vsmp_box(void)
-{
-return 0;
-}
-#endif
 extern int setup_profiling_timer(unsigned int);
 
 static inline void native_apic_mem_write(u32 reg, u32 v)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index ad28db7..2b85bb9 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
 
 #ifdef CONFIG_X86_64
 
-static int apic_cluster_num(void)
-{
-int i, clusters, zeros;
-unsigned id;
-u16 *bios_cpu_apicid;
-DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
-
-bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
-bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
-
-for (i = 0; i < nr_cpu_ids; i++) {
-/* are we being called early in kernel startup? */
-if (bios_cpu_apicid) {
-id = bios_cpu_apicid[i];
-} else if (i < nr_cpu_ids) {
-if (cpu_present(i))
-id = per_cpu(x86_bios_cpu_apicid, i);
-else
-continue;
-} else
-break;
-
-if (id != BAD_APICID)
-__set_bit(APIC_CLUSTERID(id), clustermap);
-}
-
-/* Problem:  Partially populated chassis may not have CPUs in some of
- * the APIC clusters they have been allocated.  Only present CPUs have
- * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
- * Since clusters are allocated sequentially, count zeros only if
- * they are bounded by ones.
- */
-clusters = 0;
-zeros = 0;
-for (i = 0; i < NUM_APIC_CLUSTERS; i++) {
-if (test_bit(i, clustermap)) {
-clusters += 1 + zeros;
-zeros = 0;
-} else
-++zeros;
-}
-
-return clusters;
-}
-
 static int multi_checked;
 static int multi;
 
@@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
 int apic_is_clustered_box(void)
 {
 dmi_check_multi();
-if (multi)
-return 1;
-
-if (!is_vsmp_box())
-return 0;
-
-/*
- * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
- * not guaranteed to be synced between boards
- */
-if (apic_cluster_num() > 1)
-return 1;
-
-return 0;
+return multi;
 }
 #endif
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()

2014-06-22 Thread Oren Twaig
Remove invalid code which caused TSC to be declared as unstable on vSMP
Foundation box even if it was stable and let the kernel decide for itself.

When a vSMP Foundation box is detected, the function apic_cluster_num() counts
the number of APIC clusters found. If more than one found, a multi board
configuration is assumed, and TSC marked as unstable. This behavior is
incorrect as vSMP Foundation may use processors from single node only, attached
to memory of other nodes - and such node may have more than one APIC cluster
(typically any recent intel box has more than single APIC_CLUSTERID(x)).

To fix this, we simply remove the code which detects a vSMP Foundation box and
affects apic_is_clusted_box() return value. This can be done because later the
kernel checks by itself if the TSC is stable using the
check_tsc_sync_[source|target]() functions and marks TSC as unstable if needed.

Signed-off-by: Oren Twaig o...@scalemp.com
Acked-by: Shai Fultheim s...@scalemp.com
---
 arch/x86/include/asm/apic.h |8 --
 arch/x86/kernel/apic/apic.c |   60 +--
 2 files changed, 1 insertion(+), 67 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 19b0eba..c100694 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -85,14 +85,6 @@ static inline bool apic_from_smp_config(void)
 #include asm/paravirt.h
 #endif
 
-#ifdef CONFIG_X86_64
-extern int is_vsmp_box(void);
-#else
-static inline int is_vsmp_box(void)
-{
-return 0;
-}
-#endif
 extern int setup_profiling_timer(unsigned int);
 
 static inline void native_apic_mem_write(u32 reg, u32 v)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index ad28db7..2b85bb9 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2451,51 +2451,6 @@ static void apic_pm_activate(void) { }
 
 #ifdef CONFIG_X86_64
 
-static int apic_cluster_num(void)
-{
-int i, clusters, zeros;
-unsigned id;
-u16 *bios_cpu_apicid;
-DECLARE_BITMAP(clustermap, NUM_APIC_CLUSTERS);
-
-bios_cpu_apicid = early_per_cpu_ptr(x86_bios_cpu_apicid);
-bitmap_zero(clustermap, NUM_APIC_CLUSTERS);
-
-for (i = 0; i  nr_cpu_ids; i++) {
-/* are we being called early in kernel startup? */
-if (bios_cpu_apicid) {
-id = bios_cpu_apicid[i];
-} else if (i  nr_cpu_ids) {
-if (cpu_present(i))
-id = per_cpu(x86_bios_cpu_apicid, i);
-else
-continue;
-} else
-break;
-
-if (id != BAD_APICID)
-__set_bit(APIC_CLUSTERID(id), clustermap);
-}
-
-/* Problem:  Partially populated chassis may not have CPUs in some of
- * the APIC clusters they have been allocated.  Only present CPUs have
- * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap.
- * Since clusters are allocated sequentially, count zeros only if
- * they are bounded by ones.
- */
-clusters = 0;
-zeros = 0;
-for (i = 0; i  NUM_APIC_CLUSTERS; i++) {
-if (test_bit(i, clustermap)) {
-clusters += 1 + zeros;
-zeros = 0;
-} else
-++zeros;
-}
-
-return clusters;
-}
-
 static int multi_checked;
 static int multi;
 
@@ -2540,20 +2495,7 @@ static void dmi_check_multi(void)
 int apic_is_clustered_box(void)
 {
 dmi_check_multi();
-if (multi)
-return 1;
-
-if (!is_vsmp_box())
-return 0;
-
-/*
- * ScaleMP vSMPowered boxes have one cluster per board and TSCs are
- * not guaranteed to be synced between boards
- */
-if (apic_cluster_num()  1)
-return 1;
-
-return 0;
+return multi;
 }
 #endif
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] x86, Clean up smp_num_siblings calculation

2014-06-01 Thread Oren Twaig
Hi Prarit,

See below,

On 05/30/2014 02:43 PM, Prarit Bhargava wrote:
> I have a system on which I have disabled threading in the BIOS, and I am 
> booting
> the kernel with the option "idle=poll".
>
> The kernel displays
>
> process: WARNING: polling idle and HT enabled, performance may degrade
>
> which is incorrect -- I've already disabled HT.
>
> This warning is issued here:
>
> void select_idle_routine(const struct cpuinfo_x86 *c)
> {
> if (boot_option_idle_override == IDLE_POLL && smp_num_siblings > 1)
> pr_warn_once("WARNING: polling idle and HT enabled, 
> performance may degrade\n");
>
> >From my understanding of the other ares of kernel that use
> smp_num_siblings, the value is supposed to be the the number of threads
> per core.
>
> The value of smp_num_siblings is incorrect.  In theory, it should be 1 but it
> is reported as 2.  When I looked into how smp_num_siblings is calculated I
> found the following call sequence in the kernel:
>
> start_kernel ->
> check_bugs ->
> identify_boot_cpu ->
> identify_cpu ->
> c_init = init_intel
> init_intel ->
> 
> detect_extended_topology
> (sets value)
>
> OR
>
> c_init = init_amd
> init_amd -> amd_detect_cmp
>  -> 
> amd_get_topology
> (sets value)
>  -> detect_ht()
> ...(sets value)
> detect_ht()
> (also sets value)
>
> ie) it is set three times in some cases and is overwritten by the call
> to detect_ht() from identify_cpu() in all cases.
>
> It should be noted that nothing in the identify_cpu() path or the cpu_up()
> path requires smp_num_siblings to be set, prior to the final call to
> detect_ht().
>
> For x86 boxes, smp_num_siblings is set to a value read in a CPUID call in
> detect_ht(). This value is the *factory defined* value in all cases; even
> if HT is disabled in BIOS the value still returns 2 if the CPU supports
> HT.  AMD also reports the factory defined value in all cases.

The above is incorrect in case of X-TOPOLOGY mode. I such a case the information
about number of siblings comes from the LEVEL_MAX_SIBLINGS() macro and the
X86_FEATURE_XTOPOLOGY flag is set to skip detect_ht() work :
void detect_ht(struct cpuinfo_x86 *c)
...
if (cpu_has(c, X86_FEATURE_XTOPOLOGY))
return;

In addition, the information about the number of sibling no longer comes from
CPUID(0x1)->ebx but rather from the 0xb leaf of CPUID.

I've marked below the problematic code change.

Thanks,
Oren Twaig

>
> Other uses of smp_num_siblings involve oprofile (used after boot), and
> the perf code which is done well after the initial cpus are brought online.
>
> This patch removes dead code and moves the assignment of smp_num_siblings
> to only the detect_ht() code; it is still always reporting 2.  A follow
> on patch will fix the calculation.
>
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> Cc: Borislav Petkov 
> Cc: Paul Gortmaker 
> Cc: Andrew Morton 
> Cc: Andi Kleen 
> Cc: Dave Jones 
> Cc: Torsten Kaiser 
> Cc: Jan Beulich 
> Cc: Jan Kiszka 
> Cc: Toshi Kani 
> Cc: Andrew Jones 
> Signed-off-by: Prarit Bhargava 
> ---
>  arch/x86/kernel/cpu/amd.c  |1 -
>  arch/x86/kernel/cpu/common.c   |   23 +++
>  arch/x86/kernel/cpu/topology.c |2 +-
>  arch/x86/kernel/smpboot.c  |5 ++---
>  4 files changed, 14 insertions(+), 17 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index ce8b8ff..6aca2b6 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -304,7 +304,6 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
>  node_id = ecx & 7;
> 
>  /* get compute unit information */
> -smp_num_siblings = ((ebx >> 8) & 3) + 1;
>  c->compute_unit_id = ebx & 0xff;
>  cores_per_cu += ((ebx >> 8) & 3);
>  } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
> diff --git a/a

Re: [PATCH 1/2] x86, Clean up smp_num_siblings calculation

2014-06-01 Thread Oren Twaig
Hi Prarit,

See below,

On 05/30/2014 02:43 PM, Prarit Bhargava wrote:
 I have a system on which I have disabled threading in the BIOS, and I am 
 booting
 the kernel with the option idle=poll.

 The kernel displays

 process: WARNING: polling idle and HT enabled, performance may degrade

 which is incorrect -- I've already disabled HT.

 This warning is issued here:

 void select_idle_routine(const struct cpuinfo_x86 *c)
 {
 if (boot_option_idle_override == IDLE_POLL  smp_num_siblings  1)
 pr_warn_once(WARNING: polling idle and HT enabled, 
 performance may degrade\n);

 From my understanding of the other ares of kernel that use
 smp_num_siblings, the value is supposed to be the the number of threads
 per core.

 The value of smp_num_siblings is incorrect.  In theory, it should be 1 but it
 is reported as 2.  When I looked into how smp_num_siblings is calculated I
 found the following call sequence in the kernel:

 start_kernel -
 check_bugs -
 identify_boot_cpu -
 identify_cpu -
 c_init = init_intel
 init_intel -
 
 detect_extended_topology
 (sets value)

 OR

 c_init = init_amd
 init_amd - amd_detect_cmp
  - 
 amd_get_topology
 (sets value)
  - detect_ht()
 ...(sets value)
 detect_ht()
 (also sets value)

 ie) it is set three times in some cases and is overwritten by the call
 to detect_ht() from identify_cpu() in all cases.

 It should be noted that nothing in the identify_cpu() path or the cpu_up()
 path requires smp_num_siblings to be set, prior to the final call to
 detect_ht().

 For x86 boxes, smp_num_siblings is set to a value read in a CPUID call in
 detect_ht(). This value is the *factory defined* value in all cases; even
 if HT is disabled in BIOS the value still returns 2 if the CPU supports
 HT.  AMD also reports the factory defined value in all cases.

The above is incorrect in case of X-TOPOLOGY mode. I such a case the information
about number of siblings comes from the LEVEL_MAX_SIBLINGS() macro and the
X86_FEATURE_XTOPOLOGY flag is set to skip detect_ht() work :
void detect_ht(struct cpuinfo_x86 *c)
...
if (cpu_has(c, X86_FEATURE_XTOPOLOGY))
return;

In addition, the information about the number of sibling no longer comes from
CPUID(0x1)-ebx but rather from the 0xb leaf of CPUID.

I've marked below the problematic code change.

Thanks,
Oren Twaig


 Other uses of smp_num_siblings involve oprofile (used after boot), and
 the perf code which is done well after the initial cpus are brought online.

 This patch removes dead code and moves the assignment of smp_num_siblings
 to only the detect_ht() code; it is still always reporting 2.  A follow
 on patch will fix the calculation.

 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Ingo Molnar mi...@redhat.com
 Cc: H. Peter Anvin h...@zytor.com
 Cc: x...@kernel.org
 Cc: Borislav Petkov b...@suse.de
 Cc: Paul Gortmaker paul.gortma...@windriver.com
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Andi Kleen a...@linux.intel.com
 Cc: Dave Jones da...@redhat.com
 Cc: Torsten Kaiser just.for.l...@googlemail.com
 Cc: Jan Beulich jbeul...@suse.com
 Cc: Jan Kiszka jan.kis...@siemens.com
 Cc: Toshi Kani toshi.k...@hp.com
 Cc: Andrew Jones drjo...@redhat.com
 Signed-off-by: Prarit Bhargava pra...@redhat.com
 ---
  arch/x86/kernel/cpu/amd.c  |1 -
  arch/x86/kernel/cpu/common.c   |   23 +++
  arch/x86/kernel/cpu/topology.c |2 +-
  arch/x86/kernel/smpboot.c  |5 ++---
  4 files changed, 14 insertions(+), 17 deletions(-)

 diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
 index ce8b8ff..6aca2b6 100644
 --- a/arch/x86/kernel/cpu/amd.c
 +++ b/arch/x86/kernel/cpu/amd.c
 @@ -304,7 +304,6 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
  node_id = ecx  7;
 
  /* get compute unit information */
 -smp_num_siblings = ((ebx  8)  3) + 1;
  c-compute_unit_id = ebx  0xff;
  cores_per_cu += ((ebx  8)  3);
  } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
 diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
 index a135239..fc1235c 100644
 --- a/arch/x86/kernel/cpu/common.c
 +++ b/arch/x86/kernel/cpu/common.c
 @@ -507,42 +507,41 @@ void detect_ht(struct cpuinfo_x86 *c)
  u32 eax, ebx, ecx, edx;
  int index_msb

Re: [PATCH] x86, hugetlb: add missing TLB page invalidation for hugetlb_cow()

2014-05-15 Thread Oren Twaig

On 05/15/2014 08:00 PM, Anthony Iliopoulos wrote:
> I have dismissed this case, since I assume that there are many more
> cycles spent in servicing the TLB invalidation IPI, walking the pgtable
> plus other related overhead (e.g. sched) than in updating the pte/pmd
> so I am not sure how possible it would be to hit this condition.

Hi Anthony,

I have a question about the above statement. What will happen with multi
cpu VMs ? won't the race described above can happen ? I.e one virtual CPU
can will visit the host and the other will continue to encounter your race ?

Thanks,
Oren.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, hugetlb: add missing TLB page invalidation for hugetlb_cow()

2014-05-15 Thread Oren Twaig

On 05/15/2014 08:00 PM, Anthony Iliopoulos wrote:
 I have dismissed this case, since I assume that there are many more
 cycles spent in servicing the TLB invalidation IPI, walking the pgtable
 plus other related overhead (e.g. sched) than in updating the pte/pmd
 so I am not sure how possible it would be to hit this condition.

Hi Anthony,

I have a question about the above statement. What will happen with multi
cpu VMs ? won't the race described above can happen ? I.e one virtual CPU
can will visit the host and the other will continue to encounter your race ?

Thanks,
Oren.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/urgent] x86/vsmp: Fix irq routing

2014-04-29 Thread tip-bot for Oren Twaig
Commit-ID:  39025ba38278f3003ee538409f7c98970620ef49
Gitweb: http://git.kernel.org/tip/39025ba38278f3003ee538409f7c98970620ef49
Author: Oren Twaig 
AuthorDate: Mon, 28 Apr 2014 10:21:37 +0300
Committer:  Ingo Molnar 
CommitDate: Mon, 28 Apr 2014 09:27:34 +0200

x86/vsmp: Fix irq routing

Correct IRQ routing in case a vSMP box is detected
but the  Interrupt Routing Comply (IRC) value is set to
"comply", which leads to incorrect IRQ routing.

Before the patch:

When a vSMP box was detected and IRC was set to "comply",
users (and the kernel) couldn't effectively set the
destination of the IRQs. This is because the hook inside
vsmp_64.c always setup all CPUs as the IRQ destination using
cpumask_setall() as the return value for IRQ allocation mask.
Later, this "overrided" mask caused the kernel to set the IRQ
destination to the lowest online CPU in the mask (CPU0 usually).

After the patch:

When the IRC is set to "comply", users (and the kernel) can control
the destination of the IRQs as we will not be changing the
default "apic->vector_allocation_domain".

Signed-off-by: Oren Twaig 
Acked-by: Shai Fultheim 
Link: http://lkml.kernel.org/r/1398669697-2123-1-git-send-email-o...@scalemp.com
[ Minor readability edits. ]
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/vsmp_64.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index f6584a9..5edc34b 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -26,6 +26,9 @@
 
 #define TOPOLOGY_REGISTER_OFFSET 0x10
 
+/* Flag below is initialized once during vSMP PCI initialization. */
+static int irq_routing_comply = 1;
+
 #if defined CONFIG_PCI && defined CONFIG_PARAVIRT
 /*
  * Interrupt control on vSMPowered systems:
@@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void)
 #ifdef CONFIG_SMP
if (cap & ctl & BIT(8)) {
ctl &= ~BIT(8);
+
+   /* Interrupt routing set to ignore */
+   irq_routing_comply = 0;
+
 #ifdef CONFIG_PROC_FS
/* Don't let users change irq affinity via procfs */
no_irq_affinity = 1;
@@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void)
 {
/* need to update phys_pkg_id */
apic->phys_pkg_id = apicid_phys_pkg_id;
-   apic->vector_allocation_domain = fill_vector_allocation_domain;
+
+   if (!irq_routing_comply)
+   apic->vector_allocation_domain = fill_vector_allocation_domain;
 }
 
 void __init vsmp_init(void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/urgent] x86/vsmp: Fix irq routing

2014-04-29 Thread tip-bot for Oren Twaig
Commit-ID:  39025ba38278f3003ee538409f7c98970620ef49
Gitweb: http://git.kernel.org/tip/39025ba38278f3003ee538409f7c98970620ef49
Author: Oren Twaig o...@scalemp.com
AuthorDate: Mon, 28 Apr 2014 10:21:37 +0300
Committer:  Ingo Molnar mi...@kernel.org
CommitDate: Mon, 28 Apr 2014 09:27:34 +0200

x86/vsmp: Fix irq routing

Correct IRQ routing in case a vSMP box is detected
but the  Interrupt Routing Comply (IRC) value is set to
comply, which leads to incorrect IRQ routing.

Before the patch:

When a vSMP box was detected and IRC was set to comply,
users (and the kernel) couldn't effectively set the
destination of the IRQs. This is because the hook inside
vsmp_64.c always setup all CPUs as the IRQ destination using
cpumask_setall() as the return value for IRQ allocation mask.
Later, this overrided mask caused the kernel to set the IRQ
destination to the lowest online CPU in the mask (CPU0 usually).

After the patch:

When the IRC is set to comply, users (and the kernel) can control
the destination of the IRQs as we will not be changing the
default apic-vector_allocation_domain.

Signed-off-by: Oren Twaig o...@scalemp.com
Acked-by: Shai Fultheim s...@scalemp.com
Link: http://lkml.kernel.org/r/1398669697-2123-1-git-send-email-o...@scalemp.com
[ Minor readability edits. ]
Signed-off-by: Ingo Molnar mi...@kernel.org
---
 arch/x86/kernel/vsmp_64.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index f6584a9..5edc34b 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -26,6 +26,9 @@
 
 #define TOPOLOGY_REGISTER_OFFSET 0x10
 
+/* Flag below is initialized once during vSMP PCI initialization. */
+static int irq_routing_comply = 1;
+
 #if defined CONFIG_PCI  defined CONFIG_PARAVIRT
 /*
  * Interrupt control on vSMPowered systems:
@@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void)
 #ifdef CONFIG_SMP
if (cap  ctl  BIT(8)) {
ctl = ~BIT(8);
+
+   /* Interrupt routing set to ignore */
+   irq_routing_comply = 0;
+
 #ifdef CONFIG_PROC_FS
/* Don't let users change irq affinity via procfs */
no_irq_affinity = 1;
@@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void)
 {
/* need to update phys_pkg_id */
apic-phys_pkg_id = apicid_phys_pkg_id;
-   apic-vector_allocation_domain = fill_vector_allocation_domain;
+
+   if (!irq_routing_comply)
+   apic-vector_allocation_domain = fill_vector_allocation_domain;
 }
 
 void __init vsmp_init(void)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-28 Thread Oren Twaig
Correct IRQ routing in case a vSMP Foundation box is detected but the 
Interrupt Routing Comply (IRC) is set to "comply".

Before the patch:
When a vSMP Foundation box was detected and IRC was set to "comply", users (and
kernel) couldn't effectively set the destination of the IRQs. This is because
the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using
cpumask_setall() as the return value for IRQ allocation mask. Later, this
"overrided" mask caused the kernel to set the IRQ destination to the lowest
online CPU in the mask (CPU0 usually).

After the patch:
When the IRC is set to "comply", Users (and kernel) can control the destination
of the IRQs as we will not be changing the default
"apic->vector_allocation_domain".

Signed-off-by: Oren Twaig 
Acked-by: Shai Fultheim 
---
 arch/x86/kernel/vsmp_64.c |   11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index f6584a9..39dd854 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -26,6 +26,9 @@
 
 #define TOPOLOGY_REGISTER_OFFSET 0x10
 
+/* Flag below is initialized once during vSMP Foundation PCI initialization. */
+static int interrupt_routing_comply = 1;
+
 #if defined CONFIG_PCI && defined CONFIG_PARAVIRT
 /*
  * Interrupt control on vSMPowered systems:
@@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void)
 #ifdef CONFIG_SMP
if (cap & ctl & BIT(8)) {
ctl &= ~BIT(8);
+
+   /* Interrupt routing set to ignore */
+   interrupt_routing_comply = 0;
+
 #ifdef CONFIG_PROC_FS
/* Don't let users change irq affinity via procfs */
no_irq_affinity = 1;
@@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void)
 {
/* need to update phys_pkg_id */
apic->phys_pkg_id = apicid_phys_pkg_id;
-   apic->vector_allocation_domain = fill_vector_allocation_domain;
+
+   if (!interrupt_routing_comply)
+   apic->vector_allocation_domain = fill_vector_allocation_domain;
 }
 
 void __init vsmp_init(void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-28 Thread Oren Twaig
Hi Andi,
On 04/27/2014 09:34 PM, Andi Kleen wrote:
> Again, what lock protects it?
>
> If you cannot answer that question you likely shouldn't use static.

The only function which touches this variable is vsmp_init() which is an
"_init" function which is guarantee to run by a single cpu - this means, no 
race.

Thanks,
  Oren
>
> -Andi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-28 Thread Oren Twaig
Hi Andi,
On 04/27/2014 09:34 PM, Andi Kleen wrote:
 Again, what lock protects it?

 If you cannot answer that question you likely shouldn't use static.

The only function which touches this variable is vsmp_init() which is an
_init function which is guarantee to run by a single cpu - this means, no 
race.

Thanks,
  Oren

 -Andi


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-28 Thread Oren Twaig
Correct IRQ routing in case a vSMP Foundation box is detected but the 
Interrupt Routing Comply (IRC) is set to comply.

Before the patch:
When a vSMP Foundation box was detected and IRC was set to comply, users (and
kernel) couldn't effectively set the destination of the IRQs. This is because
the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using
cpumask_setall() as the return value for IRQ allocation mask. Later, this
overrided mask caused the kernel to set the IRQ destination to the lowest
online CPU in the mask (CPU0 usually).

After the patch:
When the IRC is set to comply, Users (and kernel) can control the destination
of the IRQs as we will not be changing the default
apic-vector_allocation_domain.

Signed-off-by: Oren Twaig o...@scalemp.com
Acked-by: Shai Fultheim s...@scalemp.com
---
 arch/x86/kernel/vsmp_64.c |   11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index f6584a9..39dd854 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -26,6 +26,9 @@
 
 #define TOPOLOGY_REGISTER_OFFSET 0x10
 
+/* Flag below is initialized once during vSMP Foundation PCI initialization. */
+static int interrupt_routing_comply = 1;
+
 #if defined CONFIG_PCI  defined CONFIG_PARAVIRT
 /*
  * Interrupt control on vSMPowered systems:
@@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void)
 #ifdef CONFIG_SMP
if (cap  ctl  BIT(8)) {
ctl = ~BIT(8);
+
+   /* Interrupt routing set to ignore */
+   interrupt_routing_comply = 0;
+
 #ifdef CONFIG_PROC_FS
/* Don't let users change irq affinity via procfs */
no_irq_affinity = 1;
@@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void)
 {
/* need to update phys_pkg_id */
apic-phys_pkg_id = apicid_phys_pkg_id;
-   apic-vector_allocation_domain = fill_vector_allocation_domain;
+
+   if (!interrupt_routing_comply)
+   apic-vector_allocation_domain = fill_vector_allocation_domain;
 }
 
 void __init vsmp_init(void)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-27 Thread Oren Twaig
Hi Andi,

On 04/25/2014 05:22 PM, Andi Kleen wrote:
>> +static int irc = 1;
> Using a static for such state is very unusual. You need to describe what
> protects it against races and why that is needed over a cleaner solution.

The only reason I've used a static variable is because I wanted to avoid
inserting another code/functions which are depended on CONFIG_PCI.  The code is
used once during initialization and hence cannot be racy.

 
But, if static variables are unusual (new at linux kernel), I will change the
flow to read the HW state again (using the PCI functions).  Please let me know 
if that is desirable.

Thanks,
   Oren.
 
>
> -Andi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-27 Thread Oren Twaig
Hi Ingo,

On 04/26/2014 09:09 AM, Ingo Molnar wrote:
> I still don't see a clear explanation of what the _user_ saw and sees
> before and after the change. What is the effect of the patch: correct
> IRQ routing (i.e. before the change IRQs would end up on the wrong
> CPU), lower overhead IRQ routing (i.e. before the change IRQ routing
> overhead was more expensive), or something else?
>
> You don't spell this out clearly and it's a crucial piece of
> information that comes before every other explanation.
>
I see.. I tried to explain the entire flow and that was confusing - I'll explain
only the patch.

As you stated, in general, the patch corrects IRQ routing in case a vSMP
Foundation box is detected but the Interrupt Routing Comply (IRC) is set to
"comply".

Before the patch:
When a vSMP Foundation box was detected and IRC was set to "comply", users (and
kernel) couldn't effectively set the destination of the IRQs. This is because
the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using
cpumask_setall() as the return value for IRQ allocation mask. Later, this
"overrided" mask caused the kernel to set the IRQ destination to the lowest
online CPU in the mask (CPU0 usually).

After the patch:
When the IRC is set to "comply", Users (and kernel) can control the destination
of the IRQs as we will not be changing the default
"apic->vector_allocation_domain".

Thanks,
   Oren

> Thanks,
>
> Ingo


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-27 Thread Oren Twaig
Hi Ingo,

On 04/26/2014 09:09 AM, Ingo Molnar wrote:
 I still don't see a clear explanation of what the _user_ saw and sees
 before and after the change. What is the effect of the patch: correct
 IRQ routing (i.e. before the change IRQs would end up on the wrong
 CPU), lower overhead IRQ routing (i.e. before the change IRQ routing
 overhead was more expensive), or something else?

 You don't spell this out clearly and it's a crucial piece of
 information that comes before every other explanation.

I see.. I tried to explain the entire flow and that was confusing - I'll explain
only the patch.

As you stated, in general, the patch corrects IRQ routing in case a vSMP
Foundation box is detected but the Interrupt Routing Comply (IRC) is set to
comply.

Before the patch:
When a vSMP Foundation box was detected and IRC was set to comply, users (and
kernel) couldn't effectively set the destination of the IRQs. This is because
the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using
cpumask_setall() as the return value for IRQ allocation mask. Later, this
overrided mask caused the kernel to set the IRQ destination to the lowest
online CPU in the mask (CPU0 usually).

After the patch:
When the IRC is set to comply, Users (and kernel) can control the destination
of the IRQs as we will not be changing the default
apic-vector_allocation_domain.

Thanks,
   Oren

 Thanks,

 Ingo


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-27 Thread Oren Twaig
Hi Andi,

On 04/25/2014 05:22 PM, Andi Kleen wrote:
 +static int irc = 1;
 Using a static for such state is very unusual. You need to describe what
 protects it against races and why that is needed over a cleaner solution.

The only reason I've used a static variable is because I wanted to avoid
inserting another code/functions which are depended on CONFIG_PCI.  The code is
used once during initialization and hence cannot be racy.

 
But, if static variables are unusual (new at linux kernel), I will change the
flow to read the HW state again (using the PCI functions).  Please let me know 
if that is desirable.

Thanks,
   Oren.
 

 -Andi


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-25 Thread Oren Twaig

On 4/25/2014 11:01 AM, Ingo Molnar wrote:



> * Oren Twaig  wrote:
>
>> vSMP Foundation provides locality based interrupt routing which needed
>> vector_allocation_domain to allow all online cpus can handle all 
possible

>> vectors.
>>
>> Enforcing Interrupt Routing Comply (IRC) mode requires us to unplug 
this hook as
>> otherwise the IOAPIC, MSI and MSIX destination selectors to always 
select the
>> lowest online cpu as the destination. I.e affinity of HW interrupts 
cannot be

>> controled by kernel and/or userspace code.
>>
>> The purpose of the patch is to fix the code to set override vector 
allocation
>> domain only when IRC is set to ignore to allow the kernel and 
userspace to

>> effectively control the destination of the HW interrupts.
>>
>> Signed-off-by: Oren Twaig 
>> Acked-by: Shai Fultheim 
>
> So what was the behavior before the change - certain IRQs did not get
> routed, they just ended up on CPU0 or on some other undesirable CPU?
> Or was IRQ distribution random? It's not clear from the changelog.

It all depends on the IRC flag. When set to "ignore" by the linux
kernel, vSMP Foundation knew that it can deliver the IRQ to the CPU
which would result in less virtualization overhead. For example, we
could deliver the HW interrupt to the CPU which got it or any other CPU
in the system. We couldn't have done it without the kernel making sure
that each vector can be passed to all CPUs. This is why we override the
verctor allocation domain to signal all CPUs.

But, when the IRC is set to "comply" we, before this patch, still
efected the allocation domains alltough it wasn't needed. It wasn't
needed because when in "comply" mode, we always pass the HW interrupt to
the CPU the kernel requested (by setting the IOAPIC entry, MSI/X entry
or IR entry)

Thanks,
  Oren




> Thanks,
>
> Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in

> the body of a message to majord...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-25 Thread Oren Twaig

On 4/25/2014 11:01 AM, Ingo Molnar wrote:



 * Oren Twaig o...@scalemp.com wrote:

 vSMP Foundation provides locality based interrupt routing which needed
 vector_allocation_domain to allow all online cpus can handle all 
possible

 vectors.

 Enforcing Interrupt Routing Comply (IRC) mode requires us to unplug 
this hook as
 otherwise the IOAPIC, MSI and MSIX destination selectors to always 
select the
 lowest online cpu as the destination. I.e affinity of HW interrupts 
cannot be

 controled by kernel and/or userspace code.

 The purpose of the patch is to fix the code to set override vector 
allocation
 domain only when IRC is set to ignore to allow the kernel and 
userspace to

 effectively control the destination of the HW interrupts.

 Signed-off-by: Oren Twaig o...@scalemp.com
 Acked-by: Shai Fultheim s...@scalemp.com

 So what was the behavior before the change - certain IRQs did not get
 routed, they just ended up on CPU0 or on some other undesirable CPU?
 Or was IRQ distribution random? It's not clear from the changelog.

It all depends on the IRC flag. When set to ignore by the linux
kernel, vSMP Foundation knew that it can deliver the IRQ to the CPU
which would result in less virtualization overhead. For example, we
could deliver the HW interrupt to the CPU which got it or any other CPU
in the system. We couldn't have done it without the kernel making sure
that each vector can be passed to all CPUs. This is why we override the
verctor allocation domain to signal all CPUs.

But, when the IRC is set to comply we, before this patch, still
efected the allocation domains alltough it wasn't needed. It wasn't
needed because when in comply mode, we always pass the HW interrupt to
the CPU the kernel requested (by setting the IOAPIC entry, MSI/X entry
or IR entry)

Thanks,
  Oren




 Thanks,

 Ingo
 --
 To unsubscribe from this list: send the line unsubscribe 
linux-kernel in

 the body of a message to majord...@vger.kernel.org
 More majordomo info at http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/




---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-24 Thread Oren Twaig
vSMP Foundation provides locality based interrupt routing which needed
vector_allocation_domain to allow all online cpus can handle all possible
vectors. 

Enforcing Interrupt Routing Comply (IRC) mode requires us to unplug this hook as
otherwise the IOAPIC, MSI and MSIX destination selectors to always select the
lowest online cpu as the destination. I.e affinity of HW interrupts cannot be
controled by kernel and/or userspace code.

The purpose of the patch is to fix the code to set override vector allocation
domain only when IRC is set to ignore to allow the kernel and userspace to
effectively control the destination of the HW interrupts.

Signed-off-by: Oren Twaig 
Acked-by: Shai Fultheim 
---

 arch/x86/kernel/vsmp_64.c |   10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index f6584a9..b7f8e5b 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -26,6 +26,8 @@
 
 #define TOPOLOGY_REGISTER_OFFSET 0x10
 
+static int irc = 1;
+
 #if defined CONFIG_PCI && defined CONFIG_PARAVIRT
 /*
  * Interrupt control on vSMPowered systems:
@@ -101,6 +103,10 @@ static void __init set_vsmp_pv_ops(void)
 #ifdef CONFIG_SMP
if (cap & ctl & BIT(8)) {
ctl &= ~BIT(8);
+
+   /* Interrupt routing set to ignore */
+   irc = 0;
+
 #ifdef CONFIG_PROC_FS
/* Don't let users change irq affinity via procfs */
no_irq_affinity = 1;
@@ -218,7 +224,9 @@ static void vsmp_apic_post_init(void)
 {
/* need to update phys_pkg_id */
apic->phys_pkg_id = apicid_phys_pkg_id;
-   apic->vector_allocation_domain = fill_vector_allocation_domain;
+
+   if (!irc)
+   apic->vector_allocation_domain = fill_vector_allocation_domain;
 }
 
 void __init vsmp_init(void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-24 Thread Oren Twaig
Hi Ingo,

On 04/24/2014 09:22 AM, Ingo Molnar wrote:
>
> * Oren Twaig  wrote:
>
>> Set all inclusive vector allocation domain only if interrupt routing
>> is set to ignore. When in comply mode, vector allocation behavior
>> isn't changed.
>
> This changelog is too terse:
>
> Please update the changelog to describe the current behavior, and how
> it affects your platform. (I.e. how do users notice, if at all?)
>
> Then describe why you think that behavior should be changed. ie:
> what's the reason for this patch.
>
> Only then describe the details of the change itself.

I will send v2 of the patch with a revised and more detailed
explanation about the behavior aspect of the patch.

>
>> -apic->vector_allocation_domain = fill_vector_allocation_domain;
>> +
>> +if (!irc)
>> +apic->vector_allocation_domain = fill_vector_allocation_domain;
>
> Please also run scripts/checkpatch.pl over your patch which will
> report trivial coding style problems like the one here.

Yep - did that just like the instructions on "SubmmitingPatches", 
unfortunately, although I've followed the Thunderbird configuration instructions
with external editor, it seems to sent the the email content wrong (bad line 
endings as I saw now).

So sorry for the trouble, will change to a text email client and will
resend the patch.

Thanks,
Oren.
>
> Thanks,
>
> Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-24 Thread Oren Twaig
Hi Ingo,

On 04/24/2014 09:22 AM, Ingo Molnar wrote:

 * Oren Twaig o...@scalemp.com wrote:

 Set all inclusive vector allocation domain only if interrupt routing
 is set to ignore. When in comply mode, vector allocation behavior
 isn't changed.

 This changelog is too terse:

 Please update the changelog to describe the current behavior, and how
 it affects your platform. (I.e. how do users notice, if at all?)

 Then describe why you think that behavior should be changed. ie:
 what's the reason for this patch.

 Only then describe the details of the change itself.

I will send v2 of the patch with a revised and more detailed
explanation about the behavior aspect of the patch.


 -apic-vector_allocation_domain = fill_vector_allocation_domain;
 +
 +if (!irc)
 +apic-vector_allocation_domain = fill_vector_allocation_domain;

 Please also run scripts/checkpatch.pl over your patch which will
 report trivial coding style problems like the one here.

Yep - did that just like the instructions on SubmmitingPatches, 
unfortunately, although I've followed the Thunderbird configuration instructions
with external editor, it seems to sent the the email content wrong (bad line 
endings as I saw now).

So sorry for the trouble, will change to a text email client and will
resend the patch.

Thanks,
Oren.

 Thanks,

 Ingo
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-24 Thread Oren Twaig
vSMP Foundation provides locality based interrupt routing which needed
vector_allocation_domain to allow all online cpus can handle all possible
vectors. 

Enforcing Interrupt Routing Comply (IRC) mode requires us to unplug this hook as
otherwise the IOAPIC, MSI and MSIX destination selectors to always select the
lowest online cpu as the destination. I.e affinity of HW interrupts cannot be
controled by kernel and/or userspace code.

The purpose of the patch is to fix the code to set override vector allocation
domain only when IRC is set to ignore to allow the kernel and userspace to
effectively control the destination of the HW interrupts.

Signed-off-by: Oren Twaig o...@scalemp.com
Acked-by: Shai Fultheim s...@scalemp.com
---

 arch/x86/kernel/vsmp_64.c |   10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index f6584a9..b7f8e5b 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -26,6 +26,8 @@
 
 #define TOPOLOGY_REGISTER_OFFSET 0x10
 
+static int irc = 1;
+
 #if defined CONFIG_PCI  defined CONFIG_PARAVIRT
 /*
  * Interrupt control on vSMPowered systems:
@@ -101,6 +103,10 @@ static void __init set_vsmp_pv_ops(void)
 #ifdef CONFIG_SMP
if (cap  ctl  BIT(8)) {
ctl = ~BIT(8);
+
+   /* Interrupt routing set to ignore */
+   irc = 0;
+
 #ifdef CONFIG_PROC_FS
/* Don't let users change irq affinity via procfs */
no_irq_affinity = 1;
@@ -218,7 +224,9 @@ static void vsmp_apic_post_init(void)
 {
/* need to update phys_pkg_id */
apic-phys_pkg_id = apicid_phys_pkg_id;
-   apic-vector_allocation_domain = fill_vector_allocation_domain;
+
+   if (!irc)
+   apic-vector_allocation_domain = fill_vector_allocation_domain;
 }
 
 void __init vsmp_init(void)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-23 Thread Oren Twaig
Set all inclusive vector allocation domain only if interrupt routing is 
set to ignore. When in comply mode, vector allocation behavior isn't 
changed.


Signed-off-by: Oren Twaig 
Acked-by: Shai Fultheim 
---
diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index f6584a9..60a195d 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -27,6 +27,9 @@
 #define TOPOLOGY_REGISTER_OFFSET 0x10

 #if defined CONFIG_PCI && defined CONFIG_PARAVIRT
+
+static int irc = 1;
+
 /*
  * Interrupt control on vSMPowered systems:
  * ~AC is a shadow of IF.  If IF is 'on' AC should be 'off'
@@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void)
 #ifdef CONFIG_SMP
 if (cap & ctl & BIT(8)) {
 ctl &= ~BIT(8);
+
+/* Interrupt routing set to ignore */
+irc = 0;
+
 #ifdef CONFIG_PROC_FS
 /* Don't let users change irq affinity via procfs */
 no_irq_affinity = 1;
@@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void)
 {
 /* need to update phys_pkg_id */
 apic->phys_pkg_id = apicid_phys_pkg_id;
-apic->vector_allocation_domain = fill_vector_allocation_domain;
+
+if (!irc)
+apic->vector_allocation_domain = fill_vector_allocation_domain;
 }

 void __init vsmp_init(void)


---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] X86: Hook apic vector allocation domain only when interrupt routing are set to ignore

2014-04-23 Thread Oren Twaig
Set all inclusive vector allocation domain only if interrupt routing is 
set to ignore. When in comply mode, vector allocation behavior isn't 
changed.


Signed-off-by: Oren Twaig o...@scalemp.com
Acked-by: Shai Fultheim s...@scalemp.com
---
diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index f6584a9..60a195d 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -27,6 +27,9 @@
 #define TOPOLOGY_REGISTER_OFFSET 0x10

 #if defined CONFIG_PCI  defined CONFIG_PARAVIRT
+
+static int irc = 1;
+
 /*
  * Interrupt control on vSMPowered systems:
  * ~AC is a shadow of IF.  If IF is 'on' AC should be 'off'
@@ -101,6 +104,10 @@ static void __init set_vsmp_pv_ops(void)
 #ifdef CONFIG_SMP
 if (cap  ctl  BIT(8)) {
 ctl = ~BIT(8);
+
+/* Interrupt routing set to ignore */
+irc = 0;
+
 #ifdef CONFIG_PROC_FS
 /* Don't let users change irq affinity via procfs */
 no_irq_affinity = 1;
@@ -218,7 +225,9 @@ static void vsmp_apic_post_init(void)
 {
 /* need to update phys_pkg_id */
 apic-phys_pkg_id = apicid_phys_pkg_id;
-apic-vector_allocation_domain = fill_vector_allocation_domain;
+
+if (!irc)
+apic-vector_allocation_domain = fill_vector_allocation_domain;
 }

 void __init vsmp_init(void)


---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/