Re: [PATCH] kvm: x86: svm: remove SVM_EXIT_READ_CR* intercepts

2015-03-16 Thread Joel Schopp


On 03/12/2015 04:20 PM, Radim Krčmář wrote:
> 2015-03-12 15:17-0500, Joel Schopp:
>> There isn't really a valid reason for kvm to intercept cr* reads
>> on svm hardware.  The current kvm code just ends up returning
>> the register
> There is no need to intercept CR* if the value that the guest should see
> is equal to what we set there, but that is not always the case:
> - CR0 might differ from what the guest should see because of lazy fpu
Based on our previous conversations I understand why we have to trap the
write to the CR0 ts bit for lazy fpu, but don't understand why that
should affect a read.  I'll take another look at the code to see what
I'm missing.  You are probably correct in which case I'll modify the
patch to only turn off the read intercepts when lazy fpu isn't active.

> - CR3 isn't intercepted with nested paging and it should differ
>   otherwise
> - CR4 contains PAE bit when run without nested paging
>
> CR2 and CR8 already aren't intercepted, so it looks like only CR0 and
> CR4 could use some optimizations.
I'll send out a v2 with these less aggressive optimizations.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mce: use safe MSR accesses

2015-03-13 Thread Joel Schopp

On 03/13/2015 11:03 AM, jesse.lar...@amd.com wrote:
> From: Jesse Larrew 
>
> Certain MSRs are only relevant to a kernel in host mode, and kvm had
> chosen not to implement these MSRs at all for guests. If a guest kernel
> ever tried to access these MSRs, the result was a general protection
> fault.
>
> KVM will be separately patched to return 0 when these MSRs are read,
> and this patch ensures that MSR accesses are tolerant of exceptions.
>
> Signed-off-by: Jesse Larrew 
> ---
>  arch/x86/kernel/cpu/mcheck/mce.c | 11 +++
>  1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c 
> b/arch/x86/kernel/cpu/mcheck/mce.c
> index 61a9668ce..2737ced 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -1540,7 +1540,7 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 
> *c)
>if (c->x86 == 0x15 &&
>(c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
>int i;
> -  u64 val, hwcr;
> +  u64 hwcr;
>bool need_toggle;
>u32 msrs[] = {
>   0x0413, /* MC4_MISC0 */
> @@ -1556,13 +1556,8 @@ static int __mcheck_cpu_apply_quirks(struct 
> cpuinfo_x86 *c)
>wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
>  
>for (i = 0; i < ARRAY_SIZE(msrs); i++) {
> -  rdmsrl(msrs[i], val);
> -
> -  /* CntP bit set? */
> -  if (val & BIT_64(62)) {
> - val &= ~BIT_64(62);
> - wrmsrl(msrs[i], val);
> -  }
> +  /* Clear CntP bit safely */
> +      msr_clear_bit(msrs[i], 62);
>}
>  
>/* restore old settings */
I like it.

Reviewed-by: Joel Schopp 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kvm: x86: svm: remove SVM_EXIT_READ_CR* intercepts

2015-03-12 Thread Joel Schopp
There isn't really a valid reason for kvm to intercept cr* reads
on svm hardware.  The current kvm code just ends up returning
the register

Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |   41 -
 1 file changed, 4 insertions(+), 37 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index cc618c8..c3d10e6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1090,9 +1090,6 @@ static void init_vmcb(struct vcpu_svm *svm)
svm->vcpu.fpu_active = 1;
svm->vcpu.arch.hflags = 0;
 
-   set_cr_intercept(svm, INTERCEPT_CR0_READ);
-   set_cr_intercept(svm, INTERCEPT_CR3_READ);
-   set_cr_intercept(svm, INTERCEPT_CR4_READ);
set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
set_cr_intercept(svm, INTERCEPT_CR3_WRITE);
set_cr_intercept(svm, INTERCEPT_CR4_WRITE);
@@ -1174,7 +1171,6 @@ static void init_vmcb(struct vcpu_svm *svm)
control->nested_ctl = 1;
clr_intercept(svm, INTERCEPT_INVLPG);
clr_exception_intercept(svm, PF_VECTOR);
-   clr_cr_intercept(svm, INTERCEPT_CR3_READ);
clr_cr_intercept(svm, INTERCEPT_CR3_WRITE);
save->g_pat = 0x0007040600070406ULL;
save->cr3 = 0;
@@ -2968,29 +2964,10 @@ static int cr_interception(struct vcpu_svm *svm)
kvm_queue_exception(&svm->vcpu, UD_VECTOR);
return 1;
}
-   } else { /* mov from cr */
-   switch (cr) {
-   case 0:
-   val = kvm_read_cr0(&svm->vcpu);
-   break;
-   case 2:
-   val = svm->vcpu.arch.cr2;
-   break;
-   case 3:
-   val = kvm_read_cr3(&svm->vcpu);
-   break;
-   case 4:
-   val = kvm_read_cr4(&svm->vcpu);
-   break;
-   case 8:
-   val = kvm_get_cr8(&svm->vcpu);
-   break;
-   default:
-   WARN(1, "unhandled read from CR%d", cr);
-   kvm_queue_exception(&svm->vcpu, UD_VECTOR);
-   return 1;
-   }
-   kvm_register_write(&svm->vcpu, reg, val);
+   } else { /* mov from cr, should never trap in svm */
+   WARN(1, "unhandled read from CR%d", cr);
+   kvm_queue_exception(&svm->vcpu, UD_VECTOR);
+   return 1;
}
kvm_complete_insn_gp(&svm->vcpu, err);
 
@@ -3321,10 +3298,6 @@ static int mwait_interception(struct vcpu_svm *svm)
 }
 
 static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
-   [SVM_EXIT_READ_CR0] = cr_interception,
-   [SVM_EXIT_READ_CR3] = cr_interception,
-   [SVM_EXIT_READ_CR4] = cr_interception,
-   [SVM_EXIT_READ_CR8] = cr_interception,
[SVM_EXIT_CR0_SEL_WRITE]= emulate_on_interception,
[SVM_EXIT_WRITE_CR0]= cr_interception,
[SVM_EXIT_WRITE_CR3]= cr_interception,
@@ -4151,11 +4124,9 @@ static const struct __x86_intercept {
u32 exit_code;
enum x86_intercept_stage stage;
 } x86_intercept_map[] = {
-   [x86_intercept_cr_read] = POST_EX(SVM_EXIT_READ_CR0),
[x86_intercept_cr_write]= POST_EX(SVM_EXIT_WRITE_CR0),
[x86_intercept_clts]= POST_EX(SVM_EXIT_WRITE_CR0),
[x86_intercept_lmsw]= POST_EX(SVM_EXIT_WRITE_CR0),
-   [x86_intercept_smsw]= POST_EX(SVM_EXIT_READ_CR0),
[x86_intercept_dr_read] = POST_EX(SVM_EXIT_READ_DR0),
[x86_intercept_dr_write]= POST_EX(SVM_EXIT_WRITE_DR0),
[x86_intercept_sldt]= POST_EX(SVM_EXIT_LDTR_READ),
@@ -4221,10 +4192,6 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu,
goto out;
 
switch (icpt_info.exit_code) {
-   case SVM_EXIT_READ_CR0:
-   if (info->intercept == x86_intercept_cr_read)
-   icpt_info.exit_code += info->modrm_reg;
-   break;
case SVM_EXIT_WRITE_CR0: {
unsigned long cr0, val;
u64 intercept;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mce: use safe MSR accesses

2015-03-12 Thread Joel Schopp


On 03/11/2015 05:47 PM, Luck, Tony wrote:
>> When running as a guest under kvm, it's possible that the MSR
>> being accessed may not be implemented. All MSR accesses should
>> be prepared to handle exceptions.
> Isn't that a KVM bug?  The code here first checks family/model before 
> accessing the MSR:
>
>  if (c->x86 == 0x15 &&
>  (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
>
> If kvm tells the guest that it is running on one of these models, shouldn't 
> it provide
> complete coverage for that model?
These MSRs don't make sense in guest mode.  The real question is if we
fix that in KVM, here, or both.  I'm a fan of fixing it in both places. 
Xen's behavior is to return a value of 0 if the guest tries to access
these, that seems like a reasonable thing to do in KVM as well.  I am
volunteering myself to write that patch for KVM, but I would encourage
accepting an updated version of this patch as well.
>
> If that isn't possible - then you should still do more than just 
> s/rdmsrl/rdmsrl_safe/ ... like
> check the return value to see whether you got an exception .. and thus should 
> skip past
> code that uses the "val" that you thought you read from the non-existent MSR.
Initializing val to 0 where it is declared should have the desired effect.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] x86: svm: use cr_interception for SVM_EXIT_CR0_SEL_WRITE

2015-03-06 Thread Joel Schopp
From: David Kaplan 

Another patch in my war on emulate_on_interception() use as a svm exit handler.

These were pulled out of a larger patch at the suggestion of Radim Krcmar, see
https://lkml.org/lkml/2015/2/25/559

Changes since v1:
* fixed typo introduced after test, retested

Signed-off-by: David Kaplan 
[separated out just cr_interception part from larger removal of
INTERCEPT_CR0_WRITE, forward ported, tested]
Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..16ad05b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2940,7 +2940,10 @@ static int cr_interception(struct vcpu_svm *svm)
return emulate_on_interception(svm);
 
reg = svm->vmcb->control.exit_info_1 & SVM_EXITINFO_REG_MASK;
-   cr = svm->vmcb->control.exit_code - SVM_EXIT_READ_CR0;
+   if (svm->vmcb->control.exit_code == SVM_EXIT_CR0_SEL_WRITE)
+   cr = SVM_EXIT_WRITE_CR0 - SVM_EXIT_READ_CR0;
+   else
+   cr = svm->vmcb->control.exit_code - SVM_EXIT_READ_CR0;
 
err = 0;
if (cr >= 16) { /* mov to cr */
@@ -3325,7 +3328,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm 
*svm) = {
[SVM_EXIT_READ_CR3] = cr_interception,
[SVM_EXIT_READ_CR4] = cr_interception,
[SVM_EXIT_READ_CR8] = cr_interception,
-   [SVM_EXIT_CR0_SEL_WRITE]= emulate_on_interception,
+   [SVM_EXIT_CR0_SEL_WRITE]= cr_interception,
[SVM_EXIT_WRITE_CR0]= cr_interception,
[SVM_EXIT_WRITE_CR3]= cr_interception,
[SVM_EXIT_WRITE_CR4]= cr_interception,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: svm: use cr_interception for SVM_EXIT_CR0_SEL_WRITE

2015-03-06 Thread Joel Schopp
From: David Kaplan 

Another patch in my war on emulate_on_interception() use as a svm exit handler.

These were pulled out of a larger patch at the suggestion of Radim Krcmar, see
https://lkml.org/lkml/2015/2/25/559

Signed-off-by: David Kaplan 
[separated out just cr_interception part from larger removal of
INTERCEPT_CR0_WRITE, forward ported, tested]
Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..57f0240 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2940,7 +2940,10 @@ static int cr_interception(struct vcpu_svm *svm)
return emulate_on_interception(svm);
 
reg = svm->vmcb->control.exit_info_1 & SVM_EXITINFO_REG_MASK;
-   cr = svm->vmcb->control.exit_code - SVM_EXIT_READ_CR0;
+   if (svm->vmcb->control.exit_code == SVM_EXIT_SEL_CR0_WRITE)
+   cr = SVM_EXIT_WRITE_CR0 - SVM_EXIT_READ_CR0;
+   else
+   cr = svm->vmcb->control.exit_code - SVM_EXIT_READ_CR0;
 
err = 0;
if (cr >= 16) { /* mov to cr */
@@ -3325,7 +3328,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm 
*svm) = {
[SVM_EXIT_READ_CR3] = cr_interception,
[SVM_EXIT_READ_CR4] = cr_interception,
[SVM_EXIT_READ_CR8] = cr_interception,
-   [SVM_EXIT_CR0_SEL_WRITE]= emulate_on_interception,
+   [SVM_EXIT_CR0_SEL_WRITE]= cr_interception,
[SVM_EXIT_WRITE_CR0]= cr_interception,
[SVM_EXIT_WRITE_CR3]= cr_interception,
[SVM_EXIT_WRITE_CR4]= cr_interception,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()

2015-03-03 Thread Joel Schopp

On 03/03/2015 10:44 AM, Radim Krčmář wrote:
> 2015-03-02 15:02-0600, Joel Schopp:
>> +int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port)
>> +{
>> +unsigned long val;
>> +int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
>> +   port, &val, 1);
>> +
> Btw. does this return 1 in some scenario?
If a function returns a value it is always a good idea to check it and
act appropriately.  That said...
emulator_pio_in_emulated will return 1 if emulator_pio_in_out returns 1
or if vcpu->arch.pio.count != 0
emulator_pio_in_out returns 1 if kernel_pio returns 0
kernel_pio returns 0 if kvm_io_bus_read returns 0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()

2015-03-03 Thread Joel Schopp
Thank you for your detailed review on several of my patches.

>>  
>> +static int complete_fast_pio(struct kvm_vcpu *vcpu)
> (complete_fast_pio_in()?)
If I do a v4 I'll adopt that name.
>> +{
>> +unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
> Shouldn't we handle writes in EAX differently than in AX and AL, because
> of implicit zero extension.
I don't think the implicit zero extension hurts us here, but maybe there
is something I'm missing that I need understand. Could you explain this
further?
>
>> +
>> +BUG_ON(!vcpu->arch.pio.count);
>> +BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax));
> (Looking at it again, a check for 'vcpu->arch.pio.count == 1' would be
>  sufficient.)
I prefer the checks that are there now after your last review,
especially since surrounded by BUG_ON they only run on debug kernels.

>
>> +
>> +memcpy(&new_rax, vcpu, sizeof(new_rax));
>> +trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size,
>> +  vcpu->arch.pio.count, vcpu->arch.pio_data);
>> +kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
>> +vcpu->arch.pio.count = 0;
> I think it is better to call emulator_pio_in_emulated directly, like
>
>   emulator_pio_in_out(&vcpu->arch.emulate_ctxt, vcpu->arch.pio.size,
>   vcpu->arch.pio.port, &new_rax, 1);
>   kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
>
> because we know that vcpu->arch.pio.count != 0.
I think two extra lines of code in my patch vs your suggestion are worth
it to a) reduce execution path length b) increase readability c) avoid
breaking the abstraction by not checking the return code d) avoid any
future bugs introduced by changes the function that would return a value
other than 1. 
>
> Refactoring could avoid the weird vcpu->ctxt->vcpu conversion.
> (A better name is always welcome.)
The pointer chasing is making me dizzy.  I'm not sure why
emulator_pio_in_emulated takes a x86_emulate_ctxt when all it does it
immediately translate that to a vcpu and never use the x86_emulate_ctxt,
why not pass the vcpu in the first place?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] x86: svm: use kvm_fast_pio_in()

2015-03-02 Thread Joel Schopp
From: David Kaplan 

We can make the in instruction go faster the same way the out instruction is
already.

Changes from v2[Joel]:
* changed rax from u32 to unsigned long
* changed a couple return 0 to BUG_ON()
* changed 8 to sizeof(new_rax)
* added trace hook
* removed redundant clearing of count
Changes from v1[Joel]
* Added kvm_fast_pio_in() implementation that was left out of v1

Signed-off-by: David Kaplan 
[extracted from larger unlrelated patch, forward ported, addressed reviews, 
tested]
Signed-off-by: Joel Schopp 
---
 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/svm.c  |4 +++-
 arch/x86/kvm/x86.c  |   30 ++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a236e39..b976824 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -931,6 +931,7 @@ int kvm_set_msr(struct kvm_vcpu *vcpu, struct msr_data 
*msr);
 struct x86_emulate_ctxt;
 
 int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned short port);
+int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port);
 void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int kvm_emulate_halt(struct kvm_vcpu *vcpu);
 int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..f8c906b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1899,7 +1899,7 @@ static int io_interception(struct vcpu_svm *svm)
++svm->vcpu.stat.io_exits;
string = (io_info & SVM_IOIO_STR_MASK) != 0;
in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
-   if (string || in)
+   if (string)
return emulate_instruction(vcpu, 0) == EMULATE_DONE;
 
port = io_info >> 16;
@@ -1907,6 +1907,8 @@ static int io_interception(struct vcpu_svm *svm)
svm->next_rip = svm->vmcb->control.exit_info_2;
skip_emulated_instruction(&svm->vcpu);
 
+   if (in)
+   return kvm_fast_pio_in(vcpu, size, port);
return kvm_fast_pio_out(vcpu, size, port);
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bd7a70b..d05efaf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5463,6 +5463,36 @@ int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, 
unsigned short port)
 }
 EXPORT_SYMBOL_GPL(kvm_fast_pio_out);
 
+static int complete_fast_pio(struct kvm_vcpu *vcpu)
+{
+   unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
+
+   BUG_ON(!vcpu->arch.pio.count);
+   BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax));
+
+   memcpy(&new_rax, vcpu, sizeof(new_rax));
+   trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size,
+ vcpu->arch.pio.count, vcpu->arch.pio_data);
+   kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
+   vcpu->arch.pio.count = 0;
+   return 1;
+}
+
+int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port)
+{
+   unsigned long val;
+   int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
+  port, &val, 1);
+
+   if (ret)
+   kvm_register_write(vcpu, VCPU_REGS_RAX, val);
+   else
+   vcpu->arch.complete_userspace_io = complete_fast_pio;
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(kvm_fast_pio_in);
+
 static void tsc_bad(void *info)
 {
__this_cpu_write(cpu_tsc_khz, 0);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] x86: svm: use kvm_fast_pio_in()

2015-03-02 Thread Joel Schopp





return emulate_instruction(vcpu, 0) == EMULATE_DONE;
  
  	port = io_info >> 16;

@@ -1907,6 +1907,8 @@ static int io_interception(struct vcpu_svm *svm)
svm->next_rip = svm->vmcb->control.exit_info_2;
skip_emulated_instruction(&svm->vcpu);
  
+	if (in)

+   return kvm_fast_pio_in(vcpu, size, port);
return kvm_fast_pio_out(vcpu, size, port);

(kvm_fast_pio() comes to mind.)
If you combined them you'd have to have an extra argument to say if it 
was in or out. You'd then have to have code to parse that.  I prefer 
this way.





  }
  
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c

index bd7a70b..089247c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5463,6 +5463,39 @@ int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, 
unsigned short port)
  }
  EXPORT_SYMBOL_GPL(kvm_fast_pio_out);
  
+static int complete_fast_pio(struct kvm_vcpu *vcpu)

+{
+   u32 new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);

u64.

Good call.  I'll use unsigned long like kvm_fast_pio_out() uses.


arch/x86/kvm/x86.c


+
+   if (!vcpu->arch.pio.count)
+   return 0;
+   if (vcpu->arch.pio.count * vcpu->arch.pio.size > 8)
+   return 0;

sizeof(new_rax).  (safer and easier to understand)

Both should never happen in KVM code, BUG_ON().

Agreed on both counts.




+
+   memcpy(&new_rax, vcpu->arch.pio_data,
+  vcpu->arch.pio.count * vcpu->arch.pio.size);

Use emulator_pio_in_emulated() here, for code sharing.
(We want to trace the read here too;  it could be better to split
  the path from emulator_pio_in_emulated() first.)

I looked at pulling this out, it was a painful.  I'll add the trace hook.




+   kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
+
+   vcpu->arch.pio.count = 0;
+   return 1;
+}
+
+int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port)
+{
+   unsigned long val;
+   int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
+  port, &val, 1);
+
+   if (ret) {
+   kvm_register_write(vcpu, VCPU_REGS_RAX, val);
+   vcpu->arch.pio.count = 0;

(emulator_pio_in_emulated() sets count to zero if it returns true.)

will remove = 0 line
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/2] kvm: x86: kvm_emulate_*

2015-03-02 Thread Joel Schopp
Review comments from v1 that used kvm_emulate_wbinvd() pointed out that 
kvm_emulate_* was inconsistant in using skipping, while kvm_emulate() always
skips.  The first patch cleans up the existing use while the second patch
adds use of the updated version of kvm_emulate_wbinvd() in svm

Changes since v2:
* fixed email subject line on series short description
* renamed kvm_emulate_halt_noskip() to kvm_vcpu_halt()
* added header declaration for kvm_vcpu_halt()
* squashed blank line 
---

David Kaplan (1):
  x86: svm: make wbinvd faster

Joel Schopp (1):
  kvm: x86: make kvm_emulate_* consistant


 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/svm.c  |   10 +++---
 arch/x86/kvm/vmx.c  |9 +++--
 arch/x86/kvm/x86.c  |   23 ---
 4 files changed, 31 insertions(+), 12 deletions(-)

--

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/2] kvm: x86: make kvm_emulate_* consistant

2015-03-02 Thread Joel Schopp
Currently kvm_emulate() skips the instruction but kvm_emulate_* sometimes
don't.  The end reult is the caller ends up doing the skip themselves.
Let's make them consistant.

Signed-off-by: Joel Schopp 
---
 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/svm.c  |2 --
 arch/x86/kvm/vmx.c  |9 +++--
 arch/x86/kvm/x86.c  |   23 ---
 4 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a236e39..bf5a160 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -933,6 +933,7 @@ struct x86_emulate_ctxt;
 int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned short port);
 void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int kvm_emulate_halt(struct kvm_vcpu *vcpu);
+int kvm_vcpu_halt(struct kvm_vcpu *vcpu);
 int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu);
 
 void kvm_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..0c9e377 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1929,14 +1929,12 @@ static int nop_on_interception(struct vcpu_svm *svm)
 static int halt_interception(struct vcpu_svm *svm)
 {
svm->next_rip = kvm_rip_read(&svm->vcpu) + 1;
-   skip_emulated_instruction(&svm->vcpu);
return kvm_emulate_halt(&svm->vcpu);
 }
 
 static int vmmcall_interception(struct vcpu_svm *svm)
 {
svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
-   skip_emulated_instruction(&svm->vcpu);
kvm_emulate_hypercall(&svm->vcpu);
return 1;
 }
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 14c1a18..eef7f53 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4995,7 +4995,7 @@ static int handle_rmode_exception(struct kvm_vcpu *vcpu,
if (emulate_instruction(vcpu, 0) == EMULATE_DONE) {
if (vcpu->arch.halt_request) {
vcpu->arch.halt_request = 0;
-   return kvm_emulate_halt(vcpu);
+   return kvm_vcpu_halt(vcpu);
}
return 1;
}
@@ -5522,13 +5522,11 @@ static int handle_interrupt_window(struct kvm_vcpu 
*vcpu)
 
 static int handle_halt(struct kvm_vcpu *vcpu)
 {
-   skip_emulated_instruction(vcpu);
return kvm_emulate_halt(vcpu);
 }
 
 static int handle_vmcall(struct kvm_vcpu *vcpu)
 {
-   skip_emulated_instruction(vcpu);
kvm_emulate_hypercall(vcpu);
return 1;
 }
@@ -5559,7 +5557,6 @@ static int handle_rdpmc(struct kvm_vcpu *vcpu)
 
 static int handle_wbinvd(struct kvm_vcpu *vcpu)
 {
-   skip_emulated_instruction(vcpu);
kvm_emulate_wbinvd(vcpu);
return 1;
 }
@@ -5898,7 +5895,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu 
*vcpu)
 
if (vcpu->arch.halt_request) {
vcpu->arch.halt_request = 0;
-   ret = kvm_emulate_halt(vcpu);
+   ret = kvm_vcpu_halt(vcpu);
goto out;
}
 
@@ -9513,7 +9510,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool 
launch)
vmcs12->launch_state = 1;
 
if (vmcs12->guest_activity_state == GUEST_ACTIVITY_HLT)
-   return kvm_emulate_halt(vcpu);
+   return kvm_vcpu_halt(vcpu);
 
vmx->nested.nested_run_pending = 1;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bd7a70b..6ff90f7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4706,7 +4706,7 @@ static void emulator_invlpg(struct x86_emulate_ctxt 
*ctxt, ulong address)
kvm_mmu_invlpg(emul_to_vcpu(ctxt), address);
 }
 
-int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu)
+int kvm_emulate_wbinvd_noskip(struct kvm_vcpu *vcpu)
 {
if (!need_emulate_wbinvd(vcpu))
return X86EMUL_CONTINUE;
@@ -4723,11 +4723,19 @@ int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu)
wbinvd();
return X86EMUL_CONTINUE;
 }
+
+int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu)
+{
+   kvm_x86_ops->skip_emulated_instruction(vcpu);
+   return kvm_emulate_wbinvd_noskip(vcpu);
+}
 EXPORT_SYMBOL_GPL(kvm_emulate_wbinvd);
 
+
+
 static void emulator_wbinvd(struct x86_emulate_ctxt *ctxt)
 {
-   kvm_emulate_wbinvd(emul_to_vcpu(ctxt));
+   kvm_emulate_wbinvd_noskip(emul_to_vcpu(ctxt));
 }
 
 int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long *dest)
@@ -5817,7 +5825,7 @@ void kvm_arch_exit(void)
free_percpu(shared_msrs);
 }
 
-int kvm_emulate_halt(struct kvm_vcpu *vcpu)
+int kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
++vcpu->stat.halt_exits;
if (irqchip_in_kernel(vcpu->kvm)) {
@@ -5828,6 +5836,13 @@ int kvm_emulate_halt(struct kvm_vcpu

[PATCH v3 2/2] x86: svm: make wbinvd faster

2015-03-02 Thread Joel Schopp
From: David Kaplan 

No need to re-decode WBINVD since we know what it is from the intercept.

Signed-off-by: David Kaplan 
[extracted from larger unlrelated patch, forward ported, tested,style cleanup]
Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 0c9e377..6fa4222 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2774,6 +2774,12 @@ static int skinit_interception(struct vcpu_svm *svm)
return 1;
 }
 
+static int wbinvd_interception(struct vcpu_svm *svm)
+{
+   kvm_emulate_wbinvd(&svm->vcpu);
+   return 1;
+}
+
 static int xsetbv_interception(struct vcpu_svm *svm)
 {
u64 new_bv = kvm_read_edx_eax(&svm->vcpu);
@@ -3374,7 +3380,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm 
*svm) = {
[SVM_EXIT_STGI] = stgi_interception,
[SVM_EXIT_CLGI] = clgi_interception,
[SVM_EXIT_SKINIT]   = skinit_interception,
-   [SVM_EXIT_WBINVD]   = emulate_on_interception,
+   [SVM_EXIT_WBINVD]   = wbinvd_interception,
[SVM_EXIT_MONITOR]  = monitor_interception,
[SVM_EXIT_MWAIT]= mwait_interception,
[SVM_EXIT_XSETBV]   = xsetbv_interception,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] kvm: x86: make kvm_emulate_* consistant

2015-03-02 Thread Joel Schopp



---
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
@@ -4995,7 +4995,7 @@ static int handle_rmode_exception(struct kvm_vcpu *vcpu,
if (emulate_instruction(vcpu, 0) == EMULATE_DONE) {
if (vcpu->arch.halt_request) {
vcpu->arch.halt_request = 0;
-   return kvm_emulate_halt(vcpu);
+   return kvm_emulate_halt_noskip(vcpu);

noskip is used without being declared ... it shouldn't compile.
I tested on AMD hardware, I thought I had turned on the Intel KVM module 
as well, but it turns out I hadn't.  Will fix in v3.



*_noskip makes the usual case harder to undertand: we just want to halt
the vcpu, so name it more directly ... like kvm_vcpu_halt()?

I like that better.  Will make the change in v3.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] x86: svm: make wbinvd faster

2015-03-02 Thread Joel Schopp
From: David Kaplan 
No need to re-decode WBINVD since we know what it is from the intercept.

Signed-off-by: David Kaplan 
[extracted from larger unlrelated patch, forward ported, tested]
Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 0c9e377..794bca7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2774,6 +2774,13 @@ static int skinit_interception(struct vcpu_svm *svm)
return 1;
 }
 
+static int wbinvd_interception(struct vcpu_svm *svm)
+{
+   kvm_emulate_wbinvd(&svm->vcpu);
+   return 1;
+}
+
+
 static int xsetbv_interception(struct vcpu_svm *svm)
 {
u64 new_bv = kvm_read_edx_eax(&svm->vcpu);
@@ -3374,7 +3381,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm 
*svm) = {
[SVM_EXIT_STGI] = stgi_interception,
[SVM_EXIT_CLGI] = clgi_interception,
[SVM_EXIT_SKINIT]   = skinit_interception,
-   [SVM_EXIT_WBINVD]   = emulate_on_interception,
+   [SVM_EXIT_WBINVD]   = wbinvd_interception,
[SVM_EXIT_MONITOR]  = monitor_interception,
[SVM_EXIT_MWAIT]= mwait_interception,
[SVM_EXIT_XSETBV]   = xsetbv_interception,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/2] Series short description

2015-03-02 Thread Joel Schopp
Review comments from v1 that used kvm_emulate_wbinvd() pointed out that 
kvm_emulate_* was inconsistant in using skipping, while kvm_emulate() always
skips.  The first patch cleans up the existing use while the second patch
adds use of the updated version of kvm_emulate_wbinvd() in svm

---

Joel Schopp (2):
  kvm: x86: make kvm_emulate_* consistant
  x86: svm: make wbinvd faster


 arch/x86/kvm/svm.c |   11 ---
 arch/x86/kvm/vmx.c |9 +++--
 arch/x86/kvm/x86.c |   23 ---
 3 files changed, 31 insertions(+), 12 deletions(-)

--

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] kvm: x86: make kvm_emulate_* consistant

2015-03-02 Thread Joel Schopp
Currently kvm_emulate() skips the instruction but kvm_emulate_* sometimes
don't.  The end reult is the caller ends up doing the skip themselves.
Let's make them consistant.

Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |2 --
 arch/x86/kvm/vmx.c |9 +++--
 arch/x86/kvm/x86.c |   23 ---
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..0c9e377 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1929,14 +1929,12 @@ static int nop_on_interception(struct vcpu_svm *svm)
 static int halt_interception(struct vcpu_svm *svm)
 {
svm->next_rip = kvm_rip_read(&svm->vcpu) + 1;
-   skip_emulated_instruction(&svm->vcpu);
return kvm_emulate_halt(&svm->vcpu);
 }
 
 static int vmmcall_interception(struct vcpu_svm *svm)
 {
svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
-   skip_emulated_instruction(&svm->vcpu);
kvm_emulate_hypercall(&svm->vcpu);
return 1;
 }
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 14c1a18..b7dcd3c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4995,7 +4995,7 @@ static int handle_rmode_exception(struct kvm_vcpu *vcpu,
if (emulate_instruction(vcpu, 0) == EMULATE_DONE) {
if (vcpu->arch.halt_request) {
vcpu->arch.halt_request = 0;
-   return kvm_emulate_halt(vcpu);
+   return kvm_emulate_halt_noskip(vcpu);
}
return 1;
}
@@ -5522,13 +5522,11 @@ static int handle_interrupt_window(struct kvm_vcpu 
*vcpu)
 
 static int handle_halt(struct kvm_vcpu *vcpu)
 {
-   skip_emulated_instruction(vcpu);
return kvm_emulate_halt(vcpu);
 }
 
 static int handle_vmcall(struct kvm_vcpu *vcpu)
 {
-   skip_emulated_instruction(vcpu);
kvm_emulate_hypercall(vcpu);
return 1;
 }
@@ -5559,7 +5557,6 @@ static int handle_rdpmc(struct kvm_vcpu *vcpu)
 
 static int handle_wbinvd(struct kvm_vcpu *vcpu)
 {
-   skip_emulated_instruction(vcpu);
kvm_emulate_wbinvd(vcpu);
return 1;
 }
@@ -5898,7 +5895,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu 
*vcpu)
 
if (vcpu->arch.halt_request) {
vcpu->arch.halt_request = 0;
-   ret = kvm_emulate_halt(vcpu);
+   ret = kvm_emulate_halt_noskip(vcpu);
goto out;
}
 
@@ -9513,7 +9510,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool 
launch)
vmcs12->launch_state = 1;
 
if (vmcs12->guest_activity_state == GUEST_ACTIVITY_HLT)
-   return kvm_emulate_halt(vcpu);
+   return kvm_emulate_halt_noskip(vcpu);
 
vmx->nested.nested_run_pending = 1;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bd7a70b..96a8333 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4706,7 +4706,7 @@ static void emulator_invlpg(struct x86_emulate_ctxt 
*ctxt, ulong address)
kvm_mmu_invlpg(emul_to_vcpu(ctxt), address);
 }
 
-int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu)
+int kvm_emulate_wbinvd_noskip(struct kvm_vcpu *vcpu)
 {
if (!need_emulate_wbinvd(vcpu))
return X86EMUL_CONTINUE;
@@ -4723,11 +4723,19 @@ int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu)
wbinvd();
return X86EMUL_CONTINUE;
 }
+
+int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu)
+{
+   kvm_x86_ops->skip_emulated_instruction(vcpu);
+   return kvm_emulate_wbinvd_noskip(vcpu);
+}
 EXPORT_SYMBOL_GPL(kvm_emulate_wbinvd);
 
+
+
 static void emulator_wbinvd(struct x86_emulate_ctxt *ctxt)
 {
-   kvm_emulate_wbinvd(emul_to_vcpu(ctxt));
+   kvm_emulate_wbinvd_noskip(emul_to_vcpu(ctxt));
 }
 
 int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long *dest)
@@ -5817,7 +5825,7 @@ void kvm_arch_exit(void)
free_percpu(shared_msrs);
 }
 
-int kvm_emulate_halt(struct kvm_vcpu *vcpu)
+int kvm_emulate_halt_noskip(struct kvm_vcpu *vcpu)
 {
++vcpu->stat.halt_exits;
if (irqchip_in_kernel(vcpu->kvm)) {
@@ -5828,6 +5836,13 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
return 0;
}
 }
+EXPORT_SYMBOL_GPL(kvm_emulate_halt_noskip);
+
+int kvm_emulate_halt(struct kvm_vcpu *vcpu)
+{
+   kvm_x86_ops->skip_emulated_instruction(vcpu);
+   return kvm_emulate_halt_noskip(vcpu);
+}
 EXPORT_SYMBOL_GPL(kvm_emulate_halt);
 
 int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
@@ -5912,6 +5927,8 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
unsigned long nr, a0, a1, a2, a3, ret;
int op_64_bit, r = 1;
 
+   kvm_x86_ops->skip_emulated_instruction(vcpu);
+
if (kvm_hv_hypercall_enabled(vcpu->kvm))
return kvm_hv_

Re: [PATCH] x86: svm: make wbinvd faster

2015-03-02 Thread Joel Schopp


On 03/02/2015 10:03 AM, Radim Krčmář wrote:

2015-03-02 10:25-0500, Bandan Das:

Radim Krčmář  writes:

2015-03-01 21:29-0500, Bandan Das:

Joel Schopp  writes:

+static int wbinvd_interception(struct vcpu_svm *svm)
+{
+   kvm_emulate_wbinvd(&svm->vcpu);
+   skip_emulated_instruction(&svm->vcpu);
+   return 1;
+}

Can't we merge this to kvm_emulate_wbinvd, and just call that function
directly for both vmx and svm ?

kvm_emulate_wbinvd() lives in x86.c and skip_emulated_instruction() is
from svm.c/vmx.c:  so we'd have to create a new x86 op and change the
emulator code as well ... it's probably better like this.

There's already one - kvm_x86_ops->skip_emulated_instruction

My bad, its usage is inconsistent and I only looked at two close
interceptions where it was used ... kvm_emulate_cpuid() calls
kvm_x86_ops->skip_emulated_instruction(), while kvm_emulate_halt() and
kvm_emulate_hypercall() need an external skip.

We do "skip" the instruction with kvm_emulate(), so automatically
skipping the instruction on kvm_emulate_*() makes sense:
  1. rename kvm_emulate_halt() and kvm_emulate_wbinvd() to accommodate
 callers that don't want to skip
  2. introduce kvm_emulate_{halt,wbinvd}() and move the skip to to
 kvm_emulate_{halt,wbinvd,hypercall}()

The alternative is to remove kvm_x86_ops->skip_emulated_instruction():
  1. remove skip from kvm_emulate_cpuid() and modify callers
  2. move kvm_complete_insn_gp to a header file and use
 skip_emulated_instruction directly
  3. remove unused kvm_x86_ops->skip_emulated_instruction()

Which one do you prefer?
I prefer renaming them,  ie kvm_emulate_wbinvd_noskip(), and making the 
existing ones, ie kvm_emulate_wbinvd() call the noskip verion and add a 
skip similar to how wbinvd_interception above does.  I can send out a 
patch later today with that rework.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] x86: svm: use kvm_fast_pio_in()

2015-03-02 Thread Joel Schopp
From: David Kaplan 

We can make the in instruction go faster the same way the out instruction is
already.

Changes from v1
* Added kvm_fast_pio_in() implementation that was left out of v1

Signed-off-by: David Kaplan 
[extracted from larger unlrelated patch, forward ported, tested]
Signed-off-by: Joel Schopp 
---
 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/svm.c  |4 +++-
 arch/x86/kvm/x86.c  |   33 +
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a236e39..b976824 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -931,6 +931,7 @@ int kvm_set_msr(struct kvm_vcpu *vcpu, struct msr_data 
*msr);
 struct x86_emulate_ctxt;
 
 int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned short port);
+int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port);
 void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int kvm_emulate_halt(struct kvm_vcpu *vcpu);
 int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..f8c906b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1899,7 +1899,7 @@ static int io_interception(struct vcpu_svm *svm)
++svm->vcpu.stat.io_exits;
string = (io_info & SVM_IOIO_STR_MASK) != 0;
in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
-   if (string || in)
+   if (string)
return emulate_instruction(vcpu, 0) == EMULATE_DONE;
 
port = io_info >> 16;
@@ -1907,6 +1907,8 @@ static int io_interception(struct vcpu_svm *svm)
svm->next_rip = svm->vmcb->control.exit_info_2;
skip_emulated_instruction(&svm->vcpu);
 
+   if (in)
+   return kvm_fast_pio_in(vcpu, size, port);
return kvm_fast_pio_out(vcpu, size, port);
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bd7a70b..089247c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5463,6 +5463,39 @@ int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, 
unsigned short port)
 }
 EXPORT_SYMBOL_GPL(kvm_fast_pio_out);
 
+static int complete_fast_pio(struct kvm_vcpu *vcpu)
+{
+   u32 new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
+
+   if (!vcpu->arch.pio.count)
+   return 0;
+   if (vcpu->arch.pio.count * vcpu->arch.pio.size > 8)
+   return 0;
+
+   memcpy(&new_rax, vcpu->arch.pio_data,
+  vcpu->arch.pio.count * vcpu->arch.pio.size);
+   kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
+
+   vcpu->arch.pio.count = 0;
+   return 1;
+}
+
+int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port)
+{
+   unsigned long val;
+   int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
+  port, &val, 1);
+
+   if (ret) {
+   kvm_register_write(vcpu, VCPU_REGS_RAX, val);
+   vcpu->arch.pio.count = 0;
+   } else
+   vcpu->arch.complete_userspace_io = complete_fast_pio;
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(kvm_fast_pio_in);
+
 static void tsc_bad(void *info)
 {
__this_cpu_write(cpu_tsc_khz, 0);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: svm: use kvm_fast_pio_in()

2015-03-02 Thread Joel Schopp



+   if (in)
+   return kvm_fast_pio_in(vcpu, size, port);

Have I missed a patch that defined kvm_fast_pio_in()?
Not sure how I managed to leave out the bulk of the patch. Resending v2 
momentarily.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: svm: make wbinvd faster

2015-02-27 Thread Joel Schopp
From: David Kaplan 
No need to re-decode WBINVD since we know what it is from the intercept.

Signed-off-by: David Kaplan 
[extracted from larger unlrelated patch, forward ported, tested]
Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |   10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..86ecd21 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2776,6 +2776,14 @@ static int skinit_interception(struct vcpu_svm *svm)
return 1;
 }
 
+static int wbinvd_interception(struct vcpu_svm *svm)
+{
+   kvm_emulate_wbinvd(&svm->vcpu);
+   skip_emulated_instruction(&svm->vcpu);
+   return 1;
+}
+
+
 static int xsetbv_interception(struct vcpu_svm *svm)
 {
u64 new_bv = kvm_read_edx_eax(&svm->vcpu);
@@ -3376,7 +3384,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm 
*svm) = {
[SVM_EXIT_STGI] = stgi_interception,
[SVM_EXIT_CLGI] = clgi_interception,
[SVM_EXIT_SKINIT]   = skinit_interception,
-   [SVM_EXIT_WBINVD]   = emulate_on_interception,
+   [SVM_EXIT_WBINVD]   = wbinvd_interception,
[SVM_EXIT_MONITOR]  = monitor_interception,
[SVM_EXIT_MWAIT]= mwait_interception,
[SVM_EXIT_XSETBV]   = xsetbv_interception,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: svm: use kvm_fast_pio_in()

2015-02-27 Thread Joel Schopp
From: David Kaplan 

We can make the in instruction go faster the same way the out instruction is
already.

Signed-off-by: David Kaplan 
[extracted from larger unlrelated patch, forward ported, tested]
Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..f8c906b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1899,7 +1899,7 @@ static int io_interception(struct vcpu_svm *svm)
++svm->vcpu.stat.io_exits;
string = (io_info & SVM_IOIO_STR_MASK) != 0;
in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
-   if (string || in)
+   if (string)
return emulate_instruction(vcpu, 0) == EMULATE_DONE;
 
port = io_info >> 16;
@@ -1907,6 +1907,8 @@ static int io_interception(struct vcpu_svm *svm)
svm->next_rip = svm->vmcb->control.exit_info_2;
skip_emulated_instruction(&svm->vcpu);
 
+   if (in)
+   return kvm_fast_pio_in(vcpu, size, port);
return kvm_fast_pio_out(vcpu, size, port);
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: svm: use kvm_fast_pio_in()

2015-02-27 Thread Joel Schopp
From: David Kaplan 

We can make the in instruction go faster the same way the out instruction is
already.

Signed-off-by: David Kaplan 
[extracted from larger unlrelated patch, forward ported, tested]
Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..f8c906b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1899,7 +1899,7 @@ static int io_interception(struct vcpu_svm *svm)
++svm->vcpu.stat.io_exits;
string = (io_info & SVM_IOIO_STR_MASK) != 0;
in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
-   if (string || in)
+   if (string)
return emulate_instruction(vcpu, 0) == EMULATE_DONE;
 
port = io_info >> 16;
@@ -1907,6 +1907,8 @@ static int io_interception(struct vcpu_svm *svm)
svm->next_rip = svm->vmcb->control.exit_info_2;
skip_emulated_instruction(&svm->vcpu);
 
+   if (in)
+   return kvm_fast_pio_in(vcpu, size, port);
return kvm_fast_pio_out(vcpu, size, port);
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: svm: don't intercept CR0 TS or MP bit write

2015-02-25 Thread Joel Schopp

On 02/25/2015 02:26 PM, Radim Krčmář wrote:
> 2015-02-24 15:25-0600, Joel Schopp:
>>>> -  clr_cr_intercept(svm, INTERCEPT_CR0_WRITE);
>>>>} else {
>>>>set_cr_intercept(svm, INTERCEPT_CR0_READ);
>>> (There is no point in checking fpu_active if cr0s are equal.)
>>>
>>>> -  set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
>>> KVM uses lazy FPU and the state is undefined before the first access.
>>> We set cr0.ts when !svm->vcpu.fpu_active to detect the first access, but
>>> if we allow the guest to clear cr0.ts without exiting, it can access FPU
>>> with undefined state.
>> Thanks for the valuable feedback.  It's apparent I hadn't thought
>> through the interaction with lazy FPU and will need to go back and
>> rethink my approach here.
> I don't think we can gain much without sacrificing some laziness, like:
> when a guest with lazy FPU clears CR0.TS, it is going to use that FPU,
> so we could pre-load FPU in this case and drop the write intercept too;
> guests that unconditionally clear CR0.TS would perform worse though.
>
> It's going to take a lot of time, but two hunks in your patch, that made
> selective intercept benefit from decode assists, look useful even now.
>
> Would you post them separately?
I can re-post those separately.  They are less useful, though probably
still worth doing, on their own because SVM_EXIT_WRITE_CR0 takes
precidence over SVM_EXIT_CR0_SEL_WRITE
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: svm: don't intercept CR0 TS or MP bit write

2015-02-24 Thread Joel Schopp

>> -clr_cr_intercept(svm, INTERCEPT_CR0_WRITE);
>>  } else {
>>  set_cr_intercept(svm, INTERCEPT_CR0_READ);
> (There is no point in checking fpu_active if cr0s are equal.)
>
>> -set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
> KVM uses lazy FPU and the state is undefined before the first access.
> We set cr0.ts when !svm->vcpu.fpu_active to detect the first access, but
> if we allow the guest to clear cr0.ts without exiting, it can access FPU
> with undefined state.
Thanks for the valuable feedback.  It's apparent I hadn't thought
through the interaction with lazy FPU and will need to go back and
rethink my approach here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: svm: don't intercept CR0 TS or MP bit write

2015-02-20 Thread Joel Schopp
From: David Kaplan 

Reduce the number of exits by avoiding exiting when the guest writes TS or MP
bits of CR0.  INTERCEPT_CR0_WRITE intercepts all writes to CR0 including TS and
MP bits. It intercepts these even if INTERCEPT_SELECTIVE_CR0 is set.  What we
should be doing is setting INTERCEPT_SELECTIVE_CR0 and not setting
INTERCEPT_CR0_WRITE.

Signed-off-by: David Kaplan 
[added remove of clr_cr_intercept in init_vmcb, fixed check in handle_exit,
added emulation on interception back in, forward ported, tested]
Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |   13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..55822e5 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1093,7 +1093,6 @@ static void init_vmcb(struct vcpu_svm *svm)
set_cr_intercept(svm, INTERCEPT_CR0_READ);
set_cr_intercept(svm, INTERCEPT_CR3_READ);
set_cr_intercept(svm, INTERCEPT_CR4_READ);
-   set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
set_cr_intercept(svm, INTERCEPT_CR3_WRITE);
set_cr_intercept(svm, INTERCEPT_CR4_WRITE);
set_cr_intercept(svm, INTERCEPT_CR8_WRITE);
@@ -1539,10 +1538,8 @@ static void update_cr0_intercept(struct vcpu_svm *svm)
 
if (gcr0 == *hcr0 && svm->vcpu.fpu_active) {
clr_cr_intercept(svm, INTERCEPT_CR0_READ);
-   clr_cr_intercept(svm, INTERCEPT_CR0_WRITE);
} else {
set_cr_intercept(svm, INTERCEPT_CR0_READ);
-   set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
}
 }
 
@@ -2940,7 +2937,11 @@ static int cr_interception(struct vcpu_svm *svm)
return emulate_on_interception(svm);
 
reg = svm->vmcb->control.exit_info_1 & SVM_EXITINFO_REG_MASK;
-   cr = svm->vmcb->control.exit_code - SVM_EXIT_READ_CR0;
+
+   if (svm->vmcb->control.exit_code == SVM_EXIT_CR0_SEL_WRITE)
+  cr = 16;
+   else
+  cr = svm->vmcb->control.exit_code - SVM_EXIT_READ_CR0;
 
err = 0;
if (cr >= 16) { /* mov to cr */
@@ -3325,7 +3326,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm 
*svm) = {
[SVM_EXIT_READ_CR3] = cr_interception,
[SVM_EXIT_READ_CR4] = cr_interception,
[SVM_EXIT_READ_CR8] = cr_interception,
-   [SVM_EXIT_CR0_SEL_WRITE]= emulate_on_interception,
+   [SVM_EXIT_CR0_SEL_WRITE]= cr_interception,
[SVM_EXIT_WRITE_CR0]= cr_interception,
[SVM_EXIT_WRITE_CR3]= cr_interception,
[SVM_EXIT_WRITE_CR4]= cr_interception,
@@ -3502,7 +3503,7 @@ static int handle_exit(struct kvm_vcpu *vcpu)
struct kvm_run *kvm_run = vcpu->run;
u32 exit_code = svm->vmcb->control.exit_code;
 
-   if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE))
+   if (!is_cr_intercept(svm, INTERCEPT_SELECTIVE_CR0))
vcpu->arch.cr0 = svm->vmcb->save.cr0;
if (npt_enabled)
vcpu->arch.cr3 = svm->vmcb->save.cr3;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] x86: svm: use kvm_register_write()/read()

2015-02-20 Thread Joel Schopp
From: David Kaplan 

KVM has nice wrappers to access the register values, clean up a few places
that should use them but currently do not.

Signed-off-by: David Kaplan 
[forward port and testing]
Signed-off-by: Joel Schopp 
---
 arch/x86/kvm/svm.c |   19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..a7d88e4 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2757,11 +2757,11 @@ static int invlpga_interception(struct vcpu_svm *svm)
 {
struct kvm_vcpu *vcpu = &svm->vcpu;
 
-   trace_kvm_invlpga(svm->vmcb->save.rip, vcpu->arch.regs[VCPU_REGS_RCX],
- vcpu->arch.regs[VCPU_REGS_RAX]);
+   trace_kvm_invlpga(svm->vmcb->save.rip, kvm_register_read(&svm->vcpu, 
VCPU_REGS_RCX),
+ kvm_register_read(&svm->vcpu, VCPU_REGS_RAX));
 
/* Let's treat INVLPGA the same as INVLPG (can be optimized!) */
-   kvm_mmu_invlpg(vcpu, vcpu->arch.regs[VCPU_REGS_RAX]);
+   kvm_mmu_invlpg(vcpu, kvm_register_read(&svm->vcpu, VCPU_REGS_RAX));
 
svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
skip_emulated_instruction(&svm->vcpu);
@@ -2770,7 +2770,7 @@ static int invlpga_interception(struct vcpu_svm *svm)
 
 static int skinit_interception(struct vcpu_svm *svm)
 {
-   trace_kvm_skinit(svm->vmcb->save.rip, 
svm->vcpu.arch.regs[VCPU_REGS_RAX]);
+   trace_kvm_skinit(svm->vmcb->save.rip, kvm_register_read(&svm->vcpu, 
VCPU_REGS_RAX));
 
kvm_queue_exception(&svm->vcpu, UD_VECTOR);
return 1;
@@ -3133,7 +3133,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, unsigned 
ecx, u64 *data)
 
 static int rdmsr_interception(struct vcpu_svm *svm)
 {
-   u32 ecx = svm->vcpu.arch.regs[VCPU_REGS_RCX];
+   u32 ecx = kvm_register_read(&svm->vcpu, VCPU_REGS_RCX);
u64 data;
 
if (svm_get_msr(&svm->vcpu, ecx, &data)) {
@@ -3142,8 +3142,8 @@ static int rdmsr_interception(struct vcpu_svm *svm)
} else {
trace_kvm_msr_read(ecx, data);
 
-   svm->vcpu.arch.regs[VCPU_REGS_RAX] = data & 0x;
-   svm->vcpu.arch.regs[VCPU_REGS_RDX] = data >> 32;
+   kvm_register_write(&svm->vcpu, VCPU_REGS_RAX, data & 
0x);
+   kvm_register_write(&svm->vcpu, VCPU_REGS_RDX, data >> 32);
svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
skip_emulated_instruction(&svm->vcpu);
}
@@ -3246,9 +3246,8 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr)
 static int wrmsr_interception(struct vcpu_svm *svm)
 {
struct msr_data msr;
-   u32 ecx = svm->vcpu.arch.regs[VCPU_REGS_RCX];
-   u64 data = (svm->vcpu.arch.regs[VCPU_REGS_RAX] & -1u)
-   | ((u64)(svm->vcpu.arch.regs[VCPU_REGS_RDX] & -1u) << 32);
+   u32 ecx = kvm_register_read(&svm->vcpu, VCPU_REGS_RCX);
+   u64 data = kvm_read_edx_eax(&svm->vcpu);
 
msr.data = data;
msr.index = ecx;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: svm: use kvm_register_write()/read()

2015-02-20 Thread Joel Schopp

On 02/20/2015 02:54 PM, Borislav Petkov wrote:
> On Fri, Feb 20, 2015 at 12:39:40PM -0600, Joel Schopp wrote:
>> KVM has nice wrappers to access the register values, clean up a few places
>> that should use them but currently do not.
>>
>> Signed-off-by:David Kaplan 
>> Signed-off-by:Joel Schopp 
> This SOB chain looks strange. If David is the author, you want to have
> him in From: at the beginning of the patch.
>
> Stuff you did ontop should be in []-braces before your SOB, like this:
Will resend with From: line and braces for clarification.

>
> Signed-off-by: David Kaplan 
> [ Did this and that to patch. ]
> Signed-off-by: Joel Schopp 
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: svm: use kvm_register_write()/read()

2015-02-20 Thread Joel Schopp
KVM has nice wrappers to access the register values, clean up a few places
that should use them but currently do not.

Signed-off-by:David Kaplan 
Signed-off-by:Joel Schopp 
---
 arch/x86/kvm/svm.c |   19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..a7d88e4 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2757,11 +2757,11 @@ static int invlpga_interception(struct vcpu_svm *svm)
 {
struct kvm_vcpu *vcpu = &svm->vcpu;
 
-   trace_kvm_invlpga(svm->vmcb->save.rip, vcpu->arch.regs[VCPU_REGS_RCX],
- vcpu->arch.regs[VCPU_REGS_RAX]);
+   trace_kvm_invlpga(svm->vmcb->save.rip, kvm_register_read(&svm->vcpu, 
VCPU_REGS_RCX),
+ kvm_register_read(&svm->vcpu, VCPU_REGS_RAX));
 
/* Let's treat INVLPGA the same as INVLPG (can be optimized!) */
-   kvm_mmu_invlpg(vcpu, vcpu->arch.regs[VCPU_REGS_RAX]);
+   kvm_mmu_invlpg(vcpu, kvm_register_read(&svm->vcpu, VCPU_REGS_RAX));
 
svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
skip_emulated_instruction(&svm->vcpu);
@@ -2770,7 +2770,7 @@ static int invlpga_interception(struct vcpu_svm *svm)
 
 static int skinit_interception(struct vcpu_svm *svm)
 {
-   trace_kvm_skinit(svm->vmcb->save.rip, 
svm->vcpu.arch.regs[VCPU_REGS_RAX]);
+   trace_kvm_skinit(svm->vmcb->save.rip, kvm_register_read(&svm->vcpu, 
VCPU_REGS_RAX));
 
kvm_queue_exception(&svm->vcpu, UD_VECTOR);
return 1;
@@ -3133,7 +3133,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, unsigned 
ecx, u64 *data)
 
 static int rdmsr_interception(struct vcpu_svm *svm)
 {
-   u32 ecx = svm->vcpu.arch.regs[VCPU_REGS_RCX];
+   u32 ecx = kvm_register_read(&svm->vcpu, VCPU_REGS_RCX);
u64 data;
 
if (svm_get_msr(&svm->vcpu, ecx, &data)) {
@@ -3142,8 +3142,8 @@ static int rdmsr_interception(struct vcpu_svm *svm)
} else {
trace_kvm_msr_read(ecx, data);
 
-   svm->vcpu.arch.regs[VCPU_REGS_RAX] = data & 0x;
-   svm->vcpu.arch.regs[VCPU_REGS_RDX] = data >> 32;
+   kvm_register_write(&svm->vcpu, VCPU_REGS_RAX, data & 
0x);
+   kvm_register_write(&svm->vcpu, VCPU_REGS_RDX, data >> 32);
svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
skip_emulated_instruction(&svm->vcpu);
}
@@ -3246,9 +3246,8 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr)
 static int wrmsr_interception(struct vcpu_svm *svm)
 {
struct msr_data msr;
-   u32 ecx = svm->vcpu.arch.regs[VCPU_REGS_RCX];
-   u64 data = (svm->vcpu.arch.regs[VCPU_REGS_RAX] & -1u)
-   | ((u64)(svm->vcpu.arch.regs[VCPU_REGS_RDX] & -1u) << 32);
+   u32 ecx = kvm_register_read(&svm->vcpu, VCPU_REGS_RCX);
+   u64 data = kvm_read_edx_eax(&svm->vcpu);
 
msr.data = data;
msr.index = ecx;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Qemu-devel] [RFC PATCH] virtio-mmio: support for multiple irqs

2014-11-05 Thread Joel Schopp

On 11/05/2014 03:12 AM, Shannon Zhao wrote:
> Hi Rémy,
>
> On 2014/11/5 16:26, GAUGUEY Rémy 228890 wrote:
>> Hi Shannon, 
>>
>>> Type of backend bandwith(GBytes/sec)
>>> virtio-net  0.66
>>> vhost-net   1.49
>>> vhost-net with irqfd2.01
>>>
>>> Test cmd: ./iperf -c 192.168.0.2 -P 1 -i 10 -p 5001 -f G -t 60
>> Impressive results !
>> Could you please detail your setup ? which platform are you using and which 
>> GbE controller ?
> Sorry for not telling the test scenario. This test scenario is from Host to 
> Guest. It just
> compare the performance of different backends. I did this test on ARM64 
> platform.
>
> The setup was based on:
> 1)on host kvm-arm should support ioeventfd and irqfd
>   The irqfd patch is from Eric "ARM: KVM: add irqfd support".
>   http://www.spinics.net/lists/kvm-arm/msg11014.html
>
>   The ioeventfd patch is reworked by me from Antonios.
>   http://www.spinics.net/lists/kvm-arm/msg08413.html
>
> 2)qemu should enable ioeventfd support for virtio-mmio
>   This patch is refer to Ying-Shiuan Pan and reworked for new qemu branch.
>   https://lists.gnu.org/archive/html/qemu-devel/2014-11/msg00594.html
>
> 3)qemu should enable multiple irqs for virtio-mmio
>   This patch isn't sent to qemu maillist as we want to check whether this 
> is the right direction.
>   If you want to test, I'll send it to you.
I'm not a maintainer so my opinion isn't worth a lot here, but this
seems like the right direction to me.  I'd like to see the qemu patch
(do mention the dependency on the kernel patch) on the qemu-devel
mailing list.  I think these numbers also support some of the prereqs
listed above that have gone through several iterations getting queued up
for 3.19.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PCI: Add ACS support for AMD A88X southbridge devices

2014-10-02 Thread Joel Schopp


On 10/02/2014 08:47 AM, Alex Williamson wrote:

On Thu, 2014-10-02 at 16:05 +0300, Marti Raudsepp wrote:

AMD has confirmed that peer-to-peer between two southbridge functions
does not occur.

Joel Schopp at https://bugzilla.kernel.org/show_bug.cgi?id=81841#c15

 +-14.4-[01]05.0  Dialogic Corporation PRI
The legacy PCI should be isolated from the other devices identified.
Not sure what is going on here.

 +-14.5  Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller
This OHCI Controller should also be isolated from the other devices.

Signed-off-by: Marti Raudsepp 

The bugzilla comments aren't quite as decisive as I'd like to see for a
quirk, so I think we should probably get an ACK from Joel before
including this.  Thanks,


My apologies for not being as clear as I could have been in the 
bugzilla.  These are isolated.  Acked-by is below.




Alex


---
  drivers/pci/quirks.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 80c2d01..ce43316 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3582,6 +3582,11 @@ struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
   * 1002:439d SB7x0/SB8x0/SB9x0 LPC host controller
   * 1002:4384 SBx00 PCI to PCI Bridge
   * 1002:4399 SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
+ *
+ * https://bugzilla.kernel.org/show_bug.cgi?id=81841#c15
+ *
+ * 1022:780f [AMD] FCH PCI Bridge
+ * 1022:7809 [AMD] FCH USB OHCI Controller
   */
  static int pci_quirk_amd_sb_acs(struct pci_dev *dev, u16 acs_flags)
  {
@@ -3675,6 +3680,8 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_ATI, 0x439d, pci_quirk_amd_sb_acs },
{ PCI_VENDOR_ID_ATI, 0x4384, pci_quirk_amd_sb_acs },
{ PCI_VENDOR_ID_ATI, 0x4399, pci_quirk_amd_sb_acs },
+   { PCI_VENDOR_ID_AMD, 0x780f, pci_quirk_amd_sb_acs },
+   { PCI_VENDOR_ID_AMD, 0x7809, pci_quirk_amd_sb_acs },
{ PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_intel_pch_acs },
{ 0 }
  };


Acked-by: Joel Schopp 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 2/7] arm64: Introduce VA_BITS and translation level options

2014-07-14 Thread Joel Schopp
I agree that these patches would be very useful.  I just rebased my fix
for a VTTBR_BADDR_MASK bug on one of these patches that could be pulled
out independently.  See
https://lists.cs.columbia.edu/pipermail/kvmarm/2014-July/010480.html

The original author Jungseok Lee is no longer available to work on
future versions of these patches.  I was thinking that if they didn't
get picked up as they are that with the original author's blessing I
would pick them up and keep them forward ported/resubmitted.  I have an
SOC to test them on.

-Joel

On 07/14/2014 02:53 PM, Timur Tabi wrote:
> On Mon, May 12, 2014 at 4:40 AM, Jungseok Lee  wrote:
>> This patch adds virtual address space size and a level of translation
>> tables to kernel configuration. It facilicates introduction of
>> different MMU options, such as 4KB + 4 levels, 16KB + 4 levels and
>> 64KB + 3 levels, easily.
> Is there a reason why this patch has not yet been picked up?  It
> appears to work just fine, and the change is necessary for ARM SOCs
> that support large amounts of memory.  It seems weird that after so
> many versions, reviews, and ACKs, that it still not in linux-next.
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 3/9] irqchip: GIC: Convert to EOImode == 1

2014-06-25 Thread Joel Schopp



+   if (resource_size(&cpu_res) >= SZ_8K)
+   supports_deactivate = true;
+   else
+   pr_warn("GIC: CPU interface size is %x, DT is probably 
wrong\n", (int)resource_size(&cpu_res));

This will not work on APM X-Gene because, for
X-Gene first CPU page is at 0x7802 and
second CPU page is at 0x7803.

Ian had send-out a patch long time back to extend
GIC dt-bindings for addressing this issue.
(http://www.spinics.net/lists/arm-kernel/msg283767.html)


We have a similar issue with an AMD SOC.  You can add 0xf000 (60K) page 
offset to it to cleverly work around the issue but it seems quite likely 
that the page offset has to be communicated to userspace and handled 
there at no small effort.


Anybody want to revive Ian's split patches?  They do seem convoluted but 
seem like they might work as an approach.


-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Introduce ARM GICv2m MSI(-X) support

2014-06-24 Thread Joel Schopp
I've been running and doing development on top of these patches.  I 
found a problem in an earlier version that i can confirm is now fixed in 
this current version.


Reviewed-by: Joel Schopp 

On 06/23/2014 07:32 PM, suravee.suthikulpa...@amd.com wrote:

From: Suravee Suthikulpanit 

This patch set introduces support for MSI(-X) in GICv2m specification,
which is implemented in some variation of GIC400.

This depends on and has been tested with the V7 of "Add support for PCI in 
AArch64"
(https://lkml.org/lkml/2014/3/14/320).

Suravee Suthikulpanit (2):
   arm/gic: Add binding probe for GIC400
   arm/gic: Add supports for GICv2m MSI(-X)

  Documentation/devicetree/bindings/arm/gic.txt |  18 +-
  drivers/irqchip/Kconfig   |   6 +
  drivers/irqchip/Makefile  |   1 +
  drivers/irqchip/gic-msi-v2m.c | 249 ++
  drivers/irqchip/gic-msi-v2m.h |  20 +++
  drivers/irqchip/irq-gic.c |  23 ++-
  6 files changed, 313 insertions(+), 4 deletions(-)
  create mode 100644 drivers/irqchip/gic-msi-v2m.c
  create mode 100644 drivers/irqchip/gic-msi-v2m.h



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tpmdd-devel] [PATCH] tpm: MAINTAINERS: Add myself as tpm maintainer

2013-10-23 Thread Joel Schopp

> These would have been posted as patch numbers 8 through 13 in the
> original series.
> 
> I think what happened is at this point in the series module compile
> broke. That is fixed now in the for-james pull, so the rest of the
> series should be looked at.
> 
> Peter's checkpatch clean up will create some minor conflicts, so I
> should probably resend the lot after rebasing it.
> 

If you rebase and resend I will commit to reviewing them.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Aw: Re: [tpmdd-devel] [PATCH] tpm: MAINTAINERS: Add myself as tpm maintainer

2013-10-22 Thread Joel Schopp

>> I have no objection to you adding yourself here. I do think we should
>> probably also cut the list down at the same time as I don't think all
>> the listed maintainers are active anymore. Also, the list is getting a
>> bit unwieldy. If everyone maintains it nobody maintains it.
> 
> I agree with you here, this was also the reason I took it over when Kent 
> stepped down and noone else stepped in of the current maintainers, leaving 
> the subsystem more or less unmaintained.

FYI I'm one of the two people who took over co-maintaining TrouSerS
after Kent stepped down.  I am also reviewing tpm device driver patches
that go by, but I am glad others more knowledgeable about those drivers
such as you are there to be the maintainers.  Thanks for stepping up.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tpmdd-devel] [PATCH] tpm: MAINTAINERS: Add myself as tpm maintainer

2013-10-22 Thread Joel Schopp
On 10/22/2013 12:36 PM, Peter Huewe wrote:
> Since I'm actively maintaining the tpm subsystem for a few months now,
> it's time to step up and be an official maintainer for the tpm subsystem,
> atleast until I hear something different from my company.
> 
> The maintaining is done solely in my private time, out of private interest.
> Speaking only on behalf of myself, trying to be as vendor neutral as possible.
> 
> Signed-off-by: Peter Huewe 
> ---
>  MAINTAINERS |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4fde706..936adb4 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8475,6 +8475,7 @@ F:  drivers/media/usb/tm6000/
>  TPM DEVICE DRIVER
>  M:   Leonidas Da Silva Barbosa 
>  M:   Ashley Lai 
> +M:   Peter Huewe 
>  M:   Rajiv Andrade 
>  W:   http://tpmdd.sourceforge.net
>  M:   Marcel Selhorst 
> 

I have no objection to you adding yourself here.  I do think we should
probably also cut the list down at the same time as I don't think all
the listed maintainers are active anymore.  Also, the list is getting a
bit unwieldy.  If everyone maintains it nobody maintains it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tpmdd-devel] [PATCH 02/13] tpm atmel: Call request_region with the correct base

2013-10-04 Thread Joel Schopp
On 10/02/2013 11:36 PM, Jason Gunthorpe wrote:
> On Wed, Oct 02, 2013 at 07:11:14PM -0500, Ashley D Lai wrote:
> 
>>> I somewhat have the feeling that we should maybe begin to deprecate
>>> the vendor specific 1.1 tpms...
> 
>> I agree. If we have a machine to test and it fails then we know we don't
>> have a user for this.
> 
> Is this driver is only used on IBM systems? If so, will IBM provide
> support for those systems on RHEL7? If not the driver can probably
> safely be dropped.
> 
> The trouble with these old drivers is that they don't follow modern
> conventions (there are several little bugs at least) and nobody can
> test them to safely fix them.
> 
> If you do find hardware, we can at least take a solid run at sprucing
> up the testable drivers which should give them more life..
> 
> Jason

We don't have any plans to support 1.1 tpms in any new major
distribution releases.  I am in support of deprecating 1.1 tpms and just
supporting 1.2 and 2.0 going forward.

This is the direction we are taking with TrouSerS as well for what it's
worth.  So that's at least one major user you won't see complain about
missing tpm 1.1 support.

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tpmdd-devel] [PATCH 09/13] tpm: Pull everything related to sysfs into tpm-sysfs.c

2013-09-30 Thread Joel Schopp


> There is also the fact that the driver may not be able to tell if a
> locality is available without doing some kind of test command. The Xen
> TPM interface doesn't expose what localities are available, for example,
> and the TIS interface may need to test to see if locality 3 and 4 are
> actually blocked by the chipset - at least 3 might be available on some
> systems (the spec leaves this "implementation dependent").
> 
>>> Perhaps:
>>> default_locality - default to CONFIG_USER_DEFAULT_LOCALITY
>>> sysfs node permissions 0644
>>> kernel_locality - default to #CONFIG_KERNEL_DEFAULT_LOCALITY
>>> 0444 if CONFIG_KERNEL_ONLY_LOCALITY=y
>>> 0644 if CONFIG_KERNEL_ONLY_LOCALITY=n
>>> ioctl TPM_{GET,SET}_LOCALITY on an open /dev/tpmX
>>>
>>> If CONFIG_KERNEL_ONLY_LOCALITY=y, the userspace locality is not
>>> permitted to be equal to kernel_locality (but may take any other valid
>>> value).  Drivers may reject locality values that they consider invalid
>>> (the default should be to only allow 0-4, which is all that is defined
>>> in the spec) or may fail on attempted use of the TPM by passing down an
>>> error from the hardware - I would expect the latter to be the case on
>>> attempts to use locality 4 in the tpm_tis driver.
>>
>> Seems resonable, CONFIG_KERNEL_ONLY_LOCALITY could be
>> CONFIG_TPM_ONE_TIME_LOCALITY (eg you get to set kernel_locality only
>> once)
> 
> Hmm, how much trouble would it be to make this a menu selection? Even
> with the one-time-set option, you still need a default set either in
> the code or by CONFIG_ so that the TPM is not unavailable before the
> sysfs write. The options would be:
> 
>   - CONFIG_TPM_KERNEL_DEFAULT_LOCALITY = [int]
>   - CONFIG_TPM_KERNEL_LOCALITY_FIXED - no changes from userspace
>   - CONFIG_TPM_KERNEL_LOCALITY_ONESHOT - only one change possible
>   - CONFIG_TPM_KERNEL_LOCALITY_ANY - may be changed freely
> 
> The userspace locality is not allowed to use the kernel locality if
> the mode is either FIXED or ONESHOT, but may share locality if ANY
> is used.
> 
> Or, for more flexibility (I actually like this one better):
> 
>   - CONFIG_TPM_KERNEL_DEFAULT_LOCALITY = [int]
>   - CONFIG_TPM_KERNEL_LOCALITY_FIXED = [bool]

This seems best of the options discussed to me.

> 
> And sysfs contains:
>   - kernel_locality [0644, int; 0444 if FIXED=y or when locked(?)]
>   - lock_kernel_locality [write-once; only exists if FIXED=n]
> 
> Where kernel_locality may be changed until a write is made to
> local_kernel_locality, at which time the value of kernel_locality
> becomes read-only and no longer available via /dev/tpmX.
> 
>>> The only one I see immediately is seal/unseal (security/keys/trusted.c).
>>> The invocation of the seal command would need to be changed to seal the
>>> trusted keys to the kernel-only locality in order to take advantage of
>>> the increased protection provided by a kernel-only locality.
>>
>> Right
> 
> Actually, only the invocation needs to be changed - the PCR selection
> is passed in from userspace, which will just need to use PCR_INFO_LONG
> with the proper locality mask.
> 
 Do you know anyone on the userspace SW side who could look at this?
>>
>>> I should be able to find someone.
>>
>> Okay, let me know. I'd like to get a few more clean ups done before
>> mucking with the sysfs, but the way forward for locality looks pretty
>> clear..
>>
>> Thanks,
>> Jason
> 
> So far, nobody I have talked to has offered any strong opinions on
> what locality should be used or how it should be set. I think finding
> a developer of trousers may be the most useful to talk about how the
> ioctl portion of this would need to be set up - if someone is actually
> needed.
> 

I am a TrouSerS developer and am ccing Richard, another TrouSerS
developer, and ccing the trousers-tech list.  It would be good if you
could elaborate on the question and context for those not following the
entire thread, myself included.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tpmdd-devel] [PATCH 00/13] TPM cleanup

2013-09-23 Thread Joel Schopp

> Jason Gunthorpe (13):
>   tpm: ibmvtpm: Use %zd formatting for size_t format arguments
>   tpm atmel: Call request_region with the correct base
>   tpm: xen-tpmfront: Fix default durations
>   tpm: Store devname in the tpm_chip
>   tpm: Use container_of to locate the tpm_chip in tpm_open
>   tpm: Remove redundant dev_set_drvdata
>   tpm: Remove tpm_show_caps_1_2
>   tpm: Pull everything related to /dev/tpmX into tpm-dev.c
>   tpm: Pull everything related to sysfs into tpm-sysfs.c
>   tpm: Create a tpm_class_ops structure and use it in the drivers
>   tpm: Use the ops structure instead of a copy in tpm_vendor_specific
>   tpm: st33: Remove chip->data_buffer access from this driver
>   tpm: Make tpm-dev allocate a per-file structure
> 
>  drivers/char/tpm/Makefile   |   2 +-
>  drivers/char/tpm/tpm-dev.c  | 213 +++
>  drivers/char/tpm/tpm-sysfs.c| 318 ++
>  drivers/char/tpm/tpm.c  | 524 
> +++-
>  drivers/char/tpm/tpm.h  |  86 +++---
>  drivers/char/tpm/tpm_atmel.c|  30 +--
>  drivers/char/tpm/tpm_i2c_atmel.c|  42 +--
>  drivers/char/tpm/tpm_i2c_infineon.c |  44 +--
>  drivers/char/tpm/tpm_i2c_nuvoton.c  |  42 +--
>  drivers/char/tpm/tpm_i2c_stm_st33.c |  51 +---
>  drivers/char/tpm/tpm_ibmvtpm.c  |  44 +--
>  drivers/char/tpm/tpm_infineon.c |  28 +-
>  drivers/char/tpm/tpm_nsc.c  |  28 +-
>  drivers/char/tpm/tpm_spi_stm_st33.c |  50 +---
>  drivers/char/tpm/tpm_tis.c  |  43 +--
>  drivers/char/tpm/xen-tpmfront.c |  57 +---
>  include/linux/tpm.h |  15 ++
>  17 files changed, 638 insertions(+), 979 deletions(-)
>  create mode 100644 drivers/char/tpm/tpm-dev.c
>  create mode 100644 drivers/char/tpm/tpm-sysfs.c
> 

For what it's worth I have nothing to say except the cleanups look sane
to me.
Reviewed-by: Joel Schopp 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 8/9] bfs: remove multiple assignments

2008-01-27 Thread Joel Schopp

-inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME_SEC;
+inode->i_mtime = CURRENT_TIME_SEC;
+inode->i_atime = CURRENT_TIME_SEC;
+inode->i_ctime = CURRENT_TIME_SEC;

multiple assignments like "x = y = z = value;" can potentially
(depending on the compiler and arch) be faster than "x = value; y =
value; z=value;"

I am surprized that this script complains about them as it is a
perfectly valid thing to do in C.


I think it seems wise to ask the maintainers of checkpatch.pl to
comment on that. I'm Cc:ing them now.



There are plenty of things that are valid to do in C that don't make for 
maintainable code.  These scripts are designed to make your code easier for 
real people to review and maintain.


As for if this can be faster we don't deal in the realm of "can".  Please 
show a concrete example of gcc making Linux kernel code faster with 
multiple assignments per line.  If you can do that I'm willing to change my 
mind and I'll lead the charge for mutliple assignments per line.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] update checkpatch.pl to version 0.10

2007-09-28 Thread Joel Schopp

The only question is whether this should default to on.  You are voting
off.  I personally think on.

Andrew?  Randy?  Joel?


The main audience of this is new contributors, who should have more verbose 
output, including nitpicky things like multiple assignments per line.  The 
default should target them.  More advanced users can certainly use a flag 
that says "give me only the real errors".


It might be a good idea to have three levels.
--really-errors
--really-picky
--really-experimental

Only with better names.  --really-picky would be default, but would only 
include tests that have a very very high ratio of hits to false positives, 
but would still call out things like multiple assignments per line. 
--really-errors we could call the Igno level.  --really-experimental would 
call out all issues, even on checks that generate a fair number of false 
positives.


-Joel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.23] ibmebus: Prevent bus_id collisions

2007-08-30 Thread Joel Schopp
There are currently two GX devices, eHCA and eHEA, which both reside 
beneath the root node - this is required by architecture for those 
devices. Unless they invent a device called 
"supercalifragilisticexpialidocious", devices in the root note will have a 
full_name of less than 31 chars. Even in that case, the truncation occurs 
at the beginning, so the @xxx part that makes the nodes unique will stay 
in place.




OK, didn't realize it had to be beneath the root node, and that the 
truncation truncated the front and not the back.  I would have done it 
differently, but this should work.


Acked-by: Joel Schopp <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] update checkpatch.pl to version 0.06

2007-06-22 Thread Joel Schopp

Several of our on-disk filesystems have an ioctl function that already
has indented goto labels.  I don't think it's quite worth churning all
of these (working) filesystems to make a style checker happy.

I think it's worse style to be mixing label indentation in a file as it
is to create new "correct" indentation labels.  That's why I suggested
using context in the file to determine it rather than absolute rules.


If it is bad coding style that is justified because there is already other bad coding 
style to match -- that is not a judgment call for a script to make, but for a real 
person to make.


There is no law that says you have to fix 100% of the warnings the script generates, 
even if they are valid warnings.  You'd just better be ready to justify them is all. 
 Your justification seems reasonable in this case -- it is worse to mix right and 
wrong label indentation vs indenting wrongly but consistently.


Indent consistently wrong and feel happy about it, just don't expect the style 
checker to give you a free pass when you perpetuate somebody else's wrong.  If we 
start down the path of bad coding style always being OK if there is already bad 
coding style around it I think that is a slippery slope.  There should be some 
friction when we perpetuate bad style so there is some incentive to fix the style for 
future generations to be able to read our code better.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] update checkpatch.pl to version 0.06

2007-06-22 Thread Joel Schopp

foo_ioctl()
{
switch(ioctl) {
case FOO:
lots
of
code
error:
return result;
case BAR:
return result;
}

Notice that the "error:" label is indented.  Each of the case is kinda
like a mini function with its own variables and return statement.


If it is "kinda like a mini function" why not make it "actually a mini function" and 
call it?


I really don't like the indenting here.  When I first glanced over that code I 
thought "case FOO:", "case error:", "case BAR:".  Only later after reading your 
description did I realize error wasn't part of the switch, but an independent label.




Do you think it is worth teaching the patch checker about these?  It
seems pretty sane style to me.


It hurts my eyes.  Not that I'm the coding style czar or anything, if I were the 
kernel coding style would be different in several ways.  But inasmuch as this is a 
democracy (which it isn't) then I am opposed to crazy indentation such as your example.






-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] checkpatch: add an exclusion for 'for_each' helper macros

2007-06-08 Thread Joel Schopp


Dan Williams wrote:

checkpatch currently complains about macros like the following:

#define for_each_dma_cap_mask(cap, mask) \
for ((cap) = first_dma_cap(mask);   \
(cap) < DMA_TX_TYPE_END;\
(cap) = next_dma_cap((cap), (mask)))


Signed-off-by: Dan Williams <[EMAIL PROTECTED]>


I'd like it if this patch updated Chapter 12 of Documentation/CodingStyle as well. 
That section is where the rule to check came from and it would be nice for it to 
mention the exception to the rule as well.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] add a trivial patch style checker v2

2007-05-29 Thread Joel Schopp

As a first step package up the current state of the patch style
checker and include it in the kernel tree.  Add instructions
suggesting running it on submissions.  This adds version v0.01 of
the checkpatch.pl script.

Signed-off-by: Andy Whitcroft <[EMAIL PROTECTED]>


Signed-off-by: Joel Schopp <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] add a trivial patch style checker

2007-05-29 Thread Joel Schopp

+   if(!($prevline=~/\/\*\*/) && length($lineforcounting) > 80){

Actually, I think this should be "> 79" (after stripping a .diff's
control column), since the cursor may move to the 81th column when
editing an 80-col line - which is what we want to avoid, no?


80 tends to work for me because of that "if on 80 then don't wrap until
there is another character" behaviour of most terminals.  Anyone else
with a firm opinion.


I think 80 is good.  What the specific number is does not matter much, we all have 
screens wider than 80 characters.  The point is just to have a number that prevents 
really long lines and prevents people from indenting too many levels past our minds 
ability to keep up.  We've already all been coding to 80, and it happens to be a nice 
round number we can all remember and love.  The only reason I see to select 79 is 
that prime numbers are generally cooler than other numbers.


-Joel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] add a trivial patch style checker

2007-05-29 Thread Joel Schopp

Randy Dunlap wrote:

On Sun, 27 May 2007 18:11:25 +0100 Andy Whitcroft wrote:


Also if either Joel or Randy want to be on on the MAINTAINERS
entry yell and we'll get it updated, wouldn't want to list
anyone without permission.


Yes, please.


Yes.  Add me as well.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-05 Thread Joel Schopp
If you only need to allocate 1 page size and smaller allocations then no 
it's not a problem.  As soon as you go above that it will be.  You don't 
need to go all the way up to MAX_ORDER size to see an impact, it's just 
increasingly more severe as you get away from 1 page and towards MAX_ORDER.


We allocate order 1 and 2 pages for stuff without too much problem.


The question I want to know is where do you draw the line as to what is acceptable to 
allocate in a single contiguous block?


1 page?  8 pages?  256 pages?  4K pages?  Obviously 1 page works fine. With 4K page 
size and 16MB MAX_ORDER 4K pages is theoretically supported, but doesn't work under 
almost any circumstances (unless you use Mel's patches).



on-demand hugepages could be done better anyway by having the hypervisor
defrag physical memory and provide some way for the guest to ask for a
hugepage, no?


Unless you break the 1:1 virt-phys mapping it doesn't matter if the hypervisor can 
defrag this for you as the kernel will have the physical address cached away 
somewhere and expect the data not to move.


I'm a big fan of making this somebody else's problem and the hypervisor would be a 
good place.  I just can't figure out how to actually do it at that layer without 
changing Linux in a way that is unacceptable to the community at large.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-05 Thread Joel Schopp

But if you don't require a lot of higher order allocations anyway, then
guest fragmentation caused by ballooning doesn't seem like much problem.


If you only need to allocate 1 page size and smaller allocations then no it's not a 
problem.  As soon as you go above that it will be.  You don't need to go all the way 
up to MAX_ORDER size to see an impact, it's just increasingly more severe as you get 
away from 1 page and towards MAX_ORDER.




If you need higher order allocations, then ballooning is bad because of
fragmentation, so you need memory unplug, so you need higher order
allocations. Goto 1.


Yes, it's a closed loop.  But hotplug isn't the only one that needs higher order 
allocations.  In fact it's pretty far down the list.  I look at it like this, a lot 
of users need high order allocations for better performance and things like on-demand 
hugepages.  As a bonus you get memory hot-remove.



Balooning probably does skew memory management stats and watermarks, but
that's just because it is implemented as a module. A couple of hooks
should be enough to allow things to be adjusted?


That is a good idea independent of the current discussion.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-02 Thread Joel Schopp

Linus Torvalds wrote:


On Thu, 1 Mar 2007, Andrew Morton wrote:

So some urgent questions are: how are we going to do mem hotunplug and
per-container RSS?


The people who were trying to do memory hot-unplug basically all stopped waiting for 
these patches, or something similar, to solve the fragmentation problem.  Our last 
working set of patches built on top of an earlier version of Mel's list based solution.




Also: how are we going to do this in virtualized environments? Usually the 
people who care abotu memory hotunplug are exactly the same people who 
also care (or claim to care, or _will_ care) about virtualization.


Yes, we are.  And we are very much in favor of these patches.  At last year's OLS 
developers from IBM, HP, Xen coauthored a paper titled "Resizing Memory with Balloons 
and Hotplug".  http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf  Our 
conclusion was that ballooning is simply not good enough and we need memory 
hot-unplug.  Here is a quote from the article I find relevant to today's discussion:


"Memory Hotplug remove is not in mainline.
Patches exist, released under the GPL, but are
only occasionally rebased. To be worthwhile
the existing patches would need either a remappable
kernel, which remains highly doubtful, or
a fragmentation avoidance strategy to keep migrateable
and non-migrateable pages clumped
together nicely."

At IBM all of our Power4, Power5, and future hardware supports a lot of 
virtualization features.  This hardware took "Best Virtualization Solution" at 
LinuxWorld Expo, so we aren't talking research projects here. 
http://www-03.ibm.com/press/us/en/pressrelease/20138.wss


My personal opinion is that while I'm not a huge fan of virtualization, 
these kinds of things really _can_ be handled more cleanly at that layer, 
and not in the kernel at all. Afaik, it's what IBM already does, and has 
been doing for a while. There's no shame in looking at what already works, 
especially if it's simpler.


I believe you are talking about the zSeries (aka mainframe) because the rest of IBM 
needs these patches.  zSeries built their whole processor instruction set, memory 
model, etc around their form of virtualization, and I doubt the rest of us are going 
to change our processor instruction set that drastically.  I've had a lot of talks 
with Martin Schwidefsky (the maintainer of Linux on zSeries) about how we could do 
more of what they do and the basic answer is we can't because what they do is so 
fundamentally incompatible.


While I appreciate that we should all dump our current hardware and buy mainframes it 
seems to me that an easier solution is to take a few patches from Mel and work with 
the hardware we already have.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-02 Thread Joel Schopp

Exhibiting a workload where the list patch breaks down and the zone
patch rescues it might help if it's felt that the combination isn't as
good as lists in isolation. I'm sure one can be dredged up somewhere.


I can't think of a workload that totally makes a mess out of list-based. 
However, list-based makes no guarantees on availability. If a system 
administrator knows they need between 10,000 and 100,000 huge pages and 
doesn't want to waste memory pinning too many huge pages at boot-time, 
the zone-based mechanism would be what he wanted.


From our testing with earlier versions of list based for memory hot-unplug on 
pSeries machines we were able to hot-unplug huge amounts of memory after running the 
nastiest workloads we could find for over a week.  Without the patches we were unable 
to hot-unplug anything within minutes of running the same workloads.


If something works for 99.999% of people (list based) and there is an easy way to 
configure it for the other 0.001% of the people ("zone" based) I call that a great 
solution.  I really don't understand what the resistance is to these patches.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1

2005-09-01 Thread Joel Schopp

Try the diff below although I suspect much of the extra logic can go
away and something like

len = tty_buffer_request_root(tty, HVCS_BUFF_LEN);
if(len) {
len = hvc_get_chars(, len);
tty_insert_flip_string(tty, buf, len);
}

is better.


It's like whack a mole.  30 more now in drivers/serial/jsm/jsm_tty.c and 
 drivers/serial/icom.c


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1

2005-09-01 Thread Joel Schopp

   /* If flip is full, just reschedule a later read */
   if (count == 0) {
   poll_mask |= HVC_POLL_READ;

shouldn't be deleting the declaration of count. 
and possibly the "flip removal" was incomplete (line 636) ???



Yep. You can remove the tty->flip.count test or use count, but at that
point count is guaranteed to be > 0 I believe. Fixed both in my tree will
push a new diff to Andre soon.


There are at least a couple other spots where flip got missed, after 
fixing the count and flip problem mentioned these come up:


drivers/char/hvcs.c:459: error: structure has no member named `flip'
drivers/char/hvcs.c:472: error: structure has no member named `flip'


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] ibmvscsi timeout fix

2005-08-22 Thread Joel Schopp

This patch fixes a long term borkenness in
ibmvscsi where we were using the wrong timeout
field from the scsi command (and using the 
wrong units.)  Now broken by the fact that the

scsi_cmnd timeout field is gone entirely.
This only worked before because all the SCSI
targets assumed that 0 was default.


That was fast.  I report the error to you and get a patch next time I 
look at my mail.  This does fix the build break I saw in a 
2.6.13-rc6-mm1 defconfig on ppc64.


Adding Andrew Morton to the distribution list since somebody else is 
bound to notice that ppc64 -mm doesn't compile anymore.


Acked-by: Joel Schopp <[EMAIL PROTECTED]>



Signed-off-by: Dave Boutcher <[EMAIL PROTECTED]>

--- linux-2.6.13-rc6-mm1-orig/drivers/scsi/ibmvscsi/ibmvscsi.c  2005-08-22 
13:54:20.111955197 -0500
+++ linux-2.6.13-rc6-mm1/drivers/scsi/ibmvscsi/ibmvscsi.c   2005-08-22 
14:22:56.265042174 -0500
@@ -594,7 +594,7 @@
init_event_struct(evt_struct,
  handle_cmd_rsp,
  VIOSRP_SRP_FORMAT,
- cmnd->timeout);
+ cmnd->timeout_per_command/HZ);
 
 	evt_struct->cmnd = cmnd;

evt_struct->cmnd_done = done;




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ppc64 build broke between 2.6.11-bk6 and 2.6.11-bk7

2005-03-18 Thread Joel Schopp
Mikael Pettersson wrote:
Andrew Morton writes:
 > "Martin J. Bligh" <[EMAIL PROTECTED]> wrote:
 > >
 > > drivers/built-in.o(.text+0x182bc): In function `.matroxfb_probe':
 > > : undefined reference to `.mac_vmode_to_var'
 > > make: *** [.tmp_vmlinux1] Error 1
 > > 
 > > Anyone know what that is?
 > > 
 > 
 > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm4/broken-out/fbdev-kconfig-fix-for-macmodes-and-ppc.patch
 > 
 > should fix it.

It seems the culprit is "matroxfb-compile-error.patch" which unconditionally 
adds
macmodes.o to the Makefile line for CONFIG_FB_MATROX. This obviously breaks on 
!ppc.
The patch Andrew mentions above converts the Kconfig entry for FB_MATROX to do a
"select FB_MACMODES if PPC_PMAC", so dropping matroxfb-compile-error.patch 
should suffice.

matroxfb-compile-error.patch was a valid fix for a compile problem. It 
was against 2.6.11-bk10, therefore wasn't in the 2.6.11-bk6 or 2.6.11bk7 
you had problems with and didn't cause this mess to begin with.

It appears the problem was more systemic than what I saw during my 
compile, thus the fbdev-kconfig-fix-for-macmodes-and-ppc.patch probably 
fixes the problem I fixed and a host of others.  Of course it conflicts 
with my patch.

Please drop the matroxfb-compile-error.patch and if the problem isn't 
truly fixed by fbdev-kconfig-fix-for-macmodes-and-ppc.patch I will 
resend it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] matroxfb compile error

2005-03-15 Thread Joel Schopp
When compiling 2.6.11-bk10 I get a compile error.  The attached patch 
fixes it for me.  Please apply if you haven't gotten another patch for 
this already.

This patch fixes this compile error by causing the object that
contains the referenced function to be built:

drivers/built-in.o(.text+0x26db8): In function `.initMatrox2':
: undefined reference to `.mac_vmode_to_var' 

Signed-off-by: Joel Schopp <[EMAIL PROTECTED]>
---


diff -puN drivers/video/Makefile~matrox drivers/video/Makefile
--- 2.6.11-bk10/drivers/video/Makefile~matrox   2005-03-15 11:08:44.0 
-0600
+++ 2.6.11-bk10-jschopp/drivers/video/Makefile  2005-03-15 11:12:08.0 
-0600
@@ -26,7 +26,7 @@ obj-$(CONFIG_FB_CYBER2000)+= cyb
 obj-$(CONFIG_FB_PM2)  += pm2fb.o
 obj-$(CONFIG_FB_PM3) += pm3fb.o
 
-obj-$(CONFIG_FB_MATROX)  += matrox/
+obj-$(CONFIG_FB_MATROX)  += matrox/ macmodes.o
 obj-$(CONFIG_FB_RIVA)+= riva/ vgastate.o
 obj-$(CONFIG_FB_NVIDIA)  += nvidia/
 obj-$(CONFIG_FB_ATY) += aty/
_


Re: [PATCH] explicitly bind idle tasks

2005-03-07 Thread Joel Schopp
Nathan Lynch wrote:
With hotplug cpu and preempt, we tend to see smp_processor_id warnings
from idle loop code because it's always checking whether its cpu has
gone offline.  Replacing every use of smp_processor_id with
_smp_processor_id in all idle loop code is one solution; another way
is explicitly binding idle threads to their cpus (the smp_processor_id
warning does not fire if the caller is bound only to the calling cpu).
This has the (admittedly slight) advantage of letting us know if an
idle thread ever runs on the wrong cpu.
I also prefer explicitly binding idle threads to their cpus instead of 
replacing use of smp_processor_id with _smp_processor_id.


Signed-off-by: Nathan Lynch <[EMAIL PROTECTED]>
Acked-by: Joel Schopp <[EMAIL PROTECTED]>
Index: linux-2.6.11-rc5-mm1/init/main.c
===
--- linux-2.6.11-rc5-mm1.orig/init/main.c	2005-03-02 00:12:07.0 +
+++ linux-2.6.11-rc5-mm1/init/main.c	2005-03-02 00:53:04.0 +
@@ -638,6 +638,10 @@
 {
 	lock_kernel();
 	/*
+	 * init can run on any cpu.
+	 */
+	set_cpus_allowed(current, CPU_MASK_ALL);
+	/*
 	 * Tell the world that we're going to be the grim
 	 * reaper of innocent orphaned children.
 	 *
Index: linux-2.6.11-rc5-mm1/kernel/sched.c
===
--- linux-2.6.11-rc5-mm1.orig/kernel/sched.c	2005-03-02 00:12:07.0 +
+++ linux-2.6.11-rc5-mm1/kernel/sched.c	2005-03-02 00:47:14.0 +
@@ -4092,6 +4092,7 @@
 	idle->array = NULL;
 	idle->prio = MAX_PRIO;
 	idle->state = TASK_RUNNING;
+	idle->cpus_allowed = cpumask_of_cpu(cpu);
 	set_task_cpu(idle, cpu);
 
 	spin_lock_irqsave(&rq->lock, flags);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/