On 07/12/2012 11:25 AM, Raghavendra K T wrote:
>>
>> The problem occurs even with no overcommit at all. One vcpu is in a
>> legitimately long pause loop. All those exits accomplish nothing, since
>> all vcpus are scheduled. Better to let it spin in guest mode.
>>
>
> I agree. One idea is we can
On Thu, 2012-07-12 at 11:12 +0300, Avi Kivity wrote:
> On 07/12/2012 05:17 AM, Benjamin Herrenschmidt wrote:
> >> ARM doesn't have an instruction for cpu_relax(), so it can't intercept
> >> it. Given ppc's dislike of overcommit, and the way it implements
> >> cpu_relax() by adjusting hw thread pri
On Wed, 11 Jul 2012 14:04:03 +0300, Avi Kivity wrote:
>
> > So this would probably improve guests that uses cpu_relax, for example
> > stop_machine_run. I have no measurements, though.
>
> smp_call_function() too (though that can be converted to directed yield
> too). It seems worthwhile.
>
Wi
On 07/12/2012 01:41 PM, Avi Kivity wrote:
On 07/12/2012 08:11 AM, Raghavendra K T wrote:
Ah, I thouht you objected to the CONFIG var. Maybe call it
cpu_relax_intercepted since that's the linuxy name for the instruction.
Ok, just to be on same page. 'll have :
1. cpu_relax_intercepted instead
On 07/12/2012 01:45 PM, Avi Kivity wrote:
On 07/11/2012 05:01 PM, Raghavendra K T wrote:
On 07/11/2012 07:29 PM, Raghavendra K T wrote:
On 07/11/2012 02:30 PM, Avi Kivity wrote:
On 07/10/2012 12:47 AM, Andrew Theurer wrote:
For the cpu threads in the host that are actually active (in this ca
On 07/11/2012 05:01 PM, Raghavendra K T wrote:
> On 07/11/2012 07:29 PM, Raghavendra K T wrote:
>> On 07/11/2012 02:30 PM, Avi Kivity wrote:
>>> On 07/10/2012 12:47 AM, Andrew Theurer wrote:
For the cpu threads in the host that are actually active (in this case
1/2 of them), ~50% of
On 07/12/2012 05:17 AM, Benjamin Herrenschmidt wrote:
>> ARM doesn't have an instruction for cpu_relax(), so it can't intercept
>> it. Given ppc's dislike of overcommit, and the way it implements
>> cpu_relax() by adjusting hw thread priority, I'm guessing it doesn't
>> intercept those either, but
On 07/12/2012 08:11 AM, Raghavendra K T wrote:
>> Ah, I thouht you objected to the CONFIG var. Maybe call it
>> cpu_relax_intercepted since that's the linuxy name for the instruction.
>>
>
> Ok, just to be on same page. 'll have :
> 1. cpu_relax_intercepted instead of pause_loop_exited.
>
> 2.
On 07/11/2012 05:09 PM, Avi Kivity wrote:
On 07/11/2012 02:18 PM, Christian Borntraeger wrote:
On 11/07/12 13:04, Avi Kivity wrote:
On 07/11/2012 01:17 PM, Christian Borntraeger wrote:
On 11/07/12 11:06, Avi Kivity wrote:
[...]
Almost all s390 kernels use diag9c (directed yield to a given gue
On Wed, 2012-07-11 at 14:23 +0300, Avi Kivity wrote:
> On 07/11/2012 02:16 PM, Alexander Graf wrote:
> >>
> >>> yes the data structure itself seems based on the algorithm
> >>> and not on arch specific things. That should work. If we move that to
> >>> common
> >>> code then s390 will use that s
> ARM doesn't have an instruction for cpu_relax(), so it can't intercept
> it. Given ppc's dislike of overcommit, and the way it implements
> cpu_relax() by adjusting hw thread priority, I'm guessing it doesn't
> intercept those either, but I'm copying the ppc people in case I'm
> wrong. So it's
On 07/11/2012 07:29 PM, Raghavendra K T wrote:
On 07/11/2012 02:30 PM, Avi Kivity wrote:
On 07/10/2012 12:47 AM, Andrew Theurer wrote:
For the cpu threads in the host that are actually active (in this case
1/2 of them), ~50% of their time is in kernel and ~43% in guest. This
is for a no-IO wor
On 07/11/2012 02:30 PM, Avi Kivity wrote:
On 07/10/2012 12:47 AM, Andrew Theurer wrote:
For the cpu threads in the host that are actually active (in this case
1/2 of them), ~50% of their time is in kernel and ~43% in guest. This
is for a no-IO workload, so that's just incredible to see so much
On 07/11/2012 05:21 PM, Raghavendra K T wrote:
On 07/11/2012 03:47 PM, Christian Borntraeger wrote:
On 11/07/12 11:06, Avi Kivity wrote:
[...]
So there is no win here, but there are other cases were diag44 is
used, e.g. cpu_relax.
I have to double check with others, if these cases are critical
On 07/11/2012 02:52 PM, Alexander Graf wrote:
>
> On 11.07.2012, at 13:23, Avi Kivity wrote:
>
>> On 07/11/2012 02:16 PM, Alexander Graf wrote:
> yes the data structure itself seems based on the algorithm
> and not on arch specific things. That should work. If we move that to
>
On 07/11/2012 05:25 PM, Christian Borntraeger wrote:
On 11/07/12 13:51, Raghavendra K T wrote:
Almost all s390 kernels use diag9c (directed yield to a given guest cpu) for
spinlocks, though.
Perhaps x86 should copy this.
See arch/s390/lib/spinlock.c
The basic idea is using several heuristic
On 11/07/12 13:51, Raghavendra K T wrote:
Almost all s390 kernels use diag9c (directed yield to a given guest cpu)
for spinlocks, though.
>>>
>>> Perhaps x86 should copy this.
>>
>> See arch/s390/lib/spinlock.c
>> The basic idea is using several heuristics:
>> - loop for a given amount o
On 07/11/2012 03:47 PM, Christian Borntraeger wrote:
On 11/07/12 11:06, Avi Kivity wrote:
[...]
Almost all s390 kernels use diag9c (directed yield to a given guest cpu) for
spinlocks, though.
Perhaps x86 should copy this.
See arch/s390/lib/spinlock.c
The basic idea is using several heuristi
On 11.07.2012, at 13:23, Avi Kivity wrote:
> On 07/11/2012 02:16 PM, Alexander Graf wrote:
>>>
yes the data structure itself seems based on the algorithm
and not on arch specific things. That should work. If we move that to
common
code then s390 will use that scheme automat
On 07/11/2012 02:18 PM, Christian Borntraeger wrote:
> On 11/07/12 13:04, Avi Kivity wrote:
>> On 07/11/2012 01:17 PM, Christian Borntraeger wrote:
>>> On 11/07/12 11:06, Avi Kivity wrote:
>>> [...]
> Almost all s390 kernels use diag9c (directed yield to a given guest cpu)
> for spinlocks,
On 07/11/2012 02:16 PM, Alexander Graf wrote:
>>
>>> yes the data structure itself seems based on the algorithm
>>> and not on arch specific things. That should work. If we move that to
>>> common
>>> code then s390 will use that scheme automatically for the cases were we
>>> call
>>> kvm_vcpu
On 11/07/12 13:04, Avi Kivity wrote:
> On 07/11/2012 01:17 PM, Christian Borntraeger wrote:
>> On 11/07/12 11:06, Avi Kivity wrote:
>> [...]
Almost all s390 kernels use diag9c (directed yield to a given guest cpu)
for spinlocks, though.
>>>
>>> Perhaps x86 should copy this.
>>
>> See arc
On 11.07.2012, at 13:04, Avi Kivity wrote:
> On 07/11/2012 01:17 PM, Christian Borntraeger wrote:
>> On 11/07/12 11:06, Avi Kivity wrote:
>> [...]
Almost all s390 kernels use diag9c (directed yield to a given guest cpu)
for spinlocks, though.
>>>
>>> Perhaps x86 should copy this.
>>
On 07/11/2012 01:17 PM, Christian Borntraeger wrote:
> On 11/07/12 11:06, Avi Kivity wrote:
> [...]
>>> Almost all s390 kernels use diag9c (directed yield to a given guest cpu)
>>> for spinlocks, though.
>>
>> Perhaps x86 should copy this.
>
> See arch/s390/lib/spinlock.c
> The basic idea is usi
On 11/07/12 11:06, Avi Kivity wrote:
[...]
>> Almost all s390 kernels use diag9c (directed yield to a given guest cpu) for
>> spinlocks, though.
>
> Perhaps x86 should copy this.
See arch/s390/lib/spinlock.c
The basic idea is using several heuristics:
- loop for a given amount of loops
- check i
On 07/09/2012 10:55 AM, Christian Borntraeger wrote:
> On 09/07/12 08:20, Raghavendra K T wrote:
>> Currently Pause Looop Exit (PLE) handler is doing directed yield to a
>> random VCPU on PL exit. Though we already have filtering while choosing
>> the candidate to yield_to, we can do better.
>>
>>
On 07/10/2012 12:47 AM, Andrew Theurer wrote:
>
> For the cpu threads in the host that are actually active (in this case
> 1/2 of them), ~50% of their time is in kernel and ~43% in guest. This
> is for a no-IO workload, so that's just incredible to see so much cpu
> wasted. I feel that
On Tue, 2012-07-10 at 17:24 +0530, Raghavendra K T wrote:
> On 07/10/2012 03:17 AM, Andrew Theurer wrote:
> > On Mon, 2012-07-09 at 11:50 +0530, Raghavendra K T wrote:
> >> Currently Pause Looop Exit (PLE) handler is doing directed yield to a
> >> random VCPU on PL exit. Though we already have filt
On 07/10/2012 03:17 AM, Andrew Theurer wrote:
On Mon, 2012-07-09 at 11:50 +0530, Raghavendra K T wrote:
Currently Pause Looop Exit (PLE) handler is doing directed yield to a
random VCPU on PL exit. Though we already have filtering while choosing
the candidate to yield_to, we can do better.
Hi,
On 07/10/2012 03:17 AM, Andrew Theurer wrote:
On Mon, 2012-07-09 at 11:50 +0530, Raghavendra K T wrote:
Currently Pause Looop Exit (PLE) handler is doing directed yield to a
random VCPU on PL exit. Though we already have filtering while choosing
the candidate to yield_to, we can do better.
[.
On 07/10/2012 03:17 AM, Andrew Theurer wrote:
> On Mon, 2012-07-09 at 11:50 +0530, Raghavendra K T wrote:
>> Currently Pause Looop Exit (PLE) handler is doing directed yield to a
>> random VCPU on PL exit. Though we already have filtering while choosing
>> the candidate to yield_to, we can do bett
On 07/09/2012 01:25 PM, Christian Borntraeger wrote:
On 09/07/12 08:20, Raghavendra K T wrote:
Currently Pause Looop Exit (PLE) handler is doing directed yield to a
random VCPU on PL exit. Though we already have filtering while choosing
the candidate to yield_to, we can do better.
Problem is, f
On 07/09/2012 02:20 AM, Raghavendra K T wrote:
Currently Pause Looop Exit (PLE) handler is doing directed yield to a
random VCPU on PL exit. Though we already have filtering while choosing
the candidate to yield_to, we can do better.
Problem is, for large vcpu guests, we have more probability of
On Mon, 2012-07-09 at 11:50 +0530, Raghavendra K T wrote:
> Currently Pause Looop Exit (PLE) handler is doing directed yield to a
> random VCPU on PL exit. Though we already have filtering while choosing
> the candidate to yield_to, we can do better.
Hi, Raghu.
> Problem is, for large vcpu guests
On 09/07/12 08:20, Raghavendra K T wrote:
> Currently Pause Looop Exit (PLE) handler is doing directed yield to a
> random VCPU on PL exit. Though we already have filtering while choosing
> the candidate to yield_to, we can do better.
>
> Problem is, for large vcpu guests, we have more probability
Currently Pause Looop Exit (PLE) handler is doing directed yield to a
random VCPU on PL exit. Though we already have filtering while choosing
the candidate to yield_to, we can do better.
Problem is, for large vcpu guests, we have more probability of yielding
to a bad vcpu. We are not able to prev
36 matches
Mail list logo