subject:"Re\: \[PATCH 0\/9\] qspinlock stuff \-v15"

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Waiman Long


On 03/30/2015 12:29 PM, Peter Zijlstra wrote:

On Mon, Mar 30, 2015 at 12:25:12PM -0400, Waiman Long wrote:

I did it differently in my PV portion of the qspinlock patch. Instead of
just waking up the CPU, the new lock holder will check if the new queue head
has been halted. If so, it will set the slowpath flag for the halted queue
head in the lock so as to wake it up at unlock time. This should eliminate
your concern of dong twice as many VMEXIT in an overcommitted scenario.

We can still do that on top of all this right? As you might have
realized I'm a fan of gradual complexity :-)


Of course. I am just saying that the concern can be addressed with some 
additional code change.


-Longman
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Waiman Long


On 03/27/2015 10:07 AM, Konrad Rzeszutek Wilk wrote:

On Thu, Mar 26, 2015 at 09:21:53PM +0100, Peter Zijlstra wrote:

On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote:

Ah nice. That could be spun out as a seperate patch to optimize the existing
ticket locks I presume.

Yes I suppose we can do something similar for the ticket and patch in
the right increment. We'd need to restructure the code a bit, but
its not fundamentally impossible.

We could equally apply the head hashing to the current ticket
implementation and avoid the current bitmap iteration.


Now with the old pv ticketlock code an vCPU would only go to sleep once and
be woken up when it was its turn. With this new code it is woken up twice
(and twice it goes to sleep). With an overcommit scenario this would imply
that we will have at least twice as many VMEXIT as with the previous code.

An astute observation, I had not considered that.

Thank you.

I presume when you did benchmarking this did not even register? Thought
I wonder if it would if you ran the benchmark for a week or so.

You presume I benchmarked :-) I managed to boot something virt and run
hackbench in it. I wouldn't know a representative virt setup if I ran
into it.

The thing is, we want this qspinlock for real hardware because its
faster and I really want to avoid having to carry two spinlock
implementations -- although I suppose that if we really really have to
we could.

In some way you already have that - for virtualized environments where you
don't have an PV mechanism you just use the byte spinlock - which is good.

And switching to PV ticketlock implementation after boot.. ugh. I feel your 
pain.

What if you used an PV bytelock implemenation? The code you posted already
'sprays' all the vCPUS to wake up. And that is exactly what you need for PV
bytelocks - well, you only need to wake up the vCPUS that have gone to sleep
waiting on an specific 'struct spinlock' and just stash those in an per-cpu
area. The old Xen spinlock code (Before 3.11?) had this.

Just an idea thought.


The current code should have just waken up one sleeping vCPU. We 
shouldn't want to wake up all of them and have almost all except one go 
back to sleep. I think the PV bytelock you suggest is workable. It 
should also simplify the implementation. It is just a matter of how much 
we value the fairness attribute of the PV ticket or queue spinlock 
implementation that we have.


-Longman
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Peter Zijlstra

On Mon, Mar 30, 2015 at 12:25:12PM -0400, Waiman Long wrote:
 I did it differently in my PV portion of the qspinlock patch. Instead of
 just waking up the CPU, the new lock holder will check if the new queue head
 has been halted. If so, it will set the slowpath flag for the halted queue
 head in the lock so as to wake it up at unlock time. This should eliminate
 your concern of dong twice as many VMEXIT in an overcommitted scenario.

We can still do that on top of all this right? As you might have
realized I'm a fan of gradual complexity :-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Waiman Long


On 03/25/2015 03:47 PM, Konrad Rzeszutek Wilk wrote:

On Mon, Mar 16, 2015 at 02:16:13PM +0100, Peter Zijlstra wrote:

Hi Waiman,

As promised; here is the paravirt stuff I did during the trip to BOS last week.

All the !paravirt patches are more or less the same as before (the only real
change is the copyright lines in the first patch).

The paravirt stuff is 'simple' and KVM only -- the Xen code was a little more
convoluted and I've no real way to test that but it should be stright fwd to
make work.

I ran this using the virtme tool (thanks Andy) on my laptop with a 4x
overcommit on vcpus (16 vcpus as compared to the 4 my laptop actually has) and
it both booted and survived a hackbench run (perf bench sched messaging -g 20
-l 5000).

So while the paravirt code isn't the most optimal code ever conceived it does 
work.

Also, the paravirt patching includes replacing the call with movb $0, %arg1
for the native case, which should greatly reduce the cost of having
CONFIG_PARAVIRT_SPINLOCKS enabled on actual hardware.

Ah nice. That could be spun out as a seperate patch to optimize the existing
ticket locks I presume.


The goal is to replace ticket spinlock by queue spinlock. We may not 
want to support 2 different spinlock implementations in the kernel.




Now with the old pv ticketlock code an vCPU would only go to sleep once and
be woken up when it was its turn. With this new code it is woken up twice
(and twice it goes to sleep). With an overcommit scenario this would imply
that we will have at least twice as many VMEXIT as with the previous code.


I did it differently in my PV portion of the qspinlock patch. Instead of 
just waking up the CPU, the new lock holder will check if the new queue 
head has been halted. If so, it will set the slowpath flag for the 
halted queue head in the lock so as to wake it up at unlock time. This 
should eliminate your concern of dong twice as many VMEXIT in an 
overcommitted scenario.


BTW, I did some qspinlock vs. ticketspinlock benchmarks using AIM7 
high_systime workload on a 4-socket IvyBridge-EX system (60 cores, 120 
threads) with some interesting results.


In term of the performance benefit of this patch, I ran the
high_systime workload (which does a lot of fork() and exit())
at various load levels (500, 1000, 1500 and 2000 users) on a
4-socket IvyBridge-EX bare-metal system (60 cores, 120 threads)
with intel_pstate driver and performance scaling governor. The JPM
(jobs/minutes) and execution time results were as follows:

Kernel  JPMExecution Time
--  -----
At 500 users:
 3.19118857.1426.25s
3.19-qspinlock134889.7523.13s
% change +13.5%-11.9%

At 1000 users:
 3.19204255.3230.55s
 3.19-qspinlock239631.3426.04s
% change +17.3%-14.8%

At 1500 users:
 3.19177272.7352.80s
 3.19-qspinlock326132.4028.70s
% change +84.0%-45.6%

At 2000 users:
 3.19196690.3163.45s
 3.19-qspinlock341730.5636.52s
% change +73.7%-42.4%

It turns out that this workload was causing quite a lot of spinlock
contention in the vanilla 3.19 kernel. The performance advantage of
this patch increases with heavier loads.

With the powersave governor, the JPM data were as follows:

Users3.19 3.19-qspinlock  % Change
---  
 500  112635.38  132596.69   +17.7%
1000  171240.40  240369.80   +40.4%
1500  130507.53  324436.74  +148.6%
2000  175972.93  341637.01   +94.1%

With the qspinlock patch, there wasn't too much difference in
performance between the 2 scaling governors. Without this patch,
the powersave governor was much slower than the performance governor.

By disabling the intel_pstate driver and use acpi_cpufreq instead,
the benchmark performance (JPM) at 1000 users level for the performance
and ondemand governors were:

  Governor  3.193.19-qspinlock   % Change
    --   
  performance   124949.94   219950.65+76.0%
  ondemand  4838.90   206690.96+4171%

The performance was just horrible when there was significant spinlock
contention with the ondemand governor. There was also significant
run-to-run variation.  A second run of the same benchmark gave a result
of 22115 JPMs. With the qspinlock patch, however, the performance was
much more stable under different cpufreq drivers and governors. That
is not the case with the default ticket spinlock implementation.

The %CPU times spent on spinlock contention (from perf) with the
performance governor and the intel_pstate driver were:

  Kernel Function3.19 kernel3.19-qspinlock kernel
  ------

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-27 Thread Raghavendra K T


On 03/16/2015 06:46 PM, Peter Zijlstra wrote:

Hi Waiman,

As promised; here is the paravirt stuff I did during the trip to BOS last week.

All the !paravirt patches are more or less the same as before (the only real
change is the copyright lines in the first patch).

The paravirt stuff is 'simple' and KVM only -- the Xen code was a little more
convoluted and I've no real way to test that but it should be stright fwd to
make work.

I ran this using the virtme tool (thanks Andy) on my laptop with a 4x
overcommit on vcpus (16 vcpus as compared to the 4 my laptop actually has) and
it both booted and survived a hackbench run (perf bench sched messaging -g 20
-l 5000).

So while the paravirt code isn't the most optimal code ever conceived it does 
work.

Also, the paravirt patching includes replacing the call with movb $0, %arg1
for the native case, which should greatly reduce the cost of having
CONFIG_PARAVIRT_SPINLOCKS enabled on actual hardware.

I feel that if someone were to do a Xen patch we can go ahead and merge this
stuff (finally!).

These patches do not implement the paravirt spinlock debug stats currently
implemented (separately) by KVM and Xen, but that should not be too hard to do
on top and in the 'generic' code -- no reason to duplicate all that.

Of course; once this lands people can look at improving the paravirt nonsense.



last time I had reported some hangs in kvm case, and I can confirm that
the current set of patches works fine.

Feel free to add
Tested-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com #kvm pv

As far as performance is concerned (with my 16core +ht machine having
16vcpu guests [ even w/ , w/o the lfsr hash patchset ]), I do not see
any significant observations to report, though I understand that we
could see much more benefit with large number of vcpus because of
possible reduction in cache bouncing.






--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-27 Thread Konrad Rzeszutek Wilk

On Thu, Mar 26, 2015 at 09:21:53PM +0100, Peter Zijlstra wrote:
 On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote:
  Ah nice. That could be spun out as a seperate patch to optimize the existing
  ticket locks I presume.
 
 Yes I suppose we can do something similar for the ticket and patch in
 the right increment. We'd need to restructure the code a bit, but
 its not fundamentally impossible.
 
 We could equally apply the head hashing to the current ticket
 implementation and avoid the current bitmap iteration.
 
  Now with the old pv ticketlock code an vCPU would only go to sleep once and
  be woken up when it was its turn. With this new code it is woken up twice 
  (and twice it goes to sleep). With an overcommit scenario this would imply
  that we will have at least twice as many VMEXIT as with the previous code.
 
 An astute observation, I had not considered that.

Thank you.
 
  I presume when you did benchmarking this did not even register? Thought
  I wonder if it would if you ran the benchmark for a week or so.
 
 You presume I benchmarked :-) I managed to boot something virt and run
 hackbench in it. I wouldn't know a representative virt setup if I ran
 into it.
 
 The thing is, we want this qspinlock for real hardware because its
 faster and I really want to avoid having to carry two spinlock
 implementations -- although I suppose that if we really really have to
 we could.

In some way you already have that - for virtualized environments where you
don't have an PV mechanism you just use the byte spinlock - which is good.

And switching to PV ticketlock implementation after boot.. ugh. I feel your 
pain.

What if you used an PV bytelock implemenation? The code you posted already
'sprays' all the vCPUS to wake up. And that is exactly what you need for PV
bytelocks - well, you only need to wake up the vCPUS that have gone to sleep
waiting on an specific 'struct spinlock' and just stash those in an per-cpu
area. The old Xen spinlock code (Before 3.11?) had this.

Just an idea thought.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-26 Thread Peter Zijlstra

On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote:
 Ah nice. That could be spun out as a seperate patch to optimize the existing
 ticket locks I presume.

Yes I suppose we can do something similar for the ticket and patch in
the right increment. We'd need to restructure the code a bit, but
its not fundamentally impossible.

We could equally apply the head hashing to the current ticket
implementation and avoid the current bitmap iteration.

 Now with the old pv ticketlock code an vCPU would only go to sleep once and
 be woken up when it was its turn. With this new code it is woken up twice 
 (and twice it goes to sleep). With an overcommit scenario this would imply
 that we will have at least twice as many VMEXIT as with the previous code.

An astute observation, I had not considered that.

 I presume when you did benchmarking this did not even register? Thought
 I wonder if it would if you ran the benchmark for a week or so.

You presume I benchmarked :-) I managed to boot something virt and run
hackbench in it. I wouldn't know a representative virt setup if I ran
into it.

The thing is, we want this qspinlock for real hardware because its
faster and I really want to avoid having to carry two spinlock
implementations -- although I suppose that if we really really have to
we could.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-25 Thread Konrad Rzeszutek Wilk

On Mon, Mar 16, 2015 at 02:16:13PM +0100, Peter Zijlstra wrote:
 Hi Waiman,
 
 As promised; here is the paravirt stuff I did during the trip to BOS last 
 week.
 
 All the !paravirt patches are more or less the same as before (the only real
 change is the copyright lines in the first patch).
 
 The paravirt stuff is 'simple' and KVM only -- the Xen code was a little more
 convoluted and I've no real way to test that but it should be stright fwd to
 make work.
 
 I ran this using the virtme tool (thanks Andy) on my laptop with a 4x
 overcommit on vcpus (16 vcpus as compared to the 4 my laptop actually has) and
 it both booted and survived a hackbench run (perf bench sched messaging -g 20
 -l 5000).
 
 So while the paravirt code isn't the most optimal code ever conceived it does 
 work.
 
 Also, the paravirt patching includes replacing the call with movb $0, %arg1
 for the native case, which should greatly reduce the cost of having
 CONFIG_PARAVIRT_SPINLOCKS enabled on actual hardware.

Ah nice. That could be spun out as a seperate patch to optimize the existing
ticket locks I presume.

Now with the old pv ticketlock code an vCPU would only go to sleep once and
be woken up when it was its turn. With this new code it is woken up twice 
(and twice it goes to sleep). With an overcommit scenario this would imply
that we will have at least twice as many VMEXIT as with the previous code.

I presume when you did benchmarking this did not even register? Thought
I wonder if it would if you ran the benchmark for a week or so.

 
 I feel that if someone were to do a Xen patch we can go ahead and merge this
 stuff (finally!).
 
 These patches do not implement the paravirt spinlock debug stats currently
 implemented (separately) by KVM and Xen, but that should not be too hard to do
 on top and in the 'generic' code -- no reason to duplicate all that.
 
 Of course; once this lands people can look at improving the paravirt nonsense.
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-18 Thread Waiman Long


On 03/16/2015 09:16 AM, Peter Zijlstra wrote:

Hi Waiman,

As promised; here is the paravirt stuff I did during the trip to BOS last week.

All the !paravirt patches are more or less the same as before (the only real
change is the copyright lines in the first patch).

The paravirt stuff is 'simple' and KVM only -- the Xen code was a little more
convoluted and I've no real way to test that but it should be stright fwd to
make work.

I ran this using the virtme tool (thanks Andy) on my laptop with a 4x
overcommit on vcpus (16 vcpus as compared to the 4 my laptop actually has) and
it both booted and survived a hackbench run (perf bench sched messaging -g 20
-l 5000).

So while the paravirt code isn't the most optimal code ever conceived it does 
work.

Also, the paravirt patching includes replacing the call with movb $0, %arg1
for the native case, which should greatly reduce the cost of having
CONFIG_PARAVIRT_SPINLOCKS enabled on actual hardware.

I feel that if someone were to do a Xen patch we can go ahead and merge this
stuff (finally!).

These patches do not implement the paravirt spinlock debug stats currently
implemented (separately) by KVM and Xen, but that should not be too hard to do
on top and in the 'generic' code -- no reason to duplicate all that.

Of course; once this lands people can look at improving the paravirt nonsense.



Thanks for sending this out. I have no problem with the !paravirt patch. 
I do have some comments on the paravirt one which I will reply individually.


Cheers,
Longman
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/9] qspinlock stuff -v15

Re: [PATCH 0/9] qspinlock stuff -v15

Re: [PATCH 0/9] qspinlock stuff -v15

Re: [PATCH 0/9] qspinlock stuff -v15

Re: [PATCH 0/9] qspinlock stuff -v15

Re: [PATCH 0/9] qspinlock stuff -v15

Re: [PATCH 0/9] qspinlock stuff -v15

Re: [PATCH 0/9] qspinlock stuff -v15

Re: [PATCH 0/9] qspinlock stuff -v15

9 matches

Site Navigation

Mail list logo

Footer information