Longman,
We further optimized the kernel spinlock in ali-spin-lock.patch
as attachment based on kernel 4.3.0-rc4.
Run thread.c in user space with kernel patch(ali-spin-lock.patch) on E5-2699v3,
compare with original spinlock:
The printed data indicates the performance in critical path is
Longman,
We further optimized the kernel spinlock in ali-spin-lock.patch
as attachment based on kernel 4.3.0-rc4.
Run thread.c in user space with kernel patch(ali-spin-lock.patch) on E5-2699v3,
compare with original spinlock:
The printed data indicates the performance in critical path is
On 11/30/2015 01:17 AM, Ling Ma wrote:
Any comments, the patch is acceptable ?
Thanks
Ling
Ling,
The core idea of your current patch hasn't changed from your previous
patch.
My comment is that you should not attempt to sell it as a replacement
of the current spinlock mechanism. I just
On 11/30/2015 01:17 AM, Ling Ma wrote:
Any comments, the patch is acceptable ?
Thanks
Ling
Ling,
The core idea of your current patch hasn't changed from your previous
patch.
My comment is that you should not attempt to sell it as a replacement
of the current spinlock mechanism. I just
Any comments, the patch is acceptable ?
Thanks
Ling
2015-11-26 17:00 GMT+08:00 Ling Ma :
> Run thread.c with clean kernel 4.3.0-rc4, perf top -G also indicates
> cache_flusharray and cache_alloc_refill functions spend 25.6% time
> on queued_spin_lock_slowpath totally. it means the compared data
Any comments, the patch is acceptable ?
Thanks
Ling
2015-11-26 17:00 GMT+08:00 Ling Ma :
> Run thread.c with clean kernel 4.3.0-rc4, perf top -G also indicates
> cache_flusharray and cache_alloc_refill functions spend 25.6% time
> on queued_spin_lock_slowpath totally.
Run thread.c with clean kernel 4.3.0-rc4, perf top -G also indicates
cache_flusharray and cache_alloc_refill functions spend 25.6% time
on queued_spin_lock_slowpath totally. it means the compared data
from our spinlock-test.patch is reliable.
Thanks
Ling
2015-11-26 11:49 GMT+08:00 Ling Ma :
>
Run thread.c with clean kernel 4.3.0-rc4, perf top -G also indicates
cache_flusharray and cache_alloc_refill functions spend 25.6% time
on queued_spin_lock_slowpath totally. it means the compared data
from our spinlock-test.patch is reliable.
Thanks
Ling
2015-11-26 11:49 GMT+08:00 Ling Ma
Hi Longman,
All compared data is from the below operation in spinlock-test.patch:
+#if ORG_QUEUED_SPINLOCK
+ org_queued_spin_lock((struct qspinlock *)>list_lock);
+ refill_fn();
+ org_queued_spin_unlock((struct qspinlock *)>list_lock);
+#else
+ new_spin_lock((struct
On 11/23/2015 04:41 AM, Ling Ma wrote:
> Hi Longman,
>
> Attachments include user space application thread.c and kernel patch
> spinlock-test.patch based on kernel 4.3.0-rc4
>
> we run thread.c with kernel patch, test original and new spinlock
> respectively,
> perf top -G indicates thread.c
Hi Longman,
All compared data is from the below operation in spinlock-test.patch:
+#if ORG_QUEUED_SPINLOCK
+ org_queued_spin_lock((struct qspinlock *)>list_lock);
+ refill_fn();
+ org_queued_spin_unlock((struct qspinlock *)>list_lock);
+#else
+ new_spin_lock((struct
On 11/23/2015 04:41 AM, Ling Ma wrote:
> Hi Longman,
>
> Attachments include user space application thread.c and kernel patch
> spinlock-test.patch based on kernel 4.3.0-rc4
>
> we run thread.c with kernel patch, test original and new spinlock
> respectively,
> perf top -G indicates thread.c
Any comments about it ?
Thanks
Ling
2015-11-23 17:41 GMT+08:00 Ling Ma :
> Hi Longman,
>
> Attachments include user space application thread.c and kernel patch
> spinlock-test.patch based on kernel 4.3.0-rc4
>
> we run thread.c with kernel patch, test original and new spinlock
> respectively,
>
Any comments about it ?
Thanks
Ling
2015-11-23 17:41 GMT+08:00 Ling Ma :
> Hi Longman,
>
> Attachments include user space application thread.c and kernel patch
> spinlock-test.patch based on kernel 4.3.0-rc4
>
> we run thread.c with kernel patch, test original and new
Hi Longman,
Attachments include user space application thread.c and kernel patch
spinlock-test.patch based on kernel 4.3.0-rc4
we run thread.c with kernel patch, test original and new spinlock respectively,
perf top -G indicates thread.c cause cache_alloc_refill and
cache_flusharray functions to
Hi Longman,
Attachments include user space application thread.c and kernel patch
spinlock-test.patch based on kernel 4.3.0-rc4
we run thread.c with kernel patch, test original and new spinlock respectively,
perf top -G indicates thread.c cause cache_alloc_refill and
cache_flusharray functions to
On 11/05/2015 11:28 PM, Ling Ma wrote:
Longman
Thanks for your suggestion.
We will look for real scenario to test, and could you please introduce
some benchmarks on spinlock ?
Regards
Ling
The kernel has been well optimized for most common workloads that
spinlock contention is usually not
On 11/05/2015 11:28 PM, Ling Ma wrote:
Longman
Thanks for your suggestion.
We will look for real scenario to test, and could you please introduce
some benchmarks on spinlock ?
Regards
Ling
The kernel has been well optimized for most common workloads that
spinlock contention is usually not
Longman
Thanks for your suggestion.
We will look for real scenario to test, and could you please introduce
some benchmarks on spinlock ?
Regards
Ling
>
> Your new spinlock code completely change the API and the semantics of the
> existing spinlock calls. That requires changes to thousands of
Longman
Thanks for your suggestion.
We will look for real scenario to test, and could you please introduce
some benchmarks on spinlock ?
Regards
Ling
>
> Your new spinlock code completely change the API and the semantics of the
> existing spinlock calls. That requires changes to thousands of
Hi All,
(send again for linux-kernel@vger.kernel.org)
Spinlock caused cache line ping-pong between cores,
we have to spend lots of time to get serialized execution.
However if we present the serialized work to one core,
it will help us save much time.
In the attachment we changed code based on
Hi All,
(send again for linux-kernel@vger.kernel.org)
Spinlock caused cache line ping-pong between cores,
we have to spend lots of time to get serialized execution.
However if we present the serialized work to one core,
it will help us save much time.
In the attachment we changed code based on
22 matches
Mail list logo