[tip: core/rcu] rcu: Update comment from rsp->rcu_gp_seq to rsp->gp_seq

2020-07-31 Thread tip-bot2 for Lihao Liang
The following commit has been merged into the core/rcu branch of tip:

Commit-ID: 360fbbb4897c98971e8955b063c01250817a2191
Gitweb:
https://git.kernel.org/tip/360fbbb4897c98971e8955b063c01250817a2191
Author:Lihao Liang 
AuthorDate:Thu, 14 May 2020 21:34:34 +01:00
Committer: Paul E. McKenney 
CommitterDate: Mon, 29 Jun 2020 11:58:50 -07:00

rcu: Update comment from rsp->rcu_gp_seq to rsp->gp_seq

Signed-off-by: Lihao Liang 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 9c6f734..575745f 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -41,7 +41,7 @@ struct rcu_node {
raw_spinlock_t __private lock;  /* Root rcu_node's lock protects */
/*  some rcu_state fields as well as */
/*  following. */
-   unsigned long gp_seq;   /* Track rsp->rcu_gp_seq. */
+   unsigned long gp_seq;   /* Track rsp->gp_seq. */
unsigned long gp_seq_needed; /* Track furthest future GP request. */
unsigned long completedqs; /* All QSes done for this node. */
unsigned long qsmask;   /* CPUs or groups that need to switch in */
@@ -149,7 +149,7 @@ union rcu_noqs {
 /* Per-CPU data for read-copy update. */
 struct rcu_data {
/* 1) quiescent-state and grace-period handling : */
-   unsigned long   gp_seq; /* Track rsp->rcu_gp_seq counter. */
+   unsigned long   gp_seq; /* Track rsp->gp_seq counter. */
unsigned long   gp_seq_needed;  /* Track furthest future GP request. */
union rcu_noqs  cpu_no_qs;  /* No QSes yet for this CPU. */
boolcore_needs_qs;  /* Core waits for quiesc state. */


[PATCH] doc: Update comment from rsp->rcu_gp_seq to rsp->gp_seq

2020-05-14 Thread Lihao Liang
Signed-off-by: Lihao Liang 
---
 kernel/rcu/tree.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 2d7fcb9bdd34..508c46421eb3 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -41,7 +41,7 @@ struct rcu_node {
raw_spinlock_t __private lock;  /* Root rcu_node's lock protects */
/*  some rcu_state fields as well as */
/*  following. */
-   unsigned long gp_seq;   /* Track rsp->rcu_gp_seq. */
+   unsigned long gp_seq;   /* Track rsp->gp_seq. */
unsigned long gp_seq_needed; /* Track furthest future GP request. */
unsigned long completedqs; /* All QSes done for this node. */
unsigned long qsmask;   /* CPUs or groups that need to switch in */
@@ -149,7 +149,7 @@ union rcu_noqs {
 /* Per-CPU data for read-copy update. */
 struct rcu_data {
/* 1) quiescent-state and grace-period handling : */
-   unsigned long   gp_seq; /* Track rsp->rcu_gp_seq counter. */
+   unsigned long   gp_seq; /* Track rsp->gp_seq counter. */
unsigned long   gp_seq_needed;  /* Track furthest future GP request. */
union rcu_noqs  cpu_no_qs;  /* No QSes yet for this CPU. */
boolcore_needs_qs;  /* Core waits for quiesc state. */
-- 
2.26.2.761.g0e0b3e54be-goog



Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol

2018-01-27 Thread Lihao Liang
On Sat, Jan 27, 2018 at 7:57 AM, Paul E. McKenney
<paul...@linux.vnet.ibm.com> wrote:
> On Sat, Jan 27, 2018 at 07:22:27AM +0000, Lihao Liang wrote:
>> On Thu, Jan 25, 2018 at 5:53 AM, Paul E. McKenney
>> <paul...@linux.vnet.ibm.com> wrote:
>> > On Tue, Jan 23, 2018 at 03:59:25PM +0800, liangli...@huawei.com wrote:
>> >> From: Lihao Liang <liangli...@huawei.com>
>> >>
>> >> Dear Paul,
>> >>
>> >> This patch set implements a preemptive version of RCU (PRCU) based on the 
>> >> following paper:
>> >>
>> >> Fast Consensus Using Bounded Staleness for Scalable Read-mostly 
>> >> Synchronization.
>> >> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> >> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> >> https://dl.acm.org/citation.cfm?id=3024114.3024143
>> >>
>> >> We have also added preliminary callback-handling support.  Thus, the 
>> >> current version
>> >> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), 
>> >> call_prcu(),
>> >> and prcu_barrier().
>> >>
>> >> This is an experimental patch, so it would be good to have some feedback.
>> >>
>> >> Known shortcoming is that the grace-period version is incremented in 
>> >> synchronize_prcu().
>> >> If call_prcu() or prcu_barrier() is called but there is no 
>> >> synchronized_prcu() invoked,
>> >> callbacks cannot be invoked.  Later version should address this issue, 
>> >> e.g. adding a
>> >> grace-period expedition mechanism.  Others include to use a a 
>> >> hierarchical structure,
>> >> taking into account the NUMA topology, to send IPI in synchronize_prcu().
>> >>
>> >> We have tested the implementation using rcutorture on both an x86 and 
>> >> ARM64 machine.
>> >> PRCU passed 1h and 3h tests on all the newly added config files except 
>> >> PRCU07 reported BUG
>> >> in a 1h run.
>> >>
>> >> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
>> >> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
>> >> [ 1594.73] smpboot: CPU 14 is now offline
>> >> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
>> >> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
>> >> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
>> >> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
>> >> [ 1599.365098] prcu-torture: rtc: b0277b90 ver: 66358 tfle: 0 
>> >> rta: 66358 rtaf: 0
>> >> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418
>> >> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 
>> >> 188/189:1 cbflood: 225
>> >> [ 1599.367946] prcu-torture: !!!
>> >> [ 1599.367966] [ cut here ]
>> >
>> > The "rtbe: 1" indicates that your implementation of prcu_barrier()
>> > failed to wait for all preceding call_prcu() callbacks to be invoked.
>> >
>> > Does the immediately following "Reader Pipe:" list have any but the
>> > first two numbers non-zero?
>>
>> Yes.
>
> If the third or subsequent numbers are non-zero, that would indicate
> too-short grace periods.  This would be a critical bug in PRCU.
>
>> >> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to 
>> >> true, that is
>> >> synchronize_rcu_expedited was tested.
>> >>
>> >> The rcuperf results are as follows (average grace-period duration in ms 
>> >> of ten 10min runs):
>> >>
>> >> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
>> >>
>> >> CPUs  2   4   8  12  15   16
>> >> PRCU   0.141.074.158.02   10.7915.16
>> >> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
>> >>
>> >> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
>> >>
>> >> CPUs   2   48  16  32   48   6364
>> >> PRCU0.23   19.6938.28   63.21   95.41   167.18   252.01   1841.44
>> >> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27
>> >
>> > Well, at the very least, this is a bug report on either expedited RCU
>> &g

Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol

2018-01-27 Thread Lihao Liang
On Sat, Jan 27, 2018 at 7:57 AM, Paul E. McKenney
 wrote:
> On Sat, Jan 27, 2018 at 07:22:27AM +0000, Lihao Liang wrote:
>> On Thu, Jan 25, 2018 at 5:53 AM, Paul E. McKenney
>>  wrote:
>> > On Tue, Jan 23, 2018 at 03:59:25PM +0800, liangli...@huawei.com wrote:
>> >> From: Lihao Liang 
>> >>
>> >> Dear Paul,
>> >>
>> >> This patch set implements a preemptive version of RCU (PRCU) based on the 
>> >> following paper:
>> >>
>> >> Fast Consensus Using Bounded Staleness for Scalable Read-mostly 
>> >> Synchronization.
>> >> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> >> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> >> https://dl.acm.org/citation.cfm?id=3024114.3024143
>> >>
>> >> We have also added preliminary callback-handling support.  Thus, the 
>> >> current version
>> >> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), 
>> >> call_prcu(),
>> >> and prcu_barrier().
>> >>
>> >> This is an experimental patch, so it would be good to have some feedback.
>> >>
>> >> Known shortcoming is that the grace-period version is incremented in 
>> >> synchronize_prcu().
>> >> If call_prcu() or prcu_barrier() is called but there is no 
>> >> synchronized_prcu() invoked,
>> >> callbacks cannot be invoked.  Later version should address this issue, 
>> >> e.g. adding a
>> >> grace-period expedition mechanism.  Others include to use a a 
>> >> hierarchical structure,
>> >> taking into account the NUMA topology, to send IPI in synchronize_prcu().
>> >>
>> >> We have tested the implementation using rcutorture on both an x86 and 
>> >> ARM64 machine.
>> >> PRCU passed 1h and 3h tests on all the newly added config files except 
>> >> PRCU07 reported BUG
>> >> in a 1h run.
>> >>
>> >> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
>> >> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
>> >> [ 1594.73] smpboot: CPU 14 is now offline
>> >> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
>> >> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
>> >> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
>> >> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
>> >> [ 1599.365098] prcu-torture: rtc: b0277b90 ver: 66358 tfle: 0 
>> >> rta: 66358 rtaf: 0
>> >> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418
>> >> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 
>> >> 188/189:1 cbflood: 225
>> >> [ 1599.367946] prcu-torture: !!!
>> >> [ 1599.367966] [ cut here ]
>> >
>> > The "rtbe: 1" indicates that your implementation of prcu_barrier()
>> > failed to wait for all preceding call_prcu() callbacks to be invoked.
>> >
>> > Does the immediately following "Reader Pipe:" list have any but the
>> > first two numbers non-zero?
>>
>> Yes.
>
> If the third or subsequent numbers are non-zero, that would indicate
> too-short grace periods.  This would be a critical bug in PRCU.
>
>> >> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to 
>> >> true, that is
>> >> synchronize_rcu_expedited was tested.
>> >>
>> >> The rcuperf results are as follows (average grace-period duration in ms 
>> >> of ten 10min runs):
>> >>
>> >> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
>> >>
>> >> CPUs  2   4   8  12  15   16
>> >> PRCU   0.141.074.158.02   10.7915.16
>> >> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
>> >>
>> >> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
>> >>
>> >> CPUs   2   48  16  32   48   6364
>> >> PRCU0.23   19.6938.28   63.21   95.41   167.18   252.01   1841.44
>> >> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27
>> >
>> > Well, at the very least, this is a bug report on either expedited RCU
>> > grace-period latency or on rcuperf's measurements, and thank you for that.
>> &g

Re: [PATCH RFC 01/16] prcu: Add PRCU implementation

2018-01-26 Thread Lihao Liang
On Thu, Jan 25, 2018 at 6:16 AM, Paul E. McKenney
<paul...@linux.vnet.ibm.com> wrote:
> On Tue, Jan 23, 2018 at 03:59:26PM +0800, liangli...@huawei.com wrote:
>> From: Heng Zhang <hen...@huawei.com>
>>
>> This RCU implementation (PRCU) is based on a fast consensus protocol
>> published in the following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly 
>> Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>>
>> Signed-off-by: Heng Zhang <hen...@huawei.com>
>> Signed-off-by: Lihao Liang <liangli...@huawei.com>
>
> A few comments and questions interspersed.
>
> Thanx, Paul
>
>> ---
>>  include/linux/prcu.h |  37 +++
>>  kernel/rcu/Makefile  |   2 +-
>>  kernel/rcu/prcu.c| 125 
>> +++
>>  kernel/sched/core.c  |   2 +
>>  4 files changed, 165 insertions(+), 1 deletion(-)
>>  create mode 100644 include/linux/prcu.h
>>  create mode 100644 kernel/rcu/prcu.c
>>
>> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
>> new file mode 100644
>> index ..653b4633
>> --- /dev/null
>> +++ b/include/linux/prcu.h
>> @@ -0,0 +1,37 @@
>> +#ifndef __LINUX_PRCU_H
>> +#define __LINUX_PRCU_H
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#define CONFIG_PRCU
>> +
>> +struct prcu_local_struct {
>> + unsigned int locked;
>> + unsigned int online;
>> + unsigned long long version;
>> +};
>> +
>> +struct prcu_struct {
>> + atomic64_t global_version;
>> + atomic_t active_ctr;
>> + struct mutex mtx;
>> + wait_queue_head_t wait_q;
>> +};
>> +
>> +#ifdef CONFIG_PRCU
>> +void prcu_read_lock(void);
>> +void prcu_read_unlock(void);
>> +void synchronize_prcu(void);
>> +void prcu_note_context_switch(void);
>> +
>> +#else /* #ifdef CONFIG_PRCU */
>> +
>> +#define prcu_read_lock() do {} while (0)
>> +#define prcu_read_unlock() do {} while (0)
>> +#define synchronize_prcu() do {} while (0)
>> +#define prcu_note_context_switch() do {} while (0)
>
> If CONFIG_PRCU=n and some code is built that uses PRCU, shouldn't you
> get a build error rather than an error-free but inoperative PRCU?
>

Very good point, thank you!

> Of course, Peter's question about purpose of the patch set applies
> here as well.
>

The main motivation of this patch set is the comparison results of
rcuperf between PRCU and Tree RCU in which PRCU outperformed Tree RCU
by a large margin.

As indicated in your reply of the email in this patch series

[PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol

this may be a bug on either expedited RCU grace-period latency or on
rcuperf's measurements.

Many thanks,
Lihao.

>> +
>> +#endif /* #ifdef CONFIG_PRCU */
>> +#endif /* __LINUX_PRCU_H */
>> diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
>> index 23803c7d..8791419c 100644
>> --- a/kernel/rcu/Makefile
>> +++ b/kernel/rcu/Makefile
>> @@ -2,7 +2,7 @@
>>  # and is generally not a function of system call inputs.
>>  KCOV_INSTRUMENT := n
>>
>> -obj-y += update.o sync.o
>> +obj-y += update.o sync.o prcu.o
>>  obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
>>  obj-$(CONFIG_TREE_SRCU) += srcutree.o
>>  obj-$(CONFIG_TINY_SRCU) += srcutiny.o
>> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
>> new file mode 100644
>> index ..a00b9420
>> --- /dev/null
>> +++ b/kernel/rcu/prcu.c
>> @@ -0,0 +1,125 @@
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include 
>> +
>> +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
>> +
>> +struct prcu_struct global_prcu = {
>> + .global_version = ATOMIC64_INIT(0),
>> + .active_ctr = ATOMIC_INIT(0),
>> + .mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
>> + .wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
>> +};
>> +struct prcu_struct *prcu = _prcu;
>> +
>> +static inline void prcu_report(struct prcu_local_struct *local)
>> +{
>> + unsigned long long global_version;
>> + unsigned long long local_version;
>> +
>> + global_version = atomic64_read(>global_version);
>

Re: [PATCH RFC 01/16] prcu: Add PRCU implementation

2018-01-26 Thread Lihao Liang
On Thu, Jan 25, 2018 at 6:16 AM, Paul E. McKenney
 wrote:
> On Tue, Jan 23, 2018 at 03:59:26PM +0800, liangli...@huawei.com wrote:
>> From: Heng Zhang 
>>
>> This RCU implementation (PRCU) is based on a fast consensus protocol
>> published in the following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly 
>> Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>>
>> Signed-off-by: Heng Zhang 
>> Signed-off-by: Lihao Liang 
>
> A few comments and questions interspersed.
>
> Thanx, Paul
>
>> ---
>>  include/linux/prcu.h |  37 +++
>>  kernel/rcu/Makefile  |   2 +-
>>  kernel/rcu/prcu.c| 125 
>> +++
>>  kernel/sched/core.c  |   2 +
>>  4 files changed, 165 insertions(+), 1 deletion(-)
>>  create mode 100644 include/linux/prcu.h
>>  create mode 100644 kernel/rcu/prcu.c
>>
>> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
>> new file mode 100644
>> index ..653b4633
>> --- /dev/null
>> +++ b/include/linux/prcu.h
>> @@ -0,0 +1,37 @@
>> +#ifndef __LINUX_PRCU_H
>> +#define __LINUX_PRCU_H
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#define CONFIG_PRCU
>> +
>> +struct prcu_local_struct {
>> + unsigned int locked;
>> + unsigned int online;
>> + unsigned long long version;
>> +};
>> +
>> +struct prcu_struct {
>> + atomic64_t global_version;
>> + atomic_t active_ctr;
>> + struct mutex mtx;
>> + wait_queue_head_t wait_q;
>> +};
>> +
>> +#ifdef CONFIG_PRCU
>> +void prcu_read_lock(void);
>> +void prcu_read_unlock(void);
>> +void synchronize_prcu(void);
>> +void prcu_note_context_switch(void);
>> +
>> +#else /* #ifdef CONFIG_PRCU */
>> +
>> +#define prcu_read_lock() do {} while (0)
>> +#define prcu_read_unlock() do {} while (0)
>> +#define synchronize_prcu() do {} while (0)
>> +#define prcu_note_context_switch() do {} while (0)
>
> If CONFIG_PRCU=n and some code is built that uses PRCU, shouldn't you
> get a build error rather than an error-free but inoperative PRCU?
>

Very good point, thank you!

> Of course, Peter's question about purpose of the patch set applies
> here as well.
>

The main motivation of this patch set is the comparison results of
rcuperf between PRCU and Tree RCU in which PRCU outperformed Tree RCU
by a large margin.

As indicated in your reply of the email in this patch series

[PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol

this may be a bug on either expedited RCU grace-period latency or on
rcuperf's measurements.

Many thanks,
Lihao.

>> +
>> +#endif /* #ifdef CONFIG_PRCU */
>> +#endif /* __LINUX_PRCU_H */
>> diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
>> index 23803c7d..8791419c 100644
>> --- a/kernel/rcu/Makefile
>> +++ b/kernel/rcu/Makefile
>> @@ -2,7 +2,7 @@
>>  # and is generally not a function of system call inputs.
>>  KCOV_INSTRUMENT := n
>>
>> -obj-y += update.o sync.o
>> +obj-y += update.o sync.o prcu.o
>>  obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
>>  obj-$(CONFIG_TREE_SRCU) += srcutree.o
>>  obj-$(CONFIG_TINY_SRCU) += srcutiny.o
>> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
>> new file mode 100644
>> index ..a00b9420
>> --- /dev/null
>> +++ b/kernel/rcu/prcu.c
>> @@ -0,0 +1,125 @@
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include 
>> +
>> +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
>> +
>> +struct prcu_struct global_prcu = {
>> + .global_version = ATOMIC64_INIT(0),
>> + .active_ctr = ATOMIC_INIT(0),
>> + .mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
>> + .wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
>> +};
>> +struct prcu_struct *prcu = _prcu;
>> +
>> +static inline void prcu_report(struct prcu_local_struct *local)
>> +{
>> + unsigned long long global_version;
>> + unsigned long long local_version;
>> +
>> + global_version = atomic64_read(>global_version);
>> + local_version = local->version;
>> + if (global_version > local_version)
>> + cmpxchg(>ver

Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol

2018-01-26 Thread Lihao Liang
On Thu, Jan 25, 2018 at 5:53 AM, Paul E. McKenney
<paul...@linux.vnet.ibm.com> wrote:
> On Tue, Jan 23, 2018 at 03:59:25PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang <liangli...@huawei.com>
>>
>> Dear Paul,
>>
>> This patch set implements a preemptive version of RCU (PRCU) based on the 
>> following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly 
>> Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>>
>> We have also added preliminary callback-handling support.  Thus, the current 
>> version
>> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), 
>> call_prcu(),
>> and prcu_barrier().
>>
>> This is an experimental patch, so it would be good to have some feedback.
>>
>> Known shortcoming is that the grace-period version is incremented in 
>> synchronize_prcu().
>> If call_prcu() or prcu_barrier() is called but there is no 
>> synchronized_prcu() invoked,
>> callbacks cannot be invoked.  Later version should address this issue, e.g. 
>> adding a
>> grace-period expedition mechanism.  Others include to use a a hierarchical 
>> structure,
>> taking into account the NUMA topology, to send IPI in synchronize_prcu().
>>
>> We have tested the implementation using rcutorture on both an x86 and ARM64 
>> machine.
>> PRCU passed 1h and 3h tests on all the newly added config files except 
>> PRCU07 reported BUG
>> in a 1h run.
>>
>> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
>> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
>> [ 1594.73] smpboot: CPU 14 is now offline
>> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
>> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
>> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
>> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
>> [ 1599.365098] prcu-torture: rtc: b0277b90 ver: 66358 tfle: 0 rta: 
>> 66358 rtaf: 0
>> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418
>> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 
>> cbflood: 225
>> [ 1599.367946] prcu-torture: !!!
>> [ 1599.367966] [ cut here ]
>
> The "rtbe: 1" indicates that your implementation of prcu_barrier()
> failed to wait for all preceding call_prcu() callbacks to be invoked.
>
> Does the immediately following "Reader Pipe:" list have any but the
> first two numbers non-zero?
>

Yes.

>> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to 
>> true, that is
>> synchronize_rcu_expedited was tested.
>>
>> The rcuperf results are as follows (average grace-period duration in ms of 
>> ten 10min runs):
>>
>> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
>>
>> CPUs  2   4   8  12  15   16
>> PRCU   0.141.074.158.02   10.7915.16
>> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
>>
>> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
>>
>> CPUs   2   48  16  32   48   6364
>> PRCU0.23   19.6938.28   63.21   95.41   167.18   252.01   1841.44
>> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27
>
> Well, at the very least, this is a bug report on either expedited RCU
> grace-period latency or on rcuperf's measurements, and thank you for that.
> I will look into this.  In the meantime, could you please let me know
> exactly how you invoked rcuperf?
>

We used the following command to invoke rcuperf:

sudo ./kvm.sh --torture rcuperf --duration 10 --configs 10*TREE

The actual script run-rcuperf.sh to run the experiments can be found
in the following email of this patch series:

[PATCH RFC 15/16] rcutorture: Add scripts to run experiments

Please let us know how it goes.

Many thanks,
Lihao.

> I have a few comments on some of your patches based on a quick scan
> through them.
>
> Thanx, Paul
>
>> Best wishes,
>> Lihao.
>>
>>
>> Lihao Liang (15):
>>   rcutorture: Add PRCU rcu_torture_ops
>>   rcutorture: Add PRCU test config files
>>   rcuperf: Add PRCU rcu_perf_ops
>>   rcuperf: Add PRCU test config files
>>   rcuperf: Set gp_exp to true for tests to run
>>

Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol

2018-01-26 Thread Lihao Liang
On Thu, Jan 25, 2018 at 5:53 AM, Paul E. McKenney
 wrote:
> On Tue, Jan 23, 2018 at 03:59:25PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang 
>>
>> Dear Paul,
>>
>> This patch set implements a preemptive version of RCU (PRCU) based on the 
>> following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly 
>> Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>>
>> We have also added preliminary callback-handling support.  Thus, the current 
>> version
>> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), 
>> call_prcu(),
>> and prcu_barrier().
>>
>> This is an experimental patch, so it would be good to have some feedback.
>>
>> Known shortcoming is that the grace-period version is incremented in 
>> synchronize_prcu().
>> If call_prcu() or prcu_barrier() is called but there is no 
>> synchronized_prcu() invoked,
>> callbacks cannot be invoked.  Later version should address this issue, e.g. 
>> adding a
>> grace-period expedition mechanism.  Others include to use a a hierarchical 
>> structure,
>> taking into account the NUMA topology, to send IPI in synchronize_prcu().
>>
>> We have tested the implementation using rcutorture on both an x86 and ARM64 
>> machine.
>> PRCU passed 1h and 3h tests on all the newly added config files except 
>> PRCU07 reported BUG
>> in a 1h run.
>>
>> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
>> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
>> [ 1594.73] smpboot: CPU 14 is now offline
>> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
>> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
>> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
>> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
>> [ 1599.365098] prcu-torture: rtc: b0277b90 ver: 66358 tfle: 0 rta: 
>> 66358 rtaf: 0
>> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418
>> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 
>> cbflood: 225
>> [ 1599.367946] prcu-torture: !!!
>> [ 1599.367966] [ cut here ]
>
> The "rtbe: 1" indicates that your implementation of prcu_barrier()
> failed to wait for all preceding call_prcu() callbacks to be invoked.
>
> Does the immediately following "Reader Pipe:" list have any but the
> first two numbers non-zero?
>

Yes.

>> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to 
>> true, that is
>> synchronize_rcu_expedited was tested.
>>
>> The rcuperf results are as follows (average grace-period duration in ms of 
>> ten 10min runs):
>>
>> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
>>
>> CPUs  2   4   8  12  15   16
>> PRCU   0.141.074.158.02   10.7915.16
>> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
>>
>> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
>>
>> CPUs   2   48  16  32   48   6364
>> PRCU0.23   19.6938.28   63.21   95.41   167.18   252.01   1841.44
>> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27
>
> Well, at the very least, this is a bug report on either expedited RCU
> grace-period latency or on rcuperf's measurements, and thank you for that.
> I will look into this.  In the meantime, could you please let me know
> exactly how you invoked rcuperf?
>

We used the following command to invoke rcuperf:

sudo ./kvm.sh --torture rcuperf --duration 10 --configs 10*TREE

The actual script run-rcuperf.sh to run the experiments can be found
in the following email of this patch series:

[PATCH RFC 15/16] rcutorture: Add scripts to run experiments

Please let us know how it goes.

Many thanks,
Lihao.

> I have a few comments on some of your patches based on a quick scan
> through them.
>
> Thanx, Paul
>
>> Best wishes,
>> Lihao.
>>
>>
>> Lihao Liang (15):
>>   rcutorture: Add PRCU rcu_torture_ops
>>   rcutorture: Add PRCU test config files
>>   rcuperf: Add PRCU rcu_perf_ops
>>   rcuperf: Add PRCU test config files
>>   rcuperf: Set gp_exp to true for tests to run
>>   prcu: Implement call_prcu() API
>>   prcu: Imp

Re: [PATCH RFC 07/16] prcu: Implement call_prcu() API

2018-01-26 Thread Lihao Liang
On Thu, Jan 25, 2018 at 6:20 AM, Paul E. McKenney
<paul...@linux.vnet.ibm.com> wrote:
> On Tue, Jan 23, 2018 at 03:59:32PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang <liangli...@huawei.com>
>>
>> This is PRCU's counterpart of RCU's call_rcu() API.
>>
>> Reviewed-by: Heng Zhang <hen...@huawei.com>
>> Signed-off-by: Lihao Liang <liangli...@huawei.com>
>> ---
>>  include/linux/prcu.h | 25 
>>  init/main.c  |  2 ++
>>  kernel/rcu/prcu.c| 67 
>> +---
>>  3 files changed, 91 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
>> index 653b4633..e5e09c9b 100644
>> --- a/include/linux/prcu.h
>> +++ b/include/linux/prcu.h
>> @@ -2,15 +2,36 @@
>>  #define __LINUX_PRCU_H
>>
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>
>>  #define CONFIG_PRCU
>>
>> +struct prcu_version_head {
>> + unsigned long long version;
>> + struct prcu_version_head *next;
>> +};
>> +
>> +/* Simple unsegmented callback list for PRCU. */
>> +struct prcu_cblist {
>> + struct rcu_head *head;
>> + struct rcu_head **tail;
>> + struct prcu_version_head *version_head;
>> + struct prcu_version_head **version_tail;
>> + long len;
>> +};
>> +
>> +#define PRCU_CBLIST_INITIALIZER(n) { \
>> + .head = NULL, .tail = , \
>> + .version_head = NULL, .version_tail = _head, \
>> +}
>> +
>>  struct prcu_local_struct {
>>   unsigned int locked;
>>   unsigned int online;
>>   unsigned long long version;
>> + struct prcu_cblist cblist;
>>  };
>>
>>  struct prcu_struct {
>> @@ -24,6 +45,8 @@ struct prcu_struct {
>>  void prcu_read_lock(void);
>>  void prcu_read_unlock(void);
>>  void synchronize_prcu(void);
>> +void call_prcu(struct rcu_head *head, rcu_callback_t func);
>> +void prcu_init(void);
>>  void prcu_note_context_switch(void);
>>
>>  #else /* #ifdef CONFIG_PRCU */
>> @@ -31,6 +54,8 @@ void prcu_note_context_switch(void);
>>  #define prcu_read_lock() do {} while (0)
>>  #define prcu_read_unlock() do {} while (0)
>>  #define synchronize_prcu() do {} while (0)
>> +#define call_prcu() do {} while (0)
>> +#define prcu_init() do {} while (0)
>>  #define prcu_note_context_switch() do {} while (0)
>>
>>  #endif /* #ifdef CONFIG_PRCU */
>> diff --git a/init/main.c b/init/main.c
>> index f8665104..4925964e 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -38,6 +38,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -574,6 +575,7 @@ asmlinkage __visible void __init start_kernel(void)
>>   workqueue_init_early();
>>
>>   rcu_init();
>> + prcu_init();
>>
>>   /* Trace events are available after this */
>>   trace_init();
>> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
>> index a00b9420..f198285c 100644
>> --- a/kernel/rcu/prcu.c
>> +++ b/kernel/rcu/prcu.c
>> @@ -1,11 +1,12 @@
>>  #include 
>> -#include 
>>  #include 
>> -#include 
>> +#include 
>>  #include 
>> -
>> +#include 
>>  #include 
>>
>> +#include "rcu.h"
>> +
>>  DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
>>
>>  struct prcu_struct global_prcu = {
>> @@ -16,6 +17,16 @@ struct prcu_struct global_prcu = {
>>  };
>>  struct prcu_struct *prcu = _prcu;
>>
>> +/* Initialize simple callback list. */
>> +static void prcu_cblist_init(struct prcu_cblist *rclp)
>> +{
>> + rclp->head = NULL;
>> + rclp->tail = >head;
>> + rclp->version_head = NULL;
>> + rclp->version_tail = >version_head;
>> + rclp->len = 0;
>> +}
>> +
>>  static inline void prcu_report(struct prcu_local_struct *local)
>>  {
>>   unsigned long long global_version;
>> @@ -123,3 +134,53 @@ void prcu_note_context_switch(void)
>>   prcu_report(local);
>>   put_cpu_ptr(_local);
>>  }
>> +
>> +void call_prcu(struct rcu_head *head, rcu_callback_t func)
>> +{
>> + unsigned long flags;
>> + struct prcu_local_struct *local;
>> + struct prcu_cblist *rclp;
>> + struct prcu_version_head *vhp;
>> +
>> + debug_rcu_

Re: [PATCH RFC 07/16] prcu: Implement call_prcu() API

2018-01-26 Thread Lihao Liang
On Thu, Jan 25, 2018 at 6:20 AM, Paul E. McKenney
 wrote:
> On Tue, Jan 23, 2018 at 03:59:32PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang 
>>
>> This is PRCU's counterpart of RCU's call_rcu() API.
>>
>> Reviewed-by: Heng Zhang 
>> Signed-off-by: Lihao Liang 
>> ---
>>  include/linux/prcu.h | 25 
>>  init/main.c  |  2 ++
>>  kernel/rcu/prcu.c| 67 
>> +---
>>  3 files changed, 91 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
>> index 653b4633..e5e09c9b 100644
>> --- a/include/linux/prcu.h
>> +++ b/include/linux/prcu.h
>> @@ -2,15 +2,36 @@
>>  #define __LINUX_PRCU_H
>>
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>
>>  #define CONFIG_PRCU
>>
>> +struct prcu_version_head {
>> + unsigned long long version;
>> + struct prcu_version_head *next;
>> +};
>> +
>> +/* Simple unsegmented callback list for PRCU. */
>> +struct prcu_cblist {
>> + struct rcu_head *head;
>> + struct rcu_head **tail;
>> + struct prcu_version_head *version_head;
>> + struct prcu_version_head **version_tail;
>> + long len;
>> +};
>> +
>> +#define PRCU_CBLIST_INITIALIZER(n) { \
>> + .head = NULL, .tail = , \
>> + .version_head = NULL, .version_tail = _head, \
>> +}
>> +
>>  struct prcu_local_struct {
>>   unsigned int locked;
>>   unsigned int online;
>>   unsigned long long version;
>> + struct prcu_cblist cblist;
>>  };
>>
>>  struct prcu_struct {
>> @@ -24,6 +45,8 @@ struct prcu_struct {
>>  void prcu_read_lock(void);
>>  void prcu_read_unlock(void);
>>  void synchronize_prcu(void);
>> +void call_prcu(struct rcu_head *head, rcu_callback_t func);
>> +void prcu_init(void);
>>  void prcu_note_context_switch(void);
>>
>>  #else /* #ifdef CONFIG_PRCU */
>> @@ -31,6 +54,8 @@ void prcu_note_context_switch(void);
>>  #define prcu_read_lock() do {} while (0)
>>  #define prcu_read_unlock() do {} while (0)
>>  #define synchronize_prcu() do {} while (0)
>> +#define call_prcu() do {} while (0)
>> +#define prcu_init() do {} while (0)
>>  #define prcu_note_context_switch() do {} while (0)
>>
>>  #endif /* #ifdef CONFIG_PRCU */
>> diff --git a/init/main.c b/init/main.c
>> index f8665104..4925964e 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -38,6 +38,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -574,6 +575,7 @@ asmlinkage __visible void __init start_kernel(void)
>>   workqueue_init_early();
>>
>>   rcu_init();
>> + prcu_init();
>>
>>   /* Trace events are available after this */
>>   trace_init();
>> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
>> index a00b9420..f198285c 100644
>> --- a/kernel/rcu/prcu.c
>> +++ b/kernel/rcu/prcu.c
>> @@ -1,11 +1,12 @@
>>  #include 
>> -#include 
>>  #include 
>> -#include 
>> +#include 
>>  #include 
>> -
>> +#include 
>>  #include 
>>
>> +#include "rcu.h"
>> +
>>  DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
>>
>>  struct prcu_struct global_prcu = {
>> @@ -16,6 +17,16 @@ struct prcu_struct global_prcu = {
>>  };
>>  struct prcu_struct *prcu = _prcu;
>>
>> +/* Initialize simple callback list. */
>> +static void prcu_cblist_init(struct prcu_cblist *rclp)
>> +{
>> + rclp->head = NULL;
>> + rclp->tail = >head;
>> + rclp->version_head = NULL;
>> + rclp->version_tail = >version_head;
>> + rclp->len = 0;
>> +}
>> +
>>  static inline void prcu_report(struct prcu_local_struct *local)
>>  {
>>   unsigned long long global_version;
>> @@ -123,3 +134,53 @@ void prcu_note_context_switch(void)
>>   prcu_report(local);
>>   put_cpu_ptr(_local);
>>  }
>> +
>> +void call_prcu(struct rcu_head *head, rcu_callback_t func)
>> +{
>> + unsigned long flags;
>> + struct prcu_local_struct *local;
>> + struct prcu_cblist *rclp;
>> + struct prcu_version_head *vhp;
>> +
>> + debug_rcu_head_queue(head);
>> +
>> + /* Use GFP_ATOMIC with IRQs disabled */
>> + vhp = kmalloc(si

Re: [PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run

2018-01-26 Thread Lihao Liang
On Thu, Jan 25, 2018 at 6:18 AM, Paul E. McKenney
<paul...@linux.vnet.ibm.com> wrote:
> On Tue, Jan 23, 2018 at 03:59:31PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang <liangli...@huawei.com>
>>
>> Signed-off-by: Lihao Liang <liangli...@huawei.com>
>> ---
>>  kernel/rcu/rcuperf.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
>> index ea80fa3e..baccc123 100644
>> --- a/kernel/rcu/rcuperf.c
>> +++ b/kernel/rcu/rcuperf.c
>> @@ -60,7 +60,7 @@ MODULE_AUTHOR("Paul E. McKenney 
>> <paul...@linux.vnet.ibm.com>");
>>  #define VERBOSE_PERFOUT_ERRSTRING(s) \
>>   do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } 
>> while (0)
>>
>> -torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
>> +torture_param(bool, gp_exp, true, "Use expedited GP wait primitives");
>
> This is fine as a convenience for internal testing, but the usual way
> to make this happen is using the rcuperf.gp_exp kernel boot parameter.
> Or was that not working for you?
>

Sure. It should work if rcuperf.gp_exp=1 is added to the .boot files
(it wouldn't work rcuperf.gp_exp=false is used).

Thanks,
Lihao.

> Thanx, Paul
>
>>  torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
>>  torture_param(int, nreaders, -1, "Number of RCU reader threads");
>>  torture_param(int, nwriters, -1, "Number of RCU updater threads");
>> --
>> 2.14.1.729.g59c0ea183
>>
>


Re: [PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run

2018-01-26 Thread Lihao Liang
On Thu, Jan 25, 2018 at 6:18 AM, Paul E. McKenney
 wrote:
> On Tue, Jan 23, 2018 at 03:59:31PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang 
>>
>> Signed-off-by: Lihao Liang 
>> ---
>>  kernel/rcu/rcuperf.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
>> index ea80fa3e..baccc123 100644
>> --- a/kernel/rcu/rcuperf.c
>> +++ b/kernel/rcu/rcuperf.c
>> @@ -60,7 +60,7 @@ MODULE_AUTHOR("Paul E. McKenney 
>> ");
>>  #define VERBOSE_PERFOUT_ERRSTRING(s) \
>>   do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } 
>> while (0)
>>
>> -torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
>> +torture_param(bool, gp_exp, true, "Use expedited GP wait primitives");
>
> This is fine as a convenience for internal testing, but the usual way
> to make this happen is using the rcuperf.gp_exp kernel boot parameter.
> Or was that not working for you?
>

Sure. It should work if rcuperf.gp_exp=1 is added to the .boot files
(it wouldn't work rcuperf.gp_exp=false is used).

Thanks,
Lihao.

> Thanx, Paul
>
>>  torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
>>  torture_param(int, nreaders, -1, "Number of RCU reader threads");
>>  torture_param(int, nwriters, -1, "Number of RCU updater threads");
>> --
>> 2.14.1.729.g59c0ea183
>>
>


Re: [PATCH RFC 01/16] prcu: Add PRCU implementation

2018-01-24 Thread Lihao Liang
Dear Peter,

Many thanks for your comments. I will provide a proper changelog.
Alternatively, the paper can be found at

http://ipads.se.sjtu.edu.cn/lib/exe/fetch.php?media=publications:consensus-tpds16.pdf

Best,
Lihao.

On Wed, Jan 24, 2018 at 11:26 AM, Peter Zijlstra  wrote:
> On Tue, Jan 23, 2018 at 03:59:26PM +0800, liangli...@huawei.com wrote:
>> From: Heng Zhang 
>>
>> This RCU implementation (PRCU) is based on a fast consensus protocol
>> published in the following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly 
>> Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>
> That's an utterly useless changelog for something like a new RCU
> implementation.
>
> You fail to describe why you're proposing a new RCU implementation; what
> problems does it fix?, how is it better?
>
> All you provide is a paywalled link to some paper that we can't read.
>
> Please write a real changelog that describes things properly and
> provide, if at all possible, a readily accessible link to your paper.


Re: [PATCH RFC 01/16] prcu: Add PRCU implementation

2018-01-24 Thread Lihao Liang
Dear Peter,

Many thanks for your comments. I will provide a proper changelog.
Alternatively, the paper can be found at

http://ipads.se.sjtu.edu.cn/lib/exe/fetch.php?media=publications:consensus-tpds16.pdf

Best,
Lihao.

On Wed, Jan 24, 2018 at 11:26 AM, Peter Zijlstra  wrote:
> On Tue, Jan 23, 2018 at 03:59:26PM +0800, liangli...@huawei.com wrote:
>> From: Heng Zhang 
>>
>> This RCU implementation (PRCU) is based on a fast consensus protocol
>> published in the following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly 
>> Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>
> That's an utterly useless changelog for something like a new RCU
> implementation.
>
> You fail to describe why you're proposing a new RCU implementation; what
> problems does it fix?, how is it better?
>
> All you provide is a paywalled link to some paper that we can't read.
>
> Please write a real changelog that describes things properly and
> provide, if at all possible, a readily accessible link to your paper.


Re: [PATCH v3] rcutorture: Add basic ARM64 support to run scripts

2018-01-12 Thread Lihao Liang


On 2018/1/13 1:52, Paul E. McKenney wrote:
> On Fri, Jan 12, 2018 at 06:11:32PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang <liangli...@huawei.com>
>>
>> This commit adds support of the qemu command qemu-system-aarch64
>> to rcutorture.
>>
>> Signed-off-by: Lihao Liang <liangli...@huawei.com>
> 
> This is to replace your previous patch, not to apply on top of it,
> correct?  (Either way is fine, just please let me know.)
> 

Please replace the previous one.

Thanks,
Lihao.

>   Thanx, Paul
> 
>> ---
>>
>> Comparing to the previous version, this patch lifts the limitation of
>> maximum 8 CPUs of option "-M virt" by adding "gic-version=host" to it.
>> This allows qemu to use the maximum CPU number supported by the actual
>> hardware.
>>
>> This commit is against RCU's git branch rcu/dev
>>
>> commit 505b61b2ec1d ("EXP: rcu: Add debugging info to other assertion")
>>
>>
>>  tools/testing/selftests/rcutorture/bin/functions.sh | 17 +++--
>>  1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh 
>> b/tools/testing/selftests/rcutorture/bin/functions.sh
>> index 07a1377..65f6655 100644
>> --- a/tools/testing/selftests/rcutorture/bin/functions.sh
>> +++ b/tools/testing/selftests/rcutorture/bin/functions.sh
>> @@ -136,6 +136,9 @@ identify_boot_image () {
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo arch/x86/boot/bzImage
>>  ;;
>> +qemu-system-aarch64)
>> +echo arch/arm64/boot/Image
>> +;;
>>  *)
>>  echo vmlinux
>>  ;;
>> @@ -158,6 +161,9 @@ identify_qemu () {
>>  elif echo $u | grep -q "Intel 80386"
>>  then
>>  echo qemu-system-i386
>> +elif echo $u | grep -q aarch64
>> +then
>> +echo qemu-system-aarch64
>>  elif uname -a | grep -q ppc64
>>  then
>>  echo qemu-system-ppc64
>> @@ -176,16 +182,20 @@ identify_qemu () {
>>  # Output arguments for the qemu "-append" string based on CPU type
>>  # and the TORTURE_QEMU_INTERACTIVE environment variable.
>>  identify_qemu_append () {
>> +local console=ttyS0
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo noapic selinux=0 initcall_debug debug
>>  ;;
>> +qemu-system-aarch64)
>> +console=ttyAMA0
>> +;;
>>  esac
>>  if test -n "$TORTURE_QEMU_INTERACTIVE"
>>  then
>>  echo root=/dev/sda
>>  else
>> -echo console=ttyS0
>> +echo console=$console
>>  fi
>>  }
>>
>> @@ -197,6 +207,9 @@ identify_qemu_args () {
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  ;;
>> +qemu-system-aarch64)
>> +echo -machine virt,gic-version=host -cpu host
>> +;;
>>  qemu-system-ppc64)
>>  echo -enable-kvm -M pseries -nodefaults
>>  echo -device spapr-vscsi
>> @@ -254,7 +267,7 @@ specify_qemu_cpus () {
>>  echo $2
>>  else
>>  case "$1" in
>> -qemu-system-x86_64|qemu-system-i386)
>> +qemu-system-x86_64|qemu-system-i386|qemu-system-aarch64)
>>  echo $2 -smp $3
>>  ;;
>>  qemu-system-ppc64)
>> -- 
>> 2.7.4
>>
> 
> 
> .
> 



Re: [PATCH v3] rcutorture: Add basic ARM64 support to run scripts

2018-01-12 Thread Lihao Liang


On 2018/1/13 1:52, Paul E. McKenney wrote:
> On Fri, Jan 12, 2018 at 06:11:32PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang 
>>
>> This commit adds support of the qemu command qemu-system-aarch64
>> to rcutorture.
>>
>> Signed-off-by: Lihao Liang 
> 
> This is to replace your previous patch, not to apply on top of it,
> correct?  (Either way is fine, just please let me know.)
> 

Please replace the previous one.

Thanks,
Lihao.

>   Thanx, Paul
> 
>> ---
>>
>> Comparing to the previous version, this patch lifts the limitation of
>> maximum 8 CPUs of option "-M virt" by adding "gic-version=host" to it.
>> This allows qemu to use the maximum CPU number supported by the actual
>> hardware.
>>
>> This commit is against RCU's git branch rcu/dev
>>
>> commit 505b61b2ec1d ("EXP: rcu: Add debugging info to other assertion")
>>
>>
>>  tools/testing/selftests/rcutorture/bin/functions.sh | 17 +++--
>>  1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh 
>> b/tools/testing/selftests/rcutorture/bin/functions.sh
>> index 07a1377..65f6655 100644
>> --- a/tools/testing/selftests/rcutorture/bin/functions.sh
>> +++ b/tools/testing/selftests/rcutorture/bin/functions.sh
>> @@ -136,6 +136,9 @@ identify_boot_image () {
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo arch/x86/boot/bzImage
>>  ;;
>> +qemu-system-aarch64)
>> +echo arch/arm64/boot/Image
>> +;;
>>  *)
>>  echo vmlinux
>>  ;;
>> @@ -158,6 +161,9 @@ identify_qemu () {
>>  elif echo $u | grep -q "Intel 80386"
>>  then
>>  echo qemu-system-i386
>> +elif echo $u | grep -q aarch64
>> +then
>> +echo qemu-system-aarch64
>>  elif uname -a | grep -q ppc64
>>  then
>>  echo qemu-system-ppc64
>> @@ -176,16 +182,20 @@ identify_qemu () {
>>  # Output arguments for the qemu "-append" string based on CPU type
>>  # and the TORTURE_QEMU_INTERACTIVE environment variable.
>>  identify_qemu_append () {
>> +local console=ttyS0
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo noapic selinux=0 initcall_debug debug
>>  ;;
>> +qemu-system-aarch64)
>> +console=ttyAMA0
>> +;;
>>  esac
>>  if test -n "$TORTURE_QEMU_INTERACTIVE"
>>  then
>>  echo root=/dev/sda
>>  else
>> -echo console=ttyS0
>> +echo console=$console
>>  fi
>>  }
>>
>> @@ -197,6 +207,9 @@ identify_qemu_args () {
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  ;;
>> +qemu-system-aarch64)
>> +echo -machine virt,gic-version=host -cpu host
>> +;;
>>  qemu-system-ppc64)
>>  echo -enable-kvm -M pseries -nodefaults
>>  echo -device spapr-vscsi
>> @@ -254,7 +267,7 @@ specify_qemu_cpus () {
>>  echo $2
>>  else
>>  case "$1" in
>> -qemu-system-x86_64|qemu-system-i386)
>> +qemu-system-x86_64|qemu-system-i386|qemu-system-aarch64)
>>  echo $2 -smp $3
>>  ;;
>>  qemu-system-ppc64)
>> -- 
>> 2.7.4
>>
> 
> 
> .
> 



Re: [PATCH v2] rcutorture: Add basic ARM64 support to run scripts

2018-01-12 Thread Lihao Liang
Hi Paul,

On 2017/12/19 7:31, Paul E. McKenney wrote:
> On Tue, Dec 12, 2017 at 05:19:25PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang <liangli...@huawei.com>
>>
>> This commit adds support of the qemu command qemu-system-aarch64
>> to rcutorture.
>>
>> Signed-off-by: Lihao Liang <liangli...@huawei.com>
> 
> Queued for further review and testing, thank you!
> 
> (This one has been on my list for quite some time.)
> 
>   Thanx, Paul
> 
>> ---
>> This commit is against RCU's git tree rcu/dev branch
>>
>> commit 505b61b2ec1d ("EXP: rcu: Add debugging info to other assertion")
>>
>> Note that the max CPUs supported by qemu machine 'virt' is 8 so the value of
>> CONFIG_NR_CPUS in some test configuration files needs to be adjusted.
>>
>>  tools/testing/selftests/rcutorture/bin/functions.sh | 17 +++--
>>  1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh 
>> b/tools/testing/selftests/rcutorture/bin/functions.sh
>> index 07a1377..0541d10 100644
>> --- a/tools/testing/selftests/rcutorture/bin/functions.sh
>> +++ b/tools/testing/selftests/rcutorture/bin/functions.sh
>> @@ -136,6 +136,9 @@ identify_boot_image () {
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo arch/x86/boot/bzImage
>>  ;;
>> +qemu-system-aarch64)
>> +echo arch/arm64/boot/Image
>> +;;
>>  *)
>>  echo vmlinux
>>  ;;
>> @@ -158,6 +161,9 @@ identify_qemu () {
>>  elif echo $u | grep -q "Intel 80386"
>>  then
>>  echo qemu-system-i386
>> +elif echo $u | grep -q aarch64
>> +then
>> +echo qemu-system-aarch64
>>  elif uname -a | grep -q ppc64
>>  then
>>  echo qemu-system-ppc64
>> @@ -176,16 +182,20 @@ identify_qemu () {
>>  # Output arguments for the qemu "-append" string based on CPU type
>>  # and the TORTURE_QEMU_INTERACTIVE environment variable.
>>  identify_qemu_append () {
>> +local console=ttyS0
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo noapic selinux=0 initcall_debug debug
>>  ;;
>> +qemu-system-aarch64)
>> +console=ttyAMA0
>> +;;
>>  esac
>>  if test -n "$TORTURE_QEMU_INTERACTIVE"
>>  then
>>  echo root=/dev/sda
>>  else
>> -echo console=ttyS0
>> +echo console=$console
>>  fi
>>  }
>>
>> @@ -197,6 +207,9 @@ identify_qemu_args () {
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  ;;
>> +qemu-system-aarch64)
>> +echo -M virt -cpu host

The qemu option "-M virt" only supports maximum 8 CPUs. We can lift this 
limitation by adding "gic-version=host" to it, which allows qemu to use the 
maximum CPU number supported by the actual hardware.

I have sent you a new version in a separate email.

Best,
Lihao.

>> +;;
>>  qemu-system-ppc64)
>>  echo -enable-kvm -M pseries -nodefaults
>>  echo -device spapr-vscsi
>> @@ -254,7 +267,7 @@ specify_qemu_cpus () {
>>  echo $2
>>  else
>>  case "$1" in
>> -qemu-system-x86_64|qemu-system-i386)
>> +qemu-system-x86_64|qemu-system-i386|qemu-system-aarch64)
>>  echo $2 -smp $3
>>  ;;
>>  qemu-system-ppc64)
>> -- 
>> 2.7.4
>>
> 
> 
> .
> 



Re: [PATCH v2] rcutorture: Add basic ARM64 support to run scripts

2018-01-12 Thread Lihao Liang
Hi Paul,

On 2017/12/19 7:31, Paul E. McKenney wrote:
> On Tue, Dec 12, 2017 at 05:19:25PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang 
>>
>> This commit adds support of the qemu command qemu-system-aarch64
>> to rcutorture.
>>
>> Signed-off-by: Lihao Liang 
> 
> Queued for further review and testing, thank you!
> 
> (This one has been on my list for quite some time.)
> 
>   Thanx, Paul
> 
>> ---
>> This commit is against RCU's git tree rcu/dev branch
>>
>> commit 505b61b2ec1d ("EXP: rcu: Add debugging info to other assertion")
>>
>> Note that the max CPUs supported by qemu machine 'virt' is 8 so the value of
>> CONFIG_NR_CPUS in some test configuration files needs to be adjusted.
>>
>>  tools/testing/selftests/rcutorture/bin/functions.sh | 17 +++--
>>  1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh 
>> b/tools/testing/selftests/rcutorture/bin/functions.sh
>> index 07a1377..0541d10 100644
>> --- a/tools/testing/selftests/rcutorture/bin/functions.sh
>> +++ b/tools/testing/selftests/rcutorture/bin/functions.sh
>> @@ -136,6 +136,9 @@ identify_boot_image () {
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo arch/x86/boot/bzImage
>>  ;;
>> +qemu-system-aarch64)
>> +echo arch/arm64/boot/Image
>> +;;
>>  *)
>>  echo vmlinux
>>  ;;
>> @@ -158,6 +161,9 @@ identify_qemu () {
>>  elif echo $u | grep -q "Intel 80386"
>>  then
>>  echo qemu-system-i386
>> +elif echo $u | grep -q aarch64
>> +then
>> +echo qemu-system-aarch64
>>  elif uname -a | grep -q ppc64
>>  then
>>  echo qemu-system-ppc64
>> @@ -176,16 +182,20 @@ identify_qemu () {
>>  # Output arguments for the qemu "-append" string based on CPU type
>>  # and the TORTURE_QEMU_INTERACTIVE environment variable.
>>  identify_qemu_append () {
>> +local console=ttyS0
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo noapic selinux=0 initcall_debug debug
>>  ;;
>> +qemu-system-aarch64)
>> +console=ttyAMA0
>> +;;
>>  esac
>>  if test -n "$TORTURE_QEMU_INTERACTIVE"
>>  then
>>  echo root=/dev/sda
>>  else
>> -echo console=ttyS0
>> +echo console=$console
>>  fi
>>  }
>>
>> @@ -197,6 +207,9 @@ identify_qemu_args () {
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  ;;
>> +qemu-system-aarch64)
>> +echo -M virt -cpu host

The qemu option "-M virt" only supports maximum 8 CPUs. We can lift this 
limitation by adding "gic-version=host" to it, which allows qemu to use the 
maximum CPU number supported by the actual hardware.

I have sent you a new version in a separate email.

Best,
Lihao.

>> +;;
>>  qemu-system-ppc64)
>>  echo -enable-kvm -M pseries -nodefaults
>>  echo -device spapr-vscsi
>> @@ -254,7 +267,7 @@ specify_qemu_cpus () {
>>  echo $2
>>  else
>>  case "$1" in
>> -qemu-system-x86_64|qemu-system-i386)
>> +qemu-system-x86_64|qemu-system-i386|qemu-system-aarch64)
>>  echo $2 -smp $3
>>  ;;
>>  qemu-system-ppc64)
>> -- 
>> 2.7.4
>>
> 
> 
> .
> 



Re: [PATCH] rcutorture: Add basic ARM64 support to run scripts

2017-12-12 Thread Lihao Liang
Hi Paul,

Many thanks for your helpful comments! I have addressed all of them in a new 
version of the patch, which is sent out in a separate email.

If you have further comments, please let me know.

Best regards,
Lihao.

On 2017/12/12 0:32, Paul E. McKenney wrote:
> On Fri, Dec 08, 2017 at 06:13:43PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang <liangli...@huawei.com>
>>
>> This commit adds support of the qemu command qemu-system-aarch64
>> to rcutorture. Use the following command to run:
>>
>>   ./kvm.sh --qemu-cmd qemu-system-aarch64
>>
>> Signed-off-by: Lihao Liang <liangli...@huawei.com>
> 
> Nice!!!  Getting ARM support for rcutorture has been on my todo list
> for some time!
> 
> A few questions and comments below.
> 
> Feedback from ARM experts also welcome!
> 
>   Thanx, Paul
> 
>> ---
>>
>> The max CPUs supported by qemu machine 'virt' is 8 so the value of
>> CONFIG_NR_CPUS in some test configuration files needs to be adjusted.
>>
>>  tools/testing/selftests/rcutorture/bin/functions.sh | 18 +-
>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh 
>> b/tools/testing/selftests/rcutorture/bin/functions.sh
>> index 07a1377..5ffe4fe 100644
>> --- a/tools/testing/selftests/rcutorture/bin/functions.sh
>> +++ b/tools/testing/selftests/rcutorture/bin/functions.sh
>> @@ -136,6 +136,9 @@ identify_boot_image () {
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo arch/x86/boot/bzImage
>>  ;;
>> +qemu-system-aarch64)
>> +echo arch/arm64/boot/Image
>> +;;
>>  *)
>>  echo vmlinux
>>  ;;
> 
> Is it possible to automatically select ARM based on the kernel binary?
> See the identify_qemu function for how this is done for i386, x86_64,
> and PowerPC.  Can an "elif" be added for ARM?
> 
>> @@ -185,7 +188,14 @@ identify_qemu_append () {
>>  then
>>  echo root=/dev/sda
>>  else
>> -echo console=ttyS0
>> +case "$1" in
>> +qemu-system-aarch64)
>> +echo console=ttyAMA0
>> +;;
>> +*)
>> +echo console=ttyS0
>> +;;
>> +esac
>>  fi
>>  }
> 
> This approach is going to result in very ugly nesting if support is
> added for additional CPU families.  How about something like this?
> 
> identify_qemu_append () {
>   local console=ttyS0
> 
>   case "$1" in
>   qemu-system-x86_64|qemu-system-i386)
>   echo noapic selinux=0 initcall_debug debug
>   ;;
>   qemu-system-aarch64)
>   console=ttyAMA0
>   ;;
>   esac
>   if test -n "$TORTURE_QEMU_INTERACTIVE"
>   then
>   echo root=/dev/sda
>   else
>   echo console=$console
>   fi
> }
> 
>> @@ -197,6 +207,9 @@ identify_qemu_args () {
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  ;;
>> +qemu-system-aarch64)
>> +echo -M virt -cpu host
>> +;;
>>  qemu-system-ppc64)
>>  echo -enable-kvm -M pseries -nodefaults
>>  echo -device spapr-vscsi
>> @@ -257,6 +270,9 @@ specify_qemu_cpus () {
>>  qemu-system-x86_64|qemu-system-i386)
> 
> How about the following instead, eliminating the need for an additional
> case?
> 
>   qemu-system-x86_64|qemu-system-i386!qemu-system-aarch64)
> 
>>  echo $2 -smp $3
>>  ;;
>> +qemu-system-aarch64)
>> +echo $2 -smp $3
>> +;;
>>  qemu-system-ppc64)
>>  nt="`lscpu | grep '^NUMA node0' | sed -e 
>> 's/^[^,]*,\([0-9]*\),.*$/\1/'`"
>>  echo $2 -smp cores=`expr \( $3 + $nt - 1 \) / 
>> $nt`,threads=$nt
>> -- 
>> 2.7.4
>>
> 
> 
> .
> 



Re: [PATCH] rcutorture: Add basic ARM64 support to run scripts

2017-12-12 Thread Lihao Liang
Hi Paul,

Many thanks for your helpful comments! I have addressed all of them in a new 
version of the patch, which is sent out in a separate email.

If you have further comments, please let me know.

Best regards,
Lihao.

On 2017/12/12 0:32, Paul E. McKenney wrote:
> On Fri, Dec 08, 2017 at 06:13:43PM +0800, liangli...@huawei.com wrote:
>> From: Lihao Liang 
>>
>> This commit adds support of the qemu command qemu-system-aarch64
>> to rcutorture. Use the following command to run:
>>
>>   ./kvm.sh --qemu-cmd qemu-system-aarch64
>>
>> Signed-off-by: Lihao Liang 
> 
> Nice!!!  Getting ARM support for rcutorture has been on my todo list
> for some time!
> 
> A few questions and comments below.
> 
> Feedback from ARM experts also welcome!
> 
>   Thanx, Paul
> 
>> ---
>>
>> The max CPUs supported by qemu machine 'virt' is 8 so the value of
>> CONFIG_NR_CPUS in some test configuration files needs to be adjusted.
>>
>>  tools/testing/selftests/rcutorture/bin/functions.sh | 18 +-
>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh 
>> b/tools/testing/selftests/rcutorture/bin/functions.sh
>> index 07a1377..5ffe4fe 100644
>> --- a/tools/testing/selftests/rcutorture/bin/functions.sh
>> +++ b/tools/testing/selftests/rcutorture/bin/functions.sh
>> @@ -136,6 +136,9 @@ identify_boot_image () {
>>  qemu-system-x86_64|qemu-system-i386)
>>  echo arch/x86/boot/bzImage
>>  ;;
>> +qemu-system-aarch64)
>> +echo arch/arm64/boot/Image
>> +;;
>>  *)
>>  echo vmlinux
>>  ;;
> 
> Is it possible to automatically select ARM based on the kernel binary?
> See the identify_qemu function for how this is done for i386, x86_64,
> and PowerPC.  Can an "elif" be added for ARM?
> 
>> @@ -185,7 +188,14 @@ identify_qemu_append () {
>>  then
>>  echo root=/dev/sda
>>  else
>> -echo console=ttyS0
>> +case "$1" in
>> +qemu-system-aarch64)
>> +echo console=ttyAMA0
>> +;;
>> +*)
>> +echo console=ttyS0
>> +;;
>> +esac
>>  fi
>>  }
> 
> This approach is going to result in very ugly nesting if support is
> added for additional CPU families.  How about something like this?
> 
> identify_qemu_append () {
>   local console=ttyS0
> 
>   case "$1" in
>   qemu-system-x86_64|qemu-system-i386)
>   echo noapic selinux=0 initcall_debug debug
>   ;;
>   qemu-system-aarch64)
>   console=ttyAMA0
>   ;;
>   esac
>   if test -n "$TORTURE_QEMU_INTERACTIVE"
>   then
>   echo root=/dev/sda
>   else
>   echo console=$console
>   fi
> }
> 
>> @@ -197,6 +207,9 @@ identify_qemu_args () {
>>  case "$1" in
>>  qemu-system-x86_64|qemu-system-i386)
>>  ;;
>> +qemu-system-aarch64)
>> +echo -M virt -cpu host
>> +;;
>>  qemu-system-ppc64)
>>  echo -enable-kvm -M pseries -nodefaults
>>  echo -device spapr-vscsi
>> @@ -257,6 +270,9 @@ specify_qemu_cpus () {
>>  qemu-system-x86_64|qemu-system-i386)
> 
> How about the following instead, eliminating the need for an additional
> case?
> 
>   qemu-system-x86_64|qemu-system-i386!qemu-system-aarch64)
> 
>>  echo $2 -smp $3
>>  ;;
>> +qemu-system-aarch64)
>> +echo $2 -smp $3
>> +;;
>>  qemu-system-ppc64)
>>  nt="`lscpu | grep '^NUMA node0' | sed -e 
>> 's/^[^,]*,\([0-9]*\),.*$/\1/'`"
>>  echo $2 -smp cores=`expr \( $3 + $nt - 1 \) / 
>> $nt`,threads=$nt
>> -- 
>> 2.7.4
>>
> 
> 
> .
> 



Re: linux-next: Signed-off-by missing for commit in the rcu tree

2017-11-28 Thread Lihao Liang


On 2017/11/29 11:14, Hanjun Guo wrote:
> Hi Lihao,
> 
> On 2017/11/29 10:48, Lihao Liang wrote:
>> Hi Paul,
>>
>> Signed-off-by: Lihao Liang <lihao.li...@gmail.com>
> 
> ...
> 
>>
>> Many thanks,
>> Lihao.
>>
>> On 2017/11/29 9:14, Paul E. McKenney wrote:
>>> On Wed, Nov 29, 2017 at 11:51:51AM +1100, Stephen Rothwell wrote:
>>>> Hi Paul,
>>>>
>>>> Commit
>>>>
>>>>   d7e182c9c324 ("rcu: Remove unnecessary spinlock in 
>>>> rcu_boot_init_percpu_data()")
>>>>
>>>> is missing a Signed-off-by from its author.
>>> Good catch, Stephen!
>>>
>>> Lihao, would you please get me you Signed-off-by?  The patch is below.
>>>
>>> Thanx, Paul
>>>
>>> --------
>>>
>>> commit d7e182c9c32480c1f579dd888ac50e88bfb39596
>>> Author: Liang Lihao <liangli...@huawei.com>
> 
> So it's better to keep the author and Signed-off-by the same :)
> 

Sure. Paul and Stephen, please use

Signed-off-by: Lihao Liang <liangli...@huawei.com>

Many thanks,
Lihao.

> Thanks
> Hanjun
> 
> 
> .
> 



Re: linux-next: Signed-off-by missing for commit in the rcu tree

2017-11-28 Thread Lihao Liang


On 2017/11/29 11:14, Hanjun Guo wrote:
> Hi Lihao,
> 
> On 2017/11/29 10:48, Lihao Liang wrote:
>> Hi Paul,
>>
>> Signed-off-by: Lihao Liang 
> 
> ...
> 
>>
>> Many thanks,
>> Lihao.
>>
>> On 2017/11/29 9:14, Paul E. McKenney wrote:
>>> On Wed, Nov 29, 2017 at 11:51:51AM +1100, Stephen Rothwell wrote:
>>>> Hi Paul,
>>>>
>>>> Commit
>>>>
>>>>   d7e182c9c324 ("rcu: Remove unnecessary spinlock in 
>>>> rcu_boot_init_percpu_data()")
>>>>
>>>> is missing a Signed-off-by from its author.
>>> Good catch, Stephen!
>>>
>>> Lihao, would you please get me you Signed-off-by?  The patch is below.
>>>
>>> Thanx, Paul
>>>
>>> --------
>>>
>>> commit d7e182c9c32480c1f579dd888ac50e88bfb39596
>>> Author: Liang Lihao 
> 
> So it's better to keep the author and Signed-off-by the same :)
> 

Sure. Paul and Stephen, please use

Signed-off-by: Lihao Liang 

Many thanks,
Lihao.

> Thanks
> Hanjun
> 
> 
> .
> 



Re: linux-next: Signed-off-by missing for commit in the rcu tree

2017-11-28 Thread Lihao Liang
Hi Paul,

Signed-off-by: Lihao Liang <lihao.li...@gmail.com>

Many thanks,
Lihao.

On 2017/11/29 9:14, Paul E. McKenney wrote:
> On Wed, Nov 29, 2017 at 11:51:51AM +1100, Stephen Rothwell wrote:
>> Hi Paul,
>>
>> Commit
>>
>>   d7e182c9c324 ("rcu: Remove unnecessary spinlock in 
>> rcu_boot_init_percpu_data()")
>>
>> is missing a Signed-off-by from its author.
> 
> Good catch, Stephen!
> 
> Lihao, would you please get me you Signed-off-by?  The patch is below.
> 
>   Thanx, Paul
> 
> 
> 
> commit d7e182c9c32480c1f579dd888ac50e88bfb39596
> Author: Liang Lihao <liangli...@huawei.com>
> Date:   Wed Nov 22 19:00:55 2017 +
> 
> rcu: Remove unnecessary spinlock in rcu_boot_init_percpu_data()
> 
> Since rcu_boot_init_percpu_data() is only called at boot time,
> there is no data race and spinlock is not needed.
> 
> Signed-off-by: Paul E. McKenney <paul...@linux.vnet.ibm.com>
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 69722817d6d6..0abe1db53d70 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3636,12 +3636,9 @@ static void rcu_init_new_rnp(struct rcu_node *rnp_leaf)
>  static void __init
>  rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
>  {
> - unsigned long flags;
>   struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
> - struct rcu_node *rnp = rcu_get_root(rsp);
>  
>   /* Set up local state, ensuring consistent view of global state. */
> - raw_spin_lock_irqsave_rcu_node(rnp, flags);
>   rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
>   rdp->dynticks = _cpu(rcu_dynticks, cpu);
>   WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != 1);
> @@ -3649,7 +3646,6 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state 
> *rsp)
>   rdp->cpu = cpu;
>   rdp->rsp = rsp;
>   rcu_boot_init_nocb_percpu_data(rdp);
> - raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
>  }
>  
>  /*
> 
> 
> 



Re: linux-next: Signed-off-by missing for commit in the rcu tree

2017-11-28 Thread Lihao Liang
Hi Paul,

Signed-off-by: Lihao Liang 

Many thanks,
Lihao.

On 2017/11/29 9:14, Paul E. McKenney wrote:
> On Wed, Nov 29, 2017 at 11:51:51AM +1100, Stephen Rothwell wrote:
>> Hi Paul,
>>
>> Commit
>>
>>   d7e182c9c324 ("rcu: Remove unnecessary spinlock in 
>> rcu_boot_init_percpu_data()")
>>
>> is missing a Signed-off-by from its author.
> 
> Good catch, Stephen!
> 
> Lihao, would you please get me you Signed-off-by?  The patch is below.
> 
>   Thanx, Paul
> 
> 
> 
> commit d7e182c9c32480c1f579dd888ac50e88bfb39596
> Author: Liang Lihao 
> Date:   Wed Nov 22 19:00:55 2017 +
> 
> rcu: Remove unnecessary spinlock in rcu_boot_init_percpu_data()
> 
> Since rcu_boot_init_percpu_data() is only called at boot time,
> there is no data race and spinlock is not needed.
> 
> Signed-off-by: Paul E. McKenney 
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 69722817d6d6..0abe1db53d70 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3636,12 +3636,9 @@ static void rcu_init_new_rnp(struct rcu_node *rnp_leaf)
>  static void __init
>  rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
>  {
> - unsigned long flags;
>   struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
> - struct rcu_node *rnp = rcu_get_root(rsp);
>  
>   /* Set up local state, ensuring consistent view of global state. */
> - raw_spin_lock_irqsave_rcu_node(rnp, flags);
>   rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
>   rdp->dynticks = _cpu(rcu_dynticks, cpu);
>   WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != 1);
> @@ -3649,7 +3646,6 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state 
> *rsp)
>   rdp->cpu = cpu;
>   rdp->rsp = rsp;
>   rcu_boot_init_nocb_percpu_data(rdp);
> - raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
>  }
>  
>  /*
> 
> 
>