Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-02-02 Thread Emilio G. Cota
Just noticed the message above mistakenly sat in my outbox for
nearly 2 months. Just flushed it, so do not be surprised by
its original date.

Sorry for the noise,

Emilio



Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-02-02 Thread Emilio G. Cota
On Tue, Nov 29, 2016 at 12:46:59 +0100, Paolo Bonzini wrote:
> A QemuLockCnt comprises a counter and a mutex, with primitives
> to increment and decrement the counter, and to take and release the
> mutex.  It can be used to do lock-free visits to a data structure
> whenever mutexes would be too heavy-weight and the critical section
> is too long for RCU.
> 
> This could be implemented simply by protecting the counter with the
> mutex, but QemuLockCnt is harder to misuse and more efficient.
> 
> Signed-off-by: Paolo Bonzini 
(snip)
> +++ b/docs/lockcnt.txt
> @@ -0,0 +1,343 @@
> +DOCUMENTATION FOR LOCKED COUNTERS (aka QemuLockCnt)
> +===
(snip)
> +bool qemu_lockcnt_dec_if_lock(QemuLockCnt *lockcnt);
> +
> +If the count is 1, decrement the count to zero, lock
> +the mutex and return true.  Otherwise, return false.
> +
(snip)
> +++ b/util/lockcnt.c
(snip)
> +void qemu_lockcnt_init(QemuLockCnt *lockcnt)
> +{
> +qemu_mutex_init(>mutex);
> +lockcnt->count = 0;

Minor nit: a release barrier here wouldn't harm.

> +}
> +
> +void qemu_lockcnt_destroy(QemuLockCnt *lockcnt)
> +{
> +qemu_mutex_destroy(>mutex);
> +}
> +
> +void qemu_lockcnt_inc(QemuLockCnt *lockcnt)
> +{
> +int old;
> +for (;;) {
> +old = atomic_read(>count);
> +if (old == 0) {
> +qemu_lockcnt_lock(lockcnt);
> +qemu_lockcnt_inc_and_unlock(lockcnt);
> +return;
> +} else {
> +if (atomic_cmpxchg(>count, old, old + 1) == old) {
> +return;
> +}
> +}
> +}
> +}
> +
> +void qemu_lockcnt_dec(QemuLockCnt *lockcnt)
> +{
> +atomic_dec(>count);
> +}
> +
> +/* Decrement a counter, and return locked if it is decremented to zero.
> + * It is impossible for the counter to become nonzero while the mutex
> + * is taken.
> + */
> +bool qemu_lockcnt_dec_and_lock(QemuLockCnt *lockcnt)
> +{
> +int val = atomic_read(>count);
> +while (val > 1) {
> +int old = atomic_cmpxchg(>count, val, val - 1);
> +if (old != val) {
> +val = old;
> +continue;
> +}
> +
> +return false;
> +}

Minor nit:
while (val > 1) {
int old = cmpxchg();
if (old == val) {
return false;
}
val = old;
}
seems to me a little easier to read.

> +qemu_lockcnt_lock(lockcnt);
> +if (atomic_fetch_dec(>count) == 1) {
> +return true;
> +}
> +
> +qemu_lockcnt_unlock(lockcnt);
> +return false;
> +}
> +
> +/* Decrement a counter and return locked if it is decremented to zero.
> + * Otherwise do nothing.

This comment doesn't match the code below nor the description in the
.txt file (quoted above) [we might not decrement the counter at all!]

> + *
> + * It is impossible for the counter to become nonzero while the mutex
> + * is taken.
> + */
> +bool qemu_lockcnt_dec_if_lock(QemuLockCnt *lockcnt)
> +{
> +int val = atomic_mb_read(>count);
> +if (val > 1) {
> +return false;
> +}
> +
> +qemu_lockcnt_lock(lockcnt);
> +if (atomic_fetch_dec(>count) == 1) {
> +return true;
> +}
> +
> +qemu_lockcnt_inc_and_unlock(lockcnt);

The choice of dec+(maybe)inc over cmpxchg seems a little odd to me.

Emilio



[Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-01-12 Thread Paolo Bonzini
A QemuLockCnt comprises a counter and a mutex, with primitives
to increment and decrement the counter, and to take and release the
mutex.  It can be used to do lock-free visits to a data structure
whenever mutexes would be too heavy-weight and the critical section
is too long for RCU.

This could be implemented simply by protecting the counter with the
mutex, but QemuLockCnt is harder to misuse and more efficient.

Signed-off-by: Paolo Bonzini 
---
 docs/lockcnt.txt  | 278 ++
 include/qemu/thread.h | 110 
 util/Makefile.objs|   1 +
 util/lockcnt.c| 114 +
 4 files changed, 503 insertions(+)
 create mode 100644 docs/lockcnt.txt
 create mode 100644 util/lockcnt.c

diff --git a/docs/lockcnt.txt b/docs/lockcnt.txt
new file mode 100644
index 000..25a8091
--- /dev/null
+++ b/docs/lockcnt.txt
@@ -0,0 +1,278 @@
+DOCUMENTATION FOR LOCKED COUNTERS (aka QemuLockCnt)
+===
+
+QEMU often uses reference counts to track data structures that are being
+accessed and should not be freed.  For example, a loop that invoke
+callbacks like this is not safe:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+
+QLIST_FOREACH_SAFE protects against deletion of the current node (ioh)
+by stashing away its "next" pointer.  However, ioh->fd_write could
+actually delete the next node from the list.  The simplest way to
+avoid this is to mark the node as deleted, and remove it from the
+list in the above loop:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+
+If however this loop must also be reentrant, i.e. it is possible that
+ioh->fd_write invokes the loop again, some kind of counting is needed:
+
+walking_handlers++;
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+if (walking_handlers == 1) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+}
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+walking_handlers--;
+
+One may think of using the RCU primitives, rcu_read_lock() and
+rcu_read_unlock(); effectively, the RCU nesting count would take
+the place of the walking_handlers global variable.  Indeed,
+reference counting and RCU have similar purposes, but their usage in
+general is complementary:
+
+- reference counting is fine-grained and limited to a single data
+  structure; RCU delays reclamation of *all* RCU-protected data
+  structures;
+
+- reference counting works even in the presence of code that keeps
+  a reference for a long time; RCU critical sections in principle
+  should be kept short;
+
+- reference counting is often applied to code that is not thread-safe
+  but is reentrant; in fact, usage of reference counting in QEMU predates
+  the introduction of threads by many years.  RCU is generally used to
+  protect readers from other threads freeing memory after concurrent
+  modifications to a data structure.
+
+- reclaiming data can be done by a separate thread in the case of RCU;
+  this can improve performance, but also delay reclamation undesirably.
+  With reference counting, reclamation is deterministic.
+
+This file documents QemuLockCnt, an abstraction for using reference
+counting in code that has to be both thread-safe and reentrant.
+
+
+QemuLockCnt concepts
+
+
+A QemuLockCnt comprises both a counter and a mutex; it has primitives
+to increment and decrement the counter, and to take and release the
+mutex.  The counter notes how many visits to the data structures are
+taking place (the visits could be from different threads, or there could
+be multiple reentrant visits from the same thread).  The basic rules
+governing the counter/mutex pair then are the following:
+
+- Data protected by the QemuLockCnt must not be freed unless the
+  counter is zero and the mutex is taken.
+
+- A new visit cannot be started while the counter is zero and the
+  mutex is taken.
+
+Most of the time, the mutex protects all writes to the data structure,
+not just frees, though there could be cases where this is not necessary.
+
+Reads, instead, can be done without taking the mutex, as long as the
+readers and writers use the same macros that are used for RCU, for
+example atomic_rcu_read, atomic_rcu_set, QLIST_FOREACH_RCU, etc.  This is
+because the reads are done outside a lock and a set or QLIST_INSERT_HEAD
+can happen concurrently with the read.  The RCU API ensures that the
+processor and the compiler see all 

[Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-01-12 Thread Paolo Bonzini
A QemuLockCnt comprises a counter and a mutex, with primitives
to increment and decrement the counter, and to take and release the
mutex.  It can be used to do lock-free visits to a data structure
whenever mutexes would be too heavy-weight and the critical section
is too long for RCU.

This could be implemented simply by protecting the counter with the
mutex, but QemuLockCnt is harder to misuse and more efficient.

Signed-off-by: Paolo Bonzini 
---
 docs/lockcnt.txt  | 278 ++
 include/qemu/thread.h | 110 
 util/Makefile.objs|   1 +
 util/lockcnt.c| 114 +
 4 files changed, 503 insertions(+)
 create mode 100644 docs/lockcnt.txt
 create mode 100644 util/lockcnt.c

diff --git a/docs/lockcnt.txt b/docs/lockcnt.txt
new file mode 100644
index 000..25a8091
--- /dev/null
+++ b/docs/lockcnt.txt
@@ -0,0 +1,278 @@
+DOCUMENTATION FOR LOCKED COUNTERS (aka QemuLockCnt)
+===
+
+QEMU often uses reference counts to track data structures that are being
+accessed and should not be freed.  For example, a loop that invoke
+callbacks like this is not safe:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+
+QLIST_FOREACH_SAFE protects against deletion of the current node (ioh)
+by stashing away its "next" pointer.  However, ioh->fd_write could
+actually delete the next node from the list.  The simplest way to
+avoid this is to mark the node as deleted, and remove it from the
+list in the above loop:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+
+If however this loop must also be reentrant, i.e. it is possible that
+ioh->fd_write invokes the loop again, some kind of counting is needed:
+
+walking_handlers++;
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+if (walking_handlers == 1) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+}
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+walking_handlers--;
+
+One may think of using the RCU primitives, rcu_read_lock() and
+rcu_read_unlock(); effectively, the RCU nesting count would take
+the place of the walking_handlers global variable.  Indeed,
+reference counting and RCU have similar purposes, but their usage in
+general is complementary:
+
+- reference counting is fine-grained and limited to a single data
+  structure; RCU delays reclamation of *all* RCU-protected data
+  structures;
+
+- reference counting works even in the presence of code that keeps
+  a reference for a long time; RCU critical sections in principle
+  should be kept short;
+
+- reference counting is often applied to code that is not thread-safe
+  but is reentrant; in fact, usage of reference counting in QEMU predates
+  the introduction of threads by many years.  RCU is generally used to
+  protect readers from other threads freeing memory after concurrent
+  modifications to a data structure.
+
+- reclaiming data can be done by a separate thread in the case of RCU;
+  this can improve performance, but also delay reclamation undesirably.
+  With reference counting, reclamation is deterministic.
+
+This file documents QemuLockCnt, an abstraction for using reference
+counting in code that has to be both thread-safe and reentrant.
+
+
+QemuLockCnt concepts
+
+
+A QemuLockCnt comprises both a counter and a mutex; it has primitives
+to increment and decrement the counter, and to take and release the
+mutex.  The counter notes how many visits to the data structures are
+taking place (the visits could be from different threads, or there could
+be multiple reentrant visits from the same thread).  The basic rules
+governing the counter/mutex pair then are the following:
+
+- Data protected by the QemuLockCnt must not be freed unless the
+  counter is zero and the mutex is taken.
+
+- A new visit cannot be started while the counter is zero and the
+  mutex is taken.
+
+Most of the time, the mutex protects all writes to the data structure,
+not just frees, though there could be cases where this is not necessary.
+
+Reads, instead, can be done without taking the mutex, as long as the
+readers and writers use the same macros that are used for RCU, for
+example atomic_rcu_read, atomic_rcu_set, QLIST_FOREACH_RCU, etc.  This is
+because the reads are done outside a lock and a set or QLIST_INSERT_HEAD
+can happen concurrently with the read.  The RCU API ensures that the
+processor and the compiler see all 

Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-01-11 Thread Stefan Hajnoczi
On Wed, Jan 04, 2017 at 02:26:17PM +0100, Paolo Bonzini wrote:
> +/* Decrement a counter, and return locked if it is decremented to zero.
> + * It is impossible for the counter to become nonzero while the mutex
> + * is taken.
> + */
> +bool qemu_lockcnt_dec_and_lock(QemuLockCnt *lockcnt)
> +{
> +int val = atomic_read(>count);

I think I've figured out the answer to my question why this isn't an
atomic_mb_read():

The following accesses use atomic_cmpxchg() or atomic_fetch_dec().
There is no adverse effect if we read an outdated value here since the
cmpxchg/fetch_dec are sequentially consistent and explicitly require us
to handle the old value.

> +while (val > 1) {
> +int old = atomic_cmpxchg(>count, val, val - 1);
> +if (old != val) {
> +val = old;
> +continue;
> +}
> +
> +return false;
> +}
> +
> +qemu_lockcnt_lock(lockcnt);
> +if (atomic_fetch_dec(>count) == 1) {
> +return true;
> +}
> +
> +qemu_lockcnt_unlock(lockcnt);
> +return false;


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-01-11 Thread Paolo Bonzini


On 11/01/2017 17:35, Stefan Hajnoczi wrote:
> On Wed, Jan 04, 2017 at 02:26:17PM +0100, Paolo Bonzini wrote:
>> +/* Decrement a counter, and return locked if it is decremented to zero.
>> + * It is impossible for the counter to become nonzero while the mutex
>> + * is taken.
>> + */
>> +bool qemu_lockcnt_dec_and_lock(QemuLockCnt *lockcnt)
>> +{
>> +int val = atomic_read(>count);
> 
> Why does qemu_lockcnt_inc() use atomic_mb_read() while this function
> uses plain atomic_read()?

qemu_lockcnt_inc can use atomic_read indeed.

In both cases, the actual synchronization happens in atomic_cmpxchg or
qemu_lockcnt_lock.  This is just an "advisory" read to decide what path
to go through.  The worst case that can happen is that you go uselessly
once through the while-cmpxchg loop, or that you don't use the fast path
even though you could have.

In contrast, qemu_lockcnt_dec_if_lock needs atomic_mb_read to do a
load-acquire.

Paolo



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-01-11 Thread Stefan Hajnoczi
On Wed, Jan 04, 2017 at 02:26:17PM +0100, Paolo Bonzini wrote:
> +/* Decrement a counter, and return locked if it is decremented to zero.
> + * It is impossible for the counter to become nonzero while the mutex
> + * is taken.
> + */
> +bool qemu_lockcnt_dec_and_lock(QemuLockCnt *lockcnt)
> +{
> +int val = atomic_read(>count);

Why does qemu_lockcnt_inc() use atomic_mb_read() while this function
uses plain atomic_read()?


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-01-11 Thread Paolo Bonzini


On 11/01/2017 16:48, Fam Zheng wrote:
> On Wed, 01/04 14:26, Paolo Bonzini wrote:
>> +For example, QEMU uses QemuLockCnt to manage an AioContext's list of
>> +bottom halves and file descriptor handlers.  Modifications to the list
>> +of file descriptor handlers are rare.  Creation of a new bottom half is
>> +frequent and can happen on a fast path; however: 1) it is almost never
>> +concurrent with a visit to the list of bottom halves; 
> 
> Isn't it common that thread A is busy notifying thread B with BH, and thread B
> is busy processing the notification? In that case creation and visiting the BH
> list are concurrent.

For multi-threaded cases you may can create the BH just once
(thread-pool does it, and aio_co_schedule will be there in the next part
of the work to simplify "remote" qemu_coroutine_enter).

The case where thread A creates the bottom half does occur with rbd.c,
but then, the next point applies:

>> 2) it only has
>> +three instructions in the critical path, two assignments and a smp_wmb().
>> +
>> +/**
>> + * qemu_lockcnt_dec: possibly decrement a QemuLockCnt's counter and lock it.
> 
> s/qemu_lockcnt_dec/qemu_lockcnt_dec_if_lock/

Oops, I'll wait for full review and send v4.

>> + * @lockcnt: the lockcnt to operate on
>> + *
>> + * If the count is 1, decrement the count to zero, lock
>> + * the mutex and return true.  Otherwise, return false.
>> + */
>> +bool qemu_lockcnt_dec_if_lock(QemuLockCnt *lockcnt);
>> +
>> +/**
>> + * qemu_lockcnt_lock: lock a QemuLockCnt's mutex.
>> + * @lockcnt: the lockcnt to operate on
>> + *
>> + * Remember that concurrent visits are not blocked unless the count is
>> + * also zero.  You can use qemu_lockcnt_count to check for this inside a
>> + * critical section.
>> + */
>> +void qemu_lockcnt_lock(QemuLockCnt *lockcnt);
>> diff --git a/util/lockcnt.c b/util/lockcnt.c
>> new file mode 100644
>> index 000..78ed1e4
>> --- /dev/null
>> +++ b/util/lockcnt.c
>> @@ -0,0 +1,113 @@
>> +/*
>> + * QemuLockCnt implementation
>> + *
>> + * Copyright Red Hat, Inc. 2015
> 
> 2015, 2016?

Or 2017 even. :)

Paolo

>> + *
>> + * Author:
>> + *   Paolo Bonzini 
>> + */
>> +#include "qemu/osdep.h"
>> +#include "qemu/thread.h"
>> +#include "qemu/atomic.h"



Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-01-11 Thread Fam Zheng
On Wed, 01/04 14:26, Paolo Bonzini wrote:
> +For example, QEMU uses QemuLockCnt to manage an AioContext's list of
> +bottom halves and file descriptor handlers.  Modifications to the list
> +of file descriptor handlers are rare.  Creation of a new bottom half is
> +frequent and can happen on a fast path; however: 1) it is almost never
> +concurrent with a visit to the list of bottom halves; 

Isn't it common that thread A is busy notifying thread B with BH, and thread B
is busy processing the notification? In that case creation and visiting the BH
list are concurrent.

> 2) it only has
> +three instructions in the critical path, two assignments and a smp_wmb().
> +
> +/**
> + * qemu_lockcnt_dec: possibly decrement a QemuLockCnt's counter and lock it.

s/qemu_lockcnt_dec/qemu_lockcnt_dec_if_lock/

> + * @lockcnt: the lockcnt to operate on
> + *
> + * If the count is 1, decrement the count to zero, lock
> + * the mutex and return true.  Otherwise, return false.
> + */
> +bool qemu_lockcnt_dec_if_lock(QemuLockCnt *lockcnt);
> +
> +/**
> + * qemu_lockcnt_lock: lock a QemuLockCnt's mutex.
> + * @lockcnt: the lockcnt to operate on
> + *
> + * Remember that concurrent visits are not blocked unless the count is
> + * also zero.  You can use qemu_lockcnt_count to check for this inside a
> + * critical section.
> + */
> +void qemu_lockcnt_lock(QemuLockCnt *lockcnt);
> diff --git a/util/lockcnt.c b/util/lockcnt.c
> new file mode 100644
> index 000..78ed1e4
> --- /dev/null
> +++ b/util/lockcnt.c
> @@ -0,0 +1,113 @@
> +/*
> + * QemuLockCnt implementation
> + *
> + * Copyright Red Hat, Inc. 2015

2015, 2016?

> + *
> + * Author:
> + *   Paolo Bonzini 
> + */
> +#include "qemu/osdep.h"
> +#include "qemu/thread.h"
> +#include "qemu/atomic.h"



[Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2017-01-04 Thread Paolo Bonzini
A QemuLockCnt comprises a counter and a mutex, with primitives
to increment and decrement the counter, and to take and release the
mutex.  It can be used to do lock-free visits to a data structure
whenever mutexes would be too heavy-weight and the critical section
is too long for RCU.

This could be implemented simply by protecting the counter with the
mutex, but QemuLockCnt is harder to misuse and more efficient.

Signed-off-by: Paolo Bonzini 
---
 docs/lockcnt.txt  | 278 ++
 include/qemu/thread.h | 109 
 util/Makefile.objs|   1 +
 util/lockcnt.c| 113 
 4 files changed, 501 insertions(+)
 create mode 100644 docs/lockcnt.txt
 create mode 100644 util/lockcnt.c

diff --git a/docs/lockcnt.txt b/docs/lockcnt.txt
new file mode 100644
index 000..25a8091
--- /dev/null
+++ b/docs/lockcnt.txt
@@ -0,0 +1,278 @@
+DOCUMENTATION FOR LOCKED COUNTERS (aka QemuLockCnt)
+===
+
+QEMU often uses reference counts to track data structures that are being
+accessed and should not be freed.  For example, a loop that invoke
+callbacks like this is not safe:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+
+QLIST_FOREACH_SAFE protects against deletion of the current node (ioh)
+by stashing away its "next" pointer.  However, ioh->fd_write could
+actually delete the next node from the list.  The simplest way to
+avoid this is to mark the node as deleted, and remove it from the
+list in the above loop:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+
+If however this loop must also be reentrant, i.e. it is possible that
+ioh->fd_write invokes the loop again, some kind of counting is needed:
+
+walking_handlers++;
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+if (walking_handlers == 1) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+}
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+walking_handlers--;
+
+One may think of using the RCU primitives, rcu_read_lock() and
+rcu_read_unlock(); effectively, the RCU nesting count would take
+the place of the walking_handlers global variable.  Indeed,
+reference counting and RCU have similar purposes, but their usage in
+general is complementary:
+
+- reference counting is fine-grained and limited to a single data
+  structure; RCU delays reclamation of *all* RCU-protected data
+  structures;
+
+- reference counting works even in the presence of code that keeps
+  a reference for a long time; RCU critical sections in principle
+  should be kept short;
+
+- reference counting is often applied to code that is not thread-safe
+  but is reentrant; in fact, usage of reference counting in QEMU predates
+  the introduction of threads by many years.  RCU is generally used to
+  protect readers from other threads freeing memory after concurrent
+  modifications to a data structure.
+
+- reclaiming data can be done by a separate thread in the case of RCU;
+  this can improve performance, but also delay reclamation undesirably.
+  With reference counting, reclamation is deterministic.
+
+This file documents QemuLockCnt, an abstraction for using reference
+counting in code that has to be both thread-safe and reentrant.
+
+
+QemuLockCnt concepts
+
+
+A QemuLockCnt comprises both a counter and a mutex; it has primitives
+to increment and decrement the counter, and to take and release the
+mutex.  The counter notes how many visits to the data structures are
+taking place (the visits could be from different threads, or there could
+be multiple reentrant visits from the same thread).  The basic rules
+governing the counter/mutex pair then are the following:
+
+- Data protected by the QemuLockCnt must not be freed unless the
+  counter is zero and the mutex is taken.
+
+- A new visit cannot be started while the counter is zero and the
+  mutex is taken.
+
+Most of the time, the mutex protects all writes to the data structure,
+not just frees, though there could be cases where this is not necessary.
+
+Reads, instead, can be done without taking the mutex, as long as the
+readers and writers use the same macros that are used for RCU, for
+example atomic_rcu_read, atomic_rcu_set, QLIST_FOREACH_RCU, etc.  This is
+because the reads are done outside a lock and a set or QLIST_INSERT_HEAD
+can happen concurrently with the read.  The RCU API ensures that the
+processor and the compiler see all 

[Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2016-12-21 Thread Paolo Bonzini
A QemuLockCnt comprises a counter and a mutex, with primitives
to increment and decrement the counter, and to take and release the
mutex.  It can be used to do lock-free visits to a data structure
whenever mutexes would be too heavy-weight and the critical section
is too long for RCU.

This could be implemented simply by protecting the counter with the
mutex, but QemuLockCnt is harder to misuse and more efficient.

Signed-off-by: Paolo Bonzini 
---
 docs/lockcnt.txt  | 278 ++
 include/qemu/thread.h | 109 
 util/Makefile.objs|   1 +
 util/lockcnt.c| 113 
 4 files changed, 501 insertions(+)
 create mode 100644 docs/lockcnt.txt
 create mode 100644 util/lockcnt.c

diff --git a/docs/lockcnt.txt b/docs/lockcnt.txt
new file mode 100644
index 000..25a8091
--- /dev/null
+++ b/docs/lockcnt.txt
@@ -0,0 +1,278 @@
+DOCUMENTATION FOR LOCKED COUNTERS (aka QemuLockCnt)
+===
+
+QEMU often uses reference counts to track data structures that are being
+accessed and should not be freed.  For example, a loop that invoke
+callbacks like this is not safe:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+
+QLIST_FOREACH_SAFE protects against deletion of the current node (ioh)
+by stashing away its "next" pointer.  However, ioh->fd_write could
+actually delete the next node from the list.  The simplest way to
+avoid this is to mark the node as deleted, and remove it from the
+list in the above loop:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+
+If however this loop must also be reentrant, i.e. it is possible that
+ioh->fd_write invokes the loop again, some kind of counting is needed:
+
+walking_handlers++;
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+if (walking_handlers == 1) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+}
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+walking_handlers--;
+
+One may think of using the RCU primitives, rcu_read_lock() and
+rcu_read_unlock(); effectively, the RCU nesting count would take
+the place of the walking_handlers global variable.  Indeed,
+reference counting and RCU have similar purposes, but their usage in
+general is complementary:
+
+- reference counting is fine-grained and limited to a single data
+  structure; RCU delays reclamation of *all* RCU-protected data
+  structures;
+
+- reference counting works even in the presence of code that keeps
+  a reference for a long time; RCU critical sections in principle
+  should be kept short;
+
+- reference counting is often applied to code that is not thread-safe
+  but is reentrant; in fact, usage of reference counting in QEMU predates
+  the introduction of threads by many years.  RCU is generally used to
+  protect readers from other threads freeing memory after concurrent
+  modifications to a data structure.
+
+- reclaiming data can be done by a separate thread in the case of RCU;
+  this can improve performance, but also delay reclamation undesirably.
+  With reference counting, reclamation is deterministic.
+
+This file documents QemuLockCnt, an abstraction for using reference
+counting in code that has to be both thread-safe and reentrant.
+
+
+QemuLockCnt concepts
+
+
+A QemuLockCnt comprises both a counter and a mutex; it has primitives
+to increment and decrement the counter, and to take and release the
+mutex.  The counter notes how many visits to the data structures are
+taking place (the visits could be from different threads, or there could
+be multiple reentrant visits from the same thread).  The basic rules
+governing the counter/mutex pair then are the following:
+
+- Data protected by the QemuLockCnt must not be freed unless the
+  counter is zero and the mutex is taken.
+
+- A new visit cannot be started while the counter is zero and the
+  mutex is taken.
+
+Most of the time, the mutex protects all writes to the data structure,
+not just frees, though there could be cases where this is not necessary.
+
+Reads, instead, can be done without taking the mutex, as long as the
+readers and writers use the same macros that are used for RCU, for
+example atomic_rcu_read, atomic_rcu_set, QLIST_FOREACH_RCU, etc.  This is
+because the reads are done outside a lock and a set or QLIST_INSERT_HEAD
+can happen concurrently with the read.  The RCU API ensures that the
+processor and the compiler see all 

Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2016-11-30 Thread Stefan Hajnoczi
On Tue, Nov 29, 2016 at 12:46:59PM +0100, Paolo Bonzini wrote:
> A QemuLockCnt comprises a counter and a mutex, with primitives
> to increment and decrement the counter, and to take and release the
> mutex.  It can be used to do lock-free visits to a data structure
> whenever mutexes would be too heavy-weight and the critical section
> is too long for RCU.
> 
> This could be implemented simply by protecting the counter with the
> mutex, but QemuLockCnt is harder to misuse and more efficient.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  docs/lockcnt.txt  | 343 
> ++
>  include/qemu/thread.h |  17 +++
>  util/Makefile.objs|   1 +
>  util/lockcnt.c| 113 +
>  4 files changed, 474 insertions(+)
>  create mode 100644 docs/lockcnt.txt
>  create mode 100644 util/lockcnt.c
> 
> diff --git a/docs/lockcnt.txt b/docs/lockcnt.txt
> new file mode 100644
> index 000..fc5d240
> --- /dev/null
> +++ b/docs/lockcnt.txt
> @@ -0,0 +1,343 @@
> +DOCUMENTATION FOR LOCKED COUNTERS (aka QemuLockCnt)
> +===

This file contains all the documentation but the header file has no doc
comments.  Could you move everything into the header file (like
include/qom/object.h)?


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2016-11-29 Thread Eric Blake
On 11/29/2016 05:46 AM, Paolo Bonzini wrote:
> A QemuLockCnt comprises a counter and a mutex, with primitives
> to increment and decrement the counter, and to take and release the
> mutex.  It can be used to do lock-free visits to a data structure
> whenever mutexes would be too heavy-weight and the critical section
> is too long for RCU.
> 
> This could be implemented simply by protecting the counter with the
> mutex, but QemuLockCnt is harder to misuse and more efficient.
> 
> Signed-off-by: Paolo Bonzini 
> ---


> +int qemu_lockcnt_count(QemuLockCnt *lockcnt);
> +
> +Return the lockcnt's count.  The count can change at any time
> +any time; still, while the lockcnt is locked, one can usefully

duplicate 'any time'


> +++ b/util/lockcnt.c
> @@ -0,0 +1,113 @@
> +/*
> + * QemuLockCnt implementation
> + *
> + * Copyright Red Hat, Inc. 2015

You've been sitting on this a while :)  Want to add 2016?

The documentation is a huge help to understanding the code; overall it
looks pretty clean.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH 02/10] qemu-thread: introduce QemuLockCnt

2016-11-29 Thread Paolo Bonzini
A QemuLockCnt comprises a counter and a mutex, with primitives
to increment and decrement the counter, and to take and release the
mutex.  It can be used to do lock-free visits to a data structure
whenever mutexes would be too heavy-weight and the critical section
is too long for RCU.

This could be implemented simply by protecting the counter with the
mutex, but QemuLockCnt is harder to misuse and more efficient.

Signed-off-by: Paolo Bonzini 
---
 docs/lockcnt.txt  | 343 ++
 include/qemu/thread.h |  17 +++
 util/Makefile.objs|   1 +
 util/lockcnt.c| 113 +
 4 files changed, 474 insertions(+)
 create mode 100644 docs/lockcnt.txt
 create mode 100644 util/lockcnt.c

diff --git a/docs/lockcnt.txt b/docs/lockcnt.txt
new file mode 100644
index 000..fc5d240
--- /dev/null
+++ b/docs/lockcnt.txt
@@ -0,0 +1,343 @@
+DOCUMENTATION FOR LOCKED COUNTERS (aka QemuLockCnt)
+===
+
+QEMU often uses reference counts to track data structures that are being
+accessed and should not be freed.  For example, a loop that invoke
+callbacks like this is not safe:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+
+QLIST_FOREACH_SAFE protects against deletion of the current node (ioh)
+by stashing away its "next" pointer.  However, ioh->fd_write could
+actually delete the next node from the list.  The simplest way to
+avoid this is to mark the node as deleted, and remove it from the
+list in the above loop:
+
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+
+If however this loop must also be reentrant, i.e. it is possible that
+ioh->fd_write invokes the loop again, some kind of counting is needed:
+
+walking_handlers++;
+QLIST_FOREACH_SAFE(ioh, _handlers, next, pioh) {
+if (ioh->deleted) {
+if (walking_handlers == 1) {
+QLIST_REMOVE(ioh, next);
+g_free(ioh);
+}
+} else {
+if (ioh->revents & G_IO_OUT) {
+ioh->fd_write(ioh->opaque);
+}
+}
+}
+walking_handlers--;
+
+One may think of using the RCU primitives, rcu_read_lock() and
+rcu_read_unlock(); effectively, the RCU nesting count would take
+the place of the walking_handlers global variable.  Indeed,
+reference counting and RCU have similar purposes, but their usage in
+general is complementary:
+
+- reference counting is fine-grained and limited to a single data
+  structure; RCU delays reclamation of *all* RCU-protected data
+  structures;
+
+- reference counting works even in the presence of code that keeps
+  a reference for a long time; RCU critical sections in principle
+  should be kept short;
+
+- reference counting is often applied to code that is not thread-safe
+  but is reentrant; in fact, usage of reference counting in QEMU predates
+  the introduction of threads by many years.  RCU is generally used to
+  protect readers from other threads freeing memory after concurrent
+  modifications to a data structure.
+
+- reclaiming data can be done by a separate thread in the case of RCU;
+  this can improve performance, but also delay reclamation undesirably.
+  With reference counting, reclamation is deterministic.
+
+This file documents QemuLockCnt, an abstraction for using reference
+counting in code that has to be both thread-safe and reentrant.
+
+
+QemuLockCnt concepts
+
+
+A QemuLockCnt comprises both a counter and a mutex; it has primitives
+to increment and decrement the counter, and to take and release the
+mutex.  The counter notes how many visits to the data structures are
+taking place (the visits could be from different threads, or there could
+be multiple reentrant visits from the same thread).  The basic rules
+governing the counter/mutex pair then are the following:
+
+- Data protected by the QemuLockCnt must not be freed unless the
+  counter is zero and the mutex is taken.
+
+- A new visit cannot be started while the counter is zero and the
+  mutex is taken.
+
+Most of the time, the mutex protects all writes to the data structure,
+not just frees, though there could be cases where this is not necessary.
+
+Reads, instead, can be done without taking the mutex, as long as the
+readers and writers use the same macros that are used for RCU, for
+example atomic_rcu_read, atomic_rcu_set, QLIST_FOREACH_RCU, etc.  This is
+because the reads are done outside a lock and a set or QLIST_INSERT_HEAD
+can happen concurrently with the read.  The RCU API ensures that the
+processor and the compiler see all required memory