Re: [Xen-devel] [PATCH v4 2/2] gnttab: refactor locking for scalability

2015-01-12 Thread Jan Beulich
>>> On 09.01.15 at 16:12,  wrote:
> @@ -188,6 +191,26 @@ nr_active_grant_frames(struct grant_table *gt)
>  return num_act_frames_from_sha_frames(nr_grant_frames(gt));
>  }
>  
> +static inline struct active_grant_entry *
> +active_entry_acquire(struct grant_table *t, grant_ref_t e)
> +{
> +struct active_grant_entry *act;
> +
> +/* not perfect, but better than nothing for a debug build
> + * sanity check */

When coding style issues are being pointed out, please make sure
you go through your patch series and address _all_ of them, even
if not each and every violation was explicitly named.

> @@ -514,17 +513,22 @@ static int grant_map_exists(const struct domain *ld,
> nr_grant_entries(rgt));
>  for ( ref = *ref_count; ref < max_iter; ref++ )
>  {
> -act = &active_entry(rgt, ref);
> +struct active_grant_entry *act;
> +act = active_entry_acquire(rgt, ref);
>  
> -if ( !act->pin )
> -continue;
> +#define CONTINUE_IF(cond) \
> +if ((cond)) { \
> +active_entry_release(act); \
> +continue; \
> +}
>  
> -if ( act->domid != ld->domain_id )
> -continue;
> +CONTINUE_IF(!act->pin);
> +CONTINUE_IF(act->domid != ld->domain_id);
> +CONTINUE_IF(act->frame != mfn);

I highly question this is in any way better than the v3 code. Is there
a clear reason not to follow the advice of putting the
active_entry_release(act) into the third of the for() expressions?

And if you _really_ want/need it this way, please again get the coding
style right (and drop the pointless redundant parentheses).

> @@ -545,16 +555,32 @@ static void mapcount(
>  grant_handle_t handle;
>  
>  *wrc = *rdc = 0;
> +ASSERT(rw_is_locked(&rd->grant_table->lock));
> +
> +/* N.B.: while taking the left side maptrack spinlock prevents
> + * any mapping changes, the right side active entries could be
> + * changing while we are counting. To be perfectly correct, the
> + * caller should hold the grant table write lock, or some other
> + * mechanism should be used to prevent concurrent changes during
> + * this operation. This is tricky because we can't promote a read
> + * lock into a write lock.
> + */
> +spin_lock(&lgt->maptrack_lock);

So you added the suggested ASSERT(), but you didn't adjust the
comment (which actually should precede the ASSERT() imo) in any
way.

> @@ -698,12 +723,9 @@ __gnttab_map_grant_ref(
>  GNTPIN_hstr_inc : GNTPIN_hstw_inc;
>  
>  frame = act->frame;
> -act_pin = act->pin;
>  
>  cache_flags = (shah->flags & (GTF_PAT | GTF_PWT | GTF_PCD) );
>  
> -read_unlock(&rgt->lock);
> -
>  /* pg may be set, with a refcount included, from __get_paged_frame */
>  if ( !pg )
>  {
> @@ -778,8 +800,6 @@ __gnttab_map_grant_ref(
>  goto undo_out;
>  }
>  
> -double_maptrack_lock(lgt, rgt);

Again an un-addressed comment from v3: "Taking these two together
- isn't patch 1 wrong then, in that it acquires both maptrack locks in
double_gt_lock()?"

> @@ -941,18 +960,19 @@ __gnttab_unmap_common(
>  
>  rgt = rd->grant_table;
>  read_lock(&rgt->lock);
> -double_maptrack_lock(lgt, rgt);
> +
> +read_lock(&rgt->lock);

Acquiring the same read lock twice in a row?

> @@ -3045,9 +3097,11 @@ static void gnttab_usage_print(struct domain *rd)
>  uint16_t status;
>  uint64_t frame;
>  
> -act = &active_entry(gt, ref);
> -if ( !act->pin )
> +act = active_entry_acquire(gt, ref);
> +if ( !act->pin ) {

Coding style again.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 2/2] gnttab: refactor locking for scalability

2015-01-09 Thread Christoph Egger
From: Matt Wilson 

This patch refactors grant table locking. It splits the previous
single spinlock per grant table into multiple locks. The heavily
modified components of the grant table (the maptrack state and the
active entries) are now protected by their own spinlocks. The
remaining elements of the grant table are read-mostly, so the main
grant table lock is modified to be a rwlock to improve concurrency.

Workloads with high grant table operation rates, especially map/unmap
operations as used by blkback/blkfront when persistent grants are not
supported, show marked improvement with these changes. A domU with 24
VBDs in a streaming 2M write workload achieved 1,400 MB/sec before
this change. Performance more than doubles with this patch, reaching
3,000 MB/sec before tuning and 3,600 MB/sec after adjusting event
channel vCPU bindings.

Signed-off-by: Matt Wilson 
[chegger: ported to xen-staging, split into multiple commits]

v4:
  * Addressed ASSERT()'s from Jan Beulich
  * Addressed locking issues in unmap_common pointed out
by Jan Beulich
v3:
  * Addressed gnttab_swap_grant_ref() comment from Andrew Cooper

Signed-off-by: Christoph Egger 
Cc: Anthony Liguori 
Cc: Jan Beulich 
Cc: Keir Fraser 
---
 docs/misc/grant-tables.txt |   21 +
 xen/common/grant_table.c   |  219 +++-
 2 files changed, 158 insertions(+), 82 deletions(-)

diff --git a/docs/misc/grant-tables.txt b/docs/misc/grant-tables.txt
index c9ae8f2..1ada018 100644
--- a/docs/misc/grant-tables.txt
+++ b/docs/misc/grant-tables.txt
@@ -63,6 +63,7 @@ is complete.
   act->domid : remote domain being granted rights
   act->frame : machine frame being granted
   act->pin   : used to hold reference counts
+  act->lock  : spinlock used to serialize access to active entry state
 
  Map tracking
  
@@ -87,6 +88,8 @@ is complete.
version, partially initialized active table 
pages,
etc.
   grant_table->maptrack_lock : spinlock used to protect the maptrack state
+  active_grant_entry->lock   : spinlock used to serialize modifications to
+   active entries
 
  The primary lock for the grant table is a read/write spinlock. All
  functions that access members of struct grant_table must acquire a
@@ -102,6 +105,24 @@ is complete.
  state can be rapidly modified under some workloads, and the critical
  sections are very small, thus we use a spinlock to protect them.
 
+ Active entries are obtained by calling active_entry_acquire(gt, ref).
+ This function returns a pointer to the active entry after locking its
+ spinlock. The caller must hold the rwlock for the gt in question
+ before calling active_entry_acquire(). This is because the grant
+ table can be dynamically extended via gnttab_grow_table() while a
+ domain is running and must be fully initialized. Once all access to
+ the active entry is complete, release the lock by calling
+ active_entry_release(act).
+
+ Summary of rules for locking:
+  active_entry_acquire() and active_entry_release() can only be
+  called when holding the relevant grant table's lock. I.e.:
+read_lock(>->lock);
+act = active_entry_acquire(gt, ref);
+...
+active_entry_release(act);
+read_unlock(>->lock);
+
 

 
  Granting a foreign domain access to frames
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 586ca94..92b3cb1 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -157,10 +157,13 @@ struct active_grant_entry {
in the page.   */
 unsigned  length:16; /* For sub-page grants, the length of the
 grant.*/
+spinlock_tlock;  /* lock to protect access of this entry.
+see docs/misc/grant-tables.txt for
+locking protocol  */
 };
 
 #define ACGNT_PER_PAGE (PAGE_SIZE / sizeof(struct active_grant_entry))
-#define active_entry(t, e) \
+#define _active_entry(t, e) \
 ((t)->active[(e)/ACGNT_PER_PAGE][(e)%ACGNT_PER_PAGE])
 
 static inline void gnttab_flush_tlb(const struct domain *d)
@@ -188,6 +191,26 @@ nr_active_grant_frames(struct grant_table *gt)
 return num_act_frames_from_sha_frames(nr_grant_frames(gt));
 }
 
+static inline struct active_grant_entry *
+active_entry_acquire(struct grant_table *t, grant_ref_t e)
+{
+struct active_grant_entry *act;
+
+/* not perfect, but better than nothing for a debug build
+ * sanity check */
+ASSERT(rw_is_locked(&t->lock));
+
+act = &_active_entry(t, e);
+spin_lock(&act->lock);
+
+return act;
+}
+
+static inline void active_entry_release(struct active_grant_entry *act)
+{
+spin_unlock(&act->lock);
+}
+
 /* Check if the page has been paged out, or needs