Re: [RFC mach] vm: fix locking issues

2015-08-27 Thread Justus Winter
Quoting Samuel Thibault (2015-08-27 00:56:23)
 Justus Winter, le Mon 17 Aug 2015 17:51:09 +0200, a écrit :
  @@ -2157,10 +2162,13 @@ start_pass_1:
/*
 *  Check for permanent objects in the destination.
 */
  -
  - if ((entry-object.vm_object != VM_OBJECT_NULL) 
  -!entry-object.vm_object-temporary)
  - contains_permanent_objects = TRUE;
  + object = entry-object.vm_object;
  + if ((object != VM_OBJECT_NULL)
  +  ! contains_permanent_objects) {
  + vm_object_lock(object);
  + contains_permanent_objects = object-temporary;
 
 This looks wrong, shouldn't it be !object-temporary?

Yes.

 Also, taking the lock just to read a boolean usually means there's
 something bigger wrong somewhere.  Either the boolean was not meant to
 be protected (e.g. because it is constant), or more than just reading it
 should be locked, for the whole coherency.  Here it never changes except
 when memory_object_set_attributes_common is called, but then there is
 nothing that will prevent memory_object_set_attributes_common from doing
 it between the time you take the lock, get the temporary field, release
 the lock, and the time when you actually do something about it, e.g.
 here return KERN_FAILURE because you believe that there's a permanent
 object while it has just been turned temporary actually. That's probably
 not a problem in practice because the two calls will probably have some
 dependency, but that's the actual issue here.

Right.

 
  @@ -2275,8 +2284,15 @@ start_pass_1:
 */
   
object = entry-object.vm_object;
  + temporary = 0;
  + if (object != VM_OBJECT_NULL) {
  + vm_object_lock(object);
  + temporary = object-temporary;
  + vm_object_unlock(object);
  + }
  +
 
 Here it's worse: the copy strategy changes. Perhaps we could think that
 it doesn't pose problem in practice because the two calls would probably
 have a dependency, I don't know about memory objects to know.

Right, I will look more closely at the fields and how they are meant
to be used, not how it is documented in the header.

  diff --git a/vm/vm_object.c b/vm/vm_object.c
  index deac0c2..1d3e727 100644
  --- a/vm/vm_object.c
  +++ b/vm/vm_object.c
  @@ -217,9 +217,11 @@ static void _vm_object_setup(
vm_size_t   size)
   {
*object = vm_object_template;
  - queue_init(object-memq);
vm_object_lock_init(object);
  + vm_object_lock(object);
  + queue_init(object-memq);
object-size = size;
  + vm_object_unlock(object);
 
 Here it's a fresh object, that's why there's no need to lock, but oh
 well.

Yes.  I pondered enclosing these with #if MACH_LDEBUG, but that's ugly
and so I postponed it until it turns out to be a bottleneck.

  @@ -1474,14 +1491,11 @@ vm_object_t vm_object_copy_delayed(
 *  must be done carefully, to avoid deadlock.
 */
   
  - /*
  -  *  Allocate a new copy object before locking, even
  -  *  though we may not need it later.
  -  */
  + vm_object_lock(src_object);
   
new_copy = vm_object_allocate(src_object-size);
   
  - vm_object_lock(src_object);
 
 Better make a copy of src_object-size to avoid keeping the object
 locked while we allocate stuff.

Ok.

  @@ -252,6 +252,8 @@ vm_pageout_setup(
vm_object_unlock(new_object);
}
   
  + vm_object_lock(old_object);
  +
if (flush) {
/*
 *  Create a place-holder page where the old one was,
  @@ -262,7 +264,6 @@ vm_pageout_setup(
== VM_PAGE_NULL)
vm_page_more_fictitious();
  - vm_object_lock(old_object);
 
 Why keeping the lock while allocating a page?
 
vm_page_lock_queues();
vm_page_remove(m);
vm_page_unlock_queues();
  @@ -281,8 +282,6 @@ vm_pageout_setup(
VM_EXTERNAL_STATE_EXISTS);
   #endif   /* MACH_PAGEMAP */
   
  - vm_object_unlock(old_object);
 
 Why keeping it locked here?  For accessing old_object-internal?  It is
 constant across the life of an object, thus does not need to be
 protected by the lock.

Ok.  I wish that kind of knowledge was stated in the comments
describing the objects.


Cheers,
Justus



Re: [RFC mach] vm: fix locking issues

2015-08-27 Thread Samuel Thibault
Justus Winter, le Thu 27 Aug 2015 13:19:51 +0200, a écrit :
  Why keeping it locked here?  For accessing old_object-internal?  It is
  constant across the life of an object, thus does not need to be
  protected by the lock.
 
 Ok.  I wish that kind of knowledge was stated in the comments
 describing the objects.

Agreed, patch welcome :)

Samuel



Re: [RFC mach] vm: fix locking issues

2015-08-26 Thread Samuel Thibault
Justus Winter, le Mon 17 Aug 2015 17:51:09 +0200, a écrit :
 @@ -2157,10 +2162,13 @@ start_pass_1:
   /*
*  Check for permanent objects in the destination.
*/
 -
 - if ((entry-object.vm_object != VM_OBJECT_NULL) 
 -!entry-object.vm_object-temporary)
 - contains_permanent_objects = TRUE;
 + object = entry-object.vm_object;
 + if ((object != VM_OBJECT_NULL)
 +  ! contains_permanent_objects) {
 + vm_object_lock(object);
 + contains_permanent_objects = object-temporary;

This looks wrong, shouldn't it be !object-temporary?

Also, taking the lock just to read a boolean usually means there's
something bigger wrong somewhere.  Either the boolean was not meant to
be protected (e.g. because it is constant), or more than just reading it
should be locked, for the whole coherency.  Here it never changes except
when memory_object_set_attributes_common is called, but then there is
nothing that will prevent memory_object_set_attributes_common from doing
it between the time you take the lock, get the temporary field, release
the lock, and the time when you actually do something about it, e.g.
here return KERN_FAILURE because you believe that there's a permanent
object while it has just been turned temporary actually. That's probably
not a problem in practice because the two calls will probably have some
dependency, but that's the actual issue here.

 @@ -2275,8 +2284,15 @@ start_pass_1:
*/
  
   object = entry-object.vm_object;
 + temporary = 0;
 + if (object != VM_OBJECT_NULL) {
 + vm_object_lock(object);
 + temporary = object-temporary;
 + vm_object_unlock(object);
 + }
 +

Here it's worse: the copy strategy changes. Perhaps we could think that
it doesn't pose problem in practice because the two calls would probably
have a dependency, I don't know about memory objects to know.

 diff --git a/vm/vm_object.c b/vm/vm_object.c
 index deac0c2..1d3e727 100644
 --- a/vm/vm_object.c
 +++ b/vm/vm_object.c
 @@ -217,9 +217,11 @@ static void _vm_object_setup(
   vm_size_t   size)
  {
   *object = vm_object_template;
 - queue_init(object-memq);
   vm_object_lock_init(object);
 + vm_object_lock(object);
 + queue_init(object-memq);
   object-size = size;
 + vm_object_unlock(object);

Here it's a fresh object, that's why there's no need to lock, but oh
well.

 @@ -244,8 +246,11 @@ vm_object_t vm_object_allocate(
   port = ipc_port_alloc_kernel();
   if (port == IP_NULL)
   panic(vm_object_allocate);
 +
 + vm_object_lock(object);
   object-pager_name = port;
   ipc_kobject_set(port, (ipc_kobject_t) object, IKOT_PAGING_NAME);
 + vm_object_unlock(object);

Ditto (and later with new_copy and result)

 @@ -1474,14 +1491,11 @@ vm_object_t vm_object_copy_delayed(
*  must be done carefully, to avoid deadlock.
*/
  
 - /*
 -  *  Allocate a new copy object before locking, even
 -  *  though we may not need it later.
 -  */
 + vm_object_lock(src_object);
  
   new_copy = vm_object_allocate(src_object-size);
  
 - vm_object_lock(src_object);

Better make a copy of src_object-size to avoid keeping the object
locked while we allocate stuff.

 @@ -252,6 +252,8 @@ vm_pageout_setup(
   vm_object_unlock(new_object);
   }
  
 + vm_object_lock(old_object);
 +
   if (flush) {
   /*
*  Create a place-holder page where the old one was,
 @@ -262,7 +264,6 @@ vm_pageout_setup(
   == VM_PAGE_NULL)
   vm_page_more_fictitious();
 - vm_object_lock(old_object);

Why keeping the lock while allocating a page?

   vm_page_lock_queues();
   vm_page_remove(m);
   vm_page_unlock_queues();
 @@ -281,8 +282,6 @@ vm_pageout_setup(
   VM_EXTERNAL_STATE_EXISTS);
  #endif   /* MACH_PAGEMAP */
  
 - vm_object_unlock(old_object);

Why keeping it locked here?  For accessing old_object-internal?  It is
constant across the life of an object, thus does not need to be
protected by the lock.

Samuel



[RFC mach] vm: fix locking issues

2015-08-17 Thread Justus Winter
Avoid accessing fields of `vm_object' objects without having it
locked.  These problems have been found using a code transformation
done by Coccinelle that instrumented all accesses with a runtime
check, and manual inspection.

* vm/memory_object.c (memory_object_data_supply): Avoid accessing
fields without the lock.
* vm/vm_fault.c (vm_fault_page): Likewise.
* vm/vm_map.c (vm_map_submap): Properly lock `object'.
(vm_map_copy_overwrite): Avoid accessing fields without the lock.
(vm_map_copyin): Lock `src_object'.
* vm/vm_object.c (_vm_object_setup): Likewise.
(vm_object_allocate): Likewise.
(vm_object_terminate): Avoid accessing fields without the lock.
(vm_object_copy_delayed): Lock `src_object' earlier, lock `new_copy'.
(vm_object_shadow): Lock `result'.
(vm_object_enter): Properly lock `object'.  Avoid accessing fields
without the lock.
* vm/vm_pageout.c (vm_pageout_setup): Properly lock `old_object'.
---
 vm/memory_object.c | 17 +++
 vm/vm_fault.c  | 39 +++--
 vm/vm_map.c| 55 ---
 vm/vm_object.c | 84 +-
 vm/vm_pageout.c|  9 ++
 5 files changed, 141 insertions(+), 63 deletions(-)

diff --git a/vm/memory_object.c b/vm/memory_object.c
index 097ed23..0a07429 100644
--- a/vm/memory_object.c
+++ b/vm/memory_object.c
@@ -101,6 +101,7 @@ kern_return_t memory_object_data_supply(
vm_page_t   *page_list;
boolean_t   was_absent;
vm_map_copy_t   orig_copy = data_copy;
+   pager_request_t pager_request;
 
/*
 *  Look for bogus arguments
@@ -270,6 +271,7 @@ retry_lookup:
/*
 *  Send reply if one was requested.
 */
+   pager_request = object-pager_request;
vm_object_paging_end(object);
vm_object_unlock(object);
 
@@ -279,7 +281,7 @@ retry_lookup:
if (IP_VALID(reply_to)) {
memory_object_supply_completed(
reply_to, reply_to_type,
-   object-pager_request,
+   pager_request,
original_offset,
original_length,
result,
@@ -788,7 +790,9 @@ MACRO_END
continue;
 
case MEMORY_OBJECT_LOCK_RESULT_MUST_CLEAN:
-   case MEMORY_OBJECT_LOCK_RESULT_MUST_RETURN:
+   case MEMORY_OBJECT_LOCK_RESULT_MUST_RETURN:;
+   vm_offset_t object_paging_offset;
+
/*
 * The clean and return cases are similar.
 *
@@ -811,6 +815,7 @@ MACRO_END
PAGEOUT_PAGES;
}
 
+   object_paging_offset = object-paging_offset;
vm_object_unlock(object);
 
/*
@@ -821,8 +826,7 @@ MACRO_END
if (new_object == VM_OBJECT_NULL) {
new_object = vm_object_allocate(original_size);
new_offset = 0;
-   paging_offset = m-offset +
-   object-paging_offset;
+   paging_offset = m-offset + object_paging_offset;
pageout_action = page_lock_result;
}
 
@@ -831,7 +835,7 @@ MACRO_END
 *  new object.
 */
m = vm_pageout_setup(m,
-   m-offset + object-paging_offset,
+   m-offset + object_paging_offset,
new_object,
new_offset,
should_flush);
@@ -859,11 +863,12 @@ MACRO_END
}
 
if (IP_VALID(reply_to)) {
+   pager_request_t pager_request = object-pager_request;
vm_object_unlock(object);
 
/* consumes our naked send-once/send right for reply_to */
(void) memory_object_lock_completed(reply_to, reply_to_type,
-   object-pager_request, original_offset, original_size);
+   pager_request, original_offset, original_size);
 
vm_object_lock(object);
}
diff --git a/vm/vm_fault.c b/vm/vm_fault.c
index 46779f6..101ebce 100644
--- a/vm/vm_fault.c
+++ b/vm/vm_fault.c
@@ -229,6 +229,17 @@ vm_fault_return_t vm_fault_page(
boolean_t   look_for_page;
vm_prot_t   access_required;
 
+   /* We need to unlock an object before making requests to a
+  memory manager.  We use this object to temporarily store
+  object attributes needed for the request to avoid accessing
+  the object while it is unlocked.  */
+   struct
+   {