Re: [PATCH] Port leak when using clisp
I've split the patch into two. Cheers Flavio On Thu, 27 Aug 2015 at 11:53 Justus Winter 4win...@informatik.uni-hamburg.de wrote: Hi, Quoting Flávio Cruz (2015-08-27 00:26:13) Em qua, 26 de ago de 2015 às 12:18, Justus Winter 4win...@informatik.uni-hamburg.de escreveu: After looking around and using KDB, I've figured out that the following loop in kern/exceptions.c/exception_raise_continue_slow was the culprit: Could you elaborate a little on your method? I changed the code to chain all the allocated ports together in a linked list. Then, I added new fields to ipc_port to track send-only rights and to understand where those rights were being created and where the port was being manipulated. Finally, I added a new KDB command to show all the inactive ports. It was then easy to track exactly how the port was handled inside exception_raise_continue_slow. Neat. Thanks for sharing. Fwiw, I've used Coccinelle for code instrumentations. Also for fixing up code in a mechanical way. It is an awesome tool :) while (mr == MACH_RCV_INTERRUPTED) { /* * Somebody is trying to force this thread * to a clean point. We must cooperate * and then resume the receive. */ while (thread_should_halt(self)) { /* don't terminate while holding a reference */ if (self-ast AST_TERMINATE) ipc_port_release(reply_port); thread_halt_self(); } ip_lock(reply_port); } The reply port of the target thread (from CLISP?) is not being released since it enters the while(thread_should_halt(self)) loop and the thread terminates inside thread_halt_self but the previous condition does not hold, which fails to release the port. I also think that once the code enters thread_halt_self, it never comes back since the stack is discarded (for both if cases inside the function). While I agree that the function thread_halt_self does not return, it only terminates the thread if the AST_TERMINATE flag is set. Otherwise, it returns to userspace. Yes, it returns to userspace using thread_exception_return, but notice that a reference must be released anyway (check first statement after the while loop). Right. We have two references, one for the send-once right, and one for the port being saved in ith_port (see exception.c:386). For the else case in thread_halt_self, the thread may throw away the current stack and run thread_exception_return in thread_block, which is also executed in the case of success in exception_raise_continue_slow. And indeed exception_raise_continue_fast releases two references too. I find the use of `while' dodgy in the first place b/c if thread_halt_self doesn't return, an if would do the trick as well and don't imply the chance of that code looping... I think there might be some situations where it will loop around depending on the scheduling order (see line 798 on sched_prim.c). Could be. I keep messing up the semantics of thread_invoke in my head , I've added a continuation argument to thread_halt_self so that the port is released if the thread resumes using the continuation (followed by thread_exception_return). Otherwise, it may loop and it will eventually also release the reference and call thread_exception_return. Yes. This looks like the clean solution. I've also made some cosmetic changes to the code. Would be best to split the change up nevertheless. Thanks for looking into this :) Justus port-leak.patch Description: Binary data no-continuation.patch Description: Binary data
[PATCH] Fix race condition in ext2fs when remounting
On some systems, ext2fs.static would regularly hang at startup, as a race condition meant it would process paging requests while remounting. To fix this, libpager has been altered to allow inhibiting and resuming its worker threads, and ext2fs uses this to inhibit paging while remounting. * console/pager.c (pager_requests): New variable. (user_pager_init): Updated call to pager_start_workers to use new pager_requests variable. * daemons/runsystem.sh: Removed artificial delay working around the race condition. * ext2fs/ext2fs.c (diskfs_reload_global_state): Call new inhibit_ext2_pager and resume_ext2_pager functions, and leave sblock as non-NULL so it will be munmapped. * ext2fs/ext2fs.h (inhibit_ext2_pager,resume_ext2_pager): New functions. * ext2fs/pager.c (file_pager_requests): New variable. (create_disk_pager): Updated call to pager_start_workers to use new file_pager_requests variable. (inhibit_ext2_pager,resume_ext2_pager): New functions. * fatfs/fatfs.h (inhibit_fat_pager,resume_fat_pager): New functions. * fatfs/pager.c (file_pager_requests): New variable. (create_fat_pager): Updated call to pager_start_workers to use new file_pager_requests variable. (inhibit_fat_pager,resume_fat_pager): New functions. * libdiskfs/disk-pager.c (diskfs_disk_pager_requests): New variable. (diskfs_start_disk_pager): Updated call to pager_start_workers to use new diskfs_disk_pager_requests variable. * libdiskfs/diskfs-pager.h (diskfs_disk_pager_requests): New variable. * libpager/demuxer.c (struct pager_requests): Renamed struct requests to struct pager_requests. Replaced queue with queue_in and queue_out pointers. Added inhibit_wakeup field. (pager_demuxer): Updated to use new queue_in/queue_out pointers. Only wake up workers if not inhibited. (worker_func): Updated to use new queue_in/queue_out pointers. Final worker thread to sleep notifies the inhibit_wakeup condition variable. (pager_start_workers): Added out parameter for the requests instance. Allocate heap space shared by both queues. Initialise new inhibit_wakeup condition. (pager_inhibit_workers,pager_resume_workers): New functions. * libpager/pager.h (struct pager_requests): Public forward definition. (pager_start_workers): Added out parameter for the requests instance. (pager_inhibit_workers,pager_resume_workers): New functions. * libpager/queue.h (queue_empty): New function. * storeio/pager.c (pager_requests): New variable. (init_dev_paging): Updated call to pager_start_workers to use new pager_requests variable. --- console/pager.c | 3 +- daemons/runsystem.sh | 3 -- ext2fs/ext2fs.c | 12 - ext2fs/ext2fs.h | 6 +++ ext2fs/pager.c | 33 - fatfs/fatfs.h| 2 + fatfs/pager.c| 33 - libdiskfs/disk-pager.c | 3 +- libdiskfs/diskfs-pager.h | 1 + libpager/demuxer.c | 119 --- libpager/pager.h | 28 ++- libpager/queue.h | 8 storeio/pager.c | 3 +- 13 files changed, 227 insertions(+), 27 deletions(-) diff --git a/console/pager.c b/console/pager.c index 5e13ba4..818e49d 100644 --- a/console/pager.c +++ b/console/pager.c @@ -42,6 +42,7 @@ struct user_pager_info /* We need a separate bucket for the pager ports. */ static struct port_bucket *pager_bucket; +static struct pager_requests *pager_requests; /* Implement the pager_clear_user_data callback from the pager library. */ @@ -133,7 +134,7 @@ user_pager_init (void) error (5, errno, Cannot create pager bucket); /* Start libpagers worker threads. */ - err = pager_start_workers (pager_bucket); + err = pager_start_workers (pager_bucket, pager_requests); if (err) error (5, err, Cannot start pager worker threads); } diff --git a/daemons/runsystem.sh b/daemons/runsystem.sh index ae25a7d..5d0ad01 100644 --- a/daemons/runsystem.sh +++ b/daemons/runsystem.sh @@ -118,9 +118,6 @@ esac /hurd/mach-defpager # This is necessary to make stat / return the correct device ids. -# Work around a race condition (probably in the root translator). -for i in `seq 1 10` ; do : ; done # XXX - fsysopts / --update --readonly # Finally, start the actual init. diff --git a/ext2fs/ext2fs.c b/ext2fs/ext2fs.c index d0fdfe7..03c9eed 100644 --- a/ext2fs/ext2fs.c +++ b/ext2fs/ext2fs.c @@ -207,10 +207,20 @@ main (int argc, char **argv) error_t diskfs_reload_global_state () { + error_t err; + pokel_flush (global_pokel); pager_flush (diskfs_disk_pager, 1); - sblock = NULL; + + /* libdiskfs is not responsible for inhibiting paging. */ + err = inhibit_ext2_pager (); + if (err) +return err; + get_hypermetadata (); map_hypermetadata (); + + resume_ext2_pager (); + return 0; } diff --git a/ext2fs/ext2fs.h b/ext2fs/ext2fs.h index 96d8e9d..a744685 100644 --- a/ext2fs/ext2fs.h +++ b/ext2fs/ext2fs.h @@ -201,6 +201,12 @@ struct user_pager_info /* Set up the disk pager. */ void
Re: [PATCH] Port leak when using clisp
Hello, Applied, thanks! Samuel
Re: [PATCH] Port leak when using clisp
Quoting Flávio Cruz (2015-08-27 00:26:13) I wrote: Fwiw, I prefer inlined patches so I can reply easier ;) OK ;) Oh no, but the new patch got corrupted now :/ I should have suggested git send-email, but didn't. Sorry about that... Justus
Re: [PATCH] Port leak when using clisp
Hi, Quoting Flávio Cruz (2015-08-27 00:26:13) Em qua, 26 de ago de 2015 às 12:18, Justus Winter 4win...@informatik.uni-hamburg.de escreveu: After looking around and using KDB, I've figured out that the following loop in kern/exceptions.c/exception_raise_continue_slow was the culprit: Could you elaborate a little on your method? I changed the code to chain all the allocated ports together in a linked list. Then, I added new fields to ipc_port to track send-only rights and to understand where those rights were being created and where the port was being manipulated. Finally, I added a new KDB command to show all the inactive ports. It was then easy to track exactly how the port was handled inside exception_raise_continue_slow. Neat. Thanks for sharing. Fwiw, I've used Coccinelle for code instrumentations. Also for fixing up code in a mechanical way. It is an awesome tool :) while (mr == MACH_RCV_INTERRUPTED) { /* * Somebody is trying to force this thread * to a clean point. We must cooperate * and then resume the receive. */ while (thread_should_halt(self)) { /* don't terminate while holding a reference */ if (self-ast AST_TERMINATE) ipc_port_release(reply_port); thread_halt_self(); } ip_lock(reply_port); } The reply port of the target thread (from CLISP?) is not being released since it enters the while(thread_should_halt(self)) loop and the thread terminates inside thread_halt_self but the previous condition does not hold, which fails to release the port. I also think that once the code enters thread_halt_self, it never comes back since the stack is discarded (for both if cases inside the function). While I agree that the function thread_halt_self does not return, it only terminates the thread if the AST_TERMINATE flag is set. Otherwise, it returns to userspace. Yes, it returns to userspace using thread_exception_return, but notice that a reference must be released anyway (check first statement after the while loop). Right. We have two references, one for the send-once right, and one for the port being saved in ith_port (see exception.c:386). For the else case in thread_halt_self, the thread may throw away the current stack and run thread_exception_return in thread_block, which is also executed in the case of success in exception_raise_continue_slow. And indeed exception_raise_continue_fast releases two references too. I find the use of `while' dodgy in the first place b/c if thread_halt_self doesn't return, an if would do the trick as well and don't imply the chance of that code looping... I think there might be some situations where it will loop around depending on the scheduling order (see line 798 on sched_prim.c). Could be. I keep messing up the semantics of thread_invoke in my head , I've added a continuation argument to thread_halt_self so that the port is released if the thread resumes using the continuation (followed by thread_exception_return). Otherwise, it may loop and it will eventually also release the reference and call thread_exception_return. Yes. This looks like the clean solution. I've also made some cosmetic changes to the code. Would be best to split the change up nevertheless. Thanks for looking into this :) Justus
Re: [RFC mach] vm: fix locking issues
Quoting Samuel Thibault (2015-08-27 00:56:23) Justus Winter, le Mon 17 Aug 2015 17:51:09 +0200, a écrit : @@ -2157,10 +2162,13 @@ start_pass_1: /* * Check for permanent objects in the destination. */ - - if ((entry-object.vm_object != VM_OBJECT_NULL) -!entry-object.vm_object-temporary) - contains_permanent_objects = TRUE; + object = entry-object.vm_object; + if ((object != VM_OBJECT_NULL) + ! contains_permanent_objects) { + vm_object_lock(object); + contains_permanent_objects = object-temporary; This looks wrong, shouldn't it be !object-temporary? Yes. Also, taking the lock just to read a boolean usually means there's something bigger wrong somewhere. Either the boolean was not meant to be protected (e.g. because it is constant), or more than just reading it should be locked, for the whole coherency. Here it never changes except when memory_object_set_attributes_common is called, but then there is nothing that will prevent memory_object_set_attributes_common from doing it between the time you take the lock, get the temporary field, release the lock, and the time when you actually do something about it, e.g. here return KERN_FAILURE because you believe that there's a permanent object while it has just been turned temporary actually. That's probably not a problem in practice because the two calls will probably have some dependency, but that's the actual issue here. Right. @@ -2275,8 +2284,15 @@ start_pass_1: */ object = entry-object.vm_object; + temporary = 0; + if (object != VM_OBJECT_NULL) { + vm_object_lock(object); + temporary = object-temporary; + vm_object_unlock(object); + } + Here it's worse: the copy strategy changes. Perhaps we could think that it doesn't pose problem in practice because the two calls would probably have a dependency, I don't know about memory objects to know. Right, I will look more closely at the fields and how they are meant to be used, not how it is documented in the header. diff --git a/vm/vm_object.c b/vm/vm_object.c index deac0c2..1d3e727 100644 --- a/vm/vm_object.c +++ b/vm/vm_object.c @@ -217,9 +217,11 @@ static void _vm_object_setup( vm_size_t size) { *object = vm_object_template; - queue_init(object-memq); vm_object_lock_init(object); + vm_object_lock(object); + queue_init(object-memq); object-size = size; + vm_object_unlock(object); Here it's a fresh object, that's why there's no need to lock, but oh well. Yes. I pondered enclosing these with #if MACH_LDEBUG, but that's ugly and so I postponed it until it turns out to be a bottleneck. @@ -1474,14 +1491,11 @@ vm_object_t vm_object_copy_delayed( * must be done carefully, to avoid deadlock. */ - /* - * Allocate a new copy object before locking, even - * though we may not need it later. - */ + vm_object_lock(src_object); new_copy = vm_object_allocate(src_object-size); - vm_object_lock(src_object); Better make a copy of src_object-size to avoid keeping the object locked while we allocate stuff. Ok. @@ -252,6 +252,8 @@ vm_pageout_setup( vm_object_unlock(new_object); } + vm_object_lock(old_object); + if (flush) { /* * Create a place-holder page where the old one was, @@ -262,7 +264,6 @@ vm_pageout_setup( == VM_PAGE_NULL) vm_page_more_fictitious(); - vm_object_lock(old_object); Why keeping the lock while allocating a page? vm_page_lock_queues(); vm_page_remove(m); vm_page_unlock_queues(); @@ -281,8 +282,6 @@ vm_pageout_setup( VM_EXTERNAL_STATE_EXISTS); #endif /* MACH_PAGEMAP */ - vm_object_unlock(old_object); Why keeping it locked here? For accessing old_object-internal? It is constant across the life of an object, thus does not need to be protected by the lock. Ok. I wish that kind of knowledge was stated in the comments describing the objects. Cheers, Justus
Re: [RFC mach] vm: fix locking issues
Justus Winter, le Thu 27 Aug 2015 13:19:51 +0200, a écrit : Why keeping it locked here? For accessing old_object-internal? It is constant across the life of an object, thus does not need to be protected by the lock. Ok. I wish that kind of knowledge was stated in the comments describing the objects. Agreed, patch welcome :) Samuel