Date: Thu, 14 Mar 2019 08:06:58 -0700 From: Jason Thorpe <thor...@me.com> Message-ID: <134778ad-a675-414a-bbb3-7eeeaf2c2...@me.com>
| Great sleuthing, you pretty much nailed it. OK, great. This is the patch I plan to commit soon (rather than waiting on lots of review - so that the b5 tests can start working again ASAP). If anyone has any issues with this, please make them known soon (as in quite soon...) Of course, we can always revise, or revert, these changes later. They do seem to work as intended however. Before I added this to my test system, I (using the original kernel) did a test of (several iterations of) munlock(buf, 0) (where buf is page aligned) without any problems, as expected. After applying this patch, the t_mlock test works with no failings (as it exists in -current ... I will fix all its idiocy after it has been used a few times in its current state - no matter how broken it should not be able to crash the kernel). Also, after this patch, I tested address wraparround (addr+llen < addr) and get the ENOMEM error that the patch returns (which seems to be the error that POSIX specifies for this kind of thing. Ans speaking of which we appear to differ with POSIX in several error codes, POSIX does not use EINVAL for mlock/munlock of a region which crosses a "hole" that would be ENOMEM (munlock converts the internal EINVAL into ENOMEM before returning, mlock() doesn't), and munlock() without a preceding mlock is not an error at all (POSIX says "regardless of how many times mlock( ) has been called by the process for any of the pages in the specified range." Zero is a number of times that mlock() might have been called. ALso, EAGAIN is correct for mlock() when the max number of pages the kernel will allow to be locked have been locked, but not for the case when a process exceeds its resource limit of locked pages ... that should be ENOMEM. There might be more like this. This patch doesn't do anything with those issues, they're not urgent, and we can fix them anytime (if we agree to do so.) kre Index: uvm_map.c =================================================================== RCS file: /cvsroot/src/sys/uvm/uvm_map.c,v retrieving revision 1.358 diff -u -r1.358 uvm_map.c --- uvm_map.c 3 Mar 2019 17:37:36 -0000 1.358 +++ uvm_map.c 14 Mar 2019 16:29:57 -0000 @@ -3359,6 +3359,14 @@ } entry = start_entry; + if (start == end) { /* nothing required */ + if ((lockflags & UVM_LK_EXIT) == 0) + vm_map_unlock(map); + + UVMHIST_LOG(maphist,"<- done (nothing)",0,0,0,0); + return 0; + } + /* * handle wiring and unwiring separately. */ Index: uvm_mmap.c =================================================================== RCS file: /cvsroot/src/sys/uvm/uvm_mmap.c,v retrieving revision 1.169 diff -u -r1.169 uvm_mmap.c --- uvm_mmap.c 19 Dec 2017 18:34:47 -0000 1.169 +++ uvm_mmap.c 14 Mar 2019 16:29:58 -0000 @@ -759,8 +759,12 @@ pageoff = (addr & PAGE_MASK); addr -= pageoff; - size += pageoff; - size = (vsize_t)round_page(size); + if (size != 0) { + size += pageoff; + size = (vsize_t)round_page(size); + } + if (addr + size < addr) + return ENOMEM; error = range_test(&p->p_vmspace->vm_map, addr, size, false); if (error) @@ -810,8 +814,12 @@ pageoff = (addr & PAGE_MASK); addr -= pageoff; - size += pageoff; - size = (vsize_t)round_page(size); + if (size != 0) { + size += pageoff; + size = (vsize_t)round_page(size); + } + if (addr + size < addr) + return ENOMEM; error = range_test(&p->p_vmspace->vm_map, addr, size, false); if (error) Index: uvm_page.c =================================================================== RCS file: /cvsroot/src/sys/uvm/uvm_page.c,v retrieving revision 1.198 diff -u -r1.198 uvm_page.c --- uvm_page.c 19 May 2018 15:03:26 -0000 1.198 +++ uvm_page.c 14 Mar 2019 16:29:58 -0000 @@ -1605,9 +1605,11 @@ uvm_pageunwire(struct vm_page *pg) { KASSERT(mutex_owned(&uvm_pageqlock)); + KASSERT(pg->wire_count != 0); pg->wire_count--; if (pg->wire_count == 0) { uvm_pageactivate(pg); + KASSERT(uvmexp.wired != 0); uvmexp.wired--; } }