On Tue, Dec 24, 2024 at 12:44:49PM +0300, Sergey Bugaev wrote:
> > + ASSERT(err == KERN_SUCCESS, "cannot set VM limits");
>
> There's ASSERT_RET, which also stringifies the error code.
>
Good, that's much better.
> > + /* check limits are actually saved */
> > + err = vm_get_size_limit(mach_task_self(), &cur, &max);
> > + ASSERT(err == KERN_SUCCESS, "getting the VM limit failed");
> > + ASSERT(cur == M_128M, "cur limit was not expected");
> > + ASSERT(max == M_512M, "max limit was not expected");
> > +
> > + /* check we can no longer increase the hard limit */
> > + err = vm_set_size_limit(mach_host_self(), mach_task_self(), M_128M,
> > M_512M * 2);
> > + ASSERT(err == KERN_INVALID_HOST, "raising VM hard limit shall fail");
>
> Well, uh. Looking at it like this, I wonder if this should return
> KERN_INVALID_ARGUMENT or something in this case, not
> KERN_INVALID_HOST. "Invalid host" makes sense if the way you think of
> it is "I want to increase the size limit, would this host port
> suffice"; but when written out like this it reads the other way
> around: here's a host port, I'm attempting to set the limit ("no
> sorry, increasing the limit is not allowed" => KERN_INVALID_ARGUMENT).
>
I've been expanding the comment in gnumach.defs:
/*
* Set a task virtual memory limit parameters
*
* HOST_PORT must be the privileged host control port
* if the caller desires to increase the current max limit.
*
* On the other hand, if the max limit is being decreased, the
* unprivileged host name (as returned by mach_host_self()) can
* be provided.
*
* Returns:
* - KERN_SUCESS when the size limit has been set
* - KERN_INVALID_ARGUMENT in the following cases:
* * current_limit > max_limit
* * map is NULL
* * attemt to increase max limi without providing
* the privileged control host port.
*/
Should I still return KERN_INVALID_HOST in case the "host port dance" fails?
Or is it ok to map every error to KERN_INVALID_ARGUMENT?
> > + vm_deallocate(mach_task_self(), mem, (128l * 1024l));
>
> Could check for the error here as well :)
>
> > + err = vm_set_size_limit(host_priv(), mach_task_self(), M_128M, M_512M *
> > 2);
> > + ASSERT(err == KERN_SUCCESS, "privileged tasks shall be allowed to
> > increase the max limit");
>
> And perhaps check that you can alloc that bigger chunk now?
>
Ok
> > diff --git a/vm/vm_map.c b/vm/vm_map.c
> > index 03d22ea1..eded31a0 100644
> > --- a/vm/vm_map.c
> > +++ b/vm/vm_map.c
> > @@ -189,6 +189,7 @@ void vm_map_setup(
> >
> > map->size = 0;
> > map->size_wired = 0;
> > + map->size_none = 0;
> > map->ref_count = 1;
> > map->pmap = pmap;
> > map->min_offset = min;
> > @@ -198,6 +199,9 @@ void vm_map_setup(
> > map->first_free = vm_map_to_entry(map);
> > map->hint = vm_map_to_entry(map);
> > map->name = NULL;
> > + /* TODO add to default limit the swap size */
> > + map->size_cur_limit = vm_page_mem_size() / 2;
> > + map->size_max_limit = vm_page_mem_size() / 2;
>
> Perhaps also skip this for kernel maps?
>
Like this?
if (pmap != kernel_pmap) {
map->size_cur_limit = vm_page_mem_size() / 2;
map->size_max_limit = vm_page_mem_size() / 2;
} else {
map->size_cur_limit = (~0UL);
map->size_max_limit = (~0UL);
}
> > vm_map_lock_init(map);
> > simple_lock_init(&map->ref_lock);
> > simple_lock_init(&map->hint_lock);
> > @@ -268,6 +272,49 @@ void vm_map_unlock(struct vm_map *map)
> > lock_write_done(&map->lock);
> > }
> > @@ -789,6 +836,10 @@ kern_return_t vm_map_find_entry(
> > vm_map_entry_t entry, new_entry;
> > vm_offset_t start;
> > vm_offset_t end;
> > + kern_return_t err;
> > +
> > + if ((err = vm_map_enforce_limit(map, size, "vm_map_find_entry")) !=
> > KERN_SUCCESS)
> > + return err;
>
> Hm, so the map is said to be locked, good. But: we don't have
> max_protection here (?), so we cannot skip the check for max_prot ==
> VM_PROT_NONE. Oh well.
>
This is called by projected buffers (which seems to be kernel memory mapped
into user maps and are locate in vm/vm_kern.c).
There it says:
* The user is precluded from manipulating the VM entry of this buffer
* (i.e. changing protection, inheritance or machine attributes).
So hopefully it doesn't make sense to make the protection VM_PROT_NONE.
Also, I couldn't find any callers to projected_buffer_allocate :/
> > @@ -1160,6 +1221,7 @@ kern_return_t vm_map_enter(
> >
> > vm_map_entry_link(map, entry, new_entry);
> > map->size += size;
> > + map->size_none += ((max_protection == VM_PROT_NONE) ? size : 0);
>
> I'd rather have written this as just
>
> if (max_protection == VM_PROT_NONE)
> map->size_none += size;
>
I changed to your suggestion. Better not to try to "smart" unneedlessly.
> > @@ -2882,6 +2945,11 @@ kern_return_t vm_map_copyout(
> > return KERN_NO_SPACE;
> > }
> >
> > + if ((kr = vm_map_enforce_limit(dst_map, size, "vm_map_copyout")) !=
> > KERN_SUCCESS) {
> > + vm_map_unlock(dst_map);
> > + return kr;
> > + }
>
> Ok. Perhaps add a comment mentioning that the new entries are
> protected VM_PROT_DEFAULT / VM_PROT_ALL, hence no need to adjust
> map->size_none.
>
Added the comment where `dst_map->size` is incremented.
> > +
> > +/*
> > + * vm_get_size_limit
> > + *
> > + * Gets the current/maximum virtual adress space limits
> > + * of the provided `map`.
> > + */
> > +kern_return_t
> > +vm_get_size_limit(
> > + vm_map_t map,
> > + vm_size_t *current_limit,
> > + vm_size_t *max_limit)
> > +{
>
> Missing the check for map == VM_MAP_NULL.
>
Hmm, I've got an error from mach_msg when the receiver is NULL.
But I'll add it just in case.
> > + vm_map_lock_read(map);
> > + *current_limit = map->size_cur_limit;
> > + *max_limit = map->size_max_limit;
> > + vm_map_unlock_read(map);
> > +
> > + return KERN_SUCCESS;
> > +}
>
> And remember, you still have to handle the cases of:
> - deallocating a VM_PROT_NONE mapping,
Isn't this hunk enough?
@@ -2042,6 +2104,7 @@ void vm_map_entry_delete(
vm_map_entry_unlink(map, entry);
map->size -= size;
+ map->size_none -= ((entry->max_protection == VM_PROT_NONE) ? size :
0);
vm_map_entry_dispose(map, entry);
}
> - downgrading an existing mapping to VM_PROT_NONE.
Hmm, maybe I should start tracking the protection instead of the max_protection?
Or do you mean calls to vm_map_protect with set_max == TRUE?