Re: [PATCH gnumach v1] Implement per task virtual memory limit

Diego Nieto Cid Tue, 24 Dec 2024 09:11:43 -0800


On Tue, Dec 24, 2024 at 12:44:49PM +0300, Sergey Bugaev wrote:
> > +  ASSERT(err == KERN_SUCCESS, "cannot set VM limits");
> 
> There's ASSERT_RET, which also stringifies the error code.
>


Good, that's much better.

> > +  /* check limits are actually saved */
> > +  err = vm_get_size_limit(mach_task_self(), &cur, &max);
> > +  ASSERT(err == KERN_SUCCESS, "getting the VM limit failed");
> > +  ASSERT(cur == M_128M, "cur limit was not expected");
> > +  ASSERT(max == M_512M, "max limit was not expected");
> > +
> > +  /* check we can no longer increase the hard limit */
> > +  err = vm_set_size_limit(mach_host_self(), mach_task_self(), M_128M, 
> > M_512M * 2);
> > +  ASSERT(err == KERN_INVALID_HOST, "raising VM hard limit shall fail");
> 
> Well, uh. Looking at it like this, I wonder if this should return
> KERN_INVALID_ARGUMENT or something in this case, not
> KERN_INVALID_HOST. "Invalid host" makes sense if the way you think of
> it is "I want to increase the size limit, would this host port
> suffice"; but when written out like this it reads the other way
> around: here's a host port, I'm attempting to set the limit ("no
> sorry, increasing the limit is not allowed" => KERN_INVALID_ARGUMENT).
>

I've been expanding the comment in gnumach.defs:

    /*
     *      Set a task virtual memory limit parameters
     *
     *      HOST_PORT must be the privileged host control port
     *      if the caller desires to increase the current max limit.
     *
     *      On the other hand, if the max limit is being decreased, the
     *      unprivileged host name (as returned by mach_host_self()) can
     *      be provided.
     *
     *      Returns:
     *        - KERN_SUCESS when the size limit has been set
     *        - KERN_INVALID_ARGUMENT in the following cases:
     *            * current_limit > max_limit
     *            * map is NULL
     *            * attemt to increase max limi without providing
     *              the privileged control host port.
     */

Should I still return KERN_INVALID_HOST in case the "host port dance" fails?
Or is it ok to map every error to KERN_INVALID_ARGUMENT?

> > +  vm_deallocate(mach_task_self(), mem, (128l * 1024l));
> 
> Could check for the error here as well :)
> 
> > +  err = vm_set_size_limit(host_priv(), mach_task_self(), M_128M, M_512M * 
> > 2);
> > +  ASSERT(err == KERN_SUCCESS, "privileged tasks shall be allowed to 
> > increase the max limit");
> 
> And perhaps check that you can alloc that bigger chunk now?
> 

Ok

> > diff --git a/vm/vm_map.c b/vm/vm_map.c
> > index 03d22ea1..eded31a0 100644
> > --- a/vm/vm_map.c
> > +++ b/vm/vm_map.c
> > @@ -189,6 +189,7 @@ void vm_map_setup(
> >
> >         map->size = 0;
> >         map->size_wired = 0;
> > +       map->size_none = 0;
> >         map->ref_count = 1;
> >         map->pmap = pmap;
> >         map->min_offset = min;
> > @@ -198,6 +199,9 @@ void vm_map_setup(
> >         map->first_free = vm_map_to_entry(map);
> >         map->hint = vm_map_to_entry(map);
> >         map->name = NULL;
> > +       /* TODO add to default limit the swap size */
> > +       map->size_cur_limit = vm_page_mem_size() / 2;
> > +       map->size_max_limit = vm_page_mem_size() / 2;
> 
> Perhaps also skip this for kernel maps?
> 

Like this?

    if (pmap != kernel_pmap) {
        map->size_cur_limit = vm_page_mem_size() / 2;
        map->size_max_limit = vm_page_mem_size() / 2;
    } else {
        map->size_cur_limit = (~0UL);
        map->size_max_limit = (~0UL);
    }


> >         vm_map_lock_init(map);
> >         simple_lock_init(&map->ref_lock);
> >         simple_lock_init(&map->hint_lock);
> > @@ -268,6 +272,49 @@ void vm_map_unlock(struct vm_map *map)
> >         lock_write_done(&map->lock);
> >  }
> > @@ -789,6 +836,10 @@ kern_return_t vm_map_find_entry(
> >         vm_map_entry_t  entry, new_entry;
> >         vm_offset_t     start;
> >         vm_offset_t     end;
> > +       kern_return_t   err;
> > +
> > +       if ((err = vm_map_enforce_limit(map, size, "vm_map_find_entry")) != 
> > KERN_SUCCESS)
> > +               return err;
> 
> Hm, so the map is said to be locked, good. But: we don't have
> max_protection here (?), so we cannot skip the check for max_prot ==
> VM_PROT_NONE. Oh well.
>

This is called by projected buffers (which seems to be kernel memory mapped
into user maps and are locate in vm/vm_kern.c).

There it says:

     *      The user is precluded from manipulating the VM entry of this buffer
     *      (i.e. changing protection, inheritance or machine attributes).

So hopefully it doesn't make sense to make the protection VM_PROT_NONE.

Also, I couldn't find any callers to projected_buffer_allocate :/

> > @@ -1160,6 +1221,7 @@ kern_return_t vm_map_enter(
> >
> >         vm_map_entry_link(map, entry, new_entry);
> >         map->size += size;
> > +       map->size_none += ((max_protection == VM_PROT_NONE) ? size : 0);
> 
> I'd rather have written this as just
> 
> if (max_protection == VM_PROT_NONE)
>         map->size_none += size;
>

I changed to your suggestion. Better not to try to "smart" unneedlessly.

> > @@ -2882,6 +2945,11 @@ kern_return_t vm_map_copyout(
> >                 return KERN_NO_SPACE;
> >         }
> >
> > +       if ((kr = vm_map_enforce_limit(dst_map, size, "vm_map_copyout")) != 
> > KERN_SUCCESS) {
> > +               vm_map_unlock(dst_map);
> > +               return kr;
> > +       }
> 
> Ok. Perhaps add a comment mentioning that the new entries are
> protected VM_PROT_DEFAULT / VM_PROT_ALL, hence no need to adjust
> map->size_none.
>

Added the comment where `dst_map->size` is incremented.

> > +
> > +/*
> > + *     vm_get_size_limit
> > + *
> > + *     Gets the current/maximum virtual adress space limits
> > + *     of the provided `map`.
> > + */
> > +kern_return_t
> > +vm_get_size_limit(
> > +       vm_map_t        map,
> > +       vm_size_t       *current_limit,
> > +       vm_size_t       *max_limit)
> > +{
> 
> Missing the check for map == VM_MAP_NULL.
>

Hmm, I've got an error from mach_msg when the receiver is NULL.
But I'll add it just in case.

> > +       vm_map_lock_read(map);
> > +       *current_limit = map->size_cur_limit;
> > +       *max_limit = map->size_max_limit;
> > +       vm_map_unlock_read(map);
> > +
> > +       return KERN_SUCCESS;
> > +}
> 
> And remember, you still have to handle the cases of:
> - deallocating a VM_PROT_NONE mapping,

Isn't this hunk enough?

    @@ -2042,6 +2104,7 @@ void vm_map_entry_delete(
    
            vm_map_entry_unlink(map, entry);
            map->size -= size;
    +       map->size_none -= ((entry->max_protection == VM_PROT_NONE) ? size : 
0);
    
            vm_map_entry_dispose(map, entry);
     }

> - downgrading an existing mapping to VM_PROT_NONE.

Hmm, maybe I should start tracking the protection instead of the max_protection?
Or do you mean calls to vm_map_protect with set_max == TRUE?

Re: [PATCH gnumach v1] Implement per task virtual memory limit

Reply via email to