Re: [PATCH] new interface: memory_object_get_proxy

2021-11-07 Thread Samuel Thibault
Hello,

Thanks!

Applied at last: there were various typing errors.

Samuel



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-06 Thread Joan Lledó
Hi,

I tested it and everything works fine. I'm attaching the two patches for 
convenience.

About the name, feel free to change it if you want, although I think if we 
rename it then the function should be moved to vm_map.c, maybe just below 
vm_region()





Re: [PATCH] new interface: memory_object_get_proxy

2021-11-04 Thread Sergey Bugaev
On Sat, Oct 30, 2021 at 9:38 AM Joan Lledó  wrote:
> El 24/10/21 a les 19:50, Sergey Bugaev ha escrit:
> > Naming: perhaps memory_object_create_vm_proxy ()? or even
> > memory_object_create_task_vm_proxy ()?
>
> I don't care about the name, you guys decide.

Actually, how about vm_region_create_proxy ()? It has a nice symmetry to it:

memory_object_create_proxy () — create a proxy for this memory object(s)
vm_region_create_proxy () — create a proxy for this VM region(s)

So we'd say that it's the '_create_proxy' part that is common between
the proxy RPC names, not the 'memory_object_' prefix. After all, the
proxy RPCs are not the actual memory_object_... () RPCs that memory
objects themselves implement.

Sergey



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Joan Lledó

Hi,

El 1/11/21 a les 17:47, Sergey Bugaev ha escrit:

With this diff (on top of your patch), it finally works sanely for me:


Cool, great work. I'd like to try it myself but I won't have the time 
until next weekend. I'll merge your changes with mine and make my tests.




Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Samuel Thibault
Sergey Bugaev, le lun. 01 nov. 2021 19:47:06 +0300, a ecrit:
> That, and also proxies actually don't support non-zero offset, but
> they do support non-zero 'start' (I don't know what the difference was
> supposed to be between the two).

offset is inside the original object, and start is in the proxy object.
Think for instance of /dev/mem, within which you could want to map some
VGA-compatible range at 0xA (that's the start), but the mapping
range is a subpart of the memory of some device (that's the offset).

Samuel



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Samuel Thibault
Sergey Bugaev, le lun. 01 nov. 2021 22:59:59 +0300, a ecrit:
> On Mon, Nov 1, 2021 at 10:36 PM Samuel Thibault  
> wrote:
> > offset is inside the original object, and start is in the proxy object.
> > Think for instance of /dev/mem, within which you could want to map some
> > VGA-compatible range at 0xA (that's the start), but the mapping
> > range is a subpart of the memory of some device (that's the offset).
> 
> Makes sense, thank you. But that means the code in vm_map () is wrong:
> it'd need to subtract 'start' and add 'offset' of the proxy to the
> user-specified 'offset'. Currently it adds 'start' and ignores
> 'offset'.

Ah, sorry, re-reading the comment of the RPC: it's the converse of what
I said: start is in the original memory object, and offset is in the
proxy object.

Samuel



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Sergey Bugaev
On Mon, Nov 1, 2021 at 10:36 PM Samuel Thibault  wrote:
> offset is inside the original object, and start is in the proxy object.
> Think for instance of /dev/mem, within which you could want to map some
> VGA-compatible range at 0xA (that's the start), but the mapping
> range is a subpart of the memory of some device (that's the offset).

Makes sense, thank you. But that means the code in vm_map () is wrong:
it'd need to subtract 'start' and add 'offset' of the proxy to the
user-specified 'offset'. Currently it adds 'start' and ignores
'offset'.

Sergey



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Sergey Bugaev
On Mon, Nov 1, 2021 at 7:13 PM Sergey Bugaev  wrote:
> But again, let's try to get the simple version working first. As I've
> said, with vm_object_pager_create () it kind of works, except it
> doesn't give me the actual data for whatever reason.

Phew, I figured out why that was!

We were not calculating the offset correctly. We have to take into
account the offset of our address inside the entry, not only the
offset of the entry in its memory object (compare to vm_map.c:4694).
That, and also proxies actually don't support non-zero offset, but
they do support non-zero 'start' (I don't know what the difference was
supposed to be between the two).

With this diff (on top of your patch), it finally works sanely for me:

diff --git a/vm/memory_object_proxy.c b/vm/memory_object_proxy.c
index 82b3611c..4343c12e 100644
--- a/vm/memory_object_proxy.c
+++ b/vm/memory_object_proxy.c
@@ -201,6 +201,7 @@ memory_object_get_proxy (task_t task, const
vm_offset_t address,
 {
   kern_return_t ret;
   vm_map_entry_t entry, tmp_entry;
+  vm_object_t object;
   vm_offset_t offset, start;
   ipc_port_t pager;

@@ -217,16 +218,28 @@ memory_object_get_proxy (task_t task, const
vm_offset_t address,
 entry = tmp_entry;
   }

+  if (entry->is_sub_map) {
+vm_map_unlock_read(task->map);
+return(KERN_INVALID_ARGUMENT);
+  }
+
   /* Limit the allowed protection and range to the entry ones */
   if (len > entry->vme_end - entry->vme_start) {
 vm_map_unlock_read(task->map);
 return(KERN_INVALID_ARGUMENT);
   }
-
   max_protection &= entry->max_protection;
-  pager = ipc_port_copy_send(entry->object.vm_object->pager);
-  offset = entry->offset;
-  start = 0;
+
+  object = entry->object.vm_object;
+  vm_object_lock(object);
+  /* Create a pager in case this is an internal object that does
+ not yet have one. */
+  vm_object_pager_create(object);
+  pager = ipc_port_copy_send(object->pager);
+  vm_object_unlock(object);
+
+  start = (address - entry->vme_start) + entry->offset;
+  offset = 0;

   vm_map_unlock_read(task->map);

Sergey



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Sergey Bugaev
On Mon, Nov 1, 2021 at 6:44 PM Joan Lledó  wrote:
> > * "Anonymous" mappings (created with a null memory object)
>
> If they provide no object when calling vm_map, then the kernel uses the
> default pager, isn't it?

Conceptually, yes. But the way it's implemented, the object->pager
port stays null until it has to be paged out (due to lack of memory),
and only then is a pager port created and sent to the default pager.
So what I've been trying to do by calling vm_object_pager_create () is
to force this to happen sooner.

> Can't you send a reference to the default pager
> to memory_object_create_proxy()?

Yes, that's the plan. I'm not sure I _like_ having to involve the
default pager even when we don't actually need its paging
functionality, simply to make some VM operation representable; but
let's at least get *that* working, and then see if we maybe need a way
to avoid the default pager altogether.

> The RPC is to return a proxy, and proxies always need an original object
> to proxy, if there's no object, I'm not sure how this RPC should behave

We could make proxies reference either a memory object (as in, a pager
port — like now) or a vm_object; this way we wouldn't need to involve
the defpager.

But again, let's try to get the simple version working first. As I've
said, with vm_object_pager_create () it kind of works, except it
doesn't give me the actual data for whatever reason.

Sergey



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Joan Lledó

Hi,

El 1/11/21 a les 14:36, Sergey Bugaev ha escrit:

* "Anonymous" mappings (created with a null memory object)


If they provide no object when calling vm_map, then the kernel uses the 
default pager, isn't it? Can't you send a reference to the default pager 
to memory_object_create_proxy()?


The RPC is to return a proxy, and proxies always need an original object 
to proxy, if there's no object, I'm not sure how this RPC should behave




Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Sergey Bugaev
So I've applied the patch and have tried to actually exercise the new
functionality.

And unfortunately, I'm reminded (by a kernel panic and a corrupted
filesystem) of the two important cases that we have forgot:

* Submaps
* "Anonymous" mappings (created with a null memory object)

Submaps could be doable by recursing down through them — or really, we
can just explicitly not support submaps, they only seem to be relevant
for the kernel's own memory map.

Anonymous mappings are exactly what we're interested in for
realloc/mremap, so it's important that we support them. After reading
up on how memory objects (pagers) and vm_object's and pages and
whatnot work within the kernel, I believe a call to
vm_object_pager_create () should be sufficient to "realize" a
previously-anonymous mapping. I've applied these changes to your
patch:

--- a/vm/memory_object_proxy.c
+++ b/vm/memory_object_proxy.c
@@ -201,6 +201,7 @@ memory_object_get_proxy (task_t task, const
vm_offset_t address,
 {
   kern_return_t ret;
   vm_map_entry_t entry, tmp_entry;
+  vm_object_t object;
   vm_offset_t offset, start;
   ipc_port_t pager;

@@ -217,14 +218,26 @@ memory_object_get_proxy (task_t task, const
vm_offset_t address,
 entry = tmp_entry;
   }

+  if (entry->is_sub_map) {
+vm_map_unlock_read(task->map);
+return(KERN_INVALID_ARGUMENT);
+  }
+
   /* Limit the allowed protection and range to the entry ones */
   if (len > entry->vme_end - entry->vme_start) {
 vm_map_unlock_read(task->map);
 return(KERN_INVALID_ARGUMENT);
   }
-
   max_protection &= entry->max_protection;
-  pager = ipc_port_copy_send(entry->object.vm_object->pager);
+
+  object = entry->object.vm_object;
+  vm_object_lock(object);
+  /* Create a pager in case this is an internal object that does
+ not yet have one. */
+  vm_object_pager_create(object);
+  pager = ipc_port_copy_send(object->pager);
+  vm_object_unlock(object);
+
   offset = entry->offset;
   start = 0;

And now the kernel at least no longer crashes, even after multiple
iterations of this (good!). I am able to successfully create a proxy
and map it, and it does find the same underlying vm_object_t.
vm_region () also shows the same name and offset between the old and
the new mappings, so all is good in theory.

The only problem is when I try to actually read data from the new
mapping, I get complete garbage :(

Sergey



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Joan Lledó
Here you go





Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Sergey Bugaev
On Mon, Nov 1, 2021 at 12:32 PM Joan Lledó  wrote:
> +kern_return_t
> +memory_object_get_proxy (task_t task, const vm_offset_t address,
> +vm_prot_t max_protection, vm_offset_t len,
> +ipc_port_t *port)
> +{
> +  kern_return_t err;

Super minor nitpick: this variable is typically named 'kr' or 'ret',
it would seem. 'err' is more of a userspace thing with glibc's
error_t.

> +  vm_map_lock_read(task->map);
> +  if (!vm_map_lookup_entry(task->map, address, &tmp_entry)) {
> +if ((entry = tmp_entry->vme_next) == vm_map_to_entry(task->map)) {
> +  vm_map_unlock_read(task->map);
> +  return(KERN_NO_SPACE);
> +}
> +  } else {
> +entry = tmp_entry;
> +  }
> +
> +  /* Limit the allowed protection and range to the entry ones */
> +  if (len > entry->vme_end - entry->vme_start)
> +return(KERN_INVALID_ARGUMENT);

You also need to unlock the map here.

> +  pager = ipc_port_copy_send(entry->object.vm_object->pager);
> +  offset = entry->offset;
> +  start = 0;
> +
> +  vm_map_unlock_read(task->map);
> +
> +  err = memory_object_create_proxy(task->itk_space, max_protection,
> +   &pager, 1,
> +   &offset, 1,
> +   &start, 1,
> +   &len, 1, port);
> +  if (err)
> +ipc_port_release_send(pager);
> +
> +  return err;

Yes, this looks correct now, at least as far as I understand kernel
internals. Thank you! :)

Sergey



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Sergey Bugaev
On Mon, Nov 1, 2021 at 12:33 PM Joan Lledó  wrote:
> Thanks for your explanations, it makes sense but there's still one thing I 
> don't understand: if memory_object_create_proxy() is the owner of the pager 
> it receives as parameter, and the caller doesn't care about releasing it, 
> then where is it released?

When the proxy itself is destroyed. Namely, in
memory_object_proxy_notify () there's a ipc_port_release_send
(proxy->object) call.

> I modified my patch to met all your points, I assumed the function to release 
> the reference is ipc_port_release_send(). Tell me if I'm wrong.

Yeah, I think that's one one.

Sergey



Re: [PATCH] new interface: memory_object_get_proxy

2021-11-01 Thread Joan Lledó
Hi,

El 30/10/21 a les 14:06, Sergey Bugaev ha escrit:
> I hope this makes sense; I'd be happy to expand if not.

Thanks for your explanations, it makes sense but there's still one thing I 
don't understand: if memory_object_create_proxy() is the owner of the pager it 
receives as parameter, and the caller doesn't care about releasing it, then 
where is it released?

I modified my patch to met all your points, I assumed the function to release 
the reference is ipc_port_release_send(). Tell me if I'm wrong.

Thanks





Re: [PATCH] new interface: memory_object_get_proxy

2021-10-30 Thread Samuel Thibault
I forgot to answer this:

Sergey Bugaev, le sam. 30 oct. 2021 15:06:30 +0300, a ecrit:
> > > Should the implementation cap the length to that of the entry
> > > silently, or should it return an error if called with an overly long
> > > len argument?
> > >
> >
> > I don't know, Samuel, what do you think?
> 
> Actually now that I think about it, it would probably make sense for
> this to, some day, transparently support proxying multiple regions,
> which is something that the memory object proxies API supports in
> theory (but it's currently unimplemented). So maybe it should return a
> "not yet supported" error for now.

Yes, though we don't have such error, but KERN_INVALID_ARGUMENT would be
fine enough.

Samuel



Re: [PATCH] new interface: memory_object_get_proxy

2021-10-30 Thread Sergey Bugaev
On Sat, Oct 30, 2021 at 3:06 PM Sergey Bugaev  wrote:
> But
> memory_object_create_proxy () does add a reference to the memory
> object by doing ipc_port_copy_send (object[0]). It would seem that it
> should not do that; but then I'm not sure why this doesn't cause a
> leak — I know Samuel has ensured that proxies don't cause leaks, at
> least not immediately visible ones.

Aha, that is the exact thing that Samuel has fixed here [0] (and has
mentioned to me at the time).

[0]: 
https://git.savannah.gnu.org/cgit/hurd/gnumach.git/commit/vm/memory_object_proxy.c?id=5179bcf28dac3fb39d7d4949964f038fe697bf4e

I was just testing the new version, but looking at the old version of
the source code in my local checkout. Sorry for any confusion that
I've caused by not adding 2 and 2 together.

So yes, you want to take a reference to the pager port while still
holding the lock, and then pass it over to the proxy, which takes
ownership of the port if it returns success — but if it doesn't, you
have to release the reference!

Sergey



Re: [PATCH] new interface: memory_object_get_proxy

2021-10-30 Thread Samuel Thibault
Joan Lledó, le sam. 30 oct. 2021 08:38:23 +0200, a ecrit:
> > I don't think you can access the entry once you've unlocked the map.
> > 
> 
> You're probably right b/c it doesn't seem the entry is being accessed after
> the lock release in any other place of the code.

You have to, yes. The problem is what if another threads tries to unmap
the address range concurrently. You want to keep the lock held until
you acquired a ref on the memory object. Then you are safe: the ref
guarantees that the memory object won't disappear. And from the rest of
the discussion it seems you indeed need such a ref, which you'll just
transfer to the creation of the proxy.

Samuel



Re: [PATCH] new interface: memory_object_get_proxy

2021-10-30 Thread Sergey Bugaev
On Sat, Oct 30, 2021 at 9:38 AM Joan Lledó  wrote:
> El 24/10/21 a les 19:50, Sergey Bugaev ha escrit:
> > I would expect the request port argument to be a vm_task_t (i.e. a
> > vm_map), not a full task. But I see that you need to pass
> > task->itk_space to memory_object_create_proxy (). But
> > memory_object_create_proxy () doesn't actually need the task either,
> > it just checks it against NULL (as usual) and never uses it again. So
> > maybe it'd be cleaner to make both calls accept a vm_task_t? Not that
> > this matters.
>
> yes, that's true, but if we remove the parameter from
> memory_object_create_proxy() then we must remove it from all calls in
> the code, and in the documentation if there's any. So that's beyond the
> scope of this patch, it's something we can do later on.

That argument is needed simply because it's the request port, so we
can't remove it entirely. What I meant is we could change its type
from ipc_space_t to vm_task_t; then the userspace would still be
passing a task port (task_t), but you'd pass a vm_map in the kernel.
And there aren't any other in-kernel callers of
memory_object_create_proxy () besides the one you're adding, so that
shouldn't be an issue either.

But you're right that this could be done separately if you prefer to
go with the full task_t for now.

> > Should the implementation cap the length to that of the entry
> > silently, or should it return an error if called with an overly long
> > len argument?
> >
>
> I don't know, Samuel, what do you think?

Actually now that I think about it, it would probably make sense for
this to, some day, transparently support proxying multiple regions,
which is something that the memory object proxies API supports in
theory (but it's currently unimplemented). So maybe it should return a
"not yet supported" error for now.

> I'm not familiar with the concept of consuming objects, could you
> explain it to me?

Consuming, as in taking ownership of, or "eating" the passed-in reference.

If a call consuming something, the callee takes ownership of the
passed-in reference:

some_resource_t global_resource;

void
callee (some_resource_t *argument)
{
  /* Take ownership.  */
  global_resource = argument;
}

void
caller ()
{
  some_resource_t *res = make_resource ();
  callee (res);
  /* No need to release the resource -- the callee took ownership of it.  */
}

If the call is *not* consuming something, the caller keeps ownership
of the reference, and the callee has to add another reference if it
wants to keep the resource for longer:

some_resource_t global_resource;

void
callee (some_resource_t *argument)
{
  /* Add a reference since we're storing the resource for longer.  */
  some_resource_ref (argument);
  global_resource = argument;
}

void
caller ()
{
  some_resource_t *res = make_resource ();
  callee (res);
  some_resource_rele (res);
}

For MIG routines, the convention is that the routine implementation
takes ownership of all passed-in resources (except the request port)
if and only if it returns KERN_SUCCESS (aka 0) or MIG_NO_REPLY. In
other cases (so, for error returns) MIG runtime (or technically not
even MIG but rather mach_msg_server ()) will destroy any resources in
the message. If we look at mach_ports_register () in kern/ipc_tt.c, we
can see it copies the ports from the 'memory' array and takes
ownership of them (doesn't add any references). But
memory_object_create_proxy () does add a reference to the memory
object by doing ipc_port_copy_send (object[0]). It would seem that it
should not do that; but then I'm not sure why this doesn't cause a
leak — I know Samuel has ensured that proxies don't cause leaks, at
least not immediately visible ones.

So my point is, *if* we decide that memory_object_create_proxy () is
indeed supposed to consume the passed-in- object reference, then your
routine should create that reference before calling
memory_object_create_proxy (), i.e. you should pass ipc_port_copy_send
(entry->object.vm_object->pager) instead of passing
entry->object.vm_object->pager directly.

I hope this makes sense; I'd be happy to expand if not.

Sergey



Re: [PATCH] new interface: memory_object_get_proxy

2021-10-29 Thread Joan Lledó

Hi,

El 24/10/21 a les 19:50, Sergey Bugaev ha escrit:

Naming: perhaps memory_object_create_vm_proxy ()? or even
memory_object_create_task_vm_proxy ()?


I don't care about the name, you guys decide.


I would expect the request port argument to be a vm_task_t (i.e. a
vm_map), not a full task. But I see that you need to pass
task->itk_space to memory_object_create_proxy (). But
memory_object_create_proxy () doesn't actually need the task either,
it just checks it against NULL (as usual) and never uses it again. So
maybe it'd be cleaner to make both calls accept a vm_task_t? Not that
this matters.


yes, that's true, but if we remove the parameter from 
memory_object_create_proxy() then we must remove it from all calls in 
the code, and in the documentation if there's any. So that's beyond the 
scope of this patch, it's something we can do later on.




I don't think you can access the entry once you've unlocked the map.



You're probably right b/c it doesn't seem the entry is being accessed 
after the lock release in any other place of the code.



Should the implementation cap the length to that of the entry
silently, or should it return an error if called with an overly long
len argument?



I don't know, Samuel, what do you think?



I'm... not sure if this is wrong, but just to bring some attention to
it: isn't memory_object_create_proxy () supposed to *consume* the
'objects' ports array in the successful case? I see it uses
ipc_port_copy_send (object[0]), why does that not cause a leak? For a
reference point, mach_ports_register (), which is another kernel RPC
that accepts a port array, sure does consume it.

If it *is* the case that memory_object_create_proxy () should consume
the ports, then your new routine should make an extra ref to the pager
port before passing it to memory_object_create_proxy ().


I'm not familiar with the concept of consuming objects, could you 
explain it to me?




Re: [PATCH] new interface: memory_object_get_proxy

2021-10-24 Thread Sergey Bugaev
On Sun, Oct 24, 2021 at 7:28 PM Joan Lledó  wrote:
> +routine memory_object_get_proxy(
> +   task: task_t;

Naming: perhaps memory_object_create_vm_proxy ()? or even
memory_object_create_task_vm_proxy ()?

I would expect the request port argument to be a vm_task_t (i.e. a
vm_map), not a full task. But I see that you need to pass
task->itk_space to memory_object_create_proxy (). But
memory_object_create_proxy () doesn't actually need the task either,
it just checks it against NULL (as usual) and never uses it again. So
maybe it'd be cleaner to make both calls accept a vm_task_t? Not that
this matters.

> +  vm_map_lock_read(task->map);
> +  if (!vm_map_lookup_entry(task->map, address, &tmp_entry)) {
> +if ((entry = tmp_entry->vme_next) == vm_map_to_entry(task->map)) {
> +  vm_map_unlock_read(task->map);
> +  return(KERN_NO_SPACE);
> +}
> +  } else {
> +entry = tmp_entry;
> +  }
> +  vm_map_unlock_read(task->map);
> +
> +  /* Limit the allowed protection and range to the entry ones */
> +  max_protection &= entry->max_protection;
> +  if (len > entry->vme_end - entry->vme_start)
> +len = entry->vme_end - entry->vme_start;
> +  offset = entry->offset;
> +  start = 0;

I don't think you can access the entry once you've unlocked the map.

(But note that I'm not very familiar with the kernel internals, so I
can be wrong about this.)

Should the implementation cap the length to that of the entry
silently, or should it return an error if called with an overly long
len argument?

> +
> +  return memory_object_create_proxy(task->itk_space, max_protection,
> +   &entry->object.vm_object->pager, 1,
> +   &offset, 1,
> +   &start, 1,
> +   &len, 1, port);
> +}

I'm... not sure if this is wrong, but just to bring some attention to
it: isn't memory_object_create_proxy () supposed to *consume* the
'objects' ports array in the successful case? I see it uses
ipc_port_copy_send (object[0]), why does that not cause a leak? For a
reference point, mach_ports_register (), which is another kernel RPC
that accepts a port array, sure does consume it.

If it *is* the case that memory_object_create_proxy () should consume
the ports, then your new routine should make an extra ref to the pager
port before passing it to memory_object_create_proxy ().

Overall: yes, this is what I was suggesting! Thank you for working on it!

Sergey