Re: [PATCH] new interface: memory_object_get_proxy
Hello, Thanks! Applied at last: there were various typing errors. Samuel
Re: [PATCH] new interface: memory_object_get_proxy
Hi, I tested it and everything works fine. I'm attaching the two patches for convenience. About the name, feel free to change it if you want, although I think if we rename it then the function should be moved to vm_map.c, maybe just below vm_region()
Re: [PATCH] new interface: memory_object_get_proxy
On Sat, Oct 30, 2021 at 9:38 AM Joan Lledó wrote: > El 24/10/21 a les 19:50, Sergey Bugaev ha escrit: > > Naming: perhaps memory_object_create_vm_proxy ()? or even > > memory_object_create_task_vm_proxy ()? > > I don't care about the name, you guys decide. Actually, how about vm_region_create_proxy ()? It has a nice symmetry to it: memory_object_create_proxy () — create a proxy for this memory object(s) vm_region_create_proxy () — create a proxy for this VM region(s) So we'd say that it's the '_create_proxy' part that is common between the proxy RPC names, not the 'memory_object_' prefix. After all, the proxy RPCs are not the actual memory_object_... () RPCs that memory objects themselves implement. Sergey
Re: [PATCH] new interface: memory_object_get_proxy
Hi, El 1/11/21 a les 17:47, Sergey Bugaev ha escrit: With this diff (on top of your patch), it finally works sanely for me: Cool, great work. I'd like to try it myself but I won't have the time until next weekend. I'll merge your changes with mine and make my tests.
Re: [PATCH] new interface: memory_object_get_proxy
Sergey Bugaev, le lun. 01 nov. 2021 19:47:06 +0300, a ecrit: > That, and also proxies actually don't support non-zero offset, but > they do support non-zero 'start' (I don't know what the difference was > supposed to be between the two). offset is inside the original object, and start is in the proxy object. Think for instance of /dev/mem, within which you could want to map some VGA-compatible range at 0xA (that's the start), but the mapping range is a subpart of the memory of some device (that's the offset). Samuel
Re: [PATCH] new interface: memory_object_get_proxy
Sergey Bugaev, le lun. 01 nov. 2021 22:59:59 +0300, a ecrit: > On Mon, Nov 1, 2021 at 10:36 PM Samuel Thibault > wrote: > > offset is inside the original object, and start is in the proxy object. > > Think for instance of /dev/mem, within which you could want to map some > > VGA-compatible range at 0xA (that's the start), but the mapping > > range is a subpart of the memory of some device (that's the offset). > > Makes sense, thank you. But that means the code in vm_map () is wrong: > it'd need to subtract 'start' and add 'offset' of the proxy to the > user-specified 'offset'. Currently it adds 'start' and ignores > 'offset'. Ah, sorry, re-reading the comment of the RPC: it's the converse of what I said: start is in the original memory object, and offset is in the proxy object. Samuel
Re: [PATCH] new interface: memory_object_get_proxy
On Mon, Nov 1, 2021 at 10:36 PM Samuel Thibault wrote: > offset is inside the original object, and start is in the proxy object. > Think for instance of /dev/mem, within which you could want to map some > VGA-compatible range at 0xA (that's the start), but the mapping > range is a subpart of the memory of some device (that's the offset). Makes sense, thank you. But that means the code in vm_map () is wrong: it'd need to subtract 'start' and add 'offset' of the proxy to the user-specified 'offset'. Currently it adds 'start' and ignores 'offset'. Sergey
Re: [PATCH] new interface: memory_object_get_proxy
On Mon, Nov 1, 2021 at 7:13 PM Sergey Bugaev wrote: > But again, let's try to get the simple version working first. As I've > said, with vm_object_pager_create () it kind of works, except it > doesn't give me the actual data for whatever reason. Phew, I figured out why that was! We were not calculating the offset correctly. We have to take into account the offset of our address inside the entry, not only the offset of the entry in its memory object (compare to vm_map.c:4694). That, and also proxies actually don't support non-zero offset, but they do support non-zero 'start' (I don't know what the difference was supposed to be between the two). With this diff (on top of your patch), it finally works sanely for me: diff --git a/vm/memory_object_proxy.c b/vm/memory_object_proxy.c index 82b3611c..4343c12e 100644 --- a/vm/memory_object_proxy.c +++ b/vm/memory_object_proxy.c @@ -201,6 +201,7 @@ memory_object_get_proxy (task_t task, const vm_offset_t address, { kern_return_t ret; vm_map_entry_t entry, tmp_entry; + vm_object_t object; vm_offset_t offset, start; ipc_port_t pager; @@ -217,16 +218,28 @@ memory_object_get_proxy (task_t task, const vm_offset_t address, entry = tmp_entry; } + if (entry->is_sub_map) { +vm_map_unlock_read(task->map); +return(KERN_INVALID_ARGUMENT); + } + /* Limit the allowed protection and range to the entry ones */ if (len > entry->vme_end - entry->vme_start) { vm_map_unlock_read(task->map); return(KERN_INVALID_ARGUMENT); } - max_protection &= entry->max_protection; - pager = ipc_port_copy_send(entry->object.vm_object->pager); - offset = entry->offset; - start = 0; + + object = entry->object.vm_object; + vm_object_lock(object); + /* Create a pager in case this is an internal object that does + not yet have one. */ + vm_object_pager_create(object); + pager = ipc_port_copy_send(object->pager); + vm_object_unlock(object); + + start = (address - entry->vme_start) + entry->offset; + offset = 0; vm_map_unlock_read(task->map); Sergey
Re: [PATCH] new interface: memory_object_get_proxy
On Mon, Nov 1, 2021 at 6:44 PM Joan Lledó wrote: > > * "Anonymous" mappings (created with a null memory object) > > If they provide no object when calling vm_map, then the kernel uses the > default pager, isn't it? Conceptually, yes. But the way it's implemented, the object->pager port stays null until it has to be paged out (due to lack of memory), and only then is a pager port created and sent to the default pager. So what I've been trying to do by calling vm_object_pager_create () is to force this to happen sooner. > Can't you send a reference to the default pager > to memory_object_create_proxy()? Yes, that's the plan. I'm not sure I _like_ having to involve the default pager even when we don't actually need its paging functionality, simply to make some VM operation representable; but let's at least get *that* working, and then see if we maybe need a way to avoid the default pager altogether. > The RPC is to return a proxy, and proxies always need an original object > to proxy, if there's no object, I'm not sure how this RPC should behave We could make proxies reference either a memory object (as in, a pager port — like now) or a vm_object; this way we wouldn't need to involve the defpager. But again, let's try to get the simple version working first. As I've said, with vm_object_pager_create () it kind of works, except it doesn't give me the actual data for whatever reason. Sergey
Re: [PATCH] new interface: memory_object_get_proxy
Hi, El 1/11/21 a les 14:36, Sergey Bugaev ha escrit: * "Anonymous" mappings (created with a null memory object) If they provide no object when calling vm_map, then the kernel uses the default pager, isn't it? Can't you send a reference to the default pager to memory_object_create_proxy()? The RPC is to return a proxy, and proxies always need an original object to proxy, if there's no object, I'm not sure how this RPC should behave
Re: [PATCH] new interface: memory_object_get_proxy
So I've applied the patch and have tried to actually exercise the new functionality. And unfortunately, I'm reminded (by a kernel panic and a corrupted filesystem) of the two important cases that we have forgot: * Submaps * "Anonymous" mappings (created with a null memory object) Submaps could be doable by recursing down through them — or really, we can just explicitly not support submaps, they only seem to be relevant for the kernel's own memory map. Anonymous mappings are exactly what we're interested in for realloc/mremap, so it's important that we support them. After reading up on how memory objects (pagers) and vm_object's and pages and whatnot work within the kernel, I believe a call to vm_object_pager_create () should be sufficient to "realize" a previously-anonymous mapping. I've applied these changes to your patch: --- a/vm/memory_object_proxy.c +++ b/vm/memory_object_proxy.c @@ -201,6 +201,7 @@ memory_object_get_proxy (task_t task, const vm_offset_t address, { kern_return_t ret; vm_map_entry_t entry, tmp_entry; + vm_object_t object; vm_offset_t offset, start; ipc_port_t pager; @@ -217,14 +218,26 @@ memory_object_get_proxy (task_t task, const vm_offset_t address, entry = tmp_entry; } + if (entry->is_sub_map) { +vm_map_unlock_read(task->map); +return(KERN_INVALID_ARGUMENT); + } + /* Limit the allowed protection and range to the entry ones */ if (len > entry->vme_end - entry->vme_start) { vm_map_unlock_read(task->map); return(KERN_INVALID_ARGUMENT); } - max_protection &= entry->max_protection; - pager = ipc_port_copy_send(entry->object.vm_object->pager); + + object = entry->object.vm_object; + vm_object_lock(object); + /* Create a pager in case this is an internal object that does + not yet have one. */ + vm_object_pager_create(object); + pager = ipc_port_copy_send(object->pager); + vm_object_unlock(object); + offset = entry->offset; start = 0; And now the kernel at least no longer crashes, even after multiple iterations of this (good!). I am able to successfully create a proxy and map it, and it does find the same underlying vm_object_t. vm_region () also shows the same name and offset between the old and the new mappings, so all is good in theory. The only problem is when I try to actually read data from the new mapping, I get complete garbage :( Sergey
Re: [PATCH] new interface: memory_object_get_proxy
Here you go
Re: [PATCH] new interface: memory_object_get_proxy
On Mon, Nov 1, 2021 at 12:32 PM Joan Lledó wrote: > +kern_return_t > +memory_object_get_proxy (task_t task, const vm_offset_t address, > +vm_prot_t max_protection, vm_offset_t len, > +ipc_port_t *port) > +{ > + kern_return_t err; Super minor nitpick: this variable is typically named 'kr' or 'ret', it would seem. 'err' is more of a userspace thing with glibc's error_t. > + vm_map_lock_read(task->map); > + if (!vm_map_lookup_entry(task->map, address, &tmp_entry)) { > +if ((entry = tmp_entry->vme_next) == vm_map_to_entry(task->map)) { > + vm_map_unlock_read(task->map); > + return(KERN_NO_SPACE); > +} > + } else { > +entry = tmp_entry; > + } > + > + /* Limit the allowed protection and range to the entry ones */ > + if (len > entry->vme_end - entry->vme_start) > +return(KERN_INVALID_ARGUMENT); You also need to unlock the map here. > + pager = ipc_port_copy_send(entry->object.vm_object->pager); > + offset = entry->offset; > + start = 0; > + > + vm_map_unlock_read(task->map); > + > + err = memory_object_create_proxy(task->itk_space, max_protection, > + &pager, 1, > + &offset, 1, > + &start, 1, > + &len, 1, port); > + if (err) > +ipc_port_release_send(pager); > + > + return err; Yes, this looks correct now, at least as far as I understand kernel internals. Thank you! :) Sergey
Re: [PATCH] new interface: memory_object_get_proxy
On Mon, Nov 1, 2021 at 12:33 PM Joan Lledó wrote: > Thanks for your explanations, it makes sense but there's still one thing I > don't understand: if memory_object_create_proxy() is the owner of the pager > it receives as parameter, and the caller doesn't care about releasing it, > then where is it released? When the proxy itself is destroyed. Namely, in memory_object_proxy_notify () there's a ipc_port_release_send (proxy->object) call. > I modified my patch to met all your points, I assumed the function to release > the reference is ipc_port_release_send(). Tell me if I'm wrong. Yeah, I think that's one one. Sergey
Re: [PATCH] new interface: memory_object_get_proxy
Hi, El 30/10/21 a les 14:06, Sergey Bugaev ha escrit: > I hope this makes sense; I'd be happy to expand if not. Thanks for your explanations, it makes sense but there's still one thing I don't understand: if memory_object_create_proxy() is the owner of the pager it receives as parameter, and the caller doesn't care about releasing it, then where is it released? I modified my patch to met all your points, I assumed the function to release the reference is ipc_port_release_send(). Tell me if I'm wrong. Thanks
Re: [PATCH] new interface: memory_object_get_proxy
I forgot to answer this: Sergey Bugaev, le sam. 30 oct. 2021 15:06:30 +0300, a ecrit: > > > Should the implementation cap the length to that of the entry > > > silently, or should it return an error if called with an overly long > > > len argument? > > > > > > > I don't know, Samuel, what do you think? > > Actually now that I think about it, it would probably make sense for > this to, some day, transparently support proxying multiple regions, > which is something that the memory object proxies API supports in > theory (but it's currently unimplemented). So maybe it should return a > "not yet supported" error for now. Yes, though we don't have such error, but KERN_INVALID_ARGUMENT would be fine enough. Samuel
Re: [PATCH] new interface: memory_object_get_proxy
On Sat, Oct 30, 2021 at 3:06 PM Sergey Bugaev wrote: > But > memory_object_create_proxy () does add a reference to the memory > object by doing ipc_port_copy_send (object[0]). It would seem that it > should not do that; but then I'm not sure why this doesn't cause a > leak — I know Samuel has ensured that proxies don't cause leaks, at > least not immediately visible ones. Aha, that is the exact thing that Samuel has fixed here [0] (and has mentioned to me at the time). [0]: https://git.savannah.gnu.org/cgit/hurd/gnumach.git/commit/vm/memory_object_proxy.c?id=5179bcf28dac3fb39d7d4949964f038fe697bf4e I was just testing the new version, but looking at the old version of the source code in my local checkout. Sorry for any confusion that I've caused by not adding 2 and 2 together. So yes, you want to take a reference to the pager port while still holding the lock, and then pass it over to the proxy, which takes ownership of the port if it returns success — but if it doesn't, you have to release the reference! Sergey
Re: [PATCH] new interface: memory_object_get_proxy
Joan Lledó, le sam. 30 oct. 2021 08:38:23 +0200, a ecrit: > > I don't think you can access the entry once you've unlocked the map. > > > > You're probably right b/c it doesn't seem the entry is being accessed after > the lock release in any other place of the code. You have to, yes. The problem is what if another threads tries to unmap the address range concurrently. You want to keep the lock held until you acquired a ref on the memory object. Then you are safe: the ref guarantees that the memory object won't disappear. And from the rest of the discussion it seems you indeed need such a ref, which you'll just transfer to the creation of the proxy. Samuel
Re: [PATCH] new interface: memory_object_get_proxy
On Sat, Oct 30, 2021 at 9:38 AM Joan Lledó wrote: > El 24/10/21 a les 19:50, Sergey Bugaev ha escrit: > > I would expect the request port argument to be a vm_task_t (i.e. a > > vm_map), not a full task. But I see that you need to pass > > task->itk_space to memory_object_create_proxy (). But > > memory_object_create_proxy () doesn't actually need the task either, > > it just checks it against NULL (as usual) and never uses it again. So > > maybe it'd be cleaner to make both calls accept a vm_task_t? Not that > > this matters. > > yes, that's true, but if we remove the parameter from > memory_object_create_proxy() then we must remove it from all calls in > the code, and in the documentation if there's any. So that's beyond the > scope of this patch, it's something we can do later on. That argument is needed simply because it's the request port, so we can't remove it entirely. What I meant is we could change its type from ipc_space_t to vm_task_t; then the userspace would still be passing a task port (task_t), but you'd pass a vm_map in the kernel. And there aren't any other in-kernel callers of memory_object_create_proxy () besides the one you're adding, so that shouldn't be an issue either. But you're right that this could be done separately if you prefer to go with the full task_t for now. > > Should the implementation cap the length to that of the entry > > silently, or should it return an error if called with an overly long > > len argument? > > > > I don't know, Samuel, what do you think? Actually now that I think about it, it would probably make sense for this to, some day, transparently support proxying multiple regions, which is something that the memory object proxies API supports in theory (but it's currently unimplemented). So maybe it should return a "not yet supported" error for now. > I'm not familiar with the concept of consuming objects, could you > explain it to me? Consuming, as in taking ownership of, or "eating" the passed-in reference. If a call consuming something, the callee takes ownership of the passed-in reference: some_resource_t global_resource; void callee (some_resource_t *argument) { /* Take ownership. */ global_resource = argument; } void caller () { some_resource_t *res = make_resource (); callee (res); /* No need to release the resource -- the callee took ownership of it. */ } If the call is *not* consuming something, the caller keeps ownership of the reference, and the callee has to add another reference if it wants to keep the resource for longer: some_resource_t global_resource; void callee (some_resource_t *argument) { /* Add a reference since we're storing the resource for longer. */ some_resource_ref (argument); global_resource = argument; } void caller () { some_resource_t *res = make_resource (); callee (res); some_resource_rele (res); } For MIG routines, the convention is that the routine implementation takes ownership of all passed-in resources (except the request port) if and only if it returns KERN_SUCCESS (aka 0) or MIG_NO_REPLY. In other cases (so, for error returns) MIG runtime (or technically not even MIG but rather mach_msg_server ()) will destroy any resources in the message. If we look at mach_ports_register () in kern/ipc_tt.c, we can see it copies the ports from the 'memory' array and takes ownership of them (doesn't add any references). But memory_object_create_proxy () does add a reference to the memory object by doing ipc_port_copy_send (object[0]). It would seem that it should not do that; but then I'm not sure why this doesn't cause a leak — I know Samuel has ensured that proxies don't cause leaks, at least not immediately visible ones. So my point is, *if* we decide that memory_object_create_proxy () is indeed supposed to consume the passed-in- object reference, then your routine should create that reference before calling memory_object_create_proxy (), i.e. you should pass ipc_port_copy_send (entry->object.vm_object->pager) instead of passing entry->object.vm_object->pager directly. I hope this makes sense; I'd be happy to expand if not. Sergey
Re: [PATCH] new interface: memory_object_get_proxy
Hi, El 24/10/21 a les 19:50, Sergey Bugaev ha escrit: Naming: perhaps memory_object_create_vm_proxy ()? or even memory_object_create_task_vm_proxy ()? I don't care about the name, you guys decide. I would expect the request port argument to be a vm_task_t (i.e. a vm_map), not a full task. But I see that you need to pass task->itk_space to memory_object_create_proxy (). But memory_object_create_proxy () doesn't actually need the task either, it just checks it against NULL (as usual) and never uses it again. So maybe it'd be cleaner to make both calls accept a vm_task_t? Not that this matters. yes, that's true, but if we remove the parameter from memory_object_create_proxy() then we must remove it from all calls in the code, and in the documentation if there's any. So that's beyond the scope of this patch, it's something we can do later on. I don't think you can access the entry once you've unlocked the map. You're probably right b/c it doesn't seem the entry is being accessed after the lock release in any other place of the code. Should the implementation cap the length to that of the entry silently, or should it return an error if called with an overly long len argument? I don't know, Samuel, what do you think? I'm... not sure if this is wrong, but just to bring some attention to it: isn't memory_object_create_proxy () supposed to *consume* the 'objects' ports array in the successful case? I see it uses ipc_port_copy_send (object[0]), why does that not cause a leak? For a reference point, mach_ports_register (), which is another kernel RPC that accepts a port array, sure does consume it. If it *is* the case that memory_object_create_proxy () should consume the ports, then your new routine should make an extra ref to the pager port before passing it to memory_object_create_proxy (). I'm not familiar with the concept of consuming objects, could you explain it to me?
Re: [PATCH] new interface: memory_object_get_proxy
On Sun, Oct 24, 2021 at 7:28 PM Joan Lledó wrote: > +routine memory_object_get_proxy( > + task: task_t; Naming: perhaps memory_object_create_vm_proxy ()? or even memory_object_create_task_vm_proxy ()? I would expect the request port argument to be a vm_task_t (i.e. a vm_map), not a full task. But I see that you need to pass task->itk_space to memory_object_create_proxy (). But memory_object_create_proxy () doesn't actually need the task either, it just checks it against NULL (as usual) and never uses it again. So maybe it'd be cleaner to make both calls accept a vm_task_t? Not that this matters. > + vm_map_lock_read(task->map); > + if (!vm_map_lookup_entry(task->map, address, &tmp_entry)) { > +if ((entry = tmp_entry->vme_next) == vm_map_to_entry(task->map)) { > + vm_map_unlock_read(task->map); > + return(KERN_NO_SPACE); > +} > + } else { > +entry = tmp_entry; > + } > + vm_map_unlock_read(task->map); > + > + /* Limit the allowed protection and range to the entry ones */ > + max_protection &= entry->max_protection; > + if (len > entry->vme_end - entry->vme_start) > +len = entry->vme_end - entry->vme_start; > + offset = entry->offset; > + start = 0; I don't think you can access the entry once you've unlocked the map. (But note that I'm not very familiar with the kernel internals, so I can be wrong about this.) Should the implementation cap the length to that of the entry silently, or should it return an error if called with an overly long len argument? > + > + return memory_object_create_proxy(task->itk_space, max_protection, > + &entry->object.vm_object->pager, 1, > + &offset, 1, > + &start, 1, > + &len, 1, port); > +} I'm... not sure if this is wrong, but just to bring some attention to it: isn't memory_object_create_proxy () supposed to *consume* the 'objects' ports array in the successful case? I see it uses ipc_port_copy_send (object[0]), why does that not cause a leak? For a reference point, mach_ports_register (), which is another kernel RPC that accepts a port array, sure does consume it. If it *is* the case that memory_object_create_proxy () should consume the ports, then your new routine should make an extra ref to the pager port before passing it to memory_object_create_proxy (). Overall: yes, this is what I was suggesting! Thank you for working on it! Sergey