Re: p_vmspace in syscall
On Mon, 2 Jul 2007, Nicolas Cormier wrote: I am trying to map some data allocated in kernel to a user process (via a syscall). I need the proc's vmspace, but the value of p_vmspace of the input proc argument is NULL ... How can I get a valid vmspace ? When operating in a system call, the 'td' argument to the system call function is the current thread pointer. You can follow td-td_proc to get to the current process (and therefore, its address space). In general, I prefer mapping user pages into kernel instead of kernel pages into user space, as it reduces the chances of leakage of kernel data to user space, and there are some useful primitives for making this easier. For example, take a look at the sf_buf infrastructure used for things like socket zero-copy send, which manages a temporary kernel mapping for a page. Robert N M Watson Computer Laboratory University of Cambridge ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: p_vmspace in syscall
On 7/4/07, Robert Watson [EMAIL PROTECTED] wrote: On Mon, 2 Jul 2007, Nicolas Cormier wrote: I am trying to map some data allocated in kernel to a user process (via a syscall). I need the proc's vmspace, but the value of p_vmspace of the input proc argument is NULL ... How can I get a valid vmspace ? When operating in a system call, the 'td' argument to the system call function is the current thread pointer. You can follow td-td_proc to get to the current process (and therefore, its address space). In general, I prefer mapping user pages into kernel instead of kernel pages into user space, as it reduces the chances of leakage of kernel data to user space, and there are some useful primitives for making this easier. For example, take a look at the sf_buf infrastructure used for things like socket zero-copy send, which manages a temporary kernel mapping for a page. Yes Roman told me in private that I'm wrong with the first argument, I thought that it was a proc*... For my module I try to create a simple interface of a network allocator: User code should look like this: unsigned id; void* data = netmalloc(host, size, id); memcpy(data, toto, sizeof(toto); netdetach(data); and later in another process: void* data = netattach(host, id); ... netfree(data); netmalloc syscall does something like that: - query distant host to allocate size - receive an id from distant host - malloc in kernel size - map the buffer to user process (*) netdetach syscall: - send data to distant host netattach syscall: - get data from host - malloc in kernel size - map the buffer to user process (*) * I already watch the function vm_pgmoveco (http://fxr.watson.org/fxr/source/kern/kern_subr.c?v=RELENG62#L78) I used pgmoveco as follow: vm_map_t mapa = proc-p_vmspace-vm_map, size = round_page(size); void* data = malloc(size, M_NETMALLOC, M_WAITOK); vm_offset_t addr = vm_map_min(mapa); vm_map_find(mapa, NULL, 0, addr, size, TRUE, VM_PROT_ALL, VM_PROT_ALL, MAP_NOFAULT); vm_pgmoveco(mapa, (vm_offset_t)data, addr); With this I have a panic with vm_page_insert, I am not sure to understand the reason of this panic. I can't have multiple virtual pages on the same physical page ? Thanks! -- Nicolas Cormier ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: p_vmspace in syscall
On Wed, 4 Jul 2007, Nicolas Cormier wrote: On 7/4/07, Robert Watson [EMAIL PROTECTED] wrote: On Mon, 2 Jul 2007, Nicolas Cormier wrote: I am trying to map some data allocated in kernel to a user process (via a syscall). I need the proc's vmspace, but the value of p_vmspace of the input proc argument is NULL ... How can I get a valid vmspace ? When operating in a system call, the 'td' argument to the system call function is the current thread pointer. You can follow td-td_proc to get to the current process (and therefore, its address space). In general, I prefer mapping user pages into kernel instead of kernel pages into user space, as it reduces the chances of leakage of kernel data to user space, and there are some useful primitives for making this easier. For example, take a look at the sf_buf infrastructure used for things like socket zero-copy send, which manages a temporary kernel mapping for a page. Yes Roman told me in private that I'm wrong with the first argument, I thought that it was a proc*... For my module I try to create a simple interface of a network allocator: User code should look like this: unsigned id; void* data = netmalloc(host, size, id); memcpy(data, toto, sizeof(toto); netdetach(data); and later in another process: void* data = netattach(host, id); ... netfree(data); netmalloc syscall does something like that: - query distant host to allocate size - receive an id from distant host - malloc in kernel size - map the buffer to user process (*) netdetach syscall: - send data to distant host netattach syscall: - get data from host - malloc in kernel size - map the buffer to user process (*) * I already watch the function vm_pgmoveco (http://fxr.watson.org/fxr/source/kern/kern_subr.c?v=RELENG62#L78) I used pgmoveco as follow: vm_map_t mapa = proc-p_vmspace-vm_map, size = round_page(size); void* data = malloc(size, M_NETMALLOC, M_WAITOK); vm_offset_t addr = vm_map_min(mapa); vm_map_find(mapa, NULL, 0, addr, size, TRUE, VM_PROT_ALL, VM_PROT_ALL, MAP_NOFAULT); vm_pgmoveco(mapa, (vm_offset_t)data, addr); With this I have a panic with vm_page_insert, I am not sure to understand the reason of this panic. I can't have multiple virtual pages on the same physical page ? I think part of what you're running into here is a conceptual issue. The pages allocated by malloc(9) belong to the kernel memory allocator, and are generally managed by the slab allocator. While in principle you can map them into user space, you're going to have to set up a lot of book-keeping to properly free them again later, etc. There are really two approaches you could be looking at: (1) The user app allocates memory pages, perhaps using mmap() to map anonymous memory or a file. You then borrow those pages to use in-kernel, mapping as required. (2) Your kernel code allocates pages directly from the VM system, possibly anonymous swap-backed pages from the page allocator, and maps them into the kernel as required. In either case, you'll need to think about address space limits, especially if the buffer is large -- the kernel address space on 32-bit systems is limited in size, since it shares the address space with a user application. On 64-bit systems, this is not an issue. You'll also need to make sure that the pages are both paged in and pinned in memory. So before we talk about the details of the calls, we should think about how you plan to use the memory. How much memory are we talking about -- enough to potentially run into kernel address space problems on 32-bit systems? How long will the mappings persist -- do you map them into kernel for a brief period to fill them, and then leave them mapped into user space, or is this going to be a persistent shared mapping over a very long period of time? Is the memory going to be pageable? How will it interact with things like mprotect(), msync(), etc? What should happen if a the pages are released by the process using munmap() or by mapping over the region with mmap()? What should happen in a child process if a process forks after netattach() and the parent calls netdatach()? What happens if the process calls send() using a source address in the memory region, and zero-copy sockets are enabled, which would normally lead the page to be borrowed from the user process? The underlying point here is that there is a model by which VM is managed -- pages, pagers, memory objects, mappings, address spaces, etc. We can't just talk about pages being shared or mapped, we need to think about what is to be accomplished, and how to map that into the abstractions that already exist. Memory comes in different flavours, and generally speaking, you don't want to use pages that come from malloc(9) for sharing with userspace, so we need to think about what kind of memory you do need. Robert N M Watson Computer Laboratory University of Cambridge ___
Re: p_vmspace in syscall
On 7/4/07, Robert Watson [EMAIL PROTECTED] wrote: How much memory are we talking about -- enough to potentially run into kernel address space problems on 32-bit systems? How long will the mappings persist -- do you map them into kernel for a brief period to fill them, and then leave them mapped into user space, or is this going to be a persistent shared mapping over a very long period of time? Is the memory going to be pageable? How will it interact with things like mprotect(), msync(), etc? What should happen if a the pages are released by the process using munmap() or by mapping over the region with mmap()? What should happen in a child process if a process forks after netattach() and the parent calls netdatach()? What happens if the process calls send() using a source address in the memory region, and zero-copy sockets are enabled, which would normally lead the page to be borrowed from the user process? Currently I'm just trying to play with kernel/modules/vm ... I'm a newbie in kernel development and I just want to make a little prototype of an in-kernel network allocator. To start I only need to map a page (1024 bytes) from kernel to user process. This memory will never be used by the kernel between the call of net(malloc/attach) and the call of net(detach/free). So user and kernel will never use this page at the same time. The underlying point here is that there is a model by which VM is managed -- pages, pagers, memory objects, mappings, address spaces, etc. We can't just talk about pages being shared or mapped, we need to think about what is to be accomplished, and how to map that into the abstractions that already exist. Memory comes in different flavours, and generally speaking, you don't want to use pages that come from malloc(9) for sharing with userspace, so we need to think about what kind of memory you do need. Thank you for your answer. Right now, I just want to do it as easily as possible, I don't know if this kind of project could interest other persons ? It is ok for me to work more on it later on, if there is any further interest in doing it. -- Nicolas Cormier ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: p_vmspace in syscall
On Wed, 4 Jul 2007, Nicolas Cormier wrote: Currently I'm just trying to play with kernel/modules/vm ... I'm a newbie in kernel development and I just want to make a little prototype of an in-kernel network allocator. To start I only need to map a page (1024 bytes) from kernel to user process. This memory will never be used by the kernel between the call of net(malloc/attach) and the call of net(detach/free). So user and kernel will never use this page at the same time. The underlying point here is that there is a model by which VM is managed -- pages, pagers, memory objects, mappings, address spaces, etc. We can't just talk about pages being shared or mapped, we need to think about what is to be accomplished, and how to map that into the abstractions that already exist. Memory comes in different flavours, and generally speaking, you don't want to use pages that come from malloc(9) for sharing with userspace, so we need to think about what kind of memory you do need. Thank you for your answer. Right now, I just want to do it as easily as possible, I don't know if this kind of project could interest other persons ? It is ok for me to work more on it later on, if there is any further interest in doing it. What do you mean by a network allocator? How do you plan to use these pages? If you haven't already, you should look at the zero-copy socket code in uipc_cow.c. The main criticism of this approach has been that it uses copy-on-write, leading to potential IPIs for VM shootdowns, etc. An alternative, more along the lines of IO-Lite, would be to allow user space to explicitly abandon the page on send, then map a new page to replace it. In which case you might consider a variation on the send system call that accepts only page-aligned arguments and has the effect of unmapping the pages that are sent. In neither case, on the transmit side, does this require an modification to the kernel memory allocator. The receive side has always been more tricky to deal with... Robert N M Watson Computer Laboratory University of Cambridge ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: p_vmspace in syscall
On 7/4/07, Robert Watson [EMAIL PROTECTED] wrote: What do you mean by a network allocator? How do you plan to use these pages? First I just want to access a local copy of a distant buffer. After the goal is to share memory between hosts (no concurrent access). If you haven't already, you should look at the zero-copy socket code in uipc_cow.c. The main criticism of this approach has been that it uses copy-on-write, leading to potential IPIs for VM shootdowns, etc. An alternative, more along the lines of IO-Lite, would be to allow user space to explicitly abandon the page on send, then map a new page to replace it. In which case you might consider a variation on the send system call that accepts only page-aligned arguments and has the effect of unmapping the pages that are sent. In neither case, on the transmit side, does this require an modification to the kernel memory allocator. The receive side has always been more tricky to deal with... Ok I will take a look at uipc_cow.c, Thank you -- Nicolas Cormier ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: p_vmspace in syscall
On 7/4/07, Steve Watt [EMAIL PROTECTED] wrote: In [EMAIL PROTECTED], Nicolas Cormier [EMAIL PROTECTED] wrote: On 7/4/07, Robert Watson [EMAIL PROTECTED] wrote: When operating in a system call, the 'td' argument to the system call function is the current thread pointer. You can follow td-td_proc to get to the current process (and therefore, its address space). In general, I prefer mapping user pages into kernel instead of kernel pages into user space, as it reduces the chances of leakage of kernel data to user space, and there are some useful primitives for making this easier. For example, take a look at the sf_buf infrastructure used for things like socket zero-copy send, which manages a temporary kernel mapping for a page. netmalloc syscall does something like that: - query distant host to allocate size - receive an id from distant host - malloc in kernel size - map the buffer to user process (*) netdetach syscall: - send data to distant host netattach syscall: - get data from host - malloc in kernel size - map the buffer to user process (*) What this really sounds like is network shared memory or remote DMA. I would architect this to remove as much of the management code as possible from the kernel (i.e. query the distant host, get ID, etc.) into a userland daemon. Depending on the exact semantics you want, you'll probably need to write a new kind of pager. Basically, at the netmalloc call, you would simply pass the reqest back to the userland daemon, which would format it in whatever way is needed to cross the net, send the request off, receive the ID, and give association information back to the kernel (number of pages, protections, whatever). Then the call would map the new pages into the userland process just like it was a shared memory segment. At detach time, the message would again go to the userland daemon, which would map the pages locally and probably use a zero-copy send to ship the data to the remote host. There are some fun potential interactions in there in code I haven't looked at in a long time. I'll resist the urge to dive in and hack something together, since VM systems have a way of being tricky in unexpected places. Thank you for this post ! Your design should be a good start. -- Nicolas Cormier ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: p_vmspace in syscall
In [EMAIL PROTECTED], Nicolas Cormier [EMAIL PROTECTED] wrote: On 7/4/07, Robert Watson [EMAIL PROTECTED] wrote: When operating in a system call, the 'td' argument to the system call function is the current thread pointer. You can follow td-td_proc to get to the current process (and therefore, its address space). In general, I prefer mapping user pages into kernel instead of kernel pages into user space, as it reduces the chances of leakage of kernel data to user space, and there are some useful primitives for making this easier. For example, take a look at the sf_buf infrastructure used for things like socket zero-copy send, which manages a temporary kernel mapping for a page. netmalloc syscall does something like that: - query distant host to allocate size - receive an id from distant host - malloc in kernel size - map the buffer to user process (*) netdetach syscall: - send data to distant host netattach syscall: - get data from host - malloc in kernel size - map the buffer to user process (*) What this really sounds like is network shared memory or remote DMA. I would architect this to remove as much of the management code as possible from the kernel (i.e. query the distant host, get ID, etc.) into a userland daemon. Depending on the exact semantics you want, you'll probably need to write a new kind of pager. Basically, at the netmalloc call, you would simply pass the reqest back to the userland daemon, which would format it in whatever way is needed to cross the net, send the request off, receive the ID, and give association information back to the kernel (number of pages, protections, whatever). Then the call would map the new pages into the userland process just like it was a shared memory segment. At detach time, the message would again go to the userland daemon, which would map the pages locally and probably use a zero-copy send to ship the data to the remote host. There are some fun potential interactions in there in code I haven't looked at in a long time. I'll resist the urge to dive in and hack something together, since VM systems have a way of being tricky in unexpected places. -- Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.5 / 37N 20' 15.3 Internet: steve @ Watt.COM Whois: SW32-ARIN Free time? There's no such thing. It just comes in varying prices... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: p_vmspace in syscall
On 7/2/07, Nicolas Cormier [EMAIL PROTECTED] wrote: Hi, I am trying to map some data allocated in kernel to a user process (via a syscall). I need the proc's vmspace, but the value of p_vmspace of the input proc argument is NULL ... How can I get a valid vmspace ? Thanks ! Ok, syscall function passed a proc* as arguments, I don't know where this proc* come from but it works with: struct thread *td = curthread; p = td-td_proc; -- Nicolas Cormier ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: p_vmspace in syscall
Ok, syscall function passed a proc* as arguments, I don't know where this does not make any sense... userland processes have no way to determine where a proc is stored... what exactly are you trying to achieve? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]