Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Fri, Jun 27, 2014 at 10:42 AM, Dave Hansen wrote: > On 06/27/2014 10:34 AM, Dave Hansen wrote: >> I'm claiming that we need COW behavior for the bounds tables, at least >> by default. If userspace knows enough about the ways that it is using >> the tables and knows how to share them, let it go to town. The kernel >> will permit this kind of usage model, but we simply won't be helping >> with the management of the tables when userspace creates them. > > Actually, this is another reason we need to mark VMAs as being > MPX-related explicitly instead of inferring it from the tables. If > userspace does something really specialized like this, the kernel does > not want to confuse these VMAs the ones it created. > Good point. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/27/2014 10:34 AM, Dave Hansen wrote: > I'm claiming that we need COW behavior for the bounds tables, at least > by default. If userspace knows enough about the ways that it is using > the tables and knows how to share them, let it go to town. The kernel > will permit this kind of usage model, but we simply won't be helping > with the management of the tables when userspace creates them. Actually, this is another reason we need to mark VMAs as being MPX-related explicitly instead of inferring it from the tables. If userspace does something really specialized like this, the kernel does not want to confuse these VMAs the ones it created. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/26/2014 05:26 PM, Andy Lutomirski wrote: > On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen wrote: >> On 06/26/2014 04:15 PM, Andy Lutomirski wrote: >>> Also, egads: what happens when a bound table entry is associated with >>> a MAP_SHARED page? >> >> Bounds table entries are for pointers. Do we keep pointers inside of >> MAP_SHARED-mapped things? :) > > Sure, if it's MAP_SHARED | MAP_ANONYMOUS. For example: > > struct thing { > struct thing *next; > }; > > struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...); > storage[0].next = [1]; > fork(); > > I'm not suggesting that this needs to *work* in the first incarnation of this > :) I'm not sure I'm seeing the issue. I'm claiming that we need COW behavior for the bounds tables, at least by default. If userspace knows enough about the ways that it is using the tables and knows how to share them, let it go to town. The kernel will permit this kind of usage model, but we simply won't be helping with the management of the tables when userspace creates them. You've demonstrated a case where userspace might theoretically might want to share bounds tables (although I think it's pretty dangerous). It's equally theoretically possible that userspace might *not* want to share the tables for instance if one process narrowed the bounds and the other did not. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/26/2014 05:26 PM, Andy Lutomirski wrote: On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen dave.han...@intel.com wrote: On 06/26/2014 04:15 PM, Andy Lutomirski wrote: Also, egads: what happens when a bound table entry is associated with a MAP_SHARED page? Bounds table entries are for pointers. Do we keep pointers inside of MAP_SHARED-mapped things? :) Sure, if it's MAP_SHARED | MAP_ANONYMOUS. For example: struct thing { struct thing *next; }; struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...); storage[0].next = storage[1]; fork(); I'm not suggesting that this needs to *work* in the first incarnation of this :) I'm not sure I'm seeing the issue. I'm claiming that we need COW behavior for the bounds tables, at least by default. If userspace knows enough about the ways that it is using the tables and knows how to share them, let it go to town. The kernel will permit this kind of usage model, but we simply won't be helping with the management of the tables when userspace creates them. You've demonstrated a case where userspace might theoretically might want to share bounds tables (although I think it's pretty dangerous). It's equally theoretically possible that userspace might *not* want to share the tables for instance if one process narrowed the bounds and the other did not. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/27/2014 10:34 AM, Dave Hansen wrote: I'm claiming that we need COW behavior for the bounds tables, at least by default. If userspace knows enough about the ways that it is using the tables and knows how to share them, let it go to town. The kernel will permit this kind of usage model, but we simply won't be helping with the management of the tables when userspace creates them. Actually, this is another reason we need to mark VMAs as being MPX-related explicitly instead of inferring it from the tables. If userspace does something really specialized like this, the kernel does not want to confuse these VMAs the ones it created. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Fri, Jun 27, 2014 at 10:42 AM, Dave Hansen dave.han...@intel.com wrote: On 06/27/2014 10:34 AM, Dave Hansen wrote: I'm claiming that we need COW behavior for the bounds tables, at least by default. If userspace knows enough about the ways that it is using the tables and knows how to share them, let it go to town. The kernel will permit this kind of usage model, but we simply won't be helping with the management of the tables when userspace creates them. Actually, this is another reason we need to mark VMAs as being MPX-related explicitly instead of inferring it from the tables. If userspace does something really specialized like this, the kernel does not want to confuse these VMAs the ones it created. Good point. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen wrote: > On 06/26/2014 04:15 PM, Andy Lutomirski wrote: >> So here's my mental image of how I might do this if I were doing it >> entirely in userspace: I'd create a file or memfd for the bound tables >> and another for the bound directory. These files would be *huge*: the >> bound directory file would be 2GB and the bounds table file would be >> 2^48 bytes or whatever it is. (Maybe even bigger?) >> >> Then I'd just map pieces of those files wherever they'd need to be, >> and I'd make the mappings sparse. I suspect that you don't actually >> want a vma for each piece of bound table that gets mapped -- the space >> of vmas could end up incredibly sparse. So I'd at least map (in the >> vma sense, not the pte sense) and entire bound table at a time. And >> I'd probably just map the bound directory in one big piece. >> >> Then I'd populate it in the fault handler. >> >> This is almost what the code is doing, I think, modulo the files. >> >> This has one killer problem: these mappings need to be private (cowed >> on fork). So memfd is no good. > > This essentially uses the page cache's radix tree as a parallel data > structure in order to keep a vaddr->mpx_vma map. That's not a bad idea, > but it is a parallel data structure that does not handle copy-on-write > very well. > > I'm pretty sure we need the semantics that anonymous memory provides. > >> There's got to be an easyish way to >> modify the mm code to allow anonymous maps with vm_ops. Maybe a new >> mmap_region parameter or something? Maybe even a special anon_vma, >> but I don't really understand how those work. > > Yeah, we very well might end up having to go down that path. > >> Also, egads: what happens when a bound table entry is associated with >> a MAP_SHARED page? > > Bounds table entries are for pointers. Do we keep pointers inside of > MAP_SHARED-mapped things? :) Sure, if it's MAP_SHARED | MAP_ANONYMOUS. For example: struct thing { struct thing *next; }; struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...); storage[0].next = [1]; fork(); I'm not suggesting that this needs to *work* in the first incarnation of this :) --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/26/2014 04:15 PM, Andy Lutomirski wrote: > So here's my mental image of how I might do this if I were doing it > entirely in userspace: I'd create a file or memfd for the bound tables > and another for the bound directory. These files would be *huge*: the > bound directory file would be 2GB and the bounds table file would be > 2^48 bytes or whatever it is. (Maybe even bigger?) > > Then I'd just map pieces of those files wherever they'd need to be, > and I'd make the mappings sparse. I suspect that you don't actually > want a vma for each piece of bound table that gets mapped -- the space > of vmas could end up incredibly sparse. So I'd at least map (in the > vma sense, not the pte sense) and entire bound table at a time. And > I'd probably just map the bound directory in one big piece. > > Then I'd populate it in the fault handler. > > This is almost what the code is doing, I think, modulo the files. > > This has one killer problem: these mappings need to be private (cowed > on fork). So memfd is no good. This essentially uses the page cache's radix tree as a parallel data structure in order to keep a vaddr->mpx_vma map. That's not a bad idea, but it is a parallel data structure that does not handle copy-on-write very well. I'm pretty sure we need the semantics that anonymous memory provides. > There's got to be an easyish way to > modify the mm code to allow anonymous maps with vm_ops. Maybe a new > mmap_region parameter or something? Maybe even a special anon_vma, > but I don't really understand how those work. Yeah, we very well might end up having to go down that path. > Also, egads: what happens when a bound table entry is associated with > a MAP_SHARED page? Bounds table entries are for pointers. Do we keep pointers inside of MAP_SHARED-mapped things? :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Thu, Jun 26, 2014 at 3:58 PM, Dave Hansen wrote: > On 06/26/2014 03:19 PM, Andy Lutomirski wrote: >> On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen wrote: >>> On 06/25/2014 02:05 PM, Andy Lutomirski wrote: Hmm. the memfd_create thing may be able to do this for you. If you created a per-mm memfd and mapped it, it all just might work. >>> >>> memfd_create() seems to bring a fair amount of baggage along (the fd >>> part :) if all we want is a marker. Really, all we need is _a_ bit, and >>> some way to plumb to userspace the RSS values of VMAs with that bit set. >>> >>> Creating and mmap()'ing a fd seems a rather roundabout way to get there. >> >> Hmm. So does VM_MPX, though. If this stuff were done entirely in >> userspace, then memfd_create would be exactly the right solution, I >> think. >> >> Would it work to just scan the bound directory to figure out how many >> bound tables exist? > > Theoretically, perhaps. > > Practically, the bounds directory is 2GB, and it is likely to be very > sparse. You would have to walk the page tables finding where pages were > mapped, then search the mapped pages for bounds table entries. > > Assuming that it was aligned and minimally populated, that's a *MINIMUM* > search looking for a PGD entry, then you have to look at 512 PUD > entries. A full search would have to look at half a million ptes. > That's just finding out how sparse the first level of the tables are > before you've looked at a byte of actual data, and if they were empty. > > We could keep another, parallel, data structure that handles this better > other than the hardware tables. Like, say, an rbtree that stores ranges > of virtual addresses. We could call them vm_area_somethings ... wait a > sec... we have a structure like that. ;) > > So here's my mental image of how I might do this if I were doing it entirely in userspace: I'd create a file or memfd for the bound tables and another for the bound directory. These files would be *huge*: the bound directory file would be 2GB and the bounds table file would be 2^48 bytes or whatever it is. (Maybe even bigger?) Then I'd just map pieces of those files wherever they'd need to be, and I'd make the mappings sparse. I suspect that you don't actually want a vma for each piece of bound table that gets mapped -- the space of vmas could end up incredibly sparse. So I'd at least map (in the vma sense, not the pte sense) and entire bound table at a time. And I'd probably just map the bound directory in one big piece. Then I'd populate it in the fault handler. This is almost what the code is doing, I think, modulo the files. This has one killer problem: these mappings need to be private (cowed on fork). So memfd is no good. There's got to be an easyish way to modify the mm code to allow anonymous maps with vm_ops. Maybe a new mmap_region parameter or something? Maybe even a special anon_vma, but I don't really understand how those work. Also, egads: what happens when a bound table entry is associated with a MAP_SHARED page? --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/26/2014 03:19 PM, Andy Lutomirski wrote: > On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen wrote: >> On 06/25/2014 02:05 PM, Andy Lutomirski wrote: >>> Hmm. the memfd_create thing may be able to do this for you. If you >>> created a per-mm memfd and mapped it, it all just might work. >> >> memfd_create() seems to bring a fair amount of baggage along (the fd >> part :) if all we want is a marker. Really, all we need is _a_ bit, and >> some way to plumb to userspace the RSS values of VMAs with that bit set. >> >> Creating and mmap()'ing a fd seems a rather roundabout way to get there. > > Hmm. So does VM_MPX, though. If this stuff were done entirely in > userspace, then memfd_create would be exactly the right solution, I > think. > > Would it work to just scan the bound directory to figure out how many > bound tables exist? Theoretically, perhaps. Practically, the bounds directory is 2GB, and it is likely to be very sparse. You would have to walk the page tables finding where pages were mapped, then search the mapped pages for bounds table entries. Assuming that it was aligned and minimally populated, that's a *MINIMUM* search looking for a PGD entry, then you have to look at 512 PUD entries. A full search would have to look at half a million ptes. That's just finding out how sparse the first level of the tables are before you've looked at a byte of actual data, and if they were empty. We could keep another, parallel, data structure that handles this better other than the hardware tables. Like, say, an rbtree that stores ranges of virtual addresses. We could call them vm_area_somethings ... wait a sec... we have a structure like that. ;) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen wrote: > On 06/25/2014 02:05 PM, Andy Lutomirski wrote: >> Hmm. the memfd_create thing may be able to do this for you. If you >> created a per-mm memfd and mapped it, it all just might work. > > memfd_create() seems to bring a fair amount of baggage along (the fd > part :) if all we want is a marker. Really, all we need is _a_ bit, and > some way to plumb to userspace the RSS values of VMAs with that bit set. > > Creating and mmap()'ing a fd seems a rather roundabout way to get there. Hmm. So does VM_MPX, though. If this stuff were done entirely in userspace, then memfd_create would be exactly the right solution, I think. Would it work to just scan the bound directory to figure out how many bound tables exist? --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/25/2014 02:05 PM, Andy Lutomirski wrote: > Hmm. the memfd_create thing may be able to do this for you. If you > created a per-mm memfd and mapped it, it all just might work. memfd_create() seems to bring a fair amount of baggage along (the fd part :) if all we want is a marker. Really, all we need is _a_ bit, and some way to plumb to userspace the RSS values of VMAs with that bit set. Creating and mmap()'ing a fd seems a rather roundabout way to get there. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/25/2014 02:05 PM, Andy Lutomirski wrote: Hmm. the memfd_create thing may be able to do this for you. If you created a per-mm memfd and mapped it, it all just might work. memfd_create() seems to bring a fair amount of baggage along (the fd part :) if all we want is a marker. Really, all we need is _a_ bit, and some way to plumb to userspace the RSS values of VMAs with that bit set. Creating and mmap()'ing a fd seems a rather roundabout way to get there. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen dave.han...@intel.com wrote: On 06/25/2014 02:05 PM, Andy Lutomirski wrote: Hmm. the memfd_create thing may be able to do this for you. If you created a per-mm memfd and mapped it, it all just might work. memfd_create() seems to bring a fair amount of baggage along (the fd part :) if all we want is a marker. Really, all we need is _a_ bit, and some way to plumb to userspace the RSS values of VMAs with that bit set. Creating and mmap()'ing a fd seems a rather roundabout way to get there. Hmm. So does VM_MPX, though. If this stuff were done entirely in userspace, then memfd_create would be exactly the right solution, I think. Would it work to just scan the bound directory to figure out how many bound tables exist? --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/26/2014 03:19 PM, Andy Lutomirski wrote: On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen dave.han...@intel.com wrote: On 06/25/2014 02:05 PM, Andy Lutomirski wrote: Hmm. the memfd_create thing may be able to do this for you. If you created a per-mm memfd and mapped it, it all just might work. memfd_create() seems to bring a fair amount of baggage along (the fd part :) if all we want is a marker. Really, all we need is _a_ bit, and some way to plumb to userspace the RSS values of VMAs with that bit set. Creating and mmap()'ing a fd seems a rather roundabout way to get there. Hmm. So does VM_MPX, though. If this stuff were done entirely in userspace, then memfd_create would be exactly the right solution, I think. Would it work to just scan the bound directory to figure out how many bound tables exist? Theoretically, perhaps. Practically, the bounds directory is 2GB, and it is likely to be very sparse. You would have to walk the page tables finding where pages were mapped, then search the mapped pages for bounds table entries. Assuming that it was aligned and minimally populated, that's a *MINIMUM* search looking for a PGD entry, then you have to look at 512 PUD entries. A full search would have to look at half a million ptes. That's just finding out how sparse the first level of the tables are before you've looked at a byte of actual data, and if they were empty. We could keep another, parallel, data structure that handles this better other than the hardware tables. Like, say, an rbtree that stores ranges of virtual addresses. We could call them vm_area_somethings ... wait a sec... we have a structure like that. ;) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Thu, Jun 26, 2014 at 3:58 PM, Dave Hansen dave.han...@intel.com wrote: On 06/26/2014 03:19 PM, Andy Lutomirski wrote: On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen dave.han...@intel.com wrote: On 06/25/2014 02:05 PM, Andy Lutomirski wrote: Hmm. the memfd_create thing may be able to do this for you. If you created a per-mm memfd and mapped it, it all just might work. memfd_create() seems to bring a fair amount of baggage along (the fd part :) if all we want is a marker. Really, all we need is _a_ bit, and some way to plumb to userspace the RSS values of VMAs with that bit set. Creating and mmap()'ing a fd seems a rather roundabout way to get there. Hmm. So does VM_MPX, though. If this stuff were done entirely in userspace, then memfd_create would be exactly the right solution, I think. Would it work to just scan the bound directory to figure out how many bound tables exist? Theoretically, perhaps. Practically, the bounds directory is 2GB, and it is likely to be very sparse. You would have to walk the page tables finding where pages were mapped, then search the mapped pages for bounds table entries. Assuming that it was aligned and minimally populated, that's a *MINIMUM* search looking for a PGD entry, then you have to look at 512 PUD entries. A full search would have to look at half a million ptes. That's just finding out how sparse the first level of the tables are before you've looked at a byte of actual data, and if they were empty. We could keep another, parallel, data structure that handles this better other than the hardware tables. Like, say, an rbtree that stores ranges of virtual addresses. We could call them vm_area_somethings ... wait a sec... we have a structure like that. ;) So here's my mental image of how I might do this if I were doing it entirely in userspace: I'd create a file or memfd for the bound tables and another for the bound directory. These files would be *huge*: the bound directory file would be 2GB and the bounds table file would be 2^48 bytes or whatever it is. (Maybe even bigger?) Then I'd just map pieces of those files wherever they'd need to be, and I'd make the mappings sparse. I suspect that you don't actually want a vma for each piece of bound table that gets mapped -- the space of vmas could end up incredibly sparse. So I'd at least map (in the vma sense, not the pte sense) and entire bound table at a time. And I'd probably just map the bound directory in one big piece. Then I'd populate it in the fault handler. This is almost what the code is doing, I think, modulo the files. This has one killer problem: these mappings need to be private (cowed on fork). So memfd is no good. There's got to be an easyish way to modify the mm code to allow anonymous maps with vm_ops. Maybe a new mmap_region parameter or something? Maybe even a special anon_vma, but I don't really understand how those work. Also, egads: what happens when a bound table entry is associated with a MAP_SHARED page? --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/26/2014 04:15 PM, Andy Lutomirski wrote: So here's my mental image of how I might do this if I were doing it entirely in userspace: I'd create a file or memfd for the bound tables and another for the bound directory. These files would be *huge*: the bound directory file would be 2GB and the bounds table file would be 2^48 bytes or whatever it is. (Maybe even bigger?) Then I'd just map pieces of those files wherever they'd need to be, and I'd make the mappings sparse. I suspect that you don't actually want a vma for each piece of bound table that gets mapped -- the space of vmas could end up incredibly sparse. So I'd at least map (in the vma sense, not the pte sense) and entire bound table at a time. And I'd probably just map the bound directory in one big piece. Then I'd populate it in the fault handler. This is almost what the code is doing, I think, modulo the files. This has one killer problem: these mappings need to be private (cowed on fork). So memfd is no good. This essentially uses the page cache's radix tree as a parallel data structure in order to keep a vaddr-mpx_vma map. That's not a bad idea, but it is a parallel data structure that does not handle copy-on-write very well. I'm pretty sure we need the semantics that anonymous memory provides. There's got to be an easyish way to modify the mm code to allow anonymous maps with vm_ops. Maybe a new mmap_region parameter or something? Maybe even a special anon_vma, but I don't really understand how those work. Yeah, we very well might end up having to go down that path. Also, egads: what happens when a bound table entry is associated with a MAP_SHARED page? Bounds table entries are for pointers. Do we keep pointers inside of MAP_SHARED-mapped things? :) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen dave.han...@intel.com wrote: On 06/26/2014 04:15 PM, Andy Lutomirski wrote: So here's my mental image of how I might do this if I were doing it entirely in userspace: I'd create a file or memfd for the bound tables and another for the bound directory. These files would be *huge*: the bound directory file would be 2GB and the bounds table file would be 2^48 bytes or whatever it is. (Maybe even bigger?) Then I'd just map pieces of those files wherever they'd need to be, and I'd make the mappings sparse. I suspect that you don't actually want a vma for each piece of bound table that gets mapped -- the space of vmas could end up incredibly sparse. So I'd at least map (in the vma sense, not the pte sense) and entire bound table at a time. And I'd probably just map the bound directory in one big piece. Then I'd populate it in the fault handler. This is almost what the code is doing, I think, modulo the files. This has one killer problem: these mappings need to be private (cowed on fork). So memfd is no good. This essentially uses the page cache's radix tree as a parallel data structure in order to keep a vaddr-mpx_vma map. That's not a bad idea, but it is a parallel data structure that does not handle copy-on-write very well. I'm pretty sure we need the semantics that anonymous memory provides. There's got to be an easyish way to modify the mm code to allow anonymous maps with vm_ops. Maybe a new mmap_region parameter or something? Maybe even a special anon_vma, but I don't really understand how those work. Yeah, we very well might end up having to go down that path. Also, egads: what happens when a bound table entry is associated with a MAP_SHARED page? Bounds table entries are for pointers. Do we keep pointers inside of MAP_SHARED-mapped things? :) Sure, if it's MAP_SHARED | MAP_ANONYMOUS. For example: struct thing { struct thing *next; }; struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...); storage[0].next = storage[1]; fork(); I'm not suggesting that this needs to *work* in the first incarnation of this :) --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/25/2014 02:04 PM, Andy Lutomirski wrote: > On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei wrote: >> Hmm, _install_special_mapping should completely prevent merging, even among >> MPX vmas. >> >> So, could you tell me how to set MPX specific ->name to the vma when it is >> created? Seems like that I could not find such interface. > > You may need to add one. > > I'd suggest posting a new thread to linux-mm describing what you need > and asking how to do it. I shared this with Qiaowei privately, but might as well repeat myself here in case anyone wants to set me straight. Most of the interfaces do to set vm_ops do it in file_operations ->mmap op. Nobody sets ->vm_ops on anonymous VMAs, so we're in uncharted territory. My suggestion: you can either plumb a new API down in to mmap_region() to get the VMA or set ->vm_ops, or just call find_vma() after mmap_region() or get_unmapped_area() and set it manually. Just make sure you still have mmap_sem held over the whole thing. I think I prefer just setting ->vm_ops directly, even though it's a wee bit of a hack to create something just to look it up a moment later. Oh, well. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Wed, Jun 25, 2014 at 2:04 PM, Andy Lutomirski wrote: > On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei wrote: >> On 2014-06-25, Andy Lutomirski wrote: >>> On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei >>> wrote: On 2014-06-24, Andy Lutomirski wrote: >> On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >>> Can the new vm_operation "name" be use for this? The magic >>> "always written to core dumps" feature might need to be reconsidered. >> >> One thing I'd like to avoid is an MPX vma getting merged with a >> non-MPX vma. I don't see any code to prevent two VMAs with >> different vm_ops->names from getting merged. That seems like a >> bit of a design oversight for ->name. Right? > > AFAIK there are no ->name users that don't also set ->close, for > exactly that reason. I'd be okay with adding a check for ->name, too. > > Hmm. If MPX vmas had a real struct file attached, this would all > come for free. Maybe vmas with non-default vm_ops and file != NULL > should never be mergeable? > >> >> Thinking out loud a bit... There are also some more complicated >> but more performant cleanup mechanisms that I'd like to go after in the >> future. >> Given a page, we might want to figure out if it is an MPX page or not. >> I wonder if we'll ever collide with some other user of vm_ops->name. >> It looks fairly narrowly used at the moment, but would this keep >> us from putting these pages on, say, a tmpfs mount? Doesn't look >> that way at the moment. > > You could always check the vm_ops pointer to see if it's MPX. > > One feature I've wanted: a way to have special per-process vmas that > can be easily found. For example, I want to be able to efficiently > find out where the vdso and vvar vmas are. I don't think this is > currently supported. > Andy, if you add a check for ->name to avoid the MPX vmas merged with >>> non-MPX vmas, I guess the work flow should be as follow (use >>> _install_special_mapping to get a new vma): unsigned long mpx_mmap(unsigned long len) { .. static struct vm_special_mapping mpx_mapping = { .name = "[mpx]", .pages = no_pages, }; ... vma = _install_special_mapping(mm, addr, len, vm_flags, _mapping); .. } Then, we could check the ->name to see if the VMA is MPX specific. Right? >>> >>> Does this actually create a vma backed with real memory? Doesn't this >>> need to go through anon_vma or something? _install_special_mapping >>> completely prevents merging. >>> >> Hmm, _install_special_mapping should completely prevent merging, even among >> MPX vmas. >> >> So, could you tell me how to set MPX specific ->name to the vma when it is >> created? Seems like that I could not find such interface. > > You may need to add one. > > I'd suggest posting a new thread to linux-mm describing what you need > and asking how to do it. Hmm. the memfd_create thing may be able to do this for you. If you created a per-mm memfd and mapped it, it all just might work. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei wrote: > On 2014-06-25, Andy Lutomirski wrote: >> On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei >> wrote: >>> On 2014-06-24, Andy Lutomirski wrote: > On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >> Can the new vm_operation "name" be use for this? The magic >> "always written to core dumps" feature might need to be reconsidered. > > One thing I'd like to avoid is an MPX vma getting merged with a > non-MPX vma. I don't see any code to prevent two VMAs with > different vm_ops->names from getting merged. That seems like a > bit of a design oversight for ->name. Right? AFAIK there are no ->name users that don't also set ->close, for exactly that reason. I'd be okay with adding a check for ->name, too. Hmm. If MPX vmas had a real struct file attached, this would all come for free. Maybe vmas with non-default vm_ops and file != NULL should never be mergeable? > > Thinking out loud a bit... There are also some more complicated > but more performant cleanup mechanisms that I'd like to go after in the > future. > Given a page, we might want to figure out if it is an MPX page or not. > I wonder if we'll ever collide with some other user of vm_ops->name. > It looks fairly narrowly used at the moment, but would this keep > us from putting these pages on, say, a tmpfs mount? Doesn't look > that way at the moment. You could always check the vm_ops pointer to see if it's MPX. One feature I've wanted: a way to have special per-process vmas that can be easily found. For example, I want to be able to efficiently find out where the vdso and vvar vmas are. I don't think this is currently supported. >>> Andy, if you add a check for ->name to avoid the MPX vmas merged >>> with >> non-MPX vmas, I guess the work flow should be as follow (use >> _install_special_mapping to get a new vma): >>> >>> unsigned long mpx_mmap(unsigned long len) { >>> .. >>> static struct vm_special_mapping mpx_mapping = { >>> .name = "[mpx]", >>> .pages = no_pages, >>> }; >>> >>> ... vma = _install_special_mapping(mm, addr, len, vm_flags, >>> _mapping); .. >>> } >>> >>> Then, we could check the ->name to see if the VMA is MPX specific. Right? >> >> Does this actually create a vma backed with real memory? Doesn't this >> need to go through anon_vma or something? _install_special_mapping >> completely prevents merging. >> > Hmm, _install_special_mapping should completely prevent merging, even among > MPX vmas. > > So, could you tell me how to set MPX specific ->name to the vma when it is > created? Seems like that I could not find such interface. You may need to add one. I'd suggest posting a new thread to linux-mm describing what you need and asking how to do it. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei qiaowei@intel.com wrote: On 2014-06-25, Andy Lutomirski wrote: On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei qiaowei@intel.com wrote: On 2014-06-24, Andy Lutomirski wrote: On 06/23/2014 01:06 PM, Andy Lutomirski wrote: Can the new vm_operation name be use for this? The magic always written to core dumps feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops-names from getting merged. That seems like a bit of a design oversight for -name. Right? AFAIK there are no -name users that don't also set -close, for exactly that reason. I'd be okay with adding a check for -name, too. Hmm. If MPX vmas had a real struct file attached, this would all come for free. Maybe vmas with non-default vm_ops and file != NULL should never be mergeable? Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops-name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. You could always check the vm_ops pointer to see if it's MPX. One feature I've wanted: a way to have special per-process vmas that can be easily found. For example, I want to be able to efficiently find out where the vdso and vvar vmas are. I don't think this is currently supported. Andy, if you add a check for -name to avoid the MPX vmas merged with non-MPX vmas, I guess the work flow should be as follow (use _install_special_mapping to get a new vma): unsigned long mpx_mmap(unsigned long len) { .. static struct vm_special_mapping mpx_mapping = { .name = [mpx], .pages = no_pages, }; ... vma = _install_special_mapping(mm, addr, len, vm_flags, mpx_mapping); .. } Then, we could check the -name to see if the VMA is MPX specific. Right? Does this actually create a vma backed with real memory? Doesn't this need to go through anon_vma or something? _install_special_mapping completely prevents merging. Hmm, _install_special_mapping should completely prevent merging, even among MPX vmas. So, could you tell me how to set MPX specific -name to the vma when it is created? Seems like that I could not find such interface. You may need to add one. I'd suggest posting a new thread to linux-mm describing what you need and asking how to do it. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Wed, Jun 25, 2014 at 2:04 PM, Andy Lutomirski l...@amacapital.net wrote: On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei qiaowei@intel.com wrote: On 2014-06-25, Andy Lutomirski wrote: On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei qiaowei@intel.com wrote: On 2014-06-24, Andy Lutomirski wrote: On 06/23/2014 01:06 PM, Andy Lutomirski wrote: Can the new vm_operation name be use for this? The magic always written to core dumps feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops-names from getting merged. That seems like a bit of a design oversight for -name. Right? AFAIK there are no -name users that don't also set -close, for exactly that reason. I'd be okay with adding a check for -name, too. Hmm. If MPX vmas had a real struct file attached, this would all come for free. Maybe vmas with non-default vm_ops and file != NULL should never be mergeable? Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops-name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. You could always check the vm_ops pointer to see if it's MPX. One feature I've wanted: a way to have special per-process vmas that can be easily found. For example, I want to be able to efficiently find out where the vdso and vvar vmas are. I don't think this is currently supported. Andy, if you add a check for -name to avoid the MPX vmas merged with non-MPX vmas, I guess the work flow should be as follow (use _install_special_mapping to get a new vma): unsigned long mpx_mmap(unsigned long len) { .. static struct vm_special_mapping mpx_mapping = { .name = [mpx], .pages = no_pages, }; ... vma = _install_special_mapping(mm, addr, len, vm_flags, mpx_mapping); .. } Then, we could check the -name to see if the VMA is MPX specific. Right? Does this actually create a vma backed with real memory? Doesn't this need to go through anon_vma or something? _install_special_mapping completely prevents merging. Hmm, _install_special_mapping should completely prevent merging, even among MPX vmas. So, could you tell me how to set MPX specific -name to the vma when it is created? Seems like that I could not find such interface. You may need to add one. I'd suggest posting a new thread to linux-mm describing what you need and asking how to do it. Hmm. the memfd_create thing may be able to do this for you. If you created a per-mm memfd and mapped it, it all just might work. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/25/2014 02:04 PM, Andy Lutomirski wrote: On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei qiaowei@intel.com wrote: Hmm, _install_special_mapping should completely prevent merging, even among MPX vmas. So, could you tell me how to set MPX specific -name to the vma when it is created? Seems like that I could not find such interface. You may need to add one. I'd suggest posting a new thread to linux-mm describing what you need and asking how to do it. I shared this with Qiaowei privately, but might as well repeat myself here in case anyone wants to set me straight. Most of the interfaces do to set vm_ops do it in file_operations -mmap op. Nobody sets -vm_ops on anonymous VMAs, so we're in uncharted territory. My suggestion: you can either plumb a new API down in to mmap_region() to get the VMA or set -vm_ops, or just call find_vma() after mmap_region() or get_unmapped_area() and set it manually. Just make sure you still have mmap_sem held over the whole thing. I think I prefer just setting -vm_ops directly, even though it's a wee bit of a hack to create something just to look it up a moment later. Oh, well. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 2014-06-25, Andy Lutomirski wrote: > On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei > wrote: >> On 2014-06-24, Andy Lutomirski wrote: On 06/23/2014 01:06 PM, Andy Lutomirski wrote: > Can the new vm_operation "name" be use for this? The magic > "always written to core dumps" feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops->names from getting merged. That seems like a bit of a design oversight for ->name. Right? >>> >>> AFAIK there are no ->name users that don't also set ->close, for >>> exactly that reason. I'd be okay with adding a check for ->name, too. >>> >>> Hmm. If MPX vmas had a real struct file attached, this would all >>> come for free. Maybe vmas with non-default vm_ops and file != NULL >>> should never be mergeable? >>> Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops->name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. >>> >>> You could always check the vm_ops pointer to see if it's MPX. >>> >>> One feature I've wanted: a way to have special per-process vmas that >>> can be easily found. For example, I want to be able to efficiently >>> find out where the vdso and vvar vmas are. I don't think this is >>> currently supported. >>> >> Andy, if you add a check for ->name to avoid the MPX vmas merged >> with > non-MPX vmas, I guess the work flow should be as follow (use > _install_special_mapping to get a new vma): >> >> unsigned long mpx_mmap(unsigned long len) { >> .. >> static struct vm_special_mapping mpx_mapping = { >> .name = "[mpx]", >> .pages = no_pages, >> }; >> >> ... vma = _install_special_mapping(mm, addr, len, vm_flags, >> _mapping); .. >> } >> >> Then, we could check the ->name to see if the VMA is MPX specific. Right? > > Does this actually create a vma backed with real memory? Doesn't this > need to go through anon_vma or something? _install_special_mapping > completely prevents merging. > Hmm, _install_special_mapping should completely prevent merging, even among MPX vmas. So, could you tell me how to set MPX specific ->name to the vma when it is created? Seems like that I could not find such interface. Thanks, Qiaowei N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei wrote: > On 2014-06-24, Andy Lutomirski wrote: >>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote: Can the new vm_operation "name" be use for this? The magic "always written to core dumps" feature might need to be reconsidered. >>> >>> One thing I'd like to avoid is an MPX vma getting merged with a >>> non-MPX vma. I don't see any code to prevent two VMAs with >>> different vm_ops->names from getting merged. That seems like a bit >>> of a design oversight for ->name. Right? >> >> AFAIK there are no ->name users that don't also set ->close, for >> exactly that reason. I'd be okay with adding a check for ->name, too. >> >> Hmm. If MPX vmas had a real struct file attached, this would all come >> for free. Maybe vmas with non-default vm_ops and file != NULL should >> never be mergeable? >> >>> >>> Thinking out loud a bit... There are also some more complicated but >>> more performant cleanup mechanisms that I'd like to go after in the future. >>> Given a page, we might want to figure out if it is an MPX page or not. >>> I wonder if we'll ever collide with some other user of vm_ops->name. >>> It looks fairly narrowly used at the moment, but would this keep us >>> from putting these pages on, say, a tmpfs mount? Doesn't look that >>> way at the moment. >> >> You could always check the vm_ops pointer to see if it's MPX. >> >> One feature I've wanted: a way to have special per-process vmas that >> can be easily found. For example, I want to be able to efficiently >> find out where the vdso and vvar vmas are. I don't think this is currently >> supported. >> > Andy, if you add a check for ->name to avoid the MPX vmas merged with non-MPX > vmas, I guess the work flow should be as follow (use _install_special_mapping > to get a new vma): > > unsigned long mpx_mmap(unsigned long len) > { > .. > static struct vm_special_mapping mpx_mapping = { > .name = "[mpx]", > .pages = no_pages, > }; > > ... > vma = _install_special_mapping(mm, addr, len, vm_flags, _mapping); > .. > } > > Then, we could check the ->name to see if the VMA is MPX specific. Right? Does this actually create a vma backed with real memory? Doesn't this need to go through anon_vma or something? _install_special_mapping completely prevents merging. Possibly silly question: would it make more sense to just create one giant vma for the MPX tables and only populate pieces of it as needed? This wouldn't work for 32-bit code, but maybe we don't care. (I see no reason why it couldn't work for x32, though.) (I don't really understand how anonymous memory works at all. I'm not an mm person.) --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei qiaowei@intel.com wrote: On 2014-06-24, Andy Lutomirski wrote: On 06/23/2014 01:06 PM, Andy Lutomirski wrote: Can the new vm_operation name be use for this? The magic always written to core dumps feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops-names from getting merged. That seems like a bit of a design oversight for -name. Right? AFAIK there are no -name users that don't also set -close, for exactly that reason. I'd be okay with adding a check for -name, too. Hmm. If MPX vmas had a real struct file attached, this would all come for free. Maybe vmas with non-default vm_ops and file != NULL should never be mergeable? Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops-name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. You could always check the vm_ops pointer to see if it's MPX. One feature I've wanted: a way to have special per-process vmas that can be easily found. For example, I want to be able to efficiently find out where the vdso and vvar vmas are. I don't think this is currently supported. Andy, if you add a check for -name to avoid the MPX vmas merged with non-MPX vmas, I guess the work flow should be as follow (use _install_special_mapping to get a new vma): unsigned long mpx_mmap(unsigned long len) { .. static struct vm_special_mapping mpx_mapping = { .name = [mpx], .pages = no_pages, }; ... vma = _install_special_mapping(mm, addr, len, vm_flags, mpx_mapping); .. } Then, we could check the -name to see if the VMA is MPX specific. Right? Does this actually create a vma backed with real memory? Doesn't this need to go through anon_vma or something? _install_special_mapping completely prevents merging. Possibly silly question: would it make more sense to just create one giant vma for the MPX tables and only populate pieces of it as needed? This wouldn't work for 32-bit code, but maybe we don't care. (I see no reason why it couldn't work for x32, though.) (I don't really understand how anonymous memory works at all. I'm not an mm person.) --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 2014-06-25, Andy Lutomirski wrote: On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei qiaowei@intel.com wrote: On 2014-06-24, Andy Lutomirski wrote: On 06/23/2014 01:06 PM, Andy Lutomirski wrote: Can the new vm_operation name be use for this? The magic always written to core dumps feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops-names from getting merged. That seems like a bit of a design oversight for -name. Right? AFAIK there are no -name users that don't also set -close, for exactly that reason. I'd be okay with adding a check for -name, too. Hmm. If MPX vmas had a real struct file attached, this would all come for free. Maybe vmas with non-default vm_ops and file != NULL should never be mergeable? Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops-name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. You could always check the vm_ops pointer to see if it's MPX. One feature I've wanted: a way to have special per-process vmas that can be easily found. For example, I want to be able to efficiently find out where the vdso and vvar vmas are. I don't think this is currently supported. Andy, if you add a check for -name to avoid the MPX vmas merged with non-MPX vmas, I guess the work flow should be as follow (use _install_special_mapping to get a new vma): unsigned long mpx_mmap(unsigned long len) { .. static struct vm_special_mapping mpx_mapping = { .name = [mpx], .pages = no_pages, }; ... vma = _install_special_mapping(mm, addr, len, vm_flags, mpx_mapping); .. } Then, we could check the -name to see if the VMA is MPX specific. Right? Does this actually create a vma backed with real memory? Doesn't this need to go through anon_vma or something? _install_special_mapping completely prevents merging. Hmm, _install_special_mapping should completely prevent merging, even among MPX vmas. So, could you tell me how to set MPX specific -name to the vma when it is created? Seems like that I could not find such interface. Thanks, Qiaowei N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 2014-06-24, Andy Lutomirski wrote: >> On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >>> Can the new vm_operation "name" be use for this? The magic "always >>> written to core dumps" feature might need to be reconsidered. >> >> One thing I'd like to avoid is an MPX vma getting merged with a >> non-MPX vma. I don't see any code to prevent two VMAs with >> different vm_ops->names from getting merged. That seems like a bit >> of a design oversight for ->name. Right? > > AFAIK there are no ->name users that don't also set ->close, for > exactly that reason. I'd be okay with adding a check for ->name, too. > > Hmm. If MPX vmas had a real struct file attached, this would all come > for free. Maybe vmas with non-default vm_ops and file != NULL should > never be mergeable? > >> >> Thinking out loud a bit... There are also some more complicated but >> more performant cleanup mechanisms that I'd like to go after in the future. >> Given a page, we might want to figure out if it is an MPX page or not. >> I wonder if we'll ever collide with some other user of vm_ops->name. >> It looks fairly narrowly used at the moment, but would this keep us >> from putting these pages on, say, a tmpfs mount? Doesn't look that >> way at the moment. > > You could always check the vm_ops pointer to see if it's MPX. > > One feature I've wanted: a way to have special per-process vmas that > can be easily found. For example, I want to be able to efficiently > find out where the vdso and vvar vmas are. I don't think this is currently > supported. > Andy, if you add a check for ->name to avoid the MPX vmas merged with non-MPX vmas, I guess the work flow should be as follow (use _install_special_mapping to get a new vma): unsigned long mpx_mmap(unsigned long len) { .. static struct vm_special_mapping mpx_mapping = { .name = "[mpx]", .pages = no_pages, }; ... vma = _install_special_mapping(mm, addr, len, vm_flags, _mapping); .. } Then, we could check the ->name to see if the VMA is MPX specific. Right? Thanks, Qiaowei
RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 2014-06-24, Andy Lutomirski wrote: >> +/* Make bounds tables and bouds directory unlocked. */ >> +if (vm_flags & VM_LOCKED) >> +vm_flags &= ~VM_LOCKED; > > Why? I would expect MCL_FUTURE to lock these. > Andy, I was just a little confused about LOCKED & POPULATE earlier and I thought VM_LOCKED is not necessary for MPX specific bounds tables. Now, this checking should be removed, and there should be mm_populate() for VM_LOCKED case after mmap_region(): if (!IS_ERR_VALUE(addr) && (vm_flags & VM_LOCKED)) mm_populate(addr, len); Thanks, Qiaowei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Mon, Jun 23, 2014 at 1:28 PM, Dave Hansen wrote: > On 06/23/2014 01:06 PM, Andy Lutomirski wrote: >> Can the new vm_operation "name" be use for this? The magic "always >> written to core dumps" feature might need to be reconsidered. > > One thing I'd like to avoid is an MPX vma getting merged with a non-MPX > vma. I don't see any code to prevent two VMAs with different > vm_ops->names from getting merged. That seems like a bit of a design > oversight for ->name. Right? AFAIK there are no ->name users that don't also set ->close, for exactly that reason. I'd be okay with adding a check for ->name, too. Hmm. If MPX vmas had a real struct file attached, this would all come for free. Maybe vmas with non-default vm_ops and file != NULL should never be mergeable? > > Thinking out loud a bit... There are also some more complicated but more > performant cleanup mechanisms that I'd like to go after in the future. > Given a page, we might want to figure out if it is an MPX page or not. > I wonder if we'll ever collide with some other user of vm_ops->name. It > looks fairly narrowly used at the moment, but would this keep us from > putting these pages on, say, a tmpfs mount? Doesn't look that way at > the moment. You could always check the vm_ops pointer to see if it's MPX. One feature I've wanted: a way to have special per-process vmas that can be easily found. For example, I want to be able to efficiently find out where the vdso and vvar vmas are. I don't think this is currently supported. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/23/2014 01:06 PM, Andy Lutomirski wrote: > Can the new vm_operation "name" be use for this? The magic "always > written to core dumps" feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops->names from getting merged. That seems like a bit of a design oversight for ->name. Right? Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops->name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Mon, Jun 23, 2014 at 1:03 PM, Dave Hansen wrote: > On 06/23/2014 12:49 PM, Andy Lutomirski wrote: >> On 06/18/2014 02:44 AM, Qiaowei Ren wrote: >>> This patch adds one MPX specific mmap interface, which only handles >>> mpx related maps, including bounds table and bounds directory. >>> >>> In order to track MPX specific memory usage, this interface is added >>> to stick new vm_flag VM_MPX in the vma_area_struct when create a >>> bounds table or bounds directory. >> >> I imagine the linux-mm people would want to think about any new vm flag. >> Why is this needed? > > These tables can take huge amounts of memory. In the worst-case > scenario, the tables can be 4x the size of the data structure being > tracked. IOW, a 1-page structure can require 4 bounds-table pages. > > My expectation is that folks using MPX are going to be keen on figuring > out how much memory is being dedicated to it. With this feature, plus > some grepping in /proc/$pid/smaps one could take a pretty good stab at it. > > I know VM flags are scarce, and I'm open to other ways to skin this cat. > Can the new vm_operation "name" be use for this? The magic "always written to core dumps" feature might need to be reconsidered. There's also arch_vma_name, but I just finished removing for x86, and I'd be a little sad to see it come right back. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/23/2014 12:49 PM, Andy Lutomirski wrote: > On 06/18/2014 02:44 AM, Qiaowei Ren wrote: >> This patch adds one MPX specific mmap interface, which only handles >> mpx related maps, including bounds table and bounds directory. >> >> In order to track MPX specific memory usage, this interface is added >> to stick new vm_flag VM_MPX in the vma_area_struct when create a >> bounds table or bounds directory. > > I imagine the linux-mm people would want to think about any new vm flag. > Why is this needed? These tables can take huge amounts of memory. In the worst-case scenario, the tables can be 4x the size of the data structure being tracked. IOW, a 1-page structure can require 4 bounds-table pages. My expectation is that folks using MPX are going to be keen on figuring out how much memory is being dedicated to it. With this feature, plus some grepping in /proc/$pid/smaps one could take a pretty good stab at it. I know VM flags are scarce, and I'm open to other ways to skin this cat. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/18/2014 02:44 AM, Qiaowei Ren wrote: > This patch adds one MPX specific mmap interface, which only handles > mpx related maps, including bounds table and bounds directory. > > In order to track MPX specific memory usage, this interface is added > to stick new vm_flag VM_MPX in the vma_area_struct when create a > bounds table or bounds directory. I imagine the linux-mm people would want to think about any new vm flag. Why is this needed? > > Signed-off-by: Qiaowei Ren > --- > arch/x86/Kconfig |4 +++ > arch/x86/include/asm/mpx.h | 38 > arch/x86/mm/Makefile |2 + > arch/x86/mm/mpx.c | 58 > > 4 files changed, 102 insertions(+), 0 deletions(-) > create mode 100644 arch/x86/include/asm/mpx.h > create mode 100644 arch/x86/mm/mpx.c > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 25d2c6f..0194790 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -237,6 +237,10 @@ config HAVE_INTEL_TXT > def_bool y > depends on INTEL_IOMMU && ACPI > > +config X86_INTEL_MPX > + def_bool y > + depends on CPU_SUP_INTEL > + > config X86_32_SMP > def_bool y > depends on X86_32 && SMP > diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h > new file mode 100644 > index 000..5725ac4 > --- /dev/null > +++ b/arch/x86/include/asm/mpx.h > @@ -0,0 +1,38 @@ > +#ifndef _ASM_X86_MPX_H > +#define _ASM_X86_MPX_H > + > +#include > +#include > + > +#ifdef CONFIG_X86_64 > + > +/* upper 28 bits [47:20] of the virtual address in 64-bit used to > + * index into bounds directory (BD). > + */ > +#define MPX_BD_ENTRY_OFFSET 28 > +#define MPX_BD_ENTRY_SHIFT 3 > +/* bits [19:3] of the virtual address in 64-bit used to index into > + * bounds table (BT). > + */ > +#define MPX_BT_ENTRY_OFFSET 17 > +#define MPX_BT_ENTRY_SHIFT 5 > +#define MPX_IGN_BITS 3 > + > +#else > + > +#define MPX_BD_ENTRY_OFFSET 20 > +#define MPX_BD_ENTRY_SHIFT 2 > +#define MPX_BT_ENTRY_OFFSET 10 > +#define MPX_BT_ENTRY_SHIFT 4 > +#define MPX_IGN_BITS 2 > + > +#endif > + > +#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) > +#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) > + > +#define MPX_BNDSTA_ERROR_CODE0x3 > + > +unsigned long mpx_mmap(unsigned long len); > + > +#endif /* _ASM_X86_MPX_H */ > diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile > index 6a19ad9..ecfdc46 100644 > --- a/arch/x86/mm/Makefile > +++ b/arch/x86/mm/Makefile > @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o > obj-$(CONFIG_NUMA_EMU) += numa_emulation.o > > obj-$(CONFIG_MEMTEST)+= memtest.o > + > +obj-$(CONFIG_X86_INTEL_MPX) += mpx.o > diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c > new file mode 100644 > index 000..546c5d1 > --- /dev/null > +++ b/arch/x86/mm/mpx.c > @@ -0,0 +1,58 @@ > +#include > +#include > +#include > +#include > +#include > + > +/* > + * this is really a simplified "vm_mmap". it only handles mpx > + * related maps, including bounds table and bounds directory. > + * > + * here we can stick new vm_flag VM_MPX in the vma_area_struct > + * when create a bounds table or bounds directory, in order to > + * track MPX specific memory. > + */ > +unsigned long mpx_mmap(unsigned long len) > +{ > + unsigned long ret; > + unsigned long addr, pgoff; > + struct mm_struct *mm = current->mm; > + vm_flags_t vm_flags; > + > + /* Only bounds table and bounds directory can be allocated here */ > + if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES) > + return -EINVAL; > + > + down_write(>mmap_sem); > + > + /* Too many mappings? */ > + if (mm->map_count > sysctl_max_map_count) { > + ret = -ENOMEM; > + goto out; > + } > + > + /* Obtain the address to map to. we verify (or select) it and ensure > + * that it represents a valid section of the address space. > + */ > + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE); > + if (addr & ~PAGE_MASK) { > + ret = addr; > + goto out; > + } > + > + vm_flags = VM_READ | VM_WRITE | VM_MPX | > + mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; > + > + /* Make bounds tables and bouds directory unlocked. */ > + if (vm_flags & VM_LOCKED) > + vm_flags &= ~VM_LOCKED; Why? I would expect MCL_FUTURE to lock these. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/18/2014 02:44 AM, Qiaowei Ren wrote: This patch adds one MPX specific mmap interface, which only handles mpx related maps, including bounds table and bounds directory. In order to track MPX specific memory usage, this interface is added to stick new vm_flag VM_MPX in the vma_area_struct when create a bounds table or bounds directory. I imagine the linux-mm people would want to think about any new vm flag. Why is this needed? Signed-off-by: Qiaowei Ren qiaowei@intel.com --- arch/x86/Kconfig |4 +++ arch/x86/include/asm/mpx.h | 38 arch/x86/mm/Makefile |2 + arch/x86/mm/mpx.c | 58 4 files changed, 102 insertions(+), 0 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/mm/mpx.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 25d2c6f..0194790 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -237,6 +237,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU ACPI +config X86_INTEL_MPX + def_bool y + depends on CPU_SUP_INTEL + config X86_32_SMP def_bool y depends on X86_32 SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..5725ac4 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,38 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include linux/types.h +#include asm/ptrace.h + +#ifdef CONFIG_X86_64 + +/* upper 28 bits [47:20] of the virtual address in 64-bit used to + * index into bounds directory (BD). + */ +#define MPX_BD_ENTRY_OFFSET 28 +#define MPX_BD_ENTRY_SHIFT 3 +/* bits [19:3] of the virtual address in 64-bit used to index into + * bounds table (BT). + */ +#define MPX_BT_ENTRY_OFFSET 17 +#define MPX_BT_ENTRY_SHIFT 5 +#define MPX_IGN_BITS 3 + +#else + +#define MPX_BD_ENTRY_OFFSET 20 +#define MPX_BD_ENTRY_SHIFT 2 +#define MPX_BT_ENTRY_OFFSET 10 +#define MPX_BT_ENTRY_SHIFT 4 +#define MPX_IGN_BITS 2 + +#endif + +#define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) +#define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) + +#define MPX_BNDSTA_ERROR_CODE0x3 + +unsigned long mpx_mmap(unsigned long len); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 6a19ad9..ecfdc46 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o obj-$(CONFIG_NUMA_EMU) += numa_emulation.o obj-$(CONFIG_MEMTEST)+= memtest.o + +obj-$(CONFIG_X86_INTEL_MPX) += mpx.o diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c new file mode 100644 index 000..546c5d1 --- /dev/null +++ b/arch/x86/mm/mpx.c @@ -0,0 +1,58 @@ +#include linux/kernel.h +#include linux/syscalls.h +#include asm/mpx.h +#include asm/mman.h +#include linux/sched/sysctl.h + +/* + * this is really a simplified vm_mmap. it only handles mpx + * related maps, including bounds table and bounds directory. + * + * here we can stick new vm_flag VM_MPX in the vma_area_struct + * when create a bounds table or bounds directory, in order to + * track MPX specific memory. + */ +unsigned long mpx_mmap(unsigned long len) +{ + unsigned long ret; + unsigned long addr, pgoff; + struct mm_struct *mm = current-mm; + vm_flags_t vm_flags; + + /* Only bounds table and bounds directory can be allocated here */ + if (len != MPX_BD_SIZE_BYTES len != MPX_BT_SIZE_BYTES) + return -EINVAL; + + down_write(mm-mmap_sem); + + /* Too many mappings? */ + if (mm-map_count sysctl_max_map_count) { + ret = -ENOMEM; + goto out; + } + + /* Obtain the address to map to. we verify (or select) it and ensure + * that it represents a valid section of the address space. + */ + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE); + if (addr ~PAGE_MASK) { + ret = addr; + goto out; + } + + vm_flags = VM_READ | VM_WRITE | VM_MPX | + mm-def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; + + /* Make bounds tables and bouds directory unlocked. */ + if (vm_flags VM_LOCKED) + vm_flags = ~VM_LOCKED; Why? I would expect MCL_FUTURE to lock these. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/23/2014 12:49 PM, Andy Lutomirski wrote: On 06/18/2014 02:44 AM, Qiaowei Ren wrote: This patch adds one MPX specific mmap interface, which only handles mpx related maps, including bounds table and bounds directory. In order to track MPX specific memory usage, this interface is added to stick new vm_flag VM_MPX in the vma_area_struct when create a bounds table or bounds directory. I imagine the linux-mm people would want to think about any new vm flag. Why is this needed? These tables can take huge amounts of memory. In the worst-case scenario, the tables can be 4x the size of the data structure being tracked. IOW, a 1-page structure can require 4 bounds-table pages. My expectation is that folks using MPX are going to be keen on figuring out how much memory is being dedicated to it. With this feature, plus some grepping in /proc/$pid/smaps one could take a pretty good stab at it. I know VM flags are scarce, and I'm open to other ways to skin this cat. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Mon, Jun 23, 2014 at 1:03 PM, Dave Hansen dave.han...@intel.com wrote: On 06/23/2014 12:49 PM, Andy Lutomirski wrote: On 06/18/2014 02:44 AM, Qiaowei Ren wrote: This patch adds one MPX specific mmap interface, which only handles mpx related maps, including bounds table and bounds directory. In order to track MPX specific memory usage, this interface is added to stick new vm_flag VM_MPX in the vma_area_struct when create a bounds table or bounds directory. I imagine the linux-mm people would want to think about any new vm flag. Why is this needed? These tables can take huge amounts of memory. In the worst-case scenario, the tables can be 4x the size of the data structure being tracked. IOW, a 1-page structure can require 4 bounds-table pages. My expectation is that folks using MPX are going to be keen on figuring out how much memory is being dedicated to it. With this feature, plus some grepping in /proc/$pid/smaps one could take a pretty good stab at it. I know VM flags are scarce, and I'm open to other ways to skin this cat. Can the new vm_operation name be use for this? The magic always written to core dumps feature might need to be reconsidered. There's also arch_vma_name, but I just finished removing for x86, and I'd be a little sad to see it come right back. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 06/23/2014 01:06 PM, Andy Lutomirski wrote: Can the new vm_operation name be use for this? The magic always written to core dumps feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops-names from getting merged. That seems like a bit of a design oversight for -name. Right? Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops-name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On Mon, Jun 23, 2014 at 1:28 PM, Dave Hansen dave.han...@intel.com wrote: On 06/23/2014 01:06 PM, Andy Lutomirski wrote: Can the new vm_operation name be use for this? The magic always written to core dumps feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops-names from getting merged. That seems like a bit of a design oversight for -name. Right? AFAIK there are no -name users that don't also set -close, for exactly that reason. I'd be okay with adding a check for -name, too. Hmm. If MPX vmas had a real struct file attached, this would all come for free. Maybe vmas with non-default vm_ops and file != NULL should never be mergeable? Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops-name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. You could always check the vm_ops pointer to see if it's MPX. One feature I've wanted: a way to have special per-process vmas that can be easily found. For example, I want to be able to efficiently find out where the vdso and vvar vmas are. I don't think this is currently supported. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 2014-06-24, Andy Lutomirski wrote: +/* Make bounds tables and bouds directory unlocked. */ +if (vm_flags VM_LOCKED) +vm_flags = ~VM_LOCKED; Why? I would expect MCL_FUTURE to lock these. Andy, I was just a little confused about LOCKED POPULATE earlier and I thought VM_LOCKED is not necessary for MPX specific bounds tables. Now, this checking should be removed, and there should be mm_populate() for VM_LOCKED case after mmap_region(): if (!IS_ERR_VALUE(addr) (vm_flags VM_LOCKED)) mm_populate(addr, len); Thanks, Qiaowei -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
On 2014-06-24, Andy Lutomirski wrote: On 06/23/2014 01:06 PM, Andy Lutomirski wrote: Can the new vm_operation name be use for this? The magic always written to core dumps feature might need to be reconsidered. One thing I'd like to avoid is an MPX vma getting merged with a non-MPX vma. I don't see any code to prevent two VMAs with different vm_ops-names from getting merged. That seems like a bit of a design oversight for -name. Right? AFAIK there are no -name users that don't also set -close, for exactly that reason. I'd be okay with adding a check for -name, too. Hmm. If MPX vmas had a real struct file attached, this would all come for free. Maybe vmas with non-default vm_ops and file != NULL should never be mergeable? Thinking out loud a bit... There are also some more complicated but more performant cleanup mechanisms that I'd like to go after in the future. Given a page, we might want to figure out if it is an MPX page or not. I wonder if we'll ever collide with some other user of vm_ops-name. It looks fairly narrowly used at the moment, but would this keep us from putting these pages on, say, a tmpfs mount? Doesn't look that way at the moment. You could always check the vm_ops pointer to see if it's MPX. One feature I've wanted: a way to have special per-process vmas that can be easily found. For example, I want to be able to efficiently find out where the vdso and vvar vmas are. I don't think this is currently supported. Andy, if you add a check for -name to avoid the MPX vmas merged with non-MPX vmas, I guess the work flow should be as follow (use _install_special_mapping to get a new vma): unsigned long mpx_mmap(unsigned long len) { .. static struct vm_special_mapping mpx_mapping = { .name = [mpx], .pages = no_pages, }; ... vma = _install_special_mapping(mm, addr, len, vm_flags, mpx_mapping); .. } Then, we could check the -name to see if the VMA is MPX specific. Right? Thanks, Qiaowei
[PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
This patch adds one MPX specific mmap interface, which only handles mpx related maps, including bounds table and bounds directory. In order to track MPX specific memory usage, this interface is added to stick new vm_flag VM_MPX in the vma_area_struct when create a bounds table or bounds directory. Signed-off-by: Qiaowei Ren --- arch/x86/Kconfig |4 +++ arch/x86/include/asm/mpx.h | 38 arch/x86/mm/Makefile |2 + arch/x86/mm/mpx.c | 58 4 files changed, 102 insertions(+), 0 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/mm/mpx.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 25d2c6f..0194790 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -237,6 +237,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU && ACPI +config X86_INTEL_MPX + def_bool y + depends on CPU_SUP_INTEL + config X86_32_SMP def_bool y depends on X86_32 && SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..5725ac4 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,38 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +/* upper 28 bits [47:20] of the virtual address in 64-bit used to + * index into bounds directory (BD). + */ +#define MPX_BD_ENTRY_OFFSET28 +#define MPX_BD_ENTRY_SHIFT 3 +/* bits [19:3] of the virtual address in 64-bit used to index into + * bounds table (BT). + */ +#define MPX_BT_ENTRY_OFFSET17 +#define MPX_BT_ENTRY_SHIFT 5 +#define MPX_IGN_BITS 3 + +#else + +#define MPX_BD_ENTRY_OFFSET20 +#define MPX_BD_ENTRY_SHIFT 2 +#define MPX_BT_ENTRY_OFFSET10 +#define MPX_BT_ENTRY_SHIFT 4 +#define MPX_IGN_BITS 2 + +#endif + +#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) +#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) + +#define MPX_BNDSTA_ERROR_CODE 0x3 + +unsigned long mpx_mmap(unsigned long len); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 6a19ad9..ecfdc46 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o obj-$(CONFIG_NUMA_EMU) += numa_emulation.o obj-$(CONFIG_MEMTEST) += memtest.o + +obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c new file mode 100644 index 000..546c5d1 --- /dev/null +++ b/arch/x86/mm/mpx.c @@ -0,0 +1,58 @@ +#include +#include +#include +#include +#include + +/* + * this is really a simplified "vm_mmap". it only handles mpx + * related maps, including bounds table and bounds directory. + * + * here we can stick new vm_flag VM_MPX in the vma_area_struct + * when create a bounds table or bounds directory, in order to + * track MPX specific memory. + */ +unsigned long mpx_mmap(unsigned long len) +{ + unsigned long ret; + unsigned long addr, pgoff; + struct mm_struct *mm = current->mm; + vm_flags_t vm_flags; + + /* Only bounds table and bounds directory can be allocated here */ + if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES) + return -EINVAL; + + down_write(>mmap_sem); + + /* Too many mappings? */ + if (mm->map_count > sysctl_max_map_count) { + ret = -ENOMEM; + goto out; + } + + /* Obtain the address to map to. we verify (or select) it and ensure +* that it represents a valid section of the address space. +*/ + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE); + if (addr & ~PAGE_MASK) { + ret = addr; + goto out; + } + + vm_flags = VM_READ | VM_WRITE | VM_MPX | + mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; + + /* Make bounds tables and bouds directory unlocked. */ + if (vm_flags & VM_LOCKED) + vm_flags &= ~VM_LOCKED; + + /* Set pgoff according to addr for anon_vma */ + pgoff = addr >> PAGE_SHIFT; + + ret = mmap_region(NULL, addr, len, vm_flags, pgoff); + +out: + up_write(>mmap_sem); + return ret; +} -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
This patch adds one MPX specific mmap interface, which only handles mpx related maps, including bounds table and bounds directory. In order to track MPX specific memory usage, this interface is added to stick new vm_flag VM_MPX in the vma_area_struct when create a bounds table or bounds directory. Signed-off-by: Qiaowei Ren qiaowei@intel.com --- arch/x86/Kconfig |4 +++ arch/x86/include/asm/mpx.h | 38 arch/x86/mm/Makefile |2 + arch/x86/mm/mpx.c | 58 4 files changed, 102 insertions(+), 0 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/mm/mpx.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 25d2c6f..0194790 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -237,6 +237,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU ACPI +config X86_INTEL_MPX + def_bool y + depends on CPU_SUP_INTEL + config X86_32_SMP def_bool y depends on X86_32 SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..5725ac4 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,38 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include linux/types.h +#include asm/ptrace.h + +#ifdef CONFIG_X86_64 + +/* upper 28 bits [47:20] of the virtual address in 64-bit used to + * index into bounds directory (BD). + */ +#define MPX_BD_ENTRY_OFFSET28 +#define MPX_BD_ENTRY_SHIFT 3 +/* bits [19:3] of the virtual address in 64-bit used to index into + * bounds table (BT). + */ +#define MPX_BT_ENTRY_OFFSET17 +#define MPX_BT_ENTRY_SHIFT 5 +#define MPX_IGN_BITS 3 + +#else + +#define MPX_BD_ENTRY_OFFSET20 +#define MPX_BD_ENTRY_SHIFT 2 +#define MPX_BT_ENTRY_OFFSET10 +#define MPX_BT_ENTRY_SHIFT 4 +#define MPX_IGN_BITS 2 + +#endif + +#define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) +#define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) + +#define MPX_BNDSTA_ERROR_CODE 0x3 + +unsigned long mpx_mmap(unsigned long len); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 6a19ad9..ecfdc46 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o obj-$(CONFIG_NUMA_EMU) += numa_emulation.o obj-$(CONFIG_MEMTEST) += memtest.o + +obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c new file mode 100644 index 000..546c5d1 --- /dev/null +++ b/arch/x86/mm/mpx.c @@ -0,0 +1,58 @@ +#include linux/kernel.h +#include linux/syscalls.h +#include asm/mpx.h +#include asm/mman.h +#include linux/sched/sysctl.h + +/* + * this is really a simplified vm_mmap. it only handles mpx + * related maps, including bounds table and bounds directory. + * + * here we can stick new vm_flag VM_MPX in the vma_area_struct + * when create a bounds table or bounds directory, in order to + * track MPX specific memory. + */ +unsigned long mpx_mmap(unsigned long len) +{ + unsigned long ret; + unsigned long addr, pgoff; + struct mm_struct *mm = current-mm; + vm_flags_t vm_flags; + + /* Only bounds table and bounds directory can be allocated here */ + if (len != MPX_BD_SIZE_BYTES len != MPX_BT_SIZE_BYTES) + return -EINVAL; + + down_write(mm-mmap_sem); + + /* Too many mappings? */ + if (mm-map_count sysctl_max_map_count) { + ret = -ENOMEM; + goto out; + } + + /* Obtain the address to map to. we verify (or select) it and ensure +* that it represents a valid section of the address space. +*/ + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE); + if (addr ~PAGE_MASK) { + ret = addr; + goto out; + } + + vm_flags = VM_READ | VM_WRITE | VM_MPX | + mm-def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; + + /* Make bounds tables and bouds directory unlocked. */ + if (vm_flags VM_LOCKED) + vm_flags = ~VM_LOCKED; + + /* Set pgoff according to addr for anon_vma */ + pgoff = addr PAGE_SHIFT; + + ret = mmap_region(NULL, addr, len, vm_flags, pgoff); + +out: + up_write(mm-mmap_sem); + return ret; +} -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/