Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-27 Thread Andy Lutomirski
On Fri, Jun 27, 2014 at 10:42 AM, Dave Hansen  wrote:
> On 06/27/2014 10:34 AM, Dave Hansen wrote:
>> I'm claiming that we need COW behavior for the bounds tables, at least
>> by default.  If userspace knows enough about the ways that it is using
>> the tables and knows how to share them, let it go to town.  The kernel
>> will permit this kind of usage model, but we simply won't be helping
>> with the management of the tables when userspace creates them.
>
> Actually, this is another reason we need to mark VMAs as being
> MPX-related explicitly instead of inferring it from the tables.  If
> userspace does something really specialized like this, the kernel does
> not want to confuse these VMAs the ones it created.
>

Good point.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-27 Thread Dave Hansen
On 06/27/2014 10:34 AM, Dave Hansen wrote:
> I'm claiming that we need COW behavior for the bounds tables, at least
> by default.  If userspace knows enough about the ways that it is using
> the tables and knows how to share them, let it go to town.  The kernel
> will permit this kind of usage model, but we simply won't be helping
> with the management of the tables when userspace creates them.

Actually, this is another reason we need to mark VMAs as being
MPX-related explicitly instead of inferring it from the tables.  If
userspace does something really specialized like this, the kernel does
not want to confuse these VMAs the ones it created.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-27 Thread Dave Hansen
On 06/26/2014 05:26 PM, Andy Lutomirski wrote:
> On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen  wrote:
>> On 06/26/2014 04:15 PM, Andy Lutomirski wrote:
>>> Also, egads: what happens when a bound table entry is associated with
>>> a MAP_SHARED page?
>>
>> Bounds table entries are for pointers.  Do we keep pointers inside of
>> MAP_SHARED-mapped things? :)
> 
> Sure, if it's MAP_SHARED | MAP_ANONYMOUS.  For example:
> 
> struct thing {
>   struct thing *next;
> };
> 
> struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...);
> storage[0].next = [1];
> fork();
> 
> I'm not suggesting that this needs to *work* in the first incarnation of this 
> :)

I'm not sure I'm seeing the issue.

I'm claiming that we need COW behavior for the bounds tables, at least
by default.  If userspace knows enough about the ways that it is using
the tables and knows how to share them, let it go to town.  The kernel
will permit this kind of usage model, but we simply won't be helping
with the management of the tables when userspace creates them.

You've demonstrated a case where userspace might theoretically might
want to share bounds tables (although I think it's pretty dangerous).
It's equally theoretically possible that userspace might *not* want to
share the tables for instance if one process narrowed the bounds and the
other did not.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-27 Thread Dave Hansen
On 06/26/2014 05:26 PM, Andy Lutomirski wrote:
 On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen dave.han...@intel.com wrote:
 On 06/26/2014 04:15 PM, Andy Lutomirski wrote:
 Also, egads: what happens when a bound table entry is associated with
 a MAP_SHARED page?

 Bounds table entries are for pointers.  Do we keep pointers inside of
 MAP_SHARED-mapped things? :)
 
 Sure, if it's MAP_SHARED | MAP_ANONYMOUS.  For example:
 
 struct thing {
   struct thing *next;
 };
 
 struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...);
 storage[0].next = storage[1];
 fork();
 
 I'm not suggesting that this needs to *work* in the first incarnation of this 
 :)

I'm not sure I'm seeing the issue.

I'm claiming that we need COW behavior for the bounds tables, at least
by default.  If userspace knows enough about the ways that it is using
the tables and knows how to share them, let it go to town.  The kernel
will permit this kind of usage model, but we simply won't be helping
with the management of the tables when userspace creates them.

You've demonstrated a case where userspace might theoretically might
want to share bounds tables (although I think it's pretty dangerous).
It's equally theoretically possible that userspace might *not* want to
share the tables for instance if one process narrowed the bounds and the
other did not.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-27 Thread Dave Hansen
On 06/27/2014 10:34 AM, Dave Hansen wrote:
 I'm claiming that we need COW behavior for the bounds tables, at least
 by default.  If userspace knows enough about the ways that it is using
 the tables and knows how to share them, let it go to town.  The kernel
 will permit this kind of usage model, but we simply won't be helping
 with the management of the tables when userspace creates them.

Actually, this is another reason we need to mark VMAs as being
MPX-related explicitly instead of inferring it from the tables.  If
userspace does something really specialized like this, the kernel does
not want to confuse these VMAs the ones it created.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-27 Thread Andy Lutomirski
On Fri, Jun 27, 2014 at 10:42 AM, Dave Hansen dave.han...@intel.com wrote:
 On 06/27/2014 10:34 AM, Dave Hansen wrote:
 I'm claiming that we need COW behavior for the bounds tables, at least
 by default.  If userspace knows enough about the ways that it is using
 the tables and knows how to share them, let it go to town.  The kernel
 will permit this kind of usage model, but we simply won't be helping
 with the management of the tables when userspace creates them.

 Actually, this is another reason we need to mark VMAs as being
 MPX-related explicitly instead of inferring it from the tables.  If
 userspace does something really specialized like this, the kernel does
 not want to confuse these VMAs the ones it created.


Good point.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Andy Lutomirski
On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen  wrote:
> On 06/26/2014 04:15 PM, Andy Lutomirski wrote:
>> So here's my mental image of how I might do this if I were doing it
>> entirely in userspace: I'd create a file or memfd for the bound tables
>> and another for the bound directory.  These files would be *huge*: the
>> bound directory file would be 2GB and the bounds table file would be
>> 2^48 bytes or whatever it is.  (Maybe even bigger?)
>>
>> Then I'd just map pieces of those files wherever they'd need to be,
>> and I'd make the mappings sparse.  I suspect that you don't actually
>> want a vma for each piece of bound table that gets mapped -- the space
>> of vmas could end up incredibly sparse.  So I'd at least map (in the
>> vma sense, not the pte sense) and entire bound table at a time.  And
>> I'd probably just map the bound directory in one big piece.
>>
>> Then I'd populate it in the fault handler.
>>
>> This is almost what the code is doing, I think, modulo the files.
>>
>> This has one killer problem: these mappings need to be private (cowed
>> on fork).  So memfd is no good.
>
> This essentially uses the page cache's radix tree as a parallel data
> structure in order to keep a vaddr->mpx_vma map.  That's not a bad idea,
> but it is a parallel data structure that does not handle copy-on-write
> very well.
>
> I'm pretty sure we need the semantics that anonymous memory provides.
>
>> There's got to be an easyish way to
>> modify the mm code to allow anonymous maps with vm_ops.  Maybe a new
>> mmap_region parameter or something?  Maybe even a special anon_vma,
>> but I don't really understand how those work.
>
> Yeah, we very well might end up having to go down that path.
>
>> Also, egads: what happens when a bound table entry is associated with
>> a MAP_SHARED page?
>
> Bounds table entries are for pointers.  Do we keep pointers inside of
> MAP_SHARED-mapped things? :)

Sure, if it's MAP_SHARED | MAP_ANONYMOUS.  For example:

struct thing {
  struct thing *next;
};

struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...);
storage[0].next = [1];
fork();

I'm not suggesting that this needs to *work* in the first incarnation of this :)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Dave Hansen
On 06/26/2014 04:15 PM, Andy Lutomirski wrote:
> So here's my mental image of how I might do this if I were doing it
> entirely in userspace: I'd create a file or memfd for the bound tables
> and another for the bound directory.  These files would be *huge*: the
> bound directory file would be 2GB and the bounds table file would be
> 2^48 bytes or whatever it is.  (Maybe even bigger?)
> 
> Then I'd just map pieces of those files wherever they'd need to be,
> and I'd make the mappings sparse.  I suspect that you don't actually
> want a vma for each piece of bound table that gets mapped -- the space
> of vmas could end up incredibly sparse.  So I'd at least map (in the
> vma sense, not the pte sense) and entire bound table at a time.  And
> I'd probably just map the bound directory in one big piece.
> 
> Then I'd populate it in the fault handler.
> 
> This is almost what the code is doing, I think, modulo the files.
> 
> This has one killer problem: these mappings need to be private (cowed
> on fork).  So memfd is no good.

This essentially uses the page cache's radix tree as a parallel data
structure in order to keep a vaddr->mpx_vma map.  That's not a bad idea,
but it is a parallel data structure that does not handle copy-on-write
very well.

I'm pretty sure we need the semantics that anonymous memory provides.

> There's got to be an easyish way to
> modify the mm code to allow anonymous maps with vm_ops.  Maybe a new
> mmap_region parameter or something?  Maybe even a special anon_vma,
> but I don't really understand how those work.

Yeah, we very well might end up having to go down that path.

> Also, egads: what happens when a bound table entry is associated with
> a MAP_SHARED page?

Bounds table entries are for pointers.  Do we keep pointers inside of
MAP_SHARED-mapped things? :)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Andy Lutomirski
On Thu, Jun 26, 2014 at 3:58 PM, Dave Hansen  wrote:
> On 06/26/2014 03:19 PM, Andy Lutomirski wrote:
>> On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen  wrote:
>>> On 06/25/2014 02:05 PM, Andy Lutomirski wrote:
 Hmm.  the memfd_create thing may be able to do this for you.  If you
 created a per-mm memfd and mapped it, it all just might work.
>>>
>>> memfd_create() seems to bring a fair amount of baggage along (the fd
>>> part :) if all we want is a marker.  Really, all we need is _a_ bit, and
>>> some way to plumb to userspace the RSS values of VMAs with that bit set.
>>>
>>> Creating and mmap()'ing a fd seems a rather roundabout way to get there.
>>
>> Hmm.  So does VM_MPX, though.  If this stuff were done entirely in
>> userspace, then memfd_create would be exactly the right solution, I
>> think.
>>
>> Would it work to just scan the bound directory to figure out how many
>> bound tables exist?
>
> Theoretically, perhaps.
>
> Practically, the bounds directory is 2GB, and it is likely to be very
> sparse.  You would have to walk the page tables finding where pages were
> mapped, then search the mapped pages for bounds table entries.
>
> Assuming that it was aligned and minimally populated, that's a *MINIMUM*
> search looking for a PGD entry, then you have to look at 512 PUD
> entries.  A full search would have to look at half a million ptes.
> That's just finding out how sparse the first level of the tables are
> before you've looked at a byte of actual data, and if they were empty.
>
> We could keep another, parallel, data structure that handles this better
> other than the hardware tables.  Like, say, an rbtree that stores ranges
> of virtual addresses.  We could call them vm_area_somethings ... wait a
> sec... we have a structure like that. ;)
>
>

So here's my mental image of how I might do this if I were doing it
entirely in userspace: I'd create a file or memfd for the bound tables
and another for the bound directory.  These files would be *huge*: the
bound directory file would be 2GB and the bounds table file would be
2^48 bytes or whatever it is.  (Maybe even bigger?)

Then I'd just map pieces of those files wherever they'd need to be,
and I'd make the mappings sparse.  I suspect that you don't actually
want a vma for each piece of bound table that gets mapped -- the space
of vmas could end up incredibly sparse.  So I'd at least map (in the
vma sense, not the pte sense) and entire bound table at a time.  And
I'd probably just map the bound directory in one big piece.

Then I'd populate it in the fault handler.

This is almost what the code is doing, I think, modulo the files.

This has one killer problem: these mappings need to be private (cowed
on fork).  So memfd is no good.  There's got to be an easyish way to
modify the mm code to allow anonymous maps with vm_ops.  Maybe a new
mmap_region parameter or something?  Maybe even a special anon_vma,
but I don't really understand how those work.


Also, egads: what happens when a bound table entry is associated with
a MAP_SHARED page?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Dave Hansen
On 06/26/2014 03:19 PM, Andy Lutomirski wrote:
> On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen  wrote:
>> On 06/25/2014 02:05 PM, Andy Lutomirski wrote:
>>> Hmm.  the memfd_create thing may be able to do this for you.  If you
>>> created a per-mm memfd and mapped it, it all just might work.
>>
>> memfd_create() seems to bring a fair amount of baggage along (the fd
>> part :) if all we want is a marker.  Really, all we need is _a_ bit, and
>> some way to plumb to userspace the RSS values of VMAs with that bit set.
>>
>> Creating and mmap()'ing a fd seems a rather roundabout way to get there.
> 
> Hmm.  So does VM_MPX, though.  If this stuff were done entirely in
> userspace, then memfd_create would be exactly the right solution, I
> think.
> 
> Would it work to just scan the bound directory to figure out how many
> bound tables exist?

Theoretically, perhaps.

Practically, the bounds directory is 2GB, and it is likely to be very
sparse.  You would have to walk the page tables finding where pages were
mapped, then search the mapped pages for bounds table entries.

Assuming that it was aligned and minimally populated, that's a *MINIMUM*
search looking for a PGD entry, then you have to look at 512 PUD
entries.  A full search would have to look at half a million ptes.
That's just finding out how sparse the first level of the tables are
before you've looked at a byte of actual data, and if they were empty.

We could keep another, parallel, data structure that handles this better
other than the hardware tables.  Like, say, an rbtree that stores ranges
of virtual addresses.  We could call them vm_area_somethings ... wait a
sec... we have a structure like that. ;)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Andy Lutomirski
On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen  wrote:
> On 06/25/2014 02:05 PM, Andy Lutomirski wrote:
>> Hmm.  the memfd_create thing may be able to do this for you.  If you
>> created a per-mm memfd and mapped it, it all just might work.
>
> memfd_create() seems to bring a fair amount of baggage along (the fd
> part :) if all we want is a marker.  Really, all we need is _a_ bit, and
> some way to plumb to userspace the RSS values of VMAs with that bit set.
>
> Creating and mmap()'ing a fd seems a rather roundabout way to get there.

Hmm.  So does VM_MPX, though.  If this stuff were done entirely in
userspace, then memfd_create would be exactly the right solution, I
think.

Would it work to just scan the bound directory to figure out how many
bound tables exist?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Dave Hansen
On 06/25/2014 02:05 PM, Andy Lutomirski wrote:
> Hmm.  the memfd_create thing may be able to do this for you.  If you
> created a per-mm memfd and mapped it, it all just might work.

memfd_create() seems to bring a fair amount of baggage along (the fd
part :) if all we want is a marker.  Really, all we need is _a_ bit, and
some way to plumb to userspace the RSS values of VMAs with that bit set.

Creating and mmap()'ing a fd seems a rather roundabout way to get there.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Dave Hansen
On 06/25/2014 02:05 PM, Andy Lutomirski wrote:
 Hmm.  the memfd_create thing may be able to do this for you.  If you
 created a per-mm memfd and mapped it, it all just might work.

memfd_create() seems to bring a fair amount of baggage along (the fd
part :) if all we want is a marker.  Really, all we need is _a_ bit, and
some way to plumb to userspace the RSS values of VMAs with that bit set.

Creating and mmap()'ing a fd seems a rather roundabout way to get there.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Andy Lutomirski
On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen dave.han...@intel.com wrote:
 On 06/25/2014 02:05 PM, Andy Lutomirski wrote:
 Hmm.  the memfd_create thing may be able to do this for you.  If you
 created a per-mm memfd and mapped it, it all just might work.

 memfd_create() seems to bring a fair amount of baggage along (the fd
 part :) if all we want is a marker.  Really, all we need is _a_ bit, and
 some way to plumb to userspace the RSS values of VMAs with that bit set.

 Creating and mmap()'ing a fd seems a rather roundabout way to get there.

Hmm.  So does VM_MPX, though.  If this stuff were done entirely in
userspace, then memfd_create would be exactly the right solution, I
think.

Would it work to just scan the bound directory to figure out how many
bound tables exist?

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Dave Hansen
On 06/26/2014 03:19 PM, Andy Lutomirski wrote:
 On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen dave.han...@intel.com wrote:
 On 06/25/2014 02:05 PM, Andy Lutomirski wrote:
 Hmm.  the memfd_create thing may be able to do this for you.  If you
 created a per-mm memfd and mapped it, it all just might work.

 memfd_create() seems to bring a fair amount of baggage along (the fd
 part :) if all we want is a marker.  Really, all we need is _a_ bit, and
 some way to plumb to userspace the RSS values of VMAs with that bit set.

 Creating and mmap()'ing a fd seems a rather roundabout way to get there.
 
 Hmm.  So does VM_MPX, though.  If this stuff were done entirely in
 userspace, then memfd_create would be exactly the right solution, I
 think.
 
 Would it work to just scan the bound directory to figure out how many
 bound tables exist?

Theoretically, perhaps.

Practically, the bounds directory is 2GB, and it is likely to be very
sparse.  You would have to walk the page tables finding where pages were
mapped, then search the mapped pages for bounds table entries.

Assuming that it was aligned and minimally populated, that's a *MINIMUM*
search looking for a PGD entry, then you have to look at 512 PUD
entries.  A full search would have to look at half a million ptes.
That's just finding out how sparse the first level of the tables are
before you've looked at a byte of actual data, and if they were empty.

We could keep another, parallel, data structure that handles this better
other than the hardware tables.  Like, say, an rbtree that stores ranges
of virtual addresses.  We could call them vm_area_somethings ... wait a
sec... we have a structure like that. ;)


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Andy Lutomirski
On Thu, Jun 26, 2014 at 3:58 PM, Dave Hansen dave.han...@intel.com wrote:
 On 06/26/2014 03:19 PM, Andy Lutomirski wrote:
 On Wed, Jun 25, 2014 at 2:45 PM, Dave Hansen dave.han...@intel.com wrote:
 On 06/25/2014 02:05 PM, Andy Lutomirski wrote:
 Hmm.  the memfd_create thing may be able to do this for you.  If you
 created a per-mm memfd and mapped it, it all just might work.

 memfd_create() seems to bring a fair amount of baggage along (the fd
 part :) if all we want is a marker.  Really, all we need is _a_ bit, and
 some way to plumb to userspace the RSS values of VMAs with that bit set.

 Creating and mmap()'ing a fd seems a rather roundabout way to get there.

 Hmm.  So does VM_MPX, though.  If this stuff were done entirely in
 userspace, then memfd_create would be exactly the right solution, I
 think.

 Would it work to just scan the bound directory to figure out how many
 bound tables exist?

 Theoretically, perhaps.

 Practically, the bounds directory is 2GB, and it is likely to be very
 sparse.  You would have to walk the page tables finding where pages were
 mapped, then search the mapped pages for bounds table entries.

 Assuming that it was aligned and minimally populated, that's a *MINIMUM*
 search looking for a PGD entry, then you have to look at 512 PUD
 entries.  A full search would have to look at half a million ptes.
 That's just finding out how sparse the first level of the tables are
 before you've looked at a byte of actual data, and if they were empty.

 We could keep another, parallel, data structure that handles this better
 other than the hardware tables.  Like, say, an rbtree that stores ranges
 of virtual addresses.  We could call them vm_area_somethings ... wait a
 sec... we have a structure like that. ;)



So here's my mental image of how I might do this if I were doing it
entirely in userspace: I'd create a file or memfd for the bound tables
and another for the bound directory.  These files would be *huge*: the
bound directory file would be 2GB and the bounds table file would be
2^48 bytes or whatever it is.  (Maybe even bigger?)

Then I'd just map pieces of those files wherever they'd need to be,
and I'd make the mappings sparse.  I suspect that you don't actually
want a vma for each piece of bound table that gets mapped -- the space
of vmas could end up incredibly sparse.  So I'd at least map (in the
vma sense, not the pte sense) and entire bound table at a time.  And
I'd probably just map the bound directory in one big piece.

Then I'd populate it in the fault handler.

This is almost what the code is doing, I think, modulo the files.

This has one killer problem: these mappings need to be private (cowed
on fork).  So memfd is no good.  There's got to be an easyish way to
modify the mm code to allow anonymous maps with vm_ops.  Maybe a new
mmap_region parameter or something?  Maybe even a special anon_vma,
but I don't really understand how those work.


Also, egads: what happens when a bound table entry is associated with
a MAP_SHARED page?

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Dave Hansen
On 06/26/2014 04:15 PM, Andy Lutomirski wrote:
 So here's my mental image of how I might do this if I were doing it
 entirely in userspace: I'd create a file or memfd for the bound tables
 and another for the bound directory.  These files would be *huge*: the
 bound directory file would be 2GB and the bounds table file would be
 2^48 bytes or whatever it is.  (Maybe even bigger?)
 
 Then I'd just map pieces of those files wherever they'd need to be,
 and I'd make the mappings sparse.  I suspect that you don't actually
 want a vma for each piece of bound table that gets mapped -- the space
 of vmas could end up incredibly sparse.  So I'd at least map (in the
 vma sense, not the pte sense) and entire bound table at a time.  And
 I'd probably just map the bound directory in one big piece.
 
 Then I'd populate it in the fault handler.
 
 This is almost what the code is doing, I think, modulo the files.
 
 This has one killer problem: these mappings need to be private (cowed
 on fork).  So memfd is no good.

This essentially uses the page cache's radix tree as a parallel data
structure in order to keep a vaddr-mpx_vma map.  That's not a bad idea,
but it is a parallel data structure that does not handle copy-on-write
very well.

I'm pretty sure we need the semantics that anonymous memory provides.

 There's got to be an easyish way to
 modify the mm code to allow anonymous maps with vm_ops.  Maybe a new
 mmap_region parameter or something?  Maybe even a special anon_vma,
 but I don't really understand how those work.

Yeah, we very well might end up having to go down that path.

 Also, egads: what happens when a bound table entry is associated with
 a MAP_SHARED page?

Bounds table entries are for pointers.  Do we keep pointers inside of
MAP_SHARED-mapped things? :)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-26 Thread Andy Lutomirski
On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen dave.han...@intel.com wrote:
 On 06/26/2014 04:15 PM, Andy Lutomirski wrote:
 So here's my mental image of how I might do this if I were doing it
 entirely in userspace: I'd create a file or memfd for the bound tables
 and another for the bound directory.  These files would be *huge*: the
 bound directory file would be 2GB and the bounds table file would be
 2^48 bytes or whatever it is.  (Maybe even bigger?)

 Then I'd just map pieces of those files wherever they'd need to be,
 and I'd make the mappings sparse.  I suspect that you don't actually
 want a vma for each piece of bound table that gets mapped -- the space
 of vmas could end up incredibly sparse.  So I'd at least map (in the
 vma sense, not the pte sense) and entire bound table at a time.  And
 I'd probably just map the bound directory in one big piece.

 Then I'd populate it in the fault handler.

 This is almost what the code is doing, I think, modulo the files.

 This has one killer problem: these mappings need to be private (cowed
 on fork).  So memfd is no good.

 This essentially uses the page cache's radix tree as a parallel data
 structure in order to keep a vaddr-mpx_vma map.  That's not a bad idea,
 but it is a parallel data structure that does not handle copy-on-write
 very well.

 I'm pretty sure we need the semantics that anonymous memory provides.

 There's got to be an easyish way to
 modify the mm code to allow anonymous maps with vm_ops.  Maybe a new
 mmap_region parameter or something?  Maybe even a special anon_vma,
 but I don't really understand how those work.

 Yeah, we very well might end up having to go down that path.

 Also, egads: what happens when a bound table entry is associated with
 a MAP_SHARED page?

 Bounds table entries are for pointers.  Do we keep pointers inside of
 MAP_SHARED-mapped things? :)

Sure, if it's MAP_SHARED | MAP_ANONYMOUS.  For example:

struct thing {
  struct thing *next;
};

struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...);
storage[0].next = storage[1];
fork();

I'm not suggesting that this needs to *work* in the first incarnation of this :)

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-25 Thread Dave Hansen
On 06/25/2014 02:04 PM, Andy Lutomirski wrote:
> On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei  wrote:
>> Hmm, _install_special_mapping should completely prevent merging, even among 
>> MPX vmas.
>>
>> So, could you tell me how to set MPX specific ->name to the vma when it is 
>> created? Seems like that I could not find such interface.
> 
> You may need to add one.
> 
> I'd suggest posting a new thread to linux-mm describing what you need
> and asking how to do it.

I shared this with Qiaowei privately, but might as well repeat myself
here in case anyone wants to set me straight.

Most of the interfaces do to set vm_ops do it in file_operations ->mmap
op.  Nobody sets ->vm_ops on anonymous VMAs, so we're in uncharted
territory.

My suggestion: you can either plumb a new API down in to mmap_region()
to get the VMA or set ->vm_ops, or just call find_vma() after
mmap_region() or get_unmapped_area() and set it manually.  Just make
sure you still have mmap_sem held over the whole thing.

I think I prefer just setting ->vm_ops directly, even though it's a wee
bit of a hack to create something just to look it up a moment later.
Oh, well.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-25 Thread Andy Lutomirski
On Wed, Jun 25, 2014 at 2:04 PM, Andy Lutomirski  wrote:
> On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei  wrote:
>> On 2014-06-25, Andy Lutomirski wrote:
>>> On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei 
>>> wrote:
 On 2014-06-24, Andy Lutomirski wrote:
>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
>>> Can the new vm_operation "name" be use for this?  The magic
>>> "always written to core dumps" feature might need to be reconsidered.
>>
>> One thing I'd like to avoid is an MPX vma getting merged with a
>> non-MPX vma.  I don't see any code to prevent two VMAs with
>> different vm_ops->names from getting merged.  That seems like a
>> bit of a design oversight for ->name.  Right?
>
> AFAIK there are no ->name users that don't also set ->close, for
> exactly that reason.  I'd be okay with adding a check for ->name, too.
>
> Hmm.  If MPX vmas had a real struct file attached, this would all
> come for free. Maybe vmas with non-default vm_ops and file != NULL
> should never be mergeable?
>
>>
>> Thinking out loud a bit... There are also some more complicated
>> but more performant cleanup mechanisms that I'd like to go after in the 
>> future.
>> Given a page, we might want to figure out if it is an MPX page or not.
>> I wonder if we'll ever collide with some other user of vm_ops->name.
>> It looks fairly narrowly used at the moment, but would this keep
>> us from putting these pages on, say, a tmpfs mount?  Doesn't look
>> that way at the moment.
>
> You could always check the vm_ops pointer to see if it's MPX.
>
> One feature I've wanted: a way to have special per-process vmas that
> can be easily found.  For example, I want to be able to efficiently
> find out where the vdso and vvar vmas are.  I don't think this is
> currently supported.
>
 Andy, if you add a check for ->name to avoid the MPX vmas merged
 with
>>> non-MPX vmas, I guess the work flow should be as follow (use
>>> _install_special_mapping to get a new vma):

 unsigned long mpx_mmap(unsigned long len) {
 ..
 static struct vm_special_mapping mpx_mapping = {
 .name = "[mpx]",
 .pages = no_pages,
 };

 ... vma = _install_special_mapping(mm, addr, len, vm_flags,
 _mapping); ..
 }

 Then, we could check the ->name to see if the VMA is MPX specific. Right?
>>>
>>> Does this actually create a vma backed with real memory?  Doesn't this
>>> need to go through anon_vma or something?  _install_special_mapping
>>> completely prevents merging.
>>>
>> Hmm, _install_special_mapping should completely prevent merging, even among 
>> MPX vmas.
>>
>> So, could you tell me how to set MPX specific ->name to the vma when it is 
>> created? Seems like that I could not find such interface.
>
> You may need to add one.
>
> I'd suggest posting a new thread to linux-mm describing what you need
> and asking how to do it.

Hmm.  the memfd_create thing may be able to do this for you.  If you
created a per-mm memfd and mapped it, it all just might work.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-25 Thread Andy Lutomirski
On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei  wrote:
> On 2014-06-25, Andy Lutomirski wrote:
>> On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei 
>> wrote:
>>> On 2014-06-24, Andy Lutomirski wrote:
> On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
>> Can the new vm_operation "name" be use for this?  The magic
>> "always written to core dumps" feature might need to be reconsidered.
>
> One thing I'd like to avoid is an MPX vma getting merged with a
> non-MPX vma.  I don't see any code to prevent two VMAs with
> different vm_ops->names from getting merged.  That seems like a
> bit of a design oversight for ->name.  Right?

 AFAIK there are no ->name users that don't also set ->close, for
 exactly that reason.  I'd be okay with adding a check for ->name, too.

 Hmm.  If MPX vmas had a real struct file attached, this would all
 come for free. Maybe vmas with non-default vm_ops and file != NULL
 should never be mergeable?

>
> Thinking out loud a bit... There are also some more complicated
> but more performant cleanup mechanisms that I'd like to go after in the 
> future.
> Given a page, we might want to figure out if it is an MPX page or not.
> I wonder if we'll ever collide with some other user of vm_ops->name.
> It looks fairly narrowly used at the moment, but would this keep
> us from putting these pages on, say, a tmpfs mount?  Doesn't look
> that way at the moment.

 You could always check the vm_ops pointer to see if it's MPX.

 One feature I've wanted: a way to have special per-process vmas that
 can be easily found.  For example, I want to be able to efficiently
 find out where the vdso and vvar vmas are.  I don't think this is
 currently supported.

>>> Andy, if you add a check for ->name to avoid the MPX vmas merged
>>> with
>> non-MPX vmas, I guess the work flow should be as follow (use
>> _install_special_mapping to get a new vma):
>>>
>>> unsigned long mpx_mmap(unsigned long len) {
>>> ..
>>> static struct vm_special_mapping mpx_mapping = {
>>> .name = "[mpx]",
>>> .pages = no_pages,
>>> };
>>>
>>> ... vma = _install_special_mapping(mm, addr, len, vm_flags,
>>> _mapping); ..
>>> }
>>>
>>> Then, we could check the ->name to see if the VMA is MPX specific. Right?
>>
>> Does this actually create a vma backed with real memory?  Doesn't this
>> need to go through anon_vma or something?  _install_special_mapping
>> completely prevents merging.
>>
> Hmm, _install_special_mapping should completely prevent merging, even among 
> MPX vmas.
>
> So, could you tell me how to set MPX specific ->name to the vma when it is 
> created? Seems like that I could not find such interface.

You may need to add one.

I'd suggest posting a new thread to linux-mm describing what you need
and asking how to do it.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-25 Thread Andy Lutomirski
On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei qiaowei@intel.com wrote:
 On 2014-06-25, Andy Lutomirski wrote:
 On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei qiaowei@intel.com
 wrote:
 On 2014-06-24, Andy Lutomirski wrote:
 On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation name be use for this?  The magic
 always written to core dumps feature might need to be reconsidered.

 One thing I'd like to avoid is an MPX vma getting merged with a
 non-MPX vma.  I don't see any code to prevent two VMAs with
 different vm_ops-names from getting merged.  That seems like a
 bit of a design oversight for -name.  Right?

 AFAIK there are no -name users that don't also set -close, for
 exactly that reason.  I'd be okay with adding a check for -name, too.

 Hmm.  If MPX vmas had a real struct file attached, this would all
 come for free. Maybe vmas with non-default vm_ops and file != NULL
 should never be mergeable?


 Thinking out loud a bit... There are also some more complicated
 but more performant cleanup mechanisms that I'd like to go after in the 
 future.
 Given a page, we might want to figure out if it is an MPX page or not.
 I wonder if we'll ever collide with some other user of vm_ops-name.
 It looks fairly narrowly used at the moment, but would this keep
 us from putting these pages on, say, a tmpfs mount?  Doesn't look
 that way at the moment.

 You could always check the vm_ops pointer to see if it's MPX.

 One feature I've wanted: a way to have special per-process vmas that
 can be easily found.  For example, I want to be able to efficiently
 find out where the vdso and vvar vmas are.  I don't think this is
 currently supported.

 Andy, if you add a check for -name to avoid the MPX vmas merged
 with
 non-MPX vmas, I guess the work flow should be as follow (use
 _install_special_mapping to get a new vma):

 unsigned long mpx_mmap(unsigned long len) {
 ..
 static struct vm_special_mapping mpx_mapping = {
 .name = [mpx],
 .pages = no_pages,
 };

 ... vma = _install_special_mapping(mm, addr, len, vm_flags,
 mpx_mapping); ..
 }

 Then, we could check the -name to see if the VMA is MPX specific. Right?

 Does this actually create a vma backed with real memory?  Doesn't this
 need to go through anon_vma or something?  _install_special_mapping
 completely prevents merging.

 Hmm, _install_special_mapping should completely prevent merging, even among 
 MPX vmas.

 So, could you tell me how to set MPX specific -name to the vma when it is 
 created? Seems like that I could not find such interface.

You may need to add one.

I'd suggest posting a new thread to linux-mm describing what you need
and asking how to do it.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-25 Thread Andy Lutomirski
On Wed, Jun 25, 2014 at 2:04 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei qiaowei@intel.com wrote:
 On 2014-06-25, Andy Lutomirski wrote:
 On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei qiaowei@intel.com
 wrote:
 On 2014-06-24, Andy Lutomirski wrote:
 On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation name be use for this?  The magic
 always written to core dumps feature might need to be reconsidered.

 One thing I'd like to avoid is an MPX vma getting merged with a
 non-MPX vma.  I don't see any code to prevent two VMAs with
 different vm_ops-names from getting merged.  That seems like a
 bit of a design oversight for -name.  Right?

 AFAIK there are no -name users that don't also set -close, for
 exactly that reason.  I'd be okay with adding a check for -name, too.

 Hmm.  If MPX vmas had a real struct file attached, this would all
 come for free. Maybe vmas with non-default vm_ops and file != NULL
 should never be mergeable?


 Thinking out loud a bit... There are also some more complicated
 but more performant cleanup mechanisms that I'd like to go after in the 
 future.
 Given a page, we might want to figure out if it is an MPX page or not.
 I wonder if we'll ever collide with some other user of vm_ops-name.
 It looks fairly narrowly used at the moment, but would this keep
 us from putting these pages on, say, a tmpfs mount?  Doesn't look
 that way at the moment.

 You could always check the vm_ops pointer to see if it's MPX.

 One feature I've wanted: a way to have special per-process vmas that
 can be easily found.  For example, I want to be able to efficiently
 find out where the vdso and vvar vmas are.  I don't think this is
 currently supported.

 Andy, if you add a check for -name to avoid the MPX vmas merged
 with
 non-MPX vmas, I guess the work flow should be as follow (use
 _install_special_mapping to get a new vma):

 unsigned long mpx_mmap(unsigned long len) {
 ..
 static struct vm_special_mapping mpx_mapping = {
 .name = [mpx],
 .pages = no_pages,
 };

 ... vma = _install_special_mapping(mm, addr, len, vm_flags,
 mpx_mapping); ..
 }

 Then, we could check the -name to see if the VMA is MPX specific. Right?

 Does this actually create a vma backed with real memory?  Doesn't this
 need to go through anon_vma or something?  _install_special_mapping
 completely prevents merging.

 Hmm, _install_special_mapping should completely prevent merging, even among 
 MPX vmas.

 So, could you tell me how to set MPX specific -name to the vma when it is 
 created? Seems like that I could not find such interface.

 You may need to add one.

 I'd suggest posting a new thread to linux-mm describing what you need
 and asking how to do it.

Hmm.  the memfd_create thing may be able to do this for you.  If you
created a per-mm memfd and mapped it, it all just might work.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-25 Thread Dave Hansen
On 06/25/2014 02:04 PM, Andy Lutomirski wrote:
 On Tue, Jun 24, 2014 at 6:40 PM, Ren, Qiaowei qiaowei@intel.com wrote:
 Hmm, _install_special_mapping should completely prevent merging, even among 
 MPX vmas.

 So, could you tell me how to set MPX specific -name to the vma when it is 
 created? Seems like that I could not find such interface.
 
 You may need to add one.
 
 I'd suggest posting a new thread to linux-mm describing what you need
 and asking how to do it.

I shared this with Qiaowei privately, but might as well repeat myself
here in case anyone wants to set me straight.

Most of the interfaces do to set vm_ops do it in file_operations -mmap
op.  Nobody sets -vm_ops on anonymous VMAs, so we're in uncharted
territory.

My suggestion: you can either plumb a new API down in to mmap_region()
to get the VMA or set -vm_ops, or just call find_vma() after
mmap_region() or get_unmapped_area() and set it manually.  Just make
sure you still have mmap_sem held over the whole thing.

I think I prefer just setting -vm_ops directly, even though it's a wee
bit of a hack to create something just to look it up a moment later.
Oh, well.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-24 Thread Ren, Qiaowei
On 2014-06-25, Andy Lutomirski wrote:
> On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei 
> wrote:
>> On 2014-06-24, Andy Lutomirski wrote:
 On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
> Can the new vm_operation "name" be use for this?  The magic
> "always written to core dumps" feature might need to be reconsidered.
 
 One thing I'd like to avoid is an MPX vma getting merged with a
 non-MPX vma.  I don't see any code to prevent two VMAs with
 different vm_ops->names from getting merged.  That seems like a
 bit of a design oversight for ->name.  Right?
>>> 
>>> AFAIK there are no ->name users that don't also set ->close, for
>>> exactly that reason.  I'd be okay with adding a check for ->name, too.
>>> 
>>> Hmm.  If MPX vmas had a real struct file attached, this would all
>>> come for free. Maybe vmas with non-default vm_ops and file != NULL
>>> should never be mergeable?
>>> 
 
 Thinking out loud a bit... There are also some more complicated
 but more performant cleanup mechanisms that I'd like to go after in the 
 future.
 Given a page, we might want to figure out if it is an MPX page or not.
 I wonder if we'll ever collide with some other user of vm_ops->name.
 It looks fairly narrowly used at the moment, but would this keep
 us from putting these pages on, say, a tmpfs mount?  Doesn't look
 that way at the moment.
>>> 
>>> You could always check the vm_ops pointer to see if it's MPX.
>>> 
>>> One feature I've wanted: a way to have special per-process vmas that
>>> can be easily found.  For example, I want to be able to efficiently
>>> find out where the vdso and vvar vmas are.  I don't think this is
>>> currently supported.
>>> 
>> Andy, if you add a check for ->name to avoid the MPX vmas merged
>> with
> non-MPX vmas, I guess the work flow should be as follow (use
> _install_special_mapping to get a new vma):
>> 
>> unsigned long mpx_mmap(unsigned long len) {
>> ..
>> static struct vm_special_mapping mpx_mapping = {
>> .name = "[mpx]",
>> .pages = no_pages,
>> };
>> 
>> ... vma = _install_special_mapping(mm, addr, len, vm_flags,
>> _mapping); ..
>> }
>> 
>> Then, we could check the ->name to see if the VMA is MPX specific. Right?
> 
> Does this actually create a vma backed with real memory?  Doesn't this
> need to go through anon_vma or something?  _install_special_mapping
> completely prevents merging.
> 
Hmm, _install_special_mapping should completely prevent merging, even among MPX 
vmas.

So, could you tell me how to set MPX specific ->name to the vma when it is 
created? Seems like that I could not find such interface.

Thanks,
Qiaowei
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-24 Thread Andy Lutomirski
On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei  wrote:
> On 2014-06-24, Andy Lutomirski wrote:
>>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation "name" be use for this?  The magic "always
 written to core dumps" feature might need to be reconsidered.
>>>
>>> One thing I'd like to avoid is an MPX vma getting merged with a
>>> non-MPX vma.  I don't see any code to prevent two VMAs with
>>> different vm_ops->names from getting merged.  That seems like a bit
>>> of a design oversight for ->name.  Right?
>>
>> AFAIK there are no ->name users that don't also set ->close, for
>> exactly that reason.  I'd be okay with adding a check for ->name, too.
>>
>> Hmm.  If MPX vmas had a real struct file attached, this would all come
>> for free. Maybe vmas with non-default vm_ops and file != NULL should
>> never be mergeable?
>>
>>>
>>> Thinking out loud a bit... There are also some more complicated but
>>> more performant cleanup mechanisms that I'd like to go after in the future.
>>> Given a page, we might want to figure out if it is an MPX page or not.
>>> I wonder if we'll ever collide with some other user of vm_ops->name.
>>> It looks fairly narrowly used at the moment, but would this keep us
>>> from putting these pages on, say, a tmpfs mount?  Doesn't look that
>>> way at the moment.
>>
>> You could always check the vm_ops pointer to see if it's MPX.
>>
>> One feature I've wanted: a way to have special per-process vmas that
>> can be easily found.  For example, I want to be able to efficiently
>> find out where the vdso and vvar vmas are.  I don't think this is currently 
>> supported.
>>
> Andy, if you add a check for ->name to avoid the MPX vmas merged with non-MPX 
> vmas, I guess the work flow should be as follow (use _install_special_mapping 
> to get a new vma):
>
> unsigned long mpx_mmap(unsigned long len)
> {
> ..
> static struct vm_special_mapping mpx_mapping = {
> .name = "[mpx]",
> .pages = no_pages,
> };
>
> ...
> vma = _install_special_mapping(mm, addr, len, vm_flags, _mapping);
> ..
> }
>
> Then, we could check the ->name to see if the VMA is MPX specific. Right?

Does this actually create a vma backed with real memory?  Doesn't this
need to go through anon_vma or something?  _install_special_mapping
completely prevents merging.

Possibly silly question: would it make more sense to just create one
giant vma for the MPX tables and only populate pieces of it as needed?
 This wouldn't work for 32-bit code, but maybe we don't care.  (I see
no reason why it couldn't work for x32, though.)

(I don't really understand how anonymous memory works at all.  I'm not
an mm person.)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-24 Thread Andy Lutomirski
On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei qiaowei@intel.com wrote:
 On 2014-06-24, Andy Lutomirski wrote:
 On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation name be use for this?  The magic always
 written to core dumps feature might need to be reconsidered.

 One thing I'd like to avoid is an MPX vma getting merged with a
 non-MPX vma.  I don't see any code to prevent two VMAs with
 different vm_ops-names from getting merged.  That seems like a bit
 of a design oversight for -name.  Right?

 AFAIK there are no -name users that don't also set -close, for
 exactly that reason.  I'd be okay with adding a check for -name, too.

 Hmm.  If MPX vmas had a real struct file attached, this would all come
 for free. Maybe vmas with non-default vm_ops and file != NULL should
 never be mergeable?


 Thinking out loud a bit... There are also some more complicated but
 more performant cleanup mechanisms that I'd like to go after in the future.
 Given a page, we might want to figure out if it is an MPX page or not.
 I wonder if we'll ever collide with some other user of vm_ops-name.
 It looks fairly narrowly used at the moment, but would this keep us
 from putting these pages on, say, a tmpfs mount?  Doesn't look that
 way at the moment.

 You could always check the vm_ops pointer to see if it's MPX.

 One feature I've wanted: a way to have special per-process vmas that
 can be easily found.  For example, I want to be able to efficiently
 find out where the vdso and vvar vmas are.  I don't think this is currently 
 supported.

 Andy, if you add a check for -name to avoid the MPX vmas merged with non-MPX 
 vmas, I guess the work flow should be as follow (use _install_special_mapping 
 to get a new vma):

 unsigned long mpx_mmap(unsigned long len)
 {
 ..
 static struct vm_special_mapping mpx_mapping = {
 .name = [mpx],
 .pages = no_pages,
 };

 ...
 vma = _install_special_mapping(mm, addr, len, vm_flags, mpx_mapping);
 ..
 }

 Then, we could check the -name to see if the VMA is MPX specific. Right?

Does this actually create a vma backed with real memory?  Doesn't this
need to go through anon_vma or something?  _install_special_mapping
completely prevents merging.

Possibly silly question: would it make more sense to just create one
giant vma for the MPX tables and only populate pieces of it as needed?
 This wouldn't work for 32-bit code, but maybe we don't care.  (I see
no reason why it couldn't work for x32, though.)

(I don't really understand how anonymous memory works at all.  I'm not
an mm person.)

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-24 Thread Ren, Qiaowei
On 2014-06-25, Andy Lutomirski wrote:
 On Mon, Jun 23, 2014 at 10:53 PM, Ren, Qiaowei qiaowei@intel.com
 wrote:
 On 2014-06-24, Andy Lutomirski wrote:
 On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation name be use for this?  The magic
 always written to core dumps feature might need to be reconsidered.
 
 One thing I'd like to avoid is an MPX vma getting merged with a
 non-MPX vma.  I don't see any code to prevent two VMAs with
 different vm_ops-names from getting merged.  That seems like a
 bit of a design oversight for -name.  Right?
 
 AFAIK there are no -name users that don't also set -close, for
 exactly that reason.  I'd be okay with adding a check for -name, too.
 
 Hmm.  If MPX vmas had a real struct file attached, this would all
 come for free. Maybe vmas with non-default vm_ops and file != NULL
 should never be mergeable?
 
 
 Thinking out loud a bit... There are also some more complicated
 but more performant cleanup mechanisms that I'd like to go after in the 
 future.
 Given a page, we might want to figure out if it is an MPX page or not.
 I wonder if we'll ever collide with some other user of vm_ops-name.
 It looks fairly narrowly used at the moment, but would this keep
 us from putting these pages on, say, a tmpfs mount?  Doesn't look
 that way at the moment.
 
 You could always check the vm_ops pointer to see if it's MPX.
 
 One feature I've wanted: a way to have special per-process vmas that
 can be easily found.  For example, I want to be able to efficiently
 find out where the vdso and vvar vmas are.  I don't think this is
 currently supported.
 
 Andy, if you add a check for -name to avoid the MPX vmas merged
 with
 non-MPX vmas, I guess the work flow should be as follow (use
 _install_special_mapping to get a new vma):
 
 unsigned long mpx_mmap(unsigned long len) {
 ..
 static struct vm_special_mapping mpx_mapping = {
 .name = [mpx],
 .pages = no_pages,
 };
 
 ... vma = _install_special_mapping(mm, addr, len, vm_flags,
 mpx_mapping); ..
 }
 
 Then, we could check the -name to see if the VMA is MPX specific. Right?
 
 Does this actually create a vma backed with real memory?  Doesn't this
 need to go through anon_vma or something?  _install_special_mapping
 completely prevents merging.
 
Hmm, _install_special_mapping should completely prevent merging, even among MPX 
vmas.

So, could you tell me how to set MPX specific -name to the vma when it is 
created? Seems like that I could not find such interface.

Thanks,
Qiaowei
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
>> On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
>>> Can the new vm_operation "name" be use for this?  The magic "always
>>> written to core dumps" feature might need to be reconsidered.
>> 
>> One thing I'd like to avoid is an MPX vma getting merged with a
>> non-MPX vma.  I don't see any code to prevent two VMAs with
>> different vm_ops->names from getting merged.  That seems like a bit
>> of a design oversight for ->name.  Right?
> 
> AFAIK there are no ->name users that don't also set ->close, for
> exactly that reason.  I'd be okay with adding a check for ->name, too.
> 
> Hmm.  If MPX vmas had a real struct file attached, this would all come
> for free. Maybe vmas with non-default vm_ops and file != NULL should
> never be mergeable?
> 
>> 
>> Thinking out loud a bit... There are also some more complicated but
>> more performant cleanup mechanisms that I'd like to go after in the future.
>> Given a page, we might want to figure out if it is an MPX page or not.
>> I wonder if we'll ever collide with some other user of vm_ops->name.
>> It looks fairly narrowly used at the moment, but would this keep us
>> from putting these pages on, say, a tmpfs mount?  Doesn't look that
>> way at the moment.
> 
> You could always check the vm_ops pointer to see if it's MPX.
> 
> One feature I've wanted: a way to have special per-process vmas that
> can be easily found.  For example, I want to be able to efficiently
> find out where the vdso and vvar vmas are.  I don't think this is currently 
> supported.
> 
Andy, if you add a check for ->name to avoid the MPX vmas merged with non-MPX 
vmas, I guess the work flow should be as follow (use _install_special_mapping 
to get a new vma):

unsigned long mpx_mmap(unsigned long len)
{
..
static struct vm_special_mapping mpx_mapping = {
.name = "[mpx]",
.pages = no_pages,
};

...
vma = _install_special_mapping(mm, addr, len, vm_flags, _mapping);
..
}

Then, we could check the ->name to see if the VMA is MPX specific. Right?

Thanks,
Qiaowei



RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
>> +/* Make bounds tables and bouds directory unlocked. */
>> +if (vm_flags & VM_LOCKED)
>> +vm_flags &= ~VM_LOCKED;
> 
> Why?  I would expect MCL_FUTURE to lock these.
> 
Andy, I was just a little confused about LOCKED & POPULATE earlier and I 
thought VM_LOCKED is not necessary for MPX specific bounds tables. Now, this 
checking should be removed, and there should be mm_populate() for VM_LOCKED 
case after mmap_region():

   if (!IS_ERR_VALUE(addr) && (vm_flags & VM_LOCKED))
   mm_populate(addr, len);

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Andy Lutomirski
On Mon, Jun 23, 2014 at 1:28 PM, Dave Hansen  wrote:
> On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
>> Can the new vm_operation "name" be use for this?  The magic "always
>> written to core dumps" feature might need to be reconsidered.
>
> One thing I'd like to avoid is an MPX vma getting merged with a non-MPX
> vma.  I don't see any code to prevent two VMAs with different
> vm_ops->names from getting merged.  That seems like a bit of a design
> oversight for ->name.  Right?

AFAIK there are no ->name users that don't also set ->close, for
exactly that reason.  I'd be okay with adding a check for ->name, too.

Hmm.  If MPX vmas had a real struct file attached, this would all come
for free.  Maybe vmas with non-default vm_ops and file != NULL should
never be mergeable?

>
> Thinking out loud a bit... There are also some more complicated but more
> performant cleanup mechanisms that I'd like to go after in the future.
> Given a page, we might want to figure out if it is an MPX page or not.
> I wonder if we'll ever collide with some other user of vm_ops->name.  It
> looks fairly narrowly used at the moment, but would this keep us from
> putting these pages on, say, a tmpfs mount?  Doesn't look that way at
> the moment.

You could always check the vm_ops pointer to see if it's MPX.

One feature I've wanted: a way to have special per-process vmas that
can be easily found.  For example, I want to be able to efficiently
find out where the vdso and vvar vmas are.  I don't think this is
currently supported.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Dave Hansen
On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
> Can the new vm_operation "name" be use for this?  The magic "always
> written to core dumps" feature might need to be reconsidered.

One thing I'd like to avoid is an MPX vma getting merged with a non-MPX
vma.  I don't see any code to prevent two VMAs with different
vm_ops->names from getting merged.  That seems like a bit of a design
oversight for ->name.  Right?

Thinking out loud a bit... There are also some more complicated but more
performant cleanup mechanisms that I'd like to go after in the future.
Given a page, we might want to figure out if it is an MPX page or not.
I wonder if we'll ever collide with some other user of vm_ops->name.  It
looks fairly narrowly used at the moment, but would this keep us from
putting these pages on, say, a tmpfs mount?  Doesn't look that way at
the moment.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Andy Lutomirski
On Mon, Jun 23, 2014 at 1:03 PM, Dave Hansen  wrote:
> On 06/23/2014 12:49 PM, Andy Lutomirski wrote:
>> On 06/18/2014 02:44 AM, Qiaowei Ren wrote:
>>> This patch adds one MPX specific mmap interface, which only handles
>>> mpx related maps, including bounds table and bounds directory.
>>>
>>> In order to track MPX specific memory usage, this interface is added
>>> to stick new vm_flag VM_MPX in the vma_area_struct when create a
>>> bounds table or bounds directory.
>>
>> I imagine the linux-mm people would want to think about any new vm flag.
>>  Why is this needed?
>
> These tables can take huge amounts of memory.  In the worst-case
> scenario, the tables can be 4x the size of the data structure being
> tracked.  IOW, a 1-page structure can require 4 bounds-table pages.
>
> My expectation is that folks using MPX are going to be keen on figuring
> out how much memory is being dedicated to it.  With this feature, plus
> some grepping in /proc/$pid/smaps one could take a pretty good stab at it.
>
> I know VM flags are scarce, and I'm open to other ways to skin this cat.
>

Can the new vm_operation "name" be use for this?  The magic "always
written to core dumps" feature might need to be reconsidered.

There's also arch_vma_name, but I just finished removing for x86, and
I'd be a little sad to see it come right back.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Dave Hansen
On 06/23/2014 12:49 PM, Andy Lutomirski wrote:
> On 06/18/2014 02:44 AM, Qiaowei Ren wrote:
>> This patch adds one MPX specific mmap interface, which only handles
>> mpx related maps, including bounds table and bounds directory.
>>
>> In order to track MPX specific memory usage, this interface is added
>> to stick new vm_flag VM_MPX in the vma_area_struct when create a
>> bounds table or bounds directory.
> 
> I imagine the linux-mm people would want to think about any new vm flag.
>  Why is this needed?

These tables can take huge amounts of memory.  In the worst-case
scenario, the tables can be 4x the size of the data structure being
tracked.  IOW, a 1-page structure can require 4 bounds-table pages.

My expectation is that folks using MPX are going to be keen on figuring
out how much memory is being dedicated to it.  With this feature, plus
some grepping in /proc/$pid/smaps one could take a pretty good stab at it.

I know VM flags are scarce, and I'm open to other ways to skin this cat.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Andy Lutomirski
On 06/18/2014 02:44 AM, Qiaowei Ren wrote:
> This patch adds one MPX specific mmap interface, which only handles
> mpx related maps, including bounds table and bounds directory.
> 
> In order to track MPX specific memory usage, this interface is added
> to stick new vm_flag VM_MPX in the vma_area_struct when create a
> bounds table or bounds directory.

I imagine the linux-mm people would want to think about any new vm flag.
 Why is this needed?

> 
> Signed-off-by: Qiaowei Ren 
> ---
>  arch/x86/Kconfig   |4 +++
>  arch/x86/include/asm/mpx.h |   38 
>  arch/x86/mm/Makefile   |2 +
>  arch/x86/mm/mpx.c  |   58 
> 
>  4 files changed, 102 insertions(+), 0 deletions(-)
>  create mode 100644 arch/x86/include/asm/mpx.h
>  create mode 100644 arch/x86/mm/mpx.c
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 25d2c6f..0194790 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -237,6 +237,10 @@ config HAVE_INTEL_TXT
>   def_bool y
>   depends on INTEL_IOMMU && ACPI
>  
> +config X86_INTEL_MPX
> + def_bool y
> + depends on CPU_SUP_INTEL
> +
>  config X86_32_SMP
>   def_bool y
>   depends on X86_32 && SMP
> diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
> new file mode 100644
> index 000..5725ac4
> --- /dev/null
> +++ b/arch/x86/include/asm/mpx.h
> @@ -0,0 +1,38 @@
> +#ifndef _ASM_X86_MPX_H
> +#define _ASM_X86_MPX_H
> +
> +#include 
> +#include 
> +
> +#ifdef CONFIG_X86_64
> +
> +/* upper 28 bits [47:20] of the virtual address in 64-bit used to
> + * index into bounds directory (BD).
> + */
> +#define MPX_BD_ENTRY_OFFSET  28
> +#define MPX_BD_ENTRY_SHIFT   3
> +/* bits [19:3] of the virtual address in 64-bit used to index into
> + * bounds table (BT).
> + */
> +#define MPX_BT_ENTRY_OFFSET  17
> +#define MPX_BT_ENTRY_SHIFT   5
> +#define MPX_IGN_BITS 3
> +
> +#else
> +
> +#define MPX_BD_ENTRY_OFFSET  20
> +#define MPX_BD_ENTRY_SHIFT   2
> +#define MPX_BT_ENTRY_OFFSET  10
> +#define MPX_BT_ENTRY_SHIFT   4
> +#define MPX_IGN_BITS 2
> +
> +#endif
> +
> +#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
> +#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
> +
> +#define MPX_BNDSTA_ERROR_CODE0x3
> +
> +unsigned long mpx_mmap(unsigned long len);
> +
> +#endif /* _ASM_X86_MPX_H */
> diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
> index 6a19ad9..ecfdc46 100644
> --- a/arch/x86/mm/Makefile
> +++ b/arch/x86/mm/Makefile
> @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o
>  obj-$(CONFIG_NUMA_EMU)   += numa_emulation.o
>  
>  obj-$(CONFIG_MEMTEST)+= memtest.o
> +
> +obj-$(CONFIG_X86_INTEL_MPX)  += mpx.o
> diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
> new file mode 100644
> index 000..546c5d1
> --- /dev/null
> +++ b/arch/x86/mm/mpx.c
> @@ -0,0 +1,58 @@
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/*
> + * this is really a simplified "vm_mmap". it only handles mpx
> + * related maps, including bounds table and bounds directory.
> + *
> + * here we can stick new vm_flag VM_MPX in the vma_area_struct
> + * when create a bounds table or bounds directory, in order to
> + * track MPX specific memory.
> + */
> +unsigned long mpx_mmap(unsigned long len)
> +{
> + unsigned long ret;
> + unsigned long addr, pgoff;
> + struct mm_struct *mm = current->mm;
> + vm_flags_t vm_flags;
> +
> + /* Only bounds table and bounds directory can be allocated here */
> + if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
> + return -EINVAL;
> +
> + down_write(>mmap_sem);
> +
> + /* Too many mappings? */
> + if (mm->map_count > sysctl_max_map_count) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + /* Obtain the address to map to. we verify (or select) it and ensure
> +  * that it represents a valid section of the address space.
> +  */
> + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
> + if (addr & ~PAGE_MASK) {
> + ret = addr;
> + goto out;
> + }
> +
> + vm_flags = VM_READ | VM_WRITE | VM_MPX |
> + mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
> +
> + /* Make bounds tables and bouds directory unlocked. */
> + if (vm_flags & VM_LOCKED)
> + vm_flags &= ~VM_LOCKED;

Why?  I would expect MCL_FUTURE to lock these.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Andy Lutomirski
On 06/18/2014 02:44 AM, Qiaowei Ren wrote:
 This patch adds one MPX specific mmap interface, which only handles
 mpx related maps, including bounds table and bounds directory.
 
 In order to track MPX specific memory usage, this interface is added
 to stick new vm_flag VM_MPX in the vma_area_struct when create a
 bounds table or bounds directory.

I imagine the linux-mm people would want to think about any new vm flag.
 Why is this needed?

 
 Signed-off-by: Qiaowei Ren qiaowei@intel.com
 ---
  arch/x86/Kconfig   |4 +++
  arch/x86/include/asm/mpx.h |   38 
  arch/x86/mm/Makefile   |2 +
  arch/x86/mm/mpx.c  |   58 
 
  4 files changed, 102 insertions(+), 0 deletions(-)
  create mode 100644 arch/x86/include/asm/mpx.h
  create mode 100644 arch/x86/mm/mpx.c
 
 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
 index 25d2c6f..0194790 100644
 --- a/arch/x86/Kconfig
 +++ b/arch/x86/Kconfig
 @@ -237,6 +237,10 @@ config HAVE_INTEL_TXT
   def_bool y
   depends on INTEL_IOMMU  ACPI
  
 +config X86_INTEL_MPX
 + def_bool y
 + depends on CPU_SUP_INTEL
 +
  config X86_32_SMP
   def_bool y
   depends on X86_32  SMP
 diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
 new file mode 100644
 index 000..5725ac4
 --- /dev/null
 +++ b/arch/x86/include/asm/mpx.h
 @@ -0,0 +1,38 @@
 +#ifndef _ASM_X86_MPX_H
 +#define _ASM_X86_MPX_H
 +
 +#include linux/types.h
 +#include asm/ptrace.h
 +
 +#ifdef CONFIG_X86_64
 +
 +/* upper 28 bits [47:20] of the virtual address in 64-bit used to
 + * index into bounds directory (BD).
 + */
 +#define MPX_BD_ENTRY_OFFSET  28
 +#define MPX_BD_ENTRY_SHIFT   3
 +/* bits [19:3] of the virtual address in 64-bit used to index into
 + * bounds table (BT).
 + */
 +#define MPX_BT_ENTRY_OFFSET  17
 +#define MPX_BT_ENTRY_SHIFT   5
 +#define MPX_IGN_BITS 3
 +
 +#else
 +
 +#define MPX_BD_ENTRY_OFFSET  20
 +#define MPX_BD_ENTRY_SHIFT   2
 +#define MPX_BT_ENTRY_OFFSET  10
 +#define MPX_BT_ENTRY_SHIFT   4
 +#define MPX_IGN_BITS 2
 +
 +#endif
 +
 +#define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 +#define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 +
 +#define MPX_BNDSTA_ERROR_CODE0x3
 +
 +unsigned long mpx_mmap(unsigned long len);
 +
 +#endif /* _ASM_X86_MPX_H */
 diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
 index 6a19ad9..ecfdc46 100644
 --- a/arch/x86/mm/Makefile
 +++ b/arch/x86/mm/Makefile
 @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o
  obj-$(CONFIG_NUMA_EMU)   += numa_emulation.o
  
  obj-$(CONFIG_MEMTEST)+= memtest.o
 +
 +obj-$(CONFIG_X86_INTEL_MPX)  += mpx.o
 diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
 new file mode 100644
 index 000..546c5d1
 --- /dev/null
 +++ b/arch/x86/mm/mpx.c
 @@ -0,0 +1,58 @@
 +#include linux/kernel.h
 +#include linux/syscalls.h
 +#include asm/mpx.h
 +#include asm/mman.h
 +#include linux/sched/sysctl.h
 +
 +/*
 + * this is really a simplified vm_mmap. it only handles mpx
 + * related maps, including bounds table and bounds directory.
 + *
 + * here we can stick new vm_flag VM_MPX in the vma_area_struct
 + * when create a bounds table or bounds directory, in order to
 + * track MPX specific memory.
 + */
 +unsigned long mpx_mmap(unsigned long len)
 +{
 + unsigned long ret;
 + unsigned long addr, pgoff;
 + struct mm_struct *mm = current-mm;
 + vm_flags_t vm_flags;
 +
 + /* Only bounds table and bounds directory can be allocated here */
 + if (len != MPX_BD_SIZE_BYTES  len != MPX_BT_SIZE_BYTES)
 + return -EINVAL;
 +
 + down_write(mm-mmap_sem);
 +
 + /* Too many mappings? */
 + if (mm-map_count  sysctl_max_map_count) {
 + ret = -ENOMEM;
 + goto out;
 + }
 +
 + /* Obtain the address to map to. we verify (or select) it and ensure
 +  * that it represents a valid section of the address space.
 +  */
 + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
 + if (addr  ~PAGE_MASK) {
 + ret = addr;
 + goto out;
 + }
 +
 + vm_flags = VM_READ | VM_WRITE | VM_MPX |
 + mm-def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
 +
 + /* Make bounds tables and bouds directory unlocked. */
 + if (vm_flags  VM_LOCKED)
 + vm_flags = ~VM_LOCKED;

Why?  I would expect MCL_FUTURE to lock these.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Dave Hansen
On 06/23/2014 12:49 PM, Andy Lutomirski wrote:
 On 06/18/2014 02:44 AM, Qiaowei Ren wrote:
 This patch adds one MPX specific mmap interface, which only handles
 mpx related maps, including bounds table and bounds directory.

 In order to track MPX specific memory usage, this interface is added
 to stick new vm_flag VM_MPX in the vma_area_struct when create a
 bounds table or bounds directory.
 
 I imagine the linux-mm people would want to think about any new vm flag.
  Why is this needed?

These tables can take huge amounts of memory.  In the worst-case
scenario, the tables can be 4x the size of the data structure being
tracked.  IOW, a 1-page structure can require 4 bounds-table pages.

My expectation is that folks using MPX are going to be keen on figuring
out how much memory is being dedicated to it.  With this feature, plus
some grepping in /proc/$pid/smaps one could take a pretty good stab at it.

I know VM flags are scarce, and I'm open to other ways to skin this cat.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Andy Lutomirski
On Mon, Jun 23, 2014 at 1:03 PM, Dave Hansen dave.han...@intel.com wrote:
 On 06/23/2014 12:49 PM, Andy Lutomirski wrote:
 On 06/18/2014 02:44 AM, Qiaowei Ren wrote:
 This patch adds one MPX specific mmap interface, which only handles
 mpx related maps, including bounds table and bounds directory.

 In order to track MPX specific memory usage, this interface is added
 to stick new vm_flag VM_MPX in the vma_area_struct when create a
 bounds table or bounds directory.

 I imagine the linux-mm people would want to think about any new vm flag.
  Why is this needed?

 These tables can take huge amounts of memory.  In the worst-case
 scenario, the tables can be 4x the size of the data structure being
 tracked.  IOW, a 1-page structure can require 4 bounds-table pages.

 My expectation is that folks using MPX are going to be keen on figuring
 out how much memory is being dedicated to it.  With this feature, plus
 some grepping in /proc/$pid/smaps one could take a pretty good stab at it.

 I know VM flags are scarce, and I'm open to other ways to skin this cat.


Can the new vm_operation name be use for this?  The magic always
written to core dumps feature might need to be reconsidered.

There's also arch_vma_name, but I just finished removing for x86, and
I'd be a little sad to see it come right back.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Dave Hansen
On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation name be use for this?  The magic always
 written to core dumps feature might need to be reconsidered.

One thing I'd like to avoid is an MPX vma getting merged with a non-MPX
vma.  I don't see any code to prevent two VMAs with different
vm_ops-names from getting merged.  That seems like a bit of a design
oversight for -name.  Right?

Thinking out loud a bit... There are also some more complicated but more
performant cleanup mechanisms that I'd like to go after in the future.
Given a page, we might want to figure out if it is an MPX page or not.
I wonder if we'll ever collide with some other user of vm_ops-name.  It
looks fairly narrowly used at the moment, but would this keep us from
putting these pages on, say, a tmpfs mount?  Doesn't look that way at
the moment.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Andy Lutomirski
On Mon, Jun 23, 2014 at 1:28 PM, Dave Hansen dave.han...@intel.com wrote:
 On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation name be use for this?  The magic always
 written to core dumps feature might need to be reconsidered.

 One thing I'd like to avoid is an MPX vma getting merged with a non-MPX
 vma.  I don't see any code to prevent two VMAs with different
 vm_ops-names from getting merged.  That seems like a bit of a design
 oversight for -name.  Right?

AFAIK there are no -name users that don't also set -close, for
exactly that reason.  I'd be okay with adding a check for -name, too.

Hmm.  If MPX vmas had a real struct file attached, this would all come
for free.  Maybe vmas with non-default vm_ops and file != NULL should
never be mergeable?


 Thinking out loud a bit... There are also some more complicated but more
 performant cleanup mechanisms that I'd like to go after in the future.
 Given a page, we might want to figure out if it is an MPX page or not.
 I wonder if we'll ever collide with some other user of vm_ops-name.  It
 looks fairly narrowly used at the moment, but would this keep us from
 putting these pages on, say, a tmpfs mount?  Doesn't look that way at
 the moment.

You could always check the vm_ops pointer to see if it's MPX.

One feature I've wanted: a way to have special per-process vmas that
can be easily found.  For example, I want to be able to efficiently
find out where the vdso and vvar vmas are.  I don't think this is
currently supported.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
 +/* Make bounds tables and bouds directory unlocked. */
 +if (vm_flags  VM_LOCKED)
 +vm_flags = ~VM_LOCKED;
 
 Why?  I would expect MCL_FUTURE to lock these.
 
Andy, I was just a little confused about LOCKED  POPULATE earlier and I 
thought VM_LOCKED is not necessary for MPX specific bounds tables. Now, this 
checking should be removed, and there should be mm_populate() for VM_LOCKED 
case after mmap_region():

   if (!IS_ERR_VALUE(addr)  (vm_flags  VM_LOCKED))
   mm_populate(addr, len);

Thanks,
Qiaowei

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-23 Thread Ren, Qiaowei
On 2014-06-24, Andy Lutomirski wrote:
 On 06/23/2014 01:06 PM, Andy Lutomirski wrote:
 Can the new vm_operation name be use for this?  The magic always
 written to core dumps feature might need to be reconsidered.
 
 One thing I'd like to avoid is an MPX vma getting merged with a
 non-MPX vma.  I don't see any code to prevent two VMAs with
 different vm_ops-names from getting merged.  That seems like a bit
 of a design oversight for -name.  Right?
 
 AFAIK there are no -name users that don't also set -close, for
 exactly that reason.  I'd be okay with adding a check for -name, too.
 
 Hmm.  If MPX vmas had a real struct file attached, this would all come
 for free. Maybe vmas with non-default vm_ops and file != NULL should
 never be mergeable?
 
 
 Thinking out loud a bit... There are also some more complicated but
 more performant cleanup mechanisms that I'd like to go after in the future.
 Given a page, we might want to figure out if it is an MPX page or not.
 I wonder if we'll ever collide with some other user of vm_ops-name.
 It looks fairly narrowly used at the moment, but would this keep us
 from putting these pages on, say, a tmpfs mount?  Doesn't look that
 way at the moment.
 
 You could always check the vm_ops pointer to see if it's MPX.
 
 One feature I've wanted: a way to have special per-process vmas that
 can be easily found.  For example, I want to be able to efficiently
 find out where the vdso and vvar vmas are.  I don't think this is currently 
 supported.
 
Andy, if you add a check for -name to avoid the MPX vmas merged with non-MPX 
vmas, I guess the work flow should be as follow (use _install_special_mapping 
to get a new vma):

unsigned long mpx_mmap(unsigned long len)
{
..
static struct vm_special_mapping mpx_mapping = {
.name = [mpx],
.pages = no_pages,
};

...
vma = _install_special_mapping(mm, addr, len, vm_flags, mpx_mapping);
..
}

Then, we could check the -name to see if the VMA is MPX specific. Right?

Thanks,
Qiaowei



[PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-18 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig   |4 +++
 arch/x86/include/asm/mpx.h |   38 
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   58 
 4 files changed, 102 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 25d2c6f..0194790 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -237,6 +237,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..546c5d1
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,58 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * this is really a simplified "vm_mmap". it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(>mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Make bounds tables and bouds directory unlocked. */
+   if (vm_flags & VM_LOCKED)
+   vm_flags &= ~VM_LOCKED;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr >> PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+
+out:
+   up_write(>mmap_sem);
+   return ret;
+}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-18 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

Signed-off-by: Qiaowei Ren qiaowei@intel.com
---
 arch/x86/Kconfig   |4 +++
 arch/x86/include/asm/mpx.h |   38 
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   58 
 4 files changed, 102 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 25d2c6f..0194790 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -237,6 +237,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU  ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32  SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include linux/types.h
+#include asm/ptrace.h
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..546c5d1
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,58 @@
+#include linux/kernel.h
+#include linux/syscalls.h
+#include asm/mpx.h
+#include asm/mman.h
+#include linux/sched/sysctl.h
+
+/*
+ * this is really a simplified vm_mmap. it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current-mm;
+   vm_flags_t vm_flags;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES  len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(mm-mmap_sem);
+
+   /* Too many mappings? */
+   if (mm-map_count  sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr  ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm-def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Make bounds tables and bouds directory unlocked. */
+   if (vm_flags  VM_LOCKED)
+   vm_flags = ~VM_LOCKED;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr  PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+
+out:
+   up_write(mm-mmap_sem);
+   return ret;
+}
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/