Re: some cleanup of of uvm_map.c

2010-04-22 Thread Anton Maksimenkov
So, can I remind this simple cleanup?


2010/1/31 Anton Maksimenkov anton...@gmail.com:
 Here is some cleanup of uvm_map.c code.

 At first, in uvm_map_advice() function.
 Let it looks as simple as uvm_map_inherit() - no need to lock/unlock map
 just to realize that sanity check fail. Also no need to do it in every
loop.

 Second, remove redundant temp_entry variable from both functions.
 One entry variable is enough to do the job.

 $ cvs diff -Nup sys/uvm/uvm_map.c
 Index: sys/uvm/uvm_map.c
 ===
 RCS file: /cvs/src/sys/uvm/uvm_map.c,v
 retrieving revision 1.123
 diff -u -p -r1.123 uvm_map.c
 --- sys/uvm/uvm_map.c   28 Aug 2009 00:40:03 -  1.123
 +++ sys/uvm/uvm_map.c   31 Jan 2010 11:35:56 -
 @@ -2473,7 +2473,7 @@ int
  uvm_map_inherit(struct vm_map *map, vaddr_t start, vaddr_t end,
 vm_inherit_t new_inheritance)
  {
 -   struct vm_map_entry *entry, *temp_entry;
 +   struct vm_map_entry *entry;
UVMHIST_FUNC(uvm_map_inherit); UVMHIST_CALLED(maphist);
UVMHIST_LOG(maphist,(map=%p,start=0x%lx,end=0x%lx,new_inh=0x%lx),
map, start, end, new_inheritance);
 @@ -2492,11 +2492,10 @@ uvm_map_inherit(struct vm_map *map, vadd

VM_MAP_RANGE_CHECK(map, start, end);

 -   if (uvm_map_lookup_entry(map, start, temp_entry)) {
 -   entry = temp_entry;
 +   if (uvm_map_lookup_entry(map, start, entry)) {
UVM_MAP_CLIP_START(map, entry, start);
} else {
 -   entry = temp_entry-next;
 +   entry = entry-next;
}

while ((entry != map-header)  (entry-start  end)) {
 @@ -2519,18 +2518,29 @@ uvm_map_inherit(struct vm_map *map, vadd
  int
  uvm_map_advice(struct vm_map *map, vaddr_t start, vaddr_t end, int
new_advice)
  {
 -   struct vm_map_entry *entry, *temp_entry;
 +   struct vm_map_entry *entry;
UVMHIST_FUNC(uvm_map_advice); UVMHIST_CALLED(maphist);
UVMHIST_LOG(maphist,(map=%p,start=0x%lx,end=0x%lx,new_adv=0x%lx),
map, start, end, new_advice);

 +   switch (new_advice) {
 +   case MADV_NORMAL:
 +   case MADV_RANDOM:
 +   case MADV_SEQUENTIAL:
 +   /* nothing special here */
 +   break;
 +
 +   default:
 +   UVMHIST_LOG(maphist,- done (INVALID ARG),0,0,0,0);
 +   return (EINVAL);
 +   }
 +
vm_map_lock(map);
VM_MAP_RANGE_CHECK(map, start, end);
 -   if (uvm_map_lookup_entry(map, start, temp_entry)) {
 -   entry = temp_entry;
 +   if (uvm_map_lookup_entry(map, start, entry)) {
UVM_MAP_CLIP_START(map, entry, start);
} else {
 -   entry = temp_entry-next;
 +   entry = entry-next;
}

/*
 @@ -2539,19 +2549,6 @@ uvm_map_advice(struct vm_map *map, vaddr

while ((entry != map-header)  (entry-start  end)) {
UVM_MAP_CLIP_END(map, entry, end);
 -
 -   switch (new_advice) {
 -   case MADV_NORMAL:
 -   case MADV_RANDOM:
 -   case MADV_SEQUENTIAL:
 -   /* nothing special here */
 -   break;
 -
 -   default:
 -   vm_map_unlock(map);
 -   UVMHIST_LOG(maphist,- done (INVALID
ARG),0,0,0,0);
 -   return (EINVAL);
 -   }
entry-advice = new_advice;
entry = entry-next;
}

 --
 antonvm




--
antonvm



Re: some cleanup of of uvm_map.c

2010-04-22 Thread Ted Unangst
On Thu, Apr 22, 2010 at 11:56 PM, Anton Maksimenkov anton...@gmail.com wrote:
 So, can I remind this simple cleanup?

Thanks, applied, but next time watch the tabs vs spaces when mailing a diff.

 2010/1/31 Anton Maksimenkov anton...@gmail.com:
 Here is some cleanup of uvm_map.c code.



Re: some cleanup of of uvm_map.c

2010-02-04 Thread Ariane van der Steldt
On Thu, Feb 04, 2010 at 11:51:11AM +0500, Anton Maksimenkov wrote:
 2010/2/3 Ariane van der Steldt ari...@stack.nl:
  On Tue, Feb 02, 2010 at 01:21:55AM +0500, Anton Maksimenkov wrote:
  uvm_map_lookup_entry()... in uvm_map_findspace()...
 
  I'm pretty sure that function is bugged: I think it doesn't do
  wrap-around on the search for an entry. ?
 
  Both functions are rather complex. And both functions pre-date random
  allocation. All this map-hint business has no reason to remain, since
  random allocation is here to stay and the hint has a high failure rate
  since then (if it is used, which it usually isn't anyway).
 
  The RB_TREE code in uvm_map_findspace is a pretty good solution given
  the map entry struct. You will need to understand the RB_AUGMENT hack
  that is only used in this place of the kernel though.
 
 2010/2/3 Owain Ainsworth zer...@googlemail.com:
  On Wed, Feb 03, 2010 at 04:33:50PM +0100, Ariane van der Steldt wrote:
  I'm pretty sure that function is bugged...
  Yes. Henning has a bgpd core router where bgpd occasionally dies due to
 
 
 Let me introduce my idea.
 
 In fact, we have only two functions in uvm_map.c which make searching
 in maps. These are uvm_map_lookup_entry() and uvm_map_findspace().
 The uvm_map_lookup_entry() do search by address (VA), while
 uvm_map_findspace() do search by free space.

uvm_map_findspace() is also used when searching by addr:
mmap(..., MAP_FIXED) requires a search by addr.
uvm_map_findspace() should also be able to take an addr constraint: like
Oga said, i386 has a segmented view of memory wrt W^X.

 And we have a RB_HEAD(uvm_tree, vm_map_entry) rbhead, which is
 indexed by address (see uvm_compare()). Since that this RB_TREE
 provide a searching by address option to us. This is what the
 RB_TREE used for.
 But when we try to use that RB_TREE for track free space and to
 SEARCH by free space ? we do dirty and ugly things!
 Then we add a RB_AUGMENT hack (it is so ugly and hard to use right
 with others that you can't find it anywhere in source tree... exept
 the uvm_map.c) and other tricks...

Nah, RB_AUGMENT is easy. When an item in the tree is deleted, each node
in the tree that is altered (position or insertion/removal) has each of
its children and each of its parents RB_AUGMENTed. The RB_AUGMENT calls
are done in such a way that each node is process prior to
RB_PARENT(node) being processed.

 In the end we got very unclear
 uvm_map.c code in these places, which is very hard to understand and
 track, and which is overloaded and overcomplexed by that tricks and
 crutches.
 We must stop it, because it keeps hold our hands and brains.
 
 We can do things very clear and easy. Let me show how exactly.
 
 Let's just add second RB_TREE, which is indexed by space, let's call
 it RB_HEAD(uvm_tree_by_space, vm_map_entry) rbhead_by_space, for
 example. That uvm_tree_by_space will contain vm_map_entry'es sorted by
 free space. And that uvm_tree_by_space will be used only in
 uvm_map_findspace() to search for  vm_map_entry with needed free
 space. RB_NFIND() is our best friend here! Simple, reasonable fast,
 clearly and easy to understand.

I use that trick in uvm_pmemrange. It's a nice algorithm, easy to
implement. Keep in mind however: you'll need to support random
allocation, or the diff will be rejected.

I was actually thinking along the same lines: rb-tree of free space,
ordered by size.
Let: start = first rb-entry with enough space for allocation
Let: end = rb_max(free tree)
Let: chosen = random chunk in [start, end]

You'll want indices on the tree, so the random generator can be given a
number to use. (If the number is too large, you'll get bias on the
largest segments, if too small, you won't consider them.)

Once you have 'chosen', you need to generate a random address inside.

Let: offset = random number between 0 and (end-start)
Now you have a random address inside chosen:
chosen.start + offset.

Disadvantage of the algorithm is that it requires 2 random numbers (the
current algorithm only requires 1). Advantage is that this algorithm
always produces a valid address, so no forward searching is necessary
anymore.

The address may have to be shifted because pmap_prefer says so. If
that's the case, do so whenever possible: it removes aliasing problems
on some architectures (afaik pmap has a way of dealing with it, but
it's hideously expensive when it has to).

 Actually, since we can have two or more vm_map_entry'es with equal
 space, so each RB_TREE element will be the list of vm_map_entry'es
 with that equal space. We can use some additional logic here when we
 push/pop vm_map_entry from that list ? we can pop vm_map_entry with
 smallest start address, for example.

You can't simply take the lowest start address. It would make addresses
completely predictable.

 Then we must free the first RB_TREE (uvm_tree, indexed by address)
 from tracking space/ownspace and other nasty tricks, and stop using it
 in uvm_map_findspace(). This uvm_tree must 

Re: some cleanup of of uvm_map.c

2010-02-04 Thread Anton Maksimenkov
2010/2/3 Owain Ainsworth zer...@googlemail.com:
 ...you can't just go straight to the start of the map.
 Say i386 where W^X is done using segments. if you dump
 something right at the other end of the segment, that pretty much screws
 it. Perhaps going back to the min hint for the protection and trying to
 push just a little bit down may work?

2010/2/4 Ariane van der Steldt ari...@stack.nl:
 mmap(..., MAP_FIXED) requires a search by addr.
 uvm_map_findspace() should also be able to take an addr constraint:
 like Oga said, i386 has a segmented view of memory wrt W^X.
...
 One more problem you may face, is uvm_km_getpage starvation. If it
 happens, you'll be unable to allocate a vm_map_entry. In your design,
 you'll need 2 per allocation most of the time. If that's too much
 pressure, the kernel may starve itself and be unable to create new
 entries, becoming completely jammed. This is a problem on any
 non-pmap_direct architectures (like i386).
I remember about MAP_FIXED, just not mentioned it in my not-so-short message ;)
And I don't want 2 vm_map_entry per allocation, I only need to keep
each vm_map_entry in both trees. One vm_map_entry can contain 2
separate RB_TREE entries (for both trees), so each tree can work with
that vm_map_entry independantly.


Can anyone explain me what is the problem with i386 segments? Or
better supply some links to docs which explain it.
I don't understand what you mean - MAP_FIXED flag is the problem or it
used as workaround for some problem?
-- 
antonvm



Re: some cleanup of of uvm_map.c

2010-02-03 Thread Ariane van der Steldt
On Sun, Jan 31, 2010 at 05:43:02PM +0500, Anton Maksimenkov wrote:
 Here is some cleanup of uvm_map.c code.
 
 At first, in uvm_map_advice() function.
 Let it looks as simple as uvm_map_inherit() - no need to lock/unlock map
 just to realize that sanity check fail. Also no need to do it in every loop.

The functional change is that a wrong advice on an empty range will now
return EINVAL instead of 0. I actually like that.

 Second, remove redundant temp_entry variable from both functions.
 One entry variable is enough to do the job.

I agree temp_entry is a redundant variable in both functions. I think
the compiler may already optimize that away (otoh, we are talking gcc
here...)

 $ cvs diff -Nup sys/uvm/uvm_map.c
 Index: sys/uvm/uvm_map.c
 ===
 RCS file: /cvs/src/sys/uvm/uvm_map.c,v
 retrieving revision 1.123
 diff -u -p -r1.123 uvm_map.c
 --- sys/uvm/uvm_map.c   28 Aug 2009 00:40:03 -  1.123
 +++ sys/uvm/uvm_map.c   31 Jan 2010 11:35:56 -
 @@ -2473,7 +2473,7 @@ int
  uvm_map_inherit(struct vm_map *map, vaddr_t start, vaddr_t end,
  vm_inherit_t new_inheritance)
  {
 -   struct vm_map_entry *entry, *temp_entry;
 +   struct vm_map_entry *entry;
 UVMHIST_FUNC(uvm_map_inherit); UVMHIST_CALLED(maphist);
 UVMHIST_LOG(maphist,(map=%p,start=0x%lx,end=0x%lx,new_inh=0x%lx),
 map, start, end, new_inheritance);
 @@ -2492,11 +2492,10 @@ uvm_map_inherit(struct vm_map *map, vadd
 
 VM_MAP_RANGE_CHECK(map, start, end);
 
 -   if (uvm_map_lookup_entry(map, start, temp_entry)) {
 -   entry = temp_entry;
 +   if (uvm_map_lookup_entry(map, start, entry)) {
 UVM_MAP_CLIP_START(map, entry, start);
 } else {
 -   entry = temp_entry-next;
 +   entry = entry-next;
 }
 
 while ((entry != map-header)  (entry-start  end)) {
 @@ -2519,18 +2518,29 @@ uvm_map_inherit(struct vm_map *map, vadd
  int
  uvm_map_advice(struct vm_map *map, vaddr_t start, vaddr_t end, int 
 new_advice)
  {
 -   struct vm_map_entry *entry, *temp_entry;
 +   struct vm_map_entry *entry;
 UVMHIST_FUNC(uvm_map_advice); UVMHIST_CALLED(maphist);
 UVMHIST_LOG(maphist,(map=%p,start=0x%lx,end=0x%lx,new_adv=0x%lx),
 map, start, end, new_advice);
 
 +   switch (new_advice) {
 +   case MADV_NORMAL:
 +   case MADV_RANDOM:
 +   case MADV_SEQUENTIAL:
 +   /* nothing special here */
 +   break;
 +
 +   default:
 +   UVMHIST_LOG(maphist,- done (INVALID ARG),0,0,0,0);
 +   return (EINVAL);
 +   }
 +
 vm_map_lock(map);
 VM_MAP_RANGE_CHECK(map, start, end);
 -   if (uvm_map_lookup_entry(map, start, temp_entry)) {
 -   entry = temp_entry;
 +   if (uvm_map_lookup_entry(map, start, entry)) {
 UVM_MAP_CLIP_START(map, entry, start);
 } else {
 -   entry = temp_entry-next;
 +   entry = entry-next;
 }
 
 /*
 @@ -2539,19 +2549,6 @@ uvm_map_advice(struct vm_map *map, vaddr
 
 while ((entry != map-header)  (entry-start  end)) {
 UVM_MAP_CLIP_END(map, entry, end);
 -
 -   switch (new_advice) {
 -   case MADV_NORMAL:
 -   case MADV_RANDOM:
 -   case MADV_SEQUENTIAL:
 -   /* nothing special here */
 -   break;
 -
 -   default:
 -   vm_map_unlock(map);
 -   UVMHIST_LOG(maphist,- done (INVALID ARG),0,0,0,0);
 -   return (EINVAL);
 -   }
 entry-advice = new_advice;
 entry = entry-next;
 }

Ok with me.
-- 
Ariane



Re: some cleanup of of uvm_map.c

2010-02-03 Thread Ariane van der Steldt
On Tue, Feb 02, 2010 at 01:21:55AM +0500, Anton Maksimenkov wrote:
 2010/2/2 Anton Maksimenkov anton...@gmail.com:
  Overcomplicated using of RB_TREE in uvm_map_lookup_entry() making it
  hard to understand.
  I thinking about drop this RB_TREE out from uvm_map.c or completely
  rewrite using of this RB_TREE.
 
 Oh, sorry, I mean in uvm_map_findspace() of course.

I'm pretty sure that function is bugged: I think it doesn't do
wrap-around on the search for an entry. (Should be easily provable, just
hint=vm_map_max(map) when entering uvm_map_findspace().)
You won't notice in kernel space, since every caller will set
hint=vm_map_min(map), but I suspect a number of failed allocations in
userspace are actually caused by that.

 In uvm_map_lookup_entry() the use of RB_TREE is understandable.

Both functions are rather complex. And both functions pre-date random
allocation. All this map-hint business has no reason to remain, since
random allocation is here to stay and the hint has a high failure rate
since then (if it is used, which it usually isn't anyway).

The RB_TREE code in uvm_map_findspace is a pretty good solution given
the map entry struct. You will need to understand the RB_AUGMENT hack
that is only used in this place of the kernel though.

Ciao,
-- 
Ariane



Re: some cleanup of of uvm_map.c

2010-02-03 Thread Anton Maksimenkov
2010/2/3 Ariane van der Steldt ari...@stack.nl:
 On Tue, Feb 02, 2010 at 01:21:55AM +0500, Anton Maksimenkov wrote:
 uvm_map_lookup_entry()... in uvm_map_findspace()...

 I'm pretty sure that function is bugged: I think it doesn't do
 wrap-around on the search for an entry. 

 Both functions are rather complex. And both functions pre-date random
 allocation. All this map-hint business has no reason to remain, since
 random allocation is here to stay and the hint has a high failure rate
 since then (if it is used, which it usually isn't anyway).

 The RB_TREE code in uvm_map_findspace is a pretty good solution given
 the map entry struct. You will need to understand the RB_AUGMENT hack
 that is only used in this place of the kernel though.

2010/2/3 Owain Ainsworth zer...@googlemail.com:
 On Wed, Feb 03, 2010 at 04:33:50PM +0100, Ariane van der Steldt wrote:
 I'm pretty sure that function is bugged...
 Yes. Henning has a bgpd core router where bgpd occasionally dies due to


Let me introduce my idea.

In fact, we have only two functions in uvm_map.c which make searching
in maps. These are uvm_map_lookup_entry() and uvm_map_findspace().
The uvm_map_lookup_entry() do search by address (VA), while
uvm_map_findspace() do search by free space.

And we have a RB_HEAD(uvm_tree, vm_map_entry) rbhead, which is
indexed by address (see uvm_compare()). Since that this RB_TREE
provide a searching by address option to us. This is what the
RB_TREE used for.
But when we try to use that RB_TREE for track free space and to
SEARCH by free space  we do dirty and ugly things!
Then we add a RB_AUGMENT hack (it is so ugly and hard to use right
with others that you can't find it anywhere in source tree... exept
the uvm_map.c) and other tricks... In the end we got very unclear
uvm_map.c code in these places, which is very hard to understand and
track, and which is overloaded and overcomplexed by that tricks and
crutches.
We must stop it, because it keeps hold our hands and brains.

We can do things very clear and easy. Let me show how exactly.

Let's just add second RB_TREE, which is indexed by space, let's call
it RB_HEAD(uvm_tree_by_space, vm_map_entry) rbhead_by_space, for
example. That uvm_tree_by_space will contain vm_map_entry'es sorted by
free space. And that uvm_tree_by_space will be used only in
uvm_map_findspace() to search for  vm_map_entry with needed free
space. RB_NFIND() is our best friend here! Simple, reasonable fast,
clearly and easy to understand.
Actually, since we can have two or more vm_map_entry'es with equal
space, so each RB_TREE element will be the list of vm_map_entry'es
with that equal space. We can use some additional logic here when we
push/pop vm_map_entry from that list  we can pop vm_map_entry with
smallest start address, for example.

Then we must free the first RB_TREE (uvm_tree, indexed by address)
from tracking space/ownspace and other nasty tricks, and stop using it
in uvm_map_findspace(). This uvm_tree must be used only in
uvm_map_lookup_entry(). Again, RB_NFIND() is our best friend here (of
course, in couple with RB_LEFT and other logic, etc). And again,
simple, reasonable fast, clearly and easy to understand!
And we can (and must) COMPLETELY STOP using RB_AUGMENT hack and forget
about it as a bad dream, at last.

If we decide that using of RB_TREE at each lookup/findspace is heavy
when there are little of vm_map_entry'es in map, we can just use
linear search in this case. And use RB_FIND/NFIND only when number of
vm_map_entry'es will be bigger than some number.

We can forget about hint and probably drop it's usage here in
uvm_map.c (I'll review it later). RB_FIND/RB_NFIND will do their best,
especially with random allocation, especially when there will be
many vm_map_entry'es in the map.

So in the end.
I have some pieces of patch in my mind, and I'll create diff with
ideas introduced here.
If you have any positives and negatives so please, say it here.
--
antonvm



Re: some cleanup of of uvm_map.c

2010-02-01 Thread Bob Beck
On 31 January 2010 05:43, Anton Maksimenkov anton...@gmail.com wrote:
 Here is some cleanup of uvm_map.c code.

 Second, remove redundant temp_entry variable from both functions.
 One entry variable is enough to do the job.


I don't think you want to do that.. uvm_map_lookup_entry can screw
with that pointer even when it fails.

bad juju



Re: some cleanup of of uvm_map.c

2010-02-01 Thread Anton Maksimenkov
2010/2/1 Bob Beck b...@ualberta.ca:
 On 31 January 2010 05:43, Anton Maksimenkov anton...@gmail.com wrote:
 Second, remove redundant temp_entry variable from both functions.
 One entry variable is enough to do the job.
 I don't think you want to do that.. uvm_map_lookup_entry can screw
 with that pointer even when it fails.

Sorry, I don't clearly understand what you mean. What is bad?
I think that screw with that pointer even when it fails is exactly
what uvm_map_lookup_entry must do and what we need here.

uvm_map_protect() exploit exactly same trick. And uvm_map_extract() do the same:
...
if (uvm_map_lookup_entry(srcmap, start, entry)) {
...do some work with entry...
} else {
/* start is not within an entry ... skip to next entry */
if (flags  UVM_EXTRACT_CONTIG) {
error = EINVAL;
goto bad;/* definite hole here ... */
}
entry = entry-next;
fudge = 0;
}

AFAIS, uvm_map_lookup_entry() set the entry either to the
vm_map_entry which contain start address (success)
or to the last vm_map_entry in the map (fail).
If it fail and set entry to the last vm_map_entry then entry =
entry-next move entry to the map-header.
So loop with
 while ((entry != map-header)  (entry-start  end)) {
will never be executed (it is right thing).

Overcomplicated using of RB_TREE in uvm_map_lookup_entry() making it
hard to understand.
I thinking about drop this RB_TREE out from uvm_map.c or completely
rewrite using of this RB_TREE.
-- 
antonvm