Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Meador Inge
On 06/12/2012 09:08 AM, Richard Henderson wrote:

 I think this is one of those cases where the -B or -R options
 (or QEMU_GUEST_BASE and QEMU_RESERVED_VA env variables) are the best
 way forward for whatever cpu you're emulating.  That or a change to
 the target's default ld script, not to link real executables quite so 
 low in the address space.

Per Richard's recommendation I experimented with -R for my use cases.  It seems
to mostly work, but for ARM GNU/Linux there is an issue that makes it awkward
to work with.

In particular, this commit [1] added validation for the guest base as a way to
ensure that the kernel-provided user mode helper functions on ARM can be mapped.
The validation function is invoked by 'probe_guest_base', but also in
main.c:3456 whenever -R or -B is used:

if (reserved_va || have_guest_base) {
if (!guest_validate_base(guest_base)) {
fprintf(stderr, Guest base/Reserved VA rejected by guest code\n);
exit(1);
}
}

Thus we might be able to allocate the reserved VA region, but it might fail the
validation and exit.  I had this actually happen on many test cases when testing
'-R 128M' with portions of the GCC testsuite.

To solve this issue I experimented with performing a similar probing in 'main'
as in 'probe_guest_base' so that we can find a reserved VA region that also
passes validation.  If a region isn't found that can be validated, then QEMU
gives up.  Does this approach seem reasonable?


[1]
http://git.qemu.org/?p=qemu.git;a=commit;h=97cc75606aef406e90a243cdb25347039003e7f0

-- 
Meador Inge
CodeSourcery / Mentor Embedded
http://www.mentor.com/embedded-software



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Richard Henderson
On 06/27/2012 08:51 AM, Meador Inge wrote:
 To solve this issue I experimented with performing a similar probing in 'main'
 as in 'probe_guest_base' so that we can find a reserved VA region that also
 passes validation.  If a region isn't found that can be validated, then QEMU
 gives up.  Does this approach seem reasonable?

I guess so, depending on how you adjust the hint each time.

I do wonder if it wouldn't be better to rearrange things such that
for 64-bit hosts and 32-bit guests we *always* reserve 4G so that
there's zero possibility of the guest stomping on host memory.  That
would also solve your problem.


r~



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Peter Maydell
On 27 June 2012 18:32, Richard Henderson r...@twiddle.net wrote:
 I do wonder if it wouldn't be better to rearrange things such that
 for 64-bit hosts and 32-bit guests we *always* reserve 4G so that
 there's zero possibility of the guest stomping on host memory.  That
 would also solve your problem.

We already almost do that;

#if (TARGET_LONG_BITS == 32)  (HOST_LONG_BITS == 64)
/*
 * When running 32-on-64 we should make sure we can fit all of the possible
 * guest address space into a contiguous chunk of virtual host memory.
 *
 * This way we will never overlap with our own libraries or binaries or stack
 * or anything else that QEMU maps.
 */
unsigned long reserved_va = 0xf700;
#else
unsigned long reserved_va;
#endif
#endif

The only reason this isn't asking for the full 4GB is that pesky
ARM commpage, and (as you hint) the right way to fix this is to
make the commpage cope OK with being inside the reserved region
as well as outside it, and then we could make that reserved_va
value actually be 4GB.

-- PMM



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Andreas Färber
Am 27.06.2012 19:32, schrieb Richard Henderson:
 On 06/27/2012 08:51 AM, Meador Inge wrote:
 To solve this issue I experimented with performing a similar probing in 
 'main'
 as in 'probe_guest_base' so that we can find a reserved VA region that also
 passes validation.  If a region isn't found that can be validated, then QEMU
 gives up.  Does this approach seem reasonable?
 
 I guess so, depending on how you adjust the hint each time.
 
 I do wonder if it wouldn't be better to rearrange things such that
 for 64-bit hosts and 32-bit guests we *always* reserve 4G so that
 there's zero possibility of the guest stomping on host memory.  That
 would also solve your problem.

openSUSE uses a version patched so that IIUC 3G are reserved.
Just today this failed on a system where swap got disabled and the
mmap() thus failed.

Alex suggested an algorithm that starts at 3G (4G, whatever) and when
that fails probes lower limits until it succeeds.

Either way, it's not a new problem, and each solution so far has had
other drawbacks... cc'ing the relevant folks.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Richard Henderson
On 06/27/2012 10:53 AM, Andreas Färber wrote:
 openSUSE uses a version patched so that IIUC 3G are reserved.
 Just today this failed on a system where swap got disabled and the
 mmap() thus failed.

Err... why?  We map with MAP_NORESERVE, so swap shouldn't matter...


r~



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Andreas Färber
Am 27.06.2012 20:36, schrieb Richard Henderson:
 On 06/27/2012 10:53 AM, Andreas Färber wrote:
 openSUSE uses a version patched so that IIUC 3G are reserved.
 Just today this failed on a system where swap got disabled and the
 mmap() thus failed.
 
 Err... why?  We map with MAP_NORESERVE, so swap shouldn't matter...

Wasn't my system... Adrian?

/-F

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Paul Brook
  openSUSE uses a version patched so that IIUC 3G are reserved.
  Just today this failed on a system where swap got disabled and the
  mmap() thus failed.
 
 Err... why?  We map with MAP_NORESERVE, so swap shouldn't matter...

I can't say if it's the same cause, but we fail with ulimit -v 4046848.

Incidentally, it seems a strange that we only reserve 0xf700 bytes, not 
the full 4G.

Paul



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Alexander Graf

On 28.06.2012, at 02:06, Paul Brook wrote:

 openSUSE uses a version patched so that IIUC 3G are reserved.
 Just today this failed on a system where swap got disabled and the
 mmap() thus failed.
 
 Err... why?  We map with MAP_NORESERVE, so swap shouldn't matter...
 
 I can't say if it's the same cause, but we fail with ulimit -v 4046848.
 
 Incidentally, it seems a strange that we only reserve 0xf700 bytes, not 
 the full 4G.

Uh, I think that was because of the vdso shared page that is allocated on top 
of -R.

Either way, this whole approach only works for 32-on-64. For 64-on-64, we can't 
reserve enough virtual memory on the host to satisfy the guest process for all 
archs.


Alex




Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Meador Inge
On 06/27/2012 12:32 PM, Richard Henderson wrote:

 On 06/27/2012 08:51 AM, Meador Inge wrote:
 To solve this issue I experimented with performing a similar probing in 
 'main'
 as in 'probe_guest_base' so that we can find a reserved VA region that also
 passes validation.  If a region isn't found that can be validated, then QEMU
 gives up.  Does this approach seem reasonable?
 
 I guess so, depending on how you adjust the hint each time.

What I am currently experimenting with is essentially the same as what is in
'probe_guest_base'.  So something like (not an actually patch submission, just
listing this here for discussion):

Index: linux-user/main.c
===
--- linux-user/main.c   (revision 376549)
+++ linux-user/main.c   (working copy)
@@ -3486,35 +3486,53 @@ int main(int argc, char **argv, char **e
 guest_base = HOST_PAGE_ALIGN(guest_base);

 if (reserved_va) {
-void *p;
+unsigned long host_start, real_start, first_start, host_size;
 int flags;

 flags = MAP_ANONYMOUS | MAP_PRIVATE | MAP_NORESERVE;
 if (have_guest_base) {
 flags |= MAP_FIXED;
 }
-p = mmap((void *)guest_base, reserved_va, PROT_NONE, flags, -1, 0);
-if (p == MAP_FAILED) {
-fprintf(stderr, Unable to reserve guest address space\n);
-exit(1);
-}
-guest_base = (unsigned long)p;
-/* Make sure the address is properly aligned.  */
-if (guest_base  ~qemu_host_page_mask) {
-munmap(p, reserved_va);
-p = mmap((void *)guest_base, reserved_va + qemu_host_page_size,
- PROT_NONE, flags, -1, 0);
-if (p == MAP_FAILED) {
+
+   first_start = host_start = HOST_PAGE_ALIGN(guest_base);
+   while (1) {
+host_size = reserved_va;
+real_start = (unsigned long) mmap((void *)host_start, host_size,
+  PROT_NONE, flags, -1, 0);
+if (real_start == (unsigned long)-1) {
+fprintf(stderr, Unable to reserve guest address space\n);
+exit(1);
+}
+guest_base = host_start;
+/* Make sure the address is properly aligned.  */
+if (guest_base  ~qemu_host_page_mask) {
+munmap((void*)real_start, host_size);
+host_size += qemu_host_page_size;
+real_start = (unsigned long) mmap((void *)guest_base,
+  host_size,
+  PROT_NONE, flags, -1, 0);
+if (real_start == (unsigned long)-1) {
+fprintf(stderr, Unable to reserve guest address space\n);
+exit(1);
+}
+guest_base = HOST_PAGE_ALIGN(real_start);
+}
+
+if (guest_validate_base(guest_base))
+break;
+
+munmap((void *)real_start, host_size);
+host_start += qemu_host_page_size;
+if (host_start == first_start) {
 fprintf(stderr, Unable to reserve guest address space\n);
 exit(1);
 }
-guest_base = HOST_PAGE_ALIGN((unsigned long)p);
 }
 qemu_log(Reserved 0x%lx bytes of guest address space\n, reserved_va);
 mmap_next_start = reserved_va;
 }

-if (reserved_va || have_guest_base) {
+if (have_guest_base) {
 if (!guest_validate_base(guest_base)) {
 fprintf(stderr, Guest base/Reserved VA rejected by guest code\n);
 exit(1);

 I do wonder if it wouldn't be better to rearrange things such that
 for 64-bit hosts and 32-bit guests we *always* reserve 4G so that
 there's zero possibility of the guest stomping on host memory.  That
 would also solve your problem.

I am seeing problems with 32-on-32 where the ARM commpage check wraps around
(incidentally I also ran into problems with -B because for some values of
guest_base it is easy for guest_base = min_mmap_addr and guest_base +
kernel_helper_addr  min_mmap_addr to hold).

-- 
Meador Inge
CodeSourcery / Mentor Embedded
http://www.mentor.com/embedded-software



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Paul Brook
 On 28.06.2012, at 02:06, Paul Brook wrote:
  openSUSE uses a version patched so that IIUC 3G are reserved.
  Just today this failed on a system where swap got disabled and the
  mmap() thus failed.
  
  Err... why?  We map with MAP_NORESERVE, so swap shouldn't matter...
  
  I can't say if it's the same cause, but we fail with ulimit -v 4046848.
  
  Incidentally, it seems a strange that we only reserve 0xf700 bytes,
  not the full 4G.
 
 Uh, I think that was because of the vdso shared page that is allocated on
 top of -R.

That can't be right.  The whole point of -R is that it defines all the guest 
accessible virtual address space.  The surrounding space is liable to be used 
by something else, and we must not make any assumptions about it.

Further inspection shows that guest_validate_base contains some extremely 
bogus code.

If the guest needs something at the top of its address space then we need to 
offset address zero within the block, and ensure accesses wrap appropriately.

Paul



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Meador Inge
On 06/27/2012 07:47 PM, Paul Brook wrote:

 On 28.06.2012, at 02:06, Paul Brook wrote:
 openSUSE uses a version patched so that IIUC 3G are reserved.
 Just today this failed on a system where swap got disabled and the
 mmap() thus failed.

 Err... why?  We map with MAP_NORESERVE, so swap shouldn't matter...

 I can't say if it's the same cause, but we fail with ulimit -v 4046848.

 Incidentally, it seems a strange that we only reserve 0xf700 bytes,
 not the full 4G.

 Uh, I think that was because of the vdso shared page that is allocated on
 top of -R.
 
 That can't be right.  The whole point of -R is that it defines all the guest 
 accessible virtual address space.  The surrounding space is liable to be used 
 by something else, and we must not make any assumptions about it.
 
 Further inspection shows that guest_validate_base contains some extremely 
 bogus code.
 
 If the guest needs something at the top of its address space then we need to 
 offset address zero within the block, and ensure accesses wrap appropriately.

'guest_validate_base' is currently called for three reasons: (1) in main.c
when using -B, (2) in main.c when using -R after mapping the reserved va
region, and (3) and when probing for a guest base in probe_guest_base.

For case (1) I suppose things are pretty much the same -- we just need to map
the extra region when needed (e.g. for the ARM kernel helpers).

For case (2) maybe we can do a probing similar to what I mentioned here [1],
but taking into account what you stated above and ensuring that the probing
finds a single region for the request va region size and any needed extra stuff.

Case (3) is mostly the same as (2) but we are probing for a guest base with a
region size deduced from looking at the image we are loading.  I suppose it is
still OK to map two regions here.  The single region only applies to -R?

Thoughts?

[1] http://lists.nongnu.org/archive/html/qemu-devel/2012-06/msg04589.html

-- 
Meador Inge
CodeSourcery / Mentor Embedded
http://www.mentor.com/embedded-software



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-27 Thread Paul Brook
 'guest_validate_base' is currently called for three reasons: (1) in main.c
 when using -B, (2) in main.c when using -R after mapping the reserved va
 region, and (3) and when probing for a guest base in probe_guest_base.
 
 For case (1) I suppose things are pretty much the same -- we just need to
 map the extra region when needed (e.g. for the ARM kernel helpers).

Yes.
 
 For case (2) maybe we can do a probing similar to what I mentioned here
 [1], but taking into account what you stated above and ensuring that the
 probing finds a single region for the request va region size and any
 needed extra stuff.

Something like that, yes. I suspect there are better ways to implement it 
though.  In principle your patch is making (2) a variant of (3). Instead of 
probing for the segments covered by the image we probe for the reserved 
regions (e.g. for ARM [0-reserved_va, 0x - 0x]).  A good 
implementation should automagically DTRT for both 32-bit and 64-bit hosts.

 Case (3) is mostly the same as (2) but we are probing for a guest base with
 a region size deduced from looking at the image we are loading.  I suppose
 it is still OK to map two regions here.  The single region only applies to
 -R?

I'd say (3) is more similar to (1).  There's no fundamental reason why -R has 
to allocate a single block.  In all cases we should be checking the same thing 
- are the addresses we need available on the host?  Having different code 
paths calling guest_validate_base, etc. for different reasons makes me think 
we're doing it wrong :-)

Paul



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-12 Thread Richard Henderson
On 2012-06-07 13:59, Meador Inge wrote:
  load_addr = loaddr;
  if (ehdr-e_type == ET_DYN) {
 +if (loaddr  mmap_min_addr)
 +probe_guest_base(image_name, loaddr, hiaddr);

This doesn't make any sense.  loaddr is almost certainly 0, unless
you've pre-linked the ld.so image.  But the next statement is letting
the system pick the address at which the image will be loaded.

What you're actually wanting is to probe the address ranges of the
real program, which since this is essentially a program running a
program is not visible to us at all.

I think this is one of those cases where the -B or -R options
(or QEMU_GUEST_BASE and QEMU_RESERVED_VA env variables) are the best
way forward for whatever cpu you're emulating.  That or a change to
the target's default ld script, not to link real executables quite so 
low in the address space.


r~



Re: [Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-12 Thread Meador Inge
On 06/12/2012 09:08 AM, Richard Henderson wrote:

 On 2012-06-07 13:59, Meador Inge wrote:
  load_addr = loaddr;
  if (ehdr-e_type == ET_DYN) {
 +if (loaddr  mmap_min_addr)
 +probe_guest_base(image_name, loaddr, hiaddr);
 
 This doesn't make any sense.  loaddr is almost certainly 0, unless
 you've pre-linked the ld.so image.  But the next statement is letting
 the system pick the address at which the image will be loaded.

It usually is.  I just want guest_base to be computed to something that
will work for cases where a fixed address image is later loaded (at which
point it is too late to compute the guest_base).  Always probing is one way I
found to do that, but as I originally said I don't know this code very well so
maybe that is not a good method.

 I think this is one of those cases where the -B or -R options
 (or QEMU_GUEST_BASE and QEMU_RESERVED_VA env variables) are the best
 way forward for whatever cpu you're emulating.  That or a change to
 the target's default ld script, not to link real executables quite so 
 low in the address space.

Hmmm, OK.  I was really hoping to have something more automatic.  Perhaps
I will have to use the options.

Thanks for the review.

-- 
Meador Inge
CodeSourcery / Mentor Embedded
http://www.mentor.com/embedded-software



[Qemu-devel] [RFC PATCH 1/1] linux-user: Probe the guest base for shared objects when needed

2012-06-07 Thread Meador Inge
In some cases when running a shared library directly from QEMU
(e.g. ld.so) the guest base should still be probed so that
any images loaded later at fixed addresses by the target code
can still be mapped.

Signed-off-by: Meador Inge mead...@codesourcery.com
---
 linux-user/elfload.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index f3b1552..c71c287 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1443,6 +1443,7 @@ static void probe_guest_base(const char *image_name,
 goto exit_errmsg;
 }
 }
+have_guest_base = 1;
 qemu_log(Relocating guest address space from 0x
  TARGET_ABI_FMT_lx  to 0x%lx\n,
  loaddr, real_start);
@@ -1528,6 +1529,8 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 
 load_addr = loaddr;
 if (ehdr-e_type == ET_DYN) {
+if (loaddr  mmap_min_addr)
+probe_guest_base(image_name, loaddr, hiaddr);
 /* The image indicates that it can be loaded anywhere.  Find a
location that can hold the memory space required.  If the
image is pre-linked, LOADDR will be non-zero.  Since we do
-- 
1.7.7.6