Re: [Qemu-devel] [PATCH 10/10] linux-user: init_guest_space: Try to make ARM space+commpage continuous

2018-03-20 Thread Peter Maydell
On 20 March 2018 at 18:49, Luke Shumaker  wrote:
> On Fri, 02 Mar 2018 09:13:12 -0500,
> Peter Maydell wrote:
>> On 28 December 2017 at 18:08, Luke Shumaker  wrote:
>> > +guest_full_size =
>> > +(0x0f00 & qemu_host_page_mask) + qemu_host_page_size;
> ^
>> I think this is probably more clearly written as 0x1ULL,
>> since rounding down to the host-page-size then adding the host-page-size
>> gets us the full 32-bit size of the guest address space.
>
> Wait, is that right?  Isn't that only true if qemu_host_page_size is
> at least 8KiB (16 bits), enough to fill the zero in the middle?  Won't
> a typical qemu_host_page_size be only 4KiB?

Yeah, I just got the arithmetic wrong here because I had it
stuck in my head that the commpage was the top page of the
guest address space.

-- PMM



Re: [Qemu-devel] [PATCH 10/10] linux-user: init_guest_space: Try to make ARM space+commpage continuous

2018-03-20 Thread Laurent Vivier
Le 20/03/2018 à 19:49, Luke Shumaker a écrit :
> On Fri, 02 Mar 2018 09:13:12 -0500,
> Peter Maydell wrote:
>> On 28 December 2017 at 18:08, Luke Shumaker  wrote:
>>> +guest_full_size =
>>> +(0x0f00 & qemu_host_page_mask) + qemu_host_page_size;
> ^
>> I think this is probably more clearly written as 0x1ULL,
>> since rounding down to the host-page-size then adding the host-page-size
>> gets us the full 32-bit size of the guest address space.
> 
> Wait, is that right?  Isn't that only true if qemu_host_page_size is
> at least 8KiB (16 bits), enough to fill the zero in the middle?  Won't
> a typical qemu_host_page_size be only 4KiB?
> 
>> That shows up that there's a potential problem here if the host
>> is 32-bit, because in that case guest_full_size (being only unsigned
>> long) will be 0, and we'll end up trying an mmap with an incorrect size.
>>
>>> +host_full_size = guest_full_size - guest_start;
>>> +real_start = (unsigned long)
>>> +mmap(NULL, host_full_size, PROT_NONE, flags, -1, 0);
>>
>> I think the general approach is right, though. Sorry it took so long
>> for us to get to reviewing this patchset.
> 
> It's all good.  I'm amazed at the amount of traffic qemu-devel gets!
> 
>> Incidentally, this code would be rather less complicated if it didn't
>> have to account for qemu_host_page_size not actually being the host
>> page size (since then you couldn't get a return from mmap() that wasn't
>> aligned properly). Does anybody know why we allow the user to specify
>> it on the command line? (git revision history doesn't help, it just says
>> there's been a -pagesize argument since commit 54936004fddc5 in 2003,
>> right back when mmap emulation was first added...)
> 
> I have no idea, I just assumed that it was a feature useful to people
> far smarter than me.
> 

I'm going to add this patch in my upcoming linux-user pull-request
(currently running regression tests).

Thanks,
Laurent



Re: [Qemu-devel] [PATCH 10/10] linux-user: init_guest_space: Try to make ARM space+commpage continuous

2018-03-20 Thread Luke Shumaker
On Fri, 02 Mar 2018 09:13:12 -0500,
Peter Maydell wrote:
> On 28 December 2017 at 18:08, Luke Shumaker  wrote:
> > +guest_full_size =
> > +(0x0f00 & qemu_host_page_mask) + qemu_host_page_size;
^
> I think this is probably more clearly written as 0x1ULL,
> since rounding down to the host-page-size then adding the host-page-size
> gets us the full 32-bit size of the guest address space.

Wait, is that right?  Isn't that only true if qemu_host_page_size is
at least 8KiB (16 bits), enough to fill the zero in the middle?  Won't
a typical qemu_host_page_size be only 4KiB?

> That shows up that there's a potential problem here if the host
> is 32-bit, because in that case guest_full_size (being only unsigned
> long) will be 0, and we'll end up trying an mmap with an incorrect size.
> 
> > +host_full_size = guest_full_size - guest_start;
> > +real_start = (unsigned long)
> > +mmap(NULL, host_full_size, PROT_NONE, flags, -1, 0);
> 
> I think the general approach is right, though. Sorry it took so long
> for us to get to reviewing this patchset.

It's all good.  I'm amazed at the amount of traffic qemu-devel gets!

> Incidentally, this code would be rather less complicated if it didn't
> have to account for qemu_host_page_size not actually being the host
> page size (since then you couldn't get a return from mmap() that wasn't
> aligned properly). Does anybody know why we allow the user to specify
> it on the command line? (git revision history doesn't help, it just says
> there's been a -pagesize argument since commit 54936004fddc5 in 2003,
> right back when mmap emulation was first added...)

I have no idea, I just assumed that it was a feature useful to people
far smarter than me.

-- 
Happy hacking,
~ Luke Shumaker



Re: [Qemu-devel] [PATCH 10/10] linux-user: init_guest_space: Try to make ARM space+commpage continuous

2018-03-20 Thread Peter Maydell
On 20 March 2018 at 15:23, Laurent Vivier  wrote:
> Le 02/03/2018 à 15:13, Peter Maydell a écrit :
>> On 28 December 2017 at 18:08, Luke Shumaker  wrote:
>>> +#if defined(TARGET_ARM) && !defined(TARGET_AARCH64)
>>> +/* On 32-bit ARM, we need to map not just the usable memory, but
>>> + * also the commpage.  Try to find a suitable place by allocating
>>> + * a big chunk for all of it.  If host_start, then the naive
>>> + * strategy probably does good enough.
>>> + */
>>> +if (!host_start) {
>>> +unsigned long guest_full_size, host_full_size, real_start;
>>> +
>>> +guest_full_size =
>>> +(0x0f00 & qemu_host_page_mask) + qemu_host_page_size;
>>
>> I think this is probably more clearly written as 0x1ULL,
>> since rounding down to the host-page-size then adding the host-page-size
>> gets us the full 32-bit size of the guest address space.
>
> Perhaps, I've missed something, but it seems not true.
>
> On x86_64, we have:
>
> qemu_host_page_mask = 0xf000
> qemu_host_page_size = 0x1000
>
> but
>
> 0x0f00 & 0xf000 = 0x
> then
> 0x + 0x1000 = 0x1000

Yes, you're right -- I'd thought that the kernel commpage was
right at the top of memory, but it isn't.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 10/10] linux-user: init_guest_space: Try to make ARM space+commpage continuous

2018-03-20 Thread Laurent Vivier
Le 02/03/2018 à 15:13, Peter Maydell a écrit :
> On 28 December 2017 at 18:08, Luke Shumaker  wrote:
>> From: Luke Shumaker 
>>
>> At a fixed distance after the usable memory that init_guest_space maps, for
>> 32-bit ARM targets we also need to map a commpage.  The normal
>> init_guest_space logic doesn't keep this in mind when searching for an
>> address range.
>>
>> If !host_start, then try to find a big continuous segment where we can put
>> both the usable memory and the commpage; we then munmap that segment and
>> set current_start to that address; and let the normal code mmap the usable
>> memory and the commpage separately.  That is: if we don't have hint of
>> where to start looking for memory, come up with one that is better than
>> NULL.  Depending on host_size and guest_start, there may or may not be a
>> gap between the usable memory and the commpage, so this is slightly more
>> restrictive than it needs to be; but it's only a hint, so that's OK.
>>
>> We only do that for !host start, because if host_start, then either:
>>  - we got an address passed in with -B, in which case we don't want to
>>interfere with what the user said;
>>  - or host_start is based off of the ELF image's loaddr.  The check "if
>>(host_start && real_start != current_start)" suggests that we really
>>want lowest available address that is >= loaddr.  I don't know why that
>>is, but I'm trusting that Paul Brook knew what he was doing when he
>>wrote the original version of that check in
>>c581deda322080e8beb88b2e468d4af54454e4b3 way back in 2010.
>>
>> Signed-off-by: Luke Shumaker 
>> ---
>>  linux-user/elfload.c | 49 +
>>  1 file changed, 49 insertions(+)
>>
>> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
>> index 7736ea2c3a..cd3a7d877d 100644
>> --- a/linux-user/elfload.c
>> +++ b/linux-user/elfload.c
>> @@ -1857,6 +1857,55 @@ unsigned long init_guest_space(unsigned long 
>> host_start,
>>
>>  /* Otherwise, a non-zero size region of memory needs to be mapped
>>   * and validated.  */
>> +
>> +#if defined(TARGET_ARM) && !defined(TARGET_AARCH64)
>> +/* On 32-bit ARM, we need to map not just the usable memory, but
>> + * also the commpage.  Try to find a suitable place by allocating
>> + * a big chunk for all of it.  If host_start, then the naive
>> + * strategy probably does good enough.
>> + */
>> +if (!host_start) {
>> +unsigned long guest_full_size, host_full_size, real_start;
>> +
>> +guest_full_size =
>> +(0x0f00 & qemu_host_page_mask) + qemu_host_page_size;
> 
> I think this is probably more clearly written as 0x1ULL,
> since rounding down to the host-page-size then adding the host-page-size
> gets us the full 32-bit size of the guest address space.

Perhaps, I've missed something, but it seems not true.

On x86_64, we have:

qemu_host_page_mask = 0xf000
qemu_host_page_size = 0x1000

but

0x0f00 & 0xf000 = 0x
then
0x + 0x1000 = 0x1000

Thanks,
Laurent



Re: [Qemu-devel] [PATCH 10/10] linux-user: init_guest_space: Try to make ARM space+commpage continuous

2018-03-03 Thread Richard Henderson
On 03/02/2018 06:13 AM, Peter Maydell wrote:
> Does anybody know why we allow the user to specify
> it on the command line? (git revision history doesn't help, it just says
> there's been a -pagesize argument since commit 54936004fddc5 in 2003,
> right back when mmap emulation was first added...)

I *think* it is to allow two different hosts with different page sizes to
achieve the same memory layout, so that tcg traces are more easily compared, to
root out bugs in the second host's tcg backend.

That's the only thing that I've used it for, anyway.


r~



Re: [Qemu-devel] [PATCH 10/10] linux-user: init_guest_space: Try to make ARM space+commpage continuous

2018-03-02 Thread Peter Maydell
On 28 December 2017 at 18:08, Luke Shumaker  wrote:
> From: Luke Shumaker 
>
> At a fixed distance after the usable memory that init_guest_space maps, for
> 32-bit ARM targets we also need to map a commpage.  The normal
> init_guest_space logic doesn't keep this in mind when searching for an
> address range.
>
> If !host_start, then try to find a big continuous segment where we can put
> both the usable memory and the commpage; we then munmap that segment and
> set current_start to that address; and let the normal code mmap the usable
> memory and the commpage separately.  That is: if we don't have hint of
> where to start looking for memory, come up with one that is better than
> NULL.  Depending on host_size and guest_start, there may or may not be a
> gap between the usable memory and the commpage, so this is slightly more
> restrictive than it needs to be; but it's only a hint, so that's OK.
>
> We only do that for !host start, because if host_start, then either:
>  - we got an address passed in with -B, in which case we don't want to
>interfere with what the user said;
>  - or host_start is based off of the ELF image's loaddr.  The check "if
>(host_start && real_start != current_start)" suggests that we really
>want lowest available address that is >= loaddr.  I don't know why that
>is, but I'm trusting that Paul Brook knew what he was doing when he
>wrote the original version of that check in
>c581deda322080e8beb88b2e468d4af54454e4b3 way back in 2010.
>
> Signed-off-by: Luke Shumaker 
> ---
>  linux-user/elfload.c | 49 +
>  1 file changed, 49 insertions(+)
>
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index 7736ea2c3a..cd3a7d877d 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -1857,6 +1857,55 @@ unsigned long init_guest_space(unsigned long 
> host_start,
>
>  /* Otherwise, a non-zero size region of memory needs to be mapped
>   * and validated.  */
> +
> +#if defined(TARGET_ARM) && !defined(TARGET_AARCH64)
> +/* On 32-bit ARM, we need to map not just the usable memory, but
> + * also the commpage.  Try to find a suitable place by allocating
> + * a big chunk for all of it.  If host_start, then the naive
> + * strategy probably does good enough.
> + */
> +if (!host_start) {
> +unsigned long guest_full_size, host_full_size, real_start;
> +
> +guest_full_size =
> +(0x0f00 & qemu_host_page_mask) + qemu_host_page_size;

I think this is probably more clearly written as 0x1ULL,
since rounding down to the host-page-size then adding the host-page-size
gets us the full 32-bit size of the guest address space.

That shows up that there's a potential problem here if the host
is 32-bit, because in that case guest_full_size (being only unsigned
long) will be 0, and we'll end up trying an mmap with an incorrect size.

> +host_full_size = guest_full_size - guest_start;
> +real_start = (unsigned long)
> +mmap(NULL, host_full_size, PROT_NONE, flags, -1, 0);

I think the general approach is right, though. Sorry it took so long
for us to get to reviewing this patchset.

Incidentally, this code would be rather less complicated if it didn't
have to account for qemu_host_page_size not actually being the host
page size (since then you couldn't get a return from mmap() that wasn't
aligned properly). Does anybody know why we allow the user to specify
it on the command line? (git revision history doesn't help, it just says
there's been a -pagesize argument since commit 54936004fddc5 in 2003,
right back when mmap emulation was first added...)

thanks
-- PMM



[Qemu-devel] [PATCH 10/10] linux-user: init_guest_space: Try to make ARM space+commpage continuous

2017-12-28 Thread Luke Shumaker
From: Luke Shumaker 

At a fixed distance after the usable memory that init_guest_space maps, for
32-bit ARM targets we also need to map a commpage.  The normal
init_guest_space logic doesn't keep this in mind when searching for an
address range.

If !host_start, then try to find a big continuous segment where we can put
both the usable memory and the commpage; we then munmap that segment and
set current_start to that address; and let the normal code mmap the usable
memory and the commpage separately.  That is: if we don't have hint of
where to start looking for memory, come up with one that is better than
NULL.  Depending on host_size and guest_start, there may or may not be a
gap between the usable memory and the commpage, so this is slightly more
restrictive than it needs to be; but it's only a hint, so that's OK.

We only do that for !host start, because if host_start, then either:
 - we got an address passed in with -B, in which case we don't want to
   interfere with what the user said;
 - or host_start is based off of the ELF image's loaddr.  The check "if
   (host_start && real_start != current_start)" suggests that we really
   want lowest available address that is >= loaddr.  I don't know why that
   is, but I'm trusting that Paul Brook knew what he was doing when he
   wrote the original version of that check in
   c581deda322080e8beb88b2e468d4af54454e4b3 way back in 2010.

Signed-off-by: Luke Shumaker 
---
 linux-user/elfload.c | 49 +
 1 file changed, 49 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 7736ea2c3a..cd3a7d877d 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1857,6 +1857,55 @@ unsigned long init_guest_space(unsigned long host_start,
 
 /* Otherwise, a non-zero size region of memory needs to be mapped
  * and validated.  */
+
+#if defined(TARGET_ARM) && !defined(TARGET_AARCH64)
+/* On 32-bit ARM, we need to map not just the usable memory, but
+ * also the commpage.  Try to find a suitable place by allocating
+ * a big chunk for all of it.  If host_start, then the naive
+ * strategy probably does good enough.
+ */
+if (!host_start) {
+unsigned long guest_full_size, host_full_size, real_start;
+
+guest_full_size =
+(0x0f00 & qemu_host_page_mask) + qemu_host_page_size;
+host_full_size = guest_full_size - guest_start;
+real_start = (unsigned long)
+mmap(NULL, host_full_size, PROT_NONE, flags, -1, 0);
+if (real_start == (unsigned long)-1) {
+if (host_size < host_full_size - qemu_host_page_size) {
+/* We failed to map a continous segment, but we're
+ * allowed to have a gap between the usable memory and
+ * the commpage where other things can be mapped.
+ * This sparseness gives us more flexibility to find
+ * an address range.
+ */
+goto naive;
+}
+return (unsigned long)-1;
+}
+munmap((void *)real_start, host_full_size);
+if (real_start & ~qemu_host_page_mask) {
+/* The same thing again, but with an extra qemu_host_page_size
+ * so that we can shift around alignment.
+ */
+unsigned long real_size = host_full_size + qemu_host_page_size;
+real_start = (unsigned long)
+mmap(NULL, real_size, PROT_NONE, flags, -1, 0);
+if (real_start == (unsigned long)-1) {
+if (host_size < host_full_size - qemu_host_page_size) {
+goto naive;
+}
+return (unsigned long)-1;
+}
+munmap((void *)real_start, real_size);
+real_start = HOST_PAGE_ALIGN(real_start);
+}
+current_start = real_start;
+}
+ naive:
+#endif
+
 while (1) {
 unsigned long real_start, real_size, aligned_size;
 aligned_size = real_size = host_size;
-- 
2.15.1