Hi Henning,
On Monday, January 27, 2020 at 12:16:08 AM UTC-7, Henning Schild wrote:
>
> Ok, so we are just looking for differences between the inmate and the
> linux as non-root cell, because the jailhouse/virtualization overhead
> is acceptable or known.
>
I'm sorry, I was confused. That is actually not correct. I am looking for
the difference between the inmate running my simple workload vs. running
that same workload in the *root cell* rather than in a non-root Linux cell.
What I am doing is activating the root cell, then simply running the
workload in Linux with a wrapper program (sha3-512.c
<https://github.com/hintron/jailhouse/blob/05824b901ce714c7a61770774b862ef24caf641e/mgh/workloads/src/sha3-512.c>).
Then, I activate my inmate and run the same workload, but this time within
the inmate in a real-time wrapper application (mgh-demo.c
<https://github.com/hintron/jailhouse/blob/05824b901ce714c7a61770774b862ef24caf641e/inmates/demos/x86/mgh-demo.c>).
Both wrapper applications now use the exact same object file, compiled once
under the Jailhouse build system. But the results are still the same.
However, the input used by the program in the inmate is in a special
'add-on' memory region I had to create and map manually with map_range().
Here is the additional memory region in my config that I named the 'heap'
(I need it big enough to hold a 20 MiB+ data input):
/* MGH: RAM - Heap */
{
/* MGH: We have 36 MB of memory allocated to the inmate
* in the root config, but are only using 1 MB for the
* inmate's stack and program. So create an additional
* "heap" area with the other 35 MB to allow the program
* more memory to work with. */
.phys_start = 0x3a700000,
.virt_start = 0x00200000,
// 35 MB (3a7 + 23 = 3ca)
.size = 0x02300000,
.flags = JAILHOUSE_MEM_READ | JAILHOUSE_MEM_WRITE |
JAILHOUSE_MEM_EXECUTE | JAILHOUSE_MEM_LOADABLE,
},
https://github.com/hintron/jailhouse/blob/05824b901ce714c7a61770774b862ef24caf641e/configs/x86/bazooka-inmate.c#L90-L103
I am able to map that large 35 MiB memory region into my inmate, and it
works ok:
#define MGH_HEAP_BASE 0x00200000
#define MGH_HEAP_SIZE (35 * MB)
...
/*
* MGH: By default, x86 inmates only map the first 2 MB of virtual memory,
even
* when more memory is configured. So map configured memory pages behind the
* virtual memory address MGH_HEAP_BASE. Without this, there is nothing
behind
* the virtual memory address and you'll get a page fault.
*/
static void expand_memory(void)
{
map_range((char *)MGH_HEAP_BASE, MGH_HEAP_SIZE, MAP_UNCACHED);
/* Set heap_pos to point to MGH_HEAP_BASE, instead of right after the
* inmate's stack, so alloc() can allocate more than 1 MB. */
heap_pos = MGH_HEAP_BASE;
}
https://github.com/hintron/jailhouse/blob/05824b901ce714c7a61770774b862ef24caf641e/inmates/demos/x86/mgh-demo.c#L113-L114
https://github.com/hintron/jailhouse/blob/05824b901ce714c7a61770774b862ef24caf641e/inmates/demos/x86/mgh-demo.c#L930-L943
.
I have tried using both my 'heap' memory region (with
programmatically-generated input) as well as using input passed into the
IVSHMEM
shared memory region
<https://github.com/hintron/jailhouse/blob/05824b901ce714c7a61770774b862ef24caf641e/configs/x86/bazooka-inmate.c#L79-L89>,
with the same results.
Maybe there is something wrong with the memory paging that is making things
a lot slower than expected, like you implied. Maybe regular Linux has a
faster way of setting up paging/memory.
In your last response, you said this:
"For the inmate itself the pagetable is constructed by the mapping
library. The code looks like it tries to do huge pages, make sure the
call map_range just once with your full memory range. Aligned and maybe
more than you actually need. Consider putting a few printfs into the
mapping code to see which path (page-size) it goes."
Could you explain the following suggestion a bit more?: "make sure the call
map_range just once with your full memory range." It looks like mgh-demo.c
calls map_range twice: once in map_shmem_and_bars() (from your original
IVSHMEM demo code, which I based this off of), and then in expand_memory()
as shown above. Are you saying I should combine those into one single call?
Also, can you explain this: "Aligned and maybe more than you actually need.
Consider putting a few printfs into the mapping code to see which path
(page-size) it goes." I'm not sure what I should be looking for inside
map_range(). What do you mean by "which path (page-size) it goes," exactly?
What's the code path?
Sorry for the bother. I really need to understand why this is happening,
because this discrepancy completely overshadows my other slightly-positive
timing results in my research. Any help is greatly appreciated.
Thanks,
Michael
--
You received this message because you are subscribed to the Google Groups
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jailhouse-dev/96056326-0700-4779-b1b8-3b0df7134a73%40googlegroups.com.