Re: How to clone CPUState in a new thread?

2019-11-08 Thread Michael Goffioul
On Thu, Nov 7, 2019 at 1:50 PM Michael Goffioul 
wrote:

> On Thu, Nov 7, 2019 at 7:53 AM Peter Maydell 
> wrote:
>
>> On Thu, 7 Nov 2019 at 12:46, Michael Goffioul
>>  wrote:
>> > Side question: is this the right mailing list to discuss this, or is
>> there a more appropriate one?
>>
>> You're more likely to find actual QEMU developers reading qemu-devel;
>> qemu-discuss has fewer contributors and they tend to be more
>> likely to be end-users or interested in end-user questions
>> rather than internals.
>>
>> As for your original question, if you're creating a new
>> thread and want the new thread's TCG CPU state to match
>> that of the old thread then the linux-user 'clone' call
>> is what you want to follow. Duplicating the existing CPU
>> state is a bit of a hacky codepath but it does work.
>> If you have a thread that doesn't want to follow the
>> existing CPU state of some other creating thread but
>> instead is started with a fixed entirely-known state,
>> then the code in linux-user/main.c which starts the first
>> thread of the process might be a better model.
>>
>> Overall, though, QEMU's code is not designed to be embedded
>> into some other runtime environment in the way you're doing
>> it, so I would expect there to be pain involved in trying
>> to get it to work (especially surrounding threads, new
>> processes and signals).
>>
>
>  Thanks for the input. I've feeling that the problem is related to the
> stack setup for the emulated code, or better, the lack thereof. I've
> successfully run JNI code in a separate java thread, while still originally
> loaded and tcg-processed in the main thread, for things like integer
> computation or even creating and returning a Java string. However something
> like sending a string to logcat makes the thing crash. The primary suspect
> is the TLS, which, if I'm not mistaken, is located somewhere in the
> thread's stack. Would that make sense?
>

It turned out the problem was really the lack of TLS storage initialization
in the emulated code. In particular, Android P and lower uses one of the 8
TLS slots defined in bionic to manage errno, meaning that any system call
would most likely lead to a crash. With some trickery, I manage to
initialize the emulated side of the thread properly and the code then runs
fine.


Re: How to clone CPUState in a new thread?

2019-11-07 Thread Michael Goffioul
On Thu, Nov 7, 2019 at 7:53 AM Peter Maydell 
wrote:

> On Thu, 7 Nov 2019 at 12:46, Michael Goffioul
>  wrote:
> > Side question: is this the right mailing list to discuss this, or is
> there a more appropriate one?
>
> You're more likely to find actual QEMU developers reading qemu-devel;
> qemu-discuss has fewer contributors and they tend to be more
> likely to be end-users or interested in end-user questions
> rather than internals.
>
> As for your original question, if you're creating a new
> thread and want the new thread's TCG CPU state to match
> that of the old thread then the linux-user 'clone' call
> is what you want to follow. Duplicating the existing CPU
> state is a bit of a hacky codepath but it does work.
> If you have a thread that doesn't want to follow the
> existing CPU state of some other creating thread but
> instead is started with a fixed entirely-known state,
> then the code in linux-user/main.c which starts the first
> thread of the process might be a better model.
>
> Overall, though, QEMU's code is not designed to be embedded
> into some other runtime environment in the way you're doing
> it, so I would expect there to be pain involved in trying
> to get it to work (especially surrounding threads, new
> processes and signals).
>

 Thanks for the input. I've feeling that the problem is related to the
stack setup for the emulated code, or better, the lack thereof. I've
successfully run JNI code in a separate java thread, while still originally
loaded and tcg-processed in the main thread, for things like integer
computation or even creating and returning a Java string. However something
like sending a string to logcat makes the thing crash. The primary suspect
is the TLS, which, if I'm not mistaken, is located somewhere in the
thread's stack. Would that make sense?


Re: How to clone CPUState in a new thread?

2019-11-07 Thread Peter Maydell
On Thu, 7 Nov 2019 at 12:46, Michael Goffioul
 wrote:
> Side question: is this the right mailing list to discuss this, or is there a 
> more appropriate one?

You're more likely to find actual QEMU developers reading qemu-devel;
qemu-discuss has fewer contributors and they tend to be more
likely to be end-users or interested in end-user questions
rather than internals.

As for your original question, if you're creating a new
thread and want the new thread's TCG CPU state to match
that of the old thread then the linux-user 'clone' call
is what you want to follow. Duplicating the existing CPU
state is a bit of a hacky codepath but it does work.
If you have a thread that doesn't want to follow the
existing CPU state of some other creating thread but
instead is started with a fixed entirely-known state,
then the code in linux-user/main.c which starts the first
thread of the process might be a better model.

Overall, though, QEMU's code is not designed to be embedded
into some other runtime environment in the way you're doing
it, so I would expect there to be pain involved in trying
to get it to work (especially surrounding threads, new
processes and signals).

thanks
-- PMM



Re: How to clone CPUState in a new thread?

2019-11-07 Thread Michael Goffioul
On Thu, Nov 7, 2019 at 7:38 AM Michael Goffioul 
wrote:

>
>
> On Thu, Nov 7, 2019 at 4:57 AM Jakob Bohm  wrote:
>
>> On 07/11/2019 01:44, Michael Goffioul wrote:
>> > Hi,
>> >
>> > I'm working on a project that wants to replace houdini (ARM-to-x86
>> > translation layer for Android from Intel) with a free open-source
>> > implementation. I'm trying to leverage qemu user-mode to achieve that,
>> > but it requires code changes to allow executing dynamically loaded
>> > functions instead of running a single executable.
>> >
>> Basic question: Isn't the qemu user-mode emulator already able to run a
>> "single executable" that loads DLLs, creates dynamic code etc. in the
>> emulated instruction set?
>>
>> The obvious exception would be to skip the ARM instruction set
>> intermediary
>> when translating Dalvik byte code from .dex files.
>>
>>  From this perspective, emulated ARM thread creation would be just letting
>> qemu emulate the ARM code that would be called, including letting qemu
>> emulate
>> the system calls such as "clone".
>>
>> A special case would be if houdini allows direct calls between ARM and x86
>> .so files.  I don't know if qemu-user has the ability to expose host
>> native DLLs to emulated code.
>>
>
> Basically Houdini implements the native bridge interface, as defined here:
> https://android.googlesource.com/platform/system/core/+/refs/tags/android-10.0.0_r11/libnativebridge/include/nativebridge/native_bridge.h#172
> It allows running Android APK that contains ARM-compiled native/JNI code
> on an Android-x86 OS. It does so by taking care of loading the ARM .so JNI
> files are providing trampoline stubs to the Android runtime JVM. It does
> not expose the host native .so to the emulated code, instead it provides a
> set of ARM-compiled core libraries from Android: it is actually very
> similar to running dynamically linked code in qemu-user with a chroot'ed
> ARM environment. Actual interaction with the native host is happening
> mostly/only through binder socket.
>
> To initialize the qemu-user engine, I make it load a custom ARM .so/ELF
> file that uses the Android linker (from the ARM pseudo chroot environment)
> as interpreter. This allows me to delegate all dynamic linking aspects.
>
> So far, the emulation is working fine and I'm able to run simple
> ARM-compiled apps on Android-x86, even if the native code spawns new
> threads. My current (hopefully last) problem is when a Java thread,
> different than the one that initialized the qemu engine) is trying to run
> native code. I need to setup a new CPUState/CPUArchState instance for this
> Java thread.
>

Side question: is this the right mailing list to discuss this, or is there
a more appropriate one?


Re: How to clone CPUState in a new thread?

2019-11-07 Thread Michael Goffioul
On Thu, Nov 7, 2019 at 4:57 AM Jakob Bohm  wrote:

> On 07/11/2019 01:44, Michael Goffioul wrote:
> > Hi,
> >
> > I'm working on a project that wants to replace houdini (ARM-to-x86
> > translation layer for Android from Intel) with a free open-source
> > implementation. I'm trying to leverage qemu user-mode to achieve that,
> > but it requires code changes to allow executing dynamically loaded
> > functions instead of running a single executable.
> >
> Basic question: Isn't the qemu user-mode emulator already able to run a
> "single executable" that loads DLLs, creates dynamic code etc. in the
> emulated instruction set?
>
> The obvious exception would be to skip the ARM instruction set intermediary
> when translating Dalvik byte code from .dex files.
>
>  From this perspective, emulated ARM thread creation would be just letting
> qemu emulate the ARM code that would be called, including letting qemu
> emulate
> the system calls such as "clone".
>
> A special case would be if houdini allows direct calls between ARM and x86
> .so files.  I don't know if qemu-user has the ability to expose host
> native DLLs to emulated code.
>

Basically Houdini implements the native bridge interface, as defined here:
https://android.googlesource.com/platform/system/core/+/refs/tags/android-10.0.0_r11/libnativebridge/include/nativebridge/native_bridge.h#172
It allows running Android APK that contains ARM-compiled native/JNI code on
an Android-x86 OS. It does so by taking care of loading the ARM .so JNI
files are providing trampoline stubs to the Android runtime JVM. It does
not expose the host native .so to the emulated code, instead it provides a
set of ARM-compiled core libraries from Android: it is actually very
similar to running dynamically linked code in qemu-user with a chroot'ed
ARM environment. Actual interaction with the native host is happening
mostly/only through binder socket.

To initialize the qemu-user engine, I make it load a custom ARM .so/ELF
file that uses the Android linker (from the ARM pseudo chroot environment)
as interpreter. This allows me to delegate all dynamic linking aspects.

So far, the emulation is working fine and I'm able to run simple
ARM-compiled apps on Android-x86, even if the native code spawns new
threads. My current (hopefully last) problem is when a Java thread,
different than the one that initialized the qemu engine) is trying to run
native code. I need to setup a new CPUState/CPUArchState instance for this
Java thread.


Re: How to clone CPUState in a new thread?

2019-11-07 Thread Jakob Bohm

On 07/11/2019 01:44, Michael Goffioul wrote:

Hi,

I'm working on a project that wants to replace houdini (ARM-to-x86 
translation layer for Android from Intel) with a free open-source 
implementation. I'm trying to leverage qemu user-mode to achieve that, 
but it requires code changes to allow executing dynamically loaded 
functions instead of running a single executable.



Basic question: Isn't the qemu user-mode emulator already able to run a
"single executable" that loads DLLs, creates dynamic code etc. in the
emulated instruction set?

The obvious exception would be to skip the ARM instruction set intermediary
when translating Dalvik byte code from .dex files.

From this perspective, emulated ARM thread creation would be just letting
qemu emulate the ARM code that would be called, including letting qemu 
emulate

the system calls such as "clone".

A special case would be if houdini allows direct calls between ARM and x86
.so files.  I don't know if qemu-user has the ability to expose host
native DLLs to emulated code.
In a nutshell, using ideas from unicorn-engine, I've enhanced 
CPUARMState with a stop address. Whenever this address is encountered 
in the translator, it generates a YIELD exception, which then makes 
the cpu_loop to exit.


It works fine for simple cases, but I'm having trouble with 
multi-threading aspect. Threads created from the native/ARM side do 
seem to work properly. The problem is when a new Java thread (not 
created from native/ARM) attempts to execute native code. The QEMU 
engine has been initialized in the main thread, but new Java threads 
do not have access to thread-local variable thread_cpu.


I've tried (maybe naively) to recreate what the clone syscall is doing 
to create a new CPUState/CPUArchState object, usable from the new 
thread, but executing any ARM code quickly lead to a crash. I suppose 
I'm doing something wrong, or missing something to properly initiale a 
new cpu. I'm hoping that someone could help me solve this problem.


I've attached the current QEMU patch I'm using, most of the Android 
glue layer is in linux-user/main.c. It contains a set of utility 
functions that my Android native bridge implementation is using.




Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2860 Soborg, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded




How to clone CPUState in a new thread?

2019-11-06 Thread Michael Goffioul
Hi,

I'm working on a project that wants to replace houdini (ARM-to-x86
translation layer for Android from Intel) with a free open-source
implementation. I'm trying to leverage qemu user-mode to achieve that, but
it requires code changes to allow executing dynamically loaded functions
instead of running a single executable.

In a nutshell, using ideas from unicorn-engine, I've enhanced CPUARMState
with a stop address. Whenever this address is encountered in the
translator, it generates a YIELD exception, which then makes the cpu_loop
to exit.

It works fine for simple cases, but I'm having trouble with multi-threading
aspect. Threads created from the native/ARM side do seem to work properly.
The problem is when a new Java thread (not created from native/ARM)
attempts to execute native code. The QEMU engine has been initialized in
the main thread, but new Java threads do not have access to thread-local
variable thread_cpu.

I've tried (maybe naively) to recreate what the clone syscall is doing to
create a new CPUState/CPUArchState object, usable from the new thread, but
executing any ARM code quickly lead to a crash. I suppose I'm doing
something wrong, or missing something to properly initiale a new cpu. I'm
hoping that someone could help me solve this problem.

I've attached the current QEMU patch I'm using, most of the Android glue
layer is in linux-user/main.c. It contains a set of utility functions that
my Android native bridge implementation is using.


qemu-android.diff.bz2
Description: application/bzip