Re: [PATCH] hw/openrisc: Fixed undercounting of TTCR in continuous mode

2024-06-10 Thread Joel Holdsworth
Hi Stafford, thanks for your response.

> - You sent this 2 times, is the only change in v2 the sender address?

Yes, I was just having some difficulty with Git and SMTP. Should be fixed now.


>> In the existing design, TTCR is prone to undercounting when running in
>> continuous mode. This manifests as a timer interrupt appearing to
>> trigger a few cycles prior to the deadline set in SPR_TTMR_TP.

> This is a good find, I have noticed the timer is off when running on OpenRISC
> but never tracked it down to this undercounting issue.  I also notice
> unexplained RCU stalls when running in Linux when tere is no load, this timer
issue might be related.

> Did you notice this via other system symptoms when running OpenRISC or just 
> via
> code auditing of QEMU?

I'm working on an OpenRISC port of Zephyr. The under-counting issue causes 
consistent deadlocks in my experiments with the test suite. I wouldn't be 
surprised if it causes problems for other OS's.


> In QEMU there is a function clock_ns_to_ticks(). Could this maybe be used
> instead to give us more standard fix?

Seems like a good idea, and I now have some nearly-complete patch that brings 
hw/openrisc/cputimer.c into closer alignment with 
target/mips/sysemu/cp0_timer.c . However, don't we run into problems with 
undercounting with clock_ns_to_ticks, because if I understand correctly it will 
round ticks down, not up?, which is the problem I was trying to avoid in the 
first place.

Joel


[PATCH] hw/openrisc: Fixed undercounting of TTCR in continuous mode

2024-06-07 Thread Joel Holdsworth via
In the existing design, TTCR is prone to undercounting when running in
continuous mode. This manifests as a timer interrupt appearing to
trigger a few cycles prior to the deadline set in SPR_TTMR_TP.

When the timer triggers, the virtual time delta in nanoseconds between
the time when the timer was set, and when it triggers is calculated.
This nanoseconds value is then divided by TIMER_PERIOD (50) to compute
an increment of cycles to apply to TTCR.

However, this calculation rounds down the number of cycles causing the
undercounting.

A simplistic solution would be to instead round up the number of cycles,
however this will result in the accumulation of timing error over time.

This patch corrects the issue by calculating the time delta in
nanoseconds between when the timer was last reset and the timer event.
This approach allows the TTCR value to be rounded up, but without
accumulating error over time.

Signed-off-by: Joel Holdsworth 
---
 hw/openrisc/cputimer.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/hw/openrisc/cputimer.c b/hw/openrisc/cputimer.c
index 835986c4db..ddc129aa48 100644
--- a/hw/openrisc/cputimer.c
+++ b/hw/openrisc/cputimer.c
@@ -29,7 +29,8 @@
 /* Tick Timer global state to allow all cores to be in sync */
 typedef struct OR1KTimerState {
 uint32_t ttcr;
-uint64_t last_clk;
+uint32_t ttcr_offset;
+uint64_t clk_offset;
 } OR1KTimerState;
 
 static OR1KTimerState *or1k_timer;
@@ -37,6 +38,8 @@ static OR1KTimerState *or1k_timer;
 void cpu_openrisc_count_set(OpenRISCCPU *cpu, uint32_t val)
 {
 or1k_timer->ttcr = val;
+or1k_timer->ttcr_offset = val;
+or1k_timer->clk_offset = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 }
 
 uint32_t cpu_openrisc_count_get(OpenRISCCPU *cpu)
@@ -53,9 +56,8 @@ void cpu_openrisc_count_update(OpenRISCCPU *cpu)
 return;
 }
 now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-or1k_timer->ttcr += (uint32_t)((now - or1k_timer->last_clk)
-/ TIMER_PERIOD);
-or1k_timer->last_clk = now;
+or1k_timer->ttcr = (now - or1k_timer->clk_offset + TIMER_PERIOD - 1) / 
TIMER_PERIOD +
+or1k_timer->ttcr_offset;
 }
 
 /* Update the next timeout time as difference between ttmr and ttcr */
@@ -69,7 +71,7 @@ void cpu_openrisc_timer_update(OpenRISCCPU *cpu)
 }
 
 cpu_openrisc_count_update(cpu);
-now = or1k_timer->last_clk;
+now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 
 if ((cpu->env.ttmr & TTMR_TP) <= (or1k_timer->ttcr & TTMR_TP)) {
 wait = TTMR_TP - (or1k_timer->ttcr & TTMR_TP) + 1;
@@ -110,7 +112,8 @@ static void openrisc_timer_cb(void *opaque)
 case TIMER_NONE:
 break;
 case TIMER_INTR:
-or1k_timer->ttcr = 0;
+/* Zero the count by applying a negative offset to the counter */
+or1k_timer->ttcr_offset += UINT32_MAX - (cpu->env.ttmr & TTMR_TP);
 break;
 case TIMER_SHOT:
 cpu_openrisc_count_stop(cpu);
@@ -137,8 +140,8 @@ static void openrisc_count_reset(void *opaque)
 /* Reset the global timer state. */
 static void openrisc_timer_reset(void *opaque)
 {
-or1k_timer->ttcr = 0x;
-or1k_timer->last_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+OpenRISCCPU *cpu = opaque;
+cpu_openrisc_count_set(cpu, 0);
 }
 
 static const VMStateDescription vmstate_or1k_timer = {
@@ -147,7 +150,8 @@ static const VMStateDescription vmstate_or1k_timer = {
 .minimum_version_id = 1,
 .fields = (const VMStateField[]) {
 VMSTATE_UINT32(ttcr, OR1KTimerState),
-VMSTATE_UINT64(last_clk, OR1KTimerState),
+VMSTATE_UINT32(ttcr_offset, OR1KTimerState),
+VMSTATE_UINT64(clk_offset, OR1KTimerState),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
2.44.1




Re: [Qemu-devel] [PATCH v2 2/4] linux-user: pass environment arguments in execve

2016-06-20 Thread Joel Holdsworth

On 20/06/16 22:40, Peter Maydell wrote:

On 20 June 2016 at 22:27, Joel Holdsworth
 wrote:

The current behaviour was quite unexpected to me - there were no warnings,
and the need to link qemu statically isn't documented anywhere. If you
really believe that static linking is the best answer here, then shouldn't
the shared library option be removed? Because with the shared-library build,
qemu-user is somewhat "broken".

Shared library QEMU works fine for simple use cases ("run a gcc
test case", for instance) or where the guest binary is linked
statically and doesn't much care where it runs.


But the distros won't like that because of the induced bloat.

I don't know of a distro which doesn't ship a statically linked
QEMU offhand. Debian, Ubuntu and SUSE certainly all do. You
basically need it for the chroot case, which is a really common one.
[These days if your distro-in-the-chroot is multiarch you could
put all the dynamic libraries for the QEMU binary in it too,
but in practice being able to just copy a single binary in is much
easier.]

thanks
-- PMM


Fair enough. It sounds like this is the way needs to be!

Even so, there is still the issue of the other glibc environment 
variables - see my LANG= example of the parent-guest wanting to run a 
child-guest Japanese, but the child-qemu should still run in English.




Re: [Qemu-devel] [PATCH v2 2/4] linux-user: pass environment arguments in execve

2016-06-20 Thread Joel Holdsworth

On 20/06/16 21:29, Laurent Vivier wrote:


Le 20/06/2016 à 21:51, Joel Holdsworth a écrit :

On 15/06/16 20:59, Laurent Vivier wrote:

Le 14/06/2016 à 21:26, Joel Holdsworth a écrit :

Previously, when emulating execve(2), qemu would execute a child
instance of the emulator with the environment variables provided by
the parent process. This caused problems with qemu if any of the
variables affected the child emulator's behaviour e.g.
LD_LIBRARY_PATH.

The best way to avoid that is to use a statically linked qemu.

Stepping back a bit; the problem I'm trying to solve is this...

There are some processes that invoke a helper process to do some work
for them e.g. gstreamer's gst-plugin-scanner. Previously qemu would
attempt to execute the helper executable as if it were machine-native,
which won't work. These patches modify qemu so that it will (optionally)
run the child process inside a child instance of qemu.

If the context is to use qemu to have a cross build/test environment, I
like the idea, but you should use chroot/binfmt to do that.

Even without the architecture change, the build/test environment must be
isolated (chroot) from the host environment to know exactly what you
build/test.
I do know what we test though: (a gstreamer unit test) -> 
(gst-plugin-scanner).


chroot+binfmt is a fine solution for testing a whole user-space, but 
rather overkill for just a single program.


Also, chroot and binfmt require root permissions. Also the libraries 
have to be installed in a rootfs tree - which isn't how my use case works.





My experience as a user was that it took a couple of hours of searching
through strace logs to figure out what the issue was. gstreamer would
just fail with a generic error about the helper. These patches are meant
to make qemu do the right thing.

Saying to the user that they should make a static linked build of qemu
isn't very practical. Having a command line argument is a much easier
solution for the user, that doesn't force them not to used
shared-library builds. The distros aren't going to go for that.

You can provide the RPM/DEB with the statically linked qemu.
(it will have no dependencies)


Users could do that, but as far as I'm concerned that isn't really 
satisfactory.


The current behaviour was quite unexpected to me - there were no 
warnings, and the need to link qemu statically isn't documented 
anywhere. If you really believe that static linking is the best answer 
here, then shouldn't the shared library option be removed? Because with 
the shared-library build, qemu-user is somewhat "broken".


But the distros won't like that because of the induced bloat.


Moreover, LD_LIBRARY_PATH is just one example. LD_PRELOAD is another.
Timezone and locale environment variables are also an issue.

all LD_ are for the ld.so, the dynamic loader, and with a statically
linked qemu, you don't use the host ld.so (see ld.so(8)).

Why timezone and local environment variables are also an issue?
All these environment variables affect the behaviour of qemu's glibc - 
these are examples of the the parent-guest being able to modify the 
behaviour of the child-host if the execve qemu wrapper patch is 
integrated. The correct way to pass the execve environ to the child qemu 
wrapper is through -E and -U arguments.



Child qemu instance should just ignore it. Thanks, Laurent 


The child qemu can't control what glibc will respond to.

There are a lot of environment variables that can affect it: 
http://www.scratchbox.org/documentation/general/tutorials/glibcenv.html


For example with LANG=, the parent-guest process might want to run the 
child-guest in Japanese, but the child-qemu should still run in English.





Re: [Qemu-devel] [PATCH v2 1/4] linux-user: add option to intercept execve() syscalls

2016-06-20 Thread Joel Holdsworth

On 15/06/16 20:31, Laurent Vivier wrote:


Le 14/06/2016 à 21:26, Joel Holdsworth a écrit :

From: Petros Angelatos 

In order for one to use QEMU user mode emulation under a chroot, it is
required to use binfmt_misc. This can be avoided by QEMU never doing a
raw execve() to the host system.

Introduce a new option, -execve, that uses the current QEMU interpreter
to intercept execve().

qemu_execve() will prepend the interpreter path , similar to what
binfmt_misc would do, and then pass the modified execve() to the host.

It is necessary to parse hashbang scripts in that function otherwise
the kernel will try to run the interpreter of a script without QEMU and
get an invalid exec format error.

Signed-off-by: Petros Angelatos 
Tested-by: Laurent Vivier 
Reviewed-by: Laurent Vivier 
---
  linux-user/main.c|  37 ++
  linux-user/qemu.h|   1 +
  linux-user/syscall.c | 137 ++-
  3 files changed, 164 insertions(+), 11 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index f8a8764..429dc06 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -18,6 +18,8 @@
   */
  #include "qemu/osdep.h"
  #include "qemu-version.h"
+
+#include 
  #include 
  #include 
  #include 
@@ -79,6 +81,7 @@ static void usage(int exitcode);
  
  static const char *interp_prefix = CONFIG_QEMU_INTERP_PREFIX;

  const char *qemu_uname_release;
+const char *qemu_execve_path;
  
  /* XXX: on x86 MAP_GROWSDOWN only works if ESP <= address + 32, so

 we allocate a bigger stack. Need a better solution, for example
@@ -3947,6 +3950,38 @@ static void handle_arg_guest_base(const char *arg)
  have_guest_base = 1;
  }
  
+static void handle_arg_execve(const char *arg)

+{
+const char *execfn;
+char buf[PATH_MAX];
+char *ret;
+int len;
+
+/* try getauxval() */
+execfn = (const char *) getauxval(AT_EXECFN);
+
+if (execfn != 0) {
+ret = realpath(execfn, buf);
+
+if (ret != NULL) {
+qemu_execve_path = strdup(buf);
+return;
+}
+}
+
+/* try /proc/self/exe */
+len = readlink("/proc/self/exe", buf, sizeof(buf) - 1);
+
+if (len != -1) {
+buf[len] = '\0';
+qemu_execve_path = strdup(buf);
+return;
+}
+
+fprintf(stderr, "qemu_execve: unable to determine intepreter's path\n");
+exit(EXIT_FAILURE);
+}
+
  static void handle_arg_reserved_va(const char *arg)
  {
  char *p;
@@ -4032,6 +4067,8 @@ static const struct qemu_argument arg_table[] = {
   "uname",  "set qemu uname release string to 'uname'"},
  {"B",  "QEMU_GUEST_BASE",  true,  handle_arg_guest_base,
   "address","set guest_base address to 'address'"},
+{"execve", "QEMU_EXECVE",  false, handle_arg_execve,
+ "",   "use this interpreter when a process calls execve()"},
  {"R",  "QEMU_RESERVED_VA", true,  handle_arg_reserved_va,
   "size",   "reserve 'size' bytes for guest virtual address space"},
  {"d",  "QEMU_LOG", true,  handle_arg_log,
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 56f29c3..567bcc1 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -149,6 +149,7 @@ void init_task_state(TaskState *ts);
  void task_settid(TaskState *);
  void stop_all_tasks(void);
  extern const char *qemu_uname_release;
+extern const char *qemu_execve_path;
  extern unsigned long mmap_min_addr;
  
  /* ??? See if we can avoid exposing so much of the loader internals.  */

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 71ccbd9..a478f56 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -106,6 +106,7 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
  #include 
  #endif
  #include 
+#include 
  #include "linux_loop.h"
  #include "uname.h"
  
@@ -6659,6 +6660,128 @@ static target_timer_t get_timer_id(abi_long arg)

  return timerid;
  }
  
+/* qemu_execve() Must return target values and target errnos. */

+static abi_long qemu_execve(char *filename, char *argv[],
+  char *envp[])
+{
+char *i_arg = NULL, *i_name = NULL;
+char **new_argp;
+int argc, fd, ret, i, offset = 3;
+char *cp;
+char buf[BINPRM_BUF_SIZE];
+
+/* normal execve case */
+if (qemu_execve_path == NULL || *qemu_execve_path == 0) {
+return get_errno(execve(filename, argv, envp));

You need a safe_execve() here, too.

Laurent
Thanks - now fixed on my branch. I will resubmit when the discussion has 
died down.


Joel



Re: [Qemu-devel] [PATCH v2 4/4] linux-user: pass strace argument in execve

2016-06-20 Thread Joel Holdsworth

On 15/06/16 21:37, Laurent Vivier wrote:

This is not needed: if you use QEMU_STRACE environment variable, it is
propagated to the child processes (this is also true for "-L" and
QEMU_LD_PREFIX).
I would say that breaks the rule of least surprise for the user. If 
we're going to invoke a child instance of qemu-user, then I think the 
user would expect us to propage the arguments from the parent.




In fact, your patch 2 breaks this...

Ok - my mistake. I will test it.



Did you try to use a statically linked qemu?

See my comments in patch #2.




Re: [Qemu-devel] [PATCH v2 3/4] linux-user: pass elf interpreter prefix in execve

2016-06-20 Thread Joel Holdsworth

On 15/06/16 21:06, Laurent Vivier wrote:

More details: why do we need this?

Add your Signed-off-by.

Le 14/06/2016 à 21:26, Joel Holdsworth a écrit :

---
  linux-user/syscall.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 440986e..1513f0f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6667,7 +6667,7 @@ static abi_long qemu_execve(char *filename, char *argv[],
  char *i_arg = NULL, *i_name = NULL;
  char **qemu_argp, **argp;
  int i, j;
-size_t qemu_argc = 3, argc, host_envc, envpc;
+size_t qemu_argc = 5, argc, host_envc, envpc;
  int fd, ret;
  char *cp;
  size_t def_envc = 0, undef_envc = 0;
@@ -6782,6 +6782,8 @@ static abi_long qemu_execve(char *filename, char *argv[],
  
  /* set up the qemu arguments */

  *argp++ = strdup(qemu_execve_path);
+*argp++ = strdup("-L");
+*argp++ = strdup(path("/"));

why "/"?

You should propagate the one from the parent (if != NULL).


path("/") seems to be the easiest way to get access to 'static struct 
pathelem *base' in util/path.c, which is set from the parent's -L 
in init_paths.






Re: [Qemu-devel] [PATCH v2 2/4] linux-user: pass environment arguments in execve

2016-06-20 Thread Joel Holdsworth

On 15/06/16 20:59, Laurent Vivier wrote:


Le 14/06/2016 à 21:26, Joel Holdsworth a écrit :

Previously, when emulating execve(2), qemu would execute a child
instance of the emulator with the environment variables provided by
the parent process. This caused problems with qemu if any of the
variables affected the child emulator's behaviour e.g.
LD_LIBRARY_PATH.

The best way to avoid that is to use a statically linked qemu.


Stepping back a bit; the problem I'm trying to solve is this...

There are some processes that invoke a helper process to do some work 
for them e.g. gstreamer's gst-plugin-scanner. Previously qemu would 
attempt to execute the helper executable as if it were machine-native, 
which won't work. These patches modify qemu so that it will (optionally) 
run the child process inside a child instance of qemu.


My experience as a user was that it took a couple of hours of searching 
through strace logs to figure out what the issue was. gstreamer would 
just fail with a generic error about the helper. These patches are meant 
to make qemu do the right thing.


Saying to the user that they should make a static linked build of qemu 
isn't very practical. Having a command line argument is a much easier 
solution for the user, that doesn't force them not to used 
shared-library builds. The distros aren't going to go for that.


Moreover, LD_LIBRARY_PATH is just one example. LD_PRELOAD is another. 
Timezone and locale environment variables are also an issue.


In an ideal world, it wouldn't even be necessary to add an argument - 
qemu would just do the right thing automatically. Though I'm not sure 
how that could be done correctly, so a command line option is a good 
compromise for a starting point.






This patch solves this issue by passing the environment variables
with '-E' arguments to the child qemu instance. The call to
execve(2) is replaced by a call to execv(2) so that the parent
emulator's environment variable state is propagated into the child.

Any variables from the host environment that are not in the in the
execve() call are removed with a '-U' argument.

Run ./scripts/checkpatch.pl on your patch...

and add your Signed-off-by here.

Sure ok.



The environment is already managed in linux-user/main.c:main(), I don't
understand why the qemu_execve() special case should differ from the
general case.
Maybe I've missed something, but the problem I'm trying to solve here is 
the issue of correctly propagating the guest environment variables into 
the child process.


The parent guest process (running inside qemu-user) constructs a set of 
environment variables and passes them to execve. However, if the parent 
qemu-user decides to run a child instance of qemu-user it should run 
with the same environment variables as the parent, but with environment 
variables the parent-guest passed through the arguments for the child-guest.


If gstreamer decides to run gst-plugin-scanner with a certain environ, 
that is for the child qemu guest, not the child qemu instance itself.






[Qemu-devel] [PATCH v2 4/4] linux-user: pass strace argument in execve

2016-06-14 Thread Joel Holdsworth
---
 linux-user/syscall.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 1513f0f..00ee7a6 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6778,6 +6778,8 @@ static abi_long qemu_execve(char *filename, char *argv[],
 qemu_argc += undef_envc * 2;
 
 /* allocate the argument list */
+if (do_strace)
+qemu_argc++;
 argp = qemu_argp = alloca((qemu_argc + 1) * sizeof(void *));
 
 /* set up the qemu arguments */
@@ -6785,6 +6787,9 @@ static abi_long qemu_execve(char *filename, char *argv[],
 *argp++ = strdup("-L");
 *argp++ = strdup(path("/"));
 
+if (do_strace)
+*argp++ = strdup("-strace");
+
 /* add arguments for the enironment variables */
 for (i = 0; i < def_envc; i++) {
 *argp++ = strdup("-E");
-- 
1.9.1




[Qemu-devel] [PATCH v2 2/4] linux-user: pass environment arguments in execve

2016-06-14 Thread Joel Holdsworth
Previously, when emulating execve(2), qemu would execute a child
instance of the emulator with the environment variables provided by
the parent process. This caused problems with qemu if any of the
variables affected the child emulator's behaviour e.g.
LD_LIBRARY_PATH.

This patch solves this issue by passing the environment variables
with '-E' arguments to the child qemu instance. The call to
execve(2) is replaced by a call to execv(2) so that the parent
emulator's environment variable state is propagated into the child.

Any variables from the host environment that are not in the in the
execve() call are removed with a '-U' argument.
---
 linux-user/syscall.c | 96 ++--
 1 file changed, 70 insertions(+), 26 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index a478f56..440986e 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6665,9 +6665,13 @@ static abi_long qemu_execve(char *filename, char *argv[],
   char *envp[])
 {
 char *i_arg = NULL, *i_name = NULL;
-char **new_argp;
-int argc, fd, ret, i, offset = 3;
+char **qemu_argp, **argp;
+int i, j;
+size_t qemu_argc = 3, argc, host_envc, envpc;
+int fd, ret;
 char *cp;
+size_t def_envc = 0, undef_envc = 0;
+char **def_env, **undef_env;
 char buf[BINPRM_BUF_SIZE];
 
 /* normal execve case */
@@ -6675,10 +6679,12 @@ static abi_long qemu_execve(char *filename, char 
*argv[],
 return get_errno(execve(filename, argv, envp));
 }
 
-for (argc = 0; argv[argc] != NULL; argc++) {
-/* nothing */ ;
-}
+/* count the number of arguments and environment variables */
+for (argc = 0; argv[argc]; argc++);
+for (host_envc = 0; environ[host_envc]; host_envc++);
+for (envpc = 0; envp[envpc]; envpc++);
 
+/* read the file header so we can check the shebang */
 fd = open(filename, O_RDONLY);
 if (fd == -1) {
 return get_errno(fd);
@@ -6739,36 +6745,74 @@ static abi_long qemu_execve(char *filename, char 
*argv[],
 i_arg = cp;
 }
 
-if (i_arg) {
-offset = 5;
-} else {
-offset = 4;
-}
+if (i_arg)
+qemu_argc += 2;
+else
+qemu_argc += 1;
+}
+
+/* list environment variables to define */
+def_env = alloca((envpc + 1) * sizeof(envp[0]));
+for (i = 0; i != envpc; i++) {
+for (j = 0; j != host_envc; j++)
+if (!strcmp(envp[i], environ[j]))
+break;
+if (j == host_envc)
+def_env[def_envc++] = envp[i];
 }
 
-new_argp = alloca((argc + offset + 1) * sizeof(void *));
+qemu_argc += def_envc * 2;
 
-/* Copy the original arguments with offset */
-for (i = 0; i < argc; i++) {
-new_argp[i + offset] = argv[i];
+/* list environment variables to undefine */
+undef_env = alloca((host_envc + 1) * sizeof(undef_env[0]));
+for (i = 0; i != host_envc; i++) {
+const char *const host_env = environ[i];
+   const size_t key_len = strchr(host_env, '=') - host_env;
+for (j = 0; j != envpc; j++)
+if (!strncmp(host_env, envp[j], key_len))
+break;
+if (j == envpc)
+undef_env[undef_envc++] = strndup(environ[i], key_len);
 }
 
-new_argp[0] = strdup(qemu_execve_path);
-new_argp[1] = strdup("-0");
-new_argp[offset] = filename;
-new_argp[argc + offset] = NULL;
+qemu_argc += undef_envc * 2;
 
-if (i_name) {
-new_argp[2] = i_name;
-new_argp[3] = i_name;
+/* allocate the argument list */
+argp = qemu_argp = alloca((qemu_argc + 1) * sizeof(void *));
 
-if (i_arg) {
-new_argp[4] = i_arg;
-}
+/* set up the qemu arguments */
+*argp++ = strdup(qemu_execve_path);
+
+/* add arguments for the enironment variables */
+for (i = 0; i < def_envc; i++) {
+*argp++ = strdup("-E");
+*argp++ = def_env[i];
+}
+
+for (i = 0; i < undef_envc; i++) {
+*argp++ = strdup("-U");
+*argp++ = undef_env[i];
+}
+
+/* add the path to the executable */
+*argp++ = strdup("-0");
+if (i_name) {
+*argp++ = i_name;
+*argp++ = i_name;
+if (i_arg)
+*argp++ = i_arg;
 } else {
-new_argp[2] = argv[0];
+*argp++ = argv[0];
 }
 
+*argp++ = filename;
+
+/* copy the original arguments with offset */
+for (i = 1; i < argc; i++)
+*argp++ = argv[i];
+
+*argp++ = NULL;
+
 /* Although execve() is not an interruptible syscall it is
  * a special case where we must use the safe_syscall wrapper:
  * if we allow a signal to happen before we make the host
@@ -6779,7 +6823,7 @@ static abi_long qemu_execve(char *filename, char *argv[],
  * before the execve completes and makes it the other
  * program's problem.
  */
-return get_errno

[Qemu-devel] [PATCH v2 3/4] linux-user: pass elf interpreter prefix in execve

2016-06-14 Thread Joel Holdsworth
---
 linux-user/syscall.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 440986e..1513f0f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6667,7 +6667,7 @@ static abi_long qemu_execve(char *filename, char *argv[],
 char *i_arg = NULL, *i_name = NULL;
 char **qemu_argp, **argp;
 int i, j;
-size_t qemu_argc = 3, argc, host_envc, envpc;
+size_t qemu_argc = 5, argc, host_envc, envpc;
 int fd, ret;
 char *cp;
 size_t def_envc = 0, undef_envc = 0;
@@ -6782,6 +6782,8 @@ static abi_long qemu_execve(char *filename, char *argv[],
 
 /* set up the qemu arguments */
 *argp++ = strdup(qemu_execve_path);
+*argp++ = strdup("-L");
+*argp++ = strdup(path("/"));
 
 /* add arguments for the enironment variables */
 for (i = 0; i < def_envc; i++) {
-- 
1.9.1




[Qemu-devel] [PATCH v2 1/4] linux-user: add option to intercept execve() syscalls

2016-06-14 Thread Joel Holdsworth
From: Petros Angelatos 

In order for one to use QEMU user mode emulation under a chroot, it is
required to use binfmt_misc. This can be avoided by QEMU never doing a
raw execve() to the host system.

Introduce a new option, -execve, that uses the current QEMU interpreter
to intercept execve().

qemu_execve() will prepend the interpreter path , similar to what
binfmt_misc would do, and then pass the modified execve() to the host.

It is necessary to parse hashbang scripts in that function otherwise
the kernel will try to run the interpreter of a script without QEMU and
get an invalid exec format error.

Signed-off-by: Petros Angelatos 
Tested-by: Laurent Vivier 
Reviewed-by: Laurent Vivier 
---
 linux-user/main.c|  37 ++
 linux-user/qemu.h|   1 +
 linux-user/syscall.c | 137 ++-
 3 files changed, 164 insertions(+), 11 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index f8a8764..429dc06 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -18,6 +18,8 @@
  */
 #include "qemu/osdep.h"
 #include "qemu-version.h"
+
+#include 
 #include 
 #include 
 #include 
@@ -79,6 +81,7 @@ static void usage(int exitcode);
 
 static const char *interp_prefix = CONFIG_QEMU_INTERP_PREFIX;
 const char *qemu_uname_release;
+const char *qemu_execve_path;
 
 /* XXX: on x86 MAP_GROWSDOWN only works if ESP <= address + 32, so
we allocate a bigger stack. Need a better solution, for example
@@ -3947,6 +3950,38 @@ static void handle_arg_guest_base(const char *arg)
 have_guest_base = 1;
 }
 
+static void handle_arg_execve(const char *arg)
+{
+const char *execfn;
+char buf[PATH_MAX];
+char *ret;
+int len;
+
+/* try getauxval() */
+execfn = (const char *) getauxval(AT_EXECFN);
+
+if (execfn != 0) {
+ret = realpath(execfn, buf);
+
+if (ret != NULL) {
+qemu_execve_path = strdup(buf);
+return;
+}
+}
+
+/* try /proc/self/exe */
+len = readlink("/proc/self/exe", buf, sizeof(buf) - 1);
+
+if (len != -1) {
+buf[len] = '\0';
+qemu_execve_path = strdup(buf);
+return;
+}
+
+fprintf(stderr, "qemu_execve: unable to determine intepreter's path\n");
+exit(EXIT_FAILURE);
+}
+
 static void handle_arg_reserved_va(const char *arg)
 {
 char *p;
@@ -4032,6 +4067,8 @@ static const struct qemu_argument arg_table[] = {
  "uname",  "set qemu uname release string to 'uname'"},
 {"B",  "QEMU_GUEST_BASE",  true,  handle_arg_guest_base,
  "address","set guest_base address to 'address'"},
+{"execve", "QEMU_EXECVE",  false, handle_arg_execve,
+ "",   "use this interpreter when a process calls execve()"},
 {"R",  "QEMU_RESERVED_VA", true,  handle_arg_reserved_va,
  "size",   "reserve 'size' bytes for guest virtual address space"},
 {"d",  "QEMU_LOG", true,  handle_arg_log,
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 56f29c3..567bcc1 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -149,6 +149,7 @@ void init_task_state(TaskState *ts);
 void task_settid(TaskState *);
 void stop_all_tasks(void);
 extern const char *qemu_uname_release;
+extern const char *qemu_execve_path;
 extern unsigned long mmap_min_addr;
 
 /* ??? See if we can avoid exposing so much of the loader internals.  */
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 71ccbd9..a478f56 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -106,6 +106,7 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
 #include 
 #endif
 #include 
+#include 
 #include "linux_loop.h"
 #include "uname.h"
 
@@ -6659,6 +6660,128 @@ static target_timer_t get_timer_id(abi_long arg)
 return timerid;
 }
 
+/* qemu_execve() Must return target values and target errnos. */
+static abi_long qemu_execve(char *filename, char *argv[],
+  char *envp[])
+{
+char *i_arg = NULL, *i_name = NULL;
+char **new_argp;
+int argc, fd, ret, i, offset = 3;
+char *cp;
+char buf[BINPRM_BUF_SIZE];
+
+/* normal execve case */
+if (qemu_execve_path == NULL || *qemu_execve_path == 0) {
+return get_errno(execve(filename, argv, envp));
+}
+
+for (argc = 0; argv[argc] != NULL; argc++) {
+/* nothing */ ;
+}
+
+fd = open(filename, O_RDONLY);
+if (fd == -1) {
+return get_errno(fd);
+}
+
+ret = read(fd, buf, BINPRM_BUF_SIZE);
+if (ret == -1) {
+close(fd);
+return get_errno(ret);
+}
+
+/* if we have less than 2 bytes, we can guess it is not executable */
+if (ret < 2) {
+close(fd);
+return -host_to_target_errno(ENOEXEC);
+}
+
+close(fd);
+
+/* adapted from the kernel
+ * 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/binfmt_script.c
+ */
+if ((buf[0] == '#') && (buf[1] == '!')) {
+/*
+ * T

[Qemu-devel] linux-user: add option to intercept execve() syscalls

2016-06-14 Thread Joel Holdsworth
This patch-set includes Peter Angelatos's previous patch-set [1] and
adds code to pass arguments for setting the environment variables,
passing the interpeter prefix, and passing the strace option.

Changes since v1:
 - Fixed argument array overflows.

[1] https://patchwork.ozlabs.org/patch/582756/




Re: [Qemu-devel] linux-user: add option to intercept execve() syscalls

2016-05-31 Thread Joel Holdsworth

On 30/05/16 08:52, Riku Voipio wrote:

On Wed, May 25, 2016 at 05:07:48PM +0100, Joel Holdsworth wrote:
Considering the messiness this serieas adds to QEMU, I do wonder how
much of win this avoidance really is.

Suggestions on how to make it less messy would be welcome.

If you have permissions to chroot,
you generally have permissions to set binfmt_misc too. Alternatively
these kind of exec manipulations are already done by external tools like
proot and scratchbox.
True. Though I would say that's quite a convoluted solution compared to 
this patch.


The issue I encountered came when I was trying to run ARM gstreamer unit 
tests. libgstreamer runs the gst-plugin-scanner helper executable at 
startup to scan for plugins. However it took me a long time to figure 
this issue out because in the current behavior linux-user doesn't make 
it clear what qemu will run all child processes on the host even if they 
are a different arch.


Adding these patches offers the choice, and makes the situation clearer 
to the user.



However if you are ready to stay around to maintain it, and nobody else
objects the code, I can merge it.

I'll be around for the time being certainly.

Best Regards
Joel Holdsworth



[Qemu-devel] [PATCH 2/4] linux-user: pass environment arguments in execve

2016-05-25 Thread Joel Holdsworth
Previously, when emulating execve(2), qemu would execute a child
instance of the emulator with the environment variables provided by
the parent process. This caused problems with qemu if any of the
variables affected the child emulator's behaviour e.g.
LD_LIBRARY_PATH.

This patch solves this issue by passing the environment variables
with '-E' arguments to the child qemu instance. The call to
execve(2) is replaced by a call to execv(2) so that the parent
emulator's environment variable state is propagated into the child.

Any variables from the host environment that are not in the in the
execve() call are removed with a '-U' argument.
---
 linux-user/syscall.c | 96 ++--
 1 file changed, 70 insertions(+), 26 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 750e381..b95f75a 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -5853,9 +5853,13 @@ static abi_long qemu_execve(char *filename, char *argv[],
   char *envp[])
 {
 char *i_arg = NULL, *i_name = NULL;
-char **new_argp;
-int argc, fd, ret, i, offset = 3;
+char **qemu_argp, **argp;
+int i, j;
+size_t qemu_argc = 3, argc, host_envc, envpc;
+int fd, ret;
 char *cp;
+size_t def_envc = 0, undef_envc = 0;
+char **def_env, **undef_env;
 char buf[BINPRM_BUF_SIZE];
 
 /* normal execve case */
@@ -5863,10 +5867,12 @@ static abi_long qemu_execve(char *filename, char 
*argv[],
 return get_errno(execve(filename, argv, envp));
 }
 
-for (argc = 0; argv[argc] != NULL; argc++) {
-/* nothing */ ;
-}
+/* count the number of arguments and environment variables */
+for (argc = 0; argv[argc]; argc++);
+for (host_envc = 0; environ[host_envc]; host_envc++);
+for (envpc = 0; envp[envpc]; envpc++);
 
+/* read the file header so we can check the shebang */
 fd = open(filename, O_RDONLY);
 if (fd == -1) {
 return get_errno(fd);
@@ -5927,37 +5933,75 @@ static abi_long qemu_execve(char *filename, char 
*argv[],
 i_arg = cp;
 }
 
-if (i_arg) {
-offset = 5;
-} else {
-offset = 4;
-}
+if (i_arg)
+qemu_argc += 2;
+else
+qemu_argc += 1;
+}
+
+/* list environment variables to define */
+def_env = alloca((envpc + 1) * sizeof(envp[0]));
+for (i = 0; i != envpc; i++) {
+for (j = 0; j != host_envc; j++)
+if (!strcmp(envp[i], environ[j]))
+break;
+if (j == host_envc)
+def_env[def_envc++] = envp[i];
 }
 
-new_argp = alloca((argc + offset + 1) * sizeof(void *));
+argc += def_envc * 2;
 
-/* Copy the original arguments with offset */
-for (i = 0; i < argc; i++) {
-new_argp[i + offset] = argv[i];
+/* list environment variables to undefine */
+undef_env = alloca((host_envc + 1) * sizeof(envp[0]));
+for (i = 0; i != host_envc; i++) {
+const char *const host_env = environ[i];
+   const size_t key_len = strchr(host_env, '=') - host_env;
+for (j = 0; j != envpc; j++)
+if (!strncmp(host_env, envp[j], key_len))
+break;
+if (j == envpc)
+undef_env[undef_envc++] = strndup(environ[i], key_len);
 }
 
-new_argp[0] = strdup(qemu_execve_path);
-new_argp[1] = strdup("-0");
-new_argp[offset] = filename;
-new_argp[argc + offset] = NULL;
+argc += undef_envc * 2;
 
-if (i_name) {
-new_argp[2] = i_name;
-new_argp[3] = i_name;
+/* allocate the argument list */
+argp = qemu_argp = alloca((qemu_argc + 1) * sizeof(void *));
 
-if (i_arg) {
-new_argp[4] = i_arg;
-}
+/* set up the qemu arguments */
+*argp++ = strdup(qemu_execve_path);
+
+/* add arguments for the enironment variables */
+for (i = 0; i < def_envc; i++) {
+*argp++ = strdup("-E");
+*argp++ = def_env[i];
+}
+
+for (i = 0; i < undef_envc; i++) {
+*argp++ = strdup("-U");
+*argp++ = undef_env[i];
+}
+
+/* add the path to the executable */
+*argp++ = strdup("-0");
+if (i_name) {
+*argp++ = i_name;
+*argp++ = i_name;
+if (i_arg)
+*argp++ = i_arg;
 } else {
-new_argp[2] = argv[0];
+*argp++ = argv[0];
 }
 
-return get_errno(execve(qemu_execve_path, new_argp, envp));
+*argp++ = filename;
+
+/* copy the original arguments with offset */
+for (i = 1; i < argc; i++)
+*argp++ = argv[i];
+
+*argp++ = NULL;
+
+return get_errno(execv(qemu_execve_path, qemu_argp));
 }
 
 /* do_syscall() should always have a single exit point at the end so
-- 
1.9.1




[Qemu-devel] [PATCH 1/4] linux-user: add option to intercept execve() syscalls

2016-05-25 Thread Joel Holdsworth
From: Petros Angelatos 

In order for one to use QEMU user mode emulation under a chroot, it is
required to use binfmt_misc. This can be avoided by QEMU never doing a
raw execve() to the host system.

Introduce a new option, -execve, that uses the current QEMU interpreter
to intercept execve().

qemu_execve() will prepend the interpreter path , similar to what
binfmt_misc would do, and then pass the modified execve() to the host.

It is necessary to parse hashbang scripts in that function otherwise
the kernel will try to run the interpreter of a script without QEMU and
get an invalid exec format error.

Signed-off-by: Petros Angelatos 
Tested-by: Laurent Vivier 
Reviewed-by: Laurent Vivier 
---
 linux-user/main.c|  36 
 linux-user/qemu.h|   1 +
 linux-user/syscall.c | 117 ++-
 3 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 95ed11d..093f858 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -17,6 +17,7 @@
  *  along with this program; if not, see .
  */
 #include "qemu/osdep.h"
+#include 
 #include 
 #include 
 #include 
@@ -78,6 +79,7 @@ static void usage(int exitcode);
 
 static const char *interp_prefix = CONFIG_QEMU_INTERP_PREFIX;
 const char *qemu_uname_release;
+const char *qemu_execve_path;
 
 /* XXX: on x86 MAP_GROWSDOWN only works if ESP <= address + 32, so
we allocate a bigger stack. Need a better solution, for example
@@ -3868,6 +3870,38 @@ static void handle_arg_guest_base(const char *arg)
 have_guest_base = 1;
 }
 
+static void handle_arg_execve(const char *arg)
+{
+const char *execfn;
+char buf[PATH_MAX];
+char *ret;
+int len;
+
+/* try getauxval() */
+execfn = (const char *) getauxval(AT_EXECFN);
+
+if (execfn != 0) {
+ret = realpath(execfn, buf);
+
+if (ret != NULL) {
+qemu_execve_path = strdup(buf);
+return;
+}
+}
+
+/* try /proc/self/exe */
+len = readlink("/proc/self/exe", buf, sizeof(buf) - 1);
+
+if (len != -1) {
+buf[len] = '\0';
+qemu_execve_path = strdup(buf);
+return;
+}
+
+fprintf(stderr, "qemu_execve: unable to determine intepreter's path\n");
+exit(EXIT_FAILURE);
+}
+
 static void handle_arg_reserved_va(const char *arg)
 {
 char *p;
@@ -3953,6 +3987,8 @@ static const struct qemu_argument arg_table[] = {
  "uname",  "set qemu uname release string to 'uname'"},
 {"B",  "QEMU_GUEST_BASE",  true,  handle_arg_guest_base,
  "address","set guest_base address to 'address'"},
+{"execve", "QEMU_EXECVE",  false, handle_arg_execve,
+ "",   "use this interpreter when a process calls execve()"},
 {"R",  "QEMU_RESERVED_VA", true,  handle_arg_reserved_va,
  "size",   "reserve 'size' bytes for guest virtual address space"},
 {"d",  "QEMU_LOG", true,  handle_arg_log,
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 208c63e..50fe716 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -138,6 +138,7 @@ void init_task_state(TaskState *ts);
 void task_settid(TaskState *);
 void stop_all_tasks(void);
 extern const char *qemu_uname_release;
+extern const char *qemu_execve_path;
 extern unsigned long mmap_min_addr;
 
 /* ??? See if we can avoid exposing so much of the loader internals.  */
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 032d338..750e381 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -101,6 +101,7 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
 #include 
 #include 
 #include 
+#include 
 #include "linux_loop.h"
 #include "uname.h"
 
@@ -5847,6 +5848,118 @@ static target_timer_t get_timer_id(abi_long arg)
 return timerid;
 }
 
+/* qemu_execve() Must return target values and target errnos. */
+static abi_long qemu_execve(char *filename, char *argv[],
+  char *envp[])
+{
+char *i_arg = NULL, *i_name = NULL;
+char **new_argp;
+int argc, fd, ret, i, offset = 3;
+char *cp;
+char buf[BINPRM_BUF_SIZE];
+
+/* normal execve case */
+if (qemu_execve_path == NULL || *qemu_execve_path == 0) {
+return get_errno(execve(filename, argv, envp));
+}
+
+for (argc = 0; argv[argc] != NULL; argc++) {
+/* nothing */ ;
+}
+
+fd = open(filename, O_RDONLY);
+if (fd == -1) {
+return get_errno(fd);
+}
+
+ret = read(fd, buf, BINPRM_BUF_SIZE);
+if (ret == -1) {
+close(fd);
+return get_errno(ret);
+}
+
+/* if we have less than 2 bytes, we can guess it is not executable */
+if (ret < 2) {
+close(fd);
+return -host_to_target_errno(ENOEXEC);
+}
+
+close(fd);
+
+/* adapted from the kernel
+ * 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/binfmt_script.c
+ */
+if ((buf[0] == '#')

[Qemu-devel] [PATCH 3/4] linux-user: pass elf interpreter prefix in execve

2016-05-25 Thread Joel Holdsworth
---
 linux-user/syscall.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b95f75a..fb75c09 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -5855,7 +5855,7 @@ static abi_long qemu_execve(char *filename, char *argv[],
 char *i_arg = NULL, *i_name = NULL;
 char **qemu_argp, **argp;
 int i, j;
-size_t qemu_argc = 3, argc, host_envc, envpc;
+size_t qemu_argc = 5, argc, host_envc, envpc;
 int fd, ret;
 char *cp;
 size_t def_envc = 0, undef_envc = 0;
@@ -5970,6 +5970,8 @@ static abi_long qemu_execve(char *filename, char *argv[],
 
 /* set up the qemu arguments */
 *argp++ = strdup(qemu_execve_path);
+*argp++ = strdup("-L");
+*argp++ = strdup(path("/"));
 
 /* add arguments for the enironment variables */
 for (i = 0; i < def_envc; i++) {
-- 
1.9.1




[Qemu-devel] linux-user: add option to intercept execve() syscalls

2016-05-25 Thread Joel Holdsworth
This patch-set includes Peter Angelatos's previous patch-set [1] and
adds code to pass arguments for setting the environment variables,
passing the interpeter prefix, and passing the strace option.

[1] https://patchwork.ozlabs.org/patch/582756/




[Qemu-devel] [PATCH 4/4] linux-user: pass strace argument in execve

2016-05-25 Thread Joel Holdsworth
---
 linux-user/syscall.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index fb75c09..314a890 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -5966,6 +5966,8 @@ static abi_long qemu_execve(char *filename, char *argv[],
 argc += undef_envc * 2;
 
 /* allocate the argument list */
+if (do_strace)
+qemu_argc++;
 argp = qemu_argp = alloca((qemu_argc + 1) * sizeof(void *));
 
 /* set up the qemu arguments */
@@ -5973,6 +5975,9 @@ static abi_long qemu_execve(char *filename, char *argv[],
 *argp++ = strdup("-L");
 *argp++ = strdup(path("/"));
 
+if (do_strace)
+*argp++ = strdup("-strace");
+
 /* add arguments for the enironment variables */
 for (i = 0; i < def_envc; i++) {
 *argp++ = strdup("-E");
-- 
1.9.1