Re: Unprivileged double fault with GDB and simple program written in assembly

2024-09-10 Thread Luca

Hi,

Il 08/09/24 01:23, Samuel Thibault ha scritto:

Hello,

AIUI, the fix you submitted was meant to fix this?


yes, at least to correctly decode the registers in the fault handler, I 
didn't look deeper yet.


Luca



[PATCH gnumach v3 1/3] add xfloat thread state interface

2024-09-04 Thread Luca Dariz
* i386/i386/fpu.c: extend current getter and setter to support the
  extended state; move the struct casting here to reuse the locking
  and allocation logic for the thread state; make sure the new state
  is set as valid, otherwise it won't be applied; add
  i386_get_xstate_size() to dynamically retrieve the FPU state size.
* i386/i386/fpu.h: update prototypes to accept generic thread state
* i386/i386/pcb.c: forward raw thread state to getter and setter, only
  checking for minimum size and use the new i386_get_xstate_size()
  helper.
* i386/include/mach/i386/mach_i386.defs: expose the new helper
  i386_get_xstate_size().
* i386/include/mach/i386/thread_status.h: add interface definition for
  I386_XFLOAT_STATE and the corresponding data structure.
---
 i386/i386/fpu.c| 139 +++--
 i386/i386/fpu.h|   9 +-
 i386/i386/pcb.c|  37 ++-
 i386/include/mach/i386/mach_i386.defs  |   7 ++
 i386/include/mach/i386/thread_status.h |  12 +++
 5 files changed, 161 insertions(+), 43 deletions(-)

diff --git a/i386/i386/fpu.c b/i386/i386/fpu.c
index 316e3b41..43771e7f 100644
--- a/i386/i386/fpu.c
+++ b/i386/i386/fpu.c
@@ -250,6 +250,17 @@ init_fpu(void)
}
 }
 
+kern_return_t
+i386_get_xstate_size(host_t host, vm_size_t *size)
+{
+   if (host == HOST_NULL)
+   return KERN_INVALID_ARGUMENT;
+
+   *size = sizeof(struct i386_xfloat_state) + fp_xsave_size;
+
+   return KERN_SUCCESS;
+}
+
 /*
  * Initialize FP handling.
  */
@@ -385,10 +396,11 @@ twd_fxsr_to_i387 (struct i386_xfp_save *fxsave)
  * concurrent fpu_set_state or fpu_get_state.
  */
 kern_return_t
-fpu_set_state(const thread_t thread,
- struct i386_float_state *state)
+fpu_set_state(const thread_t thread, void *state, int flavor)
 {
pcb_t pcb = thread->pcb;
+   struct i386_float_state *fstate = (struct i386_float_state*)state;
+   struct i386_xfloat_state *xfstate = (struct i386_xfloat_state*)state;
struct i386_fpsave_state *ifps;
struct i386_fpsave_state *new_ifps;
 
@@ -410,7 +422,8 @@ ASSERT_IPL(SPL0);
}
 #endif
 
-   if (state->initialized == 0) {
+   if ((flavor == i386_FLOAT_STATE && fstate->initialized == 0) ||
+   (flavor == i386_XFLOAT_STATE && xfstate->initialized == 0)) {
/*
 * new FPU state is 'invalid'.
 * Deallocate the fp state if it exists.
@@ -428,13 +441,6 @@ ASSERT_IPL(SPL0);
/*
 * Valid state.  Allocate the fp state if there is none.
 */
-   struct i386_fp_save *user_fp_state;
-   struct i386_fp_regs *user_fp_regs;
-
-   user_fp_state = (struct i386_fp_save *) &state->hw_state[0];
-   user_fp_regs  = (struct i386_fp_regs *)
-   &state->hw_state[sizeof(struct i386_fp_save)];
-
new_ifps = 0;
Retry:
simple_lock(&pcb->lock);
@@ -454,10 +460,43 @@ ASSERT_IPL(SPL0);
 * Ensure that reserved parts of the environment are 0.
 */
memset(ifps, 0, fp_xsave_size);
+   ifps->fp_valid = TRUE;
 
-   if (fp_save_kind != FP_FNSAVE) {
-   int i;
+   if (flavor == i386_FLOAT_STATE) {
+   struct i386_fp_save *user_fp_state;
+   struct i386_fp_regs *user_fp_regs;
+
+   user_fp_state = (struct i386_fp_save *) &fstate->hw_state[0];
+   user_fp_regs  = (struct i386_fp_regs *)
+   &fstate->hw_state[sizeof(struct i386_fp_save)];
 
+   if (fp_save_kind != FP_FNSAVE) {
+   int i;
+
+   ifps->xfp_save_state.fp_control = user_fp_state->fp_control;
+   ifps->xfp_save_state.fp_status  = user_fp_state->fp_status;
+   ifps->xfp_save_state.fp_tag = 
twd_i387_to_fxsr(user_fp_state->fp_tag);
+   ifps->xfp_save_state.fp_eip = user_fp_state->fp_eip;
+   ifps->xfp_save_state.fp_cs  = user_fp_state->fp_cs;
+   ifps->xfp_save_state.fp_opcode  = user_fp_state->fp_opcode;
+   ifps->xfp_save_state.fp_dp  = user_fp_state->fp_dp;
+   ifps->xfp_save_state.fp_ds  = user_fp_state->fp_ds;
+   for (i=0; i<8; i++)
+   memcpy(&ifps->xfp_save_state.fp_reg_word[i], 
&user_fp_regs->fp_reg_word[i], sizeof(user_fp_regs->fp_reg_word[i]));
+   } else {
+   ifps->fp_save_state.fp_control = user_fp_state->fp_control;
+   ifps->fp_save_state.fp_status  = user_fp_state->fp_status;
+   ifps->fp_save_state.fp_tag = user_fp_state->fp_tag;
+   ifps->fp_save_state.fp_eip = user_fp_state->fp_eip;
+   ifps->fp_save_state.fp_cs  = user_fp_state->fp_cs;
+   ifps->fp_save_state.fp_opcode  = user_fp_state->fp_opcode;
+

[PATCH gnumach 3/3] x86_64: fix double fault handler

2024-09-04 Thread Luca Dariz
* x86_64/locore.S: adjust to the changes in the thread state
  structure (segment registers), and add the missing opcode.
---
 x86_64/locore.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/x86_64/locore.S b/x86_64/locore.S
index 8f39a677..376f41c1 100644
--- a/x86_64/locore.S
+++ b/x86_64/locore.S
@@ -435,10 +435,10 @@ ENTRY(start_timer)
 ENTRY(t_dbl_fault)
INT_FIX
cli /* disable interrupts that might corrupt the state*/
+   pushq   $(T_DOUBLE_FAULT)   /* indicate fault type  */
pusha
movq%cr2,%rax
movq%rax,R_CR2-R_R15(%rsp)  /* CR2 might contain the faulting 
address */
-   subq$48,%rsp// FIXME remove when segments are cleaned up
movq%rsp,%rdi   /* pass the saved state */
callhandle_double_fault
jmp cpu_shutdown/* reset */
-- 
2.39.2




[PATCH gnumach v3 2/3] add tests for FLOAT/XFLOAT state

2024-09-04 Thread Luca Dariz
---
 tests/include/testlib.h  |   1 +
 tests/test-thread-state-fp.c | 240 +++
 tests/testlib.c  |  16 +++
 tests/user-qemu.mk   |   3 +-
 4 files changed, 259 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-thread-state-fp.c

diff --git a/tests/include/testlib.h b/tests/include/testlib.h
index 7c7c2b11..17a96660 100644
--- a/tests/include/testlib.h
+++ b/tests/include/testlib.h
@@ -70,6 +70,7 @@ thread_t test_thread_start(task_t task, 
void(*routine)(void*), void* arg);
 mach_port_t host_priv(void);
 mach_port_t device_priv(void);
 void wait_thread_terminated(thread_t th);
+void wait_thread_suspended(thread_t th);
 
 extern vm_size_t vm_page_size;
 
diff --git a/tests/test-thread-state-fp.c b/tests/test-thread-state-fp.c
new file mode 100644
index ..2aab8ca9
--- /dev/null
+++ b/tests/test-thread-state-fp.c
@@ -0,0 +1,240 @@
+/*
+ * Copyright (c) 2024 Free Software Foundation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#if defined(__i386__) || defined(__x86_64__)
+#include 
+
+static void printx(struct i386_xfloat_state *state, int size)
+{
+  printf("xfloat init %d fp %d exc %d\n",
+ state->initialized, state->fpkind, state->exc_status);
+  struct i386_xfp_save *xfp = (struct i386_xfp_save *) &state->hw_state[0];
+  printf("xfp %x %x %x %x %x %x %x %x\n",
+ xfp->fp_control, xfp->fp_status, xfp->fp_tag, xfp->fp_eip,
+ xfp->fp_cs, xfp->fp_opcode, xfp->fp_dp, xfp->fp_ds);
+  for (int i=0; i<8; i++)
+{
+  printf("fp%d", i);
+  for (int j=0; j<16; j++)
+printf(" %02x", xfp->fp_reg_word[i][j]);
+  printf("\n");
+}
+  for (int i=0; i<16; i++)
+{
+  printf("xmm%02d", i);
+  for (int j=0; j<16; j++)
+printf(" %02x", xfp->fp_xreg_word[i][j]);
+  printf("\n");
+}
+  printf("header xfp_features %llx bv %llx\n",
+ xfp->header.xfp_features, xfp->header.xcomp_bv);
+  int rem = size - sizeof(*state) - sizeof(struct i386_xfp_save);
+  if (rem > 0)
+{
+  int iter = 0;
+  while (rem > 0)
+{
+  const int len = 16;
+  int n;
+  if (rem > len)
+n = len;
+  else
+n = rem;
+  printf("ext");
+  for (int j=0; jextended[j + iter*len]);
+  printf("\n");
+  rem -= n;
+  iter++;
+}
+}
+}
+
+static void thread_fp_getset(void *arg)
+{
+  int err;
+  thread_t th = *(thread_t*)arg;
+
+  wait_thread_suspended(th);
+
+  mach_msg_type_number_t state_count = i386_FLOAT_STATE_COUNT;
+  struct i386_float_state state;
+
+  memset(&state, 0, sizeof(state));
+  err = thread_get_state(th, i386_FLOAT_STATE,
+(thread_state_t) &state, &state_count);
+  ASSERT_RET(err, "thread_get_state get failed");
+  ASSERT(state_count == i386_FLOAT_STATE_COUNT, "bad state_count");
+
+  struct i386_fp_regs *fpr =
+  (struct i386_fp_regs *) (&state.hw_state[0] + sizeof(struct 
i386_fp_save));
+
+  printf("fp regs get:\n");
+  for (int i=0; i<8; i++)
+  {
+  printf("fp%d", i);
+  for (int j=0; j<5; j++)
+  printf(" %04x", fpr->fp_reg_word[i][j]);
+  printf("\n");
+  }
+
+  char tmp[10];
+  memcpy(tmp, &fpr->fp_reg_word[1][0], sizeof(tmp));
+  memcpy(&fpr->fp_reg_word[1][0], &fpr->fp_reg_word[0][0], sizeof(tmp));
+  memcpy(&fpr->fp_reg_word[0][0], tmp, sizeof(tmp));
+
+  printf("fp regs set:\n");
+  for (int i=0; i<8; i++)
+  {
+  printf("fp%d", i);
+  for (int j=0; j<5; j++)
+  printf(" %04x", fpr->fp_reg_word[i][j]);
+  printf("\n");
+  }
+
+  err = thread_set_state(th, i386_FLOAT_STATE,
+(thread_state_t) &state, state_count);
+  ASSERT_RET(err, "thread_set_state set failed");
+
+  err = thread_resume(th);
+  ASSERT_RET(err, "error in thread_resume");
+  thread_terminate(mach_thread_self());
+  FAILURE("thread_terminate");
+}
+
+static void test_fp_state_getset()
+{
+  int err;
+  thread_t th = mach_thread_self();
+
+  /* load some known value in FP registers */
+  int n1[1] = {11};
+  float d2[1] = {123.456};
+  asm volatile ("fildl  %0\n"
+"fldl   %1\n"
+:: "m" (n1), "m" (d2) :);
+
+  /* then switch to the ge

Re: Unprivileged double fault with GDB and simple program written in assembly

2024-08-31 Thread Luca

Hi,

Il 28/08/24 15:01, J. E. Marinheiro ha scritto:

At this point, a double fault evidently happens, Mach starts panicking, and the 
registers are dumped:
* RAX=4010DE
* RBX=0
* RCX=1
* RDX=0
* RSI=0
* RDI=0
* RBP=0
* RSP=0
* R8 to R12=0
* EFLAGS=4000CE

The error message is:
`trapno 0: Divide error, error 01402ff8'
`panic ../i386/i386/trap.c:677: handle_double_fault: DOUBLE FAULT! This is 
critical'


I can reproduce the issues, both with and without gdb; in particular, I 
see that there is a bug in decoding the registers after a double fault; 
Fixing that I see


RAX 003c RBX 
RCX 004000de RDX 
RSI 004010de RDI 
RBP 0040 RSP 
R8  1403 R9  
R10  R11 
R12  R13 
R14  R15 
RIP 81012164 EFLAGS 00010102

where RIP seems to point the the syscall64 entry. It's also weird to 
have RSP=0 but that might be due to the bad state of the program being 
debugged.


GDB correctly prints errors related to signals, as there is no signal 
thread to handle them:


Thread 4 received signal ?, Unknown signal.
_start () at program.S:11
11mov $60, %rax
warning: Signal ? does not exist on this system.
warning: Can't deliver signal ?: No signal thread.

The signal seems caused by an invalid opcode exception after the second 
syscall, sent by the kernel and handled by GDB. From a kernel trace I 
collected it seems that the exception is sent twice, and it seems that 
somewhere after the second exception, when the program is resumed, the 
double fault happens. There might be an issue with restarting twice a 
task with an invalid state, if exceptions can be delivered; without GDB, 
the kernel terminates the task at the first error, as there is no 
exception port set.



When not using GDB, the program is simply killed by the system and nothing bad 
seems to happen. I'm guessing Linux syscalls need not be the same as Mach 
syscalls, but a double fault from some faulty program shouldn't trigger a panic 
without even root privileges.


The kernel has no idea about unix users, AFAIK in userspace the 
difference is basically the access to some privileged mach ports, but 
this is implemented in glibc and the hurd servers.



Luca




Re: [PATCH gnumach 2/3] add tests for FLOAT/XFLOAT state

2024-08-24 Thread Luca Dariz

Il 23/08/24 11:50, Sergey Bugaev ha scritto:

On Wed, Aug 21, 2024 at 7:37 PM Luca Dariz  wrote:

+#include 
+
+static void printx(struct i386_xfloat_state *state, int size)
+{
+  printf("xfloat init %d fp %d exc %d\n",
+ state->initialized, state->fpkind, state->exc_status);
+  struct i386_xfp_save *xfp = (struct i386_xfp_save *) &state->hw_state[0];
+  printf("xfp %d %d %d %d %d %d %d %d\n",
+ xfp->fp_control, xfp->fp_status, xfp->fp_tag, xfp->fp_eip,
+ xfp->fp_cs, xfp->fp_opcode, xfp->fp_dp, xfp->fp_ds);


Please make sure to disable this on non-x86. You can disable the whole
test, I guess, or -- would testing something like this also make sense
on other architectures?


Make it can just fail on other archs, I guess in general we can expect 
such test to be needed.


Luca



Re: [PATCH v2 gnumach 1/3] add xfloat thread state interface

2024-08-24 Thread Luca Dariz

Il 22/08/24 23:26, Samuel Thibault ha scritto:

Hello,

Thanks for the improved version!

Luca Dariz, le mer. 21 août 2024 18:36:14 +0200, a ecrit:

@@ -495,10 +534,11 @@ ASSERT_IPL(SPL0);
   * concurrent fpu_set_state or fpu_get_state.
   */
  kern_return_t
-fpu_get_state(const thread_t thread,
- struct i386_float_state *state)
+fpu_get_state(const thread_t thread, void *state, int flavor)
  {
pcb_t pcb = thread->pcb;
+   struct i386_float_state *fstate = (struct i386_float_state*)state;
+   struct i386_xfloat_state *xfstate = (struct i386_xfloat_state*)state;
struct i386_fpsave_state *ifps;
  
  ASSERT_IPL(SPL0);

@@ -512,7 +552,10 @@ ASSERT_IPL(SPL0);
 * No valid floating-point state.
 */
simple_unlock(&pcb->lock);
-   memset(state, 0, sizeof(struct i386_float_state));
+if (flavor == i386_FLOAT_STATE)
+memset(state, 0, sizeof(struct i386_float_state));
+else if (flavor == i386_XFLOAT_STATE)
+memset(state, 0, sizeof(struct i386_xfloat_state));


I guess we should also memset to 0 the fp_xsave_size - sizeof(struct
i386_xfp_save) part, to avoid leaking data? Thus just pass fp_xsave_size
rather than sizeof(struct i386_xfloat_state).


diff --git a/i386/include/mach/i386/thread_status.h 
b/i386/include/mach/i386/thread_status.h
index 94596a74..e5632ed6 100644
--- a/i386/include/mach/i386/thread_status.h
+++ b/i386/include/mach/i386/thread_status.h
@@ -148,6 +149,20 @@ struct i386_float_state {
  };
  #define i386_FLOAT_STATE_COUNT (sizeof(struct 
i386_float_state)/sizeof(unsigned int))
  
+#define XFP_STATE_BYTES (sizeof (struct i386_xfp_save))

+
+struct i386_xfloat_state {
+   int fpkind; /* FP_NO..FP_387X (readonly) */
+   int initialized;
+   int exc_status; /* exception status (readonly) 
*/
+   int fp_save_kind;   /* format of hardware state */
+   unsigned char   hw_state[XFP_STATE_BYTES]; /* actual "hardware" state */


I'm wondering if it's really useful to use XFP_STATE_BYTES here, since
callers are supposed to be able to support the extended part. Better
make hw_state zero-sized so that any miss in supporting the extended
part will hopefully be much more probably caught. That'll also make
i386_get_xstate_size simpler.


Good point, I'll fix it.


Luca




Re: [PATCH v2 gnumach 1/3] add xfloat thread state interface

2024-08-24 Thread Luca Dariz

Il 22/08/24 23:51, Samuel Thibault ha scritto:

Luca Dariz, le mer. 21 août 2024 18:36:14 +0200, a ecrit:

+   } else if (flavor == i386_XFLOAT_STATE) {
+   int i;
+   struct i386_xfp_save *user_fp_state = (struct i386_xfp_save *) 
&xfstate->hw_state[0];


We should also check that user_fp_state->fp_save_kind == fp_save_kind.


Right, will do.

Luca




Re: [PATCH v2 gnumach 1/3] add xfloat thread state interface

2024-08-24 Thread Luca Dariz

Il 22/08/24 23:44, Samuel Thibault ha scritto:

Luca Dariz, le mer. 21 août 2024 18:36:14 +0200, a ecrit:

diff --git a/i386/include/mach/i386/mach_i386.defs 
b/i386/include/mach/i386/mach_i386.defs
index 965d5c3b..61fed222 100644
--- a/i386/include/mach/i386/mach_i386.defs
+++ b/i386/include/mach/i386/mach_i386.defs
@@ -111,3 +111,9 @@ routine i386_get_gdt(
target_thread   : thread_t;
selector: int;
out desc: descriptor_t);
+
+/* Returns the size in bytes of the FPU extended state, to be used
+   with i386_XFLOAT_STATE instead of i386_XFLOAT_STATE_COUNT */
+routinei386_get_xstate_size(
+   host: host_t;
+   out size: natural_t);


In principle that should rather be a vm_size_t?


Yes, this was to match with thread_*_state(), which use 
mach_msg_type_number_t which is a natural_t, and considering that on 64 
bit sizeof(natural_t) < sizeof(vm_size_t).


If this makes sense, I'll document it here.

Luca




[PATCH gnumach 3/3] add rpc interrupted test

2024-08-21 Thread Luca Dariz
* tests/test-machmsg.c: add two use cases used by glibc during signal
  handling
* tests/include/testlib.h
* tests/testlib.c: add new wait_thread_terminated() helper
---
 tests/include/testlib.h |  1 +
 tests/test-machmsg.c| 80 +
 tests/testlib.c | 15 
 3 files changed, 96 insertions(+)

diff --git a/tests/include/testlib.h b/tests/include/testlib.h
index 1d08067b..737efd41 100644
--- a/tests/include/testlib.h
+++ b/tests/include/testlib.h
@@ -70,6 +70,7 @@ thread_t test_thread_start(task_t task, 
void(*routine)(void*), void* arg);
 mach_port_t host_priv(void);
 mach_port_t device_priv(void);
 void wait_thread_suspended(thread_t th);
+void wait_thread_terminated(thread_t th);
 
 extern vm_size_t vm_page_size;
 
diff --git a/tests/test-machmsg.c b/tests/test-machmsg.c
index ac292376..7f535bde 100644
--- a/tests/test-machmsg.c
+++ b/tests/test-machmsg.c
@@ -499,6 +499,83 @@ void test_msg_emptydesc(void)
 }
 
 
+void recv_to_be_interrupted(void *arg)
+{
+  mach_msg_header_t msg;
+  kern_return_t ret;
+  mach_port_t rcv_name;
+  long err = (long)arg;
+
+  ret = mach_port_allocate(mach_task_self (), MACH_PORT_RIGHT_RECEIVE, 
&rcv_name);
+  ASSERT_RET(ret, "creating rx port");
+
+  ret = mach_msg(&msg, MACH_RCV_MSG,
+ 0, sizeof(msg), rcv_name,
+ MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);
+
+  printf("mach_msg returned %x\n", ret);
+  ASSERT(ret == err, "recv not interrupted correctly");
+
+  thread_terminate(mach_thread_self());
+  FAILURE("thread_terminate");
+}
+
+void test_recv_interrupted(void)
+{
+  kern_return_t ret;
+  thread_t th;
+  th = test_thread_start(mach_task_self(), recv_to_be_interrupted, 
(void*)MACH_RCV_INTERRUPTED);
+  msleep(100);
+
+  ret = thread_suspend(th);
+  ASSERT_RET(ret, "thread_suspend");
+
+  ret = thread_abort(th);
+  ASSERT_RET(ret, "thread_abort");
+
+  ret = thread_resume(th);
+  ASSERT_RET(ret, "thread_resume");
+
+  wait_thread_terminated(th);
+}
+
+void test_recv_interrupted_setreturn(void)
+{
+  kern_return_t ret;
+  thread_t th;
+  th = test_thread_start(mach_task_self(), recv_to_be_interrupted, 
(void*)123L);
+  msleep(100);
+
+  ret = thread_suspend(th);
+  ASSERT_RET(ret, "thread_suspend");
+
+  ret = thread_abort(th);
+  ASSERT_RET(ret, "thread_abort");
+
+
+  struct i386_thread_state state;
+  unsigned int count;
+  count = i386_THREAD_STATE_COUNT;
+  ret = thread_get_state(th, i386_REGS_SEGS_STATE,
+ (thread_state_t) &state, &count);
+  ASSERT_RET(ret, "thread_get_state()");
+
+#ifdef __i386__
+  state.eax = 123;
+#elif defined(__x86_64__)
+  state.rax = 123;
+#endif
+  ret = thread_set_state(th, i386_REGS_SEGS_STATE,
+ (thread_state_t) &state, i386_THREAD_STATE_COUNT);
+  ASSERT_RET(ret, "thread_set_state");
+
+  ret = thread_resume(th);
+  ASSERT_RET(ret, "thread_resume");
+
+  wait_thread_terminated(th);
+}
+
+
 int
 main (int argc, char *argv[], int envc, char *envp[])
 {
@@ -512,5 +589,8 @@ main (int argc, char *argv[], int envc, char *envp[])
   test_msg_emptydesc();
   printf("test_iters()\n");
   test_iterations();
+  printf("test_recv_interrupted()\n");
+  test_recv_interrupted();
+  test_recv_interrupted_setreturn();
   return 0;
 }
diff --git a/tests/testlib.c b/tests/testlib.c
index df9d7113..4200979d 100644
--- a/tests/testlib.c
+++ b/tests/testlib.c
@@ -225,6 +225,21 @@ void wait_thread_suspended(thread_t th)
   } while (1);
 }
 
+void wait_thread_terminated(thread_t th)
+{
+  int err;
+  struct thread_basic_info info;
+  mach_msg_type_number_t count;
+  do {
+count = THREAD_BASIC_INFO_COUNT;
+err = thread_info(th, THREAD_BASIC_INFO, (thread_info_t)&info, &count);
+if (err == MACH_SEND_INVALID_DEST)
+break;
+ASSERT_RET(err, "error in thread_info");
+msleep(100); // don't poll continuously
+  } while (1);
+}
+
 /*
  * Minimal _start() for test modules, we just take the arguments from the
  * kernel, call main() and reboot. As in glibc, we expect the argument pointer
-- 
2.39.2




[PATCH gnumach 2/3] add tests for FLOAT/XFLOAT state

2024-08-21 Thread Luca Dariz
---
 tests/include/testlib.h  |   1 +
 tests/test-thread-state-fp.c | 232 +++
 tests/testlib.c  |  16 +++
 tests/user-qemu.mk   |   3 +-
 4 files changed, 251 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-thread-state-fp.c

diff --git a/tests/include/testlib.h b/tests/include/testlib.h
index 035fdc28..1d08067b 100644
--- a/tests/include/testlib.h
+++ b/tests/include/testlib.h
@@ -69,6 +69,7 @@ thread_t test_thread_start(task_t task, 
void(*routine)(void*), void* arg);
 
 mach_port_t host_priv(void);
 mach_port_t device_priv(void);
+void wait_thread_suspended(thread_t th);
 
 extern vm_size_t vm_page_size;
 
diff --git a/tests/test-thread-state-fp.c b/tests/test-thread-state-fp.c
new file mode 100644
index ..985ca4d6
--- /dev/null
+++ b/tests/test-thread-state-fp.c
@@ -0,0 +1,232 @@
+/*
+ * Copyright (c) 2024 Free Software Foundation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void printx(struct i386_xfloat_state *state, int size)
+{
+  printf("xfloat init %d fp %d exc %d\n",
+ state->initialized, state->fpkind, state->exc_status);
+  struct i386_xfp_save *xfp = (struct i386_xfp_save *) &state->hw_state[0];
+  printf("xfp %d %d %d %d %d %d %d %d\n",
+ xfp->fp_control, xfp->fp_status, xfp->fp_tag, xfp->fp_eip,
+ xfp->fp_cs, xfp->fp_opcode, xfp->fp_dp, xfp->fp_ds);
+  for (int i=0; i<8; i++)
+{
+  printf("fp%d", i);
+  for (int j=0; j<16; j++)
+printf(" %02x", xfp->fp_reg_word[i][j]);
+  printf("\n");
+}
+  for (int i=0; i<16; i++)
+{
+  printf("xmm%02d", i);
+  for (int j=0; j<16; j++)
+printf(" %02x", xfp->fp_xreg_word[i][j]);
+  printf("\n");
+}
+  printf("header xfp_features %llx bv %llx\n",
+ xfp->header.xfp_features, xfp->header.xcomp_bv);
+  if (size > sizeof(*state))
+{
+  int rem = size - sizeof(*state), iter = 0;
+  while (rem > 0)
+{
+  const int len = 16;
+  int n;
+  if (rem > len)
+n = len;
+  else
+n = rem;
+  printf("ext");
+  for (int j=0; jextended[j + iter*len]);
+  printf("\n");
+  rem -= n;
+  iter++;
+}
+}
+}
+
+static void thread_fp_getset(void *arg)
+{
+  int err;
+  thread_t th = *(thread_t*)arg;
+
+  wait_thread_suspended(th);
+
+  mach_msg_type_number_t state_count = i386_FLOAT_STATE_COUNT;
+  struct i386_float_state state;
+
+  memset(&state, 0, sizeof(state));
+  err = thread_get_state(th, i386_FLOAT_STATE,
+(thread_state_t) &state, &state_count);
+  ASSERT_RET(err, "thread_get_state get failed");
+  ASSERT(state_count == i386_FLOAT_STATE_COUNT, "bad state_count");
+
+  struct i386_fp_regs *fpr =
+  (struct i386_fp_regs *) (&state.hw_state[0] + sizeof(struct 
i386_fp_save));
+
+  printf("fp regs get:\n");
+  for (int i=0; i<8; i++)
+  {
+  printf("fp%d", i);
+  for (int j=0; j<5; j++)
+  printf(" %04x", fpr->fp_reg_word[i][j]);
+  printf("\n");
+  }
+
+  char tmp[10];
+  memcpy(tmp, &fpr->fp_reg_word[1][0], sizeof(tmp));
+  memcpy(&fpr->fp_reg_word[1][0], &fpr->fp_reg_word[0][0], sizeof(tmp));
+  memcpy(&fpr->fp_reg_word[0][0], tmp, sizeof(tmp));
+
+  printf("fp regs set:\n");
+  for (int i=0; i<8; i++)
+  {
+  printf("fp%d", i);
+  for (int j=0; j<5; j++)
+  printf(" %04x", fpr->fp_reg_word[i][j]);
+  printf("\n");
+  }
+
+  err = thread_set_state(th, i386_FLOAT_STATE,
+(thread_state_t) &state, state_count);
+  ASSERT_RET(err, "thread_set_state set failed");
+
+  err = thread_resume(th);
+  ASSERT_RET(err, "error in thread_resume");
+  thread_terminate(mach_thread_self());
+  FAILURE("thread_terminate");
+}
+
+static void test_fp_state_getset()
+{
+  int err;
+  thread_t th = mach_thread_self();
+
+  /* load some known value in FP registers */
+  int n1[1] = {11};
+  float d2[1] = {123.456};
+  asm volatile ("fildl  %0\n"
+"fldl   %1\n"
+:: "m" (n1), "m" (d2) :);
+
+  /* then switch to the get/set test thread, and wait to be resumed */
+  test_thread_start(mach_task_self(), thread_fp_getset, &th);
+  er

[PATCH v2 gnumach 1/3] add xfloat thread state interface

2024-08-21 Thread Luca Dariz
* i386/i386/fpu.c: extend current getter and setter to support the
  extended state; move the struct casting here to reuse the locking
  and allocation logic for the thread state; make sure the new state
  is set as valid, otherwise it won't be applied; add
  i386_get_xstate_size() to dynamically retrieve the FPU state size.
* i386/i386/fpu.h: update prototypes to accept generic thread state
* i386/i386/pcb.c: forward raw thread state to getter and setter, only
  checking for minimum size and use the new i386_get_xstate_size()
  helper.
* i386/include/mach/i386/mach_i386.defs: expose the new helper
  i386_get_xstate_size().
* i386/include/mach/i386/thread_status.h: add interface definition for
  I386_XFLOAT_STATE and the corresponding data structure.
---
 i386/i386/fpu.c| 139 +++--
 i386/i386/fpu.h|   9 +-
 i386/i386/pcb.c|  37 ++-
 i386/include/mach/i386/mach_i386.defs  |   6 ++
 i386/include/mach/i386/thread_status.h |  15 +++
 5 files changed, 163 insertions(+), 43 deletions(-)

diff --git a/i386/i386/fpu.c b/i386/i386/fpu.c
index 316e3b41..2d8a5887 100644
--- a/i386/i386/fpu.c
+++ b/i386/i386/fpu.c
@@ -250,6 +250,17 @@ init_fpu(void)
}
 }
 
+kern_return_t
+i386_get_xstate_size(host_t host, natural_t *size)
+{
+   if (host == HOST_NULL)
+   return KERN_INVALID_ARGUMENT;
+
+   *size = sizeof(struct i386_xfloat_state) + fp_xsave_size - 
sizeof(struct i386_xfp_save);
+
+   return KERN_SUCCESS;
+}
+
 /*
  * Initialize FP handling.
  */
@@ -385,10 +396,11 @@ twd_fxsr_to_i387 (struct i386_xfp_save *fxsave)
  * concurrent fpu_set_state or fpu_get_state.
  */
 kern_return_t
-fpu_set_state(const thread_t thread,
- struct i386_float_state *state)
+fpu_set_state(const thread_t thread, void *state, int flavor)
 {
pcb_t pcb = thread->pcb;
+   struct i386_float_state *fstate = (struct i386_float_state*)state;
+   struct i386_xfloat_state *xfstate = (struct i386_xfloat_state*)state;
struct i386_fpsave_state *ifps;
struct i386_fpsave_state *new_ifps;
 
@@ -410,7 +422,8 @@ ASSERT_IPL(SPL0);
}
 #endif
 
-   if (state->initialized == 0) {
+   if ((flavor == i386_FLOAT_STATE && fstate->initialized == 0) ||
+   (flavor == i386_XFLOAT_STATE && xfstate->initialized == 0)) {
/*
 * new FPU state is 'invalid'.
 * Deallocate the fp state if it exists.
@@ -428,13 +441,6 @@ ASSERT_IPL(SPL0);
/*
 * Valid state.  Allocate the fp state if there is none.
 */
-   struct i386_fp_save *user_fp_state;
-   struct i386_fp_regs *user_fp_regs;
-
-   user_fp_state = (struct i386_fp_save *) &state->hw_state[0];
-   user_fp_regs  = (struct i386_fp_regs *)
-   &state->hw_state[sizeof(struct i386_fp_save)];
-
new_ifps = 0;
Retry:
simple_lock(&pcb->lock);
@@ -454,10 +460,43 @@ ASSERT_IPL(SPL0);
 * Ensure that reserved parts of the environment are 0.
 */
memset(ifps, 0, fp_xsave_size);
+   ifps->fp_valid = TRUE;
 
-   if (fp_save_kind != FP_FNSAVE) {
-   int i;
+   if (flavor == i386_FLOAT_STATE) {
+   struct i386_fp_save *user_fp_state;
+   struct i386_fp_regs *user_fp_regs;
+
+   user_fp_state = (struct i386_fp_save *) &fstate->hw_state[0];
+   user_fp_regs  = (struct i386_fp_regs *)
+   &fstate->hw_state[sizeof(struct i386_fp_save)];
 
+   if (fp_save_kind != FP_FNSAVE) {
+   int i;
+
+   ifps->xfp_save_state.fp_control = user_fp_state->fp_control;
+   ifps->xfp_save_state.fp_status  = user_fp_state->fp_status;
+   ifps->xfp_save_state.fp_tag = 
twd_i387_to_fxsr(user_fp_state->fp_tag);
+   ifps->xfp_save_state.fp_eip = user_fp_state->fp_eip;
+   ifps->xfp_save_state.fp_cs  = user_fp_state->fp_cs;
+   ifps->xfp_save_state.fp_opcode  = user_fp_state->fp_opcode;
+   ifps->xfp_save_state.fp_dp  = user_fp_state->fp_dp;
+   ifps->xfp_save_state.fp_ds  = user_fp_state->fp_ds;
+   for (i=0; i<8; i++)
+   memcpy(&ifps->xfp_save_state.fp_reg_word[i], 
&user_fp_regs->fp_reg_word[i], sizeof(user_fp_regs->fp_reg_word[i]));
+   } else {
+   ifps->fp_save_state.fp_control = user_fp_state->fp_control;
+   ifps->fp_save_state.fp_status  = user_fp_state->fp_status;
+   ifps->fp_save_state.fp_tag = user_fp_state->fp_tag;
+   ifps->fp_save_state.fp_eip = user_fp_state->fp_eip;
+   ifps->fp_save_state.fp_cs  = user_fp_state->fp_cs;
+   ifps->fp_save_state.fp_opcod

Re: [PATCH] add xfloat thread state interface

2024-08-17 Thread Luca Dariz

Hi,

Il 05/08/24 20:28, Sergey Bugaev ha scritto:

On Mon, Aug 5, 2024 at 9:23 PM Samuel Thibault  wrote:

Luca Dariz, le lun. 05 août 2024 14:52:24 +0200, a ecrit:

Il 05/08/24 00:32, Samuel Thibault ha scritto:

Luca Dariz, le ven. 02 août 2024 17:32:34 +0200, a ecrit:

+#define XFP_STATE_BYTES (sizeof (struct i386_xfp_save))


That is not sufficient: depending on the sse level and the saving style,
we have various amount of data to store. See fp_xsave_size computed in
init_fpu, we have to save all of that. We thus have to make get_state
check that the user-land buffer is large enough for that. And as I
mentioned on irc, the i386_xfloat_state stucture should contain the
fp_save_kind for glibc to know how to restore it.



Right, maybe we also need a way for userspace to discover the required size?
e.g. with a separate call i386_get_xstate_size().


I was thinking that perhaps glibc can compute it itself with cpuid, but
it'll be simpler to add some call to gnumach indeed.

Samuel


I haven't looked closely, but just a quick reminder that MIG can
return variable-sized data just fine, and the caller (glibc) will then
learn the size after having received the data.

The machine_thread_all_state abstraction (that glibc uses internally)
does make this somewhat awkward though.


AFAIK in thread_get_state() the caller needs to allocate enough space in 
any case, even if the exact amount is not known, as handling 
MIG_ARRAY_TOO_LARGE would lead to a partial state save. We could check 
if there is an upper bound on this, but it might result in more 
maintenance work if in the future the extended state is increased to a 
bigger size.


Retrieving the amount with an arch-specific call, we only need to handle 
this in gnumach (my understanding is that glibc just needs to save and 
restore the xfp state, without inspecting it, during signal handling).


Luca




Re: [PATCH] add xfloat thread state interface

2024-08-05 Thread Luca Dariz

Il 05/08/24 00:32, Samuel Thibault ha scritto:

Hello,

Luca Dariz, le ven. 02 août 2024 17:32:34 +0200, a ecrit:

+#define XFP_STATE_BYTES (sizeof (struct i386_xfp_save))


That is not sufficient: depending on the sse level and the saving style,
we have various amount of data to store. See fp_xsave_size computed in
init_fpu, we have to save all of that. We thus have to make get_state
check that the user-land buffer is large enough for that. And as I
mentioned on irc, the i386_xfloat_state stucture should contain the
fp_save_kind for glibc to know how to restore it.



Right, maybe we also need a way for userspace to discover the required 
size? e.g. with a separate call i386_get_xstate_size().


Luca



[PATCH] add xfloat thread state interface

2024-08-02 Thread Luca Dariz
* i386/i386/fpu.c: extend current getter and setter to support the
  extended state; move the struct casting here to reuse the locking
  and allocation logic for the thread state.
* i386/i386/fpu.h: update prototypes to accept generic thread state
* i386/i386/pcb.c: forward raw thread state to getter and setter, only
  checking for minimum size.
* i386/include/mach/i386/thread_status.h: add interface definition for
  I386_XFLOAT_STATE and the corresponding data structure.
---
 i386/i386/fpu.c| 115 ++---
 i386/i386/fpu.h|   8 +-
 i386/i386/pcb.c|  23 -
 i386/include/mach/i386/thread_status.h |  12 +++
 4 files changed, 115 insertions(+), 43 deletions(-)

diff --git a/i386/i386/fpu.c b/i386/i386/fpu.c
index 316e3b41..4f783293 100644
--- a/i386/i386/fpu.c
+++ b/i386/i386/fpu.c
@@ -385,10 +385,11 @@ twd_fxsr_to_i387 (struct i386_xfp_save *fxsave)
  * concurrent fpu_set_state or fpu_get_state.
  */
 kern_return_t
-fpu_set_state(const thread_t thread,
- struct i386_float_state *state)
+fpu_set_state(const thread_t thread, void *state, int flavor)
 {
pcb_t pcb = thread->pcb;
+   struct i386_float_state *fstate = (struct i386_float_state*)state;
+   struct i386_xfloat_state *xfstate = (struct i386_xfloat_state*)state;
struct i386_fpsave_state *ifps;
struct i386_fpsave_state *new_ifps;
 
@@ -410,7 +411,8 @@ ASSERT_IPL(SPL0);
}
 #endif
 
-   if (state->initialized == 0) {
+   if ((flavor == i386_FLOAT_STATE && fstate->initialized == 0) ||
+   (flavor == i386_XFLOAT_STATE && xfstate->initialized == 0)) {
/*
 * new FPU state is 'invalid'.
 * Deallocate the fp state if it exists.
@@ -428,13 +430,6 @@ ASSERT_IPL(SPL0);
/*
 * Valid state.  Allocate the fp state if there is none.
 */
-   struct i386_fp_save *user_fp_state;
-   struct i386_fp_regs *user_fp_regs;
-
-   user_fp_state = (struct i386_fp_save *) &state->hw_state[0];
-   user_fp_regs  = (struct i386_fp_regs *)
-   &state->hw_state[sizeof(struct i386_fp_save)];
-
new_ifps = 0;
Retry:
simple_lock(&pcb->lock);
@@ -455,9 +450,41 @@ ASSERT_IPL(SPL0);
 */
memset(ifps, 0, fp_xsave_size);
 
-   if (fp_save_kind != FP_FNSAVE) {
-   int i;
+   if (flavor == i386_FLOAT_STATE) {
+   struct i386_fp_save *user_fp_state;
+   struct i386_fp_regs *user_fp_regs;
+
+   user_fp_state = (struct i386_fp_save *) &fstate->hw_state[0];
+   user_fp_regs  = (struct i386_fp_regs *)
+   &fstate->hw_state[sizeof(struct i386_fp_save)];
 
+   if (fp_save_kind != FP_FNSAVE) {
+   int i;
+
+   ifps->xfp_save_state.fp_control = user_fp_state->fp_control;
+   ifps->xfp_save_state.fp_status  = user_fp_state->fp_status;
+   ifps->xfp_save_state.fp_tag = 
twd_i387_to_fxsr(user_fp_state->fp_tag);
+   ifps->xfp_save_state.fp_eip = user_fp_state->fp_eip;
+   ifps->xfp_save_state.fp_cs  = user_fp_state->fp_cs;
+   ifps->xfp_save_state.fp_opcode  = user_fp_state->fp_opcode;
+   ifps->xfp_save_state.fp_dp  = user_fp_state->fp_dp;
+   ifps->xfp_save_state.fp_ds  = user_fp_state->fp_ds;
+   for (i=0; i<8; i++)
+   memcpy(&ifps->xfp_save_state.fp_reg_word[i], 
&user_fp_regs->fp_reg_word[i], sizeof(user_fp_regs->fp_reg_word[i]));
+   } else {
+   ifps->fp_save_state.fp_control = user_fp_state->fp_control;
+   ifps->fp_save_state.fp_status  = user_fp_state->fp_status;
+   ifps->fp_save_state.fp_tag = user_fp_state->fp_tag;
+   ifps->fp_save_state.fp_eip = user_fp_state->fp_eip;
+   ifps->fp_save_state.fp_cs  = user_fp_state->fp_cs;
+   ifps->fp_save_state.fp_opcode  = user_fp_state->fp_opcode;
+   ifps->fp_save_state.fp_dp  = user_fp_state->fp_dp;
+   ifps->fp_save_state.fp_ds  = user_fp_state->fp_ds;
+   ifps->fp_regs = *user_fp_regs;
+   }
+   } else if (flavor == i386_XFLOAT_STATE) {
+   int i;
+   struct i386_xfp_save *user_fp_state = (struct i386_xfp_save *) 
&xfstate->hw_state[0];
ifps->xfp_save_state.fp_control = user_fp_state->fp_control;
ifps->xfp_save_state.fp_status  = user_fp_state->fp_status;
ifps->xfp_save_state.fp_tag = 
twd_i387_to_fxsr(user_fp_state->fp_tag);
@@ -467,17 +494,11 @@ ASSERT_IPL(SPL0);
ifps->xfp_save_state.fp_dp  = user_fp_state->fp_dp;
ifps->x

[PATCH 2/2] tests/machmsg: check rx message size on different code paths

2024-06-11 Thread Luca Dariz
* tests/test-machmsg.c: add more combinations to existing cases:
  - make tx and rx ports independent in the send/receive tests
  - add two more variants for send/receive tests, using two separate
system calls, using different code paths in mach_msg().
---
 tests/test-machmsg.c | 117 +--
 1 file changed, 114 insertions(+), 3 deletions(-)

diff --git a/tests/test-machmsg.c b/tests/test-machmsg.c
index 60f3f49f..ac292376 100644
--- a/tests/test-machmsg.c
+++ b/tests/test-machmsg.c
@@ -40,6 +40,7 @@ static uint32_t align(uint32_t val, size_t aln)
 
 struct echo_params
 {
+  mach_port_t tx_port;
   mach_port_t rx_port;
   mach_msg_size_t rx_size;
   mach_msg_size_t rx_number;
@@ -110,6 +111,7 @@ test_iterations (void)
 
   struct echo_params params;
   params.rx_port = port;
+  params.tx_port = port;
   params.rx_size = sizeof(message.header) + sizeof(message.type) + 5;
   ALIGN_INLINE(params.rx_size, MACH_MSG_USER_ALIGNMENT);
   params.rx_number = TEST_ITERATIONS;
@@ -183,6 +185,7 @@ run_test_simple(void *msg, mach_msg_size_t msglen, 
mach_msg_id_t msgid)
   ASSERT_RET(err, "syscall_mach_port_allocate 2");
 
   struct echo_params params;
+  params.tx_port = MACH_PORT_NULL;
   params.rx_port = port;
   params.rx_size = msglen;
   params.rx_number = 1;
@@ -208,6 +211,63 @@ run_test_simple(void *msg, mach_msg_size_t msglen, 
mach_msg_id_t msgid)
   ASSERT(head->msgh_size == msglen, "wrong size in final rx");
 }
 
+/* same as run_test_simple(), but use two different sysccalls for tx and rx */
+void
+run_test_simple_split(void *msg, mach_msg_size_t msglen, mach_msg_id_t msgid)
+{
+  mach_msg_header_t *head = msg;
+  mach_port_t port, receive;
+  int err;
+
+  err = syscall_mach_port_allocate (mach_task_self (),
+MACH_PORT_RIGHT_RECEIVE, &port);
+  ASSERT_RET(err, "syscall_mach_port_allocate");
+
+  err = syscall_mach_port_allocate (mach_task_self (),
+MACH_PORT_RIGHT_RECEIVE, &receive);
+  ASSERT_RET(err, "syscall_mach_port_allocate 2");
+
+  struct echo_params params;
+  params.tx_port = receive;
+  params.rx_port = port;
+  params.rx_size = msglen;
+  params.rx_number = 1;
+  test_thread_start (mach_task_self (), echo_thread, ¶ms);
+
+  head->msgh_bits = MACH_MSGH_BITS(MACH_MSG_TYPE_MAKE_SEND,
+   MACH_MSG_TYPE_MAKE_SEND_ONCE);
+  head->msgh_remote_port = port;
+  head->msgh_local_port = receive;
+  head->msgh_id = msgid;
+  head->msgh_size = 0;  // check that the echo thread receives the correct size
+
+  err = mach_msg (msg,
+  MACH_SEND_MSG,
+  msglen,
+  0,
+  MACH_PORT_NULL,
+  MACH_MSG_TIMEOUT_NONE,
+  MACH_PORT_NULL);
+  ASSERT_RET(err, "mach_msg tx");
+
+  memset(msg, 0, msglen);
+
+  err = mach_msg (msg,
+  MACH_RCV_MSG,
+  0,
+  msglen,
+  receive,
+  MACH_MSG_TIMEOUT_NONE,
+  MACH_PORT_NULL);
+  ASSERT_RET(err, "mach_msg rx");
+
+  printf("size in final rx: %d expected %d\n", head->msgh_size, msglen);
+  ASSERT(head->msgh_size == msglen, "wrong size in final rx");
+}
+
+/* Text tx and rx of a message, without using a different thread. We
+ * also use the same port to send and receive the message.
+ */
 void
 run_test_simple_self(void *msg, mach_msg_size_t msglen, mach_msg_id_t msgid)
 {
@@ -222,9 +282,6 @@ run_test_simple_self(void *msg, mach_msg_size_t msglen, 
mach_msg_id_t msgid)
   head->msgh_bits
 = MACH_MSGH_BITS (MACH_MSG_TYPE_MAKE_SEND,
   MACH_MSG_TYPE_MAKE_SEND_ONCE);
-  /* head->msgh_bits */
-  /*   = MACH_MSGH_BITS (MACH_MSG_TYPE_MAKE_SEND_ONCE, */
-  /* MACH_MSG_TYPE_COPY_SEND); */
 
   head->msgh_bits |= MACH_MSGH_BITS_COMPLEX;
   head->msgh_remote_port = port;
@@ -245,6 +302,52 @@ run_test_simple_self(void *msg, mach_msg_size_t msglen, 
mach_msg_id_t msgid)
   ASSERT(head->msgh_size == msglen, "wrong size in final rx\n");
 }
 
+/* same as run_test_simple_self(), but use two different sysccalls for tx and 
rx */
+void
+run_test_simple_self_split(void *msg, mach_msg_size_t msglen, mach_msg_id_t 
msgid)
+{
+  mach_msg_header_t *head = msg;
+  mach_port_t port, receive;
+  int err;
+
+  err = syscall_mach_port_allocate (mach_task_self (),
+MACH_PORT_RIGHT_RECEIVE, &port);
+  ASSERT_RET(err, "syscall_mach_port_allocate");
+
+  head->msgh_bits
+= MACH_MSGH_BITS (MACH_MSG_TYPE_MAKE_SEND,
+  MACH_MSG_TYPE_MAKE_SEND_ONCE);
+
+  head->msgh_bits |= MACH_MSGH_BITS_COMPLEX;
+  head->msgh_remote_port = port;
+  head->msgh_local_port = port;
+  head->msgh_id = msgid;
+  head->msgh_size = msglen;
+
+  err = mach_msg (msg,
+  MACH_SEND_MSG,
+  msglen,
+  0,
+  port,
+  MACH_MSG_TIMEOUT_NON

[PATCH 1/2] x86_64: fix msg size forwarding in case it's not set by userspace

2024-06-11 Thread Luca Dariz
* ipc/copy_user.c: recent MIG stubs should always fill the size
  correctly in the msg header, but we shouldn't rely on that. Instead,
  we use the size that was correctly copied-in, overwriting the value
  in the header. This is already done by the 32-bit copyinmsg(), and
  was missing in the 64-bit version.
  Furthermore, the assertion about user/kernel size make sense with
  and without USER32, so take it out if the #ifdef.
---
 ipc/copy_user.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/ipc/copy_user.c b/ipc/copy_user.c
index a4b238de..850ea49e 100644
--- a/ipc/copy_user.c
+++ b/ipc/copy_user.c
@@ -442,16 +442,18 @@ int copyinmsg (const void *userbuf, void *kernelbuf, 
const size_t usize, const s
 }
 
   kmsg->msgh_size = sizeof(mach_msg_header_t) + ksaddr - (vm_offset_t)(kmsg + 
1);
-  assert(kmsg->msgh_size <= ksize);
 #else
   /* The 64 bit interface ensures the header is the same size, so it does not 
need any resizing. */
   _Static_assert(sizeof(mach_msg_header_t) == sizeof(mach_msg_user_header_t),
 "mach_msg_header_t and mach_msg_user_header_t expected to be 
of the same size");
   if (copyin(umsg, kmsg, usize))
 return 1;
+
+  kmsg->msgh_size = usize;
   kmsg->msgh_remote_port &= 0x; // FIXME: still have port names here
   kmsg->msgh_local_port &= 0x;  // also, this assumes little-endian
 #endif
+  assert(kmsg->msgh_size <= ksize);
   return 0;
 }
 
-- 
2.39.2




Re: [PATCH 1/2] RFC enhance tracing utilities

2024-03-09 Thread Luca Dariz

Il 09/03/24 16:24, Etienne Brateau ha scritto:
Le sam. 9 mars 2024 à 15:03, Luca Dariz diff --git a/i386/i386/debug.h b/i386/i386/debug.h

index 84397ba8..eff330c6 100644
--- a/i386/i386/debug.h
+++ b/i386/i386/debug.h
@@ -54,6 +54,7 @@ void debug_trace_dump(void);

  #else  /* __ASSEMBLER__ */

+#ifndef __x86_64__
  #define DEBUG_TRACE                            \
         pushl   $__LINE__                       ;\
         pushl   $9f                             ;\
@@ -62,10 +63,21 @@ void debug_trace_dump(void);
         .data                                   ;\
  9:     .ascii  __FILE__"\0"                    ;\
         .text
-
+#else  /* __x86_64__ */
+#define DEBUG_TRACE                            \
+       pushq   %rdi                            ;\
+       pushq   %rs1                            ;\

Is it really %rs1 here and not %rsi ? you are pushing rs1 but popping rsi


it's probably %rsi, this part is still to be cleaned up (see the list in 
the descritption)



diff --git a/i386/i386/locore.S b/i386/i386/locore.S
index 9d0513a1..26c5843c 100644
--- a/i386/i386/locore.S
+++ b/i386/i386/locore.S
@@ -634,6 +634,15 @@ ENTRY(thread_bootstrap_return)
   */

  ENTRY(thread_syscall_return)
+#if KERNEL_TRACE
+       testb   $0xff,EXT(syscall_trace)
+       jz      1f
+       movl    S_ARG0,%eax                     /* get return value */
+       pushl   %eax
+       call    syscall_trace_return
+       movl    %eax,S_ARG0                     /* restore return
value */
+1:
+#endif
         movl    S_ARG0,%eax                     /* get return value */
         movl    %esp,%ecx                       /* get kernel stack */
         or      $(KERNEL_STACK_SIZE-1),%ecx
@@ -1174,18 +1183,26 @@ syscall_native:

  mach_call_call:

-#ifdef DEBUG
+#if KERNEL_TRACE
         testb   $0xff,EXT(syscall_trace)
         jz      0f
-       pushl   %eax
-       call    EXT(syscall_trace_print)
-       /* will return with syscallofs still (or again) in eax */
-       addl    $4,%esp
+       pushl   %eax                     /* add syscall num to args
array  */
+       pushl   %esp                     /* args array is the first
argument*/
+       call    EXT(syscall_trace_enter)
+       popl    %eax
+       popl    %eax

you are popping eax twice here, shouldn’t the second one be esp?


The first popl just discards the %esp value, the second one restores the 
correct %eax, so the real syscall can be called as if nothing had 
happened. For i386 all function parameters are on the stack, so they are 
not touched by the tracing call.



Luca



[PATCH 2/2] RFC: add kernel trace utilities

2024-03-09 Thread Luca Dariz
These are some utilities I used to decode and control the kernel
trace, compiled as part of gnumach just so it's easier to build
them. I'm not sure if they should be in their own package, for now
they complement the kernel tracing patch so it can be tested more
easily.

With ktrace we can collect the trace of a specific executable, or to
manually start/stop tracing.

With parsetrace we can decode the trace file collected (with ktrace or
from gdb) handling different configurations and mixed 32/64 bit
kernel/user environments. This helps a lot when bootstrapping a new
platform. The usage is very basic for now, but advanced filtering can
be implemented (even with bindings to other languages).
---
 tests/Makefrag.am|   9 +
 tests/ktrace.c   | 199 +++
 tests/parsetrace.cpp | 594 +++
 3 files changed, 802 insertions(+)
 create mode 100644 tests/ktrace.c
 create mode 100644 tests/parsetrace.cpp

diff --git a/tests/Makefrag.am b/tests/Makefrag.am
index faabdf44..df1e1e90 100644
--- a/tests/Makefrag.am
+++ b/tests/Makefrag.am
@@ -32,3 +32,12 @@ clean-local: $(USER_TESTS_CLEAN)
rm -fr tests/include-mach
 
 endif # !PLATFORM_xen
+
+parsetrace: tests/parsetrace.cpp $(MACH_TESTINSTALL)
+   $(CXX) -Wall -ggdb3 -I$(MACH_TESTINCLUDE) -I$(MIG_OUTDIR) -o $@ $<
+
+ktrace: tests/ktrace.c $(MIG_OUTDIR)/gnumach.user.c $(MIG_OUTDIR)/mach.user.c
+   $(CC) -Wall -ggdb3 -I$(MACH_TESTINCLUDE) -I$(MIG_OUTDIR) -o $@ $^
+
+
+include tests/hurd-integration.mk
diff --git a/tests/ktrace.c b/tests/ktrace.c
new file mode 100644
index ..b8b7c607
--- /dev/null
+++ b/tests/ktrace.c
@@ -0,0 +1,199 @@
+/*
+ *  Copyright (C) 2024 Free Software Foundation
+ *
+ * This program is free software ; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation ; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY ; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with the program ; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+
+/* start kernel tracing */
+void do_start(mach_port_t hostp, size_t bufsize, int kinds)
+{
+  int err = ktrace_start(hostp, bufsize, kinds);
+  if (err)
+error(2, err, "ktrace_start");
+}
+
+/* stop kernel tracing */
+void do_stop(mach_port_t hostp)
+{
+  int err = ktrace_stop(hostp);
+  if (err)
+error(2, err, "ktrace_stop");
+}
+
+/* collect tracing data and free the kernel resources */
+void do_collect(mach_port_t hostp, const char *path)
+{
+  vm_offset_t buf, last_ptr;
+  mach_msg_type_number_t bufsize;
+  int flags, err;
+  err = ktrace_collect(hostp, &buf, &bufsize, &last_ptr, &flags);
+  if (err)
+error(1, err, "ktrace_collect");
+  printf("out %s size %u last %lu flags %x overrun %d\n", path, bufsize, 
last_ptr, flags, flags & 0xF);
+  if (bufsize < last_ptr)
+printf("ERROR overrun?\n");
+  FILE *fp = fopen(path, "w");
+  if (fp)
+{
+  trace_header_t th;
+  memset(&th, 0, sizeof(th));
+  memcpy(th.magic, TH_MAGIC, 4);
+  th.flags = flags;
+  th.last = last_ptr;
+  err = fwrite(&th, 1, sizeof(th), fp);
+  if (err <= 0)
+error(0, errno, "fwrite header");
+
+  err = fwrite((void*)buf, 1, bufsize, fp);
+  if (err < bufsize)
+error(0, errno, "fwrite");
+  fclose(fp);
+}
+  else
+error(0, errno, "fopen, trace file not written");
+
+  err = vm_deallocate(mach_task_self(), buf, bufsize);
+  if (err)
+error(0, err, "vm_deallocate");
+
+  err = ktrace_free(hostp);
+  if (err)
+error(1, err, "ktrace_free");
+}
+
+static const char *usage_text =
+  "usage: ktrace [ -n BUFSIZE_M ] [ -m MASK ] [ -o TRACEFILE ] 
start|stop|collect|run [ prog arg1 arg2 ... ]\n\n"
+  "  -n BUFSIZE (on start) is in MiB, default 10 MiB\n"
+  "  -m MASK (on start) is an integer bitwise-mask of the traced events, 
default all:\n"
+  "  1 - raw syscall entry/exit\n"
+  "  2 - user or kernel trap\n"
+  "  4 - context switch\n"
+  "  8 - interrupt handlers\n"
+  "  16 - raw IPC send/recv, traced on copyin/copyout. If mach_msg() fails 
copyout is skipped.\n"
+  "  -o TRACEFILE (on collect) will contain the dump of trace ring buffer, the 
default file is trace.bin\n\n"
+  "start|stop|collect allow to manage the tracing freely\n"
+  "run atomatically starts and stop tracing of the command [ prog arg1 arg2 
... ].\n"
+  ;
+
+int main(int argc, char *argv[])
+{
+  int err;
+  mach_port_t hostp;
+  size_t b

[PATCH 3/3] move x86 copy_user.[ch] to ipc/ and make it arch-indipendent

2024-03-09 Thread Luca Dariz
From: LD 

---
 Makefrag.am|  2 ++
 i386/Makefrag.am   |  1 -
 {x86_64 => ipc}/copy_user.c|  7 +--
 {i386/i386 => ipc}/copy_user.h | 18 +-
 ipc/ipc_kmsg.c |  2 +-
 ipc/ipc_mqueue.c   |  2 +-
 ipc/mach_msg.c |  2 +-
 kern/ipc_mig.c |  2 +-
 x86_64/Makefrag.am |  1 -
 9 files changed, 20 insertions(+), 17 deletions(-)
 rename {x86_64 => ipc}/copy_user.c (99%)
 rename {i386/i386 => ipc}/copy_user.h (90%)

diff --git a/Makefrag.am b/Makefrag.am
index 5b61a1d6..82fce628 100644
--- a/Makefrag.am
+++ b/Makefrag.am
@@ -76,6 +76,8 @@ endif
 #
 
 libkernel_a_SOURCES += \
+   ipc/copy_user.c \
+   ipc/copy_user.h \
ipc/ipc_entry.c \
ipc/ipc_entry.h \
ipc/ipc_init.c \
diff --git a/i386/Makefrag.am b/i386/Makefrag.am
index 58ee3273..5e7d4740 100644
--- a/i386/Makefrag.am
+++ b/i386/Makefrag.am
@@ -91,7 +91,6 @@ endif
 #
 
 libkernel_a_SOURCES += \
-   i386/i386/copy_user.h \
i386/i386/cswitch.S \
i386/i386/debug_trace.S \
i386/i386/idt_inittab.S \
diff --git a/x86_64/copy_user.c b/ipc/copy_user.c
similarity index 99%
rename from x86_64/copy_user.c
rename to ipc/copy_user.c
index c6e125d9..5c6329d3 100644
--- a/x86_64/copy_user.c
+++ b/ipc/copy_user.c
@@ -16,14 +16,15 @@
  * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
 
+#ifdef __LP64__
+
 #include 
 #include 
 
+#include 
 #include 
 #include 
 
-#include 
-
 
 /* Mach field descriptors measure size in bits */
 #define descsize_to_bytes(n) (n / 8)
@@ -611,3 +612,5 @@ int copyoutmsg (const void *kernelbuf, void *userbuf, const 
size_t ksize)
   return 0;
 
 }
+
+#endif /* __LP64__ */
diff --git a/i386/i386/copy_user.h b/ipc/copy_user.h
similarity index 90%
rename from i386/i386/copy_user.h
rename to ipc/copy_user.h
index 3d1c7278..a57b3ee5 100644
--- a/i386/i386/copy_user.h
+++ b/ipc/copy_user.h
@@ -28,7 +28,7 @@
 /*
  * The copyin_32to64() and copyout_64to32() routines are meant for data types
  * that have different size in kernel and user space. They should be 
independent
- * of endianness and hopefully can be reused in the future on other archs.
+ * of endianness and hopefully can be reused on all archs.
  * These types are e.g.:
  * - port names vs port pointers, on a 64-bit kernel
  * - memory addresses, on a 64-bit kernel and 32-bit user
@@ -71,23 +71,23 @@ static inline int copyout_address(const vm_offset_t *kaddr, 
rpc_vm_offset_t *uad
 
 static inline int copyin_port(const mach_port_name_t *uaddr, mach_port_t 
*kaddr)
 {
-#ifdef __x86_64__
+#ifdef __LP64__
   return copyin_32to64(uaddr, kaddr);
-#else /* __x86_64__ */
+#else /* __LP64__ */
   return copyin(uaddr, kaddr, sizeof(*uaddr));
-#endif /* __x86_64__ */
+#endif /* __LP64__ */
 }
 
 static inline int copyout_port(const mach_port_t *kaddr, mach_port_name_t 
*uaddr)
 {
-#ifdef __x86_64__
+#ifdef __LP64__
   return copyout_64to32(kaddr, uaddr);
-#else /* __x86_64__ */
+#else /* __LP64__ */
   return copyout(kaddr, uaddr, sizeof(*kaddr));
-#endif /* __x86_64__ */
+#endif /* __LP64__ */
 }
 
-#if defined(__x86_64__) && defined(USER32)
+#if defined(__LP64__) && defined(USER32)
 /* For 32 bit userland, kernel and user land messages are not the same size. */
 size_t msg_usize(const mach_msg_header_t *kmsg);
 #else
@@ -95,6 +95,6 @@ static inline size_t msg_usize(const mach_msg_header_t *kmsg)
 {
   return kmsg->msgh_size;
 }
-#endif /* __x86_64__ && USER32 */
+#endif /* __LP64__ && USER32 */
 
 #endif /* COPY_USER_H */
diff --git a/ipc/ipc_kmsg.c b/ipc/ipc_kmsg.c
index bd843804..8bd645ff 100644
--- a/ipc/ipc_kmsg.c
+++ b/ipc/ipc_kmsg.c
@@ -42,7 +42,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -51,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/ipc/ipc_mqueue.c b/ipc/ipc_mqueue.c
index 44e1eb98..95308f35 100644
--- a/ipc/ipc_mqueue.c
+++ b/ipc/ipc_mqueue.c
@@ -36,13 +36,13 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/ipc/mach_msg.c b/ipc/mach_msg.c
index 6194ef7b..ff5e5b09 100644
--- a/ipc/mach_msg.c
+++ b/ipc/mach_msg.c
@@ -39,7 +39,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -49,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/kern/ipc_mig.c b/kern/ipc_mig.c
index d26d2c6d..b753a25f 100644
--- a/kern/ipc_mig.c
+++ b/kern/ipc_mig.c
@@ -30,7 +30,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -42,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/x86_64/Makefrag.am b/x86_64/Makefrag.am
index b0bc45c2..2bbed986 100644
--- a/x86_64/Makefrag.am
+++ b/x86_64/Makefrag.am
@@ -90,7 +90,6 @@ libkernel_a_SOURCES += \
i386/i386/

[PATCH 1/2] RFC enhance tracing utilities

2024-03-09 Thread Luca Dariz
This extends the previous debug utility to trace system calls with
more events and the ability to control the tracing from userspace,
collecting a trace of the whole systems.

This tool was quite useful in porting the rpc format to 64 bit and
handle the 32to64 translation, but also to debug user-space programs.

It's still a draft with some limitations and open points:
- DEBUG_TRACE for 64 bit is missing
- check if we can have meaningful and reliable data removing the
  syscall number on syscall return; this would avoid saving it in the
  thread structure.
- add more info to each trace entry, e.g. current CPU or timestamps
- trace more events
- add more filtering capabilities
- trace a single subhurd
- trace format is not exported
---
 Makefrag.am   |   7 +
 configfrag.ac |  12 ++
 i386/i386/debug.h |  16 +-
 i386/i386/debug_i386.c|  45 ++--
 i386/i386/locore.S|  33 ++-
 i386/i386/pcb.c   |   4 +
 i386/i386/trap.c  |   5 +
 include/mach/gnumach.defs |  33 +++
 include/mach/trace.h  |  61 ++
 ipc/ipc_kmsg.c|   6 +-
 ipc/mach_msg.c|   2 +
 kern/exception.c  |   1 +
 kern/startup.c|   2 +
 kern/thread.h |   4 +
 kern/trace.c  | 432 ++
 kern/trace.h  |  42 
 scripts/gnumach-gdb.py|  63 ++
 tests/test-trace.c|  58 +
 tests/user-qemu.mk|   3 +-
 x86_64/debug_trace.S  |   9 +-
 x86_64/interrupt.S|  19 ++
 x86_64/locore.S   |  84 +++-
 22 files changed, 887 insertions(+), 54 deletions(-)
 create mode 100644 include/mach/trace.h
 create mode 100644 kern/trace.c
 create mode 100644 kern/trace.h
 create mode 100644 scripts/gnumach-gdb.py
 create mode 100644 tests/test-trace.c

diff --git a/Makefrag.am b/Makefrag.am
index 82fce628..25d73ec3 100644
--- a/Makefrag.am
+++ b/Makefrag.am
@@ -221,6 +221,12 @@ libkernel_a_SOURCES += \
kern/xpr.h \
kern/elf-load.c \
kern/boot_script.c
+
+if KERNEL_TRACE
+libkernel_a_SOURCES += \
+   kern/trace.c
+endif
+
 EXTRA_DIST += \
kern/exc.defs \
kern/mach.srv \
@@ -407,6 +413,7 @@ include_mach_HEADERS = \
include/mach/profilparam.h \
include/mach/std_types.h \
include/mach/syscall_sw.h \
+   include/mach/trace.h \
include/mach/task_info.h \
include/mach/task_special_ports.h \
include/mach/thread_info.h \
diff --git a/configfrag.ac b/configfrag.ac
index b8b41261..f4dbe63d 100644
--- a/configfrag.ac
+++ b/configfrag.ac
@@ -139,6 +139,18 @@ AC_DEFINE([SLAB_VERIFY], [0], [SLAB_VERIFY])
 
 # Enable the CPU pool layer in the slab allocator.
 AC_DEFINE([SLAB_USE_CPU_POOLS], [0], [SLAB_USE_CPU_POOLS])
+
+# enable support for tracing various kernel events
+AC_ARG_ENABLE([kernel-trace],
+  AS_HELP_STRING([--enable-kernel-trace], [Enable kernel event tracing]))
+[if [ x"$enable_kernel_trace" = xyes ]; then]
+  AC_DEFINE([KERNEL_TRACE], [1], [Enable kernel event tracing])
+  AM_CONDITIONAL([KERNEL_TRACE], [true])
+[else]
+  AC_DEFINE([KERNEL_TRACE], [0], [Enable kernel event tracing])
+  AM_CONDITIONAL([KERNEL_TRACE], [false])
+[fi]
+
 
 #
 # Options.
diff --git a/i386/i386/debug.h b/i386/i386/debug.h
index 84397ba8..eff330c6 100644
--- a/i386/i386/debug.h
+++ b/i386/i386/debug.h
@@ -54,6 +54,7 @@ void debug_trace_dump(void);
 
 #else  /* __ASSEMBLER__ */
 
+#ifndef __x86_64__
 #define DEBUG_TRACE\
pushl   $__LINE__   ;\
pushl   $9f ;\
@@ -62,10 +63,21 @@ void debug_trace_dump(void);
.data   ;\
 9: .ascii  __FILE__"\0";\
.text
-
+#else  /* __x86_64__ */
+#define DEBUG_TRACE\
+   pushq   %rdi;\
+   pushq   %rs1;\
+   movq$__LINE__,%rdi  ;\
+   movq$9f,%rsi;\
+   call__debug_trace   ;\
+popq   %rsi;\
+popq   %rdi;\
+   .data   ;\
+9: .ascii  __FILE__"\0";\
+   .text
+#endif
 #endif /* __ASSEMBLER__ */
 
-
 #endif /* DEBUG */
 
 /* XXX #include_next "debug.h" */
diff --git a/i386/i386/debug_i386.c b/i386/i386/debug_i386.c
index 41d032e3..61d73dff 100644
--- a/i386/i386/debug_i386.c
+++ b/i386/i386/debug_i386.c
@@ -22,6 +22,7 @@
  */
 
 #include 
+#include 
 
 #include "thread.h"
 #include "trap.h"
@@ -140,39 +141,23 @@ debug_trace_dump(void)
 
splx(s);
 }
+#endif /* DEBUG */
 
-#include 
-
-int syscall_trace = 0;
-task_t syscall_trace_task;
+#if KERNEL_TRACE
 
-int
-syscall_trace_print(int syscallvec, ...)
+#ifndef __x86_64__
+#define SYSCALL_NUM_SHIFT 4
+#else
+#define SYSCALL_NUM_SHIFT 5
+#endif

[PATCH 1/3] x86_64: split SET_KERNEL_SEGMENTS() for NCPU > 1

2024-03-09 Thread Luca Dariz
This allows 32on64 to work again. Also, it's a clearer indication of a
missing part.
---
 x86_64/locore.S | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/x86_64/locore.S b/x86_64/locore.S
index 806762bb..8f39a677 100644
--- a/x86_64/locore.S
+++ b/x86_64/locore.S
@@ -163,17 +163,21 @@
 #define POP_SEGMENTS_ISR(reg)
 #endif
 
+#if NCPUS > 1
+#define SET_KERNEL_SEGMENTS(reg)\
+   ud2 /* TODO: use swapgs or similar */
+#else // NCPUS > 1
 #ifdef USER32
 #define SET_KERNEL_SEGMENTS(reg)  \
mov %ss,reg /* switch to kernel segments */ ;\
mov reg,%ds /* (same as kernel stack segment) */ ;\
mov reg,%es ;\
mov reg,%fs ;\
-   mov $(PERCPU_DS),reg;\
mov reg,%gs
-#else
+#else // USER32
 #define SET_KERNEL_SEGMENTS(reg)
-#endif
+#endif // USER32
+#endif // NCPUS > 1
 
 /*
  * Fault recovery.
-- 
2.39.2




[PATCH 2/3] remove machine/machspl.h as it duplicates machine/spl.h

2024-03-09 Thread Luca Dariz
From: LD 

---
 device/chario.c  |  2 +-
 device/ds_routines.c |  4 ++--
 device/net_io.c  |  4 ++--
 i386/Makefrag_x86.am |  1 -
 i386/i386/db_interface.c |  2 +-
 i386/i386/db_trace.c |  2 +-
 i386/i386/fpu.c  |  2 +-
 i386/i386/ipl.h  |  2 +-
 i386/i386/irq.c  |  2 +-
 i386/i386/machspl.h  | 29 -
 i386/i386/pic.c  |  2 +-
 i386/i386/trap.c |  2 +-
 i386/i386at/com.c|  2 +-
 i386/i386at/kd.c |  2 +-
 i386/i386at/kd.h |  2 +-
 i386/i386at/kd_event.c   |  2 +-
 i386/i386at/model_dep.c  |  2 +-
 i386/i386at/rtc.c|  2 +-
 kern/ast.c   |  2 +-
 kern/eventcount.c|  2 +-
 kern/ipc_host.c  |  2 +-
 kern/ipc_sched.c |  2 +-
 kern/mach_clock.c|  2 +-
 kern/machine.c   |  2 +-
 kern/priority.c  |  2 +-
 kern/sched_prim.c|  2 +-
 kern/startup.c   |  2 +-
 kern/syscall_subr.c  |  2 +-
 kern/task.c  |  2 +-
 kern/thread.c|  2 +-
 kern/thread_swap.c   |  2 +-
 kern/xpr.c   |  2 +-
 linux/dev/arch/i386/kernel/irq.c |  2 +-
 linux/dev/kernel/sched.c |  2 +-
 xen/console.c|  2 +-
 35 files changed, 35 insertions(+), 65 deletions(-)
 delete mode 100644 i386/i386/machspl.h

diff --git a/device/chario.c b/device/chario.c
index 3fe93ccb..efb55867 100644
--- a/device/chario.c
+++ b/device/chario.c
@@ -34,7 +34,7 @@
 #include 
 #include 
 #include 
-#include/* spl definitions */
+#include/* spl definitions */
 
 #include 
 
diff --git a/device/ds_routines.c b/device/ds_routines.c
index d97d229e..439fc5b3 100644
--- a/device/ds_routines.c
+++ b/device/ds_routines.c
@@ -63,7 +63,7 @@
 #include 
 #include 
 #include 
-#include/* spl definitions */
+#include/* spl definitions */
 
 #include 
 #include 
@@ -95,7 +95,7 @@
 #include 
 #include 
 
-#include 
+#include 
 
 #ifdef LINUX_DEV
 extern struct device_emulation_ops linux_block_emulation_ops;
diff --git a/device/net_io.c b/device/net_io.c
index ee9435d7..efde9d6c 100644
--- a/device/net_io.c
+++ b/device/net_io.c
@@ -42,7 +42,7 @@
 #include 
 
 #include 
-#include/* spl definitions */
+#include/* spl definitions */
 #include 
 #include 
 #include 
@@ -64,7 +64,7 @@
 #include 
 #include 
 
-#include 
+#include 
 
 #ifMACH_TTD
 #include 
diff --git a/i386/Makefrag_x86.am b/i386/Makefrag_x86.am
index 272de023..a6c7a5c8 100644
--- a/i386/Makefrag_x86.am
+++ b/i386/Makefrag_x86.am
@@ -50,7 +50,6 @@ libkernel_a_SOURCES += \
i386/i386/mach_param.h \
i386/i386/machine_routines.h \
i386/i386/machine_task.c \
-   i386/i386/machspl.h \
i386/i386/model_dep.h \
i386/i386/mp_desc.c \
i386/i386/mp_desc.h \
diff --git a/i386/i386/db_interface.c b/i386/i386/db_interface.c
index 483991d6..392e2b69 100644
--- a/i386/i386/db_interface.c
+++ b/i386/i386/db_interface.c
@@ -59,7 +59,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #if MACH_KDB
 /* Whether the kernel uses any debugging register.  */
diff --git a/i386/i386/db_trace.c b/i386/i386/db_trace.c
index 0ef72518..abc32554 100644
--- a/i386/i386/db_trace.c
+++ b/i386/i386/db_trace.c
@@ -34,7 +34,7 @@
 #include 
 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/i386/i386/fpu.c b/i386/i386/fpu.c
index 4cd31dd9..316e3b41 100644
--- a/i386/i386/fpu.c
+++ b/i386/i386/fpu.c
@@ -43,7 +43,7 @@
 #include 
 
 #include 
-#include/* spls */
+#include/* spls */
 #include 
 #include 
 #include 
diff --git a/i386/i386/ipl.h b/i386/i386/ipl.h
index 6e59b368..2d02e3c2 100644
--- a/i386/i386/ipl.h
+++ b/i386/i386/ipl.h
@@ -71,7 +71,7 @@ WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 
 #ifdef KERNEL
 #ifndef__ASSEMBLER__
-#include 
+#include 
 /* Note that interrupts have varying signatures */
 typedef void (*interrupt_handler_fn)(int);
 extern interrupt_handler_fn ivect[];
diff --git a/i386/i386/irq.c b/i386/i386/irq.c
index a7c98890..3c2f1748 100644
--- a/i386/i386/irq.c
+++ b/i386/i386/irq.c
@@ -22,7 +22,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 extern queue_head_t main_intr_queue;
 
diff --git a/i386/i386/machspl.h b/i386/i386/machspl.h
deleted file mode 100644
index bbb26754..
--- a/i386/i386/machspl.h
+++ /dev/null
@@ -1,29 +0,0 @@
-/* 
- * Mach Operating System
- * Copyright (c) 1991,1990 Carnegie Mellon University
- * All Rights Reserved.
- * 
- * Permission to use, copy, modify and distribute this software and its
- * documentation is hereby granted, provided that both the copyright
- * notice and 

Re: Help with cross install

2024-01-12 Thread Luca

Hi,

Il 12/01/24 05:18, Joshua Branson ha scritto:

Here is my /boot/grub/custom.cfg

#+begin_example
menuentry "pci-arbiter + acpi + rumpdisk" {
 set root=(hd0,msdos1)
 multiboot /boot/gnumach-1.8-486.gz root=part:1:device:wd0 noide -s


you can try to add also "noahci" to gnumach command line after "noide"; 
this should prevent gnumach from probing AHCI devices and then rumpdisk 
should be able to find them.



Luca




[PATCH 02/11] add mach_host tests

2024-01-11 Thread Luca Dariz
---
 tests/test-mach_host.c | 81 ++
 tests/user-qemu.mk |  3 +-
 2 files changed, 83 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-mach_host.c

diff --git a/tests/test-mach_host.c b/tests/test-mach_host.c
new file mode 100644
index ..53f30240
--- /dev/null
+++ b/tests/test-mach_host.c
@@ -0,0 +1,81 @@
+/*
+ *  Copyright (C) 2024 Free Software Foundation
+ *
+ * This program is free software ; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation ; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY ; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with the program ; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+
+#include 
+
+void test_kernel_version()
+{
+  int err;
+  kernel_version_t kver;
+  err = host_get_kernel_version(mach_host_self(), kver);
+  ASSERT_RET(err, "host_kernel_info");
+  printf("kernel version: %s\n", kver);
+}
+
+void test_host_info()
+{
+  int err;
+  mach_msg_type_number_t count;
+  mach_port_t thishost = mach_host_self();
+
+  host_basic_info_data_t binfo;
+  count = HOST_BASIC_INFO_COUNT;
+  err = host_info(thishost, HOST_BASIC_INFO, (host_info_t)&binfo, &count);
+  ASSERT_RET(err, "host_basic_info");
+  ASSERT(count == HOST_BASIC_INFO_COUNT, "");
+  ASSERT(binfo.max_cpus > 0, "no cpu?");
+  ASSERT(binfo.avail_cpus > 0, "no cpu available?");
+  ASSERT(binfo.memory_size > (1024 * 1024), "memory too low");
+
+  const int maxcpus = 255;
+  int proc_slots[maxcpus];
+  count = maxcpus;
+  err = host_info(thishost, HOST_PROCESSOR_SLOTS, (host_info_t)&proc_slots, 
&count);
+  ASSERT_RET(err, "host_processor_slots");
+  ASSERT((1 <= count) && (count <= maxcpus), "");
+
+  host_sched_info_data_t sinfo;
+  count = HOST_SCHED_INFO_COUNT;
+  err = host_info(thishost, HOST_SCHED_INFO, (host_info_t)&sinfo, &count);
+  ASSERT_RET(err, "host_sched_info");
+  ASSERT(count == HOST_SCHED_INFO_COUNT, "");
+  ASSERT(sinfo.min_timeout < 1000, "timeout unexpectedly high");
+  ASSERT(sinfo.min_quantum < 1000, "quantum unexpectedly high");
+
+  host_load_info_data_t linfo;
+  count = HOST_LOAD_INFO_COUNT;
+  err = host_info(thishost, HOST_LOAD_INFO, (host_info_t)&linfo, &count);
+  ASSERT_RET(err, "host_load_info");
+  ASSERT(count == HOST_LOAD_INFO_COUNT, "");
+  for (int i=0; i<3; i++)
+  {
+  printf("avenrun %d\n", linfo.avenrun[i]);
+  printf("mach_factor %d\n", linfo.mach_factor[i]);
+  }
+}
+
+// TODO processor sets
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  test_kernel_version();
+  test_host_info();
+  return 0;
+}
diff --git a/tests/user-qemu.mk b/tests/user-qemu.mk
index 78775938..8b338241 100644
--- a/tests/user-qemu.mk
+++ b/tests/user-qemu.mk
@@ -195,7 +195,8 @@ clean-test-%:
 
 
 USER_TESTS := \
-   tests/test-hello
+   tests/test-hello \
+   tests/test-mach_host
 
 USER_TESTS_CLEAN = $(subst tests/,clean-,$(USER_TESTS))
 
-- 
2.39.2




[PATCH 09/11] add raw mach_msg tests

2024-01-11 Thread Luca Dariz
---
 tests/test-machmsg.c | 405 +++
 tests/user-qemu.mk   |   3 +-
 2 files changed, 407 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-machmsg.c

diff --git a/tests/test-machmsg.c b/tests/test-machmsg.c
new file mode 100644
index ..60f3f49f
--- /dev/null
+++ b/tests/test-machmsg.c
@@ -0,0 +1,405 @@
+/*
+ *  Copyright (C) 2024 Free Software Foundation
+ *
+ * This program is free software ; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation ; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY ; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with the program ; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#define ECHO_MAX_BODY_LEN 256
+
+static uint32_t align(uint32_t val, size_t aln)
+{
+  // we should check aln is a power of 2
+  aln--;
+  return (val + aln) & (~aln);
+}
+
+#define ALIGN_INLINE(val, n) { (val) = align((val), (n)); }
+
+struct echo_params
+{
+  mach_port_t rx_port;
+  mach_msg_size_t rx_size;
+  mach_msg_size_t rx_number;
+};
+
+void echo_thread (void *arg)
+{
+  struct echo_params *params = arg;
+  int err;
+  struct message
+  {
+mach_msg_header_t header;
+char body[ECHO_MAX_BODY_LEN];
+  } message;
+
+  printf ("thread echo started\n");
+  for (mach_msg_size_t i=0; irx_number; i++)
+{
+  message.header.msgh_local_port = params->rx_port;
+  message.header.msgh_size = sizeof (message);
+
+  err = mach_msg (&message.header,
+  MACH_RCV_MSG,
+  0, sizeof (message),
+  params->rx_port, MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);
+  ASSERT_RET(err, "mach_msg echo rx");
+  printf("echo rx %d expected 5d\n",
+ message.header.msgh_size, params->rx_size);
+  ASSERT(message.header.msgh_size == params->rx_size,
+ "wrong size in echo rx");
+
+  message.header.msgh_local_port = MACH_PORT_NULL;
+  printf ("echo: msgh_id %d\n", message.header.msgh_id);
+
+  err = mach_msg (&message.header,
+  MACH_SEND_MSG,
+  message.header.msgh_size, 0,
+  MACH_PORT_NULL, MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);
+  ASSERT_RET(err, "mach_msg echo tx");
+}
+  printf ("thread echo stopped\n");
+  thread_terminate (mach_thread_self ());
+  FAILURE("thread_terminate");
+}
+
+#define TEST_ITERATIONS 3
+
+// TODO run_test_iterations
+void
+test_iterations (void)
+{
+  mach_port_t port, receive;
+  int err;
+  struct message
+  {
+mach_msg_header_t header;
+mach_msg_type_t type;
+char data[64];
+  } message;
+
+  err = mach_port_allocate (mach_task_self (),
+   MACH_PORT_RIGHT_RECEIVE, &port);
+  ASSERT_RET(err, "mach_port_allocate");
+
+  err = mach_port_allocate (mach_task_self (),
+   MACH_PORT_RIGHT_RECEIVE, &receive);
+  ASSERT_RET(err, "mach_port_allocate 2");
+
+  struct echo_params params;
+  params.rx_port = port;
+  params.rx_size = sizeof(message.header) + sizeof(message.type) + 5;
+  ALIGN_INLINE(params.rx_size, MACH_MSG_USER_ALIGNMENT);
+  params.rx_number = TEST_ITERATIONS;
+  test_thread_start (mach_task_self (), echo_thread, ¶ms);
+
+  time_value_t start_time;
+  err = host_get_time (mach_host_self (), &start_time);
+  ASSERT_RET(err, "host_get_time");
+
+  /* Send a message down the port */
+  for (int i = 0; i < TEST_ITERATIONS; i++)
+{
+  struct message message;
+
+  memset (&message, 0, sizeof (message));
+  strcpy (message.data, "ciao");
+  size_t datalen = strlen (message.data) + 1;
+
+  message.header.msgh_bits
+   = MACH_MSGH_BITS (MACH_MSG_TYPE_MAKE_SEND,
+ MACH_MSG_TYPE_MAKE_SEND_ONCE);
+  message.header.msgh_remote_port = port;  /* Request port */
+  message.header.msgh_local_port = receive;/* Reply port */
+  message.header.msgh_id = 123;/* Message id */
+  message.header.msgh_size = sizeof (message.header) + sizeof 
(message.type) + datalen;/* Message size */
+  ALIGN_INLINE(message.header.msgh_size, 4);
+  message.type.msgt_name = MACH_MSG_TYPE_STRING;   /* Parameter type */
+  message.type.msgt_size = 8 * datalen;/* # Bits */
+  message.type.msgt_number = 1;/* Number of elements */
+  message.type.msgt_inline = TRUE; /* Inlined */
+  message.type.msgt_longform = FALSE;  /* Shortform */
+  message.type.msgt_dealloc

[PATCH 03/11] add gsync tests

2024-01-11 Thread Luca Dariz
---
 tests/test-gsync.c | 122 +
 tests/user-qemu.mk |   3 +-
 2 files changed, 124 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-gsync.c

diff --git a/tests/test-gsync.c b/tests/test-gsync.c
new file mode 100644
index ..a5160651
--- /dev/null
+++ b/tests/test-gsync.c
@@ -0,0 +1,122 @@
+/*
+ *  Copyright (C) 2024 Free Software Foundation
+ *
+ * This program is free software ; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation ; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY ; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with the program ; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+/* Gsync flags.  */
+#ifndef GSYNC_SHARED
+# define GSYNC_SHARED  0x01
+# define GSYNC_QUAD0x02
+# define GSYNC_TIMED   0x04
+# define GSYNC_BROADCAST   0x08
+# define GSYNC_MUTATE  0x10
+#endif
+
+static uint32_t single_shared;
+static struct {
+  uint32_t val1;
+  uint32_t val2;
+} single_shared_quad;
+
+static void test_single()
+{
+  int err;
+  single_shared = 0;
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 0, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_TIMEDOUT, "gsync_wait did not timeout");
+
+  single_shared = 1;
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 0, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_INVALID_ARGUMENT, "gsync_wait on wrong value");
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 1, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_TIMEDOUT, "gsync_wait again on correct value did not 
timeout");
+
+  single_shared_quad.val1 = 1;
+  single_shared_quad.val2 = 2;
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared_quad, 99, 88,
+   100, GSYNC_TIMED | GSYNC_QUAD);
+  ASSERT(err == KERN_INVALID_ARGUMENT, "gsync_wait on wrong quad value");
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared_quad, 1, 2,
+   100, GSYNC_TIMED | GSYNC_QUAD);
+  ASSERT(err == KERN_TIMEDOUT, "gsync_wait again on correct value did not 
timeout");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake with nobody waiting");
+
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 1, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_TIMEDOUT, "gsync_wait not affected by previous 
gsync_wake");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 
GSYNC_BROADCAST);
+  ASSERT_RET(err, "gsync_wake broadcast with nobody waiting");
+
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 1, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_TIMEDOUT, "gsync_wait not affected by previous 
gsync_wake");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 2, 
GSYNC_MUTATE);
+  ASSERT_RET(err, "gsync_wake nobody + mutate");
+  ASSERT(single_shared == 2, "gsync_wake single_shared did not mutate");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake nobody without mutate");
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake 3a");
+}
+
+static void single_thread_setter(void *arg)
+{
+  int err;
+  int val = (long)arg;
+
+  /* It should be enough to sleep a bit for our creator to call
+ gsync_wait(). To verify that the test is performed with the
+ correct sequence, we also change the value so if the wait is
+ called after our wake it will fail with KERN_INVALID_ARGUMENT */
+  msleep(100);
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, val, 
GSYNC_MUTATE);
+  ASSERT_RET(err, "gsync_wake from thread + mutate");
+
+  thread_terminate(mach_thread_self());
+  FAILURE("thread_terminate");
+}
+
+static void test_single_from_thread()
+{
+  int err;
+  single_shared = 10;
+  test_thread_start(mach_task_self(), single_thread_setter, (void*)11);
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 10, 0, 0, 0);
+  ASSERT_RET(err, "gsync_wait without timeout for wake from another thread");
+  ASSERT(single_shared == 11, "wake didn't mutate");
+}
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  test_single_from_thread();
+  test_single();
+  return 0;
+}
diff --git a/tests/user-qemu.mk b/tests/user-qemu.mk
index 8b338241..7c5e6927 100644
--- a/tests/user-qemu.mk
+++ b/tests/user-qemu.mk
@@ -196,7 +196,8 @@ clean-test-%:
 
 USER_TE

[PATCH 06/11] add basic vm tests

2024-01-11 Thread Luca Dariz
---
 tests/test-vm.c| 85 ++
 tests/user-qemu.mk |  3 +-
 2 files changed, 87 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-vm.c

diff --git a/tests/test-vm.c b/tests/test-vm.c
new file mode 100644
index ..4ece792e
--- /dev/null
+++ b/tests/test-vm.c
@@ -0,0 +1,85 @@
+/*
+ *  Copyright (C) 2024 Free Software Foundation
+ *
+ * This program is free software ; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation ; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY ; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with the program ; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+
+static void test_memobj()
+{
+  // this emulates maptime() mapping and reading
+  struct mapped_time_value *mtime;
+  int64_t secs, usecs;
+  mach_port_t device, memobj;
+  int err;
+
+  err = device_open (device_priv(), 0, "time", &device);
+  ASSERT_RET(err, "device_open");
+  err = device_map (device, VM_PROT_READ, 0, sizeof(*mtime), &memobj, 0);
+  ASSERT_RET(err, "device_map");
+  err = mach_port_deallocate (mach_task_self (), device);
+  ASSERT_RET(err, "mach_port_deallocate");
+  mtime = 0;
+  err = vm_map(mach_task_self (), (vm_address_t *)&mtime, sizeof *mtime, 0, 1,
+   memobj, 0, 0, VM_PROT_READ, VM_PROT_READ, VM_INHERIT_NONE);
+  ASSERT_RET(err, "vm_map");
+
+  do
+{
+  secs = mtime->seconds;
+  __sync_synchronize ();
+  usecs = mtime->microseconds;
+  __sync_synchronize ();
+}
+  while (secs != mtime->check_seconds);
+  printf("mapped time is %lld.%lld\n",secs, usecs);
+
+  err = mach_port_deallocate (mach_task_self (), memobj);
+  ASSERT_RET(err, "mach_port_deallocate");
+  err = vm_deallocate(mach_task_self(), (vm_address_t)mtime, sizeof(*mtime));
+  ASSERT_RET(err, "vm_deallocate");
+}
+
+static void test_wire()
+{
+  int err = vm_wire_all(host_priv(), mach_task_self(), VM_WIRE_ALL);
+  ASSERT_RET(err, "vm_wire_all-ALL");
+  err = vm_wire_all(host_priv(), mach_task_self(), VM_WIRE_NONE);
+  ASSERT_RET(err, "vm_wire_all-NONE");
+  // TODO check that all memory is actually wired or unwired
+}
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  printf("VM_MIN_ADDRESS=0x%p\n", VM_MIN_ADDRESS);
+  printf("VM_MAX_ADDRESS=0x%p\n", VM_MAX_ADDRESS);
+  test_wire();
+  test_memobj();
+  return 0;
+}
diff --git a/tests/user-qemu.mk b/tests/user-qemu.mk
index fe0f9569..1ab7cb8c 100644
--- a/tests/user-qemu.mk
+++ b/tests/user-qemu.mk
@@ -198,7 +198,8 @@ USER_TESTS := \
tests/test-hello \
tests/test-mach_host \
tests/test-gsync \
-   tests/test-mach_port
+   tests/test-mach_port \
+   tests/test-vm
 
 USER_TESTS_CLEAN = $(subst tests/,clean-,$(USER_TESTS))
 
-- 
2.39.2




[PATCH 11/11] add basic thread tests

2024-01-11 Thread Luca Dariz
---
 tests/test-threads.c | 104 +++
 tests/user-qemu.mk   |   3 +-
 2 files changed, 106 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-threads.c

diff --git a/tests/test-threads.c b/tests/test-threads.c
new file mode 100644
index ..06630bef
--- /dev/null
+++ b/tests/test-threads.c
@@ -0,0 +1,104 @@
+/*
+ *  Copyright (C) 2024 Free Software Foundation
+ *
+ * This program is free software ; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation ; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY ; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with the program ; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+
+void sleeping_thread(void* arg)
+{
+  printf("starting thread %d\n", arg);
+  for (int i=0; i<100; i++)
+  msleep(50);
+  printf("stopping thread %d\n", arg);
+  thread_terminate(mach_thread_self());
+  FAILURE("thread_terminate");
+}
+
+void test_many(void)
+{
+  for (long tid=0; tid<10; tid++)
+{
+  test_thread_start(mach_task_self(), sleeping_thread, (void*)tid);
+}
+  // TODO: wait for thread end notifications
+  msleep(6000);
+}
+
+#ifdef __x86_64__
+void test_fsgs_base_thread(void* tid)
+{
+  int err;
+#if defined(__SEG_FS) && defined(__SEG_GS)
+  long __seg_fs *fs_ptr;
+  long __seg_gs *gs_ptr;
+  long fs_value;
+  long gs_value;
+
+  struct i386_fsgs_base_state state;
+  state.fs_base = (unsigned long)&fs_value;
+  state.gs_base = (unsigned long)&gs_value;
+  err = thread_set_state(mach_thread_self(), i386_FSGS_BASE_STATE,
+ (thread_state_t) &state, i386_FSGS_BASE_STATE_COUNT);
+  ASSERT_RET(err, "thread_set_state");
+
+  fs_value = 0x100 + (long)tid;
+  gs_value = 0x200 + (long)tid;
+
+  msleep(50);  // allow the others to set their segment base
+
+  fs_ptr = 0;
+  gs_ptr = 0;
+  long rdvalue = *fs_ptr;
+  printf("FS expected %lx read %lx\n", fs_value, rdvalue);
+  ASSERT(fs_value == rdvalue, "FS base error\n");
+  rdvalue = *gs_ptr;
+  printf("GS expected %lx read %lx\n", gs_value, rdvalue);
+  ASSERT(gs_value == rdvalue, "GS base error\n");
+#else
+#error " missing __SEG_FS and __SEG_GS"
+#endif
+
+  thread_terminate(mach_thread_self());
+  FAILURE("thread_terminate");
+}
+#endif
+
+void test_fsgs_base(void)
+{
+#ifdef __x86_64__
+  int err;
+  for (long tid=0; tid<10; tid++)
+{
+  test_thread_start(mach_task_self(), test_fsgs_base_thread, (void*)tid);
+}
+  msleep(1000);  // TODO: wait for threads
+#endif
+}
+
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  test_fsgs_base();
+  test_many();
+  return 0;
+}
diff --git a/tests/user-qemu.mk b/tests/user-qemu.mk
index 77259198..fd5ae1ab 100644
--- a/tests/user-qemu.mk
+++ b/tests/user-qemu.mk
@@ -203,7 +203,8 @@ USER_TESTS := \
tests/test-vm \
tests/test-syscalls \
tests/test-machmsg \
-   tests/test-task
+   tests/test-task \
+   tests/test-threads
 
 USER_TESTS_CLEAN = $(subst tests/,clean-,$(USER_TESTS))
 
-- 
2.39.2




[PATCH 05/11] adjust range when changing memory pageability

2024-01-11 Thread Luca Dariz
* vm/vm_map.c: use actual limits instead of min/max boundaries to
  change pageability of the currently mapped memory.

This caused the initial vm_wire_all(host, task VM_WIRE_ALL) in glibc
startup to fail with KERN_NO_SPACE.
---
 vm/vm_map.c | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/vm/vm_map.c b/vm/vm_map.c
index 26e18676..e454bb2a 100644
--- a/vm/vm_map.c
+++ b/vm/vm_map.c
@@ -1829,6 +1829,30 @@ kern_return_t vm_map_pageable(
return(KERN_SUCCESS);
 }
 
+/* Update pageability of all the memory currently in the map.
+ * The map must be locked, and protection mismatch will not be checked, see
+ * vm_map_pageable().
+ */
+static kern_return_t
+vm_map_pageable_current(vm_map_t map, vm_prot_t access_type)
+{
+   struct rbtree_node *node;
+   vm_offset_t min_address, max_address;
+
+   node = rbtree_first(&map->hdr.tree);
+   min_address = rbtree_entry(node, struct vm_map_entry,
+  tree_node)->vme_start;
+
+   node = rbtree_last(&map->hdr.tree);
+   max_address = rbtree_entry(node, struct vm_map_entry,
+  tree_node)->vme_end;
+
+   /* Returns with the map read-locked if successful */
+   return vm_map_pageable(map, min_address, max_address,access_type,
+  FALSE, FALSE);
+}
+
+
 /*
  * vm_map_pageable_all:
  *
@@ -1859,8 +1883,7 @@ vm_map_pageable_all(struct vm_map *map, vm_wire_t flags)
map->wiring_required = FALSE;
 
/* Returns with the map read-locked if successful */
-   kr = vm_map_pageable(map, map->min_offset, map->max_offset,
-VM_PROT_NONE, FALSE, FALSE);
+   kr = vm_map_pageable_current(map, VM_PROT_NONE);
vm_map_unlock(map);
return kr;
}
@@ -1873,9 +1896,7 @@ vm_map_pageable_all(struct vm_map *map, vm_wire_t flags)
 
if (flags & VM_WIRE_CURRENT) {
/* Returns with the map read-locked if successful */
-   kr = vm_map_pageable(map, map->min_offset, map->max_offset,
-VM_PROT_READ | VM_PROT_WRITE,
-FALSE, FALSE);
+   kr = vm_map_pageable_current(map, VM_PROT_READ | VM_PROT_WRITE);
 
if (kr != KERN_SUCCESS) {
if (flags & VM_WIRE_FUTURE) {
-- 
2.39.2




[PATCH 01/11] add basic user-space tests with qemu

2024-01-11 Thread Luca Dariz
* configure.ac: move test fragment to have USER32
* tests/Makefrag.am: add user tests
* tests/README: add basic info on how to run and debug user tests
* tests/configfrag.ac: allow the test compiler/flags to be
  autoconfigured or customized
* tests/grub.cfg.single.template: add minimal grub config to boot a
  module
* tests/include/device/cons.h: add a simplified version of
  device/cons.h usable for tests
* tests/include/kern/printf.h: symlink to kern/printf.h
* tests/include/mach/mig_support.h: add basic version for user-space
  tests
* tests/include/syscalls.h: add prototypes for syscalls used in tests.
* tests/include/testlib.h: add definitions for common test
  functionalities
* tests/include/util/atoi.h: symlink to util/atoi.h
* tests/run-qemu.sh.template: add a simple qemu test runner
* tests/start.S: add arch-specific entry point
* tests/syscalls.S: generate syscalls entry points
* tests/test-hello.c: add basic smoke test
* tests/testlib.c: add the minimal functionality to run a user-space
  executable and reboot the system, and some test helpers.
* tests/user-qemu.mk: add rules to build simple user-space test
  modules, including generating mig stubs. The tests reuse some kernel
  code (like printf(), mach_atoi(), mem*(), str*() functions) so we can
  use the freestanding environment and not depend on glibc.
---
 configure.ac |   6 +-
 tests/Makefrag.am|  11 +-
 tests/README |  37 ++
 tests/configfrag.ac  |  16 +++
 tests/grub.cfg.single.template   |   4 +
 tests/include/device/cons.h  |  27 
 tests/include/kern/printf.h  |   1 +
 tests/include/mach/mig_support.h |  71 +++
 tests/include/syscalls.h |  83 
 tests/include/testlib.h  |  74 +++
 tests/include/util/atoi.h|   1 +
 tests/run-qemu.sh.template   |  38 ++
 tests/start.S|  28 
 tests/syscalls.S |   4 +
 tests/test-hello.c   |  26 
 tests/testlib.c  | 114 +
 tests/user-qemu.mk   | 212 +++
 17 files changed, 748 insertions(+), 5 deletions(-)
 create mode 100644 tests/README
 create mode 100644 tests/grub.cfg.single.template
 create mode 100644 tests/include/device/cons.h
 create mode 12 tests/include/kern/printf.h
 create mode 100644 tests/include/mach/mig_support.h
 create mode 100644 tests/include/syscalls.h
 create mode 100644 tests/include/testlib.h
 create mode 12 tests/include/util/atoi.h
 create mode 100644 tests/run-qemu.sh.template
 create mode 100644 tests/start.S
 create mode 100644 tests/syscalls.S
 create mode 100644 tests/test-hello.c
 create mode 100644 tests/testlib.c
 create mode 100644 tests/user-qemu.mk

diff --git a/configure.ac b/configure.ac
index cadc33b6..69f75cf2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -123,9 +123,6 @@ AC_CHECK_PROG([PATCH], [patch], [patch], [patch-not-found])
 # configure fragments.
 #
 
-# The test suite.
-m4_include([tests/configfrag.ac])
-
 # Default set of device drivers.
 AC_ARG_ENABLE([device-drivers],
   AS_HELP_STRING([--enable-device-drivers=WHICH], [specify WHICH (on `ix86-at'
@@ -181,6 +178,9 @@ m4_include([configfrag.ac])
 
 # Linux code snarfed into GNU Mach.
 m4_include([linux/configfrag.ac])
+
+# The test suite.
+m4_include([tests/configfrag.ac])
 
 #
 # Compiler features.
diff --git a/tests/Makefrag.am b/tests/Makefrag.am
index 2723f64a..88c7fe8c 100644
--- a/tests/Makefrag.am
+++ b/tests/Makefrag.am
@@ -21,6 +21,13 @@
 #
 
 if !PLATFORM_xen
+
+include tests/user-qemu.mk
+
 TESTS += \
-   tests/test-multiboot
-endif
+   tests/test-multiboot \
+   $(USER_TESTS)
+
+clean-local: $(USER_TESTS_CLEAN)
+
+endif # !PLATFORM_xen
diff --git a/tests/README b/tests/README
new file mode 100644
index ..3dacc184
--- /dev/null
+++ b/tests/README
@@ -0,0 +1,37 @@
+
+There are some basic tests that can be run qith qemu. You can run all the 
tests with
+
+$ make check
+
+or selectively with:
+
+$ make run-hello
+
+Also, you can debug the existing tests, or a new one, by starting on one shell
+
+$ make debug-hello
+
+and on another shell you can attach with gdb, load the symbols of the
+bootstrap module and break on its _start():
+
+$ gdb gnumach
+...
+(gdb) target remote :1234
+...
+(gdb) b setup_main
+Breakpoint 11 at 0x81019d60: file ../kern/startup.c, line 98.
+(gdb) c
+Continuing.
+
+Breakpoint 11, setup_main () at ../kern/startup.c:98
+98 cninit();
+(gdb) add-symbol-file ../gnumach/build-64/module-task
+Reading symbols from ../gnumach/build-64/module-task...
+(gdb) b _start
+Breakpoint 12 at 0x40324a: _start. (2 locations)
+(gdb) c
+Continuing.
+
+Breakpoint 12, _start () at ../tests/testlib.c:96
+96 {
+(gdb) 
diff --git a/tests/configfrag.ac b/tests/configfrag.ac
index 55c6da62..de87cbad 100

[PATCH 10/11] add basic task tests

2024-01-11 Thread Luca Dariz
---
 tests/test-task.c  | 171 +
 tests/user-qemu.mk |   3 +-
 2 files changed, 173 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-task.c

diff --git a/tests/test-task.c b/tests/test-task.c
new file mode 100644
index ..cbc75e23
--- /dev/null
+++ b/tests/test-task.c
@@ -0,0 +1,171 @@
+/*
+ *  Copyright (C) 2024 Free Software Foundation
+ *
+ * This program is free software ; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation ; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY ; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with the program ; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+
+void test_task()
+{
+  mach_port_t ourtask = mach_task_self();
+  mach_msg_type_number_t count;
+  int err;
+
+  struct task_basic_info binfo;
+  count = TASK_BASIC_INFO_COUNT;
+  err = task_info(ourtask, TASK_BASIC_INFO, (task_info_t)&binfo, &count);
+  ASSERT_RET(err, "TASK_BASIC_INFO");
+  ASSERT(binfo.virtual_size > binfo.resident_size, "wrong memory counters");
+
+  struct task_events_info einfo;
+  count = TASK_EVENTS_INFO_COUNT;
+  err = task_info(ourtask, TASK_EVENTS_INFO, (task_info_t)&einfo, &count);
+  ASSERT_RET(err, "TASK_EVENTS_INFO");
+  printf("msgs sent %llu received %llu\n",
+ einfo.messages_sent, einfo.messages_received);
+
+  struct task_thread_times_info ttinfo;
+  count = TASK_THREAD_TIMES_INFO_COUNT;
+  err = task_info(ourtask, TASK_THREAD_TIMES_INFO, (task_info_t)&ttinfo, 
&count);
+  ASSERT_RET(err, "TASK_THREAD_TIMES_INFO");
+  printf("run user %lld system %lld\n",
+ ttinfo.user_time64.seconds, ttinfo.user_time64.nanoseconds);
+}
+
+
+void dummy_thread(void *arg)
+{
+  printf("started dummy thread\n");
+  while (1)
+;
+}
+
+void check_threads(thread_t *threads, mach_msg_type_number_t nthreads)
+{  
+  for (int tid=0; tid

[PATCH 07/11] add thread creation helper to tests

2024-01-11 Thread Luca Dariz
---
 tests/include/testlib.h  |  1 +
 tests/testlib_thread_start.c | 86 
 tests/user-qemu.mk   |  1 +
 3 files changed, 88 insertions(+)
 create mode 100644 tests/testlib_thread_start.c

diff --git a/tests/include/testlib.h b/tests/include/testlib.h
index e492f2f6..a3f3a6a8 100644
--- a/tests/include/testlib.h
+++ b/tests/include/testlib.h
@@ -65,6 +65,7 @@ const char* e2s(int err);
 const char* e2s_gnumach(int err);
 void halt();
 int msleep(uint32_t timeout);
+thread_t test_thread_start(task_t task, void(*routine)(void*), void* arg);
 
 mach_port_t host_priv(void);
 mach_port_t device_priv(void);
diff --git a/tests/testlib_thread_start.c b/tests/testlib_thread_start.c
new file mode 100644
index ..c52694ca
--- /dev/null
+++ b/tests/testlib_thread_start.c
@@ -0,0 +1,86 @@
+/*
+ * MIT License
+ *
+ * Copyright (c) 2017 Luc Chabassier
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ * 
+ * The above copyright notice and this permission notice shall be included in 
all
+ * copies or substantial portions of the Software.
+ * 
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ */
+
+/* This small helper was started from 
+ * https://github.com/dwarfmaster/mach-ipc/blob/master/minimal_threads/main.c
+ * and then reworked. */
+
+#include 
+#include 
+#include 
+
+/* This is just a temporary mapping to set up the stack */
+static long stack_top[PAGE_SIZE/sizeof(long)] __attribute__ ((aligned 
(PAGE_SIZE)));
+
+thread_t test_thread_start(task_t task, void(*routine)(void*), void* arg) {
+  const vm_size_t stack_size = PAGE_SIZE * 16;
+  kern_return_t ret;
+  vm_address_t stack;
+
+  ret = vm_allocate(task, &stack, stack_size, TRUE);
+  ASSERT_RET(ret, "can't allocate the stack for a new thread");
+
+  ret = vm_protect(task, stack, PAGE_SIZE, FALSE, VM_PROT_NONE);
+  ASSERT_RET(ret, "can't protect the stack from overflows");
+
+  long *top = (long*)((vm_offset_t)stack_top + PAGE_SIZE) - 1;
+#ifdef __i386__
+  *top = (long)arg; /* The argument is passed on the stack on x86_32 */
+  *(top - 1) = 0;   /* The return address */
+#elif defined(__x86_64__)
+  *top = 0; /* The return address */
+#endif
+  ret = vm_write(task, stack + stack_size - PAGE_SIZE, (vm_offset_t)stack_top, 
PAGE_SIZE);
+  ASSERT_RET(ret, "can't initialize the stack for the new thread");
+
+  thread_t thread;
+  ret = thread_create(task, &thread);
+  ASSERT_RET(ret, "thread_create()");
+
+  struct i386_thread_state state;
+  unsigned int count;
+  count = i386_THREAD_STATE_COUNT;
+  ret = thread_get_state(thread, i386_REGS_SEGS_STATE,
+ (thread_state_t) &state, &count);
+  ASSERT_RET(ret, "thread_get_state()");
+
+#ifdef __i386__
+  state.eip = (long) routine;
+  state.uesp = (long) (stack + stack_size - sizeof(long) * 2);
+  state.ebp = 0;
+#elif defined(__x86_64__)
+  state.rip = (long) routine;
+  state.ursp = (long) (stack + stack_size - sizeof(long) * 1);
+  state.rbp = 0;
+  state.rdi = (long)arg;
+#endif
+  ret = thread_set_state(thread, i386_REGS_SEGS_STATE,
+ (thread_state_t) &state, i386_THREAD_STATE_COUNT);
+  ASSERT_RET(ret, "thread_set_state");
+
+  ret = thread_resume(thread);
+  ASSERT_RET(ret, "thread_resume");
+
+  return thread;
+}
diff --git a/tests/user-qemu.mk b/tests/user-qemu.mk
index 1ab7cb8c..8f10ba36 100644
--- a/tests/user-qemu.mk
+++ b/tests/user-qemu.mk
@@ -125,6 +125,7 @@ SRC_TESTLIB= \
$(srcdir)/tests/syscalls.S \
$(srcdir)/tests/start.S \
$(srcdir)/tests/testlib.c \
+   $(srcdir)/tests/testlib_thread_start.c \
$(builddir)/tests/errlist.c \
$(MIG_GEN_CC)
 
-- 
2.39.2




[PATCH 08/11] add syscall tests

2024-01-11 Thread Luca Dariz
---
 tests/test-syscalls.c | 166 ++
 tests/user-qemu.mk|   3 +-
 2 files changed, 168 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-syscalls.c

diff --git a/tests/test-syscalls.c b/tests/test-syscalls.c
new file mode 100644
index ..be4df8c3
--- /dev/null
+++ b/tests/test-syscalls.c
@@ -0,0 +1,166 @@
+/*
+ *  Copyright (C) 2024 Free Software Foundation
+ *
+ * This program is free software ; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation ; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY ; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with the program ; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+
+static struct {
+  mach_port_t exception_port;
+  mach_port_t thread;
+  mach_port_t task;
+  integer_t exception;
+  integer_t code;
+  integer_t subcode;
+} last_exc;
+kern_return_t catch_exception_raise(mach_port_t exception_port,
+mach_port_t thread, mach_port_t task,
+integer_t exception, integer_t code,
+long_integer_t subcode)
+{
+  printf("received catch_exception_raise(%u %u %u %d %d %d)\n",
+ exception_port, thread, task, exception, code, subcode);
+  last_exc.exception_port = exception_port;
+  last_exc.thread = thread;
+  last_exc.task = task;
+  last_exc.exception = exception;
+  last_exc.code = code;
+  last_exc.subcode = subcode;
+  return KERN_SUCCESS;
+}
+
+static char simple_request_data[PAGE_SIZE];
+static char simple_reply_data[PAGE_SIZE];
+int simple_msg_server(boolean_t (*demuxer) (mach_msg_header_t *request,
+ mach_msg_header_t *reply),
+  mach_port_t rcv_port_name,
+  int num_msgs)
+{
+  int midx = 0, mok = 0;
+  int ret;
+  mig_reply_header_t *request = (mig_reply_header_t*)simple_request_data;
+  mig_reply_header_t *reply = (mig_reply_header_t*)simple_reply_data;
+  while ((midx - num_msgs) < 0)
+{
+  ret = mach_msg(&request->Head, MACH_RCV_MSG, 0, PAGE_SIZE,
+ rcv_port_name, 0, MACH_PORT_NULL);
+  switch (ret)
+{
+case MACH_MSG_SUCCESS:
+  if ((*demuxer)(&request->Head, &reply->Head))
+mok++;  // TODO send reply
+  else
+FAILURE("demuxer didn't handle the message");
+  break;
+default:
+  ASSERT_RET(ret, "receiving in msg_server");
+  break;
+}
+  midx++;
+}
+  if (mok != midx)
+FAILURE("wrong number of message received");
+  return mok != midx;
+}
+
+
+void test_syscall_bad_arg_on_stack(void *arg)
+{
+  /* mach_msg() has 7 arguments, so the last one should be always
+ passed on the stack on x86. Here we make ESP/RSP point to the
+ wrong place to test the access check */
+#ifdef __x86_64__
+  asm volatile("movq   $0x123,%rsp;"   \
+   "movq   $-25,%rax;" \
+   "syscall;"   \
+   );
+#else
+  asm volatile("mov$0x123,%esp;"   \
+   "mov$-25,%eax;" \
+   "lcall  $0x7,$0x0;" \
+   );
+#endif
+  FAILURE("we shouldn't be here!");
+}
+
+void test_bad_syscall_num(void *arg)
+{
+#ifdef __x86_64__
+  asm volatile("movq   $0x123456,%rax;"\
+   "syscall;"   \
+   );
+#else
+  asm volatile("mov$0x123456,%eax;"\
+   "lcall  $0x7,$0x0;" \
+   );
+#endif
+  FAILURE("we shouldn't be here!");
+}
+
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  int err;
+  mach_port_t excp;
+
+  err = mach_port_allocate(mach_task_self (), MACH_PORT_RIGHT_RECEIVE, &excp);
+  ASSERT_RET(err, "creating exception port");
+
+  err = mach_port_insert_right(mach_task_self(), excp, excp,
+   MACH_MSG_TYPE_MAKE_SEND);
+  ASSERT_RET(err, "inserting send right into exception port");
+
+  err = task_set_special_port(mach_task_self(), TASK_EXCEPTION_PORT, excp);
+  ASSERT_RET(err, "setting task exception port");
+
+  /* FIXME: receiving an exception with small size causes GP on 64 bit 
userspace */
+  /* mig_reply_header_t msg; */
+  /* err = mach_msg(&msg.Head, /\* The header *\/ */
+  /*  

[PATCH 04/11] add mach_port tests

2024-01-11 Thread Luca Dariz
---
 tests/test-mach_port.c | 121 +
 tests/user-qemu.mk |   3 +-
 2 files changed, 123 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-mach_port.c

diff --git a/tests/test-mach_port.c b/tests/test-mach_port.c
new file mode 100644
index ..81a1072b
--- /dev/null
+++ b/tests/test-mach_port.c
@@ -0,0 +1,121 @@
+/*
+ *  Copyright (C) 2024 Free Software Foundation
+ *
+ * This program is free software ; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation ; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY ; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with the program ; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+
+void test_mach_port(void)
+{
+  kern_return_t err;
+
+  mach_port_name_t *namesp;
+  mach_msg_type_number_t namesCnt=0;
+  mach_port_type_t *typesp;
+  mach_msg_type_number_t typesCnt=0;
+  err = mach_port_names(mach_task_self(), &namesp, &namesCnt, &typesp, 
&typesCnt);
+  ASSERT_RET(err, "mach_port_names");
+  printf("mach_port_names: type/name length: %d %d\n", namesCnt, typesCnt);
+  ASSERT((namesCnt != 0) && (namesCnt == typesCnt),
+ "mach_port_names: wrong type/name length");
+  for (int i=0; i

Re: [PATCH 03/14] add mach_host tests

2024-01-06 Thread Luca

Il 06/01/24 20:43, Sergey Bugaev ha scritto:

On Sat, Jan 6, 2024 at 10:26 PM Luca  wrote:

Uhm, I still have an issue, although a bit different now:

By the way, the exception is still the same (General Protection, which
is usually forwarded to user space), but for a different reason,
apparently a non-canonical address in $rax=0x820175c0


I think this is just that you now have to add 2, not 1, to the result
of __builtin_frame_address().

Or better, do what I did in glibc, change it to an explicit function
argument [0], since __builtin_frame_address() doesn't return a useful
(for this purpose) value on aarch64.

[0]: 
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=24b707c1665afae7eb302542ffa92d53aa577111


great, this works :)
Thanks!


Luca




Re: [PATCH] news/2023-q4.mdwn: new qoth for q4 of 2023.

2024-01-06 Thread Luca

Il 06/01/24 21:20, Sergey Bugaev ha scritto:

On Sat, Jan 6, 2024 at 10:47 PM jbra...@dismail.de  wrote:

+Luca Dariz worked on adding [[some simple GNUMach user-space tests


I've never seen gnumach spelled like that (GNUMach), only GNU Mach or
gnumach. Is that intentional?

+Flavio Cruz [[improved GNUMach's
+IPC|https://lists.gnu.org/archive/html/bug-hurd/2023-11/msg00033.html]]
+by reordering `mach_msg_type_t` fields to byte align `msgt_name` and
+`msgt_size`.  He also created a patch series to [[avoid message
+resizing for
+x86_64|https://lists.gnu.org/archive/html/bug-hurd/2023-11/msg00028.html]].
+He also [[removed untyped mach RPC
+code|https://lists.gnu.org/archive/html/bug-hurd/2023-11/msg00026.html]],
+since the Hurd always uses typed Mach RPC.


It's not that much that *the Hurd* uses typed IPC, it's that gnumach
does. The Hurd could easily support either, and in fact this support
was present (though likely bitrotten), and this is what Flavio has
removed. This is because going forward, we don't see the Hurd running
on any other flavor of Mach; but it was originally the goal for it to
run on non-GNU Machs, IIRC.


+Sergey Bugaev added [[GNUMach entry re-coalescing
+support|https://darnassus.sceen.net/~hurd-web/open_issues/gnumach_vm_map_entry_forward_merging/]].
+Essentially, Mach is not always able to merge two vm entries that are
+made next to each other, which slows down ext2 and bash.  Sergey
+allowed GNUMach to merge entries in the common cases, which greatly
+helps ext2fs.


Is that actually true though — did it speed anything up?


I reinstalled hurd-amd64 and it does seem faster than a few months ago, 
at least noticeable during a foreign debootstrap (on both 32 and 64-bit).



Luca



Re: [PATCH 03/14] add mach_host tests

2024-01-06 Thread Luca

Il 06/01/24 20:02, Sergey Bugaev ha scritto:

On Sat, Jan 6, 2024 at 9:45 PM Samuel Thibault  wrote:


Luca, le sam. 06 janv. 2024 19:41:17 +0100, a ecrit:

Il 29/12/23 15:14, Luca Dariz ha scritto:

Il 29/12/23 14:44, Samuel Thibault ha scritto:

Also, it would be useful to compile the tests with
-ftrivial-auto-var-init=pattern so as to fill the structures with random
values before making the gnumach calls.


with this option all tests fail on the first mig-generated stub entry, which
is task_get_special_port(), in _start(). Maybe it's related to SSE somehow,
I see a page fault here:


Is $rbp unaligned?

(we do want to fix such bug anyway)


(gdb) disassemble task_get_special_port
Dump of assembler code for function task_get_special_port:
0x00416bc6 <+0>:   push   %rbp
0x00416bc7 <+1>:   mov%rsp,%rbp
0x00416bca <+4>:   sub$0xa0,%rsp
0x00416bd1 <+11>:  mov%edi,-0x94(%rbp)
0x00416bd7 <+17>:  mov%esi,-0x98(%rbp)
0x00416bdd <+23>:  mov%rdx,-0xa0(%rbp)
0x00416be4 <+30>:  lea-0x60(%rbp),%rax
0x00416be8 <+34>:  movdqa 0x124f0(%rip),%xmm0# 0x4290e0
=> 0x00416bf0 <+42>:  movaps %xmm0,(%rax)
0x00416bf3 <+45>:  movaps %xmm0,0x10(%rax)


Yes, you have to align the stack. An executable gets entered at _start
(or whatever the ELF header specifies) with %rsp 16-aligned, but you
must enter C code with %rsp being 8 modulo 8. To fix this, change your
_start like so:

asm(".global _start\n"
 "_start:\n"
 "  callq c_start");

void __attribute__((used, retain)) c_start()
{
 ...
}


Uhm, I still have an issue, although a bit different now:

(gdb) disassemble  c_start
Dump of assembler code for function c_start:
   0x00402ec1 <+0>:   push   %rbp
   0x00402ec2 <+1>:   mov%rsp,%rbp
   0x00402ec5 <+4>:   sub$0x30,%rsp
   0x00402ec9 <+8>:   movl   $0xfefefefe,-0x8(%rbp)
   0x00402ed0 <+15>:  movl   $0xfefefefe,-0xc(%rbp)
   0x00402ed7 <+22>:  mov%rbp,%rax
   0x00402eda <+25>:  add$0x8,%rax
   0x00402ede <+29>:  mov%rax,-0x18(%rbp)
   0x00402ee2 <+33>:  mov-0x18(%rbp),%rax
   0x00402ee6 <+37>:  mov%rax,-0x20(%rbp)
   0x00402eea <+41>:  mov-0x20(%rbp),%rax
   0x00402eee <+45>:  mov(%rax),%rax
   0x00402ef1 <+48>:	mov%eax,0x2a111(%rip)# 
0x42d008 

   0x00402ef7 <+54>:  mov-0x20(%rbp),%rax
   0x00402efb <+58>:  add$0x8,%rax
   0x00402eff <+62>:	mov%rax,0x2915a(%rip)# 
0x42c060 
   0x00402f06 <+69>:	mov0x29153(%rip),%rax# 
0x42c060 
   0x00402f0d <+76>:	mov0x2a0f5(%rip),%edx# 
0x42d008 

   0x00402f13 <+82>:  movslq %edx,%rdx
   0x00402f16 <+85>:  add$0x1,%rdx
   0x00402f1a <+89>:  shl$0x3,%rdx
   0x00402f1e <+93>:  add%rdx,%rax
   0x00402f21 <+96>:	mov%rax,0x2a0e8(%rip)# 
0x42d010 
   0x00402f28 <+103>:	movl   $0x0,0x2a0e6(%rip)# 
0x42d018 

   0x00402f32 <+113>: jmp0x402f43 
   0x00402f34 <+115>:	mov0x2a0de(%rip),%eax# 
0x42d018 

   0x00402f3a <+121>: add$0x1,%eax
   0x00402f3d <+124>:	mov%eax,0x2a0d5(%rip)# 
0x42d018 
   0x00402f43 <+130>:	mov0x2a0c6(%rip),%rax# 
0x42d010 
   0x00402f4a <+137>:	mov0x2a0c8(%rip),%edx# 
0x42d018 

   0x00402f50 <+143>: movslq %edx,%rdx
   0x00402f53 <+146>: shl$0x3,%rdx
   0x00402f57 <+150>: add%rdx,%rax
=> 0x000000402f5a <+153>:  mov(%rax),%rax
   0x00402f5d <+156>: test   %rax,%rax



By the way, the exception is still the same (General Protection, which 
is usually forwarded to user space), but for a different reason, 
apparently a non-canonical address in $rax=0x820175c0



Luca





Re: [PATCH 03/14] add mach_host tests

2024-01-06 Thread Luca

Il 06/01/24 19:44, Samuel Thibault ha scritto:

Luca, le sam. 06 janv. 2024 19:41:17 +0100, a ecrit:

Il 29/12/23 15:14, Luca Dariz ha scritto:

Il 29/12/23 14:44, Samuel Thibault ha scritto:

Also, it would be useful to compile the tests with
-ftrivial-auto-var-init=pattern so as to fill the structures with random
values before making the gnumach calls.


with this option all tests fail on the first mig-generated stub entry, which
is task_get_special_port(), in _start(). Maybe it's related to SSE somehow,
I see a page fault here:


Is $rbp unaligned?

(we do want to fix such bug anyway)


it's aligned to 8 byte, but it seems that for MOVAPS we need 16-biyte 
alignment.


By the way, by mistake that dump was with yet another change not part of 
this patch set, but also with the simple setup the failure is the same, 
just in another place (when printing the first string).



Luca



Re: [PATCH 03/14] add mach_host tests

2024-01-06 Thread Luca

Il 29/12/23 15:14, Luca Dariz ha scritto:

Il 29/12/23 14:44, Samuel Thibault ha scritto:

Also, it would be useful to compile the tests with
-ftrivial-auto-var-init=pattern so as to fill the structures with random
values before making the gnumach calls.


with this option all tests fail on the first mig-generated stub entry, 
which is task_get_special_port(), in _start(). Maybe it's related to SSE 
somehow, I see a page fault here:


(gdb) disassemble task_get_special_port
Dump of assembler code for function task_get_special_port:
   0x00416bc6 <+0>:   push   %rbp
   0x00416bc7 <+1>:   mov%rsp,%rbp
   0x00416bca <+4>:   sub$0xa0,%rsp
   0x00416bd1 <+11>:  mov%edi,-0x94(%rbp)
   0x00416bd7 <+17>:  mov%esi,-0x98(%rbp)
   0x00416bdd <+23>:  mov%rdx,-0xa0(%rbp)
   0x00416be4 <+30>:  lea-0x60(%rbp),%rax
   0x00416be8 <+34>:  movdqa 0x124f0(%rip),%xmm0# 0x4290e0
=> 0x00416bf0 <+42>:   movaps %xmm0,(%rax)
   0x00416bf3 <+45>:  movaps %xmm0,0x10(%rax)


Luca



Re: [PATCH 11/14] add raw mach_msg tests

2024-01-06 Thread Luca Dariz

Il 29/12/23 15:20, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 28 déc. 2023 20:42:58 +0100, a ecrit:

+
+#define align_inline(val, n) { val = align(val, n); }


Rather name it ALIGN_INLINE, so people don't mistake this for a
function (that doesn't have side-effects on its parameters).


ok




+void
+run_test_simple(void *msg, mach_msg_size_t msglen, mach_msg_id_t msgid)
+{
+  mach_msg_header_t *head = msg;
+  mach_port_t port, receive;
+  int err;
+
+  err = syscall_mach_port_allocate (mach_task_self (),
+MACH_PORT_RIGHT_RECEIVE, &port);
+  ASSERT_RET(err, "syscall_mach_port_allocate");
+
+  err = syscall_mach_port_allocate (mach_task_self (),
+MACH_PORT_RIGHT_RECEIVE, &receive);
+  ASSERT_RET(err, "syscall_mach_port_allocate 2");
+
+  struct echo_params params;
+  params.rx_port = port;
+  params.rx_size = msglen;
+  params.rx_number = 1;
+  test_thread_start (mach_task_self (), echo_thread, ¶ms);
+  msleep(100);


Is this msleep really needed? If yes, this is racy. If it's just to make
the test a bit more stressful, then ok.


not really, I'll remove it


Luca




Re: [PATCH 08/14] add thread creation helper to tests

2024-01-06 Thread Luca Dariz

Il 29/12/23 15:14, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 28 déc. 2023 20:42:55 +0100, a ecrit:

diff --git a/tests/testlib.c b/tests/testlib.c
index 6abe8c4d..e6be46ee 100644
--- a/tests/testlib.c
+++ b/tests/testlib.c
@@ -95,3 +95,56 @@ void _start()
printf("%s: test %s exit code %x\n", TEST_SUCCESS_MARKER, argv[0], ret);
halt();
  }
+
+// started from 
https://github.com/dwarfmaster/mach-ipc/blob/master/minimal_threads/main.c


So this *has* to reproduce the MIT license header of that repository.
Better put it in a separate file.


ok




+static uint32_t stack_top[PAGE_SIZE]  __attribute__ ((aligned (PAGE_SIZE)));
+thread_t test_thread_start(task_t task, void(*routine)(void*), void* arg) {
+  const vm_size_t stack_size = PAGE_SIZE * 16;
+  kern_return_t ret;
+  vm_address_t stack;
+
+  ret = vm_allocate(task, &stack, stack_size, TRUE);
+  ASSERT_RET(ret, "can't allocate the stack for a new thread");
+
+  ret = vm_protect(task, stack, PAGE_SIZE, FALSE, VM_PROT_NONE);
+  ASSERT_RET(ret, "can't protect the stack from overflows");
+
+  long *top = (long*)((vm_offset_t)stack_top + PAGE_SIZE) - 2;



+  *top = 0;/* The return address */
+#ifndef __x86_64__
+  *(top + 1) = (long)arg; /* The argument is passed on the stack on x86_32 */
+#endif
+  ret = vm_write(task, stack + stack_size - PAGE_SIZE, (vm_offset_t)stack_top, 
PAGE_SIZE);
+  ASSERT_RET(ret, "can't initialize the stack for the new thread");
+
+  thread_t thread;
+  ret = thread_create(task, &thread);
+  ASSERT_RET(ret, "thread_create()");
+
+  struct i386_thread_state state;
+  unsigned int count;
+  count = i386_THREAD_STATE_COUNT;
+  ret = thread_get_state(thread, i386_REGS_SEGS_STATE,
+ (thread_state_t) &state, &count);
+  ASSERT_RET(ret, "thread_get_state()");
+
+#ifdef __x86_64__
+  state.rip = (long) routine;
+  state.ursp = (long) (stack + stack_size - 16);


Rather use sizeof(long)*2.


+  state.rbp = 0;
+  state.rdi = (long)arg;
+#else
+  state.eip = (long) routine;
+  state.uesp = (long) (stack + stack_size - 8);


And here as well.


ok




+  state.ebp = 0;
+#endif
+  ret = thread_set_state(thread, i386_REGS_SEGS_STATE,
+ (thread_state_t) &state, i386_THREAD_STATE_COUNT);
+  ASSERT_RET(ret, "thread_set_state");
+
+  ret = thread_resume(thread);
+  ASSERT_RET(ret, "thread_resume");
+
+  return thread;
+}
--
2.39.2





Re: [PATCH 07/14] add basic vm tests

2024-01-06 Thread Luca Dariz

Il 29/12/23 15:09, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 28 déc. 2023 20:42:54 +0100, a ecrit:

+  // this emulates maptime()
+  struct mapped_time_value *mtime;
+  mach_port_t device, memobj;
+  int err = device_open (device_priv(), 0, "time", &device);
+  ASSERT_RET(err, "device_open");
+  err = device_map (device, VM_PROT_READ, 0, sizeof(*mtime), &memobj, 0);
+  ASSERT_RET(err, "device_map");
+  err = mach_port_deallocate (mach_task_self (), device);
+  ASSERT_RET(err, "mach_port_deallocate");
+  mtime = 0;
+  err =
+vm_map (mach_task_self (), (vm_address_t *)&mtime, sizeof *mtime, 0, 1,
+memobj, 0, 0, VM_PROT_READ, VM_PROT_READ, VM_INHERIT_NONE);
+  ASSERT_RET(err, "vm_map");
+  err = mach_port_deallocate (mach_task_self (), memobj);
+  ASSERT_RET(err, "mach_port_deallocate");


I'd say try to print the content, so as to check that this did return a
pointer that is readable?


ok

Luca




Re: [PATCH 06/14] adjust range when changing memory pageabilityg

2024-01-06 Thread Luca Dariz

Il 29/12/23 15:06, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 28 déc. 2023 20:42:53 +0100, a ecrit:

* vm/vm_map.c: if the start address is not in the map, try to find the
   nearest entry instead of failing.

This caused the initial vm_wire_all(host, task VM_WIRE_ALL) in glibc
startup to fail with KERN_NO_SPACE.
---
  vm/vm_map.c | 21 -
  1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/vm/vm_map.c b/vm/vm_map.c
index 26e18676..4a5861e3 100644
--- a/vm/vm_map.c
+++ b/vm/vm_map.c
@@ -1774,13 +1774,24 @@ kern_return_t vm_map_pageable(
  
  	if (!vm_map_lookup_entry(map, start, &start_entry)) {

/*
-*  Start address is not in map; this is fatal.
+*  Start address is not in map; try to find the nearest 
entry


It is wrong to drop the error like this. The caller did ask for these
addresses. If they are wrong, the caller should be told so, and not
papered over. Possibly it's vm_map_pageable_all or such that actually
needs to be fixed.


yes that seems a better place.


Luca




Re: [PATCH 05/14] add mach_port tests

2024-01-06 Thread Luca Dariz

Il 29/12/23 15:01, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 28 déc. 2023 20:42:52 +0100, a ecrit:

+  mach_port_t newname = 123;


Why initializing it?


the idea was to check that the value it's actually set, I'll fix it by 
using a valid port name.





+  err = mach_port_allocate(mach_task_self(), MACH_PORT_RIGHT_RECEIVE, 
&newname);
+  ASSERT_RET(err, "mach_port_allocate");
+  err = mach_port_allocate(mach_task_self(), MACH_PORT_RIGHT_PORT_SET, 
&newname);
+  ASSERT_RET(err, "mach_port_allocate");
+  err = mach_port_allocate(mach_task_self(), MACH_PORT_RIGHT_DEAD_NAME, 
&newname);
+  ASSERT_RET(err, "mach_port_allocate");


You could check that you are getting different names.


ok




+  err = mach_port_destroy(mach_task_self(), newname);
+  ASSERT_RET(err, "mach_port_destroy");
+
+  err = mach_port_deallocate(mach_task_self(), 1002);
+  ASSERT_RET(err, "mach_port_deallocate");
+
+  mach_port_urefs_t urefs;
+  err = mach_port_get_refs(mach_task_self(), 1002, MACH_PORT_RIGHT_DEAD_NAME, 
&urefs);
+  ASSERT_RET(err, "mach_port_get_refs");


You could also check the references of the ports created above.


ok


Luca




Re: 64bit startup

2024-01-05 Thread Luca

Il 04/01/24 10:55, Sergey Bugaev ha scritto:

P.S. I have posted all of my patches, so if you're interested in
hacking on aarch64-gnu Mach, you should be able to build the full
toolchain now.


sure, I've started looking into it, but it will take a while before I 
can run something in userspace. I'm working on top of your gnumach patch 
for now.



Luca



Re: 64bit startup

2024-01-05 Thread Luca

Il 05/01/24 19:12, Sergey Bugaev ha scritto:

/servers/crash-dump-core crashes on the memset () call in
hurd:exec/elfcore.c:fetch_thread_fpregset (); the (*fpregs) pointer is
NULL. The caller passes fpregs = ¬e.data.pr_fpreg, where note.data
is of type struct elf_lwpstatus, defined in hurd:include/sys/procfs.h,
whose pr_fpreg field is of type prfpregset_t, which is a typedef to
fpregset_t, which was an actual struct on i386, but is a pointer on
x86_64. This would've been easier to debug if I had debuginfo :)


I had this small patch applied that apparently is enough for me to have 
some kind of core dump, I'm not sure if it's a good solution:


diff --git a/exec/elfcore.c b/exec/elfcore.c
index c6aa2bc8b..405fa8e0c 100644
--- a/exec/elfcore.c
+++ b/exec/elfcore.c
@@ -544,6 +544,11 @@ dump_core (task_t task, file_t file, off_t corelimit,
   note.data.pr_info.si_code = sigcode;
   note.data.pr_info.si_errno = sigerror;

+#ifdef __x86_64__
+  struct _libc_fpstate fpstate;
+  memset(&fpstate, 0, sizeof(fpstate));
+  note.data.pr_fpreg = &fpstate;
+#endif
   fetch_thread_regset (threads[i], ¬e.data.pr_reg);
   fetch_thread_fpregset (threads[i], ¬e.data.pr_fpreg);


HTH
Luca




Re: 64bit startup

2024-01-03 Thread Luca

Hi Sergey,

Il 03/01/24 09:17, Sergey Bugaev ha scritto:

How are you running it? Should I still be using a ramdisk image and
not rumpdisk?


Recently I've been installing hurd-amd64 on another disk of my hurd-i386 
vm and booting from that. Basically I prepare the disk with debootstrap 
--foreign, then I reuse the i386 grub install to boot the 64 bit kernel 
with a custom entry, then run the --second stage, configure login, fstab 
and network and reboot. I can give you the exact commands and setup I'm 
using if you want (I need to reinstall it anyway due to latest changes),


I'm currently using qemu via virt-manager, mostly with the default 
configuration for an x86_64 vm; that means a virtual SATA disk 
controller and Q35 chipset.


The only issue I see is that sometimes at shutdown rumpdisk hangs and I 
can't halt the system, however this seems the same with hurd-i686 and it 
doesn't happen if I force the shutdown with halt-hurd (or reboot-hurd). 
I didn't had a deeper look at this so far, but I don't have issues 
booting or connecting via ssh. I didn't try heavy builds, so probably 
that's why I don't see Samuel's issue, but I've been building gdb and 
the hurd itself (for a fix for the crash server that I have in my queue).



Luca



Re: aarch64-gnu (and Happy New Year!)

2024-01-01 Thread Luca

Il 01/01/24 14:51, Sergey Bugaev ha scritto:

On Mon, Jan 1, 2024 at 4:02 PM Luca  wrote:

Hi Sergey,


Hi Luca,


Really great work! To work on gnumach we just need MIG and any armv8
compiler (also targeting GNU/Linux is fine), and it seems MIG works fine
without adjustments? Maybe there could be some issues once it's run
somewhere, e.g. alignment issues.


Well, I have the aarch64-gnu toolchain here, please see the binutils &
GCC patches that I posted. But yeah, any aarch64-targeting compiler
should work for gnumach, including aarch64-linux-gnu or just a generic
embedded aarch64 target.

MIG seems to just work (thanks to all the Flávio's work!). I'm using
the same message ABI as on x86_64, and haven't seen any issues so far
— neither compiler errors / failed static assertions (about struct
sizes and such), nor hardware errors from misaligned accesses.


Good to hear! so if I understand it correctly you tested it on aarch64 
GNU/Linux?




Once I post the preliminary gnumach patches, you should be able to
build a complete binutils + GCC + MIG aarch64-gnu toolchain.


Another issue with ARM in general is that the hardware support is much
less streamlined than x86, although with v8 there should be some
alignment on basic stuff like IRQ and UEFI.


So I've heard, yeah. But yeah I've also heard that newer chips support
UEFI, and from UEFI it should be possible to boot GRUB and from there
gnumach, so maybe the boot process is not as much of a problem as it
was, and we could avoid separate per-board builds of gnumach.


Starting the kernel itself will probably not a problem, thanks to GRUB, 
but it shouldn't be too hard to add support for U-Boot either if needed.


I think more issues might come out setting up the various pieces of the 
system. For example, some chips have heterogeneous cores, (e.g.mine has 
two A72 cores and four A53 cores) so SMP will be more complicated.


Also, about the serial console, it might be useful at some point to use 
a driver from userspace, if we can reuse some drivers from netbsd or 
linux, to avoid embedding all of them in gnumach.



But as always, I have no idea what I'm talking about.


Probably even the serial
console needs a platform-specific driver (I'm not sure, I'm more
familiar with older and more embedded variants like Cortex-M)

To bootstrap gnumach the first thing we'd need would probably be the
console, setting up the virtual memory, then thread states, context
switch, irqs and userspace entry points (list by no means exhaustive).

I actually have an armv8 server that would be handy for some
development, so I might be able to help with something in the future.


Sounds great! (But you should also be able to use qemu, no?)


You can emulate aarch64 from an amd64 machine, but it's not very 
efficient. Also, I would realay like to run GNU/Hurd natively :)




Even before you write any code, it would be great if you helped review
my gnumach patches (e.g. aarch64_thread_state & the arm/aarch64
exceptions).


sure, but I'll have to get a bit more familiar with the aarch64 details 
first.



Also, there is a bunch of design work to do.

Will/can AArch64 use the same mechanism for letting userland handle
interrupts? Do we have all the mechanisms required for userland to
poke at specific addresses in memory (to replace I/O ports)? — I
believe we do, but I haven't looked closely.


AFAIK there are no I/O ports in ARM, the usual way to configure things 
is with memory-mapped registers, so this might become even easier. About 
IRQs, probably it needs to be arch-specific anyway.



What should the API for manipulating PAC keys look like? I kind of
want it to be just another flavor of thread state, but then it is
really supposed to be per-trask, not per-thread, and maybe adding some
aarch64-specific RPCs to read and write the PAC keys in
mach_aarch64.defs is the right way. But also AFAICS Mach currently has
no notion of per-task arch-specific data (unlike for threads, and
other than the VM map), so it'd be interesting to add one — could it
be useful for something else?


Do you mean Pointer Authentication Code? That seems quite a new feature, 
so maybe not fundamental to have a first working kernel, but anyway I 
don't see issues in having arch-specific data in tasks.



What are the debugging facilities available on ARM / AArch64? Should
we expose them as more flavors of thread state, or...? What would GDB
need?


For gnumach, it might be easier to debug on real hw than x86 if we can 
use openocd as gdbserver (but JTAG lines are not always reachable).


For user space tasks I don't know.



Should gnumach accept tagged addresses (like PR_SET_TAGGED_ADDR_CTRL on Linux)?


It could, but also it doesn't seem mandatory, so I would start without it.


Can we make Linux code (in-Mach drivers, pfinet, netdde, ...) work on AArch64?


I think nowadays LKL would be a better target, unless 

Re: aarch64-gnu (and Happy New Year!)

2024-01-01 Thread Luca
teers for a gnumach port? :)


Another issue with ARM in general is that the hardware support is much 
less streamlined than x86, although with v8 there should be some 
alignment on basic stuff like IRQ and UEFI. Probably even the serial 
console needs a platform-specific driver (I'm not sure, I'm more 
familiar with older and more embedded variants like Cortex-M)


To bootstrap gnumach the first thing we'd need would probably be the 
console, setting up the virtual memory, then thread states, context 
switch, irqs and userspace entry points (list by no means exhaustive).


I actually have an armv8 server that would be handy for some 
development, so I might be able to help with something in the future.




Luca




Re: [PATCH 04/14] add gsync tests

2023-12-29 Thread Luca Dariz

Il 29/12/23 14:56, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 28 déc. 2023 20:42:51 +0100, a ecrit:

+static void single_t2(void *arg)
+{
+  int err;
+  msleep(100);
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake t2");
+
+  err = thread_terminate(mach_thread_self());
+  ASSERT_RET(err, "thread_terminate");


thread_terminate is not actually supposed to return, so that'd rather be
an assert(0)


right, I think I added the FAILURE() macro later than this and I forgot 
to update



+}
+
+static void test_single()
+{
+  int err;
+  single_shared = 0;
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 0, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_TIMEDOUT, "gsync_wait 1");
+
+  single_shared = 1;
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 0, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_INVALID_ARGUMENT, "gsync_wait 2");
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 1, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_TIMEDOUT, "gsync_wait 2b");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake 1");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 
GSYNC_BROADCAST);
+  ASSERT_RET(err, "gsync_wake 2");


Here, you want to try gsync_wait again, and check that they time out
again (i.e. the kernel did forget about the gsync_wake, as expected).


ok added


+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 2, 
GSYNC_MUTATE);
+  ASSERT_RET(err, "gsync_wake 2a");
+  ASSERT(single_shared == 2, "gsync_wake single_shared");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake 3");
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake 3a");


Why two such calls?


To be honest I don't remember, maybe a mistake. I'll try to add more 
descriptive messages on assertions.



+  single_shared = 1;
+  test_thread_start(mach_task_self(), single_t2, 0);
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 1, 0, 0, 0);
+  ASSERT_RET(err, "gsync_wait 4");


This is racy since the thread may start and call gsync_wake before
gsync_wait gets to block, and thus gsync_wait would stay stuck. Of
course in practice this doesn't happen because of the msleep(100) in the
thread, but that still leaves it racy and prone to fail if the system is
e.g. very busy. You want to make single_t2 modify single_shared, so that
if test_single2 happens to call gsync_wait after that, it will just not
block and return KERN_INVALID_ARGUMENT instead, as expected.


ok




+static void test_single2()
+{
+  int err;
+  test_thread_start(mach_task_self(), single_t2, 0);
+  single_shared = 2;
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 2, 0, 0, 0);
+  ASSERT_RET(err, "gsync_wait 1");
+  // should this fail?
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake 2");
+}


I don't understand the test. Why waking after waiting? There is nobody
to wake. You also want to set single_shared before starting the thread.


Looking at it again, it really seems the same case as above, I don't 
remember why I added the additional wake. I'll make this a 
test_single_from_thread() or similar, merge your suggestion above and 
remove the duplicate test from test_single(), I guess the exact value 
used in the synchronization is not relevant, as long as it's the 
expected one.



Luca



Re: [PATCH 03/14] add mach_host tests

2023-12-29 Thread Luca Dariz

Il 29/12/23 14:44, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 28 déc. 2023 20:42:50 +0100, a ecrit:

---
  tests/test-mach_host.c | 54 ++
  1 file changed, 54 insertions(+)
  create mode 100644 tests/test-mach_host.c

diff --git a/tests/test-mach_host.c b/tests/test-mach_host.c
new file mode 100644
index ..99fc4aca
--- /dev/null
+++ b/tests/test-mach_host.c


Such files are also non-trivial, and thus deserve a copyright header too.


@@ -0,0 +1,54 @@
+
+#include 
+
+#include 
+
+void test_kernel_version()
+{
+  int err;
+  kernel_version_t kver;
+  err = host_get_kernel_version(mach_host_self(), kver);
+  ASSERT_RET(err, "host_kernel_info");
+  printf("kernel version: %s\n", kver);
+}
+
+void test_host_info()
+{
+  int err;
+  mach_msg_type_number_t count;
+  mach_port_t thishost = mach_host_self();
+
+  host_basic_info_data_t binfo;
+  count = HOST_BASIC_INFO_COUNT;
+  err = host_info(thishost, HOST_BASIC_INFO, (host_info_t)&binfo, &count);
+  ASSERT_RET(err, "host_basic_info");
+  ASSERT(count == HOST_BASIC_INFO_COUNT, "");


Perhaps also make some basic checks, such as binfo.avail_cpus <=
binfo.max_cpus.


ok




+  const int maxcpus = 255;
+  int proc_slots[maxcpus];
+  count = maxcpus;
+  err = host_info(thishost, HOST_PROCESSOR_SLOTS, (host_info_t)&proc_slots, 
&count);
+  ASSERT_RET(err, "host_processor_slots");
+  ASSERT((1 <= count) && (count <= maxcpus), "");
+
+  host_sched_info_data_t sinfo;
+  count = HOST_SCHED_INFO_COUNT;
+  err = host_info(thishost, HOST_SCHED_INFO, (host_info_t)&sinfo, &count);
+  ASSERT_RET(err, "host_sched_info");
+  ASSERT(count == HOST_SCHED_INFO_COUNT, "");


Here you can probably check that the min_* values is smaller than e.g.
1000, to catch randomly-filled values.


ok



Also, it would be useful to compile the tests with
-ftrivial-auto-var-init=pattern so as to fill the structures with random
values before making the gnumach calls.


good idea, I didn't know this feature.


Luca




Re: [PATCH 02/14] add basic user-space tests with qemu

2023-12-29 Thread Luca Dariz

Il 29/12/23 14:57, Samuel Thibault ha scritto:

Luca Dariz, le ven. 29 déc. 2023 14:51:31 +0100, a ecrit:

Il 29/12/23 14:37, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 28 déc. 2023 20:42:49 +0100, a ecrit:

new file mode 100644
index ..4cf25891
--- /dev/null
+++ b/tests/README
@@ -0,0 +1,37 @@
+
+There are some basic tests that can be run qith qemu. You can run all the 
tests with
+
+$ make check
+
+or selectively with:
+
+$ make run-hello
+
+Also, you can debug the existing tests, or a new one, by starting on one shell
+
+$ make debug-hello
+
+and on another shell you can attach with gdb, load the symbols of the
+bootstrap module and break on its _start():
+
+$ gdb gnuamch


Typo ;)


What would be a better command?


The command is fine, I just mean the typo gnuamch -> gnumach :)


aah ok :)




+++ b/tests/run-qemu.sh.template
+++ b/tests/test-hello.c
+++ b/tests/user-qemu.mk


These are non-trivial, and so definitely need a copyright header.


Right, I forgot to add it to all files... I'll resend the patch set


Perhaps wait for my reviews on the other patchs.


diff --git a/tests/syscalls.S b/tests/syscalls.S
new file mode 100644
index ..df9c9bc0
--- /dev/null
+++ b/tests/syscalls.S
@@ -0,0 +1,4 @@
+


Spurious line?


I'll add the copyright header here also


+#include 
+
+kernel_trap(invalid_syscall,-31,0)


For such a trivial content it's not really useful.

Samuel





Re: [PATCH 02/14] add basic user-space tests with qemu

2023-12-29 Thread Luca Dariz

Hi,

Il 29/12/23 14:37, Samuel Thibault ha scritto:

Hello,

Thanks, this looks good!

Luca Dariz, le jeu. 28 déc. 2023 20:42:49 +0100, a ecrit:

new file mode 100644
index ..4cf25891
--- /dev/null
+++ b/tests/README
@@ -0,0 +1,37 @@
+
+There are some basic tests that can be run qith qemu. You can run all the 
tests with
+
+$ make check
+
+or selectively with:
+
+$ make run-hello
+
+Also, you can debug the existing tests, or a new one, by starting on one shell
+
+$ make debug-hello
+
+and on another shell you can attach with gdb, load the symbols of the
+bootstrap module and break on its _start():
+
+$ gdb gnuamch


Typo ;)


What would be a better command? This is actually how I start gdb from 
the build directory, then I have a .gdbinit that connects to qemu and 
adds the userspace symbols if needed, sets breakpoints etc... Maybe I 
can add a minimal .gdbinit and copy it in the build directory?



+++ b/tests/run-qemu.sh.template
+++ b/tests/test-hello.c
+++ b/tests/user-qemu.mk


These are non-trivial, and so definitely need a copyright header.


Right, I forgot to add it to all files... I'll resend the patch set




diff --git a/tests/syscalls.S b/tests/syscalls.S
new file mode 100644
index ..df9c9bc0
--- /dev/null
+++ b/tests/syscalls.S
@@ -0,0 +1,4 @@
+


Spurious line?


I'll add the copyright header here also




+#include 
+
+kernel_trap(invalid_syscall,-31,0)





[PATCH 14/14] add tests to make check

2023-12-28 Thread Luca Dariz
---
 tests/user-qemu.mk | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tests/user-qemu.mk b/tests/user-qemu.mk
index 50b04736..669eb77a 100644
--- a/tests/user-qemu.mk
+++ b/tests/user-qemu.mk
@@ -178,7 +178,15 @@ clean-test-%:
 
 
 USER_TESTS := \
-   tests/test-hello
+   tests/test-hello \
+   tests/test-mach_host \
+   tests/test-gsync \
+   tests/test-machmsg \
+   tests/test-mach_port \
+   tests/test-syscalls \
+   tests/test-task \
+   tests/test-threads \
+   tests/test-vm
 
 USER_TESTS_CLEAN = $(subst tests/,clean-,$(USER_TESTS))
 
-- 
2.39.2




[PATCH 11/14] add raw mach_msg tests

2023-12-28 Thread Luca Dariz
---
 tests/test-machmsg.c | 390 +++
 1 file changed, 390 insertions(+)
 create mode 100644 tests/test-machmsg.c

diff --git a/tests/test-machmsg.c b/tests/test-machmsg.c
new file mode 100644
index ..c262f5f4
--- /dev/null
+++ b/tests/test-machmsg.c
@@ -0,0 +1,390 @@
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#define ECHO_MAX_BODY_LEN 256
+
+static uint32_t align(uint32_t val, size_t aln)
+{
+  // we should check aln is a power of 2
+  aln--;
+  return (val + aln) & (~aln);
+}
+
+#define align_inline(val, n) { val = align(val, n); }
+
+struct echo_params
+{
+  mach_port_t rx_port;
+  mach_msg_size_t rx_size;
+  mach_msg_size_t rx_number;
+};
+
+void echo_thread (void *arg)
+{
+  struct echo_params *params = arg;
+  int err;
+  struct message
+  {
+mach_msg_header_t header;
+char body[ECHO_MAX_BODY_LEN];
+  } message;
+
+  printf ("thread echo started\n");
+  for (mach_msg_size_t i=0; irx_number; i++)
+{
+  message.header.msgh_local_port = params->rx_port;
+  message.header.msgh_size = sizeof (message);
+
+  err = mach_msg (&message.header,
+  MACH_RCV_MSG,
+  0, sizeof (message),
+  params->rx_port, MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);
+  ASSERT_RET(err, "mach_msg echo rx");
+  printf("echo rx %d expected 5d\n",
+ message.header.msgh_size, params->rx_size);
+  ASSERT(message.header.msgh_size == params->rx_size,
+ "wrong size in echo rx");
+
+  message.header.msgh_local_port = MACH_PORT_NULL;
+  printf ("echo: msgh_id %d\n", message.header.msgh_id);
+
+  err = mach_msg (&message.header,
+  MACH_SEND_MSG,
+  message.header.msgh_size, 0,
+  MACH_PORT_NULL, MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);
+  ASSERT_RET(err, "mach_msg echo tx");
+}
+  printf ("thread echo stopped\n");
+  err = thread_terminate (mach_thread_self ());
+  ASSERT_RET(err, "thread_terminate");
+}
+
+#define TEST_ITERATIONS 3
+
+// TODO run_test_iterations
+void
+test_iterations (void)
+{
+  mach_port_t port, receive;
+  int err;
+  struct message
+  {
+mach_msg_header_t header;
+mach_msg_type_t type;
+char data[64];
+  } message;
+
+  err = mach_port_allocate (mach_task_self (),
+   MACH_PORT_RIGHT_RECEIVE, &port);
+  ASSERT_RET(err, "mach_port_allocate");
+
+  err = mach_port_allocate (mach_task_self (),
+   MACH_PORT_RIGHT_RECEIVE, &receive);
+  ASSERT_RET(err, "mach_port_allocate 2");
+
+  struct echo_params params;
+  params.rx_port = port;
+  params.rx_size = sizeof(message.header) + sizeof(message.type) + 5;
+  align_inline(params.rx_size, MACH_MSG_USER_ALIGNMENT);
+  params.rx_number = TEST_ITERATIONS;
+  test_thread_start (mach_task_self (), echo_thread, ¶ms);
+
+  time_value_t start_time;
+  err = host_get_time (mach_host_self (), &start_time);
+  ASSERT_RET(err, "host_get_time");
+
+  /* Send a message down the port */
+  for (int i = 0; i < TEST_ITERATIONS; i++)
+{
+  struct message message;
+
+  memset (&message, 0, sizeof (message));
+  strcpy (message.data, "ciao");
+  size_t datalen = strlen (message.data) + 1;
+
+  message.header.msgh_bits
+   = MACH_MSGH_BITS (MACH_MSG_TYPE_MAKE_SEND,
+ MACH_MSG_TYPE_MAKE_SEND_ONCE);
+  message.header.msgh_remote_port = port;  /* Request port */
+  message.header.msgh_local_port = receive;/* Reply port */
+  message.header.msgh_id = 123;/* Message id */
+  message.header.msgh_size = sizeof (message.header) + sizeof 
(message.type) + datalen;/* Message size */
+  align_inline(message.header.msgh_size, 4);
+  message.type.msgt_name = MACH_MSG_TYPE_STRING;   /* Parameter type */
+  message.type.msgt_size = 8 * datalen;/* # Bits */
+  message.type.msgt_number = 1;/* Number of elements */
+  message.type.msgt_inline = TRUE; /* Inlined */
+  message.type.msgt_longform = FALSE;  /* Shortform */
+  message.type.msgt_deallocate = FALSE;/* Do not deallocate */
+  message.type.msgt_unused = 0;/* = 0 */
+
+  /* Send the message on its way and wait for a reply.  */
+  err = mach_msg (&message.header, /* The header */
+ MACH_SEND_MSG | MACH_RCV_MSG, /* Flags */
+ message.header.msgh_size, /* Send size */
+ sizeof (message), /* Max receive Size */
+ receive,  /* Receive port */
+ MACH_MSG_TIMEOUT_NONE,/* No timeout */
+ MACH_PORT_NULL);  /* No notification */
+  ASSERT_RET(err, "mach_msg txrx");
+}
+
+  time_value_t stop_time;
+  err = host_get_time (mach_host_self (), &stop_time);
+  ASSERT_RET(err, "host_get_time");
+
+  printf ("start: %d.%06d\n", start_time.seconds, star

[PATCH 12/14] add basic task tests

2023-12-28 Thread Luca Dariz
---
 tests/test-task.c | 149 ++
 1 file changed, 149 insertions(+)
 create mode 100644 tests/test-task.c

diff --git a/tests/test-task.c b/tests/test-task.c
new file mode 100644
index ..da9289ff
--- /dev/null
+++ b/tests/test-task.c
@@ -0,0 +1,149 @@
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+
+void test_task()
+{
+  mach_port_t ourtask = mach_task_self();
+  mach_msg_type_number_t count;
+  int err;
+
+  struct task_basic_info binfo;
+  count = TASK_BASIC_INFO_COUNT;
+  err = task_info(ourtask, TASK_BASIC_INFO, (task_info_t)&binfo, &count);
+  ASSERT_RET(err, "TASK_BASIC_INFO");
+
+  struct task_events_info einfo;
+  count = TASK_EVENTS_INFO_COUNT;
+  err = task_info(ourtask, TASK_EVENTS_INFO, (task_info_t)&einfo, &count);
+  ASSERT_RET(err, "TASK_EVENTS_INFO");
+
+  struct task_thread_times_info ttinfo;
+  count = TASK_THREAD_TIMES_INFO_COUNT;
+  err = task_info(ourtask, TASK_THREAD_TIMES_INFO, (task_info_t)&ttinfo, 
&count);
+  ASSERT_RET(err, "TASK_THREAD_TIMES_INFO");
+}
+
+
+void dummy_thread(void *arg)
+{
+  printf("started dummy thread\n");
+  while (1)
+;
+}
+
+void check_threads(thread_t *threads, mach_msg_type_number_t nthreads)
+{  
+  for (int tid=0; tid

[PATCH 13/14] add basic thread tests

2023-12-28 Thread Luca Dariz
---
 tests/test-threads.c | 88 
 1 file changed, 88 insertions(+)
 create mode 100644 tests/test-threads.c

diff --git a/tests/test-threads.c b/tests/test-threads.c
new file mode 100644
index ..659aaf3b
--- /dev/null
+++ b/tests/test-threads.c
@@ -0,0 +1,88 @@
+
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+
+void sleeping_thread(void* arg)
+{
+  printf("starting thread %d\n", arg);
+  for (int i=0; i<100; i++)
+  msleep(50);  // maybe we could just yield
+  printf("stopping thread %d\n", arg);
+  int err = thread_terminate(mach_thread_self());
+  ASSERT_RET(err, "thread_terminate");
+}
+
+void test_many(void)
+{
+  int err;
+  for (long tid=0; tid<10; tid++)
+{
+  test_thread_start(mach_task_self(), sleeping_thread, (void*)tid);
+}
+  // TODO: wait for thread end notifications
+  msleep(6000);
+}
+
+#ifdef __x86_64__
+void test_fsgs_base_thread(void* tid)
+{
+  int err;
+#if defined(__SEG_FS) && defined(__SEG_GS)
+  long __seg_fs *fs_ptr;
+  long __seg_gs *gs_ptr;
+  long fs_value;
+  long gs_value;
+
+  struct i386_fsgs_base_state state;
+  state.fs_base = (unsigned long)&fs_value;
+  state.gs_base = (unsigned long)&gs_value;
+  err = thread_set_state(mach_thread_self(), i386_FSGS_BASE_STATE,
+ (thread_state_t) &state, i386_FSGS_BASE_STATE_COUNT);
+  ASSERT_RET(err, "thread_set_state");
+
+  fs_value = 0x100 + (long)tid;
+  gs_value = 0x200 + (long)tid;
+
+  msleep(50);  // allow the others to set their segment base
+
+  fs_ptr = 0;
+  gs_ptr = 0;
+  long rdvalue = *fs_ptr;
+  printf("FS expected %lx read %lx\n", fs_value, rdvalue);
+  ASSERT(fs_value == rdvalue, "FS base error\n");
+  rdvalue = *gs_ptr;
+  printf("GS expected %lx read %lx\n", gs_value, rdvalue);
+  ASSERT(gs_value == rdvalue, "GS base error\n");
+#else
+#error " missing __SEG_FS and __SEG_GS"
+#endif
+
+  err = thread_terminate(mach_thread_self());
+  ASSERT_RET(err, "thread_terminate");
+}
+#endif
+
+void test_fsgs_base(void)
+{
+#ifdef __x86_64__
+  int err;
+  for (long tid=0; tid<10; tid++)
+{
+  test_thread_start(mach_task_self(), test_fsgs_base_thread, (void*)tid);
+}
+  msleep(1000);  // TODO: wait for threads
+#endif
+}
+
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  test_fsgs_base();
+  test_many();
+  return 0;
+}
-- 
2.39.2




[PATCH v2 00/14] add user tests to gnumach

2023-12-28 Thread Luca Dariz
This patch set adds simple user-space tests as part of gnumach's make
check, running on qemu, and some fixes found in the meantime.
The tests are far for complete and currently mostly cover issues I
found working on x86_64, but they are quite minimal and only require a
working MIG. They will be definitely useful in case of new
architectures, if supported by qemu.
Currently I ran the tests only on GNU/Linux as qemu does not run yet
on the Hurd.

Luca Dariz (14):
  USER32: change default to disabled and make it a general option
  add basic user-space tests with qemu
  add mach_host tests
  add gsync tests
  add mach_port tests
  adjust range when changing memory pageability
  add basic vm tests
  add thread creation helper to tests
  add syscall tests
  expose MACH_MSG_USER_ALIGNMENT for manually-built messages
  add raw mach_msg tests
  add basic task tests
  add basic thread tests
  add tests to make check

 configure.ac |  16 +-
 include/mach/message.h   |  19 +-
 tests/Makefrag.am|  11 +-
 tests/README |  37 +++
 tests/configfrag.ac  |  16 ++
 tests/grub.cfg.single.template   |   4 +
 tests/include/device/cons.h  |  10 +
 tests/include/kern/printf.h  |   1 +
 tests/include/mach/mig_support.h |  71 ++
 tests/include/syscalls.h |  83 +++
 tests/include/testlib.h  |  58 +
 tests/include/util/atoi.h|   1 +
 tests/run-qemu.sh.template   |  23 ++
 tests/syscalls.S |   4 +
 tests/test-gsync.c   |  87 +++
 tests/test-hello.c   |   9 +
 tests/test-mach_host.c   |  54 +
 tests/test-mach_port.c   |  80 +++
 tests/test-machmsg.c | 390 +++
 tests/test-syscalls.c| 149 
 tests/test-task.c| 149 
 tests/test-threads.c |  88 +++
 tests/test-vm.c  |  52 +
 tests/testlib.c  | 150 
 tests/user-qemu.mk   | 203 
 vm/vm_map.c  |  21 +-
 x86_64/configfrag.ac |  14 +-
 27 files changed, 1771 insertions(+), 29 deletions(-)
 create mode 100644 tests/README
 create mode 100644 tests/grub.cfg.single.template
 create mode 100644 tests/include/device/cons.h
 create mode 12 tests/include/kern/printf.h
 create mode 100644 tests/include/mach/mig_support.h
 create mode 100644 tests/include/syscalls.h
 create mode 100644 tests/include/testlib.h
 create mode 12 tests/include/util/atoi.h
 create mode 100644 tests/run-qemu.sh.template
 create mode 100644 tests/syscalls.S
 create mode 100644 tests/test-gsync.c
 create mode 100644 tests/test-hello.c
 create mode 100644 tests/test-mach_host.c
 create mode 100644 tests/test-mach_port.c
 create mode 100644 tests/test-machmsg.c
 create mode 100644 tests/test-syscalls.c
 create mode 100644 tests/test-task.c
 create mode 100644 tests/test-threads.c
 create mode 100644 tests/test-vm.c
 create mode 100644 tests/testlib.c
 create mode 100644 tests/user-qemu.mk

-- 
2.39.2




[PATCH 03/14] add mach_host tests

2023-12-28 Thread Luca Dariz
---
 tests/test-mach_host.c | 54 ++
 1 file changed, 54 insertions(+)
 create mode 100644 tests/test-mach_host.c

diff --git a/tests/test-mach_host.c b/tests/test-mach_host.c
new file mode 100644
index ..99fc4aca
--- /dev/null
+++ b/tests/test-mach_host.c
@@ -0,0 +1,54 @@
+
+#include 
+
+#include 
+
+void test_kernel_version()
+{
+  int err;
+  kernel_version_t kver;
+  err = host_get_kernel_version(mach_host_self(), kver);
+  ASSERT_RET(err, "host_kernel_info");
+  printf("kernel version: %s\n", kver);
+}
+
+void test_host_info()
+{
+  int err;
+  mach_msg_type_number_t count;
+  mach_port_t thishost = mach_host_self();
+
+  host_basic_info_data_t binfo;
+  count = HOST_BASIC_INFO_COUNT;
+  err = host_info(thishost, HOST_BASIC_INFO, (host_info_t)&binfo, &count);
+  ASSERT_RET(err, "host_basic_info");
+  ASSERT(count == HOST_BASIC_INFO_COUNT, "");
+
+  const int maxcpus = 255;
+  int proc_slots[maxcpus];
+  count = maxcpus;
+  err = host_info(thishost, HOST_PROCESSOR_SLOTS, (host_info_t)&proc_slots, 
&count);
+  ASSERT_RET(err, "host_processor_slots");
+  ASSERT((1 <= count) && (count <= maxcpus), "");
+
+  host_sched_info_data_t sinfo;
+  count = HOST_SCHED_INFO_COUNT;
+  err = host_info(thishost, HOST_SCHED_INFO, (host_info_t)&sinfo, &count);
+  ASSERT_RET(err, "host_sched_info");
+  ASSERT(count == HOST_SCHED_INFO_COUNT, "");
+
+  host_load_info_data_t linfo;
+  count = HOST_LOAD_INFO_COUNT;
+  err = host_info(thishost, HOST_LOAD_INFO, (host_info_t)&linfo, &count);
+  ASSERT_RET(err, "host_load_info");
+  ASSERT(count == HOST_LOAD_INFO_COUNT, "");
+}
+
+// TODO processor sets
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  test_kernel_version();
+  test_host_info();
+  return 0;
+}
-- 
2.39.2




[PATCH 09/14] add syscall tests

2023-12-28 Thread Luca Dariz
---
 tests/test-syscalls.c | 149 ++
 1 file changed, 149 insertions(+)
 create mode 100644 tests/test-syscalls.c

diff --git a/tests/test-syscalls.c b/tests/test-syscalls.c
new file mode 100644
index ..7e4234ac
--- /dev/null
+++ b/tests/test-syscalls.c
@@ -0,0 +1,149 @@
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+
+static struct {
+  mach_port_t exception_port;
+  mach_port_t thread;
+  mach_port_t task;
+  integer_t exception;
+  integer_t code;
+  integer_t subcode;
+} last_exc;
+kern_return_t catch_exception_raise(mach_port_t exception_port,
+mach_port_t thread, mach_port_t task,
+integer_t exception, integer_t code,
+long_integer_t subcode)
+{
+  printf("received catch_exception_raise(%u %u %u %d %d %d)\n",
+ exception_port, thread, task, exception, code, subcode);
+  last_exc.exception_port = exception_port;
+  last_exc.thread = thread;
+  last_exc.task = task;
+  last_exc.exception = exception;
+  last_exc.code = code;
+  last_exc.subcode = subcode;
+  return KERN_SUCCESS;
+}
+
+static char simple_request_data[PAGE_SIZE];
+static char simple_reply_data[PAGE_SIZE];
+int simple_msg_server(boolean_t (*demuxer) (mach_msg_header_t *request,
+ mach_msg_header_t *reply),
+  mach_port_t rcv_port_name,
+  int num_msgs)
+{
+  int midx = 0, mok = 0;
+  int ret;
+  mig_reply_header_t *request = (mig_reply_header_t*)simple_request_data;
+  mig_reply_header_t *reply = (mig_reply_header_t*)simple_reply_data;
+  while ((midx - num_msgs) < 0)
+{
+  ret = mach_msg(&request->Head, MACH_RCV_MSG, 0, PAGE_SIZE,
+ rcv_port_name, 0, MACH_PORT_NULL);
+  switch (ret)
+{
+case MACH_MSG_SUCCESS:
+  if ((*demuxer)(&request->Head, &reply->Head))
+mok++;  // TODO send reply
+  else
+FAILURE("demuxer didn't handle the message");
+  break;
+default:
+  ASSERT_RET(ret, "receiving in msg_server");
+  break;
+}
+  midx++;
+}
+  if (mok != midx)
+FAILURE("wrong number of message received");
+  return mok != midx;
+}
+
+
+void test_syscall_bad_arg_on_stack(void *arg)
+{
+  /* mach_msg() has 7 arguments, so the last one should be passed on the stack
+ make RSP points the the wrong place to check the access check
+  */
+#ifdef __x86_64__
+  asm volatile("movq   $0x123,%rsp;"   \
+   "movq   $-25,%rax;" \
+   "syscall;"   \
+   );
+#else
+  asm volatile("mov$0x123,%esp;"   \
+   "mov$-25,%eax;" \
+   "lcall  $0x7,$0x0;" \
+   );
+#endif
+  FAILURE("we shouldn't be here!");
+}
+
+void test_bad_syscall_num(void *arg)
+{
+#ifdef __x86_64__
+  asm volatile("movq   $0x123456,%rax;"\
+   "syscall;"   \
+   );
+#else
+  asm volatile("mov$0x123456,%eax;"\
+   "lcall  $0x7,$0x0;" \
+   );
+#endif
+  FAILURE("we shouldn't be here!");
+}
+
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  int err;
+  mach_port_t excp;
+
+  err = mach_port_allocate(mach_task_self (), MACH_PORT_RIGHT_RECEIVE, &excp);
+  ASSERT_RET(err, "creating exception port");
+
+  err = mach_port_insert_right(mach_task_self(), excp, excp,
+   MACH_MSG_TYPE_MAKE_SEND);
+  ASSERT_RET(err, "inserting sr into exception port");
+
+  err = task_set_special_port(mach_task_self(), TASK_EXCEPTION_PORT, excp);
+  ASSERT_RET(err, "setting task exception port");
+
+  /* FIXME: receiving an exception with small size causes GP on 64 bit 
userspace */
+  /* mig_reply_header_t msg; */
+  /* err = mach_msg(&msg.Head, /\* The header *\/ */
+  /*MACH_RCV_MSG, */
+  /*0, */
+  /*sizeof (msg),  /\* Max receive Size *\/ */
+  /*excp, */
+  /*1000, */
+  /*MACH_PORT_NULL); */
+
+  // FIXME: maybe MIG should provide this prototype?
+  boolean_t exc_server
+(mach_msg_header_t *InHeadP, mach_msg_header_t *OutHeadP);
+
+  memset(&last_exc, 0, sizeof(last_exc));
+  test_thread_start(mach_task_self(), test_bad_syscall_num, NULL);
+  ASSERT_RET(simple_msg_server(exc_server, excp, 1), "error in exc server");
+  ASSERT((last_exc.exception == EXC_BAD_INSTRUCTION) && (last_exc.code == 
EXC_I386_INVOP),
+ "bad exception for test_bad_syscall_num()");
+
+  memset(&last_exc, 0, sizeof(last_exc));
+  test_thread_start(mach_task_self(), test_syscall_bad_arg_on_stack, NULL);
+  ASSERT_RET(simple_msg_server(e

[PATCH 07/14] add basic vm tests

2023-12-28 Thread Luca Dariz
---
 tests/test-vm.c | 52 +
 1 file changed, 52 insertions(+)
 create mode 100644 tests/test-vm.c

diff --git a/tests/test-vm.c b/tests/test-vm.c
new file mode 100644
index ..ba52876b
--- /dev/null
+++ b/tests/test-vm.c
@@ -0,0 +1,52 @@
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+
+static void test_memobj()
+{
+  // this emulates maptime()
+  struct mapped_time_value *mtime;
+  mach_port_t device, memobj;
+  int err = device_open (device_priv(), 0, "time", &device);
+  ASSERT_RET(err, "device_open");
+  err = device_map (device, VM_PROT_READ, 0, sizeof(*mtime), &memobj, 0);
+  ASSERT_RET(err, "device_map");
+  err = mach_port_deallocate (mach_task_self (), device);
+  ASSERT_RET(err, "mach_port_deallocate");
+  mtime = 0;
+  err =
+vm_map (mach_task_self (), (vm_address_t *)&mtime, sizeof *mtime, 0, 1,
+memobj, 0, 0, VM_PROT_READ, VM_PROT_READ, VM_INHERIT_NONE);
+  ASSERT_RET(err, "vm_map");
+  err = mach_port_deallocate (mach_task_self (), memobj);
+  ASSERT_RET(err, "mach_port_deallocate");
+  err = vm_deallocate(mach_task_self(), (vm_address_t)mtime, sizeof(*mtime));
+  ASSERT_RET(err, "vm_deallocate");
+}
+
+static void test_wire()
+{
+  int err = vm_wire_all(host_priv(), mach_task_self(), VM_WIRE_ALL);
+  ASSERT_RET(err, "vm_wire_all");
+}
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  printf("VM_MIN_ADDRESS=0x%p\n", VM_MIN_ADDRESS);
+  printf("VM_MAX_ADDRESS=0x%p\n", VM_MAX_ADDRESS);
+  test_wire();
+  test_memobj();
+  return 0;
+}
-- 
2.39.2




[PATCH 02/14] add basic user-space tests with qemu

2023-12-28 Thread Luca Dariz
* configure.ac: move test fragment to have USER32
* tests/Makefrag.am: add user tests
* tests/README: add basic info on how to run and debug user tests
* tests/configfrag.ac: allow the test compiler/flags to be
  autoconfigured or customized
* tests/grub.cfg.single.template: add minimal grub config to boot a
  module
* tests/include/device/cons.h: add a simplified version of
  device/cons.h usable for tests
* tests/include/kern/printf.h: symlink to kern/printf.h
* tests/include/mach/mig_support.h: add basic version for user-space
  tests
* tests/include/syscalls.h: add prototypes for syscalls used in tests.
* tests/include/testlib.h: add definitions for common test
  functionalities
* tests/include/util/atoi.h: symlink to util/atoi.h
* tests/run-qemu.sh.template: add a simple qemu test runner
* tests/syscalls.S: generate syscalls entry points
* tests/test-hello.c: add basic smoke test
* tests/testlib.c: add the imnimal funcitonality to run a user-space
  executable and reboot the system, and some test helpers.
* tests/user-qemu.mk: add rules to build simple user-space test
  modules, including generating mig stubs. The tests reuse some kernel
  code (like printf(), mach_atoi(), mem*(), str*() functions) so we can
  use the freestanding environment and not depend on glibc.
---
 configure.ac |   6 +-
 tests/Makefrag.am|  11 +-
 tests/README |  37 ++
 tests/configfrag.ac  |  16 +++
 tests/grub.cfg.single.template   |   4 +
 tests/include/device/cons.h  |  10 ++
 tests/include/kern/printf.h  |   1 +
 tests/include/mach/mig_support.h |  71 +++
 tests/include/syscalls.h |  83 +
 tests/include/testlib.h  |  57 +
 tests/include/util/atoi.h|   1 +
 tests/run-qemu.sh.template   |  23 
 tests/syscalls.S |   4 +
 tests/test-hello.c   |   9 ++
 tests/testlib.c  |  97 +++
 tests/user-qemu.mk   | 195 +++
 16 files changed, 620 insertions(+), 5 deletions(-)
 create mode 100644 tests/README
 create mode 100644 tests/grub.cfg.single.template
 create mode 100644 tests/include/device/cons.h
 create mode 12 tests/include/kern/printf.h
 create mode 100644 tests/include/mach/mig_support.h
 create mode 100644 tests/include/syscalls.h
 create mode 100644 tests/include/testlib.h
 create mode 12 tests/include/util/atoi.h
 create mode 100644 tests/run-qemu.sh.template
 create mode 100644 tests/syscalls.S
 create mode 100644 tests/test-hello.c
 create mode 100644 tests/testlib.c
 create mode 100644 tests/user-qemu.mk

diff --git a/configure.ac b/configure.ac
index 52205eee..0c5d4d61 100644
--- a/configure.ac
+++ b/configure.ac
@@ -123,9 +123,6 @@ AC_CHECK_PROG([PATCH], [patch], [patch], [patch-not-found])
 # configure fragments.
 #
 
-# The test suite.
-m4_include([tests/configfrag.ac])
-
 # Default set of device drivers.
 AC_ARG_ENABLE([device-drivers],
   AS_HELP_STRING([--enable-device-drivers=WHICH], [specify WHICH (on `ix86-at'
@@ -181,6 +178,9 @@ m4_include([configfrag.ac])
 
 # Linux code snarfed into GNU Mach.
 m4_include([linux/configfrag.ac])
+
+# The test suite.
+m4_include([tests/configfrag.ac])
 
 #
 # Compiler features.
diff --git a/tests/Makefrag.am b/tests/Makefrag.am
index 2723f64a..88c7fe8c 100644
--- a/tests/Makefrag.am
+++ b/tests/Makefrag.am
@@ -21,6 +21,13 @@
 #
 
 if !PLATFORM_xen
+
+include tests/user-qemu.mk
+
 TESTS += \
-   tests/test-multiboot
-endif
+   tests/test-multiboot \
+   $(USER_TESTS)
+
+clean-local: $(USER_TESTS_CLEAN)
+
+endif # !PLATFORM_xen
diff --git a/tests/README b/tests/README
new file mode 100644
index ..4cf25891
--- /dev/null
+++ b/tests/README
@@ -0,0 +1,37 @@
+
+There are some basic tests that can be run qith qemu. You can run all the 
tests with
+
+$ make check
+
+or selectively with:
+
+$ make run-hello
+
+Also, you can debug the existing tests, or a new one, by starting on one shell
+
+$ make debug-hello
+
+and on another shell you can attach with gdb, load the symbols of the
+bootstrap module and break on its _start():
+
+$ gdb gnuamch
+...
+(gdb) target remote :1234
+...
+(gdb) b setup_main
+Breakpoint 11 at 0x81019d60: file ../kern/startup.c, line 98.
+(gdb) c
+Continuing.
+
+Breakpoint 11, setup_main () at ../kern/startup.c:98
+98 cninit();
+(gdb) add-symbol-file ../gnumach/build-64/module-task
+Reading symbols from ../gnumach/build-64/module-task...
+(gdb) b _start
+Breakpoint 12 at 0x40324a: _start. (2 locations)
+(gdb) c
+Continuing.
+
+Breakpoint 12, _start () at ../tests/testlib.c:96
+96 {
+(gdb) 
diff --git a/tests/configfrag.ac b/tests/configfrag.ac
index 55c6da62..de87cbad 100644
--- a/tests/configfrag.ac
+++ b/tests/configfrag.ac
@@ -22,6 +22,22 @@ dnl 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, 

[PATCH 06/14] adjust range when changing memory pageability

2023-12-28 Thread Luca Dariz
* vm/vm_map.c: if the start address is not in the map, try to find the
  nearest entry instead of failing.

This caused the initial vm_wire_all(host, task VM_WIRE_ALL) in glibc
startup to fail with KERN_NO_SPACE.
---
 vm/vm_map.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/vm/vm_map.c b/vm/vm_map.c
index 26e18676..4a5861e3 100644
--- a/vm/vm_map.c
+++ b/vm/vm_map.c
@@ -1774,13 +1774,24 @@ kern_return_t vm_map_pageable(
 
if (!vm_map_lookup_entry(map, start, &start_entry)) {
/*
-*  Start address is not in map; this is fatal.
+*  Start address is not in map; try to find the nearest 
entry
 */
-   if (lock_map) {
-   vm_map_unlock(map);
-   }
+   struct rbtree_node  *node;
+   node = rbtree_lookup_nearest(&map->hdr.tree, start,
+vm_map_entry_cmp_lookup, 
RBTREE_LEFT);
+
+   if (node == NULL) {
+   node = rbtree_lookup_nearest(&map->hdr.tree, start,
+vm_map_entry_cmp_lookup, 
RBTREE_RIGHT);
+   if (node == NULL) {
+   if (lock_map) {
+   vm_map_unlock(map);
+   }
+   return KERN_NO_SPACE;
+   }
 
-   return KERN_NO_SPACE;
+   }
+   start_entry = rbtree_entry(node, struct vm_map_entry, 
tree_node);
}
 
/*
-- 
2.39.2




[PATCH 10/14] expose MACH_MSG_USER_ALIGNMENT for manually-built messages

2023-12-28 Thread Luca Dariz
---
 include/mach/message.h | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/mach/message.h b/include/mach/message.h
index 0b8b34d4..9790ef98 100644
--- a/include/mach/message.h
+++ b/include/mach/message.h
@@ -401,6 +401,16 @@ typedef integer_t mach_msg_option_t;
 
 #define MACH_SEND_ALWAYS   0x0001  /* internal use only */
 
+#ifdef __x86_64__
+#if defined(KERNEL) && defined(USER32)
+#define MACH_MSG_USER_ALIGNMENT 4
+#else
+#define MACH_MSG_USER_ALIGNMENT 8
+#endif
+#else
+#define MACH_MSG_USER_ALIGNMENT 4
+#endif
+
 #ifdef KERNEL
 /* This is the alignment of msg descriptors and the actual data
  * for both in kernel messages and user land messages.
@@ -411,15 +421,6 @@ typedef integer_t mach_msg_option_t;
  * so that kernel messages are correctly aligned.
  */
 #define MACH_MSG_KERNEL_ALIGNMENT sizeof(uintptr_t)
-#ifdef __x86_64__
-#ifdef USER32
-#define MACH_MSG_USER_ALIGNMENT 4
-#else
-#define MACH_MSG_USER_ALIGNMENT 8
-#endif
-#else
-#define MACH_MSG_USER_ALIGNMENT 4
-#endif
 
 #define mach_msg_align(x, alignment)   \
( ( ((vm_offset_t)(x)) + ((alignment)-1) ) & ~((alignment)-1) )
-- 
2.39.2




[PATCH 08/14] add thread creation helper to tests

2023-12-28 Thread Luca Dariz
---
 tests/include/testlib.h |  1 +
 tests/testlib.c | 53 +
 2 files changed, 54 insertions(+)

diff --git a/tests/include/testlib.h b/tests/include/testlib.h
index 2b7a67c0..52605047 100644
--- a/tests/include/testlib.h
+++ b/tests/include/testlib.h
@@ -48,6 +48,7 @@ const char* e2s(int err);
 const char* e2s_gnumach(int err);
 void halt();
 int msleep(uint32_t timeout);
+thread_t test_thread_start(task_t task, void(*routine)(void*), void* arg);
 
 mach_port_t host_priv(void);
 mach_port_t device_priv(void);
diff --git a/tests/testlib.c b/tests/testlib.c
index 6abe8c4d..e6be46ee 100644
--- a/tests/testlib.c
+++ b/tests/testlib.c
@@ -95,3 +95,56 @@ void _start()
   printf("%s: test %s exit code %x\n", TEST_SUCCESS_MARKER, argv[0], ret);
   halt();
 }
+
+// started from 
https://github.com/dwarfmaster/mach-ipc/blob/master/minimal_threads/main.c
+
+static uint32_t stack_top[PAGE_SIZE]  __attribute__ ((aligned (PAGE_SIZE)));
+thread_t test_thread_start(task_t task, void(*routine)(void*), void* arg) {
+  const vm_size_t stack_size = PAGE_SIZE * 16;
+  kern_return_t ret;
+  vm_address_t stack;
+
+  ret = vm_allocate(task, &stack, stack_size, TRUE);
+  ASSERT_RET(ret, "can't allocate the stack for a new thread");
+
+  ret = vm_protect(task, stack, PAGE_SIZE, FALSE, VM_PROT_NONE);
+  ASSERT_RET(ret, "can't protect the stack from overflows");
+
+  long *top = (long*)((vm_offset_t)stack_top + PAGE_SIZE) - 2;
+  *top = 0;/* The return address */
+#ifndef __x86_64__
+  *(top + 1) = (long)arg; /* The argument is passed on the stack on x86_32 */
+#endif
+  ret = vm_write(task, stack + stack_size - PAGE_SIZE, (vm_offset_t)stack_top, 
PAGE_SIZE);
+  ASSERT_RET(ret, "can't initialize the stack for the new thread");
+
+  thread_t thread;
+  ret = thread_create(task, &thread);
+  ASSERT_RET(ret, "thread_create()");
+
+  struct i386_thread_state state;
+  unsigned int count;
+  count = i386_THREAD_STATE_COUNT;
+  ret = thread_get_state(thread, i386_REGS_SEGS_STATE,
+ (thread_state_t) &state, &count);
+  ASSERT_RET(ret, "thread_get_state()");
+
+#ifdef __x86_64__
+  state.rip = (long) routine;
+  state.ursp = (long) (stack + stack_size - 16);
+  state.rbp = 0;
+  state.rdi = (long)arg;
+#else
+  state.eip = (long) routine;
+  state.uesp = (long) (stack + stack_size - 8);
+  state.ebp = 0;
+#endif
+  ret = thread_set_state(thread, i386_REGS_SEGS_STATE,
+ (thread_state_t) &state, i386_THREAD_STATE_COUNT);
+  ASSERT_RET(ret, "thread_set_state");
+
+  ret = thread_resume(thread);
+  ASSERT_RET(ret, "thread_resume");
+
+  return thread;
+}
-- 
2.39.2




[PATCH 04/14] add gsync tests

2023-12-28 Thread Luca Dariz
---
 tests/test-gsync.c | 87 ++
 1 file changed, 87 insertions(+)
 create mode 100644 tests/test-gsync.c

diff --git a/tests/test-gsync.c b/tests/test-gsync.c
new file mode 100644
index ..80701606
--- /dev/null
+++ b/tests/test-gsync.c
@@ -0,0 +1,87 @@
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+/* Gsync flags.  */
+#ifndef GSYNC_SHARED
+# define GSYNC_SHARED  0x01
+# define GSYNC_QUAD0x02
+# define GSYNC_TIMED   0x04
+# define GSYNC_BROADCAST   0x08
+# define GSYNC_MUTATE  0x10
+#endif
+
+static uint32_t single_shared;
+static uint64_t single_shared_quad;
+
+static void single_t2(void *arg)
+{
+  int err;
+  msleep(100);
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake t2");
+
+  err = thread_terminate(mach_thread_self());
+  ASSERT_RET(err, "thread_terminate");
+}
+
+static void test_single()
+{
+  int err;
+  single_shared = 0;
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 0, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_TIMEDOUT, "gsync_wait 1");
+
+  single_shared = 1;
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 0, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_INVALID_ARGUMENT, "gsync_wait 2");
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 1, 0, 100, 
GSYNC_TIMED);
+  ASSERT(err == KERN_TIMEDOUT, "gsync_wait 2b");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake 1");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 
GSYNC_BROADCAST);
+  ASSERT_RET(err, "gsync_wake 2");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 2, 
GSYNC_MUTATE);
+  ASSERT_RET(err, "gsync_wake 2a");
+  ASSERT(single_shared == 2, "gsync_wake single_shared");
+
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake 3");
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake 3a");
+
+  single_shared = 1;
+  test_thread_start(mach_task_self(), single_t2, 0);
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 1, 0, 0, 0);
+  ASSERT_RET(err, "gsync_wait 4");
+}
+
+static void test_single2()
+{
+  int err;
+  test_thread_start(mach_task_self(), single_t2, 0);
+  single_shared = 2;
+  err = gsync_wait(mach_task_self(), (vm_offset_t)&single_shared, 2, 0, 0, 0);
+  ASSERT_RET(err, "gsync_wait 1");
+  // should this fail?
+  err = gsync_wake(mach_task_self(), (vm_offset_t)&single_shared, 0, 0);
+  ASSERT_RET(err, "gsync_wake 2");
+}
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  test_single2();
+  test_single();
+  return 0;
+}
-- 
2.39.2




[PATCH 01/14] USER32: change default to disabled and make it a general option

2023-12-28 Thread Luca Dariz
* configfrag.ac: add the global option USER32; although it makes sense
  for 64-bit versions only, it can be used by future 64-bit
  architectiures and not only x86_64.
  Also, change the default setting to be disabled; now that we have a
  working full 64-bit system, it makes sense to consider it the common
  choice.
* x86_64/configfrag.ac: define the correct 32-bit cpu if USER32 is
  enabled.
---
 configure.ac | 10 ++
 x86_64/configfrag.ac | 14 --
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index 3665d47a..52205eee 100644
--- a/configure.ac
+++ b/configure.ac
@@ -147,6 +147,16 @@ AC_ARG_ENABLE([device-drivers],
 [;;
 esac]
 
+AC_ARG_ENABLE([user32],
+AS_HELP_STRING([--enable-user32], [enable 32-bit user space on a 64-bit 
kernel]))
+[if [ x"$enable_user32" = xyes ]; then]
+AC_DEFINE([USER32], [], [enable 32-bit user on 64-bit kernel])
+AM_CONDITIONAL([enable_user32], [true])
+[else]
+AM_CONDITIONAL([enable_user32], [false])
+[fi]
+
+
 # Platform-specific configuration.
 
 # PC AT.
diff --git a/x86_64/configfrag.ac b/x86_64/configfrag.ac
index 71f8d8c1..f119a9a3 100644
--- a/x86_64/configfrag.ac
+++ b/x86_64/configfrag.ac
@@ -27,22 +27,16 @@ dnl USE OF THIS SOFTWARE.
 # Determines the size of the CPU cache line.
 AC_DEFINE([CPU_L1_SHIFT], [6], [CPU_L1_SHIFT])
 
-AC_ARG_ENABLE([user32],
-  AS_HELP_STRING([--disable-user32], [disable 32-bit user on 64-bit 
kernel]))
-[if [ x"$enable_user32" != xno ]; then]
-  AC_DEFINE([USER32], [], [enable 32-bit user on 64-bit kernel])
-  AM_CONDITIONAL([enable_user32], [true])
-[else]
-  AM_CONDITIONAL([enable_user32], [false])
-[fi]
+[if test x"$enable_user32" = xyes ; then
+  user32_cpu=i686
+fi]
 
 [# Does the architecture provide machine-specific interfaces?
 mach_machine_routines=1
 
 enable_pae=yes;;
   *)]
-AM_CONDITIONAL([HOST_x86_64], [false])
-AM_CONDITIONAL([enable_user32], [true])[;;
+AM_CONDITIONAL([HOST_x86_64], [false])[;;
 esac
 
 case $host_platform in
-- 
2.39.2




[PATCH 05/14] add mach_port tests

2023-12-28 Thread Luca Dariz
---
 tests/test-mach_port.c | 80 ++
 1 file changed, 80 insertions(+)
 create mode 100644 tests/test-mach_port.c

diff --git a/tests/test-mach_port.c b/tests/test-mach_port.c
new file mode 100644
index ..4f095047
--- /dev/null
+++ b/tests/test-mach_port.c
@@ -0,0 +1,80 @@
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+
+void test_mach_port(void)
+{
+  kern_return_t err;
+
+  mach_port_name_t *namesp;
+  mach_msg_type_number_t namesCnt=0;
+  mach_port_type_t *typesp;
+  mach_msg_type_number_t typesCnt=0;
+  err = mach_port_names(mach_task_self(), &namesp, &namesCnt, &typesp, 
&typesCnt);
+  ASSERT_RET(err, "mach_port_names");
+  printf("mach_port_names: type/name length: %d %d\n", namesCnt, typesCnt);
+  ASSERT((namesCnt != 0) && (namesCnt == typesCnt),
+ "mach_port_names: wrong type/name length");
+  for (int i=0; i

Re: [PATCH gnumach] x86_64: Support 8 byte inlined port rights to avoid message resizing.

2023-11-28 Thread Luca

Il 24/11/23 22:30, Flavio Cruz ha scritto:

If a port is inlined in a message, the user has to use
mach_port_name_inlined_t to define each port. Out of line memory
continues to use mach_port_name_t since that memory has to be copied to
the kernel anyway.

Both copyinmsg and copyoutmsg can be reduced to nothing (if we ignore
USER32) as a follow up but kept this patch simple for ease of review.


wouldn't they just become almost an alias to copyin/copyout, plus 
setting msg size? (as in the 32-bit case)


In that case probably copy_user.c is not needed anymore, on a full 
64-bit system, and would be also simpler than this change.



Luca



Re: [RFC PATCH 1/2] add basic user-space tests with qemu

2023-10-28 Thread Luca Dariz

Hi,

Il 22/10/23 17:19, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 19 oct. 2023 20:57:46 +0200, a ecrit:

* tests/configfrag.ac: add MIGUSER, required for user-space tests as
   it might be different from the one used to ubild the kernel.


Can't we just automatically select the proper mig? depending on
user-land bitness we can use i686-gnu-mig or x86_64-gnu-mig.


yes that should be the default indeed, but I would keep the possibility 
to override it, to test a new mig if needed.



* tests/include/kern/printf.h: reuse kern/printf.h
* tests/include/util/atoi.h: reuse util/atoi.h


Rather use a symlink for such files?


they are already a symlink, maybe I'll update the commit description to 
be clearer



diff --git a/tests/Makefrag.am b/tests/Makefrag.am
index 2723f64a..c16326e8 100644
--- a/tests/Makefrag.am
+++ b/tests/Makefrag.am
@@ -23,4 +23,129 @@



+if HOST_ix86
+QEMU=qemu-system-i386


Using qemu-system-i386 fails on my system: grub starts, loads the
kernel, but booting it fails. It doesn't seem related to this patch
series, as qemu-system-i386 -kernel gnumach seems to fail the same way.
But we'd rather fix it before adding the test support :)


I've tried changing some options and I see the kernel rebooting only 
with -cpu pentium (pentium2 works). I'm on debian 12 with qemu 
7.2+dfsg-7+deb12u2, I tried with both debian's current gnumach and 
compiling from master.


Does it work for you if using qemu-system-x86_64? I would avoid using it 
for the 32-bit gnumach tests as it confuses gdb.



+QEMU_GDB_PORT ?= 11234


Typo?


Actually this is somehow related to my .gdbinit, as sometimes I was 
running two debug session in parallel, to compare the 32 and 64 bit 
behavior. Having it customizable is probably enough, with one single 
default.



+   cat $(srcdir)/tests/grub.cfg.single.template | \
+   sed -e s/BOOTMODULE/$$(builddir)/tests/isofiles/boot/grub/grub.cfg


rather
< $(srcdir)/tests/grub.cfg.single.template \
sed -e s/BOOTMODULE/$$(builddir)/tests/isofiles/boot/grub/grub.cfg


will do


diff --git a/tests/run-qemu.sh.template b/tests/run-qemu.sh.template
new file mode 100644
index ..1f116354
--- /dev/null
+++ b/tests/run-qemu.sh.template
@@ -0,0 +1,22 @@
+#!/bin/sh
+
+set -e
+
+cmd="QEMU QEMU_OPTS -cdrom test-TESTNAME.iso"
+out=out-TESTNAME
+if which QEMU >/dev/null ; then
+if ! timeout -v --foreground --kill-after=3 15s $cmd > $out; then
+tail -n +18 $out  # skip terminal reconfiguration


Better use sed -n '/start module-/,$p' or such to properly skip all
kernel startup prints. You could also make the grub script print
something, that you can catch in the sed script. And you can put the sed
script instead of "> $out", so the output file is already clean and you
don't have to duplicate the one-liner to print it.


Good idea, no need to hardcode the number of lines then.


Also, please think about cleaning files :)


do you mean make clean or some other style cleanup in the code?


diff --git a/tests/testlib.c b/tests/testlib.c
new file mode 100644
index ..13aa2614
--- /dev/null
+++ b/tests/testlib.c
@@ -0,0 +1,125 @@
+// from glibc's sysdeps/mach/usleep.c


No need to keep the reference, it's small and trivial enough that it's
not copyrightable :)


ok


+int msleep(uint32_t timeout)
+{
+  mach_port_t recv = mach_reply_port();
+  return mach_msg(NULL, MACH_RCV_MSG|MACH_RCV_TIMEOUT|MACH_RCV_INTERRUPT,
+  0, 0, recv, timeout, MACH_PORT_NULL);
+}
+
+const char* e2s(int err)
+{
+switch (err)
+{
+case MACH_SEND_INVALID_DATA: return "MACH_SEND_INVALID_DATA";


You could use a macro

#define E2S(s) \
 case s: return #s;

E2S(MACH_SEND_INVALID_DATA)


In addition to this I was also thinking about auto-generating the 
switch/case from message.h kern_return.h and mig_errors.h, with the 
possibility to extend it.



Luca



[RFC PATCH 2/2] add mach_host tests

2023-10-19 Thread Luca Dariz
---
 tests/Makefrag.am  |  4 +++-
 tests/test-mach_host.c | 54 ++
 2 files changed, 57 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-mach_host.c

diff --git a/tests/Makefrag.am b/tests/Makefrag.am
index c16326e8..cb946bc7 100644
--- a/tests/Makefrag.am
+++ b/tests/Makefrag.am
@@ -146,6 +146,8 @@ debug-%: test-%.iso
 
 
 TESTS += \
-   tests/test-hello
+   tests/test-hello \
+   tests/test-mach_host
+
 
 endif # !PLATFORM_xen
diff --git a/tests/test-mach_host.c b/tests/test-mach_host.c
new file mode 100644
index ..2e9e6989
--- /dev/null
+++ b/tests/test-mach_host.c
@@ -0,0 +1,54 @@
+
+#include 
+
+#include 
+
+void test_kernel_version()
+{
+  int err;
+  kernel_version_t kver;
+  err = host_get_kernel_version(mach_host_self(), kver);
+  ASSERT_RET(err, "host_kernel_info");
+  printf("kernel version: %s\n", kver);
+}
+
+void test_host_info()
+{
+  int err;
+  mach_msg_type_number_t count;
+  mach_port_t thishost = mach_host_self();
+
+  host_basic_info_data_t binfo;
+  count = HOST_BASIC_INFO_COUNT;
+  err = host_info(thishost, HOST_BASIC_INFO, (host_info_t)&binfo, &count);
+  ASSERT_RET(err, "host_basic_info");
+  ASSERT(count == HOST_BASIC_INFO_COUNT);
+
+  const int maxcpus = 255;
+  int proc_slots[maxcpus];
+  count = maxcpus;
+  err = host_info(thishost, HOST_PROCESSOR_SLOTS, (host_info_t)&proc_slots, 
&count);
+  ASSERT_RET(err, "host_processor_slots");
+  ASSERT((1 <= count) && (count <= maxcpus));
+
+  host_sched_info_data_t sinfo;
+  count = HOST_SCHED_INFO_COUNT;
+  err = host_info(thishost, HOST_SCHED_INFO, (host_info_t)&sinfo, &count);
+  ASSERT_RET(err, "host_sched_info");
+  ASSERT(count == HOST_SCHED_INFO_COUNT);
+
+  host_load_info_data_t linfo;
+  count = HOST_LOAD_INFO_COUNT;
+  err = host_info(thishost, HOST_LOAD_INFO, (host_info_t)&linfo, &count);
+  ASSERT_RET(err, "host_load_info");
+  ASSERT(count == HOST_LOAD_INFO_COUNT);
+}
+
+// TODO processor sets
+
+int main(int argc, char *argv[], int envc, char *envp[])
+{
+  test_kernel_version();
+  test_host_info();
+  return 0;
+}
-- 
2.39.2




[RFC PATCH 0/2] gnumach testing

2023-10-19 Thread Luca Dariz
This patch adds the possibility to automate simple tests with
user-space programs, using qemu. I used this method to work on the
x86_64 port so far, and it was quite useful for short iterations and
testing special conditions (e.g. memory objects, syscall errors).

I'm not sure if this is the best way to integrate it in the existing
codebase, as it adds more dependencies for testing (qemu, two versions
of MIG if testing 32on64), or if it would be better to have a separate
repository for this kind of testing. For example the same method can
easily be combined with hurd code, by automating the build of the
relevant parts.

I usually use this method to check the 32bit (non-PAE), 32bit (PAE),
32on64 and full 64 bit before sending out patches. For example,
currently it seems that the 32on64 configuration does not start
properly.  I also add a small mach_host test example; once the testing
environment is settled I can adapt other tests I currently have.

Luca Dariz (2):
  add basic user-space tests with qemu
  add mach_host tests

 tests/Makefrag.am| 127 +++
 tests/README |  37 +
 tests/configfrag.ac  |  10 +++
 tests/grub.cfg.single.template   |   3 +
 tests/include/device/cons.h  |  10 +++
 tests/include/kern/printf.h  |   1 +
 tests/include/mach/mig_support.h |  71 +
 tests/include/syscalls.h |  83 
 tests/include/testlib.h  |  46 +++
 tests/include/util/atoi.h|   1 +
 tests/run-qemu.sh.template   |  22 ++
 tests/syscalls.S |   4 +
 tests/test-hello.c   |   9 +++
 tests/test-mach_host.c   |  54 +
 tests/testlib.c  | 125 ++
 15 files changed, 603 insertions(+)
 create mode 100644 tests/README
 create mode 100644 tests/grub.cfg.single.template
 create mode 100644 tests/include/device/cons.h
 create mode 12 tests/include/kern/printf.h
 create mode 100644 tests/include/mach/mig_support.h
 create mode 100644 tests/include/syscalls.h
 create mode 100644 tests/include/testlib.h
 create mode 12 tests/include/util/atoi.h
 create mode 100644 tests/run-qemu.sh.template
 create mode 100644 tests/syscalls.S
 create mode 100644 tests/test-hello.c
 create mode 100644 tests/test-mach_host.c
 create mode 100644 tests/testlib.c

-- 
2.39.2




[RFC PATCH 1/2] add basic user-space tests with qemu

2023-10-19 Thread Luca Dariz
* tests/Makefrag.am: add rules to build simple user-space test
  modules, including generating mig stubs. The tests reuse some kernel
  code (the printf(), mach_atoi(), mem*(), str*() functions) so we can
  use the freestanding environment and not depend on glibc.
* tests/README: add basic info on how to run and debug tests
* tests/configfrag.ac: add MIGUSER, required for user-space tests as
  it might be different from the one used to ubild the kernel.
* tests/grub.cfg.single.template: add minimal grub config to boot a
  module
* tests/include/device/cons.h: add a simplified version of
  device/cons.h usable for tests
* tests/include/kern/printf.h: reuse kern/printf.h
* tests/include/mach/mig_support.h: add basic version for user-space
  tests
* tests/include/syscalls.h: add prototypes for syscalls used in tests.
* tests/include/testlib.h: add definitions for common test
  functionalities
* tests/include/util/atoi.h: reuse util/atoi.h
* tests/run-qemu.sh.template: add a simple qemu test runner
* tests/syscalls.S: generate syscalls entry points
* tests/test-hello.c: add basic smoke test
* tests/testlib.c: add the imnimal funcitonality to run a user-space
  executable and reboot the system, and some test helpers.
---
 tests/Makefrag.am| 125 +++
 tests/README |  37 +
 tests/configfrag.ac  |  10 +++
 tests/grub.cfg.single.template   |   3 +
 tests/include/device/cons.h  |  10 +++
 tests/include/kern/printf.h  |   1 +
 tests/include/mach/mig_support.h |  71 ++
 tests/include/syscalls.h |  83 
 tests/include/testlib.h  |  46 
 tests/include/util/atoi.h|   1 +
 tests/run-qemu.sh.template   |  22 ++
 tests/syscalls.S |   4 +
 tests/test-hello.c   |   9 +++
 tests/testlib.c  | 125 +++
 14 files changed, 547 insertions(+)
 create mode 100644 tests/README
 create mode 100644 tests/grub.cfg.single.template
 create mode 100644 tests/include/device/cons.h
 create mode 12 tests/include/kern/printf.h
 create mode 100644 tests/include/mach/mig_support.h
 create mode 100644 tests/include/syscalls.h
 create mode 100644 tests/include/testlib.h
 create mode 12 tests/include/util/atoi.h
 create mode 100644 tests/run-qemu.sh.template
 create mode 100644 tests/syscalls.S
 create mode 100644 tests/test-hello.c
 create mode 100644 tests/testlib.c

diff --git a/tests/Makefrag.am b/tests/Makefrag.am
index 2723f64a..c16326e8 100644
--- a/tests/Makefrag.am
+++ b/tests/Makefrag.am
@@ -23,4 +23,129 @@
 if !PLATFORM_xen
 TESTS += \
tests/test-multiboot
+
+# FIXME: how can we reliably detect a change on any header and reinstall them?
+$(MACH_TESTINSTALL):
+   mkdir -p $@
+   $(MAKE) install-data DESTDIR=$@
+
+prepare-test: $(MACH_TESTINSTALL)
+
+$(MIG_OUTDIR):
+   mkdir -p $@
+
+MACH_TESTINSTALL=$(builddir)/tests/include-mach
+MACH_TESTINCLUDE=$(MACH_TESTINSTALL)/$(prefix)/include
+
+MIGCOMUSER=$(MIGUSER)  -n -cc cat - /dev/null
+MIG_OUTDIR=$(builddir)/tests/mig-out
+MIG_CPPFLAGS=-nostdinc -I$(MACH_TESTINCLUDE) -ggdb
+
+define generate_mig_client
+$(MIG_OUTDIR)/$(2).user.c: prepare-test $(MIG_OUTDIR) 
$(srcdir)/include/$(1)/$(2).defs
+   $(CPP) $(MIG_CPPFLAGS) -o $(MIG_OUTDIR)/$(2).user.defs 
$(srcdir)/include/$(1)/$(2).defs
+   $(MIGCOMUSER) $(MIGCOMFLAGS) $(MIGCOMUFLAGS)\
+ -user $(MIG_OUTDIR)/$(2).user.c -header $(MIG_OUTDIR)/$(2).user.h 
\
+ -list $(MIG_OUTDIR)/$(2).user.msgids  \
+ < $(MIG_OUTDIR)/$(2).user.defs
+endef
+
+define generate_mig_server
+$(MIG_OUTDIR)/$(2).server.c: prepare-test $(MIG_OUTDIR) 
$(srcdir)/include/$(1)/$(2).defs
+   $(CPP) $(MIG_CPPFLAGS) -o $(MIG_OUTDIR)/$(2).server.defs 
$(srcdir)/include/$(1)/$(2).defs
+   $(MIGCOMUSER) $(MIGCOMFLAGS) $(MIGCOMUFLAGS)\
+ -server $(MIG_OUTDIR)/$(2).server.c -header 
$(MIG_OUTDIR)/$(2).server.h   \
+ -list $(MIG_OUTDIR)/$(2).server.msgids
\
+ < $(MIG_OUTDIR)/$(2).server.defs
+endef
+
+MIG_GEN_CC= \
+   $(MIG_OUTDIR)/mach.user.c \
+   $(MIG_OUTDIR)/mach_host.user.c \
+   $(MIG_OUTDIR)/mach_port.user.c \
+   $(MIG_OUTDIR)/gnumach.user.c \
+   $(MIG_OUTDIR)/device_reply.user.c \
+   $(MIG_OUTDIR)/device.user.c \
+   $(MIG_OUTDIR)/exc.server.c
+
+# eval->info for debug
+$(eval $(call generate_mig_client,mach,mach))
+$(eval $(call generate_mig_client,mach,mach_host))
+$(eval $(call generate_mig_client,mach,mach_port))
+$(eval $(call generate_mig_client,mach,gnumach))
+$(eval $(call generate_mig_client,device,device_reply))
+$(eval $(call generate_mig_client,device,device))
+$(eval $(call generate_mig_server,mach,exc))
+
+
+TESTCFLAGS=-static -nostartfiles -nolibc \
+   -ffreestanding \
+   -I$(srcdir)/tests/include \
+   -I$

Re: [PATCH] Updated the emacs open issues page.

2023-09-05 Thread Luca

Hi,

Il 05/09/23 20:27, Guy-Fleury Iteriteka ha scritto:

https://people.debian.org/~sthibault/hurd-i386/initrd-amd64.img.gz
»

And how you launch this one. I get: no bootable module


It can be used as a ramdisk; one way to use it (maybe not the simplest, 
but more or less the one I used) is to build a bootable grub iso image:


* from the ramdisk extract ext2fs.static and exec.static
* compile gnumach for x86_64 user and kernel (with --disable-user32), 
make sure to include debian's ramdisk patch (or maybe take the image 
from 
https://people.debian.org/~sthibault/tmp/hurd-amd64/pool-hurd-amd64/g/gnumach/gnumach-image-1.8-486_1.8+git20230526-3_hurd-amd64.deb)

* use as grub.cfg something like this:
>8
set timeout=0
set default=0

menuentry "gnumach test" {
multiboot /boot/gnumach console=com0 TERM=mach-gnu-color init=/bin/sh
module --nounzip /boot/initrd.gz initrd '$(ramdisk-create)'
module /hurd/ext2fs.static ext2fs \
--multiboot-command-line='${kernel-command-line}' \
--host-priv-port='${host-port}' \
--device-master-port='${device-port}' \
--exec-server-task='${exec-task}' -T typed 
gunzip:device:rd0 \
'$(task-create)' '$(task-resume)'
module /hurd/exec.static exec \
 --device-master-port='${device-port}' \
 '$(exec-task=task-create)'
boot
}
>8
* then assemble the grub image and run it:
>8
mkdir -p isofiles/boot/grub
mkdir -p isofiles/hurd
cp grub.cfg isofiles/boot/grub/grub.cfg
cp gnumach initrd-amd64.img.gz isofiles/boot/
cp ext2fs.static exec.static isofiles/hurd/
grub-mkrescue -o test-ramhurd.iso ./isofiles
qemu-system-x86_64 -cdrom test-ramhurd.iso -m 4096 -nographic -no-reboot 
-boot d

>8
* once you have the ramdisk running, launch debootstrap's second stage 
as in https://www.gnu.org/software/hurd/open_issues/64-bit_port.html


To have something persistent, I installed it on a second disk attached 
to qemu:
- attach a disk image with -hda testdisk.img (which can also be a copy 
of initrd-amd64.img.gz, unpacked)
- once the first ramdisk debootstrap completes, mount the disk using the 
rumpdisk driver on /dev/wd0, chroot into it and run the --second-stage again
- to boot from disk, I use a similar .iso image, with a different 
grub.cfg, taking also the pci-arbiter.static and rumpdisk.static files 
from the disk image as above (I haven't tried installing grub on disk 
from hurd-amd64 yet):

>8
set timeout=0
set default=0

menuentry "gnumach test" {
multiboot /boot/gnumach console=com0 root=part:1:device:wd0

module /boot/pci-arbiter.static pci-arbiter \
--host-priv-port='${host-port}' \
--device-master-port='${device-port}' \
--next-task='${disk-task}' \
'$(task-create)' '$(task-resume)'

module /boot/rumpdisk.static rumpdisk \
--next-task='${fs-task}' \
'$(disk-task=task-create)'

module /boot/ext2fs.static ext2fs \
--multiboot-command-line='${kernel-command-line}' \
--exec-server-task='${exec-task}' -T typed '${root}' \
'$(fs-task=task-create)'

module /boot/exec.static exec '$(exec-task=task-create)'

boot
}
>8----
- once installed, you can configure a network interface and ssh into it 
as a 32-bit hurd virtual machine


Probably there is a simpler way to use it, but I hope this helps :)
I guess once the port is stabilized, it will be easier to have it running.


Luca



Re: [RFC PATCH gnumach] percpu area using gs segment

2023-08-27 Thread Luca

Hi,

Il 26/08/23 08:48, Damien Zammit ha scritto:

diff --git a/i386/i386/cpu_number.c b/i386/i386/cpu_number.c
index ef19e11f..241015b5 100644
--- a/i386/i386/cpu_number.c
+++ b/i386/i386/cpu_number.c
@@ -20,11 +20,17 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  
  #if NCPUS > 1

-int cpu_number(void)
+int cpu_number_slow(void)
  {
return cpu_id_lut[apic_get_current_cpu()];
  }


This should also be defined for non-smp, otherwise compilation fails (I 
only tested x86_64). Also I would make compilaiton explicitely fail 
somewhere on x86_64+smp for now.



diff --git a/i386/i386/percpu.c b/i386/i386/percpu.c
new file mode 100644
index ..0bc8b234
--- /dev/null
+++ b/i386/i386/percpu.c
@@ -0,0 +1,30 @@
+/*
+ * Copyright (c) 2023 Free Software Foundation, Inc.
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include 
+#include 
+#include 
+
+struct percpu percpu_array[NCPUS] __aligned(0x8000);


Why do you need this alignment?



diff --git a/i386/i386/percpu.h b/i386/i386/percpu.h
new file mode 100644
index ..b22d512c
--- /dev/null
+++ b/i386/i386/percpu.h
@@ -0,0 +1,72 @@
+/*
+ * Copyright (c) 2023 Free Software Foundation, Inc.
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _PERCPU_H_
+#define _PERCPU_H_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define percpu_assign(stm, val) \
+asm("mov %0, %%gs:%1"   \
+ : : "r" (val), "m" (__builtin_offsetof(struct percpu, stm)));
+
+#define percpu_ptr(typ, stm)\
+MACRO_BEGIN \
+typ *ptr_ = (typ *)__builtin_offsetof(struct percpu, stm); \
+\
+asm("add %%gs:0, %0"\
+ : "+r" (ptr_)  \
+ : );   \
+\
+ptr_;   \
+MACRO_END


If it could simplify this accessor, gcc (and also clang, if we ever need 
it) supports using a specific segment base for a variable:


https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Named-Address-Spaces.html#x86-Named-Address-Spaces

and

https://kernel.org/doc/html/next/x86/x86_64/fsgs.html#compiler-support-for-fs-gs-based-addressing


Luca



Re: [PATCH 1/3 gnumach] apic: Use cpuid to read the apic id for speed

2023-08-11 Thread Luca
Il 11/08/23 10:34, Damien Zammit ha scritto:> diff --git 
a/i386/i386/cpu_number.h b/i386/i386/cpu_number.h

index c00896e8..df086370 100644
--- a/i386/i386/cpu_number.h
+++ b/i386/i386/cpu_number.h
@@ -39,12 +39,30 @@
  #define   CX(addr, reg)   addr(,reg,8)
  #endif
  
-#define	CPU_NUMBER(reg)	\

+#defineCPU_NUMBER_NO_STACK(reg)\
movl%cs:lapic, reg  ;\
movl%cs:APIC_ID(reg), reg   ;\
shrl$24, reg;\
movl%cs:CX(cpu_id_lut, reg), reg;\
  
+/* Never call CPU_NUMBER(%esi) */

+#define CPU_NUMBER(reg)\
+   pushl   %esi;\
+   pushl   %eax;\
+   pushl   %ebx;\
+   pushl   %ecx;\
+   pushl   %edx;\
+   movl$1, %eax;\
+   cpuid   ;\
+   shrl$24, %ebx   ;\
+   movl%cs:CX(cpu_id_lut, %ebx), %esi  ;\
+   popl%edx;\
+   popl%ecx;\
+   popl%ebx;\
+   popl%eax;\
+   movl%esi, reg   ;\
+   popl%esi;\
+


How much faster is this? did you measure also on hw? I would use RDTSC 
to do it, I don't know if there are better ways.
Interestingly, Linux uses either LSL or RDPID to obtain the same value, 
I wonder if they could be an easy alternative.


Also, could it be that there is a particular use of it that is much more 
frequent than the others, and that might benefit from further optimizations?



Luca



Re: 64bit startup

2023-08-10 Thread Luca

Il 09/08/23 19:57, Samuel Thibault ha scritto:

Samuel Thibault, le mar. 08 août 2023 19:15:50 +0200, a ecrit:

I continued stabbing at the network issue, it was just a size alignment
problem, so that depending on their sizes, half of the network packets
would be discarded by mach_msg. Hurd-amd64 packages are getting rebuilt
to contain the fix.


They're rebuilt, the image as well.

Along the way, I fixed the hurd package getting built with pie enabled,
that was a missing commit to be uploaded to dpkg.

I'll also try to build with nostrip. In case it's not too big that'll
make debugging even easier.


Thanks! now I am able to set up networking so I can install some more 
stuff, I can also ssh from my host to the 64-bit hurd :D


By the way, if I try to add the repository key with

apt-key adv --recv-keys --keyserver keyring.debian.org 
900CB024B67931D40F82304BD0178C767D069EE6


it fails with

E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them 
is required for this operation


Were there any build failures with gnupg or was it just left out? (I see 
other gnupg-related packages built successfully, e.g. gpgv). I can 
bypass authentication anyway for now with --allow-unauthenticated and 
--allow-insecure-repositories.



Luca



Re: [PATCH 5/5] x86_64: remove unneeded segment selectors handling on full 64 bit

2023-08-09 Thread Luca Dariz

Il 04/08/23 23:50, Samuel Thibault ha scritto:

Luca Dariz, le sam. 29 juil. 2023 19:47:53 +0200, a ecrit:

@@ -803,10 +809,7 @@ kern_return_t thread_getstatus(
== 0)
saved_state->efl &= ~EFL_IF;
}
-   }
-   else
-#endif
-   {
+   } else {
/*
 * 386 mode.
 */
@@ -815,6 +818,7 @@ kern_return_t thread_getstatus(
state->fs = saved_state->fs & 0x;
state->gs = saved_state->gs & 0x;
}
+#endif


Mmm, I believe we shall then drop [gfed]s from i386_thread_state?
Otherwise we'll leak content in the structure in thread_getstatus calls.


Yes, they would not be necessary anymore. It seems this would also 
require to adapt glibc accordingly in 
sysdeps/mach/hurd/x86_64/bits/sigcontext.h, I tried bootstrapping the 
whole system with a simple patch but failed so far.


Luca



gnumach kernel memory map (was: Re: 64bit startup)

2023-08-09 Thread Luca

Il 09/08/23 11:20, Samuel Thibault ha scritto:

Sergey Bugaev, le mer. 09 août 2023 12:12:29 +0300, a ecrit:

It should just reuse whatever memory the bootloader has already loaded
the module into, no need to copy it out anywhere.


(my guess is that it's not so simple because possibly we somehow get rid
of the memory allocated for modules, otherwise we'd get stuck with that
piece of memory area that we cannot reuse for other system memory uses)


From what I understand so far it's a bit tricky, especially in the 
early boot phase. The current kernel map on x86_64 is very very simple, 
and inherits some constraints from the 32-bit map for simplicity. It's 
also partly limited by the simple bootstrapping procedure in boothdr.S. 
Some issues that I found so far are e.g.:
* decouple the bootstrap map created in boothdr.S from the runtime map 
and initalize it properly in pmap_bootstarap() (see also _phystokv vs 
phystokv). There is some memory allocated in the early C code that might 
need to be remapped (e.g the kernel page table map and the biosmem heap)
* the kernel .text/.bss/.data etc seem to be the only parts required to 
be in the upper 2G, due to mcmodel=kernel. Everything else could be 
anywhere.
* highmem is somehow limited by VM_KERNEL_MAP_SIZE, which itself is 
currently limited by the 2G range
* in some places it seems there is the assumption that memory is 
directly mapped (i.e. with a simple offset between virtual/physical 
address), I'm not sure how pervasive this assumption is and how 
difficult it would be to remove it where it's not needed.


Some time ago I started a first step in this direction, attempting to 
move down the kernel map so the highmem/directmap regions could be made 
much bigger, and still keeping the static data in the upper 2G. I 
uploaded it here [0] but it's really just a first step.


My idea here was to have as a first step the various 
dma/directnap/highmem regions between -4G and -2G and still contiguous, 
and then try to leave some red zone between them and make the regions 
bigger, once the limits could be freely moved.


For the bootstrap modules, it might be enough to explicitly reserve the 
memory in pmap_bootstrap(), although that would consume "low" memory. 
Once we have bigger memory regions, it shouldn't be too hard to move the 
bootstrap module somewhere else.


Any thoughts? I have probably overlooked some other issues, so please 
correct me if I'm wrong.


Luca


[0] 
https://gitlab.com/luckyd/gnumach/-/commit/f0564e8ed63b82b4e8b0342821c37f2461625438




Re: [PATCH gnumach] Update the 64bit RPC ABI to be simpler (v2)

2023-08-08 Thread luca

Il 09/08/23 03:09, Samuel Thibault ha scritto:

Hello,

So, is anybody against making this change?


Hi,
for me it's ok, I tried this patch together with the mig one and didn't 
see any issue so far.


Luca




[PATCH v2] x86_64: install emergency handler for double fault

2023-07-29 Thread Luca Dariz
* i386/i386/idt.c: add selector for the interrupt-specific stack
* i386/i386/ktss.c: configure ist1 to use a dedicated stack
* i386/i386/trap.c: add double fault handler, which just prints the
  state and panics. There is not much else to do in this case but it's
  useful for troubleshooting
* x86_64/idt_inittab.S: allow to specify an interrupt stack for custom
  handlers
* x86_64/locore.S: add double fault handler
---
 i386/i386/idt.c  | 12 +++-
 i386/i386/ktss.c |  4 +++-
 i386/i386/trap.c |  6 ++
 x86_64/idt_inittab.S | 25 +
 x86_64/locore.S  | 15 +++
 5 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/i386/i386/idt.c b/i386/i386/idt.c
index cdfb9a88..caa44d71 100644
--- a/i386/i386/idt.c
+++ b/i386/i386/idt.c
@@ -34,6 +34,10 @@ struct idt_init_entry
unsigned long entrypoint;
unsigned short vector;
unsigned short type;
+#ifdef __x86_64__
+   unsigned short ist;
+   unsigned short pad_0;
+#endif
 };
 extern struct idt_init_entry idt_inittab[];
 
@@ -49,7 +53,13 @@ idt_fill(struct real_gate *myidt)
/* Initialize the exception vectors from the idt_inittab.  */
while (iie->entrypoint)
{
-   fill_idt_gate(myidt, iie->vector, iie->entrypoint, KERNEL_CS, 
iie->type, 0);
+   fill_idt_gate(myidt, iie->vector, iie->entrypoint, KERNEL_CS, 
iie->type,
+#ifdef __x86_64__
+ iie->ist
+#else
+ 0
+#endif
+   );
iie++;
}
 
diff --git a/i386/i386/ktss.c b/i386/i386/ktss.c
index 1d880167..34cb6df2 100644
--- a/i386/i386/ktss.c
+++ b/i386/i386/ktss.c
@@ -43,9 +43,10 @@ struct task_tss ktss;
 void
 ktss_fill(struct task_tss *myktss, struct real_descriptor *mygdt)
 {
-   /* XXX temporary exception stack */
+   /* XXX temporary exception stacks */
/* FIXME: make it per-processor */
static int exception_stack[1024];
+   static int double_fault_stack[1024];
 
 #ifdef MACH_RING1
/* Xen won't allow us to do any I/O by default anyway, just register
@@ -61,6 +62,7 @@ ktss_fill(struct task_tss *myktss, struct real_descriptor 
*mygdt)
/* Initialize the master TSS.  */
 #ifdef __x86_64__
myktss->tss.rsp0 = (unsigned long)(exception_stack+1024);
+   myktss->tss.ist1 = (unsigned long)(double_fault_stack+1024);
 #else /* ! __x86_64__ */
myktss->tss.ss0 = KERNEL_DS;
myktss->tss.esp0 = (unsigned long)(exception_stack+1024);
diff --git a/i386/i386/trap.c b/i386/i386/trap.c
index f7bd8e38..b3689c9a 100644
--- a/i386/i386/trap.c
+++ b/i386/i386/trap.c
@@ -666,3 +666,9 @@ db_debug_all_traps (boolean_t enable)
 }
 
 #endif /* MACH_KDB */
+
+void handle_double_fault(struct i386_saved_state *regs)
+{
+  dump_ss(regs);
+  panic("DOUBLE FAULT! This is critical\n");
+}
diff --git a/x86_64/idt_inittab.S b/x86_64/idt_inittab.S
index f021b56d..fc1df0c7 100644
--- a/x86_64/idt_inittab.S
+++ b/x86_64/idt_inittab.S
@@ -50,12 +50,13 @@ ENTRY(idt_inittab)
.quad   entry   ;\
.text
 #else  /* MACH_PV_DESCRIPTORS */
-#defineIDT_ENTRY(n,entry,type) \
+#defineIDT_ENTRY(n,entry,type,ist) \
.data   2   ;\
.quad   entry   ;\
.word   n   ;\
.word   type;\
-   .long   0   /*pad*/ ;\
+   .word   ist ;\
+   .word   0   /*pad*/ ;\
.text
 #endif /* MACH_PV_DESCRIPTORS */
 
@@ -63,7 +64,7 @@ ENTRY(idt_inittab)
  * No error code.  Clear error code and push trap number.
  */
 #defineEXCEPTION(n,name) \
-   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_TRAP_GATE);\
+   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_TRAP_GATE, 0);\
 ENTRY(name);\
INT_FIX ;\
pushq   $(0);\
@@ -74,7 +75,7 @@ ENTRY(name)   ;\
  * User-accessible exception.  Otherwise, same as above.
  */
 #defineEXCEP_USR(n,name) \
-   IDT_ENTRY(n,EXT(name),ACC_PL_U|ACC_TRAP_GATE);\
+   IDT_ENTRY(n,EXT(name),ACC_PL_U|ACC_TRAP_GATE, 0);\
 ENTRY(name);\
INT_FIX ;\
pushq   $(0);\
@@ -85,7 +86,7 @@ ENTRY(name)   ;\
  * Error code has been pushed.  Just push trap number.
  */
 #defineEXCEP_ERR(n,name) \
-   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_INTR_GATE);\
+   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_INTR_GATE, 0);\
 ENTRY(name);\
INT_FIX ;\
pushq   $(n);\
@@ -95,25 +96,25 @@ ENTRY(name) ;\
  * Special interrupt code: dispatches to a unique entrypoint,
  * not defined automatically here.
  */
-#defineEXCEP_SPC(n,name)  \
-   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_TRAP_GATE)
+#defineE

[PATCH 3/5] x86_64: format pusha/popa macros for readability

2023-07-29 Thread Luca Dariz
---
 x86_64/locore.S | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/x86_64/locore.S b/x86_64/locore.S
index 413d43c4..7127957b 100644
--- a/x86_64/locore.S
+++ b/x86_64/locore.S
@@ -39,8 +39,41 @@
 #include 
 #include 
 
-#define pusha pushq %rax ; pushq %rcx ; pushq %rdx ; pushq %rbx ; subq $8,%rsp 
; pushq %rbp ; pushq %rsi ; pushq %rdi ; pushq %r8 ; pushq %r9 ; pushq %r10 ; 
pushq %r11 ; pushq %r12 ; pushq %r13 ; pushq %r14 ; pushq %r15
-#define popa popq %r15 ; popq %r14 ; popq %r13 ; popq %r12 ; popq %r11 ; popq 
%r10 ; popq %r9 ; popq %r8 ; popq %rdi ; popq %rsi ; popq %rbp ; addq $8,%rsp ; 
popq %rbx ; popq %rdx ; popq %rcx ; popq %rax
+#define pusha  \
+   pushq %rax  ;\
+   pushq %rcx  ;\
+   pushq %rdx  ;\
+   pushq %rbx  ;\
+   subq $8,%rsp;\
+   pushq %rbp  ;\
+   pushq %rsi  ;\
+   pushq %rdi  ;\
+   pushq %r8   ;\
+   pushq %r9   ;\
+   pushq %r10  ;\
+   pushq %r11  ;\
+   pushq %r12  ;\
+   pushq %r13  ;\
+   pushq %r14  ;\
+   pushq %r15
+
+#define popa   \
+   popq %r15   ;\
+   popq %r14   ;\
+   popq %r13   ;\
+   popq %r12   ;\
+   popq %r11   ;\
+   popq %r10   ;\
+   popq %r9;\
+   popq %r8;\
+   popq %rdi   ;\
+   popq %rsi   ;\
+   popq %rbp   ;\
+   addq $8,%rsp;\
+   popq %rbx   ;\
+   popq %rdx   ;\
+   popq %rcx   ;\
+   popq %rax
 
 #ifdef USER32
 #define PUSH_FSGS  \
-- 
2.39.2




[PATCH 2/5] x86_64: disable V86 mode on full 64-bit configuration

2023-07-29 Thread Luca Dariz
* i386/i386/pcb.c: simplify exception stack location and adapt thread
  gettrs/setters
* i386/i386/thread.h: don't include V86 fields on full 64-bit
* x86_64/locore.S: don't include checks for V86 mode on full 64-bit
---
 i386/i386/pcb.c| 35 ++-
 i386/i386/thread.h |  6 ++
 x86_64/locore.S|  8 
 3 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
index fb535709..d1c5fb50 100644
--- a/i386/i386/pcb.c
+++ b/i386/i386/pcb.c
@@ -145,9 +145,14 @@ void switch_ktss(pcb_t pcb)
 *  won`t save the v86 segments, so we leave room.
 */
 
+#if !defined(__x86_64__) || defined(USER32)
pcb_stack_top = (pcb->iss.efl & EFL_VM)
? (long) (&pcb->iss + 1)
: (long) (&pcb->iss.v86_segs);
+#else
+   pcb_stack_top = (vm_offset_t) (&pcb->iss + 1);
+#endif
+
 #ifdef __x86_64__
assert((pcb_stack_top & 0xF) == 0);
 #endif
@@ -534,6 +539,7 @@ kern_return_t thread_setstatus(
| EFL_USER_SET;
 #endif /* __x86_64__ && !USER32 */
 
+#if !defined(__x86_64__) || defined(USER32)
/*
 * Segment registers.  Set differently in V8086 mode.
 */
@@ -563,8 +569,9 @@ kern_return_t thread_setstatus(
thread->pcb->ims.v86s.flags =
saved_state->efl & (EFL_TF | EFL_IF);
}
-   }
-   else if (flavor == i386_THREAD_STATE) {
+   } else
+#endif
+   if (flavor == i386_THREAD_STATE) {
/*
 * 386 mode.  Set segment registers for flat
 * 32-bit address space.
@@ -630,7 +637,7 @@ kern_return_t thread_setstatus(
 #endif
break;
}
-
+#if !defined(__x86_64__) || defined(USER32)
case i386_V86_ASSIST_STATE:
{
struct i386_v86_assist_state *state;
@@ -657,7 +664,7 @@ kern_return_t thread_setstatus(
USER_REGS(thread)->efl & (EFL_TF | EFL_IF);
break;
}
-
+#endif
case i386_DEBUG_STATE:
{
struct i386_debug_state *state;
@@ -710,13 +717,20 @@ kern_return_t thread_getstatus(
 {
switch (flavor)  {
case THREAD_STATE_FLAVOR_LIST:
-   if (*count < 4)
+#if !defined(__x86_64__) || defined(USER32)
+   unsigned int ncount = 4;
+#else
+   unsigned int ncount = 3;
+#endif
+   if (*count < ncount)
return (KERN_INVALID_ARGUMENT);
tstate[0] = i386_THREAD_STATE;
tstate[1] = i386_FLOAT_STATE;
tstate[2] = i386_ISA_PORT_MAP_STATE;
+#if !defined(__x86_64__) || defined(USER32)
tstate[3] = i386_V86_ASSIST_STATE;
-   *count = 4;
+#endif
+   *count = ncount;
break;
 
case i386_THREAD_STATE:
@@ -770,6 +784,7 @@ kern_return_t thread_getstatus(
 
state->cs = saved_state->cs;
state->ss = saved_state->ss;
+#if !defined(__x86_64__) || defined(USER32)
if (saved_state->efl & EFL_VM) {
/*
 * V8086 mode.
@@ -789,7 +804,9 @@ kern_return_t thread_getstatus(
saved_state->efl &= ~EFL_IF;
}
}
-   else {
+   else
+#endif
+   {
/*
 * 386 mode.
 */
@@ -835,7 +852,7 @@ kern_return_t thread_getstatus(
*count = i386_ISA_PORT_MAP_STATE_COUNT;
break;
}
-
+#if !defined(__x86_64__) || defined(USER32)
case i386_V86_ASSIST_STATE:
{
struct i386_v86_assist_state *state;
@@ -850,7 +867,7 @@ kern_return_t thread_getstatus(
*count = i386_V86_ASSIST_STATE_COUNT;
break;
}
-
+#endif
case i386_DEBUG_STATE:
{
struct i386_debug_state *state;
diff --git a/i386/i386/thread.h b/i386/i386/thread.h
index b5fc5ffb..2378154f 100644
--- a/i386/i386/thread.h
+++ b/i386/i386/thread.h
@@ -85,12 +85,14 @@ struct i386_saved_state {
unsigned long   efl;
unsigned long   uesp;
unsigned long   ss;
+#if !defined(__x86_64__) || defined(USER32)
struct v86_segs {
unsigned long v86_es;   /* virtual 8086 segment registers */
unsigned long v86_ds;
unsigned long v86_fs;
unsigned long v86_gs;
} v86_segs;
+#endif
 };
 
 /*
@@ -144,6 +146,7 @@ struct i386_fpsave_state {
};
 };
 
+#if !defined(__x86_64__) || defined(USER32)
 /*
  * v86_assist_state:
  *
@@ -157,6 +160,7 @@ struct v86_assist_state {
unsigned short  flags;  /* 8086 flag bits */
 };

[PATCH 4/5] x86_64: refactor segment register handling

2023-07-29 Thread Luca Dariz
The actual values are not saved together with the rest of the thread
state, both because it would be quite espensive (reading MSR, unless
rdfsbase instructions are supported, but that's optional) and not
really needed. The only way the user has to change its value is with a
specific RPC, so we can intercept the change easily.  Furthermore,
Leaving the values there exposes them to being corrupted in case of a
double interruption, e.g. an irq is handled just before iretq but
after starting to restore the thread state.  This solution was
suggested by Sergey Bugaev.

* i386/i386/db_trace.c: remove fsbase/gsbase from the registers
  available
* i386/i386/debug_i386.c: remove fsbase/gsbase from the printed thread
  state
* i386/i386/i386asm.sym: remove fsbase/gsbase as it's not needed in
  asm anymore
* i386/i386/pcb.c: point fsbase/gsbase to the new location
* i386/i386/thread.h: move fsbase/gsbase to the machine state
* x86_64/locore.S: generalize segment-handling including es/ds/gs/fs
  and remove fsbase/gsbase handling. Also, factor out kernel segment
  selector setting to a macro.
---
 i386/i386/db_trace.c   |   2 -
 i386/i386/debug_i386.c |   2 -
 i386/i386/i386asm.sym  |   2 -
 i386/i386/pcb.c|  12 +--
 i386/i386/thread.h |  22 +++-
 x86_64/locore.S| 228 +++--
 6 files changed, 108 insertions(+), 160 deletions(-)

diff --git a/i386/i386/db_trace.c b/i386/i386/db_trace.c
index 2b6ad741..8bd86fa5 100644
--- a/i386/i386/db_trace.c
+++ b/i386/i386/db_trace.c
@@ -78,8 +78,6 @@ struct db_variable db_regs[] = {
{ "r13",(long *)&ddb_regs.r13, db_i386_reg_value },
{ "r14",(long *)&ddb_regs.r14, db_i386_reg_value },
{ "r15",(long *)&ddb_regs.r15, db_i386_reg_value },
-   { "fsb",(long *)&ddb_regs.fsbase,db_i386_reg_value },
-   { "gsb",(long *)&ddb_regs.gsbase,db_i386_reg_value },
 #endif
 };
 struct db_variable *db_eregs = db_regs + sizeof(db_regs)/sizeof(db_regs[0]);
diff --git a/i386/i386/debug_i386.c b/i386/i386/debug_i386.c
index b5465796..41d032e3 100644
--- a/i386/i386/debug_i386.c
+++ b/i386/i386/debug_i386.c
@@ -40,8 +40,6 @@ void dump_ss(const struct i386_saved_state *st)
st->r8, st->r9, st->r10, st->r11);
printf("R12 %016lx R13 %016lx R14 %016lx R15 %016lx\n",
st->r12, st->r13, st->r14, st->r15);
-   printf("FSBASE %016lx GSBASE %016lx\n",
-   st->fsbase, st->gsbase);
printf("RIP %016lx EFLAGS %08lx\n", st->eip, st->efl);
 #else
printf("EAX %08lx EBX %08lx ECX %08lx EDX %08lx\n",
diff --git a/i386/i386/i386asm.sym b/i386/i386/i386asm.sym
index fd0be557..1b9b40bb 100644
--- a/i386/i386/i386asm.sym
+++ b/i386/i386/i386asm.sym
@@ -108,8 +108,6 @@ offset  i386_saved_stater   r12
 offset i386_saved_stater   r13
 offset i386_saved_stater   r14
 offset i386_saved_stater   r15
-offset i386_saved_stater   fsbase
-offset i386_saved_stater   gsbase
 #endif
 
 offset i386_interrupt_statei   eip
diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
index d1c5fb50..1cf87eb1 100644
--- a/i386/i386/pcb.c
+++ b/i386/i386/pcb.c
@@ -229,8 +229,8 @@ void switch_ktss(pcb_t pcb)
 #endif /* MACH_PV_DESCRIPTORS */
 
 #if defined(__x86_64__) && !defined(USER32)
-   wrmsr(MSR_REG_FSBASE, pcb->iss.fsbase);
-   wrmsr(MSR_REG_GSBASE, pcb->iss.gsbase);
+   wrmsr(MSR_REG_FSBASE, pcb->ims.sbs.fsbase);
+   wrmsr(MSR_REG_GSBASE, pcb->ims.sbs.gsbase);
 #endif
 
db_load_context(pcb);
@@ -687,8 +687,8 @@ kern_return_t thread_setstatus(
 return KERN_INVALID_ARGUMENT;
 
 state = (struct i386_fsgs_base_state *) tstate;
-thread->pcb->iss.fsbase = state->fs_base;
-thread->pcb->iss.gsbase = state->gs_base;
+thread->pcb->ims.sbs.fsbase = state->fs_base;
+thread->pcb->ims.sbs.gsbase = state->gs_base;
 if (thread == current_thread()) {
 wrmsr(MSR_REG_FSBASE, state->fs_base);
 wrmsr(MSR_REG_GSBASE, state->gs_base);
@@ -889,8 +889,8 @@ kern_return_t thread_getstatus(
 return KERN_INVALID_ARGUMENT;
 
 state = (struct i386_fsgs_base_state *) tstate;
-state->fs_base = thread->pcb->iss.fsbase;
-state->gs_base = thread->pcb->iss.gsbase;
+state->fs_base = thread->pcb->ims.sbs.fsbase;
+state->gs_base = thread->pcb->ims.sbs.gsbase;
 *count = i386_FSGS_BASE_STATE_COUNT;
 break;
 }
diff --git a/i386/i386/thread.h b/i386/i386/thread.h
index 2378154f..86a44098 100644
--- a/i386/i386/thread.h
+++ b/i386/i386/thread.h
@@ -51,10 +51,6 @@
  */
 
 struct i386_saved_state {
-#ifdef __x86_64__
-   unsigned long   fsbase;
-   unsi

[PATCH 5/5] x86_64: remove unneeded segment selectors handling on full 64 bit

2023-07-29 Thread Luca Dariz
* i386/i386/db_interface.c: don't set unused segment selectors on full
  64-bit
* i386/i386/db_trace.c: likewise.
* i386/i386/i386asm.sym: likewise.
* i386/i386/pcb.c:: likewise.
* i386/i386/thread.h: remove ES/DS/FS/GS from thread state on !USER32,
  as they are unused in this configuration. Only SS and CS are kept.
* x86_64/locore.S: convert segment handling macros to no-op on full
  64-bit
---
 i386/i386/db_interface.c | 10 +++---
 i386/i386/db_trace.c |  2 ++
 i386/i386/i386asm.sym|  2 ++
 i386/i386/pcb.c  | 12 
 i386/i386/thread.h   |  4 
 x86_64/locore.S  | 16 +---
 6 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/i386/i386/db_interface.c b/i386/i386/db_interface.c
index 5a4ace9f..8f0ab4ec 100644
--- a/i386/i386/db_interface.c
+++ b/i386/i386/db_interface.c
@@ -332,12 +332,13 @@ kdb_trap(
regs->ebp= ddb_regs.ebp;
regs->esi= ddb_regs.esi;
regs->edi= ddb_regs.edi;
-   regs->es = ddb_regs.es & 0x;
regs->cs = ddb_regs.cs & 0x;
+#if !defined(__x86_64__) || defined(USER32)
+   regs->es = ddb_regs.es & 0x;
regs->ds = ddb_regs.ds & 0x;
regs->fs = ddb_regs.fs & 0x;
regs->gs = ddb_regs.gs & 0x;
-
+#endif
if ((type == T_INT3) &&
(db_get_task_value(regs->eip, BKPT_SIZE, FALSE, TASK_NULL)
 == BKPT_INST))
@@ -401,11 +402,12 @@ kdb_kentry(
ddb_regs.esi = is->rsi;
ddb_regs.edi = is->rdi;
 #endif
+#if !defined(__x86_64__) || defined(USER32)
ddb_regs.ds  = is->ds;
ddb_regs.es  = is->es;
ddb_regs.fs  = is->fs;
ddb_regs.gs  = is->gs;
-
+#endif
cnpollc(TRUE);
db_task_trap(-1, 0, (ddb_regs.cs & 0x3) != 0);
cnpollc(FALSE);
@@ -430,10 +432,12 @@ kdb_kentry(
is->rsi = ddb_regs.esi;
is->rdi = ddb_regs.edi;
 #endif
+#if !defined(__x86_64__) || defined(USER32)
is->ds  = ddb_regs.ds & 0x;
is->es  = ddb_regs.es & 0x;
is->fs  = ddb_regs.fs & 0x;
is->gs  = ddb_regs.gs & 0x;
+#endif
}
 #ifNCPUS > 1
db_leave();
diff --git a/i386/i386/db_trace.c b/i386/i386/db_trace.c
index 8bd86fa5..b63e140f 100644
--- a/i386/i386/db_trace.c
+++ b/i386/i386/db_trace.c
@@ -54,10 +54,12 @@
  */
 struct db_variable db_regs[] = {
{ "cs", (long *)&ddb_regs.cs,  db_i386_reg_value },
+#if !defined(__x86_64__) || defined(USER32)
{ "ds", (long *)&ddb_regs.ds,  db_i386_reg_value },
{ "es", (long *)&ddb_regs.es,  db_i386_reg_value },
{ "fs", (long *)&ddb_regs.fs,  db_i386_reg_value },
{ "gs", (long *)&ddb_regs.gs,  db_i386_reg_value },
+#endif
{ "ss", (long *)&ddb_regs.ss,  db_i386_reg_value },
{ "eax",(long *)&ddb_regs.eax, db_i386_reg_value },
{ "ecx",(long *)&ddb_regs.ecx, db_i386_reg_value },
diff --git a/i386/i386/i386asm.sym b/i386/i386/i386asm.sym
index 1b9b40bb..8af0c5d6 100644
--- a/i386/i386/i386asm.sym
+++ b/i386/i386/i386asm.sym
@@ -84,8 +84,10 @@ size i386_kernel_state   iks
 
 size   i386_exception_link iel
 
+#if !defined(__x86_64__) || defined(USER32)
 offset i386_saved_stater   gs
 offset i386_saved_stater   fs
+#endif
 offset i386_saved_stater   cs
 offset i386_saved_stater   uesp
 offset i386_saved_stater   eax
diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
index 1cf87eb1..e0f4f57a 100644
--- a/i386/i386/pcb.c
+++ b/i386/i386/pcb.c
@@ -417,10 +417,12 @@ void pcb_init(task_t parent_task, thread_t thread)
 */
pcb->iss.cs = USER_CS;
pcb->iss.ss = USER_DS;
+#if !defined(__x86_64__) || defined(USER32)
pcb->iss.ds = USER_DS;
pcb->iss.es = USER_DS;
pcb->iss.fs = USER_DS;
pcb->iss.gs = USER_DS;
+#endif
pcb->iss.efl = EFL_USER_SET;
 
thread->pcb = pcb;
@@ -578,10 +580,12 @@ kern_return_t thread_setstatus(
 */
saved_state->cs = USER_CS;
saved_state->ss = USER_DS;
+#if !defined(__x86_64__) || defined(USER32)
saved_state->ds = USER_DS;
saved_state->es = USER_DS;
saved_state->fs = USER_DS;
saved_state->gs = USER_DS;
+#endif
}
else {
/*
@@ -592,10 +596,12 @@ kern_return_t thread_setstatus(
 */
saved_state->cs = state->cs;
saved_state->ss = state->ss;
+#if !defined(__x86_64__) || defined(USER32)
saved_state->ds = state->ds;
saved_state->es = state->es;
saved_state->fs = state->fs;
saved_state->gs =

[PATCH 1/5] x86_64: fix stack handling on recursive interrupts for USER32

2023-07-29 Thread Luca Dariz
* x86_64/locore.S: ensure the thread state is filled completely even
  on recursive interrups. The value of the segment selectors is not
  very important in this case, but we still need to align the stack to
  the bottom of i386_interrupt_state.
---
 x86_64/locore.S | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/x86_64/locore.S b/x86_64/locore.S
index 16b0dde5..ac7138b7 100644
--- a/x86_64/locore.S
+++ b/x86_64/locore.S
@@ -752,17 +752,17 @@ ENTRY(all_intrs)
pushq   %r11
cld /* clear direction flag */
 
-   movq%rsp,%rdx   /* on an interrupt stack? */
-   and $(~(INTSTACK_SIZE-1)),%rdx
-   cmpq%ss:EXT(int_stack_base),%rdx
-   je  int_from_intstack   /* if not: */
-
movq%ds,%rdx/* save segment registers */
pushq   %rdx
movq%es,%rdx
pushq   %rdx
PUSH_FSGS_ISR
 
+   movq%rsp,%rdx   /* on an interrupt stack? */
+   and $(~(INTSTACK_SIZE-1)),%rdx
+   cmpq%ss:EXT(int_stack_base),%rdx
+   je  int_from_intstack   /* if not: */
+
mov %ss,%dx /* switch to kernel segments */
mov %dx,%ds
mov %dx,%es
@@ -830,6 +830,12 @@ int_from_intstack:
jb  stack_overflowed/* if not: */
callEXT(interrupt)  /* call interrupt routine */
 _return_to_iret_i: /* ( label for kdb_kintr) */
+   POP_FSGS_ISR
+   pop %rdx
+   mov %rdx,%es
+   pop %rdx
+   mov %rdx,%ds
+
pop %r11
pop %r10
pop %r9
-- 
2.39.2




Re: [RFC PATCH v4 1/2 hurd] libirqhelp: Add library

2023-07-07 Thread Luca

Il 05/07/23 09:59, Damien Zammit ha scritto:

diff --git a/libirqhelp/irqhelp.c b/libirqhelp/irqhelp.c
new file mode 100644
index 0..bafae6cf1
--- /dev/null
+++ b/libirqhelp/irqhelp.c
+void *
+irqhelp_server_loop(void *arg)
+{
+  struct irq *irq = (struct irq *)arg;
+
+  if (!irq)
+{
+  printf("irqhelp cannot start this irq thread\n");
+  return NULL;
+}
+
+  int interrupt_demuxer (mach_msg_header_t *inp,
+mach_msg_header_t *outp)
+  {
+device_intr_notification_t *n = (device_intr_notification_t *) inp;
+
+((mig_reply_header_t *) outp)->RetCode = MIG_NO_REPLY;
+if (n->intr_header.msgh_id != DEVICE_INTR_NOTIFY)
+  return 0;  /* not an interrupt */
+
+/* FIXME: id <-> gsi now has an indirection, assuming 1:1 */
+if (n->id != irq->gsi)
+  return 0;  /* interrupt not for us */
+
+/* wait if irq disabled */
+pthread_mutex_lock (&irq->irqlock);
+while (!irq->enabled)
+  pthread_cond_wait (&irq->irqcond, &irq->irqlock);
+pthread_mutex_unlock (&irq->irqlock); > +
+/* call handler */
+irq->handler(irq->context);
+
+/* ACK interrupt */
+device_intr_ack (irqdev, irq->port, MACH_MSG_TYPE_MAKE_SEND);
+
+if (irq->shutdown)
+  {
+   mach_port_deallocate(mach_task_self (), irq->port);
+   irq->port = MACH_PORT_NULL;
+   pthread_cond_destroy(&irq->irqcond);
+   pthread_mutex_destroy(&irq->irqlock);
+   free(irq);
+pthread_exit(NULL);


This would prevent the thread calling irqhelp_server_loop() to do some 
cleanup. Looking at the implementation in glibc (in mach/msgserver.c), 
maybe one way would be to set a bad reply message, that would cause an 
error different from MACH_SEND_INVALID_DEST or MACH_RCV_TOO_LARGE, so 
the server loop will return with this error code. It's a bit hacky but I 
don't know if there is another way, since the demuxer return code seems 
ignored.




Luca



Re: Recent patches break ACPI tables

2023-06-19 Thread Luca

Il 19/06/23 21:40, l...@orpolo.org ha scritto:

and at this stage the lapic pointer is not yet initialized:

(gdb) p lapic
$4 = (volatile ApicLocalUnit *) 0x0
(gdb) x &lapic
0xc109bc6c :    0x

I guess so far this worked because the address 0 was mapped, and now it 
isn't.


I'm not sure what would be the proper way to solve this. I tried 
anticipating the call to machine_init() to be before vm_mem_bootstrap() 
(to have lapic initialized) but this triggers another assert.


Strangely, initializing lapic with a dummy structure as in

===8<===
diff --git a/i386/i386/apic.c b/i386/i386/apic.c
index ff7ba3e2..db5cbc15 100644
--- a/i386/i386/apic.c
+++ b/i386/i386/apic.c
@@ -27,7 +27,9 @@
 #include 

-volatile ApicLocalUnit* lapic = NULL;
+static ApicLocalUnit init_lapic;
+
+volatile ApicLocalUnit* lapic = &init_lapic;

 ApicInfo apic_data;
===8<===

makes qemu crash, as if the cpu is started multiple times:

===8<===
qemu-system-i386 -m 512 -nographic -no-reboot -cdrom test-hello.iso -smp 2

GNU Mach 1.8
ELF section header table at c0010198
biosmem: physical memory map:
biosmem: 00:09f000, available
biosmem: 09fc00:0a, reserved
biosmem: 0f:10, reserved
biosmem: 10:001ffe, available
biosmem: 001ffe:002000, reserved
biosmem: 00fffc:01, reserved
Loaded ELF symbol table for mach (3162 symbols)
vm_page: page table size: 131024 entries (7168k)
vm_page: DMA: pages: 4080 (15M), free: 0 (0M)
vm_page: DMA: min:500 low:600 high:1000
vm_page: DIRECTMAP: pages: 126944 (495M), free: 124031 (484M)
vm_page: DIRECTMAP: min:6347 low:7616 high:12694
ACPI:
 rsdp = c00f59d0; rsdp->rsdt_addr = 1ffe1b21
 rsdt = e1184b21; rsdt->length = 34 (n = 4)
CPUS:
 CPU 0 - APIC ID 0 - addr=0xe1187000
 CPU 1 - APIC ID 1 - addr=0xe1187000
IOAPICS:
 IOAPIC 0 - APIC ID 0 - addr=0xe1188000
IRQ override: pin=0 gsi=2 trigger=EDGE polarity=HIGH
IRQ override: pin=5 gsi=5 trigger=LEVEL polarity=HIGH
IRQ override: pin=9 gsi=9 trigger=LEVEL polarity=HIGH
IRQ override: pin=10 gsi=10 trigger=LEVEL polarity=HIGH
IRQ override: pin=11 gsi=11 trigger=LEVEL polarity=HIGH
IOAPIC 0 configured
com 2 out of range
lpr0: at atbus0, port = 378x, spl = 6d, pic = 7.
RTC time is 2023-06-19 20:11:07
timer calibration... done
Starting AP 1
Trying to enable: 1
**
ERROR:../../accel/tcg/tcg-accel-ops-mttcg.c:110:mttcg_cpu_thread_fn: 
assertion failed: (cpu->halted)
Sending IPIs to APIC ID 1...Bail out! 
ERROR:../../accel/tcg/tcg-accel-ops-mttcg.c:110:mttcg_cpu_thread_fn: 
assertion failed: (cpu->halted)

===8<===


Luca



Re: Recent patches break ACPI tables

2023-06-19 Thread luca

Il 19/06/23 20:35, Almudena Garcia ha scritto:

But the code which starts the secondary cpus is so much later than the crash.

Then, the crash could be produced by the reading of ACPI tables, which are 
supposed to be in a certain memory region, defined by a physical address.

phystokv will doesn't solve fully the problem, because the lapic address is out 
of the range allowed by this function. Currently, we are using paging to map 
every ACPI table which we need to access (to get a virtual address of this).

But the search of the initial ACPI address is based in a physical address range.


I could go a bit further with debugging, and it seems that the problem 
is a bit different, it seems removing the 1:1 map exposed an issue that 
went hidden so far.


In my test the cpu is reset by a triple fault (you can see this by 
enabling interrupt and cpu_reset logging with qemu, e.g. using -d 
int,cpu_reset) which is triggered after the first call to splvm:


(gdb) bt
#0  splvm () at ../i386/i386/spl.S:122
#1  0xc1001da6 in pmap_enter (pmap=, v=, 
pa=, prot=, wired=) at 
../i386/intel/pmap.c:2171
#2  0xc1029b99 in pmap_steal_memory (size=) at 
../vm/vm_resident.c:278
#3  0xc1029c48 in vm_page_bootstrap (startp=, 
endp=) at ../vm/vm_resident.c:207

#4  0xc101b893 in vm_mem_bootstrap () at ../vm/vm_init.c:65
#5  0xc10161d1 in setup_main () at ../kern/startup.c:115
#6  0xc1004652 in c_boot_entry (bi=) at 
../i386/i386at/model_dep.c:578

#7  0xc193 in iplt_done () at ../i386/i386at/boothdr.S:103
(gdb) si
124 cli
1: x/i $pc
=> 0xc100ac5d :   cli
(gdb)
125 CPU_NUMBER(%edx)
1: x/i $pc
=> 0xc100ac5e :   mov%cs:0xc109bc6c,%edx
(gdb)
0xc100ac65  125 CPU_NUMBER(%edx)
1: x/i $pc
=> 0xc100ac65 :  mov%cs:0x20(%edx),%edx
(gdb)
t_page_fault () at ../i386/i386/locore.S:435
435 pushl   $(T_PAGE_FAULT) /* mark a page fault trap */
1: x/i $pc
=> 0xc100a42c :  push   $0xe

... and here it will enter recursively t_page_fault, because in 
trap_from_kernel there is another CPU_NUMBER. I guess the triple fault 
is triggered because at some point the exception stack overflows.


With --enable-ncpu=2 it seems that CPU_NUMBER is

#define CPU_NUMBER(reg) \
movl%cs:lapic, reg  ;\
movl%cs:APIC_ID(reg), reg   ;\
shrl$24, reg;\

and at this stage the lapic pointer is not yet initialized:

(gdb) p lapic
$4 = (volatile ApicLocalUnit *) 0x0
(gdb) x &lapic
0xc109bc6c :   0x

I guess so far this worked because the address 0 was mapped, and now it 
isn't.


I'm not sure what would be the proper way to solve this. I tried 
anticipating the call to machine_init() to be before vm_mem_bootstrap() 
(to have lapic initialized) but this triggers another assert.


Any idea?


Luca




Re: [PATCH 2/3] x86_64: install emergency handler for double fault

2023-06-19 Thread luca

Hi,

Il 19/06/23 19:53, Almudena Garcia ha scritto:

The cpus are started sequentially: don't start the next until the current has 
finished its configuration.


in this case I think it's not needed to have a per-cpu bootstrap stack, 
assuming the secondary cpu already runs a proper thread (i.e. KTSS is 
loaded) when the next cpu is started.


Luca



Re: [PATCH 2/3] x86_64: install emergency handler for double fault

2023-06-19 Thread Luca Dariz

Il 17/06/23 23:12, Samuel Thibault ha scritto:

Luca Dariz, le jeu. 15 juin 2023 23:49:30 +0200, a ecrit:

diff --git a/i386/i386/ktss.c b/i386/i386/ktss.c
index 1d880167..52f3722c 100644
--- a/i386/i386/ktss.c
+++ b/i386/i386/ktss.c
@@ -61,6 +61,7 @@ ktss_fill(struct task_tss *myktss, struct real_descriptor 
*mygdt)
/* Initialize the master TSS.  */
  #ifdef __x86_64__
myktss->tss.rsp0 = (unsigned long)(exception_stack+1024);
+   myktss->tss.ist1 = (unsigned long)(exception_stack+1024);


Shouldn't we use a different stack, to avoid overwriting information
from the first fault?


you're right, otherwise it will be less useful in case a double fault 
happens before scheduling starts (which then overwrites rsp0 with the 
pcb stack on every context switch).


Thinking about the smp case, are the cpu started in parallel or 
sequentially? There might also be a need to set a cpu-specific first 
rsp0, if there is the possibility to have early interrupts concurrently.



Luca




Re: Recent patches break ACPI tables

2023-06-19 Thread luca

Hi,

Il 18/06/23 22:52, Samuel Thibault ha scritto:

Hello,

Damien Zammit, le dim. 18 juin 2023 00:48:15 +, a ecrit:

Almu and I discovered that the following commit breaks --enable-apic 
--enable-ncpus= >1 --disable-linux-groups
   
* d972c01c pmap: only map lower BIOS memory 1:1 when using Linux drivers


I believe the ACPI tables need temporary low memory mapping to access them.


Luca, what do you think?

Can we perhaps fix the ACPI code to use phystokv to translate from
physical addresses to virtual addresses?


yes that's probably the best way to fix the code.

By the way, how, is it exactly crashing? It might not be related to 
ACPI/APIC code, I tried a simple program on a 32-bit kernel with 
--enable-apic but without --enable-ncpus and it seems to work. If I use 
--enable-ncpus=2 it seems to fail quite early, even without -smp=2 in 
qemu. At the moment I'm not able to set a breakpoint even to the C entry 
point, I'm not sure why, I think I did it in the past.


Maybe there is some other part, SMP_specific, that assumes the lower 
memory is mapped 1:1? I remember there was some bootstrap code to start 
the secondary cpus, but I don't remember the details.



Luca



[PATCH 1/3] x86_64: use solid intstack already during bootstrap

2023-06-15 Thread Luca Dariz
* x86_64/boothdr.S: there is no reason to not use it right away
---
 x86_64/boothdr.S | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/x86_64/boothdr.S b/x86_64/boothdr.S
index d81f9a78..0ab9bd55 100644
--- a/x86_64/boothdr.S
+++ b/x86_64/boothdr.S
@@ -158,7 +158,7 @@ switch64:
 
 boot_entry64:
/* Switch to our own interrupt stack.  */
-   movq$(_intstack+INTSTACK_SIZE),%rax
+   movq$solid_intstack+INTSTACK_SIZE-16, %rax
andq$(~15),%rax
movq%rax,%rsp
 
@@ -192,9 +192,6 @@ iplt_done:
/* not reached */
nop
 
-   .section .boot.data
-   .comm   _intstack,INTSTACK_SIZE
-
.code32
.section .boot.data
.align 4096
-- 
2.39.2




[PATCH 2/3] x86_64: install emergency handler for double fault

2023-06-15 Thread Luca Dariz
* i386/i386/idt.c: add selector for the interrupt-specific stack
* i386/i386/ktss.c: configure ist1
* i386/i386/trap.c: add double fault handler, which just prints the
  state and panics. There is not much else to do in this case but it's
  useful for troubleshooting
* x86_64/idt_inittab.S: allow to specify an interrupt stack for custom
  handlers
* x86_64/locore.S: add double fault handler
---
 i386/i386/idt.c  | 12 +++-
 i386/i386/ktss.c |  1 +
 i386/i386/trap.c |  6 ++
 x86_64/idt_inittab.S | 25 +
 x86_64/locore.S  | 15 +++
 5 files changed, 46 insertions(+), 13 deletions(-)

diff --git a/i386/i386/idt.c b/i386/i386/idt.c
index cdfb9a88..caa44d71 100644
--- a/i386/i386/idt.c
+++ b/i386/i386/idt.c
@@ -34,6 +34,10 @@ struct idt_init_entry
unsigned long entrypoint;
unsigned short vector;
unsigned short type;
+#ifdef __x86_64__
+   unsigned short ist;
+   unsigned short pad_0;
+#endif
 };
 extern struct idt_init_entry idt_inittab[];
 
@@ -49,7 +53,13 @@ idt_fill(struct real_gate *myidt)
/* Initialize the exception vectors from the idt_inittab.  */
while (iie->entrypoint)
{
-   fill_idt_gate(myidt, iie->vector, iie->entrypoint, KERNEL_CS, 
iie->type, 0);
+   fill_idt_gate(myidt, iie->vector, iie->entrypoint, KERNEL_CS, 
iie->type,
+#ifdef __x86_64__
+ iie->ist
+#else
+ 0
+#endif
+   );
iie++;
}
 
diff --git a/i386/i386/ktss.c b/i386/i386/ktss.c
index 1d880167..52f3722c 100644
--- a/i386/i386/ktss.c
+++ b/i386/i386/ktss.c
@@ -61,6 +61,7 @@ ktss_fill(struct task_tss *myktss, struct real_descriptor 
*mygdt)
/* Initialize the master TSS.  */
 #ifdef __x86_64__
myktss->tss.rsp0 = (unsigned long)(exception_stack+1024);
+   myktss->tss.ist1 = (unsigned long)(exception_stack+1024);
 #else /* ! __x86_64__ */
myktss->tss.ss0 = KERNEL_DS;
myktss->tss.esp0 = (unsigned long)(exception_stack+1024);
diff --git a/i386/i386/trap.c b/i386/i386/trap.c
index f7bd8e38..b3689c9a 100644
--- a/i386/i386/trap.c
+++ b/i386/i386/trap.c
@@ -666,3 +666,9 @@ db_debug_all_traps (boolean_t enable)
 }
 
 #endif /* MACH_KDB */
+
+void handle_double_fault(struct i386_saved_state *regs)
+{
+  dump_ss(regs);
+  panic("DOUBLE FAULT! This is critical\n");
+}
diff --git a/x86_64/idt_inittab.S b/x86_64/idt_inittab.S
index f021b56d..fc1df0c7 100644
--- a/x86_64/idt_inittab.S
+++ b/x86_64/idt_inittab.S
@@ -50,12 +50,13 @@ ENTRY(idt_inittab)
.quad   entry   ;\
.text
 #else  /* MACH_PV_DESCRIPTORS */
-#defineIDT_ENTRY(n,entry,type) \
+#defineIDT_ENTRY(n,entry,type,ist) \
.data   2   ;\
.quad   entry   ;\
.word   n   ;\
.word   type;\
-   .long   0   /*pad*/ ;\
+   .word   ist ;\
+   .word   0   /*pad*/ ;\
.text
 #endif /* MACH_PV_DESCRIPTORS */
 
@@ -63,7 +64,7 @@ ENTRY(idt_inittab)
  * No error code.  Clear error code and push trap number.
  */
 #defineEXCEPTION(n,name) \
-   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_TRAP_GATE);\
+   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_TRAP_GATE, 0);\
 ENTRY(name);\
INT_FIX ;\
pushq   $(0);\
@@ -74,7 +75,7 @@ ENTRY(name)   ;\
  * User-accessible exception.  Otherwise, same as above.
  */
 #defineEXCEP_USR(n,name) \
-   IDT_ENTRY(n,EXT(name),ACC_PL_U|ACC_TRAP_GATE);\
+   IDT_ENTRY(n,EXT(name),ACC_PL_U|ACC_TRAP_GATE, 0);\
 ENTRY(name);\
INT_FIX ;\
pushq   $(0);\
@@ -85,7 +86,7 @@ ENTRY(name)   ;\
  * Error code has been pushed.  Just push trap number.
  */
 #defineEXCEP_ERR(n,name) \
-   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_INTR_GATE);\
+   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_INTR_GATE, 0);\
 ENTRY(name);\
INT_FIX ;\
pushq   $(n);\
@@ -95,25 +96,25 @@ ENTRY(name) ;\
  * Special interrupt code: dispatches to a unique entrypoint,
  * not defined automatically here.
  */
-#defineEXCEP_SPC(n,name)  \
-   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_TRAP_GATE)
+#defineEXCEP_SPC(n,name, ist)  \
+   IDT_ENTRY(n,EXT(name),ACC_PL_K|ACC_TRAP_GATE, ist)
 
 
 EXCEPTION(0x00,t_zero_div)
-EXCEP_SPC(0x01,t_debug)
+EXCEP_SPC(0x01,t_debug, 0)
 /* skip NMI interrupt - let more specific code figure that out.  */
 EXCEP_USR(0x03,t_int3)
 EXCEP_USR(0x04,t_into)
 EXCEP_USR(0x05,t_bounds)
 EXCEPTION(0x06,t_invop)
 EXCEPTION(0x07,t_nofpu)
-EXCEPTION(0x08,a_dbl_fault)
+EXCEP_SPC(0x08,t_dbl_fault, 1)
 EXCEPTION(0x09,a_fpu_over)
 EXCEPTION(0x0

[PATCH 3/3] x86_64: add a critical section on entry and exit from syscall/sysret

2023-06-15 Thread Luca Dariz
When entering a syscall we're still using the user stack, so we can't
reliably handle exceptions or interrupts, otherwise a user thread can
easily crash the machine with an invalid stack. Instead, disable
interrupts and (hopefullly) avoid traps in the fragments where we need
to have the user stack in RSP.

* i386/i386/ldt.c: mask interrupts and IOPL on syscall entry
* x86_64/locore.S: keep interrupts disabled when we use the user stack
---
 i386/i386/ldt.c | 3 ++-
 x86_64/locore.S | 4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/i386/i386/ldt.c b/i386/i386/ldt.c
index 4d7ec19a..5db36426 100644
--- a/i386/i386/ldt.c
+++ b/i386/i386/ldt.c
@@ -27,6 +27,7 @@
  * "Local" descriptor table.  At the moment, all tasks use the
  * same LDT.
  */
+#include 
 #include 
 #include 
 
@@ -75,7 +76,7 @@ ldt_fill(struct real_descriptor *myldt, struct 
real_descriptor *mygdt)
 wrmsr(MSR_REG_EFER, rdmsr(MSR_REG_EFER) | MSR_EFER_SCE);
 wrmsr(MSR_REG_LSTAR, (vm_offset_t)syscall64);
 wrmsr(MSR_REG_STAR, long)USER_CS - 16) << 16) | (long)KERNEL_CS) 
<< 32);
-wrmsr(MSR_REG_FMASK, 0);  // ?
+wrmsr(MSR_REG_FMASK, EFL_IF | EFL_IOPL_USER);
 #else /* defined(__x86_64__) && ! defined(USER32) */
fill_ldt_gate(myldt, USER_SCALL,
  (vm_offset_t)&syscall, KERNEL_CS,
diff --git a/x86_64/locore.S b/x86_64/locore.S
index a6697fb9..16b0dde5 100644
--- a/x86_64/locore.S
+++ b/x86_64/locore.S
@@ -1405,9 +1405,10 @@ ENTRY(syscall64)
mov %r11,%rbx   /* prepare for error handling */
mov %r10,%rcx   /* fix arg3 location according to C ABI 
*/
 
-   /* switch to kernel stack */
+   /* switch to kernel stack, then we can enable interrupts */
CPU_NUMBER(%r11)
movqCX(EXT(kernel_stack),%r11),%rsp
+   sti
 
/* Now we have saved state and args 1-6 are in place.
 * Before invoking the syscall we do some bound checking and,
@@ -1468,6 +1469,7 @@ _syscall64_check_for_ast:
 
 _syscall64_restore_state:
/* Restore thread state and return to user using sysret. */
+   cli /* block interrupts when using the user stack in kernel space */
movqCX(EXT(active_threads),%r11),%r11 /* point to current thread */
movqTH_PCB(%r11),%r11   /* point to pcb */
addq$ PCB_ISS,%r11  /* point to saved state */
-- 
2.39.2




Re: [VERY RFC PATCH] x86_64: Rework storing segment registers

2023-06-15 Thread Luca

Il 15/06/23 19:41, Sergey Bugaev ha scritto:

On Thu, Jun 15, 2023 at 8:02 PM Luca  wrote:

That's a bit strange, with my kernel I can now successfully end the
second stage, and without these warnings. Strangely, I don't see the
"Setting up translators: " line...


Check /debootstrap/debootatrap.log :)

The output I posted was from both debootstrap std{out,err} and the
log, interleaved. I'm launching tail -f /debootstrap/debootatrap.log
in parallel with deboostrap itself, following Samuel's instructions.


ah yes, now it makes sense :D I saw that, but then I forgot...


+#if !defined(__x86_64__) || defined(USER32)


I'm wondering if we could simplify this case by defining USER32 also on
i386.


Perhaps, but then you'd need to go over the other usages of USER32 and
verify they don't imply x86_64.


from a quick look It seems indeed that most cases assume x86_64. Maybe 
it would make sense to define USER64 as defined(__x86_64__) && 
!defined(USER32) since that seems the most common case.



+#if defined(__x86_64__) && !defined(USER32)
+struct i386_saved_fsgs_base_state {
+ unsigned long   fsbase;
+ unsigned long   gsbase;
+};
+#endif
+


These can maybe be placed in the i386_machine_state below


Yeah, that'd probably be better.


Why did you remove these? The idea was that they could now be empty on
full 64-bit, and encapsulating the segment handling code for 32-bit,
which is repeated in various places... One addition could also be to set
the kernel segments after saving the user ones, or are there some more
variations of them?


Because there's no longer a need to do anything special for fs/gs,
they are now handled the same way as es/ds, namely ifdef'ed out on
!USER32.

Maybe it would make sense to define new macros like PUSH_SEGS and
POP_SEGS that would push/pop all of (ds, es, fs, gs), but that'd be a
separate change from this patch.


ok, in any case I would keep PUSH_SEGS and SET_KERNEL_SEGMENTS 
separated, as I think they need to be split in all_intrs.



Luca



Re: [VERY RFC PATCH] x86_64: Rework storing segment registers

2023-06-15 Thread Luca

Hi Sergey,

Il 15/06/23 16:54, Sergey Bugaev ha scritto:

* For USER32, don't store fs/gs base at all
* For !USER32, store fs/gs base outside of the PCB stack
* For !USER32, don't store or touch es, ds, fs, gs (but keep ss and cs)
* For !USER32, disable all of the v86 code
---


Incidentally I have a patch very very similar to this that I'm testing 
in various configurations (also thanks to your suggestion about the 
stack corruption!), although in my case I had to disable some more code 
(e.g. in db_trace.c, db_interface.c, probably because I have kdb enabled).



So I went ahead and just made x86_64 !USER32 not store or access those
segment registers, along with moving fs/gs base out of iss and disabling
v86 (not that I know what v86 is, but it sounds like something we don't
need to support considering we only allow running x86_64 code).


I think V86 is not even used on the 32-bit build, apparently it is a 
special protected mode to support older x86 programs...



I have only tested my configuration (x86_64 !USER32 !MACH_KDB) -- quite
likely this doesn't build or work in others; but for me it seems to work
very well and I haven't got a single crash, in kernel space or user
space.

Please do review!

debootstrap is still not quite happy. I've uploaded the log here: [0]

[0]: https://paste.gg/p/anonymous/c976008dc38342cd963b0778586ead19


That's a bit strange, with my kernel I can now successfully end the 
second stage, and without these warnings. Strangely, I don't see the 
"Setting up translators: " line...


On the other hand, I had to fix a tricky combination of syscalls and 
interrupts, that for some reason was triggered quite often and caused a 
triple fault, together with an issue with nested interrupts (which I'm 
not sure are still used, but the code to handle it looks weird).


How are you launching the second stage? For now I used Samuel's archive 
as ramdisk, then launched bash and then debootstrap --second-stage:


https://paste.debian.net/1283107/



  i386/i386/debug_i386.c |   2 -
  i386/i386/i386asm.sym  |   4 +-
  i386/i386/pcb.c|  30 
  i386/i386/thread.h |  27 +--
  x86_64/locore.S| 105 +
  5 files changed, 90 insertions(+), 78 deletions(-)

diff --git a/i386/i386/debug_i386.c b/i386/i386/debug_i386.c
index b5465796..41d032e3 100644
--- a/i386/i386/debug_i386.c
+++ b/i386/i386/debug_i386.c
@@ -40,8 +40,6 @@ void dump_ss(const struct i386_saved_state *st)
st->r8, st->r9, st->r10, st->r11);
printf("R12 %016lx R13 %016lx R14 %016lx R15 %016lx\n",
st->r12, st->r13, st->r14, st->r15);
-   printf("FSBASE %016lx GSBASE %016lx\n",
-   st->fsbase, st->gsbase);
printf("RIP %016lx EFLAGS %08lx\n", st->eip, st->efl);
  #else
printf("EAX %08lx EBX %08lx ECX %08lx EDX %08lx\n",
diff --git a/i386/i386/i386asm.sym b/i386/i386/i386asm.sym
index fd0be557..8af0c5d6 100644
--- a/i386/i386/i386asm.sym
+++ b/i386/i386/i386asm.sym
@@ -84,8 +84,10 @@ size i386_kernel_state   iks
  
  size	i386_exception_link	iel
  
+#if !defined(__x86_64__) || defined(USER32)


I'm wondering if we could simplify this case by defining USER32 also on 
i386.



  offseti386_saved_stater   gs
  offseti386_saved_stater   fs
+#endif
  offseti386_saved_stater   cs
  offseti386_saved_stater   uesp
  offseti386_saved_stater   eax
@@ -108,8 +110,6 @@ offset  i386_saved_stater   r12
  offseti386_saved_stater   r13
  offseti386_saved_stater   r14
  offseti386_saved_stater   r15
-offset i386_saved_stater   fsbase
-offset i386_saved_stater   gsbase
  #endif
  
  offset	i386_interrupt_state	i	eip

diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
index fb535709..d8987ddf 100644
--- a/i386/i386/pcb.c
+++ b/i386/i386/pcb.c
@@ -145,9 +145,14 @@ void switch_ktss(pcb_t pcb)
 *  won`t save the v86 segments, so we leave room.
 */
  
+#if !defined(__x86_64__) || defined(USER32)

pcb_stack_top = (pcb->iss.efl & EFL_VM)
? (long) (&pcb->iss + 1)
: (long) (&pcb->iss.v86_segs);
+#else
+   pcb_stack_top = (vm_offset_t) (&pcb->iss + 1);
+#endif
+
  #ifdef __x86_64__
assert((pcb_stack_top & 0xF) == 0);
  #endif
@@ -224,8 +229,8 @@ void switch_ktss(pcb_t pcb)
  #endif /* MACH_PV_DESCRIPTORS */
  
  #if defined(__x86_64__) && !defined(USER32)

-   wrmsr(MSR_REG_FSBASE, pcb->iss.fsbase);
-   wrmsr(MSR_REG_GSBASE, pcb->iss.gsbase);
+   wrmsr(MSR_REG_FSBASE, pcb->isbs.fsbase);
+   wrmsr(MSR_REG_GSBASE, pcb->isbs.gsbase);
  #endif
  
  	db_load_context(pcb);

@@ -412,10 +417,12 @@ void pcb_init(task_t parent_task, thread_t thread)
 */
pcb->iss.cs = USER_CS;
pcb->iss.ss = USER_DS;
+#if !de

Re: 64bit startup

2023-06-10 Thread Luca

Hi,

Il 09/06/23 19:49, Sergey Bugaev ha scritto:

with the following hacky patch, I no longer see any crashes,
debootstrap --second-stage runs all the way and leaves me with an
almost full Debian GNU/Hurd x86_64 system \o/


I was able to avoid the issues with a different approach, basically 
commenting any write to ES/DS in x86_64/locore.S. In my case the 
corruption usually happens when handling the error of writing a bad 
value to ES/DS, and disabling this probably hides the corruption issue, 
but it's still a different issue that could be solved by just removing 
the useless segment handling.


It might be that the corruption itself is caused by a second trap/irq 
happening when the exit code from a first trap is executing (e.g. pop 
the last registers before iretq). This would enter the trap/irq handler 
using the same pcb stack, and probably some fsbase/gsbase push/pop is 
not done correctly in this case.



Luca



Re: 64bit startup

2023-06-06 Thread Luca
Il 6 giugno 2023 20:22:53 UTC, Sergey Bugaev  ha scritto:
>On Mon, Jun 5, 2023 at 3:03 PM Sergey Bugaev  wrote:
>> That is going to be much easier to debug than debootstrap, thank you!
>
>Unfortunately I'm facing some troubles :|
>
>For one thing you seem to have rebuilt/updated the packages, but not
>the rootfs image, so now the debuginfo I download doesn't match the
>binaries in the image. Please update the image too!
>
>Also, gdb seems to suddenly have some sort of trouble with
>interpreting the debuginfo files, specifically I'm trying ones from
>the 'hurd-dbsym' package (to be even more specific:
>/usr/lib/debug/.build-id/bf/dd0c0525d0ca383bd842796063345a2dd0c001.debug
>from that package, which corresponds to ext2fs.static -- but I've done
>a quick check and other files seem to behave the same way too). GDB
>loads regular symbols from them, but not the debuginfo, i.e. I can see
>what function is at which address, but not map addresses to source
>lines or access local variables or use types (tcbhead_t is the one I
>currently need most). I don't know enough about GDB and DWARF to
>diagnose exactly what's going on; readelf --debug-dump=info seems to
>dump the debuginfo just fine.
>
>Please try to reproduce this with your GDB (no Hurd system required),
>and if you have changed something recently about how debug files are
>generated, maybe that's what has broken it.
>
>So that all being said, here's one crash I am (and have been) seeing a
>lot: the crash at any sort of TCB access when fs_base suddenly turns
>out to be equal to the address of _kret_popl_ds. This makes no sense
>-- surely userspace would never set that, so it must be a gnumach bug.
>
>I've got a little theory of how something like that could happen:
>
>It is my understanding that "the PCB stack" (whatever that is) where
>locore.S pushes user's registers and thread->pcb->iss is really the
>exact same place, pushing registers onto that stack is exactly writing
>to the thread's i386_saved_state structure. The first four members of
>struct i386_saved_state are unsigned long fsbase, gsbase, gs, fs --
>and being the first members of the struct means they have the lowest
>addresses, i.e. are located at the top of the PCB stack.
>
>locore.S actually skips pushing or popping these four members:
>
>#define PUSH_FSGS   \
>subq$32,%rsp
>
>#define POP_FSGS\
>addq$32,%rsp
>
>This is because fs and gs we don't care about, and fsbase/gsbase of a
>thread state can only be changed by explicit thread_set_state calls
>and not by the thread itself, so, no need to rdmsr and push it, since
>the value is already saved in the PCB slot.
>
>However, *something* goes wrong and the fsbase slot gets overwritten
>with an unrelated value (_kret_popl_ds). The real %fs_base MSR keeps
>the proper value -- until we context-switch away from the thread and
>then back to it, at which point the bogus value gets loaded into
>%fs_base and then the userland tries to use it and faults.
>
>I don't know nearly enough about x86 interrupts/traps to say, but
>could it be that we get another interrupt/trap, while presumably
>executing _kret_popl_ds, and that causes the faulting %rip to be
>pushed onto the stack, but since we're at the PCB stack at that point
>it clobbers the stored fsbase? That doesn't cause issues for all the
>other registers because we have already popped their values off and
>won't be accessing them anymore; we'll push the new values the next
>time the thread enters the kernel -- though I guess it could show up
>in thread_get_state if you do that without stopping the thread on an
>SMP kernel.
>
>cc'ing Luca -- does what I'm saying make sense? could this happen? can
>you reproduce %fs_base getting set to _kret_popl_ds?
 
yes this makes perfectly sense, I think I'm chasing the same bug currently, or 
some variation of it. With some tracing I saw that this corruption seems to 
happen when an irq (usually timer) fires when returning from a trap, although 
not necessarily at one specific point.

I still have to find exactly the reason, in my tests fsbase gets overwritten 
either with a kernel address or 0x17, which might be a segment value. Btw in 
locore.S the 64-bit-only segment handling code is doing way too much, e.g. 
es/ds and such could be ignored, I guess simplifying this part may also solve 
this issue.


Luca
Hi Sergey,



  1   2   3   >