Re: modify syscall nr on-the-fly
On Thu, Aug 23, 2007 at 04:07:41PM +0400, Yuriy Tsibizov wrote: I'm trying to get user-mode Linux to run under FreeBSD Linux emulation (on i386). Ivan, current status patches are on http://chibis.persons.gfk.ru/linux/. what fbsd version are you using? if 7.x is it with 2.4 emulation or 2.6 emulation? do you have /compat/linux/proc and sys mounted? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: modify syscall nr on-the-fly
Simon 'corecode' Schubert wrote: Ivan Voras wrote: This is very interesting. Do you have a web page with progress status, blog or something similar to track your work? I'm interested in testing this when it's ready (in absence of Xen, this could be the next best thing). You might also want to look into porting the vkernel stuff from DragonFly. It shouldn't be very much work to do. Don't expect great performance though, it's mostly still a development tool. I know of it, but does it run Linux? :) signature.asc Description: OpenPGP digital signature
Re: modify syscall nr on-the-fly
Ivan Voras wrote: You might also want to look into porting the vkernel stuff from DragonFly. It shouldn't be very much work to do. Don't expect great performance though, it's mostly still a development tool. I know of it, but does it run Linux? :) No, but I've heard about people wondering if they could use it for UML or kqemu. But these are probably just ideas. cheers simon -- Serve - BSD +++ RENT this banner advert +++ASCII Ribbon /\ Work - Mac +++ space for low €€€ NOW!1 +++ Campaign \ / Party Enjoy Relax | http://dragonflybsd.org Against HTML \ Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \ ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: modify syscall nr on-the-fly
I'm trying to get user-mode Linux to run under FreeBSD Linux emulation (on i386). Ivan, current status patches are on http://chibis.persons.gfk.ru/linux/. Yuriy. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: User-mode Linux (Was: modify syscall nr on-the-fly)
2007/8/21, Yuriy Tsibizov [EMAIL PROTECTED]: 2007/8/20, Kostik Belousov [EMAIL PROTECTED]: On Sat, Aug 18, 2007 at 02:01:26PM +0400, Yuriy Tsibizov wrote: I'm trying to get user-mode Linux to run under FreeBSD Linux emulation (on i386). User-mode Linux in it's start-up tests tries to modify syscall number (to be called by kernel) on-the-fly (http://fxr.watson.org/fxr/source/arch/um/os-Linux/start_up.c?v=linux-2.6). It forks a child thread that stops (using SIGSTOP), calls getpid() (that will be intercepted by parent thread using PTRACE_SYSCALL) and return some value based on getpid() results. Main thread waits for SIGSTOP in child process and enables PTRACE_SYSCALL (I have some code that implements it. It makes some incompatible changes to PT_SYSCALL that will break FreeBSD applications, but works for Linux apps). When main thread catches SIGTRAP (generated by ptrace) it tries to modify EAX of child thread (with PTRACE_PEEKUSR and PTRACE_POKEUSR) to replace getpid syscall with getppid. is it possible to get updated EAX (and other registers as well) in syscall(...) after ptracestop(...) in PTRACESTOP_SC(...) returns? Hope for your help, Yuriy. If I understand right what you want, I doubt that existing code would allow you to change syscall number in debugger process for debuggee. You shall look at the sys/i386/i386/trap.c, syscall() function [adjust as needed for other arches]. It calculates callp before doing PTRACESTOP_SC, as well as copies the syscall arguments into the kernel address space. Yes, I know this. I'm going to recalculate callp after PTRACESTOP_SC. And, there will be no need to copyin from user space -- all syscalls parameters are passed in registers (it will be used only for processes running under Linux emulation). Updated registers are available via *frame. With some hacks (some return codes needed by user-mode Linux are hardcoded into kernel) it loads: [...] I'll need two more flags in p_stops to add two optios: - respect PTRACE_(OLD)SETOPTIONS PTRACE_O_TRACESYSGOOD (generate SIGTRAP | 0x80 instead of plain SIGTRAP) - use Linux PTRACE_SYSCALL conventions (clear S_PT_SCE and S_PT_SCX in PTRACESTOP_SC) to make it more than just a set of hacks to run single program. PTRACE_(PEEK|POKE)USR seems to need small rewrite too. patch (against -CURRENT) is available on http://chibis.persons.gfk.ru/linux/ptrace.diff You will need to rebuild both kernel an linux module. Yuriy. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: User-mode Linux (Was: modify syscall nr on-the-fly)
here is a little review of mine... just little suggestions. Index: i386/i386/trap.c === RCS file: /home/ncvs/src/sys/i386/i386/trap.c,v retrieving revision 1.307 diff -u -r1.307 trap.c --- i386/i386/trap.c26 Jul 2007 15:32:55 - 1.307 +++ i386/i386/trap.c22 Aug 2007 08:53:19 - @@ -1004,6 +1004,32 @@ PTRACESTOP_SC(p, td, S_PT_SCE); + if (__predict_false(p-p_sysent-sv_name[0]=='L')) { please use __predict_true(p-p_sysent != elf_linux_sysvec) + if (code != frame-tf_eax) { + printf(linux sysctl patched: code %d return eax %d\n, code, frame-tf_eax); + /* retry */ + code = frame-tf_eax; + + if (p-p_sysent-sv_prepsyscall) + /* +* The prep code is MP aware. +*/ + (*p-p_sysent-sv_prepsyscall)(frame, args, code, params); + /* else should always be null */ + + if (p-p_sysent-sv_mask) + code = p-p_sysent-sv_mask; the sv_mask should be removed.. dont use it in your code. its entirely pointless when dealing with Linux binaries + if (code = p-p_sysent-sv_size) + callp = p-p_sysent-sv_table[0]; + else + callp = p-p_sysent-sv_table[code]; + + narg = callp-sy_narg; + /* retry ends */ + } + } + AUDIT_SYSCALL_ENTER(code, td); error = (*callp-sy_call)(td, args); AUDIT_SYSCALL_EXIT(error, td); Index: i386/linux/linux_ptrace.c === RCS file: /home/ncvs/src/sys/i386/linux/linux_ptrace.c,v retrieving revision 1.17 diff -u -r1.17 linux_ptrace.c --- i386/linux/linux_ptrace.c 22 Feb 2006 18:57:49 - 1.17 +++ i386/linux/linux_ptrace.c 22 Aug 2007 09:27:01 - @@ -78,6 +78,7 @@ #define PTRACE_SETFPXREGS 19 #define PTRACE_SETOPTIONS 21 +#define PTRACE_O_TRACESYSGOOD 0x0001 /* * Linux keeps debug registers at the following @@ -95,6 +96,10 @@ return ((signum == SIGSTOP)? 0 : signum); } +struct linux_pt_lreg { + l_long reg[19]; +}; + struct linux_pt_reg { l_long ebx; l_long ecx; @@ -103,17 +108,17 @@ l_long edi; l_long ebp; l_long eax; - l_int xds; - l_int xes; - l_int xfs; - l_int xgs; + l_long xds; + l_long xes; + l_long xfs; + l_long xgs; l_long orig_eax; l_long eip; - l_int xcs; + l_long xcs; l_long eflags; l_long esp; - l_int xss; -}; + l_long xss; +} __packed; why is this necessary? how does it affect amd64 linux32 emulator? /* * Translate i386 ptrace registers between Linux and FreeBSD formats. @@ -247,6 +252,7 @@ struct linux_pt_reg reg; struct linux_pt_fpreg fpreg; struct linux_pt_fpxreg fpxreg; + struct linux_pt_lreglreg; } r; union { struct reg bsd_reg; @@ -429,20 +435,21 @@ * as necessary. */ if (uap-addr sizeof(struct linux_pt_reg)) { + if (uap-addr == (11 2)) /* orig_eax */ + uap-addr = (6 2); /* eax */ + error = kern_ptrace(td, PT_GETREGS, pid, u.bsd_reg, 0); if (error != 0) break; map_regs_to_linux(u.bsd_reg, r.reg); if (req == PTRACE_PEEKUSR) { - error = copyout((char *)r.reg + uap-addr, - (void *)uap-data, sizeof(l_int)); + error = copyout((l_long*)(r.lreg.reg[uap-addr2]), + (void *)uap-data, sizeof(l_long)); break; } - *(l_int *)((char *)r.reg + uap-addr) = - (l_int)uap-data; - + r.lreg.reg[uap-addr2] = (l_long)uap-data; map_regs_from_linux(u.bsd_reg, r.reg); error = kern_ptrace(td, PT_SETREGS, pid, u.bsd_reg, 0); } @@ -470,11 +477,34 @@ error = kern_ptrace(td, PT_SETDBREGS, pid, u.bsd_dbreg,
Re: modify syscall nr on-the-fly
Yuriy Tsibizov wrote: I'm trying to get user-mode Linux to run under FreeBSD Linux emulation (on i386). This is very interesting. Do you have a web page with progress status, blog or something similar to track your work? I'm interested in testing this when it's ready (in absence of Xen, this could be the next best thing). ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: modify syscall nr on-the-fly
Ivan Voras wrote: This is very interesting. Do you have a web page with progress status, blog or something similar to track your work? I'm interested in testing this when it's ready (in absence of Xen, this could be the next best thing). You might also want to look into porting the vkernel stuff from DragonFly. It shouldn't be very much work to do. Don't expect great performance though, it's mostly still a development tool. cheers simon -- Serve - BSD +++ RENT this banner advert +++ASCII Ribbon /\ Work - Mac +++ space for low €€€ NOW!1 +++ Campaign \ / Party Enjoy Relax | http://dragonflybsd.org Against HTML \ Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \ ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: User-mode Linux (Was: modify syscall nr on-the-fly)
2007/8/23, Roman Divacky [EMAIL PROTECTED]: here is a little review of mine... just little suggestions. Index: i386/i386/trap.c === RCS file: /home/ncvs/src/sys/i386/i386/trap.c,v retrieving revision 1.307 diff -u -r1.307 trap.c --- i386/i386/trap.c26 Jul 2007 15:32:55 - 1.307 +++ i386/i386/trap.c22 Aug 2007 08:53:19 - @@ -1004,6 +1004,32 @@ PTRACESTOP_SC(p, td, S_PT_SCE); + if (__predict_false(p-p_sysent-sv_name[0]=='L')) { please use __predict_true(p-p_sysent != elf_linux_sysvec) Will it be possible to link (GENERIC) kernel wih this check? I can't find elf_linux_sysvec in my /boot/kernel/kernel... + if (code != frame-tf_eax) { + printf(linux sysctl patched: code %d return eax %d\n, code, frame-tf_eax); + /* retry */ + code = frame-tf_eax; + + if (p-p_sysent-sv_prepsyscall) + /* +* The prep code is MP aware. +*/ + (*p-p_sysent-sv_prepsyscall)(frame, args, code, params); + /* else should always be null */ + + if (p-p_sysent-sv_mask) + code = p-p_sysent-sv_mask; the sv_mask should be removed.. dont use it in your code. its entirely pointless when dealing with Linux binaries ok + if (code = p-p_sysent-sv_size) + callp = p-p_sysent-sv_table[0]; + else + callp = p-p_sysent-sv_table[code]; + + narg = callp-sy_narg; + /* retry ends */ + } + } + AUDIT_SYSCALL_ENTER(code, td); error = (*callp-sy_call)(td, args); AUDIT_SYSCALL_EXIT(error, td); Index: i386/linux/linux_ptrace.c === RCS file: /home/ncvs/src/sys/i386/linux/linux_ptrace.c,v retrieving revision 1.17 diff -u -r1.17 linux_ptrace.c --- i386/linux/linux_ptrace.c 22 Feb 2006 18:57:49 - 1.17 +++ i386/linux/linux_ptrace.c 22 Aug 2007 09:27:01 - @@ -78,6 +78,7 @@ #define PTRACE_SETFPXREGS 19 #define PTRACE_SETOPTIONS 21 +#define PTRACE_O_TRACESYSGOOD 0x0001 /* * Linux keeps debug registers at the following @@ -95,6 +96,10 @@ return ((signum == SIGSTOP)? 0 : signum); } +struct linux_pt_lreg { + l_long reg[19]; +}; + struct linux_pt_reg { l_long ebx; l_long ecx; @@ -103,17 +108,17 @@ l_long edi; l_long ebp; l_long eax; - l_int xds; - l_int xes; - l_int xfs; - l_int xgs; + l_long xds; + l_long xes; + l_long xfs; + l_long xgs; l_long orig_eax; l_long eip; - l_int xcs; + l_long xcs; l_long eflags; l_long esp; - l_int xss; -}; + l_long xss; +} __packed; why is this necessary? how does it affect amd64 linux32 emulator? I'll need to re-check this. It should not access segment registers. /* * Translate i386 ptrace registers between Linux and FreeBSD formats. @@ -247,6 +252,7 @@ struct linux_pt_reg reg; struct linux_pt_fpreg fpreg; struct linux_pt_fpxreg fpxreg; + struct linux_pt_lreglreg; } r; union { struct reg bsd_reg; @@ -429,20 +435,21 @@ * as necessary. */ if (uap-addr sizeof(struct linux_pt_reg)) { + if (uap-addr == (11 2)) /* orig_eax */ + uap-addr = (6 2); /* eax */ + error = kern_ptrace(td, PT_GETREGS, pid, u.bsd_reg, 0); if (error != 0) break; map_regs_to_linux(u.bsd_reg, r.reg); if (req == PTRACE_PEEKUSR) { - error = copyout((char *)r.reg + uap-addr, - (void *)uap-data, sizeof(l_int)); + error = copyout((l_long*)(r.lreg.reg[uap-addr2]), + (void *)uap-data, sizeof(l_long)); break; } - *(l_int *)((char *)r.reg + uap-addr) = - (l_int)uap-data; - +
Re: modify syscall nr on-the-fly
2007/8/20, Kostik Belousov [EMAIL PROTECTED]: On Sat, Aug 18, 2007 at 02:01:26PM +0400, Yuriy Tsibizov wrote: I'm trying to get user-mode Linux to run under FreeBSD Linux emulation (on i386). User-mode Linux in it's start-up tests tries to modify syscall number (to be called by kernel) on-the-fly (http://fxr.watson.org/fxr/source/arch/um/os-Linux/start_up.c?v=linux-2.6). It forks a child thread that stops (using SIGSTOP), calls getpid() (that will be intercepted by parent thread using PTRACE_SYSCALL) and return some value based on getpid() results. Main thread waits for SIGSTOP in child process and enables PTRACE_SYSCALL (I have some code that implements it. It makes some incompatible changes to PT_SYSCALL that will break FreeBSD applications, but works for Linux apps). When main thread catches SIGTRAP (generated by ptrace) it tries to modify EAX of child thread (with PTRACE_PEEKUSR and PTRACE_POKEUSR) to replace getpid syscall with getppid. is it possible to get updated EAX (and other registers as well) in syscall(...) after ptracestop(...) in PTRACESTOP_SC(...) returns? Hope for your help, Yuriy. If I understand right what you want, I doubt that existing code would allow you to change syscall number in debugger process for debuggee. You shall look at the sys/i386/i386/trap.c, syscall() function [adjust as needed for other arches]. It calculates callp before doing PTRACESTOP_SC, as well as copies the syscall arguments into the kernel address space. Yes, I know this. I'm going to recalculate callp after PTRACESTOP_SC. And, there will be no need to copyin from user space -- all syscalls parameters are passed in registers (it will be used only for processes running under Linux emulation). I know that there is no real use for this feature for native code. Yuriy ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
User-mode Linux (Was: modify syscall nr on-the-fly)
(replying to myself) 2007/8/21, Yuriy Tsibizov [EMAIL PROTECTED]: 2007/8/20, Kostik Belousov [EMAIL PROTECTED]: On Sat, Aug 18, 2007 at 02:01:26PM +0400, Yuriy Tsibizov wrote: I'm trying to get user-mode Linux to run under FreeBSD Linux emulation (on i386). User-mode Linux in it's start-up tests tries to modify syscall number (to be called by kernel) on-the-fly (http://fxr.watson.org/fxr/source/arch/um/os-Linux/start_up.c?v=linux-2.6). It forks a child thread that stops (using SIGSTOP), calls getpid() (that will be intercepted by parent thread using PTRACE_SYSCALL) and return some value based on getpid() results. Main thread waits for SIGSTOP in child process and enables PTRACE_SYSCALL (I have some code that implements it. It makes some incompatible changes to PT_SYSCALL that will break FreeBSD applications, but works for Linux apps). When main thread catches SIGTRAP (generated by ptrace) it tries to modify EAX of child thread (with PTRACE_PEEKUSR and PTRACE_POKEUSR) to replace getpid syscall with getppid. is it possible to get updated EAX (and other registers as well) in syscall(...) after ptracestop(...) in PTRACESTOP_SC(...) returns? Hope for your help, Yuriy. If I understand right what you want, I doubt that existing code would allow you to change syscall number in debugger process for debuggee. You shall look at the sys/i386/i386/trap.c, syscall() function [adjust as needed for other arches]. It calculates callp before doing PTRACESTOP_SC, as well as copies the syscall arguments into the kernel address space. Yes, I know this. I'm going to recalculate callp after PTRACESTOP_SC. And, there will be no need to copyin from user space -- all syscalls parameters are passed in registers (it will be used only for processes running under Linux emulation). Updated registers are available via *frame. With some hacks (some return codes needed by user-mode Linux are hardcoded into kernel) it loads: Core dump limits : soft - NONE hard - NONE Checking that ptrace can change system call numbers...OK Checking syscall emulation patch for ptrace...missing Checking for tmpfs mount on /dev/shm...nothing mounted on /dev/shm Checking PROT_EXEC mmap in /tmp/...OK Checking for the skas3 patch in the host: - /proc/mm... - PTRACE_FAULTINFO... - PTRACE_LDT...UML running in SKAS0 mode Linux version 2.6.22-rc2 ([EMAIL PROTECTED]) (gcc version 4.1.1 20070105 (Red Hat 4.1.1-51)) #342 Wed May 23 11:56:49 EDT 2007 Built 1 zonelists. Total pages: 8128 Kernel command line: root=98:0 PID hash table entries: 128 (order: 7, 512 bytes) Dentry cache hash table entries: 4096 (order: 2, 16384 bytes) Inode-cache hash table entries: 2048 (order: 1, 8192 bytes) Memory: 30288k available Mount-cache hash table entries: 512 Checking for host processor cmov support...Yes Checking for host processor xmm support...No openpty failed, errno = 22 openpty failed, errno = 22 aio_thread failed to initialize context, err = 38 2.6 AIO not supported on the host - reverting to 2.4 AIO 2.6 host AIO support not used - falling back to I/O thread NET: Registered protocol family 16 NET: Registered protocol family 2 IP route cache hash table entries: 1024 (order: 0, 4096 bytes) TCP established hash table entries: 1024 (order: 1, 8192 bytes) TCP bind hash table entries: 1024 (order: 0, 4096 bytes) TCP: Hash tables configured (established 1024 bind 1024) TCP reno registered Checking host MADV_REMOVE support...OK os_set_fd_async : Failed to fcntl F_SETOWN (or F_SETSIG) fd 6 to pid 1191, errno = 22 Failed to get IRQ for management console os_set_fd_async : Failed to fcntl F_SETOWN (or F_SETSIG) fd 8 to pid 1191, errno = 22 um_request_irq failed - errno = 22 Host TLS support detected Detected host type: i386 VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) io scheduler noop registered io scheduler anticipatory registered (default) io scheduler deadline registered io scheduler cfq registered TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Initialized stdio console driver Console initialized on /dev/tty0 Initializing software serial port version 1 Couldn't stat root_fs : err = 2 Failed to initialize ubd device 0 :Couldn't determine size of device's file VFS: Cannot open root device 98:0 or unknown-block(98,0) Please append a correct root= boot option; here are the available partitions: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(98,0) EIP: 0033:[28093021] CPU: 0 Not tainted ESP: 003b:28068fc4 EFLAGS: 0246 Not tainted EAX: EBX: 04aa ECX: 0013 EDX: 04aa ESI: 04a7 EDI: EBP: 28068fd8 DS: 003b ES: 003b 087fce64: [08069628] show_regs+0xb4/0xb9 087fce90: [08057ca8] panic_exit+0x25/0x3f 087fcea4: [08078720] notifier_call_chain+0x21/0x46 087fcec4: [080787bb] __atomic_notifier_call_chain+0x17/0x19
Re: modify syscall nr on-the-fly
On Sat, Aug 18, 2007 at 02:01:26PM +0400, Yuriy Tsibizov wrote: I'm trying to get user-mode Linux to run under FreeBSD Linux emulation (on i386). User-mode Linux in it's start-up tests tries to modify syscall number (to be called by kernel) on-the-fly (http://fxr.watson.org/fxr/source/arch/um/os-Linux/start_up.c?v=linux-2.6). It forks a child thread that stops (using SIGSTOP), calls getpid() (that will be intercepted by parent thread using PTRACE_SYSCALL) and return some value based on getpid() results. Main thread waits for SIGSTOP in child process and enables PTRACE_SYSCALL (I have some code that implements it. It makes some incompatible changes to PT_SYSCALL that will break FreeBSD applications, but works for Linux apps). When main thread catches SIGTRAP (generated by ptrace) it tries to modify EAX of child thread (with PTRACE_PEEKUSR and PTRACE_POKEUSR) to replace getpid syscall with getppid. is it possible to get updated EAX (and other registers as well) in syscall(...) after ptracestop(...) in PTRACESTOP_SC(...) returns? Hope for your help, Yuriy. If I understand right what you want, I doubt that existing code would allow you to change syscall number in debugger process for debuggee. You shall look at the sys/i386/i386/trap.c, syscall() function [adjust as needed for other arches]. It calculates callp before doing PTRACESTOP_SC, as well as copies the syscall arguments into the kernel address space. pgpM2zjaiw706.pgp Description: PGP signature
modify syscall nr on-the-fly
I'm trying to get user-mode Linux to run under FreeBSD Linux emulation (on i386). User-mode Linux in it's start-up tests tries to modify syscall number (to be called by kernel) on-the-fly (http://fxr.watson.org/fxr/source/arch/um/os-Linux/start_up.c?v=linux-2.6). It forks a child thread that stops (using SIGSTOP), calls getpid() (that will be intercepted by parent thread using PTRACE_SYSCALL) and return some value based on getpid() results. Main thread waits for SIGSTOP in child process and enables PTRACE_SYSCALL (I have some code that implements it. It makes some incompatible changes to PT_SYSCALL that will break FreeBSD applications, but works for Linux apps). When main thread catches SIGTRAP (generated by ptrace) it tries to modify EAX of child thread (with PTRACE_PEEKUSR and PTRACE_POKEUSR) to replace getpid syscall with getppid. is it possible to get updated EAX (and other registers as well) in syscall(...) after ptracestop(...) in PTRACESTOP_SC(...) returns? Hope for your help, Yuriy. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]