Re: ptrace single-stepping change breaks Wine

2005-01-29 Thread Kari Hurtta

[ Reading just long long thread (actually from
gmane.comp.emulators.wine.devel) ]

<[EMAIL PROTECTED]>
Linus Torvalds <[EMAIL PROTECTED]>:

> +
> + /*
> +  * Was the TF flag set by a debugger? If so, clear it now,
> +  * so that register information is correct.
> +  */
> + if (tsk->ptrace & PT_DTRACE) {
> + regs->eflags &= ~TF_MASK;
> + tsk->ptrace &= ~PT_DTRACE;
=
> + if (!tsk->ptrace & PT_DTRACE)
 ===
> + goto clear_TF;
> + }
>   }

Perhaps, I'm stupid.

But is there something strange on that test of tsk->ptrace variable?

Before that PT_DTRACE was cleared from that same tsk->ptrace variable.

/ Kari Hurtta



Re: ptrace single-stepping change breaks Wine

2005-01-05 Thread Alexandre Julliard
Mike Hearn <[EMAIL PROTECTED]> writes:

> - Another possibility would be to create a new mmap API that lets
>   us ask for exactly what we want, instead of second-guessing the
>   kernel. I don't know exactly what sort of an API Alexandre has in
>   mind here, perhaps he could describe it.

Probably the easiest would be to have a way for an app to specify the
mmap range it wants. So instead of having the kernel try to guess from
brk and stack ulimit, both of which are meaningless for Win32 apps, we
could set the range from "end of win32 exe" to 0x7ff. This would
also avoid the need to preallocate everything above 0x8000 that we
currently do and that plays havoc with address space limits.

-- 
Alexandre Julliard
[EMAIL PROTECTED]



Re: ptrace single-stepping change breaks Wine

2005-01-05 Thread Ingo Molnar

* Thomas Sailer <[EMAIL PROTECTED]> wrote:

> > I'm afraid Alexandre has decided not to apply this patch (the ABI
> > personality syscall). His reasoning is as follows:
> 
> Quite understandably.

another workaround to switch off flex-mmap is to set the stack ulimit to
'unlimited':

 saturn:~> cat /proc/self/maps | tail -7
 b7db3000-b7db4000 r--p 00cfd000 03:41 3735973/usr/lib/locale/locale-archive
 b7db4000-b7de1000 r--p 00ccc000 03:41 3735973/usr/lib/locale/locale-archive
 b7de1000-b7de7000 r--p 00cc3000 03:41 3735973/usr/lib/locale/locale-archive
 b7de7000-b7fe7000 r--p  03:41 3735973/usr/lib/locale/locale-archive
 b7fe7000-b7fe8000 rw-p b7fe7000 00:00 0
 bff2c000-c000 rw-p bff2c000 00:00 0
 e000-f000 ---p  00:00 0

 saturn:~> ulimit -s unlimited

 saturn:~> cat /proc/self/maps | tail -7
 42188000-4218a000 rw-p 00014000 03:41 3433982/lib/ld-2.3.3.so
 4218c000-422aa000 r-xp  03:41 3434006/lib/tls/libc-2.3.3.so
 422aa000-422ac000 r--p 0011d000 03:41 3434006/lib/tls/libc-2.3.3.so
 422ac000-422ae000 rw-p 0011f000 03:41 3434006/lib/tls/libc-2.3.3.so
 422ae000-422b rw-p 422ae000 00:00 0
 bfea-c000 rw-p bfea 00:00 0
 e000-f000 ---p  00:00 0

e.g. SuSE defaults to an unlimited stack so flex-mmap is effectively
disabled there.

To set the VM to legacy, for all apps, set /proc/sys/vm/legacy_va_layout
to 1.

Ingo



Re: ptrace single-stepping change breaks Wine

2005-01-05 Thread Thomas Sailer
On Tue, 2005-01-04 at 21:15 +, Mike Hearn wrote:
> Context: this is not about ptrace stuff, but rather Thomas Sailors

s/Sailor/Sailer/

> I'm afraid Alexandre has decided not to apply this patch (the ABI
> personality syscall). His reasoning is as follows:

Quite understandably.

> Could you upload a +relay,+tid,+seh,+msgbox trace somewhere please? Of
> course if you could investigate it yourself that'd be the best thing.

http://www.baycom.org/~tom/wine/wine.xst.broken.relay.tid.seh.msgbox.gz
http://www.baycom.org/~tom/wine/wine.xst.working.relay.tid.seh.msgbox.gz

I used 2.6.10-ac1 and wine-20041201-1fc3winehq. The second log (which is
huge!, ~250MBytes compressed, compression ratio roughly 1:100) is with
setarch i386 -L.

Thanks,
Tom





Re: ptrace single-stepping change breaks Wine

2005-01-04 Thread Andrew Morton
Mike Hearn <[EMAIL PROTECTED]> wrote:
>
> Also a precise description of what flex-mmap does would be good. Google
>  wasn't too informative, best I could get was that it means mmap
>  allocates from the "top" of the address space down. But where is the top
>  exactly?

Ingo has described it thus:


before:

  0x0800 ... binary code
  0x08xx ... brk area
  0x4000 ... start of mmap, new mmaps go after old ones
  0xbfxx ... stack

after:

  0x0800 ... binary code
  0x08xx ... brk area
  0xbfxx ... _end_ of all mmaps, new mmaps go below old ones
  0xbfyy ... stack

the 'after' layout guarantees that brk area (malloc()) can grow
unlimited and mmap() can grow unlimited - they will meet somewhere
inbetween when almost all of the VM is used up. [the 'top' of the mmaps
in the 'after' layout is constrained by the stack ulimit - the stack
must still fit and we never allocate into the stack's yet unallocated
and growable hole.]

with the 'before' layout we've got 900 MB for brk() and 1.9GB for
mmaps() - a rigid limit.



Re: ptrace single-stepping change breaks Wine

2005-01-04 Thread Mike Hearn
Context: this is not about ptrace stuff, but rather Thomas Sailors
original problem which seems to be flex-mmap related.

On Fri, 2004-12-31 at 16:51 +0100, Thomas Sailer wrote:
> On Freitag 31 Dezember 2004 14.31, Mike Hearn wrote:
> 
> > What about this patch?
> 
> This works now. Happy new year...

I'm afraid Alexandre has decided not to apply this patch (the ABI
personality syscall). His reasoning is as follows:

- There are a million kernel patches out there which use the ABI 
  personality system to break stuff, which means that realistically
  we have no idea what it does

- Therefore, using it would be a big support problem as it may
  even switch off stuff (new features etc) we want, so this system
  doesn't really scale

- It'd be a lot better to provide a "disable this mmap
  feature/behaviour" flag for each thing that changes it, or *even*
  better to make it opt-in as having to change the sources to make
  old apps work again kind of defeats the point of backwards
  compatibility (ie, old apps should continue to work as-is)

- Another possibility would be to create a new mmap API that lets
  us ask for exactly what we want, instead of second-guessing the
  kernel. I don't know exactly what sort of an API Alexandre has in
  mind here, perhaps he could describe it.

- So rather than apply the patch, we have to figure out why 
  flex-mmap is breaking this program and add yet more magic VM code
  to stop it from happening :(

Could you upload a +relay,+tid,+seh,+msgbox trace somewhere please? Of
course if you could investigate it yourself that'd be the best thing.

I wasn't sure whether to CC some kernel people or not, but it seems from
previous threads that the ABI personality syscall system is meant for
people like us who are trying to keep "legacy" binaries working.
Unfortunately this sort of unbreak-me-please flag isn't really what we
need/want from the kernel.

Also a precise description of what flex-mmap does would be good. Google
wasn't too informative, best I could get was that it means mmap
allocates from the "top" of the address space down. But where is the top
exactly?

thanks -mike




Re: ptrace single-stepping change breaks Wine

2005-01-01 Thread Daniel Jacobowitz
On Fri, Dec 31, 2004 at 09:19:48AM -0800, Linus Torvalds wrote:
> 
> 
> On Fri, 31 Dec 2004, Daniel Jacobowitz wrote:
> > 
> > Lots, I like it.  The syscall trap will always be delivered before the
> > single-step trap, right, because signal delivery won't run until we
> > return to userspace?
> 
> Yes. Although I've not actually tested it.
> 
> Before, it used to show up as one event, and basically the "0x80" marker 
> got lost, so effectively the "system call exit" part would have got lost. 
> Now, it _may_ DTRT, with the caveat that the system call ptrace_notify() 
> thing still has the same problem with restarting-with-a-signal.
> 
> That's basically a "don't do that then", and is the status quo, of course,
> so this is at least not a regression. It's still pretty ugly, but 
> apparently nobody really cares ;)

Yes.  At some point, I'd like to make that an error - if you want to
restart with a signal, don't do it from the notification points where
it doesn't make sense (exit, fork, vfork-done, syscall).  Send a signal
by hand, and then resume, and if you want to fudge the siginfo you can
do that when stopped in the signal delivery path.

-- 
Daniel Jacobowitz



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Jesse Allen
On Fri, 31 Dec 2004 09:30:04 -0800 (PST), Linus Torvalds
<[EMAIL PROTECTED]> wrote:
> 
> 
> On Fri, 31 Dec 2004, Davide Libenzi wrote:
> >
> > I don't think that the Wine problem resolution is due to the POPF
> > instruction handling. Basically Linus patch does a nice cleanup plus POPF
> > handling, so maybe the patch can be split.
> 
> The popf part is very nice in that it allows you to single-step and debug
> this thing.
> 
> The fact is, I can't debug Wine. The code-base is just too alien for me,
> so to debug this thing I needed to come up with a silly example of TF
> usage, and try to debug _that_ instead as if it were something unknown (ie
> debugging by knowing what the program does is a no-no, since that would
> have defeated the whole exercise).
> 
> And handlign "popf" correctly really was the only sane way to debug it,
> anything else would never have worked in a real-life debugging situation.
> It's easy enough to say "well, just do it by hand", but that's not
> practical when you debug with "si 1000" to try to pinpoint the behaviour a
> bit.
> 
> And clearly my debuggability exercise seem to have worked, since the end
> result apparently ended up doing the right thing for Wine.


I honestly think there may have been no other way.  There was a reason
why I was very vague at first.  I could not pin-point an exact
location of failure.  Just grepping a good log shows over 500,000
calls to RaiseException.  trap_handler ran over 300,000 times and as
many TF clears.  My brother told me to set watches or break points to
see if I could find something wrong, but I simply told him there is
too much going on.  The only thing I could think that can be done is
to mentally find debugging scenarios and hope to find where it could
be breaking.  But the only people that could have thought of a
scenario are people that know linux well enough what the kernel is
doing in the first place.



> 
> This is why debuggability is important. I realize that people may think
> I'm inconsistent (since I abhor kernel debuggers), but while _I_ abhor
> debuggers, I also think that the primary objective of an operating system
> is to make easy things easy, and hard things possible, so while I don't
> much like debuggers, I consider it a fundamental failure if the kernel
> doesn't have proper support for them.


I think that copy protection routines are particually nasty.  They
purposely make debugging hard (again see above, no sane program would
be like that).   And the program's reason for failing is not the
reason at all -- "please insert disc" -- the disc is already in there!
 So they don't say the real reason why it failed, leaving the user
totally hopelessly lost on what they should do.  It's really hard to
file a bug report on that alone!  Had I not placed my guess on ptrace
early on, this problem may have gone undiscovered for a long time.  I
have checked transgaming and they seem to be not aware about this, but
it would have been a rude awakening for them when they find that when
most distros had updated to 2.6.9, that all the SecuRom protected
games silently broke, and they would have had a heck of a time
debugging them to find the reason, I'm sure, even with the specs on
it!  Though who knows if cedega is affected because their code-base is
diverging, and I'm sure their copy protection support is very
different.


> 
> So I think it's worth splitting out the "popf" part of the patch, but even
> if that one doesn't actually matter for Warcraft, I'd put it in just so
> that we have a state where we're _supposed_ to be able to debug things
> with TF in them. Just having the mental expectation that things like that
> should work is important - otherwise we'll eventually end up having some
> other subtle problem with TF handling.
> 
> Linus
> 
> 


Sure.  I've tried the game and it doesn't require is_at_popf().  I
thought it did because I thought it was required for handling TF
correctly, but maybe this is another scenario.  I think it's a good
idea to keep it in too.  I actually don't care about debugging the
copy protection, I just want correct behavoir.  If doing all this has
pointed out these other issues too, then it's all well worth it,
because next time it will be /easier/ and we won't have subtle
problems.

Jesse



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Linus Torvalds


On Fri, 31 Dec 2004, Davide Libenzi wrote:
> 
> I don't think that the Wine problem resolution is due to the POPF 
> instruction handling. Basically Linus patch does a nice cleanup plus POPF 
> handling, so maybe the patch can be split.

The popf part is very nice in that it allows you to single-step and debug 
this thing.

The fact is, I can't debug Wine. The code-base is just too alien for me, 
so to debug this thing I needed to come up with a silly example of TF 
usage, and try to debug _that_ instead as if it were something unknown (ie 
debugging by knowing what the program does is a no-no, since that would 
have defeated the whole exercise).

And handlign "popf" correctly really was the only sane way to debug it, 
anything else would never have worked in a real-life debugging situation. 
It's easy enough to say "well, just do it by hand", but that's not 
practical when you debug with "si 1000" to try to pinpoint the behaviour a 
bit.

And clearly my debuggability exercise seem to have worked, since the end
result apparently ended up doing the right thing for Wine.

This is why debuggability is important. I realize that people may think 
I'm inconsistent (since I abhor kernel debuggers), but while _I_ abhor 
debuggers, I also think that the primary objective of an operating system 
is to make easy things easy, and hard things possible, so while I don't 
much like debuggers, I consider it a fundamental failure if the kernel 
doesn't have proper support for them.

So I think it's worth splitting out the "popf" part of the patch, but even
if that one doesn't actually matter for Warcraft, I'd put it in just so 
that we have a state where we're _supposed_ to be able to debug things 
with TF in them. Just having the mental expectation that things like that 
should work is important - otherwise we'll eventually end up having some 
other subtle problem with TF handling.

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Linus Torvalds


On Fri, 31 Dec 2004, Daniel Jacobowitz wrote:
> 
> Lots, I like it.  The syscall trap will always be delivered before the
> single-step trap, right, because signal delivery won't run until we
> return to userspace?

Yes. Although I've not actually tested it.

Before, it used to show up as one event, and basically the "0x80" marker 
got lost, so effectively the "system call exit" part would have got lost. 
Now, it _may_ DTRT, with the caveat that the system call ptrace_notify() 
thing still has the same problem with restarting-with-a-signal.

That's basically a "don't do that then", and is the status quo, of course,
so this is at least not a regression. It's still pretty ugly, but 
apparently nobody really cares ;)

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Linus Torvalds


On Fri, 31 Dec 2004, Andi Kleen wrote:
> 
> >  - you couldn't even debug signal handlers, because they were _really_ 
> >hard to get into unless you knew where they were and put a breakpoint 
> >on them.
> 
> Ok I see this as being a problem. But I bet it could be fixed
> much simpler without doing all this complicated and likely-to-be-buggy
> popf parsing you added.

And that is _exactly_ what we did.

Guess what broke Wine?

Uhhuh. 

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Thomas Sailer
On Freitag 31 Dezember 2004 14.31, Mike Hearn wrote:

> What about this patch?

This works now. Happy new year...

Tom



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Davide Libenzi
On Fri, 31 Dec 2004, Andi Kleen wrote:

> >  - you couldn't even debug signal handlers, because they were _really_ 
> >hard to get into unless you knew where they were and put a breakpoint 
> >on them.
> 
> Ok I see this as being a problem. But I bet it could be fixed
> much simpler without doing all this complicated and likely-to-be-buggy
> popf parsing you added.

I don't think that the Wine problem resolution is due to the POPF 
instruction handling. Basically Linus patch does a nice cleanup plus POPF 
handling, so maybe the patch can be split.



> >  - you couldn't see the instruction after a system call.
> 
> Are you sure? 

Yes, this was true with 2.4. Than it has been fixed some time ago. But 
handling that revealed a fragile handling of ptrace event delivery we had 
in do_syscall_trace(). Part of the Linus patch tries to solve the fact 
that on one side strace wants things to happen in a certain way, way that 
seem to break Wine. I think Linus cleanup of the ptrace event delivery can 
get strace, Wine and single-step-after-syscall right, w/out POPF handling. 
Then you guys can flame each other over the POPF handling ;)



- Davide




Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Daniel Jacobowitz
On Thu, Dec 30, 2004 at 09:47:42PM -0800, Linus Torvalds wrote:
> So I looked at just sharing the code with the debug trap handler, and the
> result is appended. strace works, as does all the TF tests I've thrown at
> it, and the code actually looks better anyway (the old do_debug code looks
> like it got the EIP wrong in VM86 mode, for example, this just cleans 
> that up too). Just use a common "send_sigtrap()" routine.
> 
> Does this look saner?

Lots, I like it.  The syscall trap will always be delivered before the
single-step trap, right, because signal delivery won't run until we
return to userspace?

-- 
Daniel Jacobowitz



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Mike Hearn
On Fri, 2004-12-31 at 14:13 +0100, Thomas Sailer wrote:
> No this doesn't work. The decision which address space layout to use is done 
> in arch/i386/mm/mmap.c:arch_pick_mmap_layout, and this function is called by 
> the elf loader in fs/binfmt_elf.c:load_elf_binary, i.e. the decision which 
> address space layout to use for the current wine process is already done long 
> time before your personality syscall takes effect.
> 
> I hoped there was some ELF section magic to turn this off (like execshield), 
> but there doesn't seem to be.

So it has to be done before an exec then? I had assumed changing the
personality would affect all mmaps from that point onwards, too bad.

We do some execs as part of the Wine startup code so it shouldn't be too
much of an issue. Anyway, all the setarch program does is do this
syscall then exec the program so it shouldn't be too hard to do the
equivalent in Wine.

What about this patch?

--- libs/wine/config.c  (revision 14)
+++ libs/wine/config.c  (local)
@@ -35,6 +35,10 @@
 #endif
 #include "wine/library.h"
 
+#ifdef HAVE_SYSCALL_H
+#include 
+#endif
+
 static const char server_config_dir[] = "/.wine";/* config dir 
relative to $HOME */
 static const char server_root_prefix[] = "/tmp/.wine-";  /* prefix for server 
root dir */
 static const char server_dir_prefix[] = "/server-";  /* prefix for server 
dir */
@@ -289,8 +293,13 @@ static void preloader_exec( char **argv,
 new_argv = xmalloc( (last_arg - argv + 2) * sizeof(*argv) );
 memcpy( new_argv + 1, argv, (last_arg - argv + 1) * sizeof(*argv) );
 new_argv[0] = full_name;
+
+/* set the abi personality */
+syscall(136, 0x8 /* PER_LINUX32 */ | 0x020 /* ADDR_COMPAT_LAYOUT 
*/);
+
 if (envp) execve( full_name, new_argv, envp );
 else execv( full_name, new_argv );
+
 free( new_argv );
 free( full_name );
 return;





Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Linus Torvalds


On Fri, 31 Dec 2004, Daniel Jacobowitz wrote:
> 
> Well, you put SIGTRAP|0x80 in si_code.  Coincidentally, 0x80 is
> SI_KERNEL.  So testing for SI_KERNEL | 0x80 is probably OK in the
> signal path, since most of its other arbitrary values would be either
> negative or not include SI_KERNEL.

Somebody would need to validate that no user can fool the kernel into 
doing something stupid...

That said, I think I solved the problem a different way: the 
TIF_SINGLESTEP code really fundamentally doesn't want to have _any_ of 
that SIGTRAP | 0x80 nonsense in its path, it actually wants to look like a 
debug trap - which sets si_code to TRAP_BRKPT. Which makes more sense 
anyway.

So I looked at just sharing the code with the debug trap handler, and the
result is appended. strace works, as does all the TF tests I've thrown at
it, and the code actually looks better anyway (the old do_debug code looks
like it got the EIP wrong in VM86 mode, for example, this just cleans 
that up too). Just use a common "send_sigtrap()" routine.

Does this look saner?

Linus


= include/asm-i386/ptrace.h 1.7 vs edited =
--- 1.7/include/asm-i386/ptrace.h   2004-09-03 16:55:27 -07:00
+++ edited/include/asm-i386/ptrace.h2004-12-30 21:27:39 -08:00
@@ -55,6 +55,7 @@
 #define PTRACE_SET_THREAD_AREA26
 
 #ifdef __KERNEL__
+extern void send_sigtrap(struct task_struct *tsk, struct pt_regs *regs, int 
error_code);
 #define user_mode(regs) ((VM_MASK & (regs)->eflags) || (3 & (regs)->xcs))
 #define instruction_pointer(regs) ((regs)->eip)
 #if defined(CONFIG_SMP) && defined(CONFIG_FRAME_POINTER)
= arch/i386/kernel/signal.c 1.49 vs edited =
--- 1.49/arch/i386/kernel/signal.c  2004-11-22 09:47:16 -08:00
+++ edited/arch/i386/kernel/signal.c2004-12-30 11:40:35 -08:00
@@ -270,7 +270,6 @@
 struct pt_regs *regs, unsigned long mask)
 {
int tmp, err = 0;
-   unsigned long eflags;
 
tmp = 0;
__asm__("movl %%gs,%0" : "=r"(tmp): "0"(tmp));
@@ -292,16 +291,7 @@
err |= __put_user(current->thread.error_code, &sc->err);
err |= __put_user(regs->eip, &sc->eip);
err |= __put_user(regs->xcs, (unsigned int __user *)&sc->cs);
-
-   /*
-* Iff TF was set because the program is being single-stepped by a
-* debugger, don't save that information on the signal stack.. We
-* don't want debugging to change state.
-*/
-   eflags = regs->eflags;
-   if (current->ptrace & PT_DTRACE)
-   eflags &= ~TF_MASK;
-   err |= __put_user(eflags, &sc->eflags);
+   err |= __put_user(regs->eflags, &sc->eflags);
err |= __put_user(regs->esp, &sc->esp_at_signal);
err |= __put_user(regs->xss, (unsigned int __user *)&sc->ss);
 
@@ -424,11 +414,9 @@
 * The tracer may want to single-step inside the
 * handler too.
 */
-   if (regs->eflags & TF_MASK) {
-   regs->eflags &= ~TF_MASK;
-   if (current->ptrace & PT_DTRACE)
-   ptrace_notify(SIGTRAP);
-   }
+   regs->eflags &= ~TF_MASK;
+   if (test_thread_flag(TIF_SINGLESTEP))
+   ptrace_notify(SIGTRAP);
 
 #if DEBUG_SIG
printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
@@ -519,11 +507,9 @@
 * The tracer may want to single-step inside the
 * handler too.
 */
-   if (regs->eflags & TF_MASK) {
-   regs->eflags &= ~TF_MASK;
-   if (current->ptrace & PT_DTRACE)
-   ptrace_notify(SIGTRAP);
-   }
+   regs->eflags &= ~TF_MASK;
+   if (test_thread_flag(TIF_SINGLESTEP))
+   ptrace_notify(SIGTRAP);
 
 #if DEBUG_SIG
printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
= arch/i386/kernel/traps.c 1.92 vs edited =
--- 1.92/arch/i386/kernel/traps.c   2004-12-28 11:07:48 -08:00
+++ edited/arch/i386/kernel/traps.c 2004-12-30 21:26:14 -08:00
@@ -718,23 +718,21 @@
 */
if ((regs->xcs & 3) == 0)
goto clear_TF_reenable;
-   if ((tsk->ptrace & (PT_DTRACE|PT_PTRACED)) == PT_DTRACE)
-   goto clear_TF;
+
+   /*
+* Was the TF flag set by a debugger? If so, clear it now,
+* so that register information is correct.
+*/
+   if (tsk->ptrace & PT_DTRACE) {
+   regs->eflags &= ~TF_MASK;
+   tsk->ptrace &= ~PT_DTRACE;
+   if (!tsk->ptrace & PT_DTRACE)
+   goto clear_TF;
+   }
}
 
/* Ok, finally something we can handle */
-   tsk->thread.trap_no = 1;
-   tsk->thread.error_code = error_code;
-   info.si_signo = SIGTRAP;
-   info.si_errno = 0;
-   info.si_code = TRAP_BRKPT;
-   
-   /* If this is a kernel mode trap, save the user PC on entry to 
-* the kernel, that

Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Thomas Sailer
On Donnerstag 30 Dezember 2004 14.06, you wrote:

> Tom, does this patch against Wine help? It should do the same thing as
> the setarch program, so if that fixes it then this should also (if I've
> understood how this mechanism works of course).

No this doesn't work. The decision which address space layout to use is done 
in arch/i386/mm/mmap.c:arch_pick_mmap_layout, and this function is called by 
the elf loader in fs/binfmt_elf.c:load_elf_binary, i.e. the decision which 
address space layout to use for the current wine process is already done long 
time before your personality syscall takes effect.

I hoped there was some ELF section magic to turn this off (like execshield), 
but there doesn't seem to be.

Tom



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Daniel Jacobowitz
On Thu, Dec 30, 2004 at 09:05:01PM -0800, Linus Torvalds wrote:
> 
> 
> On Thu, 30 Dec 2004, Jesse Allen wrote:
> > 
> > Well I tried this patch and it works. 
> 
> Goodie. Are there other known problems with silly copy-protection 
> schemes?  It migth be worth testing.
> 
> However:
> 
> > Since I cannot spot any issue, the patch looks good.  Are there any
> > other test cases?
> 
> Yes. It seems I broke "strace" with it. Probably the difference in system
> call trace reporting that Dan Jacobowitz already pointed out.
> 
> Now, that should be easily handled by just separating out the cases of 
> system call tracing and debug trap handling, and using the old silly code 
> for system calls. I'd prefer a cleaner approach, but that seems to be the 
> sane thing to do for now.

Strace doesn't use PTRACE_SETOPTIONS as far as I can tell... so it must
be something different.

-- 
Daniel Jacobowitz



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Andi Kleen
On Thu, Dec 30, 2004 at 04:38:21PM -0800, Linus Torvalds wrote:
> > Can someone repeat again what was wrong with the old ptrace
> > semantics before the initial change that caused all these complex
> > changes?  It seemed to work well for years. How about we just
> > go back to the old state, revert all the recent ptrace changes 
> > and skip all that?
> 
> Let me count the ways that were wrong before the changes:
>  - you couldn't debug any code that set TF. Really. ptrace would totally 
>destroy the TF state in the controlled process, so it would do 
>something totally different when debugged.

Well, tough. I assume people who play with TF themselves know
how to handle it without debuggers.  Really adding instruction
parsing for such a corner case seems like extreme overkill to me.

I agree it is not nice, but is it really worth all that complexity
to hide it?

>  - you couldn't even debug signal handlers, because they were _really_ 
>hard to get into unless you knew where they were and put a breakpoint 
>on them.

Ok I see this as being a problem. But I bet it could be fixed
much simpler without doing all this complicated and likely-to-be-buggy
popf parsing you added.

>  - you couldn't see the instruction after a system call.

Are you sure? 

>  - ptrace returned bogus TF state after a single-step

I am sure all debuggers in existence deal with that (and they will
need to continue doing so because there are so many old kernels around) 

> descriptors, and we actually do that (badly) in another place: the AMD
> "prefetch" check does exactly the same thing except it seems to get a few
> details wrong (looks like it cannot handle 16-bit code), and only works
> for the current process.

Yes, it was intentional to simplify it. 16bit code is not supposed
to use prefetches (and even if they do the probability of faulting
is pretty small) Also we needed several iterations
to get all the subtle bugs out (and I bet there are some issues left)

-Andi



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Linus Torvalds


On Fri, 31 Dec 2004, Andi Kleen wrote:
> 
> Just looking at all this complexiy and thinking about
> making it work on x86-64 too doesn't exactly give a good
> feeling in my spine.
> 
> Not to belittle your archivement Linus but it all looks
> very overengineered to me.

Ehh, do you have any _alternatives_?

> I think such complex instruction emulation games will be 
> hard to maintain and there are very surely bugs in so 
> much subtle code. 

There is no complexity anywhere, and we don't actually emulate any 
instructions at all. The only thing we do is to check _whether_ the 
instruction is a "popf" - we let the CPU do all the work, we just say "ok, 
the instruction will set TF, so we should not touch it afterwards.

> Can someone repeat again what was wrong with the old ptrace
> semantics before the initial change that caused all these complex
> changes?  It seemed to work well for years. How about we just
> go back to the old state, revert all the recent ptrace changes 
> and skip all that?

Let me count the ways that were wrong before the changes:
 - you couldn't debug any code that set TF. Really. ptrace would totally 
   destroy the TF state in the controlled process, so it would do 
   something totally different when debugged.
 - you couldn't even debug signal handlers, because they were _really_ 
   hard to get into unless you knew where they were and put a breakpoint 
   on them.
 - you couldn't see the instruction after a system call.
 - ptrace returned bogus TF state after a single-step

> I would love to skip this all on x86-64, but I would prefer
> to not make the behaviour incompatible to i386.

I suspect all the code can be shared. In fact, the change to send a
SIGTRAP directly rather than play around with "ptrace_notify()" etc is
likely totally architecture-independent apart from the calling convention
magic, so all of "do_syscall_trace()" could probably be moved into
kernel/ptrace.c.

The _only_ real complexity is actually following the silly LDT
descriptors, and we actually do that (badly) in another place: the AMD
"prefetch" check does exactly the same thing except it seems to get a few
details wrong (looks like it cannot handle 16-bit code), and only works
for the current process.

I assume you have that same prefetch thing on x86-64 already, so if
anything, you could look at my replacement and see if it would be workable
to do the prefetch thing too..

IOW, none of the issues involved are new. 

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Daniel Jacobowitz
On Thu, Dec 30, 2004 at 03:17:01PM -0800, Linus Torvalds wrote:
> 
> 
> On Thu, 30 Dec 2004, Daniel Jacobowitz wrote:
> > 
> > does not look right to me.  Before, we'd get an 0x80|SIGTRAP result
> > from wait.  Now, you've moved the 0x80 to live only inside the siginfo.
> > This is accessible to the debugger via ptrace, but only very recently
> > (late 2.5.x).  So this will probably break users of PT_TRACESYSGOOD.
> 
> I don't see how to easily emulate the old behaviour 100% - see
> ptrace_notify. We always sent the signal SIGTRAP to the process, and then
> set "exit_code" tp have the 0x80 marker by calling ptrace_stop() by hand.
> However, if we make it a real signal (which we need to do to get the 
> continue semantics right), we no longer have that out-of-band info 
> available to us.
> 
> We could make "get_signal_to_deliver()" pass in some other exit_code to 
> ptrace_stop() than just signr. It would have to pick it up from the 
> siginfo structure, but then we'd have to make sure that _other_ signal 
> users do that properly..
> 
> Patches/suggestions welcome.

Well, you put SIGTRAP|0x80 in si_code.  Coincidentally, 0x80 is
SI_KERNEL.  So testing for SI_KERNEL | 0x80 is probably OK in the
signal path, since most of its other arbitrary values would be either
negative or not include SI_KERNEL.

-- 
Daniel Jacobowitz



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Linus Torvalds


On Thu, 30 Dec 2004, Daniel Jacobowitz wrote:
> 
> does not look right to me.  Before, we'd get an 0x80|SIGTRAP result
> from wait.  Now, you've moved the 0x80 to live only inside the siginfo.
> This is accessible to the debugger via ptrace, but only very recently
> (late 2.5.x).  So this will probably break users of PT_TRACESYSGOOD.

I don't see how to easily emulate the old behaviour 100% - see
ptrace_notify. We always sent the signal SIGTRAP to the process, and then
set "exit_code" tp have the 0x80 marker by calling ptrace_stop() by hand.
However, if we make it a real signal (which we need to do to get the 
continue semantics right), we no longer have that out-of-band info 
available to us.

We could make "get_signal_to_deliver()" pass in some other exit_code to 
ptrace_stop() than just signr. It would have to pick it up from the 
siginfo structure, but then we'd have to make sure that _other_ signal 
users do that properly..

Patches/suggestions welcome.

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Andi Kleen
Linus Torvalds <[EMAIL PROTECTED]> writes:

> It's a bit more involved than I'd like, since especially the "popf" case 
> just is pretty complex, but I'd love to hear whether it works.
>
> NOTE NOTE NOTE! I've tested it, but only on one small test-case, so it 
> might be totally broken in many ways. I'd love to have people who are x86 
> and ptrace-aware give this a second look, and I'm confident Jesse will 
> find that it won't work, but can hopefully try to debug it a bit with 
> this..

Just looking at all this complexiy and thinking about
making it work on x86-64 too doesn't exactly give a good
feeling in my spine.

Not to belittle your archivement Linus but it all looks
very overengineered to me.

I think such complex instruction emulation games will be 
hard to maintain and there are very surely bugs in so 
much subtle code. 

Can someone repeat again what was wrong with the old ptrace
semantics before the initial change that caused all these complex
changes?  It seemed to work well for years. How about we just
go back to the old state, revert all the recent ptrace changes 
and skip all that?

e.g. reporting TF after popf in ptrace doesnt really seem like a big
issue to me that is worth fixing with that much code.  It is more an
unimportant corner case, isnt it? Same thing with forcing TF after
popf.  I bet most debuggers in existence get this special case wrong
and so far the world hasn't collapsed because of that.

I would love to skip this all on x86-64, but I would prefer
to not make the behaviour incompatible to i386.

-Andi




Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Linus Torvalds


On Thu, 30 Dec 2004, Jesse Allen wrote:
> 
> Well I tried this patch and it works. 

Goodie. Are there other known problems with silly copy-protection 
schemes?  It migth be worth testing.

However:

> Since I cannot spot any issue, the patch looks good.  Are there any
> other test cases?

Yes. It seems I broke "strace" with it. Probably the difference in system
call trace reporting that Dan Jacobowitz already pointed out.

Now, that should be easily handled by just separating out the cases of 
system call tracing and debug trap handling, and using the old silly code 
for system calls. I'd prefer a cleaner approach, but that seems to be the 
sane thing to do for now.

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Daniel Jacobowitz
i386 architecture details are really not my thing, so I'm going to
trust you on most of this, but this bit:

On Thu, Dec 30, 2004 at 02:46:17PM -0800, Linus Torvalds wrote:
>   /* the 0x80 provides a way for the tracing parent to distinguish
>  between a syscall stop and SIGTRAP delivery */
> - ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD) &&
> -  !test_thread_flag(TIF_SINGLESTEP) ? 0x80 : 0));
> + info.si_code = SIGTRAP;
> + if ((current->ptrace & PT_TRACESYSGOOD) && 
> !test_thread_flag(TIF_SINGLESTEP))
> + info.si_code = SIGTRAP | 0x80;
> + info.si_pid = current->tgid;
> + info.si_uid = current->uid;
>  
> - /*
> -  * this isn't the same as continuing with a signal, but it will do
> -  * for normal use.  strace only continues with a signal if the
> -  * stopping signal is not SIGTRAP.  -brl
> -  */
> - if (current->exit_code) {
> - send_sig(current->exit_code, current, 1);
> - current->exit_code = 0;
> - }
> + /* Send us the fakey SIGTRAP */
> + send_sig_info(SIGTRAP, &info, current);
>  }

does not look right to me.  Before, we'd get an 0x80|SIGTRAP result
from wait.  Now, you've moved the 0x80 to live only inside the siginfo.
This is accessible to the debugger via ptrace, but only very recently
(late 2.5.x).  So this will probably break users of PT_TRACESYSGOOD.

-- 
Daniel Jacobowitz



Re: ptrace single-stepping change breaks Wine

2004-12-31 Thread Linus Torvalds


On Thu, 30 Dec 2004, Linus Torvalds wrote:
> 
> Working on a patch for this right now, I'll send something out soonish 
> (and I'll test it on my test-case before sending it, so that it at least 
> has some chance of working).

Ok, here's a patch that may or may not make Wine happier. It's a _lot_ 
more careful about TF handling, and in particular it's trying really 
really hard to make sure that a controlling process does not change the 
trap flag as it is modified or used by the process.

This hopefully fixes:

 - single-step over "popf" should work - we won't clear TF after the popf, 
   but instead let the popf results remain. 

   NOTE! I tried to make sure that it does the right thing for segments 
   with non-zero bases, but I never actually tested that code. It's fairly 
   obvious, but it's also fairly likely to have some silly problems. Wine
   may well show effects of this, although I don't know how common 
   non-zero bases are with any kind of half-modern windows binaries.

 - ptrace reporting of "eflags" register after a single-step (we used to 
   report TF as being set because the debugger set it - even though the
   "native state" without debugging had it cleared).

   This also hopefully means that all the conditional TF clearing games
   etc aren't necessary, since arch/i386/kernel/traps.c should now be 
   taking care of hiding the debugger for most cases ("pushf" still 
   remains, and is hard. See comment in ptrace.c part of the patch)

It's a bit more involved than I'd like, since especially the "popf" case 
just is pretty complex, but I'd love to hear whether it works.

NOTE NOTE NOTE! I've tested it, but only on one small test-case, so it 
might be totally broken in many ways. I'd love to have people who are x86 
and ptrace-aware give this a second look, and I'm confident Jesse will 
find that it won't work, but can hopefully try to debug it a bit with 
this..

Linus

-
= arch/i386/kernel/signal.c 1.49 vs edited =
--- 1.49/arch/i386/kernel/signal.c  2004-11-22 09:47:16 -08:00
+++ edited/arch/i386/kernel/signal.c2004-12-30 11:40:35 -08:00
@@ -270,7 +270,6 @@
 struct pt_regs *regs, unsigned long mask)
 {
int tmp, err = 0;
-   unsigned long eflags;
 
tmp = 0;
__asm__("movl %%gs,%0" : "=r"(tmp): "0"(tmp));
@@ -292,16 +291,7 @@
err |= __put_user(current->thread.error_code, &sc->err);
err |= __put_user(regs->eip, &sc->eip);
err |= __put_user(regs->xcs, (unsigned int __user *)&sc->cs);
-
-   /*
-* Iff TF was set because the program is being single-stepped by a
-* debugger, don't save that information on the signal stack.. We
-* don't want debugging to change state.
-*/
-   eflags = regs->eflags;
-   if (current->ptrace & PT_DTRACE)
-   eflags &= ~TF_MASK;
-   err |= __put_user(eflags, &sc->eflags);
+   err |= __put_user(regs->eflags, &sc->eflags);
err |= __put_user(regs->esp, &sc->esp_at_signal);
err |= __put_user(regs->xss, (unsigned int __user *)&sc->ss);
 
@@ -424,11 +414,9 @@
 * The tracer may want to single-step inside the
 * handler too.
 */
-   if (regs->eflags & TF_MASK) {
-   regs->eflags &= ~TF_MASK;
-   if (current->ptrace & PT_DTRACE)
-   ptrace_notify(SIGTRAP);
-   }
+   regs->eflags &= ~TF_MASK;
+   if (test_thread_flag(TIF_SINGLESTEP))
+   ptrace_notify(SIGTRAP);
 
 #if DEBUG_SIG
printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
@@ -519,11 +507,9 @@
 * The tracer may want to single-step inside the
 * handler too.
 */
-   if (regs->eflags & TF_MASK) {
-   regs->eflags &= ~TF_MASK;
-   if (current->ptrace & PT_DTRACE)
-   ptrace_notify(SIGTRAP);
-   }
+   regs->eflags &= ~TF_MASK;
+   if (test_thread_flag(TIF_SINGLESTEP))
+   ptrace_notify(SIGTRAP);
 
 #if DEBUG_SIG
printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
= arch/i386/kernel/traps.c 1.92 vs edited =
--- 1.92/arch/i386/kernel/traps.c   2004-12-28 11:07:48 -08:00
+++ edited/arch/i386/kernel/traps.c 2004-12-30 12:36:30 -08:00
@@ -718,8 +718,17 @@
 */
if ((regs->xcs & 3) == 0)
goto clear_TF_reenable;
-   if ((tsk->ptrace & (PT_DTRACE|PT_PTRACED)) == PT_DTRACE)
-   goto clear_TF;
+
+   /*
+* Was the TF flag set by a debugger? If so, clear it now,
+* so that register information is correct.
+*/
+   if (tsk->ptrace & PT_DTRACE) {
+   regs->eflags &= ~TF_MASK;
+   tsk->ptrace &= ~PT_DTRACE;
+   if (!tsk->ptrace & PT_DTRACE)
+   goto clear_TF;
+   }
}
 
 

Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Jesse Allen
On Thu, 30 Dec 2004 21:47:42 -0800 (PST), Linus Torvalds
<[EMAIL PROTECTED]> wrote:
>
> 
> 
> So I looked at just sharing the code with the debug trap handler, and the
> result is appended. strace works, as does all the TF tests I've thrown at
> it, and the code actually looks better anyway (the old do_debug code looks
> like it got the EIP wrong in VM86 mode, for example, this just cleans
> that up too). Just use a common "send_sigtrap()" routine.
> 
> Does this look saner?
> 


Yeah.  I've tested and this one works.  I don't have any other copy
protection schemes that are broken.  Of the cd-rom based, older
safedisc still worked, and the newer one needs a special device driver
that wine cannot load properly yet.  It's more likely if there is a
problem with one that it's a problem with wine at this point.

Jesse



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Jesse Allen
On Thu, 30 Dec 2004 14:46:17 -0800 (PST), Linus Torvalds
<[EMAIL PROTECTED]> wrote:
> 
> 
> Ok, here's a patch that may or may not make Wine happier. It's a _lot_
> more careful about TF handling, and in particular it's trying really
> really hard to make sure that a controlling process does not change the
> trap flag as it is modified or used by the process.
> 
> This hopefully fixes:
> 
>  - single-step over "popf" should work - we won't clear TF after the popf,
>but instead let the popf results remain.
> 
>NOTE! I tried to make sure that it does the right thing for segments
>with non-zero bases, but I never actually tested that code. It's fairly
>obvious, but it's also fairly likely to have some silly problems. Wine
>may well show effects of this, although I don't know how common
>non-zero bases are with any kind of half-modern windows binaries.
> 
>  - ptrace reporting of "eflags" register after a single-step (we used to
>report TF as being set because the debugger set it - even though the
>"native state" without debugging had it cleared).
> 
>This also hopefully means that all the conditional TF clearing games
>etc aren't necessary, since arch/i386/kernel/traps.c should now be
>taking care of hiding the debugger for most cases ("pushf" still
>remains, and is hard. See comment in ptrace.c part of the patch)
> 
> It's a bit more involved than I'd like, since especially the "popf" case
> just is pretty complex, but I'd love to hear whether it works.
> 
> NOTE NOTE NOTE! I've tested it, but only on one small test-case, so it
> might be totally broken in many ways. I'd love to have people who are x86
> and ptrace-aware give this a second look, and I'm confident Jesse will
> find that it won't work, but can hopefully try to debug it a bit with
> this..
> 

Well I tried this patch and it works.  I captured a log showing the
signals and eflags again when running the program.  I compared it to
the working version.  There are differences, but none seem to be the
scenario TF was not set when it should have been.  Both log files are
just about the same size too.  I captured a second log in a row, and
compared the previous.  Again there are differences, so there is some
unavoidable randomness.

Since I cannot spot any issue, the patch looks good.  Are there any
other test cases?

Jesse



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Davide Libenzi
On Wed, 29 Dec 2004, Linus Torvalds wrote:

> On Wed, 29 Dec 2004, Davide Libenzi wrote:
> > 
> > I think same. My test simply let the function processing to let thru and 
> > reach the fake signal sending point.
> 
> No, your test-case doesn't even send a signal at all, because your
> test-program just uses a PTRACE_SINGLESTEP without any "send signal" - so
> "exit_code" will be zero after the debuggee gets released from the
> "ptrace_notify()", and the suspect code will never be executed..
> 
> The problem I think I see (and which the comment alludes to) is that if
> the controlling process continues the debuggee with a signal, we'll be
> doing the wrong thing: the code in do_syscall_trace() will take that
> signal (in "current->exit_code") and send it as a real signal to the
> debuggee, but since it is debugged, it will be caught (again) by the
> controlling process, which apparently in the case of Wine gets very
> confused.
> 
> So I _think_ the problem happens for the following schenario:
>  - wine for some reason does a PTRACE_SINGLESTEP over a system call
>  - after the single-step, wine ends up trying to get the sub-process to 
>enter a signal handler with ptrace( PTRACE_CONT, ... sig)
>  - the normal ptrace path (where we trap a signal - see kernel/signal.c 
>and the "ptrace_stop()" logic) handles this correctly, but 
>do_syscall_trace() will do a "send_sig()"
>  - we'll try to handle the signal when returning to the traced process
>  - now "get_signal_to_deliver()" will invoce the ptrace logic AGAIN, and 
>call ptrace_stop() for this fake signal, and we'll report a new event 
>to the controlling process.

Make sense to me. What about the one below? The problem is though, that we 
have two different events here. One is the single step trap and the other 
one is the signal. And according to the comment, strace want something 
different from SIGTRAP to continue with signal. So it wants the double 
event we are actually sending (ptrace_notify + send_sig). OTOH Wine does 
not seem to like the double event thingy. Hmm ...?



- Davide



--- ptrace.c.orig   2004-12-30 10:29:11.0 -0800
+++ ptrace.c2004-12-30 11:11:23.0 -0800
@@ -573,18 +573,14 @@
return;
if (!(current->ptrace & PT_PTRACED))
return;
-   /* the 0x80 provides a way for the tracing parent to distinguish
-  between a syscall stop and SIGTRAP delivery */
-   ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD) &&
-!test_thread_flag(TIF_SINGLESTEP) ? 0x80 : 0));
 
-   /*
-* this isn't the same as continuing with a signal, but it will do
-* for normal use.  strace only continues with a signal if the
-* stopping signal is not SIGTRAP.  -brl
-*/
if (current->exit_code) {
-   send_sig(current->exit_code, current, 1);
+   ptrace_notify(current->exit_code);
current->exit_code = 0;
+   } else {
+   /* the 0x80 provides a way for the tracing parent to distinguish
+  between a syscall stop and SIGTRAP delivery */
+   ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD) &&
+!test_thread_flag(TIF_SINGLESTEP) ? 
0x80 : 0));
}
 }



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Linus Torvalds


On Thu, 30 Dec 2004, Jesse Allen wrote:
> 
> Using the latest version of the patch on do_syscall_trace(), it still
> doesn't work unless I remove this test.  If indeed it's supposed to
> fall through to receive the proper signal, (ie to single step properly
> after an int op), then removing it is wrong, and I won't consider it
> anymore.  I also have to use the patch shown below, with a typo-fixed,
> to fix the other problem.  I broke it apart from the other because we
> might want to consider it seperately right now.

I'm actually able to now re-create some problems with TF handling with a 
simple non-wine test, and I think I'm zeroing in on the reason for Wine 
breaking. And I think it just happened to work by luck before.

Not only do we have problems single-stepping over a system call, we _also_ 
have problems single-stepping over a "popf" - we'll set TF (to 
single-step), and then afterwards we'll _clear_ TF, even if the "popf" 
also was supposed to set it. 

Working on a patch for this right now, I'll send something out soonish 
(and I'll test it on my test-case before sending it, so that it at least 
has some chance of working).

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Jesse Allen
On Thu, 30 Dec 2004 09:59:27 -0800 (PST), Davide Libenzi
 wrote:
> On Wed, 29 Dec 2004, Linus Torvalds wrote:
> 
> > On Wed, 29 Dec 2004, Davide Libenzi wrote:
> > >
> > > I think same. My test simply let the function processing to let thru and
> > > reach the fake signal sending point.
> >
> > No, your test-case doesn't even send a signal at all, because your
> > test-program just uses a PTRACE_SINGLESTEP without any "send signal" - so
> > "exit_code" will be zero after the debuggee gets released from the
> > "ptrace_notify()", and the suspect code will never be executed..
> 
> No no, my test case has nothing to do with the extra signal sent in
> do_syscall_trace(). But the test I put at the time:
> 
> -   if (!test_thread_flag(TIF_SYSCALL_TRACE))
> +   if (!test_thread_flag(TIF_SYSCALL_TRACE) &&
> +   !test_thread_flag(TIF_SINGLESTEP))
> return;
> 
> will make the code to not execute the "return" (in the single step case)
> and to fall through the point where the signal is sent.


Using the latest version of the patch on do_syscall_trace(), it still
doesn't work unless I remove this test.  If indeed it's supposed to
fall through to receive the proper signal, (ie to single step properly
after an int op), then removing it is wrong, and I won't consider it
anymore.  I also have to use the patch shown below, with a typo-fixed,
to fix the other problem.  I broke it apart from the other because we
might want to consider it seperately right now.

I spent some time speaking to my brother about this.  He has done his
own kernel before for an embedded system.  He was rather frustrated
with me trying to find a reason for why a rather simple change broke
my program.  He told me I couldn't have it both ways.  However I
believe that I don't understand the linux code well enough to make
that conclusion.


> 
> 
> > The problem I think I see (and which the comment alludes to) is that if
> > the controlling process continues the debuggee with a signal, we'll be
> > doing the wrong thing: the code in do_syscall_trace() will take that
> > signal (in "current->exit_code") and send it as a real signal to the
> > debuggee, but since it is debugged, it will be caught (again) by the
> > controlling process, which apparently in the case of Wine gets very
> > confused.
> >
> > So I _think_ the problem happens for the following schenario:
> >  - wine for some reason does a PTRACE_SINGLESTEP over a system call
> >  - after the single-step, wine ends up trying to get the sub-process to
> >enter a signal handler with ptrace( PTRACE_CONT, ... sig)
> >  - the normal ptrace path (where we trap a signal - see kernel/signal.c
> >and the "ptrace_stop()" logic) handles this correctly, but
> >do_syscall_trace() will do a "send_sig()"
> >  - we'll try to handle the signal when returning to the traced process
> >  - now "get_signal_to_deliver()" will invoce the ptrace logic AGAIN, and
> >call ptrace_stop() for this fake signal, and we'll report a new event
> >to the controlling process.
> >
> > Does this make sense?
> 
> This might explain what they were seeing, but OTOH it seems that the real
> cause of their problems is related to something else (according to other
> emails on this thread).
> 
> 

Actually, I don't think the vanilla kernel has the code that breaks
those other wine programs.  I just learned of this yesterday and it's
not related.  I believe it's only in fedora core 3 or -ac kernels and 
I use vanilla kernels.


Jesse

--- linux/arch/i386/kernel/ptrace.c 2004-12-29 14:10:34.0 -0700
+++ linux-mod/arch/i386/kernel/ptrace.c 2004-12-29 20:50:00.0 -0700
@@ -142,18 +142,31 @@
 {
long eflags;
 
+   /*
+* Always set TIF_SINGLESTEP - this guarantees that 
+* we single-step system calls etc.. 
+*/
set_tsk_thread_flag(child, TIF_SINGLESTEP);
+
+   /*
+* If TF was already set, don't do anything else
+*/
eflags = get_stack_long(child, EFL_OFFSET);
+   if (eflags & TRAP_FLAG)
+   return;
put_stack_long(child, EFL_OFFSET, eflags | TRAP_FLAG);
child->ptrace |= PT_DTRACE;
 }
 
 static void clear_singlestep(struct task_struct *child)
 {
+   /* Always clear TIF_SINGLESTEP... */
+   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+
+   /* But touch TF only if it was set by us.. */
if (child->ptrace & PT_DTRACE) {
long eflags;
 
-   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
eflags = get_stack_long(child, EFL_OFFSET);
put_stack_long(child, EFL_OFFSET, eflags & ~TRAP_FLAG);
child->ptrace &= ~PT_DTRACE;



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Linus Torvalds


On Thu, 30 Dec 2004, Davide Libenzi wrote:
> 
> This might explain what they were seeing, but OTOH it seems that the real 
> cause of their problems is related to something else (according to other 
> emails on this thread).

There's two different problems: the one seen by Thomas (the Xilinx FPGA
synthesizer), which is apparently just due to Wine (or, more likely, the
Windows app itself) depending on a certain memory layout for the stack
and/or other allocations. That one I think we can consider solved, and
indeed had nothing to do with TF.

The other one is the copy-protection code breaking for some game 
(Warcraft) for Jesse Allen, and that one is definitely TF-related.. Jesse 
can fix it with patches, but those patches aren't acceptable for other 
uses, so that's why I'm trying to find something that DTRT both for Wine 
and for a regular debugging session..

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Thomas Sailer
On Wed, 2004-12-29 at 20:15 -0800, Andrew Morton wrote:

> You can globally disable flex-mmap with
> 
>   echo 1 > /proc/sys/vm/legacy_va_layout
> 
> Does it fix things?

Haven't tried. But setarch i386 -L /usr/bin/wine ... did fix it.

Tom





Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Linus Torvalds


On Wed, 29 Dec 2004, Davide Libenzi wrote:
> 
> I think same. My test simply let the function processing to let thru and 
> reach the fake signal sending point.

No, your test-case doesn't even send a signal at all, because your
test-program just uses a PTRACE_SINGLESTEP without any "send signal" - so
"exit_code" will be zero after the debuggee gets released from the
"ptrace_notify()", and the suspect code will never be executed..

The problem I think I see (and which the comment alludes to) is that if
the controlling process continues the debuggee with a signal, we'll be
doing the wrong thing: the code in do_syscall_trace() will take that
signal (in "current->exit_code") and send it as a real signal to the
debuggee, but since it is debugged, it will be caught (again) by the
controlling process, which apparently in the case of Wine gets very
confused.

So I _think_ the problem happens for the following schenario:
 - wine for some reason does a PTRACE_SINGLESTEP over a system call
 - after the single-step, wine ends up trying to get the sub-process to 
   enter a signal handler with ptrace( PTRACE_CONT, ... sig)
 - the normal ptrace path (where we trap a signal - see kernel/signal.c 
   and the "ptrace_stop()" logic) handles this correctly, but 
   do_syscall_trace() will do a "send_sig()"
 - we'll try to handle the signal when returning to the traced process
 - now "get_signal_to_deliver()" will invoce the ptrace logic AGAIN, and 
   call ptrace_stop() for this fake signal, and we'll report a new event 
   to the controlling process.

Does this make sense?

If so, we have a few options:

 - very hacky one: teach all ptrace users that sometimes the signal you
   continue with can be reflected back at you, and you should just "cont
   signr"  _again_.

   This is a really bad option, partly because it's hard to tell when it's 
   just a spurious reflected signal, partly because there are many ptrace 
   users, and to a large part just because it's clearly too hacky. But it
   might be a valid approach for a Wine person who is used to Wine, and
   wants to verify whether this is indeed the schenario that triggers the
   problem..

 - Stupid approach: mark the siginfo that we send as the fake one as being 
   reflected, and have get_signal_to_deliver() not apply the ptrace 
   stopping to that.

 - Possibly cleaner approach: make system call tracing just use a proper
   SIGTRAP in the first place, and always handle all the ptrace_stop() etc
   cruft in kernel/signal.c like it handles all other calls.

I dunno. The final one looks fairly simple and clean, something like the
following, but I'm most likely overlooking some reason why this won't
work..

(And as usual, this patch has not been tested in any shape, way or form. 
In fact, it hasn't even seen an x86 machine, since I edited it out as a 
fake on my ppc64.. Somebody with more brains than me should actually try 
to get it to work)

Linus


= arch/i386/kernel/ptrace.c 1.28 vs edited =
--- 1.28/arch/i386/kernel/ptrace.c  2004-11-22 09:44:52 -08:00
+++ edited/arch/i386/kernel/ptrace.c2004-12-29 23:21:58 -08:00
@@ -559,6 +559,8 @@
 __attribute__((regparm(3)))
 void do_syscall_trace(struct pt_regs *regs, int entryexit)
 {
+   struct siginfo info;
+
if (unlikely(current->audit_context)) {
if (!entryexit)
audit_syscall_entry(current, regs->orig_eax,
@@ -573,18 +575,18 @@
return;
if (!(current->ptrace & PT_PTRACED))
return;
+
+   memset(&info, 0, sizeof(info));
+   info.si_signo = SIGTRAP;
+
/* the 0x80 provides a way for the tracing parent to distinguish
   between a syscall stop and SIGTRAP delivery */
-   ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD) &&
-!test_thread_flag(TIF_SINGLESTEP) ? 0x80 : 0));
+   info.si_code = SIGTRAP;
+   if ((current->ptrace & PT_TRACESYSGOOD) && 
!test_thread_flag(TIF_SINGLESTEP))
+   info.si_code = SIGTRAP | 0x80;
+   info.si_pid = current->tgid;
+   info.si_uid = current->uid;
 
-   /*
-* this isn't the same as continuing with a signal, but it will do
-* for normal use.  strace only continues with a signal if the
-* stopping signal is not SIGTRAP.  -brl
-*/
-   if (current->exit_code) {
-   send_sig(current->exit_code, current, 1);
-   current->exit_code = 0;
-   }
+   /* Send us the fakey SIGTRAP */
+   send_sig_info(SIGTRAP, &info, current);
 }



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Davide Libenzi
On Wed, 29 Dec 2004, Linus Torvalds wrote:

> On Wed, 29 Dec 2004, Linus Torvalds wrote:
> > 
> > .. no, I see what's up. System call returns _are_ special for 
> > single-stepping. I'll think about it..
> 
> Ok, I think I know what's up.
> 
> It's literally the bogus fake signal that do_syscall_trace() sends. I 
> think the TIF_SINGLESTEP case in do_syscall_trace() should only do the 
> ptrace_notify() and return..

I think same. My test simply let the function processing to let thru and 
reach the fake signal sending point.


- Davide




Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Davide Libenzi
On Wed, 29 Dec 2004, Linus Torvalds wrote:

> Will test whether it cleanly handles your test-case. Davide - you also 
> added the TIF_SINGLESTEP flag to that _TIF_WORK_MASK, can you explain 
> that?

I honestly do not remember, but I think is wrong and can be removed. 
That's not the problem though ...


- Davide




Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Andrew Morton
Thomas Sailer <[EMAIL PROTECTED]> wrote:
>
> Another pointer towards flexible mmap is that ulimit -s unlimited makes
> it work under 2.6.10-ac1 too.
> 

You can globally disable flex-mmap with

echo 1 > /proc/sys/vm/legacy_va_layout

Does it fix things?



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Linus Torvalds


On Wed, 29 Dec 2004, Linus Torvalds wrote:
> 
> So the updated patch would look something like the appended.

.. no, I see what's up. System call returns _are_ special for 
single-stepping. I'll think about it..

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Linus Torvalds


On Wed, 29 Dec 2004, Linus Torvalds wrote:
> 
> .. no, I see what's up. System call returns _are_ special for 
> single-stepping. I'll think about it..

Ok, I think I know what's up.

It's literally the bogus fake signal that do_syscall_trace() sends. I 
think the TIF_SINGLESTEP case in do_syscall_trace() should only do the 
ptrace_notify() and return..

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Linus Torvalds


On Thu, 30 Dec 2004, Thomas Sailer wrote:
>
> No joy with
> linux-2.6.10
> patch-2.6.10-ac1
> 01-ptrace-reverse.diff
> sigtrap-reverse.diff
> 
> Below is the seh trace output. In the working case (2.6.8) there is no
> trace:seh: output at this point.

I have no idea what "seh" is in wine-speak, but it appears that your 
problem is something totally different, especially as none of the eflags- 
changes seem to matter for you. Also, in your "seh" exception register 
dump, you don't actually seem to have TF set in %eflags (TF is 0x0100).

Some wine person would need to inform us about what the seh exception 
thing means.. "code c005"? 

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Davide Libenzi
On Wed, 29 Dec 2004, Linus Torvalds wrote:

> On Wed, 29 Dec 2004, Davide Libenzi wrote:
> > 
> > I think same. My test simply let the function processing to let thru and 
> > reach the fake signal sending point.
> 
> No, your test-case doesn't even send a signal at all, because your
> test-program just uses a PTRACE_SINGLESTEP without any "send signal" - so
> "exit_code" will be zero after the debuggee gets released from the
> "ptrace_notify()", and the suspect code will never be executed..

No no, my test case has nothing to do with the extra signal sent in 
do_syscall_trace(). But the test I put at the time:

-   if (!test_thread_flag(TIF_SYSCALL_TRACE))
+   if (!test_thread_flag(TIF_SYSCALL_TRACE) &&
+   !test_thread_flag(TIF_SINGLESTEP))
return;

will make the code to not execute the "return" (in the single step case) 
and to fall through the point where the signal is sent.



> The problem I think I see (and which the comment alludes to) is that if
> the controlling process continues the debuggee with a signal, we'll be
> doing the wrong thing: the code in do_syscall_trace() will take that
> signal (in "current->exit_code") and send it as a real signal to the
> debuggee, but since it is debugged, it will be caught (again) by the
> controlling process, which apparently in the case of Wine gets very
> confused.
> 
> So I _think_ the problem happens for the following schenario:
>  - wine for some reason does a PTRACE_SINGLESTEP over a system call
>  - after the single-step, wine ends up trying to get the sub-process to 
>enter a signal handler with ptrace( PTRACE_CONT, ... sig)
>  - the normal ptrace path (where we trap a signal - see kernel/signal.c 
>and the "ptrace_stop()" logic) handles this correctly, but 
>do_syscall_trace() will do a "send_sig()"
>  - we'll try to handle the signal when returning to the traced process
>  - now "get_signal_to_deliver()" will invoce the ptrace logic AGAIN, and 
>call ptrace_stop() for this fake signal, and we'll report a new event 
>to the controlling process.
> 
> Does this make sense?

This might explain what they were seeing, but OTOH it seems that the real 
cause of their problems is related to something else (according to other 
emails on this thread).



- Davide




Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Thomas Sailer
Another pointer towards flexible mmap is that ulimit -s unlimited makes
it work under 2.6.10-ac1 too.

Tom





Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Thomas Sailer
On Wed, 2004-12-29 at 18:10 -0800, Linus Torvalds wrote:

> I have no idea what "seh" is in wine-speak, but it appears that your 

seh means structured exception handling in microsoft-speak.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/debug/base/structured_exception_handling.asp
http://www.jorgon.freeserve.co.uk/ExceptFrame.htm

> Some wine person would need to inform us about what the seh exception 
> thing means.. "code c005"? 

c005 apparently means memory access violation. Looks like xst is
getting confused about its memory allocations...

Tom





Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Thomas Sailer
On Wed, 2004-12-29 at 18:10 -0800, Linus Torvalds wrote:

> Some wine person would need to inform us about what the seh exception 
> thing means.. "code c005"? 

There's an interesting thing. Fedora Kernel 2.6.7 works for me, Fedora
Kernel 2.6.8 breaks wine/xst. Interestingly, Linus 2.6.8 works. Now the
biggest changes between the Fedora Kernels 2.6.7-1.494.2.2 and
2.6.8-1.521, besides the rebasing, are some execshield changes and
flexible mmap. execshield cannot be the culprit as it's not in 2.6.10-
ac1, but flexible mmap seems to be in 2.6.10-ac1. To me, a candidate for
further scrutiny is flexmmap, although I do not claim to have a clue
about linux-mm...

Tom





Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Linus Torvalds


On Wed, 29 Dec 2004, Davide Libenzi wrote:
> 
> That test went in to be able to have ptrace single step, to see even the 
> instruction following the #int instruction (this was the target of the 
> patch itself). I just verified that, in 2.6.8 that does not have such test 
> anymore, the single-step-after-int capability is lost.

Ahh. That's because of a separate bug: we have this silly separation of 
"_TIF_WORK_MASK" (everything but tracing) and "_TIF_ALLWORK_MASK" 
(everything), and the system call stuff takes over _TIF_SINGLESTEP for 
some very non-obvious reasons.

I don't see why the system-call code thinks _TIF_SINGLESTEP is special, 
but it certainly explains why it doesn't get handled normally.

So the updated patch would look something like the appended.

Will test whether it cleanly handles your test-case. Davide - you also 
added the TIF_SINGLESTEP flag to that _TIF_WORK_MASK, can you explain 
that?

(And yes, I know you'd sent me the test-program before, but I'm about as 
organized as a Performing Seal with Alzheimers..)

Linus

--- 1.23/include/asm-i386/thread_info.h 2004-11-18 23:03:11 -08:00
+++ edited/include/asm-i386/thread_info.h   2004-12-29 17:52:16 -08:00
@@ -153,7 +153,7 @@
 
 /* work to do on interrupt/exception return */
 #define _TIF_WORK_MASK \
-  (0x & ~(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP))
+  (0x & ~(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT))
 #define _TIF_ALLWORK_MASK  0x  /* work to do on any return to 
u-space */
 
 /*
--- 1.28/arch/i386/kernel/ptrace.c  2004-11-22 09:44:52 -08:00
+++ edited/arch/i386/kernel/ptrace.c2004-12-29 17:36:41 -08:00
@@ -568,15 +568,13 @@
audit_syscall_exit(current, regs->eax);
}
 
-   if (!test_thread_flag(TIF_SYSCALL_TRACE) &&
-   !test_thread_flag(TIF_SINGLESTEP))
+   if (!test_thread_flag(TIF_SYSCALL_TRACE))
return;
if (!(current->ptrace & PT_PTRACED))
return;
/* the 0x80 provides a way for the tracing parent to distinguish
   between a syscall stop and SIGTRAP delivery */
-   ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD) &&
-!test_thread_flag(TIF_SINGLESTEP) ? 0x80 : 0));
+   ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD) ? 0x80 : 
0));
 
/*
 * this isn't the same as continuing with a signal, but it will do



Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Thomas Sailer
No joy with
linux-2.6.10
patch-2.6.10-ac1
01-ptrace-reverse.diff
sigtrap-reverse.diff

Below is the seh trace output. In the working case (2.6.8) there is no
trace:seh: output at this point.

Tom

Compiling vhdl file U:/home/sailer/src/vhdl/dvbc_pcseng/vprim.vhd in
Library synwork.
trace:seh:EXC_RtlRaiseException code=c005 flags=0 addr=0x770e6151
trace:seh:EXC_RtlRaiseException  info[0]=
trace:seh:EXC_RtlRaiseException  info[1]=72772078
trace:seh:EXC_RtlRaiseException  eax=72772074 ebx=b7a01d9c ecx=7f7daa0c
edx=fffa esi=7f7499b0 edi=7f7daa0c
trace:seh:EXC_RtlRaiseException  ebp=77058a70 esp=77ad2150 cs=0073
ds=007b es=007b fs=003b gs=0033 flags=00210206
trace:seh:EXC_CallHandler calling handler at 0x77134e9f code=c005
flags=0
trace:seh:EXC_CallHandler handler returned 1
trace:seh:EXC_CallHandler calling handler at 0x103662d2 code=c005
flags=0
trace:seh:EXC_CallHandler handler returned 1
trace:seh:EXC_CallHandler calling handler at 0x1034ec8e code=c005
flags=0
trace:seh:EXC_CallHandler handler returned 1
trace:seh:EXC_CallHandler calling handler at 0x4024a0 code=c005
flags=0
trace:seh:EXC_RtlUnwind code=c005 flags=2
trace:seh:EXC_CallHandler calling handler at 0x77ebe2e0 code=c005
flags=2
trace:seh:EXC_CallHandler handler returned 1
trace:seh:EXC_CallHandler calling handler at 0x77134e9f code=c005
flags=2
trace:seh:EXC_CallHandler handler returned 1
trace:seh:EXC_CallHandler calling handler at 0x103662d2 code=c005
flags=2
trace:seh:EXC_CallHandler handler returned 1
trace:seh:EXC_CallHandler calling handler at 0x1034ec8e code=c005
flags=2
trace:seh:EXC_CallHandler handler returned 1
FATAL_ERROR:Xst:Portability/export/Port_Main.h:127:1.13 - This
application has discovered an exceptional condition from which it cannot
recover.  Process will terminate.  To resolve this error, please consult
the Answers Database and other online resources at
http://support.xilinx.com. If you need further assistance, please open a
Webcase by clicking on the "WebCase" link at http://support.xilinx.com






Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Davide Libenzi
On Wed, 29 Dec 2004, Linus Torvalds wrote:

> On Wed, 29 Dec 2004, Jesse Allen wrote:
> > > 
> > > So instead of removing the setting of TIF_SINGLESTEP in set_singlestep(),
> > > can you test whether removing the _testing_ of it in do_syscall_trace()
> > > makes things happier for you? Hmm?
> > > 
> > 
> > Yes, doing that does work.  But I still have to remove the conditional
> > TF clear.  Here's the diff now to show you.
> 
> Ok. That's a good piece of information. 
> 
> I don't actually understand why do_syscall_trace() does that 
> TIF_SINGLESTEP test in the first place, since the ptrace_notify() should 
> happen as part of the _normal_ TIF_SINGLESTEP handling, afaiks. No 
> apparent need to do it in syscall tracing, and do_syscall_trace() does 
> that bogus fake signal sending.
> 
> Hmm.. That TF_SINGLESTEP test was added by Davide Libenzi in August, and I
> think Davide had some test for exactly this. Davide? Can you recall why
> you did this in the system call tracing code, rather than depend on the
> regular TIF_SINGLESTEP handling in arch/i386/kernel/signal.c:
> do_notify_resume()?

That test went in to be able to have ptrace single step, to see even the 
instruction following the #int instruction (this was the target of the 
patch itself). I just verified that, in 2.6.8 that does not have such test 
anymore, the single-step-after-int capability is lost. Below I inlining a 
simple test program (that I already sent you time ago) to test this. In my 
2.6.8 I get this output:

waiting ...
done: pid=21398  status=1407
sig=5
EIP=0x080485d5
waiting ...
done: pid=21398  status=1407
sig=5
EIP=0x080485da
waiting ...
done: pid=21398  status=1407
sig=5
EIP=0x080485df
waiting ...
done: pid=21398  status=1407
sig=5
EIP=0x080485d5
waiting ...
done: pid=21398  status=1407
sig=5
EIP=0x080485da
waiting ...
done: pid=21398  status=1407
sig=5
EIP=0x080485df

in front of this code:

80485d5:   b8 14 00 00 00  mov$0x14,%eax
80485da:   cd 80   int$0x80
80485dc:   89 45 ecmov%eax,0xffec(%ebp)
80485df:   eb f4   jmp80485d5 
 
You can see that the "mov %eax,0xffec(%ebp)" instruction at 0x80485dc 
is not seen by ptrace single step. With the test in place, it will show up 
again in your traces.



- Davide





#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 


int main(int ac, char **av) {
int i, status, res;
long start, end;
pid_t cpid, pid;
struct user_regs_struct ur;
struct sigaction sa;

sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sa.sa_handler = SIG_DFL;
sigaction(SIGCHLD, &sa, NULL);

printf("nargs=%d\n", ac);
if (ac == 1)
goto tracer;

printf("arg=%s\n", av[1]);
loop:
__asm__ volatile ("int $0x80"
  : "=a" (res)
  : "0" (__NR_getpid));
goto loop;
endloop:
exit(0);


tracer:
if ((cpid = fork()) != 0)
goto parent;

printf("child=%d\n", getpid());
ptrace(PTRACE_TRACEME, 0, NULL, NULL);

execl(av[0], av[0], "child", NULL);

exit(0);

parent:
start = (long) &&loop;
end = (long) &&endloop;

printf("pchild=%d\n", cpid);

for (;;) {
pid = wait(&status);
if (pid != cpid)
continue;
res = WSTOPSIG(status);
if (ptrace(PTRACE_GETREGS, pid, NULL, &ur)) {
printf("[%d] error: ptrace(PTRACE_GETREGS, %d)\n",
   pid, pid);
return 1;
}

if (ptrace(PTRACE_SINGLESTEP, pid, NULL, res != SIGTRAP ? res: 
0)) {
perror("ptrace(PTRACE_SINGLESTEP)");
return 1;
}

if (ur.eip >= start && ur.eip <= end)
break;
}


for (i = 0; i < 15; i++) {
printf("waiting ...\n");
pid = wait(&status);
printf("done: pid=%d  status=%d\n", pid, status);
if (pid != cpid)
continue;
res = WSTOPSIG(status);
printf("sig=%d\n", res);
if (ptrace(PTRACE_GETREGS, pid, NULL, &ur)) {
printf("[%d] error: ptrace(PTRACE_GETREGS, %d)\n",
   pid, pid);
return 1;
}

printf("EIP=0x%08x\n", ur.eip);

if (ptrace(PTRACE_SINGLESTEP, pid, NULL, res != SIGTRAP ? res: 
0)) {
perror("ptrace(PTRACE_SINGLESTEP)");
return 1;
}
}

if (ptrace(PTRACE_CONT, cpid, NULL, SIGKILL)) {
perror("ptrace(PTRACE_SINGLESTEP)");

Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Linus Torvalds


On Wed, 29 Dec 2004, Jesse Allen wrote:
> > 
> > So instead of removing the setting of TIF_SINGLESTEP in set_singlestep(),
> > can you test whether removing the _testing_ of it in do_syscall_trace()
> > makes things happier for you? Hmm?
> > 
> 
> Yes, doing that does work.  But I still have to remove the conditional
> TF clear.  Here's the diff now to show you.

Ok. That's a good piece of information. 

I don't actually understand why do_syscall_trace() does that 
TIF_SINGLESTEP test in the first place, since the ptrace_notify() should 
happen as part of the _normal_ TIF_SINGLESTEP handling, afaiks. No 
apparent need to do it in syscall tracing, and do_syscall_trace() does 
that bogus fake signal sending.

Hmm.. That TF_SINGLESTEP test was added by Davide Libenzi in August, and I
think Davide had some test for exactly this. Davide? Can you recall why
you did this in the system call tracing code, rather than depend on the
regular TIF_SINGLESTEP handling in arch/i386/kernel/signal.c:
do_notify_resume()?

(Jesse's patch re-appended here in-line for Davide's "reading pleasure").

That still leaves the problem of the clearing of TF_MASK. I _appears_ that 
the problem is that TF was set both by Wine doing a PTRACE_SINGLESTEP 
(since PT_DTRACE is set) _and_ the application having TF enabled in eflags 
from before (since it seems to want to keep it enabled).

How about something like the attachment for the TF_MASK issue (replacing
your "don't clear" code)? The comments should make the intention pretty
obvious, but I have equally obviously not actually _tested_ any of this..

Linus

---
--- linux/arch/i386/kernel/ptrace.c 2004-12-29 14:10:34.0 -0700
+++ linux-mod/arch/i386/kernel/ptrace.c 2004-12-29 14:22:33.0 -0700
@@ -568,8 +568,7 @@
audit_syscall_exit(current, regs->eax);
}
 
-   if (!test_thread_flag(TIF_SYSCALL_TRACE) &&
-   !test_thread_flag(TIF_SINGLESTEP))
+   if (!test_thread_flag(TIF_SYSCALL_TRACE))
return;
if (!(current->ptrace & PT_PTRACED))
return;
--- linux/arch/i386/kernel/signal.c 2004-12-29 14:10:34.0 -0700
+++ linux-mod/arch/i386/kernel/signal.c 2004-12-29 14:23:04.0 -0700
@@ -299,8 +299,8 @@
 * don't want debugging to change state.
 */
eflags = regs->eflags;
-   if (current->ptrace & PT_DTRACE)
-   eflags &= ~TF_MASK;
+// if (current->ptrace & PT_DTRACE)
+// eflags &= ~TF_MASK;
err |= __put_user(eflags, &sc->eflags);
err |= __put_user(regs->esp, &sc->esp_at_signal);
err |= __put_user(regs->xss, (unsigned int __user *)&sc->ss);= arch/i386/kernel/ptrace.c 1.28 vs edited =
--- 1.28/arch/i386/kernel/ptrace.c  2004-11-22 09:44:52 -08:00
+++ edited/arch/i386/kernel/ptrace.c2004-12-29 16:42:04 -08:00
@@ -142,18 +142,31 @@
 {
long eflags;
 
+   /*
+* Always set TIF_SINGLESTEP - this guarantees that 
+* we single-step system calls etc.. 
+*/
set_tsk_thread_flag(child, TIF_SINGLESTEP);
+
+   /*
+* If TF was already set, don't do anything else
+*/
eflags = get_stack_long(child, EFL_OFFSET);
+   if (flags & TRAP_FLAG)
+   return;
put_stack_long(child, EFL_OFFSET, eflags | TRAP_FLAG);
child->ptrace |= PT_DTRACE;
 }
 
 static void clear_singlestep(struct task_struct *child)
 {
+   /* Always clear TIF_SINGLESTEP... */
+   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+
+   /* But touch TF only if it was set by us.. */
if (child->ptrace & PT_DTRACE) {
long eflags;
 
-   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
eflags = get_stack_long(child, EFL_OFFSET);
put_stack_long(child, EFL_OFFSET, eflags & ~TRAP_FLAG);
child->ptrace &= ~PT_DTRACE;


Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Mike Hearn
On Thu, 2004-12-30 at 11:09 +0100, Thomas Sailer wrote:
> On Wed, 2004-12-29 at 20:15 -0800, Andrew Morton wrote:
> > You can globally disable flex-mmap with
> > 
> > echo 1 > /proc/sys/vm/legacy_va_layout
> > 
> > Does it fix things?
> 
> Haven't tried. But setarch i386 -L /usr/bin/wine ... did fix it.
> 
> Tom

Tom, does this patch against Wine help? It should do the same thing as
the setarch program, so if that fixes it then this should also (if I've
understood how this mechanism works of course).

Mike Hearn <[EMAIL PROTECTED]>
Set the Linux ABI personality to get a legacy VM layout
--- configure.ac  (revision 14)
+++ configure.ac  (local)
@@ -188,6 +188,7 @@ AC_CHECK_HEADERS(\
 	linux/joystick.h \
 	linux/major.h \
 	linux/param.h \
+	linux/personality.h \
 	linux/serial.h \
 	linux/ucdrom.h \
 	machine/cpu.h \
--- libs/wine/loader.c  (revision 14)
+++ libs/wine/loader.c  (local)
@@ -41,6 +41,11 @@
 #include "winbase.h"
 #include "wine/library.h"
 
+#ifdef HAVE_LINUX_PERSONALITY_H
+#include 
+#include 
+#endif
+
 #ifdef __APPLE__
 #include 
 #define environ (*_NSGetEnviron())
@@ -515,6 +520,22 @@ static void debug_usage(void)
 
 
 /***
+ *   linux_personality_init
+ *
+ * Sets the "please unbreak me" mmap flag to disable things like
+ * flex-mmap. Note that this will also disable exec-shield
+ * randomization for any ELF DLLs loaded after this point but the
+ * initial libraries (libc etc) linked in automatically will be
+ * randomized.
+ */
+static void linux_personality_init(void)
+{
+#if defined(HAVE_LINUX_PERSONALITY_H) && defined(__i386__)
+syscall(SYS_personality, PER_LINUX32 | 0x020 /* ADDR_COMPAT_LAYOUT */);
+#endif
+}
+
+/***
  *   wine_init
  *
  * Main Wine initialisation.
@@ -526,6 +547,7 @@ void wine_init( int argc, char *argv[], 
 void *ntdll;
 void (*init_func)(void);
 
+linux_personality_init();
 build_dll_path();
 wine_init_argv0_path( argv[0] );
 __wine_main_argc = argc;


Re: ptrace single-stepping change breaks Wine

2004-12-30 Thread Mike Hearn
On Wed, 2004-12-29 at 18:10 -0800, Linus Torvalds wrote:
> Some wine person would need to inform us about what the seh exception 
> thing means.. "code c005"? 

STATUS_ACCESS_VIOLATION, or the Win32 equivalent of a segfault. It would
appear that the ptrace changes are not responsible here, though if
changing the kernel really does change this crash then maybe there is
another issue we haven't uncovered yet.

thanks -mike




Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Jesse Allen
On Wed, 29 Dec 2004 16:44:05 -0800 (PST), Linus Torvalds
<[EMAIL PROTECTED]> wrote:
> 
> That still leaves the problem of the clearing of TF_MASK. I _appears_ that
> the problem is that TF was set both by Wine doing a PTRACE_SINGLESTEP
> (since PT_DTRACE is set) _and_ the application having TF enabled in eflags
> from before (since it seems to want to keep it enabled).
> 
> How about something like the attachment for the TF_MASK issue (replacing
> your "don't clear" code)? The comments should make the intention pretty
> obvious, but I have equally obviously not actually _tested_ any of this..
> 

The following patch works for me.  I still have to remove
TIF_SINGLESTEP test.  I also went ahead and tried adding the other fix
on the ptrace_notify call and _TIF_WORK_MASK, but for some reason,
changing _TIF_WORK_MASK seems to break the program now.  This patch
does fix the TF clearing problem.

Jesse


--- linux/arch/i386/kernel/ptrace.c 2004-12-29 14:10:34.0 -0700
+++ linux-mod/arch/i386/kernel/ptrace.c 2004-12-29 20:50:00.0 -0700
@@ -142,18 +142,31 @@
 {
long eflags;
 
+   /*
+* Always set TIF_SINGLESTEP - this guarantees that 
+* we single-step system calls etc.. 
+*/
set_tsk_thread_flag(child, TIF_SINGLESTEP);
+
+   /*
+* If TF was already set, don't do anything else
+*/
eflags = get_stack_long(child, EFL_OFFSET);
+   if (eflags & TRAP_FLAG)
+   return;
put_stack_long(child, EFL_OFFSET, eflags | TRAP_FLAG);
child->ptrace |= PT_DTRACE;
 }
 
 static void clear_singlestep(struct task_struct *child)
 {
+   /* Always clear TIF_SINGLESTEP... */
+   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+
+   /* But touch TF only if it was set by us.. */
if (child->ptrace & PT_DTRACE) {
long eflags;
 
-   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
eflags = get_stack_long(child, EFL_OFFSET);
put_stack_long(child, EFL_OFFSET, eflags & ~TRAP_FLAG);
child->ptrace &= ~PT_DTRACE;
@@ -568,15 +581,13 @@
audit_syscall_exit(current, regs->eax);
}
 
-   if (!test_thread_flag(TIF_SYSCALL_TRACE) &&
-   !test_thread_flag(TIF_SINGLESTEP))
+   if (!test_thread_flag(TIF_SYSCALL_TRACE))
return;
if (!(current->ptrace & PT_PTRACED))
return;
/* the 0x80 provides a way for the tracing parent to distinguish
   between a syscall stop and SIGTRAP delivery */
-   ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD) &&
-!test_thread_flag(TIF_SINGLESTEP) ? 0x80 : 0));
+   ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD) ? 0x80 : 
0));
 
/*
 * this isn't the same as continuing with a signal, but it will do



Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Jesse Allen
On Wed, 29 Dec 2004 12:04:57 -0800 (PST), Linus Torvalds
<[EMAIL PROTECTED]> wrote:
> On Wed, 29 Dec 2004, Jesse Allen wrote:
> > > So does removing the conditional TF clear make everything work again?
> > >
> >
> > Yes, as long as TIF_SINGLESTEP is not set in set_singlestep().
> 
> That may be a clue, if only because that makes absolutely _zero_ sense.
> 
> Setting TIF_SINGLESTEP shouldn't actually matter in this case, since we
> set the TRAP_FLAG in eflags by hand anyway (and that's what TIF_SINGESTEP
> will just re-do when returning to user space).
> 
> What TIF_SINGLESTEP _does_ do, however, is change how some other issues
> are reported to user space. In particular, it causes system call tracing
> (see arch/i386/kernel/ptrace.c: do_syscall_trace), and maybe it is _that_
> that messes up Wine.
> 
> So instead of removing the setting of TIF_SINGLESTEP in set_singlestep(),
> can you test whether removing the _testing_ of it in do_syscall_trace()
> makes things happier for you? Hmm?
> 

Yes, doing that does work.  But I still have to remove the conditional
TF clear.  Here's the diff now to show you.

Jesse
--- linux/arch/i386/kernel/ptrace.c	2004-12-29 14:10:34.0 -0700
+++ linux-mod/arch/i386/kernel/ptrace.c	2004-12-29 14:22:33.0 -0700
@@ -568,8 +568,7 @@
 			audit_syscall_exit(current, regs->eax);
 	}
 
-	if (!test_thread_flag(TIF_SYSCALL_TRACE) &&
-	!test_thread_flag(TIF_SINGLESTEP))
+	if (!test_thread_flag(TIF_SYSCALL_TRACE))
 		return;
 	if (!(current->ptrace & PT_PTRACED))
 		return;
--- linux/arch/i386/kernel/signal.c	2004-12-29 14:10:34.0 -0700
+++ linux-mod/arch/i386/kernel/signal.c	2004-12-29 14:23:04.0 -0700
@@ -299,8 +299,8 @@
 	 * don't want debugging to change state.
 	 */
 	eflags = regs->eflags;
-	if (current->ptrace & PT_DTRACE)
-		eflags &= ~TF_MASK;
+//	if (current->ptrace & PT_DTRACE)
+//		eflags &= ~TF_MASK;
 	err |= __put_user(eflags, &sc->eflags);
 	err |= __put_user(regs->esp, &sc->esp_at_signal);
 	err |= __put_user(regs->xss, (unsigned int __user *)&sc->ss);


Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Linus Torvalds


On Wed, 29 Dec 2004, Jesse Allen wrote:
> 
> > 
> > So does removing the conditional TF clear make everything work again?
> > 
> 
> Yes, as long as TIF_SINGLESTEP is not set in set_singlestep(). 

That may be a clue, if only because that makes absolutely _zero_ sense. 

Setting TIF_SINGLESTEP shouldn't actually matter in this case, since we
set the TRAP_FLAG in eflags by hand anyway (and that's what TIF_SINGESTEP
will just re-do when returning to user space).

What TIF_SINGLESTEP _does_ do, however, is change how some other issues
are reported to user space. In particular, it causes system call tracing
(see arch/i386/kernel/ptrace.c: do_syscall_trace), and maybe it is _that_ 
that messes up Wine.

So instead of removing the setting of TIF_SINGLESTEP in set_singlestep(), 
can you test whether removing the _testing_ of it in do_syscall_trace() 
makes things happier for you? Hmm?

(Also, looking at the code, I get the feeling that set_singlestep() should 
_only_ set TIF_SINGLESTEP, and not set the TRAP_FLAG by hand at all, since 
TIF_SINGESTEP should take care of that detail regardless).

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Jesse Allen
On Wed, 29 Dec 2004 12:40:53 -0700, Jesse Allen <[EMAIL PROTECTED]> wrote:
> For the wine people, I will try to upload the seh debug channel logs
> as soon as possible.
> 

I have a page with the latest logs.

http://www.chez.com/alors/



Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Jesse Allen
On Wed, 29 Dec 2004 20:35:44 +0100, Thomas Sailer <[EMAIL PROTECTED]> wrote:
> 
> Mike,
> 
> thanks a lot for your mail. I've just tried Jesse Allen's Patch with
> 2.6.10-ac1, but unfortunately it doesn't seem to be enough to get xst
> working again. Let me know if (and how :)) I can help gather evidence.
> 
> Tom
> 
> 


If the single-stepping changes truly broke your program, and you know
< 2.6.9-rc1 works, try this attached diff against 2.6.10 in addition
to the other one I made.  This should effectively reverse change #2,
but like I said, the current code for change #2 actually works for me
now.  But maybe this is not enough.  You can also look here at my
original report:

http://www.winehq.org/hypermail/wine-devel/2004/11/0310.html

and reverse the original patches against 2.6.9, and see if that kernel
works then. Patch links:

http://linux.bkbits.net:8080/linux-2.6/[EMAIL PROTECTED]
http://linux.bkbits.net:8080/linux-2.6/[EMAIL PROTECTED]
http://linux.bkbits.net:8080/linux-2.6/[EMAIL PROTECTED]

Jesse
--- linux/arch/i386/kernel/signal.c	2004-12-29 12:54:59.0 -0700
+++ linux-mod/arch/i386/kernel/signal.c	2004-12-29 12:53:23.0 -0700
@@ -426,8 +426,8 @@
 	 */
 	if (regs->eflags & TF_MASK) {
 		regs->eflags &= ~TF_MASK;
-		if (current->ptrace & PT_DTRACE)
-			ptrace_notify(SIGTRAP);
+//		if (current->ptrace & PT_DTRACE)
+//			ptrace_notify(SIGTRAP);
 	}
 
 #if DEBUG_SIG


Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Thomas Sailer
On Wed, 2004-12-29 at 15:02 +, Mike Hearn wrote:

Mike,

thanks a lot for your mail. I've just tried Jesse Allen's Patch with
2.6.10-ac1, but unfortunately it doesn't seem to be enough to get xst
working again. Let me know if (and how :)) I can help gather evidence.

Tom





Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Linus Torvalds


On Wed, 29 Dec 2004, Mike Hearn wrote:
> 
> I can't see if he CCd anybody from the archives but Jesse Allen posted a
> nice analysis of the remaining problem here:
> 
> http://www.winehq.org/hypermail/wine-devel/2004/12/0691.html

Ok, I don't remember the context from the Wine lists (and it's not clear
from the older emails I was cc'd on), so the "#3 signal.c" change
description is a bit too vague. Jesse, willing to just point to the exact
diff that you need to make Warcraft work for you (and then maybe Thomas
Sailer can verify whether that part is indeed the one that causes him
problems).

The code in question now does

/*
 * Iff TF was set because the program is being single-stepped by a
 * debugger, don't save that information on the signal stack.. We
 * don't want debugging to change state.
 */
eflags = regs->eflags;
if (current->ptrace & PT_DTRACE)
eflags &= ~TF_MASK;
err |= __put_user(eflags, &sc->eflags);

and I guess it originally never cleared it. True?

So does removing the conditional TF clear make everything work again?

Linus



Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Jesse Allen
On Wed, 29 Dec 2004 10:53:54 -0800 (PST), Linus Torvalds
<[EMAIL PROTECTED]> wrote:
> Ok, I don't remember the context from the Wine lists (and it's not clear
> from the older emails I was cc'd on), so the "#3 signal.c" change
> description is a bit too vague. Jesse, willing to just point to the exact
> diff that you need to make Warcraft work for you (and then maybe Thomas
> Sailer can verify whether that part is indeed the one that causes him
> problems).

I have attached the diff attached in this message for the lkml.

> 
> The code in question now does
> 
> /*
>  * Iff TF was set because the program is being single-stepped by a
>  * debugger, don't save that information on the signal stack.. We
>  * don't want debugging to change state.
>  */
> eflags = regs->eflags;
> if (current->ptrace & PT_DTRACE)
> eflags &= ~TF_MASK;
> err |= __put_user(eflags, &sc->eflags);
> 
> and I guess it originally never cleared it. True?

Yes.

> 
> So does removing the conditional TF clear make everything work again?
> 

Yes, as long as TIF_SINGLESTEP is not set in set_singlestep(). 
set_singlestep also sets PT_DTRACE, so as it now is, a call to the
set_singlestep function will make this condition true clearing TF when
run.  So both the conditional TF clear and setting TIF_SINGLESTEP
needs to be removed, like I show in the diff.   Making these changes
returns the code to a 2.6.8-ish resemblence.

For the wine people, I will try to upload the seh debug channel logs
as soon as possible.

Jesse
diff -u linux/arch/i386/kernel/ptrace.c linux-mod/arch/i386/kernel/ptrace.c
--- linux/arch/i386/kernel/ptrace.c	2004-12-09 15:24:07.0 -0700
+++ linux-mod/arch/i386/kernel/ptrace.c	2004-12-25 16:09:52.0 -0700
@@ -142,7 +142,7 @@
 {
 	long eflags;
 
-	set_tsk_thread_flag(child, TIF_SINGLESTEP);
+//	set_tsk_thread_flag(child, TIF_SINGLESTEP);
 	eflags = get_stack_long(child, EFL_OFFSET);
 	put_stack_long(child, EFL_OFFSET, eflags | TRAP_FLAG);
 	child->ptrace |= PT_DTRACE;
@@ -153,7 +153,7 @@
 	if (child->ptrace & PT_DTRACE) {
 		long eflags;
 
-		clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+//		clear_tsk_thread_flag(child, TIF_SINGLESTEP);
 		eflags = get_stack_long(child, EFL_OFFSET);
 		put_stack_long(child, EFL_OFFSET, eflags & ~TRAP_FLAG);
 		child->ptrace &= ~PT_DTRACE;
diff -u linux/arch/i386/kernel/signal.c linux-mod/arch/i386/kernel/signal.c
--- linux/arch/i386/kernel/signal.c	2004-12-09 15:24:07.0 -0700
+++ linux-mod/arch/i386/kernel/signal.c	2004-12-25 16:10:10.0 -0700
@@ -299,8 +299,8 @@
 	 * don't want debugging to change state.
 	 */
 	eflags = regs->eflags;
-	if (current->ptrace & PT_DTRACE)
-		eflags &= ~TF_MASK;
+//	if (current->ptrace & PT_DTRACE)
+//		eflags &= ~TF_MASK;
 	err |= __put_user(eflags, &sc->eflags);
 	err |= __put_user(regs->esp, &sc->esp_at_signal);
 	err |= __put_user(regs->xss, (unsigned int __user *)&sc->ss);


Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Thomas Sailer
Any news about the ptrace single-stepping breakage of wine?

The application that stopped working for me is xst, the Xilinx HDL
synthesizer (there's a free as in beer version at
http://www.xilinx.com/xlnx/xebiz/designResources/ip_product_details.jsp?sGlobalNavPick=PRODUCTS&sSecondaryNavPick=Design+Tools&key=DS-ISE-WEBPACK
). I'm currently at kernel 2.6.10-ac1 (as packaged by Arjan van de Ven),
and wine-20041201-1fc3winehq.

Compiling vhdl file H:/sailer/src/vhdl/xxx/vprim.vhd in Library synwork.
FATAL_ERROR:Xst:Portability/export/Port_Main.h:127:1.13 - This
application has discovered an exceptional condition from which it cannot
recover.  Process will terminate.  To resolve this error, please consult
the Answers Database and other online resources at
http://support.xilinx.com. If you need further assistance, please open a
Webcase by clicking on the "WebCase" link at http://support.xilinx.com

Thanks,
Tom





Re: ptrace single-stepping change breaks Wine

2004-12-29 Thread Mike Hearn
On Wed, 2004-12-29 at 03:14 +0100, Thomas Sailer wrote:
> Any news about the ptrace single-stepping breakage of wine?

I can't see if he CCd anybody from the archives but Jesse Allen posted a
nice analysis of the remaining problem here:

http://www.winehq.org/hypermail/wine-devel/2004/12/0691.html

Here is one interesting portion of his email:

> #3 signal.c - Clearing the trap flag if being traced by debugger in
> setup_sigcontext()
> 
> For some reason, Warcraft III doesn't work if it is cleared here.  I
> have no idea if this TF clear is really necessary.  However,
> everything I've read on this seems to indicate to me that this change
> is important, so I need to find out what this causes to the game.

The other changes Linus made are apparently good and do not cause
Warcraft to regress.

thanks -mike




Re: ptrace single-stepping change breaks Wine

2004-11-28 Thread Eric Pouech
Jesse Allen a écrit :
On Sun, Nov 21, 2004 at 10:23:41PM -0800, Linus Torvalds wrote:
Ok, how about this patch?
It does basically two things:
- it makes the x86 version of ptrace be a lot more careful about the TF 
  bit in eflags, and in particular it never touches it _unless_ the 
  tracer has explicitly asked for it (ie we set TF only when doing a
  PTRACE_SINGESTEP, and we clear it only when it has been set by us, not 
  if it has been set by the program itself).

  This patch also cleans up the codepaths by doing all the common stuff
  in set_singlestep()/clear_singlestep().
- It clarifies signal handling, and makes it clear that we always push 
  the full eflags onto the signal stack, _except_ if the TF bit was set 
  by an external ptrace user, in which case we hide it so that the tracee 
  doesn't see it when it looks at its stack contents.

  It also adds a few comments, and makes it clear that the signal handler
  itself is always set up with TF _clear_. But if we were single-stepped 
  into it, we will have notified the debugger, so the debugger obviously 
  can (and often will) decide to continue single-stepping.

IMHO, this is a nice cleanup, and it also means that I can actually debug 
my "program from hell":

	[EMAIL PROTECTED] ~]$ gdb ./a.out 
	GNU gdb Red Hat Linux (6.1post-1.20040607.41rh)
	Copyright 2004 Free Software Foundation, Inc.
	GDB is free software, covered by the GNU General Public License, and you are
	welcome to change it and/or distribute copies of it under certain conditions.
	Type "show copying" to see the conditions.
	There is absolutely no warranty for GDB.  Type "show warranty" for details.
	This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols 
	found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".

	(gdb) run
	Starting program: /home/torvalds/a.out 
	Reading symbols from shared object read from target memory...(no debugging 
	symbols found)...done.
	Loaded system supplied DSO at 0xe000
	(no debugging symbols found)...(no debugging symbols found)...
	Program received signal SIGTRAP, Trace/breakpoint trap.
	0x08048480 in main ()
	(gdb) signal SIGTRAP
	Continuing with signal SIGTRAP.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048487 in main ()
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048488 in smc ()
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.
Copy protected: ok
	Program exited with code 01.
	(gdb) 

which I think is a sign that this patch actually fixes ptrace.
Does this help with wine? I dunno. Maybe some wine people can comment..

For the wine app in question, it does make a difference.  However, there is 
a new problem.  The program gets stuck at the loading screen at 100% CPU.
It's making a call to select, timing out, and then tries select again, 
repeats.  It's waiting for something that seems to never happen.

I've captured a log, "loop.log.bz2", and shortened to have only the relevent
information after the CMS32_MUTEX is created.  Look for occurances of
 "select()" -- I think the second one is where it starts.  It's on my ftp if 
anyone wants to take a look.  It probably can be compared to the working-
version log where this doesn't occur, but it might be a pain to spot the 
working particular instance.
Hi Jesse
Any network issue on your side? I tried to traceroute resnet.disp.net, but no 
avail.
So I cannot have a look to you newest trace.
A+



Re: ptrace single-stepping change breaks Wine

2004-11-23 Thread Linus Torvalds


On Mon, 22 Nov 2004, Eric Pouech wrote:
> 
> For the linux folks, here a small comparison of what happens in the working 
> (old) case and in the non-working (new) case:
> 
> In both case
> 
> - Wine gets a first SIGTRAP (in it's sig_trap handler)
>   + Wine converts it into a Windows exception (w-exception in short).
> This includes creating a context for the generic CPU registers
>   + This w-exception is sent to the w-exception handler the program
> installed (this one can modifiy the all registers)
>   o this handler touches one of the i386 debug registers
>   + as we need to update the debug registers values (and we don't do in
> the signal handler return), this task is delegated to the Wine server
> (our central process, which is in charge of synchronisation...)
>   > the Wine server ptrace-attach:es to the process which handled
> the SIGTRAP.
>   > Wine server wait4:s on the SIGSTOP (after ptrace:attach)
>   > modify (with ptrace) the debug registers
>   > and resumes excution (ptrace: cont)
>   + wine terminates the sig trap handler and resumes the execution with
> the modified basic registers (from the saved context), and the
> modified debug registers (from the Wine server round trip)
> - a second sig trap is generated
>   > since the wine server is still ptrace:attached, it gets the signal.   
> 
> What differs then in both execution:
> - in the working case, the sig trap handler is called on the client side, 
> whereas it's never called in the non-working case. We do have a couple of 
> protection (to avoid some misbehaving apps), but none of them get triggered. 
> So 
> it seems like the trap handler is not called (ugh).

This actually implies that the current -bk tree with my latest patch may 
actually fix it.

One of the things that 2.6.9 did wrong was exactly that it cleared TF too
much in the ptrace interface. The current development tree is a lot more
careful about that, and it fixes the horrid test-case that I used to debug
it. The test-case (when run under gdb) actually does something similar to
what Wine appears to do.

> - in Windows trap handling, the TF is explictly reset before calling the 
> windows 
> exception handler (which is what Wine does, before calling the window 
> exception 
> handler). Of course the handler can set it back if it wants to continue 
> single 
> stepping.

TF will be still set in Linux when ptrace gets access, but the ptracer can
choose to clear it with PTRACE_PEEKUSR/PTRACE_POKEUSR (or with
PTRACE_GETREGS/SETREGS). I assume you already do that, since I think that
has been true forever (although maybe you don't: PTRACE_CONTINUE used to 
unconditionally clear TF, so it may be that Wine may need some minor 
modification to not do that - but the good news is that mod should be 
backwards-compatible, so it should be pretty easy).

I actually broke down and am downloading the latest source tree of wine,
let's see if I can find the place you do this.

Linus



Re: ptrace single-stepping change breaks Wine

2004-11-23 Thread Linus Torvalds


On Mon, 22 Nov 2004, Mike Hearn wrote:
> 
> this is where signals are converted to SEH exceptions (w-exceptions as
> Eric called them):
> 
>   dlls/ntdll/signal_i386.c 

Looks like it clears TF there already:

if (context->EFlags & 0x100)
{
context->EFlags &= ~0x100;  /* clear single-step flag */
}

so I'll just assume it's ok. 

Linus



Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Jesse Allen
On Mon, Nov 22, 2004 at 04:15:21PM -0700, Jesse Allen wrote:
> 
> For the wine app in question, it does make a difference.  However, there is 
> a new problem.  The program gets stuck at the loading screen at 100% CPU.
> It's making a call to select, timing out, and then tries select again, 
> repeats.  It's waiting for something that seems to never happen.
> 
> I've captured a log, "loop.log.bz2", and shortened to have only the relevent
> information after the CMS32_MUTEX is created.  Look for occurances of
>  "select()" -- I think the second one is where it starts.  It's on my ftp if 
> anyone wants to take a look.  It probably can be compared to the working-
> version log where this doesn't occur, but it might be a pain to spot the 
> working particular instance.
> 
> 


Actually it's pretty obvious.  In the working version, it's supposed to be
getting SIGTRAPs while it's calling select().  So something's up there.  But
it's doing part of what it should be doing now.





Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Jesse Allen
On Sun, Nov 21, 2004 at 10:23:41PM -0800, Linus Torvalds wrote:
> 
> Ok, how about this patch?
> 
> It does basically two things:
> 
>  - it makes the x86 version of ptrace be a lot more careful about the TF 
>bit in eflags, and in particular it never touches it _unless_ the 
>tracer has explicitly asked for it (ie we set TF only when doing a
>PTRACE_SINGESTEP, and we clear it only when it has been set by us, not 
>if it has been set by the program itself).
> 
>This patch also cleans up the codepaths by doing all the common stuff
>in set_singlestep()/clear_singlestep().
> 
>  - It clarifies signal handling, and makes it clear that we always push 
>the full eflags onto the signal stack, _except_ if the TF bit was set 
>by an external ptrace user, in which case we hide it so that the tracee 
>doesn't see it when it looks at its stack contents.
> 
>It also adds a few comments, and makes it clear that the signal handler
>itself is always set up with TF _clear_. But if we were single-stepped 
>into it, we will have notified the debugger, so the debugger obviously 
>can (and often will) decide to continue single-stepping.
> 
> IMHO, this is a nice cleanup, and it also means that I can actually debug 
> my "program from hell":
> 
>   [EMAIL PROTECTED] ~]$ gdb ./a.out 
>   GNU gdb Red Hat Linux (6.1post-1.20040607.41rh)
>   Copyright 2004 Free Software Foundation, Inc.
>   GDB is free software, covered by the GNU General Public License, and 
> you are
>   welcome to change it and/or distribute copies of it under certain 
> conditions.
>   Type "show copying" to see the conditions.
>   There is absolutely no warranty for GDB.  Type "show warranty" for 
> details.
>   This GDB was configured as "i386-redhat-linux-gnu"...(no debugging 
> symbols 
>   found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".
> 
>   (gdb) run
>   Starting program: /home/torvalds/a.out 
>   Reading symbols from shared object read from target memory...(no 
> debugging 
>   symbols found)...done.
>   Loaded system supplied DSO at 0xe000
>   (no debugging symbols found)...(no debugging symbols found)...
>   Program received signal SIGTRAP, Trace/breakpoint trap.
>   0x08048480 in main ()
>   (gdb) signal SIGTRAP
>   Continuing with signal SIGTRAP.
> 
>   Program received signal SIGTRAP, Trace/breakpoint trap.
>   0x08048487 in main ()
>   (gdb) signal SIGTRAP
>   Continuing with signal SIGTRAP.
> 
>   Program received signal SIGTRAP, Trace/breakpoint trap.
>   0x08048488 in smc ()
>   (gdb) signal SIGTRAP
>   Continuing with signal SIGTRAP.
>   Copy protected: ok
> 
>   Program exited with code 01.
>   (gdb) 
> 
> which I think is a sign that this patch actually fixes ptrace.
> 
> Does this help with wine? I dunno. Maybe some wine people can comment..
> 


For the wine app in question, it does make a difference.  However, there is 
a new problem.  The program gets stuck at the loading screen at 100% CPU.
It's making a call to select, timing out, and then tries select again, 
repeats.  It's waiting for something that seems to never happen.

I've captured a log, "loop.log.bz2", and shortened to have only the relevent
information after the CMS32_MUTEX is created.  Look for occurances of
 "select()" -- I think the second one is where it starts.  It's on my ftp if 
anyone wants to take a look.  It probably can be compared to the working-
version log where this doesn't occur, but it might be a pain to spot the 
working particular instance.


Jesse



Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Mike Hearn
On Mon, 2004-11-22 at 13:10 -0800, Linus Torvalds wrote:
> I actually broke down and am downloading the latest source tree of wine,
> let's see if I can find the place you do this.

The Wine source tree is organised in the same way Windows is, which is
bizarre and unintuitive but we don't really have much choice. So here
are the files you'd be looking for.

this is where signals are converted to SEH exceptions (w-exceptions as
Eric called them):

  dlls/ntdll/signal_i386.c 

this is where the wineserver does context related things:

  server/context_i386.c

this is where the server does ptracing:

  server/ptrace.c

There may be one or two other places that are related, I only ever
looked at this code to deal with some other copy protection system that
wasn't happy (not kernels fault though).

thanks -mike

-- 
Mike Hearn <[EMAIL PROTECTED]>
Codeweavers, Inc




Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Eric Pouech
Jesse Allen a écrit :
On Fri, Nov 19, 2004 at 01:53:38PM -0800, Linus Torvalds wrote:
On Fri, 19 Nov 2004, Daniel Jacobowitz wrote:
I'm getting the feeling that the question of whether to step into
signal handlers is orthogonal to single-stepping; maybe it should be a
separate ptrace operation.
I really don't see why. If a controlling process is asking for 
single-stepping, then it damn well should get it. It it doesn't want to 
single-step through a signal handler, then it could decide to just put a 
breakpoint on the return point (possibly by modifying the signal handler 
save area).

It's not like single-stepping into the signal handler in any way removes 
any information (while _not_ single-stepping into it clearly does).

With the patch I just posted (assuming it works for people), Wine should 
at least have the choice. The behaviour now should be:

- if the app sets TF on its own, it will cause a SIGTRAP which it can 
  catch.
- if the debugger sets TF with SINGLESTEP, it will single-step into a 
  signal handler.
- it the app sets TF _and_ you ptrace it, you the ptracer will see the 
  debug event and catch it. However, doing a "continue" at that point
  will remove the TF flag (and always has), the app will normally then
  never see the trap. You can do a "signal SIGTRAP" to actually force
  the trap handler to tun, but that one won't actually single-step (it's 
  a "continue" in all other senses).

It sounds like the third case is what wine wants.
		Linus

So an app running through wine could set TF on it's own?  It would be a 
good idea to find out what it is doing in the first place.  There has to be
a reason why War3 is so picky after the original change was introduced and
a reason why the latest changes don't seem to fix it. 

I've built a kernel 2.6.10-rc2 with the new ptrace patch.  The program still
says "please insert disc".  I haven't been able to get winedbg to tell me 
anything useful -- the program never crashes anyways.  So I went ahead and I 
captured a debug log.

the command:
WINEDEBUG=+all wine war3/Warcraft\ III.exe -opengl >& war3_and_2.6.10-rc2.log
I scanned for the part right before it calls up to display the error.  I found
that after loading advapi32.dll, the thread 000c creates a mutex and wakes up
000a.  Then 000c gets killed and right after that starts calling up the 
resources for the "insert disc" message box.  I put the log up on my ftp, and 
the interesting part in a seperate file:
ftp://resnet.dnip.net/

Any clue on what may be happening here?  Or maybe another idea on where else to 
search?
Jesse

For the linux folks, here a small comparison of what happens in the working 
(old) case and in the non-working (new) case:

In both case
- Wine gets a first SIGTRAP (in it's sig_trap handler)
+ Wine converts it into a Windows exception (w-exception in short).
  This includes creating a context for the generic CPU registers
+ This w-exception is sent to the w-exception handler the program
  installed (this one can modifiy the all registers)
o this handler touches one of the i386 debug registers
+ as we need to update the debug registers values (and we don't do in
  the signal handler return), this task is delegated to the Wine server
  (our central process, which is in charge of synchronisation...)
> the Wine server ptrace-attach:es to the process which handled
  the SIGTRAP.
> Wine server wait4:s on the SIGSTOP (after ptrace:attach)
> modify (with ptrace) the debug registers
> and resumes excution (ptrace: cont)
+ wine terminates the sig trap handler and resumes the execution with
  the modified basic registers (from the saved context), and the
  modified debug registers (from the Wine server round trip)
- a second sig trap is generated
> since the wine server is still ptrace:attached, it gets the signal.   
 
What differs then in both execution:
- in the working case, the sig trap handler is called on the client side, 
whereas it's never called in the non-working case. We do have a couple of 
protection (to avoid some misbehaving apps), but none of them get triggered. So 
it seems like the trap handler is not called (ugh).

A couple of notes:
- as the program tested is copy protected, and as it seems that the copy 
protection is what causes the harm, it can be interesting to know that the 
programe seems to set the TF flag (some copy protection schemes directly do an 
"int 1", but given the exception code we get, this isn't the case).
- in Windows trap handling, the TF is explictly reset before calling the windows 
exception handler (which is what Wine does, before calling the window exception 
handler). Of course the handler can set it back if it wants to continue single 
stepping.

HTH
A+



Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Linus Torvalds


On Mon, 22 Nov 2004, Andreas Schwab wrote:

> Linus Torvalds <[EMAIL PROTECTED]> writes:
> 
> > IMHO, this is a nice cleanup, and it also means that I can actually debug 
> > my "program from hell":
> 
> Does it also work when trying to single step over it?  I guess all bets
> are off then.

If you single-step over the "popfl", then you need to generate the
SIGTRAP's by hand too. IOW, it's _possible_ to emulate the behaviour from
within the debugger, but it gets really really nasty very quickly.

I think the nastyness in that case is at least acceptable, since if you 
single-step, you actually _see_ what is happening, and thus you have a 
chance in hell of figuring it out. Practical? No. But debuggable at least 
in theory, which it really wasn't before.

Linus



Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Davide Libenzi
On Sun, 21 Nov 2004, Linus Torvalds wrote:

> Ok, how about this patch?
> 
> It does basically two things:
> 
>  - it makes the x86 version of ptrace be a lot more careful about the TF 
>bit in eflags, and in particular it never touches it _unless_ the 
>tracer has explicitly asked for it (ie we set TF only when doing a
>PTRACE_SINGESTEP, and we clear it only when it has been set by us, not 
>if it has been set by the program itself).
> 
>This patch also cleans up the codepaths by doing all the common stuff
>in set_singlestep()/clear_singlestep().
> 
>  - It clarifies signal handling, and makes it clear that we always push 
>the full eflags onto the signal stack, _except_ if the TF bit was set 
>by an external ptrace user, in which case we hide it so that the tracee 
>doesn't see it when it looks at its stack contents.
> 
>It also adds a few comments, and makes it clear that the signal handler
>itself is always set up with TF _clear_. But if we were single-stepped 
>into it, we will have notified the debugger, so the debugger obviously 
>can (and often will) decide to continue single-stepping.

Looks like a nice cleanup. What does the test program below print for you?



- Davide



#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 


int main(int ac, char **av) {
int i, status, res;
long start, end;
pid_t cpid, pid;
struct user_regs_struct ur;
struct sigaction sa;

sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sa.sa_handler = SIG_DFL;
sigaction(SIGCHLD, &sa, NULL);

printf("nargs=%d\n", ac);
if (ac == 1)
goto tracer;

printf("arg=%s\n", av[1]);
loop:
__asm__ volatile ("int $0x80"
  : "=a" (res)
  : "0" (__NR_getpid));
goto loop;
endloop:
exit(0);


tracer:
if ((cpid = fork()) != 0)
goto parent;

printf("child=%d\n", getpid());
ptrace(PTRACE_TRACEME, 0, NULL, NULL);

execl(av[0], av[0], "child", NULL);

exit(0);

parent:
start = (long) &&loop;
end = (long) &&endloop;

printf("pchild=%d\n", cpid);

for (;;) {
pid = wait(&status);
if (pid != cpid)
continue;
res = WSTOPSIG(status);
if (ptrace(PTRACE_GETREGS, pid, NULL, &ur)) {
printf("[%d] error: ptrace(PTRACE_GETREGS, %d)\n",
   pid, pid);
return 1;
}

if (ptrace(PTRACE_SINGLESTEP, pid, NULL, res != SIGTRAP ? res: 
0)) {
perror("ptrace(PTRACE_SINGLESTEP)");
return 1;
}

if (ur.eip >= start && ur.eip <= end)
break;
}


for (i = 0; i < 15; i++) {
printf("waiting ...\n");
pid = wait(&status);
printf("done: pid=%d  status=%d\n", pid, status);
if (pid != cpid)
continue;
res = WSTOPSIG(status);
printf("sig=%d\n", res);
if (ptrace(PTRACE_GETREGS, pid, NULL, &ur)) {
printf("[%d] error: ptrace(PTRACE_GETREGS, %d)\n",
   pid, pid);
return 1;
}

printf("EIP=0x%08x\n", ur.eip);

if (ptrace(PTRACE_SINGLESTEP, pid, NULL, res != SIGTRAP ? res: 
0)) {
perror("ptrace(PTRACE_SINGLESTEP)");
return 1;
}
}

if (ptrace(PTRACE_CONT, cpid, NULL, SIGKILL)) {
perror("ptrace(PTRACE_SINGLESTEP)");
return 1;
}

return 0;
}





Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Linus Torvalds


On Sun, 21 Nov 2004, Davide Libenzi wrote:
> 
> I'd agree with Linus here. A signal handler is part of the application, so 
> it should be single stepped in the same way other application code does. 
> My original patch simply reenabled the flag before returning to userspace, 
> and this had the consequence to single step into signal handlers too.

Hmmm.. I think I may have a test-case for the problem.

Lookie here:

#include 
#include 

void function(void)
{
printf("Copy protected: ok\n");
}

void handler(int signo)
{
extern char smc;
smc++;
}

#define TF 0x100

int main(int argc, char **argv)
{
void (*fnp)(void);

signal(SIGTRAP, handler);
mprotect((void *)(0xf000 & (unsigned long)main), 4096, 
PROT_READ | PROT_WRITE);
asm volatile("pushfl ; orl %0,(%%esp) ; popfl"
: :"i" (TF):"memory");
asm volatile("pushfl ; andl %0,(%%esp) ; popfl"
: :"i" (~TF):"memory"); 
asm volatile("\nsmc:\n\t"
".byte 0xb7\n\t"
".long function"
:"=d" (fnp));
fnp();
exit(1);
}

Compile it, run it, and it should say

Copy protected: ok

Now, try to "strace" it, or debug it with gdb, and see if you can repeat 
the behaviour.

Roland? Think of it as a challenge,

Linus



Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Linus Torvalds


On Sun, 21 Nov 2004, Linus Torvalds wrote:
> 
> I'm by no means 100% sure that we should encourage the kind of programming 
> "skills" I showed with that example program, so in that sense this may not 
> be worth worrying about. That said, I do hate the notion of having 
> programs that are basically undebuggable, so from a QoI standpoint I'd 
> really like to say that you can run my horrid little program under the 
> debugger and see it work...

Ok, how about this patch?

It does basically two things:

 - it makes the x86 version of ptrace be a lot more careful about the TF 
   bit in eflags, and in particular it never touches it _unless_ the 
   tracer has explicitly asked for it (ie we set TF only when doing a
   PTRACE_SINGESTEP, and we clear it only when it has been set by us, not 
   if it has been set by the program itself).

   This patch also cleans up the codepaths by doing all the common stuff
   in set_singlestep()/clear_singlestep().

 - It clarifies signal handling, and makes it clear that we always push 
   the full eflags onto the signal stack, _except_ if the TF bit was set 
   by an external ptrace user, in which case we hide it so that the tracee 
   doesn't see it when it looks at its stack contents.

   It also adds a few comments, and makes it clear that the signal handler
   itself is always set up with TF _clear_. But if we were single-stepped 
   into it, we will have notified the debugger, so the debugger obviously 
   can (and often will) decide to continue single-stepping.

IMHO, this is a nice cleanup, and it also means that I can actually debug 
my "program from hell":

[EMAIL PROTECTED] ~]$ gdb ./a.out 
GNU gdb Red Hat Linux (6.1post-1.20040607.41rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and 
you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for 
details.
This GDB was configured as "i386-redhat-linux-gnu"...(no debugging 
symbols 
found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".

(gdb) run
Starting program: /home/torvalds/a.out 
Reading symbols from shared object read from target memory...(no 
debugging 
symbols found)...done.
Loaded system supplied DSO at 0xe000
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048480 in main ()
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048487 in main ()
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048488 in smc ()
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.
Copy protected: ok

Program exited with code 01.
(gdb) 

which I think is a sign that this patch actually fixes ptrace.

Does this help with wine? I dunno. Maybe some wine people can comment..

Roland, mind take a look? I assume you have some gdb test-suite that you 
use to test the things?

Linus



= arch/i386/kernel/ptrace.c 1.27 vs edited =
--- 1.27/arch/i386/kernel/ptrace.c  2004-11-07 18:10:34 -08:00
+++ edited/arch/i386/kernel/ptrace.c2004-11-21 21:34:58 -08:00
@@ -138,6 +138,28 @@
return retval;
 }
 
+static void set_singlestep(struct task_struct *child)
+{
+   long eflags;
+
+   set_tsk_thread_flag(child, TIF_SINGLESTEP);
+   eflags = get_stack_long(child, EFL_OFFSET);
+   put_stack_long(child, EFL_OFFSET, eflags | TRAP_FLAG);
+   child->ptrace |= PT_DTRACE;
+}
+
+static void clear_singlestep(struct task_struct *child)
+{
+   if (child->ptrace & PT_DTRACE) {
+   long eflags;
+
+   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+   eflags = get_stack_long(child, EFL_OFFSET);
+   put_stack_long(child, EFL_OFFSET, eflags & ~TRAP_FLAG);
+   child->ptrace &= ~PT_DTRACE;
+   }
+}
+
 /*
  * Called by kernel/ptrace.c when detaching..
  *
@@ -145,11 +167,7 @@
  */
 void ptrace_disable(struct task_struct *child)
 { 
-   long tmp;
-
-   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
-   tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
-   put_stack_long(child, EFL_OFFSET, tmp);
+   clear_singlestep(child);
 }
 
 /*
@@ -388,10 +406,8 @@
  }
  break;
 
-   case PTRACE_SYSCALL: /* continue and stop at next (return from) syscall 
*/
-   case PTRACE_CONT: { /* restart after signal. */
-   long tmp;
-
+   case PTRACE_SYSCALL:/* continue and stop at next (return from) 
syscall */
+   case

Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Andreas Schwab
Linus Torvalds <[EMAIL PROTECTED]> writes:

> IMHO, this is a nice cleanup, and it also means that I can actually debug 
> my "program from hell":

Does it also work when trying to single step over it?  I guess all bets
are off then.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Davide Libenzi
On Fri, Nov 19, 2004 at 01:53:38PM -0800, Linus Torvalds wrote:
> 
> On Fri, 19 Nov 2004, Daniel Jacobowitz wrote:
> > 
> > I'm getting the feeling that the question of whether to step into
> > signal handlers is orthogonal to single-stepping; maybe it should be a
> > separate ptrace operation.
> 
> I really don't see why. If a controlling process is asking for 
> single-stepping, then it damn well should get it. It it doesn't want to 
> single-step through a signal handler, then it could decide to just put a 
> breakpoint on the return point (possibly by modifying the signal handler 
> save area).

I'd agree with Linus here. A signal handler is part of the application, so 
it should be single stepped in the same way other application code does. 
My original patch simply reenabled the flag before returning to userspace, 
and this had the consequence to single step into signal handlers too.



PS: I still cannot find the beginning this thread on lkml.


- Davide




Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Davide Libenzi
On Sun, 21 Nov 2004, Linus Torvalds wrote:

> On Mon, 22 Nov 2004, Andreas Schwab wrote:
> >
> > Linus Torvalds <[EMAIL PROTECTED]> writes:
> > 
> > > Now, try to "strace" it, or debug it with gdb, and see if you can repeat 
> > > the behaviour.
> > 
> > You'll always have hard time repeating that under strace or gdb, since a
> > debugger uses SIGTRAP for it's own purpose and does not pass it to the
> > program.
> 
> Sure. But "hard time" and "impossible" are two different things. 
> 
> It _should_ be perfectly easy to debug this, by using
> 
>   signal SIGTRAP
> 
> instead of "continue" when you get a SIGTRAP that wasn't due to anything 
> you did.
> 
> But try it. It doesn't work. Why? Because the kernel will have cleared TF 
> on the signal stack, so even when you do a "signal SIGTRAP", it will only 
> run the trap handler _once_, even though it should run it three times 
> (once for each instruction in between the "popfl"s.
> 
> I do think this is actually a bug, although whether it's the bug that 
> causes problems for Wine is not clear at all. I'mm too lazy to build 
> and boot an older kernel, but I bet that on an older kernel you _can_ do 
> "signal SIGTRAP" three times, and it will work correctly. That would 
> indeed make this a regression.

Hmmm ..., this is 2.4.27:


[EMAIL PROTECTED] davide]$ gcc -g -o  .c 
[EMAIL PROTECTED] davide]$ ./ 
Copy protected: ok
[EMAIL PROTECTED] davide]$ gdb ./
GNU gdb Red Hat Linux (5.3.90-0.20030710.41rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db
library "/lib/libthread_db.so.1".

(gdb) r
Starting program: /home/davide/ 

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048454 in main (argc=1, argv=0xb9c4) at .c:26
26  asm volatile("pushfl ; andl %0,(%%esp) ; popfl"
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.

Program received signal SIGSEGV, Segmentation fault.
0x0003 in ?? ()
(gdb) bt
#0  0x0003 in ?? ()
#1  0x0804846b in smc () at .c:32
#2  0x46649b7f in __libc_start_main () from /lib/i686/libc.so.6
#3  0x08048359 in _start ()


So it seems it did not work even before, the gdb-SIGTRAP stepping. In 
2.6.8 I get a straight segfault just for running it.



- Davide




Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Andreas Schwab
Linus Torvalds <[EMAIL PROTECTED]> writes:

> Now, try to "strace" it, or debug it with gdb, and see if you can repeat 
> the behaviour.

You'll always have hard time repeating that under strace or gdb, since a
debugger uses SIGTRAP for it's own purpose and does not pass it to the
program.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Linus Torvalds


On Sun, 21 Nov 2004, Davide Libenzi wrote:
> 
> You know you're sick, don't you? Making traps inc's to get you in the 
> correct opcode to move function in edx->fnp, is indeed fruit of a sick 
> mind :=)

I prefer "creative" over "sick". It's so much less judgemental.

I thought it was a fun way to illustrate the issue, and I'm sure somebody 
had fun trying to figure out what the thing did.

I could have _obfuscated_ the thing if I had wanted to make it hard to 
follow, but instead I tried to make it as obvious as possible so that it's 
quite clear from reading the code what it actually does.

But imagine hitting that thing without seeing the source code. NOT a lot 
of fun to debug, even if the debugger _worked_ on it.

Linus



Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Linus Torvalds


On Sun, 21 Nov 2004, Davide Libenzi wrote:
> 
> So it seems it did not work even before, the gdb-SIGTRAP stepping. In 
> 2.6.8 I get a straight segfault just for running it.

Ok, that at least means it's not a regression, although it may be that the 
altered behaviour is enough to make some program work/not-work depending 
on exactly what it is testing. My example is certainly not the only way to 
try to mess up a debugger.

I'm by no means 100% sure that we should encourage the kind of programming 
"skills" I showed with that example program, so in that sense this may not 
be worth worrying about. That said, I do hate the notion of having 
programs that are basically undebuggable, so from a QoI standpoint I'd 
really like to say that you can run my horrid little program under the 
debugger and see it work...

Linus



Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Davide Libenzi
On Sun, 21 Nov 2004, Linus Torvalds wrote:

>   void handler(int signo)
>   {
>   extern char smc;
>   smc++;
>   }
> 
>   asm volatile("\nsmc:\n\t"
>   ".byte 0xb7\n\t"
>   ".long function"
>   :"=d" (fnp));
>   fnp();

You know you're sick, don't you? Making traps inc's to get you in the 
correct opcode to move function in edx->fnp, is indeed fruit of a sick 
mind :=)



- Davide




Re: ptrace single-stepping change breaks Wine

2004-11-22 Thread Linus Torvalds


On Mon, 22 Nov 2004, Andreas Schwab wrote:
>
> Linus Torvalds <[EMAIL PROTECTED]> writes:
> 
> > Now, try to "strace" it, or debug it with gdb, and see if you can repeat 
> > the behaviour.
> 
> You'll always have hard time repeating that under strace or gdb, since a
> debugger uses SIGTRAP for it's own purpose and does not pass it to the
> program.

Sure. But "hard time" and "impossible" are two different things. 

It _should_ be perfectly easy to debug this, by using

signal SIGTRAP

instead of "continue" when you get a SIGTRAP that wasn't due to anything 
you did.

But try it. It doesn't work. Why? Because the kernel will have cleared TF 
on the signal stack, so even when you do a "signal SIGTRAP", it will only 
run the trap handler _once_, even though it should run it three times 
(once for each instruction in between the "popfl"s.

I do think this is actually a bug, although whether it's the bug that 
causes problems for Wine is not clear at all. I'mm too lazy to build 
and boot an older kernel, but I bet that on an older kernel you _can_ do 
"signal SIGTRAP" three times, and it will work correctly. That would 
indeed make this a regression.

Linus



Re: ptrace single-stepping change breaks Wine

2004-11-20 Thread Jesse Allen
On Sat, Nov 20, 2004 at 02:49:15PM -0700, Jesse Allen wrote:
> the interesting part in a seperate file:
> ftp://resnet.dnip.net/
> 

I took another look at the section I found.  It doesn't describe much, but it
shows "000c: *signal* signal=5" for example, which are probably SIGTRAP's.

I decided to capture a log running under a kernel before the change, and see 
if I could find the same section again, based on the mutex name.  Well I did, 
and found alot more tracing.  The thread 000c didn't get killed either so it
shows something is different.  Of course under the old kernels I don't get the
"insert disc" message.  I put up the working version log on the ftp.


Jesse




Re: ptrace single-stepping change breaks Wine

2004-11-20 Thread Jesse Allen
On Fri, Nov 19, 2004 at 01:53:38PM -0800, Linus Torvalds wrote:
> 
> 
> On Fri, 19 Nov 2004, Daniel Jacobowitz wrote:
> > 
> > I'm getting the feeling that the question of whether to step into
> > signal handlers is orthogonal to single-stepping; maybe it should be a
> > separate ptrace operation.
> 
> I really don't see why. If a controlling process is asking for 
> single-stepping, then it damn well should get it. It it doesn't want to 
> single-step through a signal handler, then it could decide to just put a 
> breakpoint on the return point (possibly by modifying the signal handler 
> save area).
> 
> It's not like single-stepping into the signal handler in any way removes 
> any information (while _not_ single-stepping into it clearly does).
> 
> With the patch I just posted (assuming it works for people), Wine should 
> at least have the choice. The behaviour now should be:
> 
>  - if the app sets TF on its own, it will cause a SIGTRAP which it can 
>catch.
>  - if the debugger sets TF with SINGLESTEP, it will single-step into a 
>signal handler.
>  - it the app sets TF _and_ you ptrace it, you the ptracer will see the 
>debug event and catch it. However, doing a "continue" at that point
>will remove the TF flag (and always has), the app will normally then
>never see the trap. You can do a "signal SIGTRAP" to actually force
>the trap handler to tun, but that one won't actually single-step (it's 
>a "continue" in all other senses).
> 
> It sounds like the third case is what wine wants.
> 
>   Linus


So an app running through wine could set TF on it's own?  It would be a 
good idea to find out what it is doing in the first place.  There has to be
a reason why War3 is so picky after the original change was introduced and
a reason why the latest changes don't seem to fix it. 

I've built a kernel 2.6.10-rc2 with the new ptrace patch.  The program still
says "please insert disc".  I haven't been able to get winedbg to tell me 
anything useful -- the program never crashes anyways.  So I went ahead and I 
captured a debug log.

the command:
WINEDEBUG=+all wine war3/Warcraft\ III.exe -opengl >& war3_and_2.6.10-rc2.log

I scanned for the part right before it calls up to display the error.  I found
that after loading advapi32.dll, the thread 000c creates a mutex and wakes up
000a.  Then 000c gets killed and right after that starts calling up the 
resources for the "insert disc" message box.  I put the log up on my ftp, and 
the interesting part in a seperate file:
ftp://resnet.dnip.net/

Any clue on what may be happening here?  Or maybe another idea on where else to 
search?


Jesse




Re: ptrace single-stepping change breaks Wine

2004-11-20 Thread Linus Torvalds


On Fri, 19 Nov 2004, Daniel Jacobowitz wrote:
> 
> I'm getting the feeling that the question of whether to step into
> signal handlers is orthogonal to single-stepping; maybe it should be a
> separate ptrace operation.

I really don't see why. If a controlling process is asking for 
single-stepping, then it damn well should get it. It it doesn't want to 
single-step through a signal handler, then it could decide to just put a 
breakpoint on the return point (possibly by modifying the signal handler 
save area).

It's not like single-stepping into the signal handler in any way removes 
any information (while _not_ single-stepping into it clearly does).

With the patch I just posted (assuming it works for people), Wine should 
at least have the choice. The behaviour now should be:

 - if the app sets TF on its own, it will cause a SIGTRAP which it can 
   catch.
 - if the debugger sets TF with SINGLESTEP, it will single-step into a 
   signal handler.
 - it the app sets TF _and_ you ptrace it, you the ptracer will see the 
   debug event and catch it. However, doing a "continue" at that point
   will remove the TF flag (and always has), the app will normally then
   never see the trap. You can do a "signal SIGTRAP" to actually force
   the trap handler to tun, but that one won't actually single-step (it's 
   a "continue" in all other senses).

It sounds like the third case is what wine wants.

Linus



Re: ptrace single-stepping change breaks Wine

2004-11-20 Thread Roland McGrath
> I'm getting the feeling that the question of whether to step into
> signal handlers is orthogonal to single-stepping; 

No, it's not.  You only ever step into a handler when you ask to.
That's already the interface.

> Platforms which don't implement PTRACE_SINGLESTEP would probably
> appreciate this.  A "single step" which stops you after setting up the
> signal trampoline and adjusting the PC, before executing any
> instructions in the handler.

That's what PTRACE_SINGLESTEP with a nonzero signal number does since it
was fixed.



Re: ptrace single-stepping change breaks Wine

2004-11-20 Thread Daniel Jacobowitz
On Fri, Nov 19, 2004 at 09:41:44PM +0100, Eric Pouech wrote:
> >Btw, does wine ever _use_ PTRACE_SINGLESTEP for any of the things it does?
> >
> >If it does, then that woulc certainly explain why my "fix" made no 
> >difference: my fix _only_ handles the case where the ptracer never 
> >actually asks for single-stepping, and single-stepping was started 
> >entirely by the program being run (ie by setting TF in eflags from within 
> >the program itself).
> >
> >But if wine ends up using PTRACE_SINGESTEP because wine actually wants to 
> >single-step over some instructions, then the kernel will set the PT_DTRACE 
> >bit, and start tracing through signal handlers too. The way Wine doesn't 
> >want..
> 
> wine mixes both approches, we have (to control what's generated inside the 
> various exception) to ptrace from our NT-kernel-like process (the ptracer) 
> to get the context of the exception. Restart from the ptracer is done with 
> PTRACE_SINGLESTEP.

I'm getting the feeling that the question of whether to step into
signal handlers is orthogonal to single-stepping; maybe it should be a
separate ptrace operation.

Platforms which don't implement PTRACE_SINGLESTEP would probably
appreciate this.  A "single step" which stops you after setting up the
signal trampoline and adjusting the PC, before executing any
instructions in the handler.

-- 
Daniel Jacobowitz



Re: ptrace single-stepping change breaks Wine

2004-11-20 Thread Linus Torvalds


On Fri, 19 Nov 2004, Eric Pouech wrote:
> 
> wine mixes both approches, we have (to control what's generated inside the 
> various exception) to ptrace from our NT-kernel-like process (the ptracer) to 
> get the context of the exception. Restart from the ptracer is done with 
> PTRACE_SINGLESTEP.

Here's a new patch to try. Totally untested. 

It is more careful about clearing PT_DTRACED (which by now should probably
be renamed PT_PRACE_SINGLESTEP or something on x86, since we should never
be lazy about this thing any more), and it may or may not help.

Pls test _together_ with the previous patch (which is already applied in 
the current top-of-tree for anybody with really recent kernels).

Linus

-
= arch/i386/kernel/ptrace.c 1.27 vs edited =
--- 1.27/arch/i386/kernel/ptrace.c  2004-11-07 18:10:34 -08:00
+++ edited/arch/i386/kernel/ptrace.c2004-11-19 13:18:56 -08:00
@@ -138,6 +138,26 @@
return retval;
 }
 
+static void set_singlestep(struct task_struct *child)
+{
+   long eflags;
+
+   set_tsk_thread_flag(child, TIF_SINGLESTEP);
+   eflags = get_stack_long(child, EFL_OFFSET);
+   put_stack_long(child, EFL_OFFSET, eflags | TRAP_FLAG);
+   child->ptrace |= PT_DTRACE;
+}
+
+static void clear_singlestep(struct task_struct *child)
+{
+   long eflags;
+
+   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+   eflags = get_stack_long(child, EFL_OFFSET);
+   put_stack_long(child, EFL_OFFSET, eflags & ~TRAP_FLAG);
+   child->ptrace &= ~PT_DTRACE;
+}
+
 /*
  * Called by kernel/ptrace.c when detaching..
  *
@@ -145,11 +165,7 @@
  */
 void ptrace_disable(struct task_struct *child)
 { 
-   long tmp;
-
-   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
-   tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
-   put_stack_long(child, EFL_OFFSET, tmp);
+   clear_singlestep(child);
 }
 
 /*
@@ -388,10 +404,8 @@
  }
  break;
 
-   case PTRACE_SYSCALL: /* continue and stop at next (return from) syscall 
*/
-   case PTRACE_CONT: { /* restart after signal. */
-   long tmp;
-
+   case PTRACE_SYSCALL:/* continue and stop at next (return from) 
syscall */
+   case PTRACE_CONT:   /* restart after signal. */
ret = -EIO;
if ((unsigned long) data > _NSIG)
break;
@@ -401,56 +415,39 @@
else {
clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
}
-   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
child->exit_code = data;
-   /* make sure the single step bit is not set. */
-   tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
-   put_stack_long(child, EFL_OFFSET,tmp);
+   /* make sure the single step bit is not set. */
+   clear_singlestep(child);
wake_up_process(child);
ret = 0;
break;
-   }
 
 /*
  * make the child exit.  Best I can do is send it a sigkill. 
  * perhaps it should be put in the status that it wants to 
  * exit.
  */
-   case PTRACE_KILL: {
-   long tmp;
-
+   case PTRACE_KILL:
ret = 0;
if (child->exit_state == EXIT_ZOMBIE)   /* already dead */
break;
child->exit_code = SIGKILL;
-   clear_tsk_thread_flag(child, TIF_SINGLESTEP);
/* make sure the single step bit is not set. */
-   tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
-   put_stack_long(child, EFL_OFFSET, tmp);
+   clear_singlestep(child);
wake_up_process(child);
break;
-   }
-
-   case PTRACE_SINGLESTEP: {  /* set the trap flag. */
-   long tmp;
 
+   case PTRACE_SINGLESTEP: /* set the trap flag. */
ret = -EIO;
if ((unsigned long) data > _NSIG)
break;
clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
-   if ((child->ptrace & PT_DTRACE) == 0) {
-   /* Spurious delayed TF traps may occur */
-   child->ptrace |= PT_DTRACE;
-   }
-   tmp = get_stack_long(child, EFL_OFFSET) | TRAP_FLAG;
-   put_stack_long(child, EFL_OFFSET, tmp);
-   set_tsk_thread_flag(child, TIF_SINGLESTEP);
+   set_singlestep(child);
child->exit_code = data;
/* give it a chance to run. */
wake_up_process(child);
ret = 0;
break;
-   }
 
case PTRACE_DETACH:
/* detach a process that was attached. */



Re: ptrace single-stepping change breaks Wine

2004-11-19 Thread Eric Pouech
Btw, does wine ever _use_ PTRACE_SINGLESTEP for any of the things it does?
If it does, then that woulc certainly explain why my "fix" made no 
difference: my fix _only_ handles the case where the ptracer never 
actually asks for single-stepping, and single-stepping was started 
entirely by the program being run (ie by setting TF in eflags from within 
the program itself).

But if wine ends up using PTRACE_SINGESTEP because wine actually wants to 
single-step over some instructions, then the kernel will set the PT_DTRACE 
bit, and start tracing through signal handlers too. The way Wine doesn't 
want..
wine mixes both approches, we have (to control what's generated inside the 
various exception) to ptrace from our NT-kernel-like process (the ptracer) to 
get the context of the exception. Restart from the ptracer is done with 
PTRACE_SINGLESTEP.

(BTW: I also CC:ed wine-devel ML, that might be of interest to them too)
A+