Re: gdbstub initial code, v5
On 08/23, Oleg Nesterov wrote: However. I spent all Monday trying to resolve the new bug, and so far I do not understand what happens. Extremely hard to reproduce, and the kernel just hangs silently, without any message. So far I suspect the proble in utrace.c, but this time I am not sure. Solved. This was scheduler bug fixed in 2.6.35, but I used 2.6.34. This is really funny. This bug (PF_STARTING lockup) was found and fixed by me Peter. Oh. But I hit yet another problem, BUG_ON() in __utrace_engine_release(). Again, it is not reproducible, I saw it only once in dmesg and I do not even know for sure what I was doing. I'll contiue tomorrow, but if I won't be able to quickly resolve this problem I am going to ignore it for now. This time I think ugdb is wrong. Oleg.
Re: gdbstub initial code, v5
When the main thread exits, gdbserver still exposes it to gdb as a running process. It is visible via info threads, you can switch to this thread, $Tp or $Hx result in OK as if this thread is alive. gdbserver even pretends that $vCont;x:DEAD_THEAD works, although this thread obviously can never report something. This is sort of consistent with the kernel treatment. The main thread stays around as a zombie, acting as a moniker for the whole process. But indeed that is not actually useful for any thread-granularity control or information (well, there is the dead thread's usage stats, but that's all). I don't think this is really right. This just confuses the user, and imho this should be considered like the minor bug. I tend to agree, but don't think it's a big issue either way, really. ugdb doesn't do this. If the main thread exits - it exits like any other thread. I played with gdb, it seems to handle this case fine. Sounds good to me! - The exit code (Wxx) can be wrong in mt-case. The problem is, -report_death can't safely access -group_exit_code with kernel 2.6.35. This is solveable. Don't even worry about it. If there is something trivial to do that makes it better for earlier kernels, then go ahead. But if the easy thing to do gives correct results on =2.6.35 and racily wrong or random results on older kernels, then we can just live with that. Thanks, Roland
Re: gdbstub initial code, v5
Just a small report to explain what I am doing... On 08/20, Oleg Nesterov wrote: - I forgot to implement the attach to the thread group with the dead leader. Next time. Almost done, but we should avoid the races with exec somehow. But this is minor. I tried to test this code as much as I can. Again, I do not use gdb at all, I am using the scripts which try to really stress ugdb. Found 2 bugs in ugdb.ko, the second one is not nice but at least I have the temporary fix. However. I spent all Monday trying to resolve the new bug, and so far I do not understand what happens. Extremely hard to reproduce, and the kernel just hangs silently, without any message. So far I suspect the proble in utrace.c, but this time I am not sure. Will continue tomorrow... Oleg.