Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-14 Thread Nix
On 14 May 2008, Jeff Dike verbalised: > On Wed, May 14, 2008 at 09:12:12PM +0100, Nix wrote: >> Oh, and, it's pedantic of me, I know, but what does this do if time goes >> backwards in the NO_HZ case? (Or is handling that a 2.6.26 thing?) > > In all cases, it holds time steady until the host catch

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-14 Thread Jeff Dike
On Wed, May 14, 2008 at 09:12:12PM +0100, Nix wrote: > Oh, and, it's pedantic of me, I know, but what does this do if time goes > backwards in the NO_HZ case? (Or is handling that a 2.6.26 thing?) In all cases, it holds time steady until the host catches up with what the guest thinks the time shou

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-14 Thread Nix
[vincent-perrier <[EMAIL PROTECTED]> removed from Cc;, his MTA says `Client host rejected: AP0002 Please use your ISP mailserver' only I don't *have* an ISP mailserver.] On 14 May 2008, [EMAIL PROTECTED] verbalised: > Annoyingly, now I've upgraded the host to 2.6.25 (hence sans skas3), > timings

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-14 Thread Nix
On 14 May 2008, Jeff Dike verbalised: > I finally reproduced this using 2.6.25.1 (2.6.25-mm1 was no good) with > your config. YAY! (I wonder why this was so .config-dependent? You'd think it would trigger on anything, but I couldn't even make it happen on all my hosts...) > The patch below fixes

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-14 Thread Jeff Dike
I finally reproduced this using 2.6.25.1 (2.6.25-mm1 was no good) with your config. The patch below fixes it for me. You'll notice a certain similarity between this and a previous patch that you posted. It's not clear to me why yours didn't work. Jeff -- Work email - jdike

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-14 Thread vincent-perrier
Sorry if you receive this message twice, but I sent 2 messages following each other by a few minutes, and received only the second one, this was the first one: Hello, I know that I should not work as root, but I am too used to it, so my example is done with the root user. I am not sure that the

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-14 Thread vincent-perrier
I forgot to tell you, if you launch the daemon as a normal user, the RSA files of this normal user will be used, and not the root ones. And also, the daemon has to mount a file when it creates the ubdb config file, one more reason to have the root sticky bit. This tool is made for network study a

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-14 Thread vincent-perrier
Hello, I know that I should not work as root, but I am too used to it, so my example is done with the root user. I am not sure that the network simulator works with another user for the version you have, but I have checked that for next one, a normal user will do as long as the daemon is owned by

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-14 Thread Jeff Dike
On Fri, May 09, 2008 at 07:13:22PM +0200, vincent-perrier wrote: > The download is long, the untar too, but the freeze is garanteed! > > to start after download: > > "./start_clownix_net virtual_platform_configs/single_machine" > > Hope you try it! Why is it mucking with root's ssh keys: RS

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-09 Thread Nix
On 9 May 2008, [EMAIL PROTECTED] verbalised: > The download is long, the untar too, but the freeze is garanteed! If this doesn't work I can give you an account on, hm, the box which freezes uses a UML for its network link so if you flip the time on it you'd get cut off... and on the other machine

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-09 Thread vincent-perrier
For the "date --set" back 5 sec on the host that freezes an uml indefinitely (in kernel/time/timekeeping.c, update_wall_time,) then download plug_and_play_clownix_network_03 at http://clownix.net On my host, it does get stuck, but it may depend also on host and host kernel, I have: uml_clow

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-09 Thread Jeff Dike
On Sat, May 03, 2008 at 11:08:24AM +0100, Nix wrote: > >>From a quick look, this seems right. And this would be the gold > > standard of preventing UML from seeing time going backwards. > > Oh. And it still doesn't work. Damn. Is it possible for me to get my hands on a UML which is doing this?

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-03 Thread Nix
On 3 May 2008, Jeff Dike told this: > On Sat, May 03, 2008 at 12:21:15AM +0100, Nix wrote: >> With this patch (migrating most of the work into os_nsecs(), with a >> non-NO_HZ version doing skew computations too, atop your first patch, >> and making a couple of variables static for good measure), I

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-02 Thread Jeff Dike
On Sat, May 03, 2008 at 12:21:15AM +0100, Nix wrote: > With this patch (migrating most of the work into os_nsecs(), with a > non-NO_HZ version doing skew computations too, atop your first patch, > and making a couple of variables static for good measure), I still had > no luck: >From a quick look,

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-02 Thread Nix
On 2 May 2008, Jeff Dike stated: > On Fri, May 02, 2008 at 07:55:11PM +0100, Nix wrote: >> I'm trying something else now, arranging for os_nsecs() itself to do the >> never-backwards stuff on the assumption that something depends on >> monotonic timers not skipping backwards which presently they m

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-02 Thread Jeff Dike
On Fri, May 02, 2008 at 07:55:11PM +0100, Nix wrote: > No behavioural change :( Oh well. > I'm trying something else now, arranging for os_nsecs() itself to do the > never-backwards stuff on the assumption that something depends on > monotonic timers not skipping backwards which presently they mi

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-02 Thread Nix
On 2 May 2008, Jeff Dike verbalised: > With your config, I'm seeing a hang until the system time catches up > to what UML thought it should have been in the first place. But it's > only a few seconds, not forever. This is true sometimes, but not always: I just tried twice and got a rapid recovery

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-02 Thread Jeff Dike
On Fri, May 02, 2008 at 12:34:54AM +0100, Nix wrote: > loki:/tmp# /usr/bin/ntpdate -bv hades > 30 Apr 22:45:52 ntpdate[8833]: ntpdate [EMAIL PROTECTED] Fri Feb 22 18:37:11 > UTC 2008 (1) > 30 Apr 22:45:52 ntpdate[8833]: step time server 192.168.14.18 offset > -5.728539 sec > > so in effect yes :

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-01 Thread Nix
On 1 May 2008, Jeff Dike outgrape: > On Wed, Apr 30, 2008 at 10:49:27PM +0100, Nix wrote: >> Done, and tested by stepping the time (five seconds --- five seconds per >> day! I have pendulum clocks that keep better time than that!), and, >> oops, instant loop as before, with this debugging patch ap

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-05-01 Thread Jeff Dike
On Wed, Apr 30, 2008 at 10:49:27PM +0100, Nix wrote: > Done, and tested by stepping the time (five seconds --- five seconds per > day! I have pendulum clocks that keep better time than that!), and, > oops, instant loop as before, with this debugging patch applied, which > did not fire... Hmmm. I

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-04-30 Thread Nix
On 28 Apr 2008, Jeff Dike told this: > On Sat, Apr 26, 2008 at 07:31:44PM +0100, Nix wrote: >> The cause of this is almost certainly time-skewing on the *host* via >> adjtimex(). I stopped ntpd and there were no problems for half a day: I >> restarted it, and as soon as ntpd had synched and begun

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-04-28 Thread Jeff Dike
On Sat, Apr 26, 2008 at 07:31:44PM +0100, Nix wrote: > The cause of this is almost certainly time-skewing on the *host* via > adjtimex(). I stopped ntpd and there were no problems for half a day: I > restarted it, and as soon as ntpd had synched and begun slewing the time > (within a second of slew

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-04-27 Thread Nix
On 26 Apr 2008, [EMAIL PROTECTED] uttered the following: > On 25 Apr 2008, [EMAIL PROTECTED] uttered the following: > >> On 25 Apr 2008, [EMAIL PROTECTED] told this: >> >>> I suspect this can go wrong anywhere, but it happens to have been a >>> CBQ-triggered gettimeofday() while sending an arp tha

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-04-26 Thread Nix
On 25 Apr 2008, [EMAIL PROTECTED] uttered the following: > On 25 Apr 2008, [EMAIL PROTECTED] told this: > >> I suspect this can go wrong anywhere, but it happens to have been a >> CBQ-triggered gettimeofday() while sending an arp that did it. (My ADSL >> router pretty much bombs the poor damn mach

Re: [uml-devel] actually useful backtrace from a CPU-chewing hang

2008-04-25 Thread Nix
On 25 Apr 2008, [EMAIL PROTECTED] told this: > I suspect this can go wrong anywhere, but it happens to have been a > CBQ-triggered gettimeofday() while sending an arp that did it. (My ADSL > router pretty much bombs the poor damn machine with ARP packets all the > time.) Woo, it's happening a lot

[uml-devel] actually useful backtrace from a CPU-chewing hang

2008-04-25 Thread Nix
I suspect this can go wrong anywhere, but it happens to have been a CBQ-triggered gettimeofday() while sending an arp that did it. (My ADSL router pretty much bombs the poor damn machine with ARP packets all the time.) #0 getnstimeofday (ts=0x8217d10) at include/linux/time.h:182 #1 0x080824b4 in