Hi Roger et al, I've restarted juju-machine-0 and rsyslog. juju status hangs (seemingly indefinitely) and I'm seeing rapid log growth still. In machine-0.log I'm seeing the below. In all-machines.log I'm seeing what looks like the same behaviour for all of the other machines.
013-11-29 14:40:04 DEBUG juju.state open.go:88 connection failed, will retry: d ial tcp 127.0.0.1:37017: connection refused 2013-11-29 14:40:05 INFO juju runner.go:253 worker: start "api" 2013-11-29 14:40:05 INFO juju apiclient.go:106 state/api: dialing "wss://localhost:17070/" 2013-11-29 14:40:05 ERROR juju apiclient.go:111 state/api: websocket.Dial wss://localhost:17070/: dial tcp 127.0.0.1:17070: connection refused 2013-11-29 14:40:05 ERROR juju runner.go:211 worker: exited "api": websocket.Dial wss://localhost:17070/: dial tcp 127.0.0.1:17070: connection refused 2013-11-29 14:40:05 INFO juju runner.go:245 worker: restarting "api" in 3s 2013-11-29 14:40:05 DEBUG juju.state open.go:88 connection failed, will retry: dial tcp 127.0.0.1:37017: connection refused 2013-11-29 14:40:05 DEBUG juju.state open.go:88 connection failed, will retry: dial tcp 127.0.0.1:37017: connection refused On 29 November 2013 12:40, Peter Waller <pe...@scraperwiki.com> wrote: > I've replied off-list to Roger with URLs to the logs. I'm happy for them > to be shared internally between juju developers and to share them with > anyone who is interested. > > > On 29 November 2013 12:30, roger peppe <rogpe...@gmail.com> wrote: > >> On 29 November 2013 11:44, Peter Waller <pe...@scraperwiki.com> wrote: >> > The pids appear to be constant since I last reported them. Your theory >> about >> > the machine being out of disk is correct. >> > >> > Indeed the log files are 1.4 and 1.6 GB for all-machines and >> machine-0.log. >> > I'll try xz'ing them and then sending them along to you. Is it okay if I >> > e-mail them directly to you and anyone else who is interested (please >> mail >> > me personally) assuming they compress down to ~megabytes? >> >> Yes please. Putting them somewhere like google drive might work >> better than sending an attachment. I imagine they'll compress >> very nicely though. >> >> BTW, just removing the log files will not work, as the files will still >> be held >> open. You probably want to do something like (having first removed >> enough files elsewhere to ensure some spare disk space): >> >> cd /var/log/juju >> mv machine-0.log old-machine-0.log >> restart jujud-machine-0 >> >> That will cause the machine agent to create a new log file, leaving >> you free to archive the old one and remove it to save space. >> >> You can do a similar thing with all-machines.log, except you'll need >> to restart rsyslog. >> >> I hope that when you've done that, mongo will start working again >> (you might need to restart the juju-db service). If you find that >> you can run juju status, it would be good to know what is >> the value of the "agent-version" field in your environment >> config (i.e. the output of "juju get-environment | grep agent-version"). >> >> cheers, >> rog. >> >> > >> > On 29 November 2013 11:26, roger peppe <rogpe...@gmail.com> wrote: >> >> >> >> On 29 November 2013 10:26, Peter Waller <pe...@scraperwiki.com> wrote: >> >> > On 28 November 2013 17:44, Peter Waller <pe...@scraperwiki.com> >> wrote: >> >> >> >> >> >> I'm still having the problem of it spinning, every few seconds all >> of >> >> >> the >> >> >> machines are still spewing into the logs, despite my attempt at >> asking >> >> >> it to >> >> >> "upgrade" to a different version. >> >> > >> >> > >> >> > Last night I left it spinning absent any better ideas. It didn't >> seem to >> >> > be >> >> > causing any obvious harm that I could find. >> >> > >> >> > This morning `juju status` doesn't work and nor does juju debug-log. >> >> > >> >> > On the bootstrap node the following is running for `ps aux | grep >> juju` >> >> > >> >> > root 23973 0.0 0.0 4440 624 ? Ss Aug29 0:00 >> /bin/sh >> >> > -e >> >> > -c /var/lib/juju/tools/machine-0/jujud machine --log-file >> >> > /var/log/juju/machine-0.log --data-dir '/var/lib/juju' --machine-id 0 >> >> > --debug >> /var/log/juju/machine-0.log 2>&1 /bin/sh >> >> > root 23974 0.2 1.4 309440 24340 ? Sl Aug29 346:48 >> >> > /var/lib/juju/tools/machine-0/jujud machine --log-file >> >> > /var/log/juju/machine-0.log --data-dir /var/lib/juju --machine-id 0 >> >> > --debug >> >> > >> >> > $ juju debug-log | grep --line-buffered -v jsoncodec >> >> > ERROR no reachable servers >> >> > >> >> > $ juju status >> >> > <hangs seemingly indefinitely> >> >> > >> >> > Any advice or documentation someone can point me at to get this >> system >> >> > back >> >> > into a working state? >> >> >> >> If you do the above ps, then wait 5 seconds and try again, have the >> >> process ids changed? >> >> If so, that means it's probably continually bouncing for some reason. >> >> >> >> It would be useful to see the logs, in particular >> >> /var/log/juju/machine-0.log. >> >> Firstly how big are they? It is possible that the machine is out of >> disk >> >> space >> >> and that's caused the mongo database storage to stop working. >> >> You could try deleting the log files (perhaps save them first for >> >> later diagnosis). >> >> >> >> Secondly, perhaps you could paste somewhere the last few thousand lines >> >> of the logs, that might give us more of an idea of what's happening >> >> currently. >> >> >> >> Please feel free to ping us on IRC (#juju-dev or #juju on freenode.net; >> my >> >> nickname is rogpeppe there) and we could help more directly. >> >> >> >> cheers, >> >> rog. >> > >> > >> > >
-- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju