Hi Roger et al,

I've restarted juju-machine-0 and rsyslog. juju status hangs (seemingly
indefinitely) and I'm seeing rapid log growth still. In machine-0.log I'm
seeing the below. In all-machines.log I'm seeing what looks like the same
behaviour for all of the other machines.

013-11-29 14:40:04 DEBUG juju.state open.go:88 connection failed, will
retry: d
ial tcp 127.0.0.1:37017: connection refused
2013-11-29 14:40:05 INFO juju runner.go:253 worker: start "api"
2013-11-29 14:40:05 INFO juju apiclient.go:106 state/api: dialing
"wss://localhost:17070/"
2013-11-29 14:40:05 ERROR juju apiclient.go:111 state/api: websocket.Dial
wss://localhost:17070/: dial tcp 127.0.0.1:17070: connection refused
2013-11-29 14:40:05 ERROR juju runner.go:211 worker: exited "api":
websocket.Dial wss://localhost:17070/: dial tcp 127.0.0.1:17070: connection
refused
2013-11-29 14:40:05 INFO juju runner.go:245 worker: restarting "api" in 3s
2013-11-29 14:40:05 DEBUG juju.state open.go:88 connection failed, will
retry: dial tcp 127.0.0.1:37017: connection refused
2013-11-29 14:40:05 DEBUG juju.state open.go:88 connection failed, will
retry: dial tcp 127.0.0.1:37017: connection refused



On 29 November 2013 12:40, Peter Waller <pe...@scraperwiki.com> wrote:

> I've replied off-list to Roger with URLs to the logs. I'm happy for them
> to be shared internally between juju developers and to share them with
> anyone who is interested.
>
>
> On 29 November 2013 12:30, roger peppe <rogpe...@gmail.com> wrote:
>
>> On 29 November 2013 11:44, Peter Waller <pe...@scraperwiki.com> wrote:
>> > The pids appear to be constant since I last reported them. Your theory
>> about
>> > the machine being out of disk is correct.
>> >
>> > Indeed the log files are 1.4 and 1.6 GB for all-machines and
>> machine-0.log.
>> > I'll try xz'ing them and then sending them along to you. Is it okay if I
>> > e-mail them directly to you and anyone else who is interested (please
>> mail
>> > me personally) assuming they compress down to ~megabytes?
>>
>> Yes please. Putting them somewhere like google drive might work
>> better than sending an attachment. I imagine they'll compress
>> very nicely though.
>>
>> BTW, just removing the log files will not work, as the files will still
>> be held
>> open. You probably want to do something like (having first removed
>> enough files elsewhere to ensure some spare disk space):
>>
>> cd /var/log/juju
>> mv machine-0.log old-machine-0.log
>> restart jujud-machine-0
>>
>> That will cause the machine agent to create a new log file, leaving
>> you free to archive the old one and remove it to save space.
>>
>> You can do a similar thing with all-machines.log, except you'll need
>> to restart rsyslog.
>>
>> I hope that when you've done that, mongo will start working again
>> (you might need to restart the juju-db service). If you find that
>> you can run juju status, it would be good to know what is
>> the value of the "agent-version" field in your environment
>> config (i.e. the output of "juju get-environment | grep agent-version").
>>
>>   cheers,
>>     rog.
>>
>> >
>> > On 29 November 2013 11:26, roger peppe <rogpe...@gmail.com> wrote:
>> >>
>> >> On 29 November 2013 10:26, Peter Waller <pe...@scraperwiki.com> wrote:
>> >> > On 28 November 2013 17:44, Peter Waller <pe...@scraperwiki.com>
>> wrote:
>> >> >>
>> >> >> I'm still having the problem of it spinning, every few seconds all
>> of
>> >> >> the
>> >> >> machines are still spewing into the logs, despite my attempt at
>> asking
>> >> >> it to
>> >> >> "upgrade" to a different version.
>> >> >
>> >> >
>> >> > Last night I left it spinning absent any better ideas. It didn't
>> seem to
>> >> > be
>> >> > causing any obvious harm that I could find.
>> >> >
>> >> > This morning `juju status` doesn't work and nor does juju debug-log.
>> >> >
>> >> > On the bootstrap node the following is running for `ps aux | grep
>> juju`
>> >> >
>> >> > root     23973  0.0  0.0   4440   624 ?        Ss   Aug29   0:00
>> /bin/sh
>> >> > -e
>> >> > -c /var/lib/juju/tools/machine-0/jujud machine --log-file
>> >> > /var/log/juju/machine-0.log --data-dir '/var/lib/juju' --machine-id 0
>> >> > --debug >> /var/log/juju/machine-0.log 2>&1 /bin/sh
>> >> > root     23974  0.2  1.4 309440 24340 ?        Sl   Aug29 346:48
>> >> > /var/lib/juju/tools/machine-0/jujud machine --log-file
>> >> > /var/log/juju/machine-0.log --data-dir /var/lib/juju --machine-id 0
>> >> > --debug
>> >> >
>> >> > $ juju debug-log | grep --line-buffered -v jsoncodec
>> >> > ERROR no reachable servers
>> >> >
>> >> > $ juju status
>> >> > <hangs seemingly indefinitely>
>> >> >
>> >> > Any advice or documentation someone can point me at to get this
>> system
>> >> > back
>> >> > into a working state?
>> >>
>> >> If you do the above ps, then wait 5 seconds and try again, have the
>> >> process ids changed?
>> >> If so, that means it's probably continually bouncing for some reason.
>> >>
>> >> It would be useful to see the logs, in particular
>> >> /var/log/juju/machine-0.log.
>> >> Firstly how big are they? It is possible that the machine is out of
>> disk
>> >> space
>> >> and that's caused the mongo database storage to stop working.
>> >> You could try deleting the log files (perhaps save them first for
>> >> later diagnosis).
>> >>
>> >> Secondly, perhaps you could paste somewhere the last few thousand lines
>> >> of the logs, that might give us more of an idea of what's happening
>> >> currently.
>> >>
>> >> Please feel free to ping us on IRC (#juju-dev or #juju on freenode.net;
>> my
>> >> nickname is rogpeppe there) and we could help more directly.
>> >>
>> >>   cheers,
>> >>     rog.
>> >
>> >
>>
>
>
-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju

Reply via email to