Hello,
I was using 0.9.7 (I think) for a number of months, then I upgraded to
1.0.2 in the last couple of weeks. Now I am finding that occasionally
(once a day) the graylog ui becomes unresponsive. When I check the status
I get this:
root@graylog:/var/log/graylog/server# graylog-ctl status
down: elasticsearch: 0s, normally up, want up; run: log: (pid 1204) 911220s
run: etcd: (pid 4722) 177186s; run: log: (pid 1193) 911221s
run: graylog-server: (pid 13509) 160101s; run: log: (pid 1191) 911221s
run: graylog-web: (pid 4734) 177185s; run: log: (pid 1190) 911221s
run: mongodb: (pid 4792) 177184s; run: log: (pid 1192) 911221s
run: nginx: (pid 4806) 177184s; run: log: (pid 1208) 911220s
which suggests elasticsearch is down.
When I restart:
root@graylog:/var/log/graylog/server# graylog-ctl restart
ok: run: elasticsearch: (pid 22945) 0s
ok: run: etcd: (pid 22955) 0s
timeout: run: graylog-server: (pid 13509) 160288s, got TERM
ok: run: graylog-web: (pid 23055) 0s
ok: run: mongodb: (pid 23091) 0s
ok: run: nginx: (pid 23096) 0s
So elasticsearch comes up, but graylog-server refuses to. Issuing the
restart command a second time gives the same results. Issuing the stop
command also times out:
root@graylog:/var/log/graylog/server# graylog-ctl stop
ok: down: elasticsearch: 0s, normally up
ok: down: etcd: 0s, normally up
timeout: run: graylog-server: (pid 13509) 160328s, want down, got TERM
ok: down: graylog-web: 0s, normally up
ok: down: mongodb: 0s, normally up
ok: down: nginx: 1s, normally up
I get back running, I have to kill -9 the graylog server, followed by
graylog-ctl start.
Today I am not sure what time the service went down, but I had millions of
these in the lead up:
org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
Followed by a mix of these exceptions:
* org.elasticsearch.node.NodeClosedException: node closed
[graylog2-server][4aLJKPqeR2CCwR84ZO6I9w][graylog][inet[/10.4.11.143:9350]]{client=true,
data=false, master=false}
* com.mongodb.MongoException$Network: Read operation to server
127.0.0.1:27017 failed on database graylog
* com.mongodb.MongoTimeoutException: Timed out after 10000 ms while waiting
to connect. Client view of cluster state is {type=Unknown,
servers=[{address=127.0.0.1:27017, type=Unknown, state=Connecting,
exception={com.mongodb.MongoException$Network: Exception opening the
socket}, caused by {java.net.ConnectException: Connection refused}}]
I am running in an EC2 environment, with AMIs created using packer, using
the scripts
at https://github.com/Graylog2/graylog2-images/tree/master/packer with some
local extensions.
Is this related to any known issues? If not, can you offer help/advice on
how I should go about getting to the bottom of the issue?
Many thanks,
Mike.
--
You received this message because you are subscribed to the Google Groups
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.