We've upgraded our production system (AWS images) from 1.3.x to 2.0.2
On the primary server the Graylog Server is fully operational 
Whereas on the secondary server, the process is running (or it seems), but 
it's not writing anything to the logs and it does not appear in the UI as a 
node.


On the trouble server
sudo graylog-ctl status shows


run: elasticsearch: (pid 1036) 480s; run: log: (pid 1032) 480s 
run: etcd: (pid 1033) 480s; run: log: (pid 1028) 480s 
run: *graylog-server: (pid 1029)* 480s; run: log: (pid 1024) 480s 
run: nginx: (pid 1025) 480s; run: log: (pid 1022) 480s 



As seen graylog-server is running with pid 1029

But if we check the processes with pid 1029


ps -elf | grep 1029 shows


0 S root      1029  1018  0  80   0 -  1110 -      21:26 ?        00:00:00 
/bin/sh ./run
0 S root      1039  1029  0  80   0 -  2154 -      21:26 ?        00:00:00 
timeout 600 bash -c until curl -s http://127.0.0.1:27017; do sleep 1; done
0 S ubuntu    2638  2524  0  80   0 -  2616 pipe_w 21:35 pts/0    00:00:00 grep 
--color=auto 1029


 

Which clearly is *not *the graylog-server process


If we check the same thing on the primary server where everything is 
working fine,
sudo graylog-ctl status shows


run: elasticsearch: (pid 12071) 1318s; run: log: (pid 1037) 333246s
run: etcd: (pid 12090) 1317s; run: log: (pid 1035) 333246s
run: *graylog-server: (pid 12125)* 1312s; run: log: (pid 1038) 333246s
run: mongodb: (pid 12132) 1311s; run: log: (pid 1036) 333246s
run: nginx: (pid 12134) 1311s; run: log: (pid 1039) 333246s



ps -elf | grep 12125 shows


4 S graylog  12125  1031 28  80   0 - 1169685 -    21:13 ?        00:06:14 
/opt/graylog/embedded/jre/bin/java -Xms1g -Xmx1500m -XX:NewRatio=1 -server 
-XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled 
-XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow 
-jar -Dlog4j.configurationFile=file:///opt/graylog/conf/log4j2.xml 
-Djava.library.path=/opt/graylog/server/lib/sigar/ 
-Dgraylog2.installation_source=unknown /opt/graylog/server/graylog.jar server 
-f /opt/graylog/conf/graylog.conf
0 S ubuntu   17847  1419  0  80   0 -  2615 pipe_w 21:35 pts/1    00:00:00 grep 
--color=auto 12125





Clearly the graylog-server is running.

So my questions are:

   - Why graylog-ctl thinks that graylog-server is running
   - Why graylog-server is not running?
   - How can we narrow down the root issue? with graylog-server not 
   running, there the log files are not updated, hence no clue what is going 
   on.
   - Are there higher level logs for the graylog-ctl that would inform us 
   what it is going wrong when it is trying to start the graylog-server


PS: We noticed that after a long while, the graylog server eventually shows 
up as a node on the UI, and the logs start filling

Looking for errors in the logs, we only noticed the following warning


2016-06-17_17:04:56.90879 2016-06-17 17:04:56,908 WARN : 
org.graylog2.shared.events.DeadEventLoggingListener - Received unhandled 
event of type <org.graylog2.plugin.lifecycles.Lifecycle> from event bus 
<AsyncEventBus{graylog-eventbus}>


We're not even certain it had any relevance to the problem of 
graylog-server not starting immediately.


Thanks guidance on how to narrow this down is greatly appreciated.

Thanks






-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to graylog2+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/530ffc00-1742-4eea-994a-d5e95c165e88%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to