[ https://issues.apache.org/jira/browse/CLOUDSTACK-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sadhu suresh updated CLOUDSTACK-7131: ------------------------------------- Attachment: messages-20140717.rar message log > RVR: router's reduandant state shown as unknown(CheckRouterCommand is failing) > ------------------------------------------------------------------------------ > > Key: CLOUDSTACK-7131 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7131 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Management Server > Affects Versions: 4.5.0 > Reporter: sadhu suresh > Attachments: management-server.rar, messages-20140717.rar > > > 1. advance zone with vmware cluster > 2. create network offering with RVR enabled > 3.create a network with above network offering > 4.deploy a vm using above network > 5.once redundant router created successfully(master and backup routers) > 6.reboot the master router > 7. check the redundant state of the router. > actual result: > when we reboot the master ,backup becomes master and rebooted master becomes > backup as expected but after sometime(next day whenicheck the logs i am > continousily seeing the checkrouter command is failing and both routers > redundant state set as unknown) > Name r-6-VM > ID 83fe0027-d541-4795-9d0e-7306a068a9f4 > State Running > Version 4.4.0 > Requires Upgrade No > Network ID 63e9f4c8-da5c-42ca-93ea-318f56ab8e79 > Public IP Address 10.147.49.185 > Guest IP Address 10.1.1.138 > Link Local IP Address 10.147.41.20 > Host 10.147.40.9 > Compute offering System Offering For Software Router > Network Domain cs2cloud.internal > Domain ROOT > Account admin > Created 15 Jul 2014 21:21:40 > Redundant Router Yes > Redundant state UNKNOWN > VPC ID > root@r-6-VM:~# cat /ramdisk/rrouter/keepalived.log > To backup called > Disable public ip 0 > Password server is not running > Stopping DNS forwarder and DHCP server: dnsmasq(not running) ... (warning). > cache internal: > current active connections: 0 > connections created: 0 failed: 0 > connections updated: 0 failed: 0 > connections destroyed: 0 failed: 0 > cache external: > current active connections: 0 > connections created: 0 failed: 0 > connections updated: 0 failed: 0 > connections destroyed: 0 failed: 0 > traffic processed: > 0 Bytes 0 Pckts > multicast traffic (active device=eth0): > 16 Bytes sent 0 Bytes recv > 2 Pckts sent 0 Pckts recv > 0 Error send 0 Error recv > message tracking: > 0 Malformed msgs 0 Lost msgs > Conntrackd switch to backup done > Switch conntrackd mode backup 0 > Status: BACKUP > root@r-7-VM:~# cat /ramdisk/rrouter/keepalived.log > To backup called > Disable public ip 0 > Password server is not running > Stopping DNS forwarder and DHCP server: dnsmasq. > cache internal: > current active connections: 0 > connections created: 0 failed: 0 > connections updated: 0 failed: 0 > connections destroyed: 0 failed: 0 > cache external: > current active connections: 0 > connections created: 0 failed: 0 > connections updated: 0 failed: 0 > connections destroyed: 0 failed: 0 > traffic processed: > 0 Bytes 0 Pckts > multicast traffic (active device=eth0): > 16 Bytes sent 24 Bytes recv > 1 Pckts sent 2 Pckts recv > 0 Error send 0 Error recv > message tracking: > 0 Malformed msgs 0 Lost msgs > Conntrackd switch to backup done > Switch conntrackd mode backup 0 > Status: BACKUP > To master called > Password server is not running > Removed cloud-passwd-srvr iptables rules > Added cloud-passwd-srvr iptables rules > 10.1.1.117/24 10.1.1.1/24 > Restarting DNS forwarder and DHCP server: dnsmasq. > Enable public ip returned 0 > Conntrackd switch to primary done > Switch conntrackd mode primary returned 0 > ARPING 10.147.49.185 from 10.147.49.185 eth2 > Sent 1 probes (1 broadcast(s)) > Received 0 response(s) > ARPING 10.147.49.185 from 10.147.49.185 eth2 > Sent 1 probes (1 broadcast(s)) > Received 0 response(s) > Status: MASTER > root@r-7-VM:~# ls > clearUsageRules.sh func.sh hv-kvp-daemon_3.1_amd64.deb monitorServices.py > reconfigLB.sh redundant_router > root@r-7-VM:~# cd /ramdisk/rrouter/ > root@r-7-VM:/ramdisk/rrouter# ls > arping_gateways.sh check_bumpup.sh disable_pubip.sh fault.sh > keepalived.log keepalived.ts2 primary-backup.sh > backup.sh check_heartbeat.sh enable_pubip.sh heartbeat.sh > keepalived.ts master.sh services.sh > root@r-7-VM:/ramdisk/rrouter# > content of log: > ********** > Done with process of VM state report. host: 1 > 2014-07-19 04:55:53,680 ERROR [c.c.u.s.SshHelper] > (DirectAgent-467:ctx-4099f7a4 10.147.40.9, job-88, cmd: CheckRouterCommand) > SSH execution of command /opt/cloud/bin/checkrouter.sh null has an error > status code in return. result output: > 2014-07-19 04:55:53,686 DEBUG [c.c.h.v.r.VmwareResource] > (DirectAgent-467:ctx-4099f7a4 10.147.40.9, job-88, cmd: CheckRouterCommand) > checkrouter.sh execution result: false > 2014-07-19 04:55:53,688 DEBUG [c.c.a.m.DirectAgentAttache] > (DirectAgent-467:ctx-4099f7a4) Seq 1-3851140631355196160: Response Received: > 2014-07-19 04:55:53,692 ERROR [c.c.u.s.SshHelper] > (DirectAgent-385:ctx-ebbeb14f 10.147.40.9, job-82, cmd: CheckRouterCommand) > SSH execution of command /opt/cloud/bin/checkrouter.sh null has an error > status code in return. result output: > 2014-07-19 04:55:53,702 DEBUG [c.c.h.v.r.VmwareResource] > (DirectAgent-385:ctx-ebbeb14f 10.147.40.9, job-82, cmd: CheckRouterCommand) > checkrouter.sh execution result: false > 2014-07-19 04:55:53,706 DEBUG [c.c.a.m.DirectAgentAttache] > (DirectAgent-385:ctx-ebbeb14f) Seq 1-3851140631355196172: Response Received: > 2014-07-19 04:55:53,703 DEBUG [c.c.a.t.Request] > (DirectAgent-467:ctx-4099f7a4) Seq 1-3851140631355196160: Processing: { Ans: > , MgmtId: 7175246184473, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.CheckRouterAnswer":{"isBumped":false,"result":false,"details":"","wait":0}}] > } > 2014-07-19 04:55:53,716 DEBUG [c.c.a.t.Request] > (DirectAgent-385:ctx-ebbeb14f) Seq 1-3851140631355196172: Processing: { Ans: > , MgmtId: 7175246184473, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.CheckRouterAnswer":{"isBumped":false,"result":false,"details":"","wait":0}}] > } > 2014-07-19 04:55:53,719 DEBUG [c.c.a.m.AgentAttache] > (DirectAgent-385:ctx-ebbeb14f) Seq 1-3851140631355196172: Unable to find > listener. > 2014-07-19 04:55:53,723 DEBUG [c.c.a.m.AgentAttache] > (DirectAgent-467:ctx-4099f7a4) Seq 1-3851140631355196160: Unable to find > listener. > 2014-07-19 04:55:53,731 DEBUG [c.c.h.v.r.VmwareResource] > (DirectAgent-91:ctx-4f8e6182 10.147.40.9, job-82, cmd: CheckRouterCommand) > Use router's private IP for SSH control. IP : 10.147.41.20 > 2014-07-19 04:55:53,746 DEBUG [c.c.h.v.r.VmwareResource] > (DirectAgent-91:ctx-4f8e6182 10.147.40.9, job-82, cmd: CheckRouterCommand) > Run command on VR: 10.147.41.20, script: checkrouter.sh with args: null > 2014-07-19 04:55:53,751 DEBUG [c.c.h.v.r.VmwareResource] > (DirectAgent-176:ctx-95fd6537 10.147.40.9, job-83, cmd: CheckRouterCommand) > Use router's private IP for SSH control. IP : 10.147.41.30 > 2014-07-19 04:55:53,759 DEBUG [c.c.h.v.r.VmwareResource] > (DirectAgent-176:ctx-95fd6537 10.147.40.9, job-83, cmd: CheckRouterCommand) > Run command on VR: 10.147.41.30, script: checkrouter.sh with args: null -- This message was sent by Atlassian JIRA (v6.2#6252)