Upgrading some nodes today, and noticed that vdsmd restarts glusterd on a node 
when it activates it. This is causing a short break in healing when the shd 
gets disconnected, forcing some extra healing when the healing process reports 
“Transport Endpoint Disconnected” (N/A in the ovirt gui).

This is on a converged cluster (3 nodes, gluster replica volume across all 3, 
ovirt-engine running elsewhere). Centos 7 install, just upgraded to Ovirt 
4.1.2, running cluster 3.10 from the Centos SIG.

The process I’m observing:

Place a node into maintenance via GUI
Update node from command line
Reboot node (kernel update)
Watch gluster heal itself after reboot
Activate node in GUI
gluster is completely stopped on this node
gluster is started on this node
healing begins again, but isn’t working
“gluster vol heal XXXX info” reports this node’s information not available 
because “Transport endpoint not connected”.
This clears up in 5-10 minutes, then volume heals normally

Someone with a similar setup want to check this and see if it’s something 
specific to my nodes, or just a general problem with the way it’s restarting 
gluster? Looking for a little confirmation before I file a bug report on it.

Or a dev want to comment on why it stops and starts gluster, instead of a 
restart which would presumably leave the brick processes and shd running and 
not causing this effect?

Thanks,

  -Darrell
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to