Dear community,

I have a 2-node gluster cluster with one replicated volume shared to a client via NFS. If the replication link (Ethernet crossover cable) between the Gluster nodes breaks, I discovered that my whole storage is not available anymore.

I am using Pacemaker/corosync with two virtual IPs (service IPs exposed to the clients), so each node has its corresponding virtual IP, and if one node fails, corosync assigns the failing IP to the other running node). This mechanism works pretty good so far.

So, I have:

  gluster1: IP 10.196.150.251 and virtual IP 10.196.150.250
  gluster2: IP 10.196.150.252 and virtual IP 10.196.150.254

Now I am using DNS round-robin to distribute the load on both gluster nodes (name: gluster.mycompany.tld).

If one node goes down, the virtual IP is handed over to the remaining node, and the client works without any disruption. However, if the replication link between the gluster nodes breaks, we examined a service disruption. The client was then unable to write data to the cluster. The replication links between gluster nodes seems to be a single point of failure. Is that correct?

Daniel
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to