I'm sorry I forgot to mention that I already switched to "transport: udpu".
I tested multicast before creating the cluster. While the first test (omping -c 10000 -i 0.001 -F -q px-a px-b px-c px-d) showed no packet loss the second one that is mentioned at [1] (omping -c 600 -i 1 -q px-a px-b px-c px-d) showed 70% loss for multicast: root@px-b # omping -c 600 -i 1 -q px-a px-b px-c px-d […] px-a : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.077/0.250/0.443/0.065 px-a : multicast, xmt/rcv/%loss = 600/182/69%, min/avg/max/std-dev = 0.157/0.280/0.432/0.062 px-c : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.084/0.236/0.391/0.062 px-c : multicast, xmt/rcv/%loss = 600/182/69%, min/avg/max/std-dev = 0.153/0.265/0.407/0.057 px-d : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.080/0.243/0.400/0.066 px-d : multicast, xmt/rcv/%loss = 600/180/70%, min/avg/max/std-dev = 0.134/0.265/0.401/0.060 As I have no control of the switch in use I decided to go with UDPU as we don't plan to grow the cluster to more than ~15 nodes. This is my corosync.conf (I'm using 169.254.42.0/24 for cluster internal communication): ############### logging { debug: off logfile: /var/log/corosync/corosync.log timestamp: on to_logfile: yes to_syslog: yes } nodelist { node { name: px-c nodeid: 3 quorum_votes: 1 ring0_addr: px-c } node { name: px-d nodeid: 4 quorum_votes: 1 ring0_addr: px-d } node { name: px-a nodeid: 1 quorum_votes: 1 ring0_addr: px-a } node { name: px-b nodeid: 2 quorum_votes: 1 ring0_addr: px-b } } quorum { provider: corosync_votequorum } totem { cluster_name: px-infra config_version: 5 ip_version: ipv4 secauth: on transport: udpu version: 2 interface { bindnetaddr: 169.254.42.48 ringnumber: 0 } } ############### [1] https://pve.proxmox.com/wiki/Multicast_notes#Diagnosis_from_first_principles Am 25.02.2017 um 06:54 schrieb Yannis Milios: > In my opinion this is related to difficulties in cluster communication.Have a > look these notes: > > https://pve.proxmox.com/wiki/Multicast_notes > > > > On Fri, 24 Feb 2017 at 22:45, Uwe Sauter <uwe.sauter...@gmail.com > <mailto:uwe.sauter...@gmail.com>> wrote: > > Hi, > > no I didn't think about that. > > I now tried and restarted pveproxy afterwards but to no avail. > > Can you explain why you thought that this might help? > > > Regards, > > Uwe > > > Am 24.02.2017 um 21:28 schrieb Gilberto Nunes: > > Hi > > > > Did you try to execute: > > > > pvecm updatecerts > > > > in every nodes??? > > > > 2017-02-24 15:04 GMT-03:00 Uwe Sauter <uwe.sauter...@gmail.com > <mailto:uwe.sauter...@gmail.com> > <mailto:uwe.sauter...@gmail.com <mailto:uwe.sauter...@gmail.com>>>: > > > > Hi, > > > > I have a GUI problem with a four node cluster that I installed > recently. I was able > > to follow this up to ext-all.js but I'm no web developer so this is > where I got stuck. > > > > Background: > > * four node cluster > > * each node has two interfaces in use > > ** eth0 is 1Gb used for management and some VM traffic > > ** eth2 is 10Gb used for cluster synchronization, Ceph and more VM > traffic > > * host names are resolved via /etc/hosts > > * let's call the nodes px-a, px-b, px-c, px-d > > * Proxmox version 4.4-12/e71b7a74 > > > > > > Problem: > > When I access the cluster via the web GUI on px-a I can view all > info regarding px-a > > without any problems. If I try to view infos regarding the other > nodes I almost every > > time I get "connection reset by peer (596)". > > If I access the cluster GUI on px-b I can view this node's info but > not the info of the > > other nodes. > > > > I started to migrate VMs to the cluster today. Before that, when > the cluster had no > > VMs running, the access between nodes worked without problem. > > > > > > Debugging: > > I was able to trace this using Chrome's developer tools up to the > point where > > some method inside ext-all.js fails with said "connection reset by > peer". > > > > Detail using pretty formatted version of ext-all.js: > > > > Object (?) Ext.cmd.derive("Ext.data.request.Ajax", > Ext.data.request.Base begins at line 11370 > > > > Method "start" begins at line 11394 > > > > Error occurs at line 11409 "h.send(e);" > > > > > > I don't know what causes h.send(e) to fail. Any suggestions what > could cause this or how to > > debug further is appreciated. > > > > Regards, > > > > Uwe > > _______________________________________________ > > pve-user mailing list > > pve-user@pve.proxmox.com <mailto:pve-user@pve.proxmox.com> > <mailto:pve-user@pve.proxmox.com > <mailto:pve-user@pve.proxmox.com>> > > http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > <http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user> > > > > > > > > > > -- > > > > Gilberto Ferreira > > +55 (47) 99676-7530 > > Skype: gilberto.nunes36 > > > _______________________________________________ > pve-user mailing list > pve-user@pve.proxmox.com <mailto:pve-user@pve.proxmox.com> > http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > -- > Sent from Gmail Mobile _______________________________________________ pve-user mailing list pve-user@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user