FYI - i talked to our network folks and it looks like they were doing some testing last night with port failover which may or may not have caused this issue. However, I was able to correct it by fencing the problem nodes.
On Wed, Jun 3, 2015 at 10:31 AM, Megan . <nagem...@gmail.com> wrote: > Anybody ever seen "Error: ClientSocket(String): connect() failed: No such > file or directory" when doing a start all? Something seems to have > broken with our closer. Our UAT setup works as expected. I looked at > tcpdumps the best that i could (i'm not a network person though) and i > didn't see anything obvious. I shutdown iptables on all nodes. > > We are running Centos 6,6, ccs-0.16.2-75.el6_6.1.x86_64 > cman-3.0.12.1-68.el6.x86_64. We have a 12 node cluster in production that > allows us to share gfs2 iscsi mounts. no other services are used. clvmd > -R runs fine at this time. ccs -h node --sync --activate also runs fine. > > > [root@admin1 ~]# ccs -h admin1-ops --startall > > Unable to start map1-ops, possibly due to lack of quorum, try --startall > > Error: ClientSocket(String): connect() failed: No such file or directory > > Started cache2-ops > > Unable to start data1-ops, possibly due to lack of quorum, try --startall > > Error: ClientSocket(String): connect() failed: No such file or directory > > Started map2-ops > > Unable to start archive1-ops, possibly due to lack of quorum, try > --startall > > Error: ClientSocket(String): connect() failed: No such file or directory > > Started data3-ops > > Started mgmt1-ops > > Unable to start admin1-ops, possibly due to lack of quorum, try --startall > > Error: ClientSocket(String): connect() failed: No such file or directory > > Started data2-ops > > Started cache1-ops > > [root@admin1 ~]# > > I have quorum: > > [root@admin1 ~]# clustat > > Cluster Status for bitsops @ Wed Jun 3 02:13:08 2015 > > Member Status: Quorate > > > Member Name ID > Status > > ------ ---- ---- > ------ > > admin1-ops 1 > Online, Local > > mgmt1-ops 2 > Online > > archive1-ops 3 > Online > > map1-ops 4 > Online > > map2-ops 5 > Online > > cache1-ops 6 > Online > > cache2-ops 7 > Online > > data1-ops 8 > Online > > data2-ops 9 > Online > > data3-ops 10 > Online > > > > > Here is what I expect, and what UAT gives me: > > [root@admin1-uat ~]# ccs -h admin1-uat --startall > > Started mgmt1-uat > > Started data1-uat > > Started data2-uat > > Started admin1-uat > > Started tools-uat > > Started map1-uat > > Started archive1-uat > > Started cache2-uat > > Started cache1-uat > > Started map2-uat > > [root@admin1-uat ~]# > > >
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster