Hi! What surprises me most is that a connect(...O_NONBLOCK) actually blocks:
EINPROGRESS The socket is non-blocking and the connection cannot be com- pleted immediately. Regards, Ulrich >>> "Gang He" <g...@suse.com> schrieb am 08.03.2018 um 10:48 in Nachricht <5aa17765020000f9000ad...@prv-mh.provo.novell.com>: > Hi Feldhost, > > I use active rrp_mode in corosync.conf and reboot the cluster to let the > configuration effective. > But, the about 5 mins hang in new_lockspace() function is still here. > > Thanks > Gang > > >>>> >> Hi, so try to use active mode. >> >> https://www.suse.com/documentation/sle_ha/book_sleha/data/sec_ha_installatio > >> n_terms.html >> >> That fixes I saw in 4.14.* >> >>> On 8 Mar 2018, at 09:12, Gang He <g...@suse.com> wrote: >>> >>> Hi Feldhost, >>> >>> >>>>>> >>>> Hello Gang He, >>>> >>>> which type of corosync rrp_mode you use? Passive or Active? >>> clvm1:/etc/corosync # cat corosync.conf | grep rrp_mode >>> rrp_mode: passive >>> >>> Did you try test both? >>> No, only this mode. >>> Also, what kernel version you use? I see some SCTP fixes in latest kernels. >>> clvm1:/etc/corosync # uname -r >>> 4.4.114-94.11-default >>> It looks that sock->ops->connect() function is blocked for too long time >>> before >> return, under broken network situation. >>> In normal network, sock->ops->connect() function returns very quickly. >>> >>> Thanks >>> Gang >>> >>>> >>>>> On 8 Mar 2018, at 08:52, Gang He <g...@suse.com> wrote: >>>>> >>>>> Hello list and David Teigland, >>>>> >>>>> I got a problem under a two rings cluster, the problem can be reproduced >>>> with the below steps. >>>>> 1) setup a two rings cluster with two nodes. >>>>> e.g. >>>>> clvm1(nodeid 172204569) addr_list eth0 10.67.162.25 eth1 192.168.152.240 >>>>> clvm2(nodeid 172204570) addr_list eth0 10.67.162.26 eth1 192.168.152.103 >>>>> >>>>> 2) the whole cluster works well, then I put eth0 down on node clvm2, and >>>> restart pacemaker service on that node. >>>>> ifconfig eth0 down >>>>> rcpacemaker restart >>>>> >>>>> 3) the whole cluster still work well (that means corosync is very smooth >>>>> to >>>> switch to the other ring). >>>>> Then, I can mount ocfs2 file system on node clvm2 quickly with the >>>>> command >>>>> mount /dev/sda /mnt/ocfs2 >>>>> >>>>> 4) Next, I do the same mount on node clvm1, the mount command will be >>>>> hanged > >> >>>> for about 5 mins, and finally the mount command is done. >>>>> But, if we setup a ocfs2 file system resource in pacemaker, >>>>> the pacemaker resource agent will consider ocfs2 file system resource >>>> startup failure before this command returns, >>>>> the pacemaker will fence node clvm1. >>>>> This problem is impacting our customer's estimate, since they think the >>>>> two >>>> rings can be switched smoothly. >>>>> >>>>> According to this problem, I can see the mount command is hanged with the >>>> below back trace, >>>>> clvm1:/ # cat /proc/6688/stack >>>>> [<ffffffffa04b8f2d>] new_lockspace+0x92d/0xa70 [dlm] >>>>> [<ffffffffa04b92d9>] dlm_new_lockspace+0x69/0x160 [dlm] >>>>> [<ffffffffa04db758>] user_cluster_connect+0xc8/0x350 [ocfs2_stack_user] >>>>> [<ffffffffa0483872>] ocfs2_cluster_connect+0x192/0x240 [ocfs2_stackglue] >>>>> [<ffffffffa0577efc>] ocfs2_dlm_init+0x31c/0x570 [ocfs2] >>>>> [<ffffffffa05c2983>] ocfs2_fill_super+0xb33/0x1200 [ocfs2] >>>>> [<ffffffff8120e130>] mount_bdev+0x1a0/0x1e0 >>>>> [<ffffffff8120ea1a>] mount_fs+0x3a/0x170 >>>>> [<ffffffff81228bf2>] vfs_kern_mount+0x62/0x110 >>>>> [<ffffffff8122b123>] do_mount+0x213/0xcd0 >>>>> [<ffffffff8122bed5>] SyS_mount+0x85/0xd0 >>>>> [<ffffffff81614b0a>] entry_SYSCALL_64_fastpath+0x1e/0xb6 >>>>> [<ffffffffffffffff>] 0xffffffffffffffff >>>>> >>>>> The root cause is in sctp_connect_to_sock() function in lowcomms.c, >>>>> 1075 >>>>> 1076 log_print("connecting to %d", con->nodeid); >>>>> 1077 >>>>> 1078 /* Turn off Nagle's algorithm */ >>>>> 1079 kernel_setsockopt(sock, SOL_TCP, TCP_NODELAY, (char *)&one, >>>>> 1080 sizeof(one)); >>>>> 1081 >>>>> 1082 result = sock->ops->connect(sock, (struct sockaddr *)&daddr, >>>> addr_len, >>>>> 1083 O_NONBLOCK); <<= here, this >>>>> invoking >>>> will cost > 5 mins before return ETIMEDOUT(-110). >>>>> 1084 printk(KERN_ERR "sctp_connect_to_sock connect: %d\n", >>>>> result); >>>>> 1085 >>>>> 1086 if (result == -EINPROGRESS) >>>>> 1087 result = 0; >>>>> 1088 if (result == 0) >>>>> 1089 goto out; >>>>> >>>>> Then, I want to know if this problem was found/fixed before? >>>>> it looks DLM can not switch the second ring very quickly, this will >>>>> impact >>>> the above application (e.g. CLVM, ocfs2) to create a new lock space before >>>> it's startup. >>>>> >>>>> Thanks >>>>> Gang >>>>> >>>>> >>>>> _______________________________________________ >>>>> Users mailing list: Users@clusterlabs.org >>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org