Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 11/09/2013, at 2:57 PM, Andrey Groshev gre...@yandex.ru wrote: Hello Christine, Andrew and all. I'm sorry - a little was unwell, so did not answer. What we end this stream of messages? Who will change? corosync or pacemaker? For now make sure you specify a nodeid and name. Longer term, Chrissie is looking at making the combined data set available in a different namespace for pacemaker to use. 05.09.2013, 15:49, Christine Caulfield ccaul...@redhat.com: On 05/09/13 11:33, Andrew Beekhof wrote: On 05/09/2013, at 6:37 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96) trace: corosync_node_name:Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect:qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd:
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
Hello Christine, Andrew and all. I'm sorry - a little was unwell, so did not answer. What we end this stream of messages? Who will change? corosync or pacemaker? 05.09.2013, 15:49, Christine Caulfield ccaul...@redhat.com: On 05/09/13 11:33, Andrew Beekhof wrote: On 05/09/2013, at 6:37 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4 cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96 ) trace: corosync_node_name: Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect: qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer:
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96) trace: corosync_node_name:Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect:qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name:Unable to get node name for nodeid 172793107 I wonder if you need to be including the nodeid too. ie. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 nodeid: 2 } I
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 05/09/2013, at 6:37 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96 ) trace: corosync_node_name:Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect:qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name:Unable to get node name for nodeid 172793107 I
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 05/09/13 11:33, Andrew Beekhof wrote: On 05/09/2013, at 6:37 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96) trace: corosync_node_name:Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect:qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name:Unable to get node name for nodeid 172793107 I wonder if you need
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
03.09.2013, 08:27, Andrew Beekhof and...@beekhof.net: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4 cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96 ) trace: corosync_node_name: Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect: qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name: Unable to get node name for nodeid 172793107 I wonder if you need to be including the nodeid too. ie. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 nodeid: 2 } I _thought_ that
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96) trace: corosync_node_name:Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect:qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name:Unable to get node name for nodeid 172793107 I wonder if you need to be including the nodeid too. ie. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 nodeid: 2 } I _thought_ that was implicit. Chrissie: is nodelist.node.%d.nodeid always available for corosync2 or only if explicitly
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
03.09.2013, 17:52, Christine Caulfield ccaul...@redhat.com: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4 cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96 ) trace: corosync_node_name: Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect: qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name: Unable to get node name for nodeid 172793107 I wonder if you need to be including the nodeid
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 03/09/2013, at 11:46 PM, Andrey Groshev gre...@yandex.ru wrote: 03.09.2013, 08:27, Andrew Beekhof and...@beekhof.net: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96 ) trace: corosync_node_name:Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect:qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name:Unable to get node name for nodeid 172793107 I wonder if you need to be including the nodeid too. ie.
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 03/09/2013, at 11:49 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96) trace: corosync_node_name:Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect:qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name:Unable to get node name for nodeid 172793107 I wonder if you need to be including the nodeid too. ie. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 nodeid: 2
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
04.09.2013, 01:06, Andrew Beekhof and...@beekhof.net: On 03/09/2013, at 11:46 PM, Andrey Groshev gre...@yandex.ru wrote: 03.09.2013, 08:27, Andrew Beekhof and...@beekhof.net: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4 cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96 ) trace: corosync_node_name: Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect: qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name:
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96) trace: corosync_node_name:Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect:qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name:Unable to get node name for nodeid 172793107 Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( cluster.c:338 ) notice: get_node_name: Defaulting to uname -n for the local corosync node name Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( attrd.c:651 ) debug: attrd_cib_callback:Update 4 for probe_complete=true passed Aug 30 12:31:11 [9615] dev-cluster2-node4 corosync debug [QB] HUP conn (9616-9989-27) Aug
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ Hello, Andrew. You are a little misunderstood me. No, I understood you fine. I wrote that I rushed to judgment. After I did the reverse DNS zone, the cluster behaved correctly. BUT after I took apart the cluster dropped configs and restarted on the new cluster, cluster again don't showed all the nodes in the nodes (only node with running pacemaker). A small portion of the log. Full log In which (I thought) there is something interesting. Aug 30 12:31:11 [9986] dev-cluster2-node4cib: ( corosync.c:423 ) trace: check_message_sanity: Verfied message 4: (dest=all:cib, from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96) trace: corosync_node_name:Checking 172793107 vs 0 from nodelist.node.0.nodeid Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) debug: qb_ipcc_disconnect:qb_ipcc_disconnect() Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-request-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-response-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cmap-event-9616-9989-27-header Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) notice: corosync_node_name:Unable to get node name for nodeid 172793107 I wonder if you need to be including the nodeid too. ie. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 nodeid: 2 } I _thought_ that was implicit. Chrissie: is nodelist.node.%d.nodeid always available for corosync2 or only if explicitly
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! # corosync-cmapctl |grep nodelist nodelist.local_node_pos (u32) = 2 nodelist.node.0.name (str) = dev-cluster2-node2 nodelist.node.0.ring0_addr (str) = 10.76.157.17 nodelist.node.1.name (str) = dev-cluster2-node3 nodelist.node.1.ring0_addr (str) = 10.76.157.18 nodelist.node.2.name (str) = dev-cluster2-node4 nodelist.node.2.ring0_addr (str) = 10.76.157.19 # corosync-quorumtool -s Quorum information -- Date: Wed Aug 28 11:29:49 2013 Quorum provider: corosync_votequorum Nodes: 1 Node ID: 172793107 Ring ID: 52 Quorate: No Votequorum information -- Expected votes: 3 Highest expected: 3 Total votes: 1 Quorum: 2 Activity blocked Flags: Membership information -- Nodeid Votes Name 172793107 1 dev-cluster2-node4 (local) # cibadmin -Q cib epoch=25 num_updates=3 admin_epoch=0 validate-with=pacemaker-1.2 crm_feature_set=3.0.7 cib-last-written=Wed Aug 28 11:24:06 2013 update-origin=dev-cluster2-node4 update-client=crmd have-quorum=0 dc-uuid=172793107 configuration crm_config cluster_property_set id=cib-bootstrap-options nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.1.11-1.el6-4f672bc/ nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=corosync/ /cluster_property_set /crm_config nodes node id=172793107 uname=dev-cluster2-node4/ /nodes resources/ constraints/ /configuration status node_state id=172793107 uname=dev-cluster2-node4 in_ccm=true crmd=online crm-debug-origin=do_state_transition join=member expected=member lrm id=172793107 lrm_resources/ /lrm transient_attributes id=172793107 instance_attributes id=status-172793107 nvpair id=status-172793107-probe_complete name=probe_complete value=true/ /instance_attributes /transient_attributes /node_state /status /cib I figured out a way get around this, but it would be easier to do if the CIB has worked as a with CMAN. I just do not start the main resource if the attribute is not defined or it is not true. This slightly changes the logic of the cluster. But I'm not sure what the correct behavior. libqb 0.14.4 corosync 2.3.1 pacemaker 1.1.11 All build from source in previews week. Now in corosync.conf: totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 10.76.157.18 mcastaddr: 239.94.1.56 mcastport: 5405
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 29/08/2013, at 7:31 PM, Andrey Groshev gre...@yandex.ru wrote: 29.08.2013, 12:25, Andrey Groshev gre...@yandex.ru: 29.08.2013, 02:55, Andrew Beekhof and...@beekhof.net: On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes. I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area! Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise. Can you set PCMK_trace_files=corosync.c in your environment and retest? On RHEL6 that means putting the following in /etc/sysconfig/pacemaker export PCMK_trace_files=corosync.c It should produce additional logging[1] that will help diagnose the issue. [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ # corosync-cmapctl |grep nodelist nodelist.local_node_pos (u32) = 2 nodelist.node.0.name (str) = dev-cluster2-node2 nodelist.node.0.ring0_addr (str) = 10.76.157.17 nodelist.node.1.name (str) = dev-cluster2-node3 nodelist.node.1.ring0_addr (str) = 10.76.157.18 nodelist.node.2.name (str) = dev-cluster2-node4 nodelist.node.2.ring0_addr (str) = 10.76.157.19 # corosync-quorumtool -s Quorum information -- Date: Wed Aug 28 11:29:49 2013 Quorum provider: corosync_votequorum Nodes:1 Node ID: 172793107 Ring ID: 52 Quorate: No Votequorum information -- Expected votes: 3 Highest expected: 3 Total votes: 1 Quorum: 2 Activity blocked Flags: Membership information -- Nodeid Votes Name 172793107 1 dev-cluster2-node4 (local) # cibadmin -Q cib epoch=25 num_updates=3 admin_epoch=0 validate-with=pacemaker-1.2 crm_feature_set=3.0.7 cib-last-written=Wed Aug 28 11:24:06 2013 update-origin=dev-cluster2-node4 update-client=crmd have-quorum=0 dc-uuid=172793107 configuration crm_config cluster_property_set id=cib-bootstrap-options nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.1.11-1.el6-4f672bc/ nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=corosync/ /cluster_property_set /crm_config nodes node id=172793107 uname=dev-cluster2-node4/ /nodes resources/ constraints/ /configuration status node_state id=172793107 uname=dev-cluster2-node4 in_ccm=true crmd=online crm-debug-origin=do_state_transition join=member expected=member lrm id=172793107 lrm_resources/ /lrm transient_attributes id=172793107 instance_attributes id=status-172793107 nvpair id=status-172793107-probe_complete name=probe_complete value=true/ /instance_attributes
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 28/08/2013, at 5:38 PM, Andrey Groshev gre...@yandex.ru wrote: 28.08.2013, 04:06, Andrew Beekhof and...@beekhof.net: On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } The same thing. I don't know what to say. I tested it here yesterday and it worked as expected. # corosync-cmapctl |grep nodelist nodelist.local_node_pos (u32) = 2 nodelist.node.0.name (str) = dev-cluster2-node2 nodelist.node.0.ring0_addr (str) = 10.76.157.17 nodelist.node.1.name (str) = dev-cluster2-node3 nodelist.node.1.ring0_addr (str) = 10.76.157.18 nodelist.node.2.name (str) = dev-cluster2-node4 nodelist.node.2.ring0_addr (str) = 10.76.157.19 # corosync-quorumtool -s Quorum information -- Date: Wed Aug 28 11:29:49 2013 Quorum provider: corosync_votequorum Nodes:1 Node ID: 172793107 Ring ID: 52 Quorate: No Votequorum information -- Expected votes: 3 Highest expected: 3 Total votes: 1 Quorum: 2 Activity blocked Flags: Membership information -- Nodeid Votes Name 172793107 1 dev-cluster2-node4 (local) # cibadmin -Q cib epoch=25 num_updates=3 admin_epoch=0 validate-with=pacemaker-1.2 crm_feature_set=3.0.7 cib-last-written=Wed Aug 28 11:24:06 2013 update-origin=dev-cluster2-node4 update-client=crmd have-quorum=0 dc-uuid=172793107 configuration crm_config cluster_property_set id=cib-bootstrap-options nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.1.11-1.el6-4f672bc/ nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=corosync/ /cluster_property_set /crm_config nodes node id=172793107 uname=dev-cluster2-node4/ /nodes resources/ constraints/ /configuration status node_state id=172793107 uname=dev-cluster2-node4 in_ccm=true crmd=online crm-debug-origin=do_state_transition join=member expected=member lrm id=172793107 lrm_resources/ /lrm transient_attributes id=172793107 instance_attributes id=status-172793107 nvpair id=status-172793107-probe_complete name=probe_complete value=true/ /instance_attributes /transient_attributes /node_state /status /cib I figured out a way get around this, but it would be easier to do if the CIB has worked as a with CMAN. I just do not start the main resource if the attribute is not defined or it is not true. This slightly changes the logic of the cluster. But I'm not sure what the correct behavior. libqb 0.14.4 corosync 2.3.1 pacemaker 1.1.11 All build from source in previews week. Now in corosync.conf: totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 10.76.157.18 mcastaddr: 239.94.1.56 mcastport: 5405 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes debug: on timestamp: on logger_subsys { subsys: QUORUM debug: on } }
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 27/08/2013, at 1:13 PM, Andrey Groshev gre...@yandex.ru wrote: 27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. Thats because you've used IP addresses in the node list. ie. node { ring0_addr: 10.76.157.17 } try including the node name as well, eg. node { name: dev-cluster2-node2 ring0_addr: 10.76.157.17 } I figured out a way get around this, but it would be easier to do if the CIB has worked as a with CMAN. I just do not start the main resource if the attribute is not defined or it is not true. This slightly changes the logic of the cluster. But I'm not sure what the correct behavior. libqb 0.14.4 corosync 2.3.1 pacemaker 1.1.11 All build from source in previews week. Now in corosync.conf: totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 10.76.157.18 mcastaddr: 239.94.1.56 mcastport: 5405 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes debug: on timestamp: on logger_subsys { subsys: QUORUM debug: on } } quorum { provider: corosync_votequorum } nodelist { node { ring0_addr: 10.76.157.17 } node { ring0_addr: 10.76.157.18 } node { ring0_addr: 10.76.157.19 } } ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org , ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org signature.asc Description: Message signed with OpenPGP using GPGMail ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? Now in corosync.conf: totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 10.76.157.18 mcastaddr: 239.94.1.56 mcastport: 5405 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes debug: on timestamp: on logger_subsys { subsys: QUORUM debug: on } } quorum { provider: corosync_votequorum } nodelist { node { ring0_addr: 10.76.157.17 } node { ring0_addr: 10.76.157.18 } node { ring0_addr: 10.76.157.19 } } ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org signature.asc Description: Message signed with OpenPGP using GPGMail ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
27.08.2013, 05:39, Andrew Beekhof and...@beekhof.net: On 26/08/2013, at 3:09 PM, Andrey Groshev gre...@yandex.ru wrote: 26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. And it didn't work? What version of pacemaker? It does not work as I expected. I figured out a way get around this, but it would be easier to do if the CIB has worked as a with CMAN. I just do not start the main resource if the attribute is not defined or it is not true. This slightly changes the logic of the cluster. But I'm not sure what the correct behavior. libqb 0.14.4 corosync 2.3.1 pacemaker 1.1.11 All build from source in previews week. Now in corosync.conf: totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 10.76.157.18 mcastaddr: 239.94.1.56 mcastport: 5405 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes debug: on timestamp: on logger_subsys { subsys: QUORUM debug: on } } quorum { provider: corosync_votequorum } nodelist { node { ring0_addr: 10.76.157.17 } node { ring0_addr: 10.76.157.18 } node { ring0_addr: 10.76.157.19 } } ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org , ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I'll explain what it is uncomfortable. I need set attribute before start resource. On cluster with cman I can do it, but with corosync2 don't Exist other way ? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org signature.asc Description: Message signed with OpenPGP using GPGMail ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2
26.08.2013, 03:34, Andrew Beekhof and...@beekhof.net: On 23/08/2013, at 9:39 PM, Andrey Groshev gre...@yandex.ru wrote: Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. Yes, the cluster puts back entries for all the nodes it know about automagically. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. I'm assuming all three are configured in cluster.conf? Yes, there exist list nodes. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist. If it did, you should get the same behaviour. I try and expected_node and nodelist. Now in corosync.conf: totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 10.76.157.18 mcastaddr: 239.94.1.56 mcastport: 5405 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes debug: on timestamp: on logger_subsys { subsys: QUORUM debug: on } } quorum { provider: corosync_votequorum } nodelist { node { ring0_addr: 10.76.157.17 } node { ring0_addr: 10.76.157.18 } node { ring0_addr: 10.76.157.19 } } ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] different behavior cibadmin -Ql with cman and corosync2
Hello, Today I try remake my test cluster from cman to corosync2. I drew attention to the following: If I reset cluster with cman through cibadmin --erase --force In cib is still there exist names of nodes. cibadmin -Ql . nodes node id=dev-cluster2-node2.unix.tensor.ru uname=dev-cluster2-node2/ node id=dev-cluster2-node4.unix.tensor.ru uname=dev-cluster2-node4/ node id=dev-cluster2-node3.unix.tensor.ru uname=dev-cluster2-node3/ /nodes Even if cman and pacemaker running only one node. And if I do too on cluster with corosync2 I see only names of nodes which run corosync and pacemaker. I'll explain what it is uncomfortable. I need set attribute before start resource. On cluster with cman I can do it, but with corosync2 don't Exist other way ? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org