On 03/09/2013, at 11:49 PM, Christine Caulfield <ccaul...@redhat.com> wrote:
> On 03/09/13 05:20, Andrew Beekhof wrote: >> >> On 02/09/2013, at 5:27 PM, Andrey Groshev <gre...@yandex.ru> wrote: >> >>> >>> >>> 30.08.2013, 07:18, "Andrew Beekhof" <and...@beekhof.net>: >>>> On 29/08/2013, at 7:31 PM, Andrey Groshev <gre...@yandex.ru> wrote: >>>> >>>>> 29.08.2013, 12:25, "Andrey Groshev" <gre...@yandex.ru>: >>>>>> 29.08.2013, 02:55, "Andrew Beekhof" <and...@beekhof.net>: >>>>>>> On 28/08/2013, at 5:38 PM, Andrey Groshev <gre...@yandex.ru> wrote: >>>>>>>> 28.08.2013, 04:06, "Andrew Beekhof" <and...@beekhof.net>: >>>>>>>>> On 27/08/2013, at 1:13 PM, Andrey Groshev <gre...@yandex.ru> wrote: >>>>>>>>>> 27.08.2013, 05:39, "Andrew Beekhof" <and...@beekhof.net>: >>>>>>>>>>> On 26/08/2013, at 3:09 PM, Andrey Groshev <gre...@yandex.ru> >>>>>>>>>>> wrote: >>>>>>>>>>>> 26.08.2013, 03:34, "Andrew Beekhof" <and...@beekhof.net>: >>>>>>>>>>>>> On 23/08/2013, at 9:39 PM, Andrey Groshev <gre...@yandex.ru> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Today I try remake my test cluster from cman to corosync2. >>>>>>>>>>>>>> I drew attention to the following: >>>>>>>>>>>>>> If I reset cluster with cman through cibadmin --erase >>>>>>>>>>>>>> --force >>>>>>>>>>>>>> In cib is still there exist names of nodes. >>>>>>>>>>>>> Yes, the cluster puts back entries for all the nodes it know >>>>>>>>>>>>> about automagically. >>>>>>>>>>>>>> cibadmin -Ql >>>>>>>>>>>>>> ..... >>>>>>>>>>>>>> <nodes> >>>>>>>>>>>>>> <node id="dev-cluster2-node2.unix.tensor.ru" >>>>>>>>>>>>>> uname="dev-cluster2-node2"/> >>>>>>>>>>>>>> <node id="dev-cluster2-node4.unix.tensor.ru" >>>>>>>>>>>>>> uname="dev-cluster2-node4"/> >>>>>>>>>>>>>> <node id="dev-cluster2-node3.unix.tensor.ru" >>>>>>>>>>>>>> uname="dev-cluster2-node3"/> >>>>>>>>>>>>>> </nodes> >>>>>>>>>>>>>> .... >>>>>>>>>>>>>> >>>>>>>>>>>>>> Even if cman and pacemaker running only one node. >>>>>>>>>>>>> I'm assuming all three are configured in cluster.conf? >>>>>>>>>>>> Yes, there exist list nodes. >>>>>>>>>>>>>> And if I do too on cluster with corosync2 >>>>>>>>>>>>>> I see only names of nodes which run corosync and pacemaker. >>>>>>>>>>>>> Since you're not included your config, I can only guess that >>>>>>>>>>>>> your corosync.conf does not have a nodelist. >>>>>>>>>>>>> If it did, you should get the same behaviour. >>>>>>>>>>>> I try and expected_node and nodelist. >>>>>>>>>>> And it didn't work? What version of pacemaker? >>>>>>>>>> It does not work as I expected. >>>>>>>>> Thats because you've used IP addresses in the node list. >>>>>>>>> ie. >>>>>>>>> >>>>>>>>> node { >>>>>>>>> ring0_addr: 10.76.157.17 >>>>>>>>> } >>>>>>>>> >>>>>>>>> try including the node name as well, eg. >>>>>>>>> >>>>>>>>> node { >>>>>>>>> name: dev-cluster2-node2 >>>>>>>>> ring0_addr: 10.76.157.17 >>>>>>>>> } >>>>>>>> The same thing. >>>>>>> I don't know what to say. I tested it here yesterday and it worked >>>>>>> as expected. >>>>>> I found that the reason that You and I have different results - I did >>>>>> not have reverse DNS zone for these nodes. >>>>>> I know what it should be, but (PACEMAKER + CMAN) worked without a >>>>>> reverse area! >>>>> Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn! >>>> >>>> It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups >>>> - reverse or otherwise. >>>> Can you set >>>> >>>> PCMK_trace_files=corosync.c >>>> >>>> in your environment and retest? >>>> >>>> On RHEL6 that means putting the following in /etc/sysconfig/pacemaker >>>> export PCMK_trace_files=corosync.c >>>> >>>> It should produce additional logging[1] that will help diagnose the issue. >>>> >>>> [1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/ >>>> >>> >>> Hello, Andrew. >>> >>> You are a little misunderstood me. >> >> No, I understood you fine. >> >>> I wrote that I rushed to judgment. >>> After I did the reverse DNS zone, the cluster behaved correctly. >>> BUT after I took apart the cluster dropped configs and restarted on the new >>> cluster, >>> cluster again don't showed all the nodes in the nodes (only node with >>> running pacemaker). >>> >>> A small portion of the log. Full log >>> In which (I thought) there is something interesting. >>> >>> Aug 30 12:31:11 [9986] dev-cluster2-node4 cib: ( corosync.c:423 ) >>> trace: check_message_sanity: Verfied message 4: (dest=<all>:cib, >>> from=dev-cluster2-node4:cib.9986, compressed=0, size=1551, total=2143) >>> Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:96 ) >>> trace: corosync_node_name: Checking 172793107 vs 0 from >>> nodelist.node.0.nodeid >>> Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( ipcc.c:378 ) >>> debug: qb_ipcc_disconnect: qb_ipcc_disconnect() >>> Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) >>> debug: qb_rb_close: Closing ringbuffer: >>> /dev/shm/qb-cmap-request-9616-9989-27-header >>> Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) >>> debug: qb_rb_close: Closing ringbuffer: >>> /dev/shm/qb-cmap-response-9616-9989-27-header >>> Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: (ringbuffer.c:294 ) >>> debug: qb_rb_close: Closing ringbuffer: >>> /dev/shm/qb-cmap-event-9616-9989-27-header >>> Aug 30 12:31:11 [9989] dev-cluster2-node4 attrd: ( corosync.c:134 ) >>> notice: corosync_node_name: Unable to get node name for nodeid >>> 172793107 >> >> I wonder if you need to be including the nodeid too. ie. >> >> node { >> name: dev-cluster2-node2 >> ring0_addr: 10.76.157.17 >> nodeid: 2 >> } >> >> I _thought_ that was implicit. >> Chrissie: is "nodelist.node.%d.nodeid" always available for corosync2 or >> only if explicitly defined in the config? >> > > > You do need to specify a nodeid if you don't want corosync to imply it from > the IP address (or you're using IPv6). corosync won't imply a nodeif from the > order of the nodes in corosync.conf - that's not reliable enough. Right, but is that implied nodeid available as "nodelist.node.%d.nodeid"? Andrey's results suggest "no" and I would claim this is not expected/good :) > Also bear in mind that 0 is not a valid node number :-) > > Chrissie >
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org