Greetings, Heh. Well, the comment in corosync.conf makes sense to me now. Thanks, I've fixed that.
Here's my corosync.conf ---------------------------------------- totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 10.1.0.0 mcastaddr: 239.255.1.1 mcastport: 5405 ttl: 1 } cluster_name: pecan } logging { fileline: off to_stderr: no to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes debug: off timestamp: on logger_subsys { subsys: QUORUM debug: off } } quorum { provider: corosync_votequorum two_node: 1 wait_for_all: 1 } service { name: pacemaker ver: 1 } nodelist { node { ring0_addr: smoking nodeid: 1 } node { ring0_addr: mars nodeid: 2 } } ---------------------------------------- And a few things are behaving better than they did before. At the moment my goal is to set up a partition as drbd. In the interest of bandwidth I will show the commands that I use and the result I finally get. ---------------------------------------- pcs cluster auth smoking mars pcs property set stonith-enabled=true stonith_admin --metadata --agent fence_pcmk cibadmin -C -o resources --xml-file stonith.xml pcs resource create floating_ip IPaddr2 ip=10.1.2.101 cidr_netmask=32 pcs resource defaults resource-stickiness=100 ---------------------------------------- And at this point, all appears well. My pcs status output looks like I think it should. Now, of course, I admit that setting up the floating_ip is not relevant to my goal of a drbd backed filesystem, but I've been doing it as a sanity check. On to drbd ---------------------------------------- modprobe drbd systemctl start drbd.service [root@smoking cluster]# cat /proc/drbd version: 8.4.8-1 (api:1/proto:86-101) GIT-hash: 22b4c802192646e433d3f7399d578ec7fecc6272 build by mockbuild@, 2016-10- 13 19:58:26 0: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:10574 dw:10574 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 2: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 ---------------------------------------- Again, this is stuff that hung around from the previous incarnation. But it looks okay to me. I'm planning to use the '1' device. The above is run on the secondary machine, so Secondary/Primary is correct. And UpToDate/UpToDate looks right to me. Now it goes south. The mkfs.xfs appears to work, but that's not relevant anyway, right? ---------------------------------------- pcs resource create BravoSpace \ ocf:linbit:drbd drbd_resource=bravo \ op monitor interval=60s [root@smoking ~]# pcs status Cluster name: pecan Last updated: Sat Oct 15 01:33:37 2016 Last change: Sat Oct 15 01:18:56 2016 by root via cibadmin on mars Stack: corosync Current DC: mars (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum 2 nodes and 3 resources configured Node mars: UNCLEAN (online) Node smoking: UNCLEAN (online) Full list of resources: Fencing (stonith:fence_pcmk): Started mars floating_ip (ocf::heartbeat:IPaddr2): Started mars BravoSpace (ocf::linbit:drbd): FAILED[ smoking mars ] Failed Actions: * BravoSpace_stop_0 on smoking 'not configured' (6): call=18, status=complete, e xitreason='none', last-rc-change='Sat Oct 15 01:18:56 2016', queued=0ms, exec=63ms * BravoSpace_stop_0 on mars 'not configured' (6): call=18, status=complete, exit reason='none', last-rc-change='Sat Oct 15 01:18:56 2016', queued=0ms, exec=60ms PCSD Status: smoking: Online mars: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled ---------------------------------------- I've looked in /var/log/cluster/corosync.log and it doesn't seem happy but I don't know what I'm looking at. On the primary machine it's 1800+ lines on the secondary it's 600+ lines. There are 337 lines just with BravoSpace in them. One of them says drbd(BravoSpace)[3295]: 2016/10/15_01:18:56 ERROR: meta parameter misconfigured, expected clone-max -le 2, but found unset. But I tried adding clone-max=2 but the command barfed-- that's not a legal parameter. So, what's wrong? (I'm a newbie, of course.) I did a pcs resource cleanup . That shut down fencing and the IP. I tried pcs cluster start to get them back, no help. I did pcs cluster standby smoking, and then unstandby smoking. The ip started, but fencing has failed on BOTH machines. I can't see what I'm doing wrong. Thanks. I realize I'm consuming your time on the cheap. On Fri, Oct 14, 2016 at 3:33 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> wrote: > On 10/14/2016 02:48 PM, Jay Scott wrote: > > > When I "start over" I stop all the services, delete the packages, > > empty the configs and logs as best I know how. But this doesn't > > completely clear everything: the drbd metadata is evidently still > > on the partitions I've set aside for it. > > If it's small enough, dd if=/dev/zero of=/your/partition > > Get DRBD working and fully sync'ed outside of the cluster before you > start adding it. > > -- > Dimitri Maziuk > Programmer/sysadmin > BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > >
_______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org