[Pacemaker] HA setup of MySQL service using Pacemaker/DRBD
Hi, I have a working 2 node HA setup running on CentOS 6.5 with a very simple Apache webserver with replicated index.html using DRBD 8.4. The setup is configured based on the "Clusters from Scratch" Edition 5 with Fedora 13. I now with to replace Apache with a MySQL database, or just simply add it. How can I do so? I'm guessing the following: 1 . Add MySQL service to the cluster with a "crm configure primitive" command. I'm not sure what the params should be though, e.g. the configfile. 2. Set the same colocation/order rules. 3. Create/initialize a separate DRBD partition for MySQL (or can I reuse the same partition as Apache assuming I'll never exceed its capacity?) 4. Copy the database/table into the mounted DRBD partition. 5. Configure the cluster for DRBD as per Chapter 7.4 of the guide. Is this correct? Step by step instructions would be appreciated, I have some experience in RHEL/CentOS but not in HA nor MySQL. Thanks! -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, So it seems that my setup was not working because SELinux was not disabled. Once I disabled it, my web server displays the correct index.html. In my master node's /var/www/html, I see the correct index.html, but in the slave's /var/www/html I still see the old index.html. Once I do a failover and the slave becomes the master, I see the correct index.html in the new master's /var/www/html, and the website works as expected with no downtime. Is this the correct behavior? I was under the impression that both nodes will reflect the same contents, and whatever is changed on the master will be replicated in near real time in the slave. Also, I wish now with to put a mySQL database in the DRBD block device. What would the procedure be to do so? I suppose it would be similar to the Apache example, except /var/www/html would be replaced by wherever the DB is installed? On Thu, Nov 13, 2014 at 9:42 AM, Sihan Goi wrote: > Hi, > > getenforce returns "Enforcing" > ls -dZ /var/www/html returns "drwxr-xr-x. root root > system_u:object_r:httpd_sys_content_t:s0 /var/www/html" on both nodes. > > Running restorecon doesn't change the ls-dZ output. > > On Wed, Nov 12, 2014 at 2:24 PM, Vladislav Bogdanov > wrote: > >> 11.11.2014 07:27, Sihan Goi wrote: >> > Hi, >> > >> > DocumentRoot is still set to /var/www/html >> > ls -al /var/www/html shows different things on the 2 nodes >> > node01: >> > >> > total 28 >> > drwxr-xr-x. 3 root root 4096 Nov 11 12:25 . >> > drwxr-xr-x. 6 root root 4096 Jul 23 22:18 .. >> > -rw-r--r--. 1 root root50 Oct 28 18:00 index.html >> > drwx--. 2 root root 16384 Oct 28 17:59 lost+found >> > >> > node02 only has index.html, no lost+found, and it's a different version >> > of the file. >> > >> >> It look like apache is unable to stat its document root. >> Could you please show output of two commands: >> >> getenforce >> ls -dZ /var/www/html >> >> on both nodes when fs is mounted on one of them? >> If you see 'Enforcing', and the last part of the selinux context of a >> mounted fs root is not httpd_sys_content_t, then run >> 'restorecon -R /var/www/html' on that node. >> >> > Status URL is enabled in both nodes. >> > >> > >> > On Oct 30, 2014 11:14 AM, "Andrew Beekhof" > > <mailto:and...@beekhof.net>> wrote: >> > >> > >> > > On 29 Oct 2014, at 1:01 pm, Sihan Goi > > <mailto:gois...@gmail.com>> wrote: >> > > >> > > Hi, >> > > >> > > I've never used crm_report before. I just read the man file and >> > generated a tarball from 1-2 hours before I reconfigured all the >> > DRBD related resources. I've put the tarball here - >> > >> https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0 >> > > >> > > Hope you can help figure out what I'm doing wrong. Thanks for the >> > help! >> > >> > Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start >> > for /dev/drbd/by-res/wwwdata on /var/www/html >> > Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem >> > with ordered data mode. Opts: >> > Oct 28 18:13:39 node02 crmd[9870]: notice: process_lrm_event: LRM >> > operation WebFS_start_0 (call=164, rc=0, cib-update=298, >> > confirmed=true) ok >> > Oct 28 18:13:39 node02 crmd[9870]: notice: te_rsc_command: >> > Initiating action 7: start WebSite_start_0 on node02 (local) >> > Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error >> > on line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a >> > directory >> > >> > Is DocumentRoot still set to /var/www/html? >> > If so, what happens if you run 'ls -al /var/www/html' in a shell? >> > >> > Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: apache not >> running >> > Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: waiting for >> > apache /etc/httpd/conf/httpd.conf to come up >> > >> > Did you enable the status url? >> > >> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_enable_the_apache_status_url.html >> > >> > >> > >> > ___ >> > Pacemaker m
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, getenforce returns "Enforcing" ls -dZ /var/www/html returns "drwxr-xr-x. root root system_u:object_r:httpd_sys_content_t:s0 /var/www/html" on both nodes. Running restorecon doesn't change the ls-dZ output. On Wed, Nov 12, 2014 at 2:24 PM, Vladislav Bogdanov wrote: > 11.11.2014 07:27, Sihan Goi wrote: > > Hi, > > > > DocumentRoot is still set to /var/www/html > > ls -al /var/www/html shows different things on the 2 nodes > > node01: > > > > total 28 > > drwxr-xr-x. 3 root root 4096 Nov 11 12:25 . > > drwxr-xr-x. 6 root root 4096 Jul 23 22:18 .. > > -rw-r--r--. 1 root root50 Oct 28 18:00 index.html > > drwx--. 2 root root 16384 Oct 28 17:59 lost+found > > > > node02 only has index.html, no lost+found, and it's a different version > > of the file. > > > > It look like apache is unable to stat its document root. > Could you please show output of two commands: > > getenforce > ls -dZ /var/www/html > > on both nodes when fs is mounted on one of them? > If you see 'Enforcing', and the last part of the selinux context of a > mounted fs root is not httpd_sys_content_t, then run > 'restorecon -R /var/www/html' on that node. > > > Status URL is enabled in both nodes. > > > > > > On Oct 30, 2014 11:14 AM, "Andrew Beekhof" > <mailto:and...@beekhof.net>> wrote: > > > > > > > On 29 Oct 2014, at 1:01 pm, Sihan Goi > <mailto:gois...@gmail.com>> wrote: > > > > > > Hi, > > > > > > I've never used crm_report before. I just read the man file and > > generated a tarball from 1-2 hours before I reconfigured all the > > DRBD related resources. I've put the tarball here - > > > https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0 > > > > > > Hope you can help figure out what I'm doing wrong. Thanks for the > > help! > > > > Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start > > for /dev/drbd/by-res/wwwdata on /var/www/html > > Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem > > with ordered data mode. Opts: > > Oct 28 18:13:39 node02 crmd[9870]: notice: process_lrm_event: LRM > > operation WebFS_start_0 (call=164, rc=0, cib-update=298, > > confirmed=true) ok > > Oct 28 18:13:39 node02 crmd[9870]: notice: te_rsc_command: > > Initiating action 7: start WebSite_start_0 on node02 (local) > > Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error > > on line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a > > directory > > > > Is DocumentRoot still set to /var/www/html? > > If so, what happens if you run 'ls -al /var/www/html' in a shell? > > > > Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: apache not > running > > Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: waiting for > > apache /etc/httpd/conf/httpd.conf to come up > > > > Did you enable the status url? > > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_enable_the_apache_status_url.html > > > > > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > <mailto:Pacemaker@oss.clusterlabs.org> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, I'm fluent in English so I doubt it's a language barrier. I have reasonable user experience in Linux, though not extensive experience in the various system commands, and I have zero experience in HA. I'm in fact trying to make things as simple as possible by simply following the "Clusters from Scratch" guide step by step, and only modifying/omitting steps when they don't work. I know a block device (like /dev/sda) is simply a device (such as a hard disk) that appears like a file in Linux, allowing users buffered access to the device. I know a file system is like FAT/NTFS/ext2/etc. I know a mount point is a directory that you can mount an image file with a file system onto it. Once mounted, it would be as if the entire file system has the mount point as its root directory. I set up DRBD almost exactly like the instructions from Chapter 7 of "Clusters from Scratch". The only differences are in our setups. The guide assumes Fedora 13, DRBD 8.3 while I'm using CentOS 6.5 and DRBD 8.4. Since I was following the guide from start to finish, /var/www/html already has index.html already in there. node01 has it's own index.html, and node02 has its own index.html, both with different content. The guide did not instruct me to delete these files, and seems to configure the mount point to be /var/www/html (Chapter 7.4) with an ext4 file system, hence mounting the image onto a directory that already has files in it. Is this a problem? On Tue, Nov 11, 2014 at 6:07 PM, Lars Ellenberg wrote: > On Tue, Nov 11, 2014 at 12:27:23PM +0800, Sihan Goi wrote: > > Hi, > > > > DocumentRoot is still set to /var/www/html > > ls -al /var/www/html shows different things on the 2 nodes > > node01: > > > > total 28 > > drwxr-xr-x. 3 root root 4096 Nov 11 12:25 . > > drwxr-xr-x. 6 root root 4096 Jul 23 22:18 .. > > -rw-r--r--. 1 root root50 Oct 28 18:00 index.html > > drwx--. 2 root root 16384 Oct 28 17:59 lost+found > > > > node02 only has index.html, no lost+found, and it's a different version > of > > the file. > > I'm unsure if there is just a language barrier, > or if you just have not enough experience with linux in general, > or if you try to make things more complicated as they are. > > Do you know > * what a block device is? > * what a file system is? > * what a mount point is? > * that a mount point may not be empty, even though it typically is? > * what it means to mount a file system to a mount point? > > Assuming you set up DRBD in a sane way, > and it is mounted on *one* node (the node where it is Primary), > then on the *other* node, where it is NOT mounted, > you will only see the mount point, > and whatever happens to be in there. > > You probably should clear out the contents of that mount point, > so that you'd have an empty mount point. > > Or, if you like, replace it with some "dummy" content > that clearly shows that this is the mount point, > and not the file system that is intended to be mounted there. > > > Status URL is enabled in both nodes. > > As for the "DocumentRoot must be a directory", > please double check for typos... > > > > On Oct 30, 2014 11:14 AM, "Andrew Beekhof" wrote: > > > > > > > > > On 29 Oct 2014, at 1:01 pm, Sihan Goi wrote: > > > > > > > > Hi, > > > > > > > > I've never used crm_report before. I just read the man file and > > > generated a tarball from 1-2 hours before I reconfigured all the DRBD > > > related resources. I've put the tarball here - > > > > https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0 > > > > > > > > Hope you can help figure out what I'm doing wrong. Thanks for the > help! > > > > > > Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start > for > > > /dev/drbd/by-res/wwwdata on /var/www/html > > > Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem with > > > ordered data mode. Opts: > > > Oct 28 18:13:39 node02 crmd[9870]: notice: process_lrm_event: LRM > > > operation WebFS_start_0 (call=164, rc=0, cib-update=298, > confirmed=true) ok > > > Oct 28 18:13:39 node02 crmd[9870]: notice: te_rsc_command: Initiating > > > action 7: start WebSite_start_0 on node02 (local) > > > Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error on > line > > > 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory > > > > > > Is DocumentRoot still set to /var/www/html? > > > If so, what happens if you run 'ls -
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, DocumentRoot is still set to /var/www/html ls -al /var/www/html shows different things on the 2 nodes node01: total 28 drwxr-xr-x. 3 root root 4096 Nov 11 12:25 . drwxr-xr-x. 6 root root 4096 Jul 23 22:18 .. -rw-r--r--. 1 root root50 Oct 28 18:00 index.html drwx--. 2 root root 16384 Oct 28 17:59 lost+found node02 only has index.html, no lost+found, and it's a different version of the file. Status URL is enabled in both nodes. On Oct 30, 2014 11:14 AM, "Andrew Beekhof" wrote: > > > On 29 Oct 2014, at 1:01 pm, Sihan Goi wrote: > > > > Hi, > > > > I've never used crm_report before. I just read the man file and > generated a tarball from 1-2 hours before I reconfigured all the DRBD > related resources. I've put the tarball here - > https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0 > > > > Hope you can help figure out what I'm doing wrong. Thanks for the help! > > Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start for > /dev/drbd/by-res/wwwdata on /var/www/html > Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem with > ordered data mode. Opts: > Oct 28 18:13:39 node02 crmd[9870]: notice: process_lrm_event: LRM > operation WebFS_start_0 (call=164, rc=0, cib-update=298, confirmed=true) ok > Oct 28 18:13:39 node02 crmd[9870]: notice: te_rsc_command: Initiating > action 7: start WebSite_start_0 on node02 (local) > Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error on line > 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory > > Is DocumentRoot still set to /var/www/html? > If so, what happens if you run 'ls -al /var/www/html' in a shell? > > Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: apache not running > Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: waiting for apache > /etc/httpd/conf/httpd.conf to come up > > Did you enable the status url? > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_enable_the_apache_status_url.html > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, I've never used crm_report before. I just read the man file and generated a tarball from 1-2 hours before I reconfigured all the DRBD related resources. I've put the tarball here - https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0 Hope you can help figure out what I'm doing wrong. Thanks for the help! On Wed, Oct 29, 2014 at 9:24 AM, Andrew Beekhof wrote: > Can you run crm_report so we can see the logs and PE files? > > > On 28 Oct 2014, at 9:16 pm, Sihan Goi wrote: > > > > Hi, > > > > I followed those steps previously. I just tried it again, but I'm still > getting the same error. My "crm configure show" shows the following: > > > > node node01 \ > > attributes standby=off > > node node02 > > primitive ClusterIP IPaddr2 \ > > params ip=192.168.1.110 cidr_netmask=24 \ > > op monitor interval=30s > > primitive WebData ocf:linbit:drbd \ > > params drbd_resource=wwwdata \ > > op monitor interval=60s > > primitive WebFS Filesystem \ > > params device="/dev/drbd/by-res/wwwdata" > directory="/var/www/html" fstype=ext4 > > primitive WebSite apache \ > > params configfile="/etc/httpd/conf/httpd.conf" \ > > op monitor interval=1min > > ms WebDataClone WebData \ > > meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 > notify=true > > location prefer-node01 WebSite 50: node01 > > colocation WebSite-with-WebFS inf: WebSite WebFS > > colocation fs_on_drbd inf: WebFS WebDataClone:Master > > colocation website-with-ip inf: WebSite ClusterIP > > order WebFS-after-WebData inf: WebDataClone:promote WebFS:start > > order WebSite-after-WebFS inf: WebFS WebSite > > order apache-after-ip Mandatory: ClusterIP WebSite > > property cib-bootstrap-options: \ > > dc-version=1.1.10-14.el6_5.3-368c726 \ > > cluster-infrastructure=cman \ > > stonith-enabled=false \ > > no-quorum-policy=ignore > > rsc_defaults rsc_defaults-options: \ > > migration-threshold=1 > > > > What am I doing wrong? > > > > On Tue, Oct 28, 2014 at 5:11 PM, Andrew Beekhof > wrote: > > > > > On 28 Oct 2014, at 6:26 pm, Sihan Goi wrote: > > > > > > Hi, > > > > > > No, I did not do this. I followed the Pacemaker 1.1 - Clusters from > scratch edition 5 for Fedora 13, and in section 7.3.4 it instructed me to > run the following commands, which I did: > > > mkfs.ext4 /dev/drbd1 > > > mount /dev/drbd1 /mnt > > > create index.html file in /mnt > > > umount /dev/drbd1 > > > > > > Subsequently, after unmounting, there were no further instructions to > mount any other directories. > > > > > > So, how should I mount /dev/mapper/vg_node02-drbd--demo to > /var/www/html? Should I be mounting /dev/mapper/vg_node02-drbd--demo, or > /dev/drbd1. Since I've already created index.html in /dev/drbd1, should I > be mounting that? I'm a little confused here. > > > > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_configure_the_cluster_for_drbd.html > > > > Look for "Now that DRBD is functioning we can configure a Filesystem > resource to use it" > > > > > > > > On Tue, Oct 28, 2014 at 11:41 AM, Andrew Beekhof > wrote: > > > > > > > On 27 Oct 2014, at 6:05 pm, Sihan Goi wrote: > > > > > > > > Hi, > > > > > > > > That offending line is as follows: > > > > DocumentRoot "/var/www/html" > > > > > > > > I'm guessing it needs to be updated to the DRBD block device, but > I'm not sure how to do that, or even what the block device is. > > > > > > > > fdisk -l shows the following, which I'm guessing is the block device? > > > > /dev/mapper/vg_node02-drbd--demo > > > > > > > > lvs shows the following: > > > > drbd-demo vg_node02 -wi-ao 1.00g > > > > > > > > btw I'm running the commands on node02 (secondary) rather than > node01 (primary). It's just a matter of convenience due to the physical > location of the machine. Does it matter? > > > > > > Um, you need to mount /dev/mapper/vg_node02-drbd--demo to > /var/www/html with a FileSystem resource. > > > Have you not done this? > > > > > > > > > > > Thanks. > > > > > > > > On Mon, Oct 2
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, I followed those steps previously. I just tried it again, but I'm still getting the same error. My "crm configure show" shows the following: node node01 \ attributes standby=off node node02 primitive ClusterIP IPaddr2 \ params ip=192.168.1.110 cidr_netmask=24 \ op monitor interval=30s primitive WebData ocf:linbit:drbd \ params drbd_resource=wwwdata \ op monitor interval=60s primitive WebFS Filesystem \ params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=ext4 primitive WebSite apache \ params configfile="/etc/httpd/conf/httpd.conf" \ op monitor interval=1min ms WebDataClone WebData \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true location prefer-node01 WebSite 50: node01 colocation WebSite-with-WebFS inf: WebSite WebFS colocation fs_on_drbd inf: WebFS WebDataClone:Master colocation website-with-ip inf: WebSite ClusterIP order WebFS-after-WebData inf: WebDataClone:promote WebFS:start order WebSite-after-WebFS inf: WebFS WebSite order apache-after-ip Mandatory: ClusterIP WebSite property cib-bootstrap-options: \ dc-version=1.1.10-14.el6_5.3-368c726 \ cluster-infrastructure=cman \ stonith-enabled=false \ no-quorum-policy=ignore rsc_defaults rsc_defaults-options: \ migration-threshold=1 What am I doing wrong? On Tue, Oct 28, 2014 at 5:11 PM, Andrew Beekhof wrote: > > > On 28 Oct 2014, at 6:26 pm, Sihan Goi wrote: > > > > Hi, > > > > No, I did not do this. I followed the Pacemaker 1.1 - Clusters from > scratch edition 5 for Fedora 13, and in section 7.3.4 it instructed me to > run the following commands, which I did: > > mkfs.ext4 /dev/drbd1 > > mount /dev/drbd1 /mnt > > create index.html file in /mnt > > umount /dev/drbd1 > > > > Subsequently, after unmounting, there were no further instructions to > mount any other directories. > > > > So, how should I mount /dev/mapper/vg_node02-drbd--demo to > /var/www/html? Should I be mounting /dev/mapper/vg_node02-drbd--demo, or > /dev/drbd1. Since I've already created index.html in /dev/drbd1, should I > be mounting that? I'm a little confused here. > > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_configure_the_cluster_for_drbd.html > > Look for "Now that DRBD is functioning we can configure a Filesystem > resource to use it" > > > > > On Tue, Oct 28, 2014 at 11:41 AM, Andrew Beekhof > wrote: > > > > > On 27 Oct 2014, at 6:05 pm, Sihan Goi wrote: > > > > > > Hi, > > > > > > That offending line is as follows: > > > DocumentRoot "/var/www/html" > > > > > > I'm guessing it needs to be updated to the DRBD block device, but I'm > not sure how to do that, or even what the block device is. > > > > > > fdisk -l shows the following, which I'm guessing is the block device? > > > /dev/mapper/vg_node02-drbd--demo > > > > > > lvs shows the following: > > > drbd-demo vg_node02 -wi-ao 1.00g > > > > > > btw I'm running the commands on node02 (secondary) rather than node01 > (primary). It's just a matter of convenience due to the physical location > of the machine. Does it matter? > > > > Um, you need to mount /dev/mapper/vg_node02-drbd--demo to /var/www/html > with a FileSystem resource. > > Have you not done this? > > > > > > > > Thanks. > > > > > > On Mon, Oct 27, 2014 at 11:35 AM, Andrew Beekhof > wrote: > > > Oct 27 10:28:44 node02 apache(WebSite)[10515]: ERROR: Syntax error on > line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory > > > > > > > > > > > > > On 27 Oct 2014, at 1:36 pm, Sihan Goi wrote: > > > > > > > > Hi Andrew, > > > > > > > > Logs in /var/log/httpd/ are empty, but here's a snippet of > /var/log/messages right after I start pacemaker and do a "crm status" > > > > > > > > http://pastebin.com/ivQdyV4u > > > > > > > > Seems like the Apache service doesn't come up. This only happens > after I run the commands in the guide to configure DRBD. > > > > > > > > On Fri, Oct 24, 2014 at 8:29 AM, Andrew Beekhof > wrote: > > > > logs? > > > > > > > > > On 23 Oct 2014, at 1:08 pm, Sihan Goi wrote: > > > > > > > > > > Hi, can anyone help? Really stuck here... > > > > > > > > > > On Mon, Oct 2
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, No, I did not do this. I followed the Pacemaker 1.1 - Clusters from scratch edition 5 for Fedora 13, and in section 7.3.4 it instructed me to run the following commands, which I did: mkfs.ext4 /dev/drbd1 mount /dev/drbd1 /mnt create index.html file in /mnt umount /dev/drbd1 Subsequently, after unmounting, there were no further instructions to mount any other directories. So, how should I mount /dev/mapper/vg_node02-drbd--demo to /var/www/html? Should I be mounting /dev/mapper/vg_node02-drbd--demo, or /dev/drbd1. Since I've already created index.html in /dev/drbd1, should I be mounting that? I'm a little confused here. On Tue, Oct 28, 2014 at 11:41 AM, Andrew Beekhof wrote: > > > On 27 Oct 2014, at 6:05 pm, Sihan Goi wrote: > > > > Hi, > > > > That offending line is as follows: > > DocumentRoot "/var/www/html" > > > > I'm guessing it needs to be updated to the DRBD block device, but I'm > not sure how to do that, or even what the block device is. > > > > fdisk -l shows the following, which I'm guessing is the block device? > > /dev/mapper/vg_node02-drbd--demo > > > > lvs shows the following: > > drbd-demo vg_node02 -wi-ao 1.00g > > > > btw I'm running the commands on node02 (secondary) rather than node01 > (primary). It's just a matter of convenience due to the physical location > of the machine. Does it matter? > > Um, you need to mount /dev/mapper/vg_node02-drbd--demo to /var/www/html > with a FileSystem resource. > Have you not done this? > > > > > Thanks. > > > > On Mon, Oct 27, 2014 at 11:35 AM, Andrew Beekhof > wrote: > > Oct 27 10:28:44 node02 apache(WebSite)[10515]: ERROR: Syntax error on > line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory > > > > > > > > > On 27 Oct 2014, at 1:36 pm, Sihan Goi wrote: > > > > > > Hi Andrew, > > > > > > Logs in /var/log/httpd/ are empty, but here's a snippet of > /var/log/messages right after I start pacemaker and do a "crm status" > > > > > > http://pastebin.com/ivQdyV4u > > > > > > Seems like the Apache service doesn't come up. This only happens after > I run the commands in the guide to configure DRBD. > > > > > > On Fri, Oct 24, 2014 at 8:29 AM, Andrew Beekhof > wrote: > > > logs? > > > > > > > On 23 Oct 2014, at 1:08 pm, Sihan Goi wrote: > > > > > > > > Hi, can anyone help? Really stuck here... > > > > > > > > On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi > wrote: > > > > Hi, > > > > > > > > I'm following the "Clusters from Scratch" guide for Fedora 13, and > I've managed to get a 2 node cluster working with Apache. However, once I > tried to add DRBD 8.4 to the mix, it stopped working. > > > > > > > > I've followed the DRBD steps in the guide all the way till "cib > commit fs" in Section 7.4, right before "Testing Migration". However, when > I do a crm_mon, I get the following "failed actions". > > > > > > > > Last updated: Thu Oct 16 17:28:34 2014 > > > > Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01 > > > > Stack: cman > > > > Current DC: node02 - partition with quorum > > > > Version: 1.1.10-14.el6_5.3-368c726 > > > > 2 Nodes configured > > > > 5 Resources configured > > > > > > > > > > > > Online: [ node01 node02 ] > > > > > > > > ClusterIP(ocf::heartbeat:IPaddr2):Started node02 > > > > Master/Slave Set: WebDataClone [WebData] > > > > Masters: [ node02 ] > > > > Slaves: [ node01 ] > > > > WebFS (ocf::heartbeat:Filesystem):Started node02 > > > > > > > > Failed actions: > > > > WebSite_start_0 on node02 'unknown error' (1): call=278, > status=Timed Out, last-rc-change='Thu Oct 16 17:26:28 2014', > queued=2ms, exec=0ms > > > > WebSite_start_0 on node01 'unknown error' (1): call=203, > status=Timed > > > > Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, > exec=0ms > > > > > > > > Seems like the apache Website resource isn't starting up. Apache was > > > > working just fine before I configured DRBD. What did I do wrong? > > > > > > > > -- > > > > - Goi Sihan > > > > gois...@gmail.com > > > > &g
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, That offending line is as follows: DocumentRoot "/var/www/html" I'm guessing it needs to be updated to the DRBD block device, but I'm not sure how to do that, or even what the block device is. fdisk -l shows the following, which I'm guessing is the block device? /dev/mapper/vg_node02-drbd--demo lvs shows the following: drbd-demo vg_node02 -wi-ao 1.00g btw I'm running the commands on node02 (secondary) rather than node01 (primary). It's just a matter of convenience due to the physical location of the machine. Does it matter? Thanks. On Mon, Oct 27, 2014 at 11:35 AM, Andrew Beekhof wrote: > Oct 27 10:28:44 node02 apache(WebSite)[10515]: ERROR: Syntax error on line > 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory > > > > > On 27 Oct 2014, at 1:36 pm, Sihan Goi wrote: > > > > Hi Andrew, > > > > Logs in /var/log/httpd/ are empty, but here's a snippet of > /var/log/messages right after I start pacemaker and do a "crm status" > > > > http://pastebin.com/ivQdyV4u > > > > Seems like the Apache service doesn't come up. This only happens after I > run the commands in the guide to configure DRBD. > > > > On Fri, Oct 24, 2014 at 8:29 AM, Andrew Beekhof > wrote: > > logs? > > > > > On 23 Oct 2014, at 1:08 pm, Sihan Goi wrote: > > > > > > Hi, can anyone help? Really stuck here... > > > > > > On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi wrote: > > > Hi, > > > > > > I'm following the "Clusters from Scratch" guide for Fedora 13, and > I've managed to get a 2 node cluster working with Apache. However, once I > tried to add DRBD 8.4 to the mix, it stopped working. > > > > > > I've followed the DRBD steps in the guide all the way till "cib commit > fs" in Section 7.4, right before "Testing Migration". However, when I do a > crm_mon, I get the following "failed actions". > > > > > > Last updated: Thu Oct 16 17:28:34 2014 > > > Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01 > > > Stack: cman > > > Current DC: node02 - partition with quorum > > > Version: 1.1.10-14.el6_5.3-368c726 > > > 2 Nodes configured > > > 5 Resources configured > > > > > > > > > Online: [ node01 node02 ] > > > > > > ClusterIP(ocf::heartbeat:IPaddr2):Started node02 > > > Master/Slave Set: WebDataClone [WebData] > > > Masters: [ node02 ] > > > Slaves: [ node01 ] > > > WebFS (ocf::heartbeat:Filesystem):Started node02 > > > > > > Failed actions: > > > WebSite_start_0 on node02 'unknown error' (1): call=278, > status=Timed Out, last-rc-change='Thu Oct 16 17:26:28 2014', > queued=2ms, exec=0ms > > > WebSite_start_0 on node01 'unknown error' (1): call=203, > status=Timed > > > Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, > exec=0ms > > > > > > Seems like the apache Website resource isn't starting up. Apache was > > > working just fine before I configured DRBD. What did I do wrong? > > > > > > -- > > > - Goi Sihan > > > gois...@gmail.com > > > > > > > > > > > > -- > > > - Goi Sihan > > > gois...@gmail.com > > > ___ > > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > Bugs: http://bugs.clusterlabs.org > > > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > > > > > -- > > - Goi Sihan > > gois...@gmail.com > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi Andrew, Logs in /var/log/httpd/ are empty, but here's a snippet of /var/log/messages right after I start pacemaker and do a "crm status" http://pastebin.com/ivQdyV4u Seems like the Apache service doesn't come up. This only happens after I run the commands in the guide to configure DRBD. On Fri, Oct 24, 2014 at 8:29 AM, Andrew Beekhof wrote: > logs? > > > On 23 Oct 2014, at 1:08 pm, Sihan Goi wrote: > > > > Hi, can anyone help? Really stuck here... > > > > On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi wrote: > > Hi, > > > > I'm following the "Clusters from Scratch" guide for Fedora 13, and I've > managed to get a 2 node cluster working with Apache. However, once I tried > to add DRBD 8.4 to the mix, it stopped working. > > > > I've followed the DRBD steps in the guide all the way till "cib commit > fs" in Section 7.4, right before "Testing Migration". However, when I do a > crm_mon, I get the following "failed actions". > > > > Last updated: Thu Oct 16 17:28:34 2014 > > Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01 > > Stack: cman > > Current DC: node02 - partition with quorum > > Version: 1.1.10-14.el6_5.3-368c726 > > 2 Nodes configured > > 5 Resources configured > > > > > > Online: [ node01 node02 ] > > > > ClusterIP(ocf::heartbeat:IPaddr2):Started node02 > > Master/Slave Set: WebDataClone [WebData] > > Masters: [ node02 ] > > Slaves: [ node01 ] > > WebFS (ocf::heartbeat:Filesystem):Started node02 > > > > Failed actions: > > WebSite_start_0 on node02 'unknown error' (1): call=278, > status=Timed Out, last-rc-change='Thu Oct 16 17:26:28 2014', > queued=2ms, exec=0ms > > WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed > > Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms > > > > Seems like the apache Website resource isn't starting up. Apache was > > working just fine before I configured DRBD. What did I do wrong? > > > > -- > > - Goi Sihan > > gois...@gmail.com > > > > > > > > -- > > - Goi Sihan > > gois...@gmail.com > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, can anyone help? Really stuck here... On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi wrote: > Hi, > > I'm following the "Clusters from Scratch" guide for Fedora 13, and I've > managed to get a 2 node cluster working with Apache. However, once I tried > to add DRBD 8.4 to the mix, it stopped working. > > I've followed the DRBD steps in the guide all the way till "cib commit fs" > in Section 7.4, right before "Testing Migration". However, when I do a > crm_mon, I get the following "failed actions". > > Last updated: Thu Oct 16 17:28:34 2014 > Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01 > Stack: cman > Current DC: node02 - partition with quorum > Version: 1.1.10-14.el6_5.3-368c726 > 2 Nodes configured > 5 Resources configured > > > Online: [ node01 node02 ] > > ClusterIP(ocf::heartbeat:IPaddr2):Started node02 > Master/Slave Set: WebDataClone [WebData] > Masters: [ node02 ] > Slaves: [ node01 ] > WebFS (ocf::heartbeat:Filesystem):Started node02 > > Failed actions: > WebSite_start_0 on node02 'unknown error' (1): call=278, status=Timed > Out, last-rc-change='Thu Oct 16 17:26:28 2014', queued=2ms, exec=0ms > WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed > Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms > > Seems like the apache Website resource isn't starting up. Apache was > working just fine before I configured DRBD. What did I do wrong? > > -- > - Goi Sihan > gois...@gmail.com > -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] DRBD with Pacemaker on CentOs 6.5
Hi, I'm following the "Clusters from Scratch" guide for Fedora 13, and I've managed to get a 2 node cluster working with Apache. However, once I tried to add DRBD 8.4 to the mix, it stopped working. I've followed the DRBD steps in the guide all the way till "cib commit fs" in Section 7.4, right before "Testing Migration". However, when I do a crm_mon, I get the following "failed actions". Last updated: Thu Oct 16 17:28:34 2014 Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01 Stack: cman Current DC: node02 - partition with quorum Version: 1.1.10-14.el6_5.3-368c726 2 Nodes configured 5 Resources configured Online: [ node01 node02 ] ClusterIP(ocf::heartbeat:IPaddr2):Started node02 Master/Slave Set: WebDataClone [WebData] Masters: [ node02 ] Slaves: [ node01 ] WebFS (ocf::heartbeat:Filesystem):Started node02 Failed actions: WebSite_start_0 on node02 'unknown error' (1): call=278, status=Timed Out, last-rc-change='Thu Oct 16 17:26:28 2014', queued=2ms, exec=0ms WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms Seems like the apache Website resource isn't starting up. Apache was working just fine before I configured DRBD. What did I do wrong? -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Linux HA setup for CentOS 6.5
Thanks! OK, so I've followed the DRBD steps in the guide all the way till "cib commit fs" in Section 7.4, right before "Testing Migration". However, when I do a crm_mon, I get the following "failed actions". Last updated: Thu Oct 16 17:28:34 2014 Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01 Stack: cman Current DC: node02 - partition with quorum Version: 1.1.10-14.el6_5.3-368c726 2 Nodes configured 5 Resources configured Online: [ node01 node02 ] ClusterIP(ocf::heartbeat:IPaddr2):Started node02 Master/Slave Set: WebDataClone [WebData] Masters: [ node02 ] Slaves: [ node01 ] WebFS (ocf::heartbeat:Filesystem):Started node02 Failed actions: WebSite_start_0 on node02 'unknown error' (1): call=278, status=Timed Out, l ast-rc-change='Thu Oct 16 17:26:28 2014', queued=2ms, exec=0ms WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed Out, l ast-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms Seems like the apache Website resource isn't starting up. Apache was working just fine before I configured DRBD. What did I do wrong? On Thu, Oct 16, 2014 at 1:49 PM, Digimer wrote: > On 16/10/14 12:14 AM, Sihan Goi wrote: > >> After following the guide, I've successfully managed to get Apache >> server up and running in the cluster as an active/passive setup, but >> with some differences. My cluster stack is stated as being cman while >> the guide's is openais. Not sure if that's a problem. Also, some >> commands in the guide don't seem to work. >> > > If you can provide examples of what issues you're having, I will be happy > to try an help. > > I'm moving on to DRBD installation now, but when I do a "yum install >> drbd-pacemaker drbd-udev", these packages are not available. After some >> googling, it seems that drbd83-utils/kmod-drbd83 or >> drbd84-utils/kmod-drbd84 is available via another repo. Does this work >> with the guide? >> > > You need to get them from a 3rd party repo (or install from source). I > personally still use 8.3.16 (consistency during "Anvil!" generations), but > I know that 8.4 is fine on EL6 (and EL7, to address an earlier comment). I > have my own repos with these packages, but you would likely be better > served using the ELRepo ones. > > https://alteeve.ca/w/AN!Cluster_Tutorial_2#Installing_DRBD > > The only real difference is to s/83/84/: > > + yum install drbd84-utils kmod-drbd84 > - yum install drbd83-utils kmod-drbd83 > > If you run into any troubles, please share details and I am sure we'll get > you sorted out in no time. > > Cheers > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Linux HA setup for CentOS 6.5
After following the guide, I've successfully managed to get Apache server up and running in the cluster as an active/passive setup, but with some differences. My cluster stack is stated as being cman while the guide's is openais. Not sure if that's a problem. Also, some commands in the guide don't seem to work. I'm moving on to DRBD installation now, but when I do a "yum install drbd-pacemaker drbd-udev", these packages are not available. After some googling, it seems that drbd83-utils/kmod-drbd83 or drbd84-utils/kmod-drbd84 is available via another repo. Does this work with the guide? On Thu, Oct 16, 2014 at 9:35 AM, Sihan Goi wrote: > Hi, > > Thanks for the guide! I thought I had the same exact version...mine is > also named "Pacemaker 1.1 Clusters from Scratch Creating Active/Passive and > Active/Active Clusters on Fedora Edition 5", but my version of the document > is meant for Fedora 17, and uses pcs and systemctl calls which don't exist > on CentOS 6.5. I was trying to get it to work on CentOS 7 but realized > support for DRBD on CentOS 7 is really lacking. > > I'll refer to the version you posted from hereon. > > On Wed, Oct 15, 2014 at 11:43 PM, Digimer wrote: > >> Let pacemaker start cman/corosync on EL6. >> >> This is the guide that covers it, written by Pacemaker's author: >> >> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html- >> single/Clusters_from_Scratch/index.html >> >> It notes that it's based on Fedora 13, but that maps to EL6 almost >> perfectly. >> >> A very slightly altered approach is here, in my *very* unfinished >> tutorial: >> >> https://alteeve.ca/w/Anvil!_Tutorial_3_on_EL6#Configuring_the_Anvil.21 >> >> The main difference is that Andrew's approach (see section 8.2.2) is to >> disable quorum via editing /etc/sysconfig/cman, where my approach handles >> it in the main /etc/cluster/cluster.conf (cman's main config file). >> >> In any case, from then on, start pacemaker and let it handle everything >> else. >> >> Cheers >> >> digimer >> >> On 15/10/14 04:27 AM, Sihan Goi wrote: >> >>> Hi, >>> >>> So I've decided to make things simpler and go with a wired network >>> instead of wireless. I connected both boxes to a router, manually edited >>> the ifcfg-eth0 files to set static IP addresses for both boxes (not >>> before downloading and building a driver for the nic of 1 of the boxes), >>> did a "chkconfig NetworkManager off", "service NetworkManager stop", and >>> "service network restart". >>> >>> I'm able to ping each other via IP address and hostname. I also already >>> have corosync, pacemaker, crmsh and cman installed. >>> >>> I then did the following as per the guide at >>> http://geekpeek.net/linux-cluster-corosync-pacemaker >>> >>> service corosync start - success. >>> service pacemaker start - I get a "Starting cman...corosync cluster >>> engine is already running [FAILED]" >>> >>> What's up? :( >>> >>> On Oct 15, 2014 12:23 PM, "Sihan Goi" >> <mailto:gois...@gmail.com>> wrote: >>> >>> No typo. >>> >>> [root@node02 network-scripts]# ls -lah >>> /etc/sysconfig/network-scripts/ifcfg-* >>> -rw-r--r--. 1 root root 254 Oct 10 2013 >>> /etc/sysconfig/network-scripts/ifcfg-lo >>> >>> I installed CentOS 6.5 with the LiveDVD. I found it weird as well >>> that these files were missing. >>> >>> On Wed, Oct 15, 2014 at 11:54 AM, Digimer >> <mailto:li...@alteeve.ca>> wrote: >>> >>> Sure there isn't a typo there? >>> >>> an-c05n01:~# ls -lah /etc/sysconfig/network-__scripts/ifcfg-* >>> -rw-r--r--. 1 root root 225 Jan 16 2013 >>> /etc/sysconfig/network-__scripts/ifcfg-bond0 >>> -rw-r--r--. 1 root root 220 Jan 16 2013 >>> /etc/sysconfig/network-__scripts/ifcfg-bond1 >>> -rw-r--r--. 1 root root 198 Jan 16 2013 >>> /etc/sysconfig/network-__scripts/ifcfg-bond2 >>> -rw-r--r--. 1 root root 149 Jan 16 2013 >>> /etc/sysconfig/network-__scripts/ifcfg-eth0 >>> -rw-r--r--. 1 root root 144 Jan 16 2013 >>> /etc/sysconfig/network-__scripts/ifcfg-eth1 >>> -rw-r--r--. 1 root root 152 Mar 14 2013 >>> /etc/
Re: [Pacemaker] Linux HA setup for CentOS 6.5
Hi, Thanks for the guide! I thought I had the same exact version...mine is also named "Pacemaker 1.1 Clusters from Scratch Creating Active/Passive and Active/Active Clusters on Fedora Edition 5", but my version of the document is meant for Fedora 17, and uses pcs and systemctl calls which don't exist on CentOS 6.5. I was trying to get it to work on CentOS 7 but realized support for DRBD on CentOS 7 is really lacking. I'll refer to the version you posted from hereon. On Wed, Oct 15, 2014 at 11:43 PM, Digimer wrote: > Let pacemaker start cman/corosync on EL6. > > This is the guide that covers it, written by Pacemaker's author: > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html- > single/Clusters_from_Scratch/index.html > > It notes that it's based on Fedora 13, but that maps to EL6 almost > perfectly. > > A very slightly altered approach is here, in my *very* unfinished tutorial: > > https://alteeve.ca/w/Anvil!_Tutorial_3_on_EL6#Configuring_the_Anvil.21 > > The main difference is that Andrew's approach (see section 8.2.2) is to > disable quorum via editing /etc/sysconfig/cman, where my approach handles > it in the main /etc/cluster/cluster.conf (cman's main config file). > > In any case, from then on, start pacemaker and let it handle everything > else. > > Cheers > > digimer > > On 15/10/14 04:27 AM, Sihan Goi wrote: > >> Hi, >> >> So I've decided to make things simpler and go with a wired network >> instead of wireless. I connected both boxes to a router, manually edited >> the ifcfg-eth0 files to set static IP addresses for both boxes (not >> before downloading and building a driver for the nic of 1 of the boxes), >> did a "chkconfig NetworkManager off", "service NetworkManager stop", and >> "service network restart". >> >> I'm able to ping each other via IP address and hostname. I also already >> have corosync, pacemaker, crmsh and cman installed. >> >> I then did the following as per the guide at >> http://geekpeek.net/linux-cluster-corosync-pacemaker >> >> service corosync start - success. >> service pacemaker start - I get a "Starting cman...corosync cluster >> engine is already running [FAILED]" >> >> What's up? :( >> >> On Oct 15, 2014 12:23 PM, "Sihan Goi" > <mailto:gois...@gmail.com>> wrote: >> >> No typo. >> >> [root@node02 network-scripts]# ls -lah >> /etc/sysconfig/network-scripts/ifcfg-* >> -rw-r--r--. 1 root root 254 Oct 10 2013 >> /etc/sysconfig/network-scripts/ifcfg-lo >> >> I installed CentOS 6.5 with the LiveDVD. I found it weird as well >> that these files were missing. >> >> On Wed, Oct 15, 2014 at 11:54 AM, Digimer > <mailto:li...@alteeve.ca>> wrote: >> >> Sure there isn't a typo there? >> >> an-c05n01:~# ls -lah /etc/sysconfig/network-__scripts/ifcfg-* >> -rw-r--r--. 1 root root 225 Jan 16 2013 >> /etc/sysconfig/network-__scripts/ifcfg-bond0 >> -rw-r--r--. 1 root root 220 Jan 16 2013 >> /etc/sysconfig/network-__scripts/ifcfg-bond1 >> -rw-r--r--. 1 root root 198 Jan 16 2013 >> /etc/sysconfig/network-__scripts/ifcfg-bond2 >> -rw-r--r--. 1 root root 149 Jan 16 2013 >> /etc/sysconfig/network-__scripts/ifcfg-eth0 >> -rw-r--r--. 1 root root 144 Jan 16 2013 >> /etc/sysconfig/network-__scripts/ifcfg-eth1 >> -rw-r--r--. 1 root root 152 Mar 14 2013 >> /etc/sysconfig/network-__scripts/ifcfg-eth2 >> -rw-r--r--. 1 root root 149 Jan 16 2013 >> /etc/sysconfig/network-__scripts/ifcfg-eth3 >> -rw-r--r--. 1 root root 144 Jan 16 2013 >> /etc/sysconfig/network-__scripts/ifcfg-eth4 >> -rw-r--r--. 1 root root 152 Mar 14 2013 >> /etc/sysconfig/network-__scripts/ifcfg-eth5 >> -rw-r--r--. 1 root root 254 Jul 22 09:56 >> /etc/sysconfig/network-__scripts/ifcfg-lo >> -rw-r--r--. 1 root root 213 Mar 13 2013 >> /etc/sysconfig/network-__scripts/ifcfg-vbr2 >> >> I've never seen an EL6 install without the files there, >> 'network' or NetworkManager aside. >> >> digimer >> >> On 14/10/14 11:32 PM, Sihan Goi wrote: >> >> There aren't any config files in >> /etc/sysconfig/network-__scripts. When I >> was using CentOS 7, the config files were there &g
Re: [Pacemaker] Linux HA setup for CentOS 6.5
Hi, So I've decided to make things simpler and go with a wired network instead of wireless. I connected both boxes to a router, manually edited the ifcfg-eth0 files to set static IP addresses for both boxes (not before downloading and building a driver for the nic of 1 of the boxes), did a "chkconfig NetworkManager off", "service NetworkManager stop", and "service network restart". I'm able to ping each other via IP address and hostname. I also already have corosync, pacemaker, crmsh and cman installed. I then did the following as per the guide at http://geekpeek.net/linux-cluster-corosync-pacemaker service corosync start - success. service pacemaker start - I get a "Starting cman...corosync cluster engine is already running [FAILED]" What's up? :( On Oct 15, 2014 12:23 PM, "Sihan Goi" wrote: > No typo. > > [root@node02 network-scripts]# ls -lah > /etc/sysconfig/network-scripts/ifcfg-* > -rw-r--r--. 1 root root 254 Oct 10 2013 > /etc/sysconfig/network-scripts/ifcfg-lo > > I installed CentOS 6.5 with the LiveDVD. I found it weird as well that > these files were missing. > > On Wed, Oct 15, 2014 at 11:54 AM, Digimer wrote: > >> Sure there isn't a typo there? >> >> an-c05n01:~# ls -lah /etc/sysconfig/network-scripts/ifcfg-* >> -rw-r--r--. 1 root root 225 Jan 16 2013 /etc/sysconfig/network- >> scripts/ifcfg-bond0 >> -rw-r--r--. 1 root root 220 Jan 16 2013 /etc/sysconfig/network- >> scripts/ifcfg-bond1 >> -rw-r--r--. 1 root root 198 Jan 16 2013 /etc/sysconfig/network- >> scripts/ifcfg-bond2 >> -rw-r--r--. 1 root root 149 Jan 16 2013 /etc/sysconfig/network- >> scripts/ifcfg-eth0 >> -rw-r--r--. 1 root root 144 Jan 16 2013 /etc/sysconfig/network- >> scripts/ifcfg-eth1 >> -rw-r--r--. 1 root root 152 Mar 14 2013 /etc/sysconfig/network- >> scripts/ifcfg-eth2 >> -rw-r--r--. 1 root root 149 Jan 16 2013 /etc/sysconfig/network- >> scripts/ifcfg-eth3 >> -rw-r--r--. 1 root root 144 Jan 16 2013 /etc/sysconfig/network- >> scripts/ifcfg-eth4 >> -rw-r--r--. 1 root root 152 Mar 14 2013 /etc/sysconfig/network- >> scripts/ifcfg-eth5 >> -rw-r--r--. 1 root root 254 Jul 22 09:56 /etc/sysconfig/network- >> scripts/ifcfg-lo >> -rw-r--r--. 1 root root 213 Mar 13 2013 /etc/sysconfig/network- >> scripts/ifcfg-vbr2 >> >> I've never seen an EL6 install without the files there, 'network' or >> NetworkManager aside. >> >> digimer >> >> On 14/10/14 11:32 PM, Sihan Goi wrote: >> >>> There aren't any config files in /etc/sysconfig/network-scripts. When I >>> was using CentOS 7, the config files were there (ifcfg-something) but in >>> this CentOS 6.5 installation, they are missing. >>> >>> If is possible to not use cman, and just use corosync and pacemaker? If >>> so, how? >>> >>> On Wed, Oct 15, 2014 at 11:22 AM, Digimer >> <mailto:li...@alteeve.ca>> wrote: >>> >>> You can manually configure the wireless LAN without NetworkManager. >>> If you take a look, there should be existing config files in >>> /etc/sysconfig/network-__scripts/ for the wireless connection. I've >>> not done it myself since many Fedora's ago, but I believe you can >>> change NMCONTROLLER="no" and then start it up with >>> /etc/sysconfig/network start. I could be a bit wrong, but I am sure >>> you can make wireless work without NM. >>> >>> Question; Servers with WLAN? I assume these won't be used for >>> corosync? >>> >>> digimer >>> >>> >>> On 14/10/14 11:17 PM, Sihan Goi wrote: >>> >>> Hi, >>> >>> Is there a tutorial showing how to get a basic Linux HA setup >>> with >>> replicated storage (via DRBD) working on CentOS 6.5? I want to >>> have >>> mySQL as the HA resource with the database replicated across the >>> nodes. >>> I've scoured the web for one but it seems that I get stuck in >>> each one >>> somewhere. >>> >>> To elaborate, I have 2 CentOS 6.5 nodes configured with distinct >>> hostnames and static IPs. They are connected to a wireless AP, >>> and can >>> ping each other. >>> >>> I tried following this guide - >>> http://clusterlabs.org/__quickstart-redhat.html >>> <http://clus
Re: [Pacemaker] Linux HA setup for CentOS 6.5
No typo. [root@node02 network-scripts]# ls -lah /etc/sysconfig/network-scripts/ifcfg-* -rw-r--r--. 1 root root 254 Oct 10 2013 /etc/sysconfig/network-scripts/ifcfg-lo I installed CentOS 6.5 with the LiveDVD. I found it weird as well that these files were missing. On Wed, Oct 15, 2014 at 11:54 AM, Digimer wrote: > Sure there isn't a typo there? > > an-c05n01:~# ls -lah /etc/sysconfig/network-scripts/ifcfg-* > -rw-r--r--. 1 root root 225 Jan 16 2013 /etc/sysconfig/network- > scripts/ifcfg-bond0 > -rw-r--r--. 1 root root 220 Jan 16 2013 /etc/sysconfig/network- > scripts/ifcfg-bond1 > -rw-r--r--. 1 root root 198 Jan 16 2013 /etc/sysconfig/network- > scripts/ifcfg-bond2 > -rw-r--r--. 1 root root 149 Jan 16 2013 /etc/sysconfig/network- > scripts/ifcfg-eth0 > -rw-r--r--. 1 root root 144 Jan 16 2013 /etc/sysconfig/network- > scripts/ifcfg-eth1 > -rw-r--r--. 1 root root 152 Mar 14 2013 /etc/sysconfig/network- > scripts/ifcfg-eth2 > -rw-r--r--. 1 root root 149 Jan 16 2013 /etc/sysconfig/network- > scripts/ifcfg-eth3 > -rw-r--r--. 1 root root 144 Jan 16 2013 /etc/sysconfig/network- > scripts/ifcfg-eth4 > -rw-r--r--. 1 root root 152 Mar 14 2013 /etc/sysconfig/network- > scripts/ifcfg-eth5 > -rw-r--r--. 1 root root 254 Jul 22 09:56 /etc/sysconfig/network- > scripts/ifcfg-lo > -rw-r--r--. 1 root root 213 Mar 13 2013 /etc/sysconfig/network- > scripts/ifcfg-vbr2 > > I've never seen an EL6 install without the files there, 'network' or > NetworkManager aside. > > digimer > > On 14/10/14 11:32 PM, Sihan Goi wrote: > >> There aren't any config files in /etc/sysconfig/network-scripts. When I >> was using CentOS 7, the config files were there (ifcfg-something) but in >> this CentOS 6.5 installation, they are missing. >> >> If is possible to not use cman, and just use corosync and pacemaker? If >> so, how? >> >> On Wed, Oct 15, 2014 at 11:22 AM, Digimer > <mailto:li...@alteeve.ca>> wrote: >> >> You can manually configure the wireless LAN without NetworkManager. >> If you take a look, there should be existing config files in >> /etc/sysconfig/network-__scripts/ for the wireless connection. I've >> not done it myself since many Fedora's ago, but I believe you can >> change NMCONTROLLER="no" and then start it up with >> /etc/sysconfig/network start. I could be a bit wrong, but I am sure >> you can make wireless work without NM. >> >> Question; Servers with WLAN? I assume these won't be used for >> corosync? >> >> digimer >> >> >> On 14/10/14 11:17 PM, Sihan Goi wrote: >> >> Hi, >> >> Is there a tutorial showing how to get a basic Linux HA setup with >> replicated storage (via DRBD) working on CentOS 6.5? I want to >> have >> mySQL as the HA resource with the database replicated across the >> nodes. >> I've scoured the web for one but it seems that I get stuck in >> each one >> somewhere. >> >> To elaborate, I have 2 CentOS 6.5 nodes configured with distinct >> hostnames and static IPs. They are connected to a wireless AP, >> and can >> ping each other. >> >> I tried following this guide - >> http://clusterlabs.org/__quickstart-redhat.html >> <http://clusterlabs.org/quickstart-redhat.html> >> However, cman will not start when NetworkManager is running, and >> my >> nodes cannot connect to the wireless AP without NetworkManager >> running. >> Am I missing something or is that the stupidest dependency ever? >> How is >> a cluster supposed to work when the nodes aren't connected to >> one another? >> >> I also tried following the "clusters from scratch" guide but >> that seems >> to rely on systemctl calls which aren't available on CentOS 6.5. >> >> Any help? >> >> -- >> - Goi Sihan >> gois...@gmail.com <mailto:gois...@gmail.com> >> <mailto:gois...@gmail.com <mailto:gois...@gmail.com>> >> >> >> _ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> <mailto:Pacemaker@oss.clusterlabs.org> >> http://oss.clusterlabs.org/__mailman/listinfo/pacemaker >> <http://oss.clusterlabs.org/mailman/listinfo/pacemaker> >> >&
Re: [Pacemaker] Linux HA setup for CentOS 6.5
There aren't any config files in /etc/sysconfig/network-scripts. When I was using CentOS 7, the config files were there (ifcfg-something) but in this CentOS 6.5 installation, they are missing. If is possible to not use cman, and just use corosync and pacemaker? If so, how? On Wed, Oct 15, 2014 at 11:22 AM, Digimer wrote: > You can manually configure the wireless LAN without NetworkManager. If you > take a look, there should be existing config files in > /etc/sysconfig/network-scripts/ for the wireless connection. I've not > done it myself since many Fedora's ago, but I believe you can change > NMCONTROLLER="no" and then start it up with /etc/sysconfig/network start. I > could be a bit wrong, but I am sure you can make wireless work without NM. > > Question; Servers with WLAN? I assume these won't be used for corosync? > > digimer > > > On 14/10/14 11:17 PM, Sihan Goi wrote: > >> Hi, >> >> Is there a tutorial showing how to get a basic Linux HA setup with >> replicated storage (via DRBD) working on CentOS 6.5? I want to have >> mySQL as the HA resource with the database replicated across the nodes. >> I've scoured the web for one but it seems that I get stuck in each one >> somewhere. >> >> To elaborate, I have 2 CentOS 6.5 nodes configured with distinct >> hostnames and static IPs. They are connected to a wireless AP, and can >> ping each other. >> >> I tried following this guide - http://clusterlabs.org/ >> quickstart-redhat.html >> However, cman will not start when NetworkManager is running, and my >> nodes cannot connect to the wireless AP without NetworkManager running. >> Am I missing something or is that the stupidest dependency ever? How is >> a cluster supposed to work when the nodes aren't connected to one another? >> >> I also tried following the "clusters from scratch" guide but that seems >> to rely on systemctl calls which aren't available on CentOS 6.5. >> >> Any help? >> >> -- >> - Goi Sihan >> gois...@gmail.com <mailto:gois...@gmail.com> >> >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> >> > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Linux HA setup for CentOS 6.5
Hi, Is there a tutorial showing how to get a basic Linux HA setup with replicated storage (via DRBD) working on CentOS 6.5? I want to have mySQL as the HA resource with the database replicated across the nodes. I've scoured the web for one but it seems that I get stuck in each one somewhere. To elaborate, I have 2 CentOS 6.5 nodes configured with distinct hostnames and static IPs. They are connected to a wireless AP, and can ping each other. I tried following this guide - http://clusterlabs.org/quickstart-redhat.html However, cman will not start when NetworkManager is running, and my nodes cannot connect to the wireless AP without NetworkManager running. Am I missing something or is that the stupidest dependency ever? How is a cluster supposed to work when the nodes aren't connected to one another? I also tried following the "clusters from scratch" guide but that seems to rely on systemctl calls which aren't available on CentOS 6.5. Any help? -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] ERROR: Unable to find nic or netmask.
I mean things like firewall settings, as well as services like pcsd, pacemaker and corosync not starting up automatically sometimes. On Tue, Sep 16, 2014 at 5:10 PM, Nikita Michalko wrote: > On 16.09.2014 10:31, Sihan Goi wrote: > > Figured out the problem - the firewall rules are somehow not persistent. > After running the following commands: > > iptables -I INPUT -m state --state NEW -p udp -m multiport --dports > 5404,5405 -j ACCEPT > iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 2224 -j ACCEPT > iptables -I INPUT -p igmp -j ACCEPT > iptables -I INPUT -m addrtype --dst-type MULTICAST -j ACCEPT > service iptables save > > Both nodes are able to communicate with each other. > > Seems like several things aren't persistent upon reboots, and need to be > restarted/reconfigured. Is this the intended behavior? > > What do you mean with "several things" ? Firewall/iptables on CentOS 7? Or > Pacemaker/Corosync/pcs ? > > > Nikita > > > On Tue, Sep 2, 2014 at 2:05 PM, Nikita Michalko > > wrote: > > > Hi, > > maybe is following > helpfull:https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0CDEQFjAB&url=http%3A%2F%2Fhttpd.apache.org%2Fdocs%2Ftrunk%2Fbind.html&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNGCErofEEVtclS_x6ZXA3bXvJiaww&sig2=hR8kUWRcpmN4PE1V42t9kg&bvm=bv.74115972,d.bGEhttps://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CC0QrAIwAA&url=http%3A%2F%2Fubuntuforums.org%2Fshowthread.php%3Ft%3D1636667&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNHcs7alJ_RwBc4tWq2X7ew4ynEmzg&sig2=ra1qjZ8nly8opwawrACidw&bvm=bv.74115972,d.bGE > > > HTH > > Nikita > > > > On 02.09.2014 07:47, Sihan Goi wrote: > > Hi, > > After some investigation, it seems that my Apache is having trouble > starting in both nodes. I get the following error message when I try to > restart the service: > > Job for httpd.service failed. See 'systemctl status httpd.service' and > 'journalctl -xn' for details. > > "systemctl status httpd.service" shows the following output: > > httpd.service - The Apache HTTP Server >Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled) >Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s > ago > Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, > status=0/SUCCESS) > Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND > (code=exited, status=1/FAILURE) > Main PID: 26093 (code=exited, status=1/FAILURE) > > Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably > det...ge > Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072: > m...80 > Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available, > shutti...wn > Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs > Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited, > code...RE > Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server. > Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state. > Hint: Some lines were ellipsized, use -l to show in full. > > /var/log/messages also shows similar messages > > Sep 2 13:41:12 node02 systemd: Starting The Apache HTTP Server... > Sep 2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine > the server's fully qualified domain name, using 192.168.0.112. Set the > 'ServerName' directive globally to suppress this message > Sep 2 13:41:12 node02 httpd: (98)Address already in use: AH00072: > make_sock: could not bind to address 127.0.0.1:80 > Sep 2 13:41:12 node02 httpd: no listening sockets available, shutting down > Sep 2 13:41:12 node02 httpd: AH00015: Unable to open logs > Sep 2 13:41:12 node02 systemd: httpd.service: main process exited, > code=exited, status=1/FAILURE > Sep 2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server. > Sep 2 13:41:12 node02 systemd: Unit httpd.service entered failed state. > > Is this related to the problem? > > > > On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai > > wrote: > > > Try to set cidr_netmask=32 for resource only, and let the physical > interface's netmask be 24. > > On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi >wrote: > > Got it. Changed the netmask for both PCs to 255.255.255.0 and changed > cidr_netmask to 24 and it works...sort of. > > It was working for a while, and then I rebooted both PCs, and now each > thinks its online and the other is offline. > > "pcs status"
Re: [Pacemaker] ERROR: Unable to find nic or netmask.
Figured out the problem - the firewall rules are somehow not persistent. After running the following commands: iptables -I INPUT -m state --state NEW -p udp -m multiport --dports 5404,5405 -j ACCEPT iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 2224 -j ACCEPT iptables -I INPUT -p igmp -j ACCEPT iptables -I INPUT -m addrtype --dst-type MULTICAST -j ACCEPT service iptables save Both nodes are able to communicate with each other. Seems like several things aren't persistent upon reboots, and need to be restarted/reconfigured. Is this the intended behavior? On Tue, Sep 2, 2014 at 2:05 PM, Nikita Michalko wrote: > Hi, > > maybe is following helpfull: > https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0CDEQFjAB&url=http%3A%2F%2Fhttpd.apache.org%2Fdocs%2Ftrunk%2Fbind.html&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNGCErofEEVtclS_x6ZXA3bXvJiaww&sig2=hR8kUWRcpmN4PE1V42t9kg&bvm=bv.74115972,d.bGE > https://www.google.at/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CC0QrAIwAA&url=http%3A%2F%2Fubuntuforums.org%2Fshowthread.php%3Ft%3D1636667&ei=QV0FVK2YBYHO0QXPxYHQDw&usg=AFQjCNHcs7alJ_RwBc4tWq2X7ew4ynEmzg&sig2=ra1qjZ8nly8opwawrACidw&bvm=bv.74115972,d.bGE > > > HTH > > Nikita > > > > On 02.09.2014 07:47, Sihan Goi wrote: > > Hi, > > After some investigation, it seems that my Apache is having trouble > starting in both nodes. I get the following error message when I try to > restart the service: > > Job for httpd.service failed. See 'systemctl status httpd.service' and > 'journalctl -xn' for details. > > "systemctl status httpd.service" shows the following output: > > httpd.service - The Apache HTTP Server >Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled) >Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s > ago > Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, > status=0/SUCCESS) > Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND > (code=exited, status=1/FAILURE) > Main PID: 26093 (code=exited, status=1/FAILURE) > > Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably > det...ge > Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072: > m...80 > Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available, > shutti...wn > Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs > Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited, > code...RE > Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server. > Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state. > Hint: Some lines were ellipsized, use -l to show in full. > > /var/log/messages also shows similar messages > > Sep 2 13:41:12 node02 systemd: Starting The Apache HTTP Server... > Sep 2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine > the server's fully qualified domain name, using 192.168.0.112. Set the > 'ServerName' directive globally to suppress this message > Sep 2 13:41:12 node02 httpd: (98)Address already in use: AH00072: > make_sock: could not bind to address 127.0.0.1:80 > Sep 2 13:41:12 node02 httpd: no listening sockets available, shutting down > Sep 2 13:41:12 node02 httpd: AH00015: Unable to open logs > Sep 2 13:41:12 node02 systemd: httpd.service: main process exited, > code=exited, status=1/FAILURE > Sep 2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server. > Sep 2 13:41:12 node02 systemd: Unit httpd.service entered failed state. > > Is this related to the problem? > > > > On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai > wrote: > > > Try to set cidr_netmask=32 for resource only, and let the physical > interface's netmask be 24. > > On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi > wrote: > > Got it. Changed the netmask for both PCs to 255.255.255.0 and changed > cidr_netmask to 24 and it works...sort of. > > It was working for a while, and then I rebooted both PCs, and now each > thinks its online and the other is offline. > > "pcs status" on my node01 gives the following output: > Cluster name: cluster_web > Last updated: Tue Sep 2 12:21:25 2014 > Last change: Tue Sep 2 12:13:27 2014 via cibadmin on node02 > Stack: corosync > Current DC: node01 (1) - partition WITHOUT quorum > Version: 1.1.10-32.el7_0-368c726 > 2 Nodes configured > 2 Resources configured > > > Online: [ node01 ] > OFFLINE: [ node02 ] > > Full list of resources: > > virtual_ip(ocf::heartbeat:IPaddr2):S
[Pacemaker] Notification when a node is down
Hi, Is there any way for a Pacemaker/Corosync/PCS setup to send a notification when it detects that a node in a cluster is down? I read that Pacemaker and Corosync logs events to syslog, but where is the syslog file in CentOS? Do they log events such as a failover occurrence? Thanks. -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] pcs cluster auth shows "Error: Unable to communicate with node" message
Tried that, same problem. On Sep 9, 2014 3:44 PM, "emmanuel segura" wrote: > systemctl enable pcsd.service ? > > 2014-09-09 9:37 GMT+02:00 Sihan Goi : > > Hi, > > > > I had a basic HA setup working with 2 nodes previously running a simple > > Apache web server on a private local network. However, I'm having trouble > > getting it to work right now, and I haven't changed anything other than > > rebooting a few times. > > > > Firstly, I've noticed that I need to start the pcsd service manually > after > > every reboot with "systemctl start pcsd". Corosync seems to start > > automatically > > > > After starting pcsd and restarting the cluster, the HA cluster used to > work. > > However, now it doesn't seem to. "pcs status" on the node01 would show > node1 > > as online and node02 as offline, and vice versa. When I try "pcs cluster > > auth node02" from node01, I'd get "Error: Unable to communicate with > > node02", even though I'm able to ping both the IP address and hostname of > > node02 from node01 > > > > node01 and node02 would both serve their own web page when I enter the > > virtual IP address in the browser URL bar. However, a 3rd device > connected > > to the same network is unable to load the webpage from the virtual IP > > address. > > > > What's wrong? Thanks! > > > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] pcs cluster auth shows "Error: Unable to communicate with node" message
Hi, I had a basic HA setup working with 2 nodes previously running a simple Apache web server on a private local network. However, I'm having trouble getting it to work right now, and I haven't changed anything other than rebooting a few times. Firstly, I've noticed that I need to start the pcsd service manually after every reboot with "systemctl start pcsd". Corosync seems to start automatically After starting pcsd and restarting the cluster, the HA cluster used to work. However, now it doesn't seem to. "pcs status" on the node01 would show node1 as online and node02 as offline, and vice versa. When I try "pcs cluster auth node02" from node01, I'd get "Error: Unable to communicate with node02", even though I'm able to ping both the IP address and hostname of node02 from node01 node01 and node02 would both serve their own web page when I enter the virtual IP address in the browser URL bar. However, a 3rd device connected to the same network is unable to load the webpage from the virtual IP address. What's wrong? Thanks! ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] ERROR: Unable to find nic or netmask.
Hi, After some investigation, it seems that my Apache is having trouble starting in both nodes. I get the following error message when I try to restart the service: Job for httpd.service failed. See 'systemctl status httpd.service' and 'journalctl -xn' for details. "systemctl status httpd.service" shows the following output: httpd.service - The Apache HTTP Server Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled) Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s ago Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=0/SUCCESS) Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=exited, status=1/FAILURE) Main PID: 26093 (code=exited, status=1/FAILURE) Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably det...ge Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072: m...80 Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available, shutti...wn Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited, code...RE Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server. Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state. Hint: Some lines were ellipsized, use -l to show in full. /var/log/messages also shows similar messages Sep 2 13:41:12 node02 systemd: Starting The Apache HTTP Server... Sep 2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 192.168.0.112. Set the 'ServerName' directive globally to suppress this message Sep 2 13:41:12 node02 httpd: (98)Address already in use: AH00072: make_sock: could not bind to address 127.0.0.1:80 Sep 2 13:41:12 node02 httpd: no listening sockets available, shutting down Sep 2 13:41:12 node02 httpd: AH00015: Unable to open logs Sep 2 13:41:12 node02 systemd: httpd.service: main process exited, code=exited, status=1/FAILURE Sep 2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server. Sep 2 13:41:12 node02 systemd: Unit httpd.service entered failed state. Is this related to the problem? On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai < maillist...@gmail.com> wrote: > Try to set cidr_netmask=32 for resource only, and let the physical > interface's netmask be 24. > > On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi wrote: > > Got it. Changed the netmask for both PCs to 255.255.255.0 and changed > > cidr_netmask to 24 and it works...sort of. > > > > It was working for a while, and then I rebooted both PCs, and now each > > thinks its online and the other is offline. > > > > "pcs status" on my node01 gives the following output: > > Cluster name: cluster_web > > Last updated: Tue Sep 2 12:21:25 2014 > > Last change: Tue Sep 2 12:13:27 2014 via cibadmin on node02 > > Stack: corosync > > Current DC: node01 (1) - partition WITHOUT quorum > > Version: 1.1.10-32.el7_0-368c726 > > 2 Nodes configured > > 2 Resources configured > > > > > > Online: [ node01 ] > > OFFLINE: [ node02 ] > > > > Full list of resources: > > > > virtual_ip(ocf::heartbeat:IPaddr2):Started node01 > > webserver(ocf::heartbeat:apache):Started node01 > > > > PCSD Status: > > node01: Offline > > node02: Online > > > > Daemon Status: > > corosync: active/disabled > > pacemaker: active/disabled > > pcsd: active/disabled > > > > However, "pcs status" on node02 shows the following output: > > Cluster name: cluster_web > > Last updated: Tue Sep 2 12:20:41 2014 > > Last change: Tue Sep 2 11:59:03 2014 via cibadmin on node02 > > Stack: corosync > > Current DC: node02 (2) - partition WITHOUT quorum > > Version: 1.1.10-32.el7_0-368c726 > > 2 Nodes configured > > 2 Resources configured > > > > > > Online: [ node02 ] > > OFFLINE: [ node01 ] > > > > Full list of resources: > > > > virtual_ip(ocf::heartbeat:IPaddr2):Started node02 > > webserver(ocf::heartbeat:apache):Started node02 > > > > PCSD Status: > > node01: Offline > > node02: Online > > > > Daemon Status: > > corosync: active/disabled > > pacemaker: active/disabled > > pcsd: active/disabled > > > > Seems like each node thinks it's online and the other is not. I'm > running HA > > on apache webserver, and if I access the webpage on node01, I get > node01's > > index.html. If I access it on node02, I get node02's index.html. If I > access > > it via another PC co
Re: [Pacemaker] ERROR: Unable to find nic or netmask.
Got it. Changed the netmask for both PCs to 255.255.255.0 and changed cidr_netmask to 24 and it works...sort of. It was working for a while, and then I rebooted both PCs, and now each thinks its online and the other is offline. "pcs status" on my node01 gives the following output: Cluster name: cluster_web Last updated: Tue Sep 2 12:21:25 2014 Last change: Tue Sep 2 12:13:27 2014 via cibadmin on node02 Stack: corosync Current DC: node01 (1) - partition WITHOUT quorum Version: 1.1.10-32.el7_0-368c726 2 Nodes configured 2 Resources configured Online: [ node01 ] OFFLINE: [ node02 ] Full list of resources: virtual_ip(ocf::heartbeat:IPaddr2):Started node01 webserver(ocf::heartbeat:apache):Started node01 PCSD Status: node01: Offline node02: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled However, "pcs status" on node02 shows the following output: Cluster name: cluster_web Last updated: Tue Sep 2 12:20:41 2014 Last change: Tue Sep 2 11:59:03 2014 via cibadmin on node02 Stack: corosync Current DC: node02 (2) - partition WITHOUT quorum Version: 1.1.10-32.el7_0-368c726 2 Nodes configured 2 Resources configured Online: [ node02 ] OFFLINE: [ node01 ] Full list of resources: virtual_ip(ocf::heartbeat:IPaddr2):Started node02 webserver(ocf::heartbeat:apache):Started node02 PCSD Status: node01: Offline node02: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled Seems like each node thinks it's online and the other is not. I'm running HA on apache webserver, and if I access the webpage on node01, I get node01's index.html. If I access it on node02, I get node02's index.html. If I access it via another PC connected to the same AP, the webpage is unavailable. What could be wrong? On Mon, Sep 1, 2014 at 9:09 PM, John Lauro wrote: > ip=192.168.0.110 cidr_netmask=32 > /32 leaves no room for any other IP addresses on that interface and so you > have to specify the nic. Are you certain 192.168.0.111 and 192.168.0.112 > do not have a different netmask from 255.255.255.255, like 255.255.255.0 > for /24 or 255.255.0.0 for /16? If they do have 255.255.255.255 too, then > they are probably not setup correctly... > > PS: cidr_netmask is optional. Assuming a proper netmask (not > 255.255.255.2555) is on 192.168.0.111 and 192.168.0.112 it should work > without specifying cidr_netmask. > > > -- > > *From: *"Sihan Goi" > *To: *pacemaker@oss.clusterlabs.org > *Sent: *Monday, September 1, 2014 4:17:20 AM > *Subject: *[Pacemaker] ERROR: Unable to find nic or netmask. > > > Hi, > > I'm trying to create a HA cluster with 2 CentOS 7 PCs connected to a > wireless AP. The PCs have the static IP addresses 192.168.0.111 and > 192.168.0.112 respectively and hostnames node01 and node02 respectively. > > I've tried to create a virtual IP address of 192.168.0.110 using the > following command: > > pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.0.110 > cidr_netmask=32 op monitor interval=30s > > However, when I do a "pcs status resources" I get the following output: > > virtual_ip(ocf::heartbeat:IPaddr2):Stopped > > The virtual IP is stopped rather than started. I looked into > /var/log/messages and /var/log/pacemaker.log > and I find the following error messages: > > node02 IPaddr2(virtual_ip)[25451]: ERROR: Unable to find nic or netmask. > node02 IPaddr2(virtual_ip)[25451]: ERROR: [findif] failed > > It seems that it's unable to find my nic. How can I fix this? > > Thanks. > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > > -- - Goi Sihan gois...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] ERROR: Unable to find nic or netmask.
Hi, I'm trying to create a HA cluster with 2 CentOS 7 PCs connected to a wireless AP. The PCs have the static IP addresses 192.168.0.111 and 192.168.0.112 respectively and hostnames node01 and node02 respectively. I've tried to create a virtual IP address of 192.168.0.110 using the following command: pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.0.110 cidr_netmask=32 op monitor interval=30s However, when I do a "pcs status resources" I get the following output: virtual_ip(ocf::heartbeat:IPaddr2):Stopped The virtual IP is stopped rather than started. I looked into /var/log/messages and /var/log/pacemaker.log and I find the following error messages: node02 IPaddr2(virtual_ip)[25451]: ERROR: Unable to find nic or netmask. node02 IPaddr2(virtual_ip)[25451]: ERROR: [findif] failed It seems that it's unable to find my nic. How can I fix this? Thanks. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org