Re: [Pacemaker] [ha-wg] [ha-wg-technical] [Linux-HA] [RFC] Organizing HA Summit 2015

2014-11-11 Thread Fabio M. Di Nitto


On 11/5/2014 4:16 PM, Lars Ellenberg wrote:
 On Sat, Nov 01, 2014 at 01:19:35AM -0400, Digimer wrote:
 All the cool kids will be there.

 You want to be a cool kid, right?
 
 Well, no. ;-)
 
 But I'll still be there,
 and a few other Linbit'ers as well.
 
 Fabio, let us know what we could do to help make it happen.
 

I appreciate the offer.

Assuming we achieve quorum to do the event, I´d say that I´ll take of
the meeting rooms/hotel logistics and one lunch and learn pizza event.
It would be nice if others could organize a dinner event.

Cheers
Fabio



   Lars
 
 On 01/11/14 01:06 AM, Fabio M. Di Nitto wrote:
 just a kind reminder.

 On 9/8/2014 12:30 PM, Fabio M. Di Nitto wrote:
 All,

 it's been almost 6 years since we had a face to face meeting for all
 developers and vendors involved in Linux HA.

 I'd like to try and organize a new event and piggy-back with DevConf in
 Brno [1].

 DevConf will start Friday the 6th of Feb 2015 in Red Hat Brno offices.

 My suggestion would be to have a 2 days dedicated HA summit the 4th and
 the 5th of February.

 The goal for this meeting is to, beside to get to know each other and
 all social aspect of those events, tune the directions of the various HA
 projects and explore common areas of improvements.

 I am also very open to the idea of extending to 3 days, 1 one dedicated
 to customers/users and 2 dedicated to developers, by starting the 3rd.

 Thoughts?

 Fabio

 PS Please hit reply all or include me in CC just to make sure I'll see
 an answer :)

 [1] http://devconf.cz/

 Could you please let me know by end of Nov if you are interested or not?

 I have heard only from few people so far.

 Cheers
 Fabio
 ___
 ha-wg mailing list
 ha...@lists.linux-foundation.org
 https://lists.linuxfoundation.org/mailman/listinfo/ha-wg
 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Daemon Start attempt on wrong Server

2014-11-11 Thread Alexandre
You should use an opt out cluster. Set the cluster option
symmetrical=false. This will tell corosync not to place a resource anywhere
on the cluster, unless a location rule explicitly tell the cluster where it
should run.

Corosync will still monitor sql resources on www hosts and return rc 5 but
this is expected and works.
Le 11 nov. 2014 13:22, Hauke Homburg hhomb...@w3-creative.de a écrit :

 Hello,

 I am installing a 6 Node pacemaker CLuster. 3 Nodes for Apache, 3 Nodes
 for Postgres.

 My Cluster Config is

 node kvm-node1
 node sql-node1
 node sql-node2
 node sql-node3
 node www-node1
 node www-node2
 node www-node3
 primitive pri_kvm_ip ocf:heartbeat:IPaddr2 \
 params ip=10.0.6.41 cidr_netmask=255.255.255.0 \
 op monitor interval=10s timeout=20s
 primitive pri_sql_ip ocf:heartbeat:IPaddr2 \
 params ip=10.0.6.31 cidr_netmask=255.255.255.0 \
 op monitor interval=10s timeout=20s
 primitive pri_www_ip ocf:heartbeat:IPaddr2 \
 params ip=10.0.6.21 cidr_netmask=255.255.255.0 \
 op monitor interval=10s timeout=20s
 primitive res_apache ocf:heartbeat:apache \
 params configfile=/etc/apache2/apache2.conf \
 op start interval=0 timeout=40 \
 op stop interval=0 timeout=60 \
 op monitor interval=60 timeout=120 start-delay=0 \
 meta target-role=Started
 primitive res_pgsql ocf:heartbeat:pgsql \
 params pgctl=/usr/lib/postgresql/9.1/bin/pg_ctl
 psql=/usr/bin/psql start_opt= pgdata=/var/lib/postgresql/9.1/main
 config=/etc/postgresql/9.1/main/postgresql.conf pgdba=postgres \
 op start interval=0 timeout=120s \
 op stop interval=0 timeout=120s \
 op monitor interval=30s timeout=30s depth=0
 location loc_kvm_ip_node1 pri_kvm_ip 10001: kvm-node1
 location loc_sql_ip_node1 pri_sql_ip inf: sql-node1
 location loc_sql_ip_node2 pri_sql_ip inf: sql-node2
 location loc_sql_ip_node3 pri_sql_ip inf: sql-node3
 location loc_sql_srv_node1 res_pgsql inf: sql-node1
 location loc_sql_srv_node2 res_pgsql inf: sql-node2
 location loc_sql_srv_node3 res_pgsql inf: sql-node3
 location loc_www_ip_node1 pri_www_ip inf: www-node1
 location loc_www_ip_node2 pri_www_ip inf: www-node2
 location loc_www_ip_node3 pri_www_ip inf: www-node3
 location loc_www_srv_node1 res_apache inf: www-node1
 location loc_www_srv_node2 res_apache inf: www-node2
 location loc_www_srv_node3 res_apache inf: www-node3
 property $id=cib-bootstrap-options \
 dc-version=1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff \
 cluster-infrastructurFailed actions:

 Why do i see in crm_mon the following output?

 res_pgsql_start_0 (node=www-node1, call=16, rc=5, status=complete):
 not installed
 res_pgsql_start_0 (node=www-node2, call=13, rc=5, status=complete):
 not installed
 pri_www_ip_monitor_1 (node=www-node3, call=22, rc=7,
 status=complete): not running
 res_pgsql_start_0 (node=www-node3, call=13, rc=5, status=complete):
 not installed
 res_apache_start_0 (node=sql-node2, call=18, rc=5, status=complete):
 not installed
 res_pgsql_start_0 (node=sql-node2, call=12, rc=5, status=complete):
 not installed
 res_apache_start_0 (node=sql-node3, call=12, rc=5, status=complete):
 not installed
 res_pgsql_start_0 (node=sql-node3, call=10, rc=5, status=complete):
 not installed
 res_apache_start_0 (node=kvm-node1, call=12, rc=5, status=complete):
 not installed
 res_pgsql_start_0 (node=kvm-node1, call=20, rc=5, status=complete):
 not installede=openais \
 expected-quorum-votes=7 \
 stonith-enabled=false


 I set the infinity for pgsql on all 3 sql nodes, but not! on the www
 nodes. Why tries Pacemaker to start the Postgres SQL Server on the www
 Node? In example?

 Thank for your Help

 greetings

 Hauke

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Daemon Start attempt on wrong Server

2014-11-11 Thread Hauke Homburg

Am 11.11.2014 13:34, schrieb Alexandre:


You should use an opt out cluster. Set the cluster option  
symmetrical=false. This will tell corosync not to place a resource 
anywhere on the cluster, unless a location rule explicitly tell the 
cluster where it should run.


Corosync will still monitor sql resources on www hosts and return rc 5 
but this is expected and works.


Le 11 nov. 2014 13:22, Hauke Homburg hhomb...@w3-creative.de 
mailto:hhomb...@w3-creative.de a écrit :


Hello,

I am installing a 6 Node pacemaker CLuster. 3 Nodes for Apache, 3
Nodes for Postgres.

My Cluster Config is

node kvm-node1
node sql-node1
node sql-node2
node sql-node3
node www-node1
node www-node2
node www-node3
primitive pri_kvm_ip ocf:heartbeat:IPaddr2 \
params ip=10.0.6.41 cidr_netmask=255.255.255.0 \
op monitor interval=10s timeout=20s
primitive pri_sql_ip ocf:heartbeat:IPaddr2 \
params ip=10.0.6.31 cidr_netmask=255.255.255.0 \
op monitor interval=10s timeout=20s
primitive pri_www_ip ocf:heartbeat:IPaddr2 \
params ip=10.0.6.21 cidr_netmask=255.255.255.0 \
op monitor interval=10s timeout=20s
primitive res_apache ocf:heartbeat:apache \
params configfile=/etc/apache2/apache2.conf \
op start interval=0 timeout=40 \
op stop interval=0 timeout=60 \
op monitor interval=60 timeout=120 start-delay=0 \
meta target-role=Started
primitive res_pgsql ocf:heartbeat:pgsql \
params pgctl=/usr/lib/postgresql/9.1/bin/pg_ctl
psql=/usr/bin/psql start_opt=
pgdata=/var/lib/postgresql/9.1/main
config=/etc/postgresql/9.1/main/postgresql.conf pgdba=postgres \
op start interval=0 timeout=120s \
op stop interval=0 timeout=120s \
op monitor interval=30s timeout=30s depth=0
location loc_kvm_ip_node1 pri_kvm_ip 10001: kvm-node1
location loc_sql_ip_node1 pri_sql_ip inf: sql-node1
location loc_sql_ip_node2 pri_sql_ip inf: sql-node2
location loc_sql_ip_node3 pri_sql_ip inf: sql-node3
location loc_sql_srv_node1 res_pgsql inf: sql-node1
location loc_sql_srv_node2 res_pgsql inf: sql-node2
location loc_sql_srv_node3 res_pgsql inf: sql-node3
location loc_www_ip_node1 pri_www_ip inf: www-node1
location loc_www_ip_node2 pri_www_ip inf: www-node2
location loc_www_ip_node3 pri_www_ip inf: www-node3
location loc_www_srv_node1 res_apache inf: www-node1
location loc_www_srv_node2 res_apache inf: www-node2
location loc_www_srv_node3 res_apache inf: www-node3
property $id=cib-bootstrap-options \
dc-version=1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff \
cluster-infrastructurFailed actions:

Why do i see in crm_mon the following output?

res_pgsql_start_0 (node=www-node1, call=16, rc=5,
status=complete): not installed
res_pgsql_start_0 (node=www-node2, call=13, rc=5,
status=complete): not installed
pri_www_ip_monitor_1 (node=www-node3, call=22, rc=7,
status=complete): not running
res_pgsql_start_0 (node=www-node3, call=13, rc=5,
status=complete): not installed
res_apache_start_0 (node=sql-node2, call=18, rc=5,
status=complete): not installed
res_pgsql_start_0 (node=sql-node2, call=12, rc=5,
status=complete): not installed
res_apache_start_0 (node=sql-node3, call=12, rc=5,
status=complete): not installed
res_pgsql_start_0 (node=sql-node3, call=10, rc=5,
status=complete): not installed
res_apache_start_0 (node=kvm-node1, call=12, rc=5,
status=complete): not installed
res_pgsql_start_0 (node=kvm-node1, call=20, rc=5,
status=complete): not installede=openais \
expected-quorum-votes=7 \
stonith-enabled=false


I set the infinity for pgsql on all 3 sql nodes, but not! on the
www nodes. Why tries Pacemaker to start the Postgres SQL Server on
the www Node? In example?

Thank for your Help

greetings

Hauke

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
mailto:Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Hello Alexandre,

Why can't i set the infinity for the SQL servernodes to start the SQL 
Daemon only on the sql-nodes? I thought that has to be all?


Greetings

Hauke

Re: [Pacemaker] Daemon Start attempt on wrong Server

2014-11-11 Thread Andrei Borzenkov
В Tue, 11 Nov 2014 16:19:56 +0100
Hauke Homburg hhomb...@w3-creative.de пишет:

 Am 11.11.2014 13:34, schrieb Alexandre:
 
  You should use an opt out cluster. Set the cluster option  
  symmetrical=false. This will tell corosync not to place a resource 
  anywhere on the cluster, unless a location rule explicitly tell the 
  cluster where it should run.
 
  Corosync will still monitor sql resources on www hosts and return rc 5 
  but this is expected and works.
 
  Le 11 nov. 2014 13:22, Hauke Homburg hhomb...@w3-creative.de 
  mailto:hhomb...@w3-creative.de a écrit :
 
  Hello,
 
  I am installing a 6 Node pacemaker CLuster. 3 Nodes for Apache, 3
  Nodes for Postgres.
 
  My Cluster Config is
 
  node kvm-node1
  node sql-node1
  node sql-node2
  node sql-node3
  node www-node1
  node www-node2
  node www-node3
  primitive pri_kvm_ip ocf:heartbeat:IPaddr2 \
  params ip=10.0.6.41 cidr_netmask=255.255.255.0 \
  op monitor interval=10s timeout=20s
  primitive pri_sql_ip ocf:heartbeat:IPaddr2 \
  params ip=10.0.6.31 cidr_netmask=255.255.255.0 \
  op monitor interval=10s timeout=20s
  primitive pri_www_ip ocf:heartbeat:IPaddr2 \
  params ip=10.0.6.21 cidr_netmask=255.255.255.0 \
  op monitor interval=10s timeout=20s
  primitive res_apache ocf:heartbeat:apache \
  params configfile=/etc/apache2/apache2.conf \
  op start interval=0 timeout=40 \
  op stop interval=0 timeout=60 \
  op monitor interval=60 timeout=120 start-delay=0 \
  meta target-role=Started
  primitive res_pgsql ocf:heartbeat:pgsql \
  params pgctl=/usr/lib/postgresql/9.1/bin/pg_ctl
  psql=/usr/bin/psql start_opt=
  pgdata=/var/lib/postgresql/9.1/main
  config=/etc/postgresql/9.1/main/postgresql.conf pgdba=postgres \
  op start interval=0 timeout=120s \
  op stop interval=0 timeout=120s \
  op monitor interval=30s timeout=30s depth=0
  location loc_kvm_ip_node1 pri_kvm_ip 10001: kvm-node1
  location loc_sql_ip_node1 pri_sql_ip inf: sql-node1
  location loc_sql_ip_node2 pri_sql_ip inf: sql-node2
  location loc_sql_ip_node3 pri_sql_ip inf: sql-node3
  location loc_sql_srv_node1 res_pgsql inf: sql-node1
  location loc_sql_srv_node2 res_pgsql inf: sql-node2
  location loc_sql_srv_node3 res_pgsql inf: sql-node3
  location loc_www_ip_node1 pri_www_ip inf: www-node1
  location loc_www_ip_node2 pri_www_ip inf: www-node2
  location loc_www_ip_node3 pri_www_ip inf: www-node3
  location loc_www_srv_node1 res_apache inf: www-node1
  location loc_www_srv_node2 res_apache inf: www-node2
  location loc_www_srv_node3 res_apache inf: www-node3
  property $id=cib-bootstrap-options \
  dc-version=1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff \
  cluster-infrastructurFailed actions:
 
  Why do i see in crm_mon the following output?
 
  res_pgsql_start_0 (node=www-node1, call=16, rc=5,
  status=complete): not installed
  res_pgsql_start_0 (node=www-node2, call=13, rc=5,
  status=complete): not installed
  pri_www_ip_monitor_1 (node=www-node3, call=22, rc=7,
  status=complete): not running
  res_pgsql_start_0 (node=www-node3, call=13, rc=5,
  status=complete): not installed
  res_apache_start_0 (node=sql-node2, call=18, rc=5,
  status=complete): not installed
  res_pgsql_start_0 (node=sql-node2, call=12, rc=5,
  status=complete): not installed
  res_apache_start_0 (node=sql-node3, call=12, rc=5,
  status=complete): not installed
  res_pgsql_start_0 (node=sql-node3, call=10, rc=5,
  status=complete): not installed
  res_apache_start_0 (node=kvm-node1, call=12, rc=5,
  status=complete): not installed
  res_pgsql_start_0 (node=kvm-node1, call=20, rc=5,
  status=complete): not installede=openais \
  expected-quorum-votes=7 \
  stonith-enabled=false
 
 
  I set the infinity for pgsql on all 3 sql nodes, but not! on the
  www nodes. Why tries Pacemaker to start the Postgres SQL Server on
  the www Node? In example?
 
  Thank for your Help
 
  greetings
 
  Hauke
 
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  mailto:Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting started:
  http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
  Bugs: http://bugs.clusterlabs.org
 
 
 
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting 

Re: [Pacemaker] Daemon Start attempt on wrong Server

2014-11-11 Thread Hauke Homburg

Am 11.11.2014 16:25, schrieb Andrei Borzenkov:

В Tue, 11 Nov 2014 16:19:56 +0100
Hauke Homburghhomb...@w3-creative.de  пишет:


Am 11.11.2014 13:34, schrieb Alexandre:

You should use an opt out cluster. Set the cluster option
symmetrical=false. This will tell corosync not to place a resource
anywhere on the cluster, unless a location rule explicitly tell the
cluster where it should run.

Corosync will still monitor sql resources on www hosts and return rc 5
but this is expected and works.

Le 11 nov. 2014 13:22, Hauke Homburghhomb...@w3-creative.de
mailto:hhomb...@w3-creative.de  a écrit :

 Hello,

 I am installing a 6 Node pacemaker CLuster. 3 Nodes for Apache, 3
 Nodes for Postgres.

 My Cluster Config is

 node kvm-node1
 node sql-node1
 node sql-node2
 node sql-node3
 node www-node1
 node www-node2
 node www-node3
 primitive pri_kvm_ip ocf:heartbeat:IPaddr2 \
 params ip=10.0.6.41 cidr_netmask=255.255.255.0 \
 op monitor interval=10s timeout=20s
 primitive pri_sql_ip ocf:heartbeat:IPaddr2 \
 params ip=10.0.6.31 cidr_netmask=255.255.255.0 \
 op monitor interval=10s timeout=20s
 primitive pri_www_ip ocf:heartbeat:IPaddr2 \
 params ip=10.0.6.21 cidr_netmask=255.255.255.0 \
 op monitor interval=10s timeout=20s
 primitive res_apache ocf:heartbeat:apache \
 params configfile=/etc/apache2/apache2.conf \
 op start interval=0 timeout=40 \
 op stop interval=0 timeout=60 \
 op monitor interval=60 timeout=120 start-delay=0 \
 meta target-role=Started
 primitive res_pgsql ocf:heartbeat:pgsql \
 params pgctl=/usr/lib/postgresql/9.1/bin/pg_ctl
 psql=/usr/bin/psql start_opt=
 pgdata=/var/lib/postgresql/9.1/main
 config=/etc/postgresql/9.1/main/postgresql.conf pgdba=postgres \
 op start interval=0 timeout=120s \
 op stop interval=0 timeout=120s \
 op monitor interval=30s timeout=30s depth=0
 location loc_kvm_ip_node1 pri_kvm_ip 10001: kvm-node1
 location loc_sql_ip_node1 pri_sql_ip inf: sql-node1
 location loc_sql_ip_node2 pri_sql_ip inf: sql-node2
 location loc_sql_ip_node3 pri_sql_ip inf: sql-node3
 location loc_sql_srv_node1 res_pgsql inf: sql-node1
 location loc_sql_srv_node2 res_pgsql inf: sql-node2
 location loc_sql_srv_node3 res_pgsql inf: sql-node3
 location loc_www_ip_node1 pri_www_ip inf: www-node1
 location loc_www_ip_node2 pri_www_ip inf: www-node2
 location loc_www_ip_node3 pri_www_ip inf: www-node3
 location loc_www_srv_node1 res_apache inf: www-node1
 location loc_www_srv_node2 res_apache inf: www-node2
 location loc_www_srv_node3 res_apache inf: www-node3
 property $id=cib-bootstrap-options \
 dc-version=1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff \
 cluster-infrastructurFailed actions:

 Why do i see in crm_mon the following output?

 res_pgsql_start_0 (node=www-node1, call=16, rc=5,
 status=complete): not installed
 res_pgsql_start_0 (node=www-node2, call=13, rc=5,
 status=complete): not installed
 pri_www_ip_monitor_1 (node=www-node3, call=22, rc=7,
 status=complete): not running
 res_pgsql_start_0 (node=www-node3, call=13, rc=5,
 status=complete): not installed
 res_apache_start_0 (node=sql-node2, call=18, rc=5,
 status=complete): not installed
 res_pgsql_start_0 (node=sql-node2, call=12, rc=5,
 status=complete): not installed
 res_apache_start_0 (node=sql-node3, call=12, rc=5,
 status=complete): not installed
 res_pgsql_start_0 (node=sql-node3, call=10, rc=5,
 status=complete): not installed
 res_apache_start_0 (node=kvm-node1, call=12, rc=5,
 status=complete): not installed
 res_pgsql_start_0 (node=kvm-node1, call=20, rc=5,
 status=complete): not installede=openais \
 expected-quorum-votes=7 \
 stonith-enabled=false


 I set the infinity for pgsql on all 3 sql nodes, but not! on the
 www nodes. Why tries Pacemaker to start the Postgres SQL Server on
 the www Node? In example?

 Thank for your Help

 greetings

 Hauke

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 mailto:Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started:
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: 

Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-11-11 Thread Sihan Goi
Hi,

I'm fluent in English so I doubt it's a language barrier. I have reasonable
user experience in Linux, though not extensive experience in the various
system commands, and I have zero experience in HA. I'm in fact trying to
make things as simple as possible by simply following the Clusters from
Scratch guide step by step, and only modifying/omitting steps when they
don't work.

I know a block device (like /dev/sda) is simply a device (such as a hard
disk) that appears like a file in Linux, allowing users buffered access to
the device.
I know a file system is like FAT/NTFS/ext2/etc.
I know a mount point is a directory that you can mount an image file with a
file system onto it. Once mounted, it would be as if the entire file system
has the mount point as its root directory.

I set up DRBD almost exactly like the instructions from Chapter 7 of
Clusters from Scratch. The only differences are in our setups. The guide
assumes Fedora 13, DRBD 8.3 while I'm using CentOS 6.5 and DRBD 8.4.

Since I was following the guide from start to finish, /var/www/html already
has index.html already in there. node01 has it's own index.html, and node02
has its own index.html, both with different content. The guide did not
instruct me to delete these files, and seems to configure the mount point
to be /var/www/html (Chapter 7.4) with an ext4 file system, hence mounting
the image onto a directory that already has files in it. Is this a problem?


On Tue, Nov 11, 2014 at 6:07 PM, Lars Ellenberg lars.ellenb...@linbit.com
wrote:

 On Tue, Nov 11, 2014 at 12:27:23PM +0800, Sihan Goi wrote:
  Hi,
 
  DocumentRoot is still set to /var/www/html
  ls -al /var/www/html shows different things on the 2 nodes
  node01:
 
  total 28
  drwxr-xr-x. 3 root root  4096 Nov 11 12:25 .
  drwxr-xr-x. 6 root root  4096 Jul 23 22:18 ..
  -rw-r--r--. 1 root root50 Oct 28 18:00 index.html
  drwx--. 2 root root 16384 Oct 28 17:59 lost+found
 
  node02 only has index.html, no lost+found, and it's a different version
 of
  the file.

 I'm unsure if there is just a language barrier,
 or if you just have not enough experience with linux in general,
 or if you try to make things more complicated as they are.

 Do you know
  * what a block device is?
  * what a file system is?
  * what a mount point is?
  * that a mount point may not be empty, even though it typically is?
  * what it means to mount a file system to a mount point?

 Assuming you set up DRBD in a sane way,
 and it is mounted on *one* node (the node where it is Primary),
 then on the *other* node, where it is NOT mounted,
 you will only see the mount point,
 and whatever happens to be in there.

 You probably should clear out the contents of that mount point,
 so that you'd have an empty mount point.

 Or, if you like, replace it with some dummy content
 that clearly shows that this is the mount point,
 and not the file system that is intended to be mounted there.

  Status URL is enabled in both nodes.

 As for the DocumentRoot must be a directory,
 please double check for typos...


  On Oct 30, 2014 11:14 AM, Andrew Beekhof and...@beekhof.net wrote:
 
  
On 29 Oct 2014, at 1:01 pm, Sihan Goi gois...@gmail.com wrote:
   
Hi,
   
I've never used crm_report before. I just read the man file and
   generated a tarball from 1-2 hours before I reconfigured all the DRBD
   related resources. I've put the tarball here -
  
 https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0
   
Hope you can help figure out what I'm doing wrong. Thanks for the
 help!
  
   Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start
 for
   /dev/drbd/by-res/wwwdata on /var/www/html
   Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem with
   ordered data mode. Opts:
   Oct 28 18:13:39 node02 crmd[9870]:   notice: process_lrm_event: LRM
   operation WebFS_start_0 (call=164, rc=0, cib-update=298,
 confirmed=true) ok
   Oct 28 18:13:39 node02 crmd[9870]:   notice: te_rsc_command: Initiating
   action 7: start WebSite_start_0 on node02 (local)
   Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error on
 line
   292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory
  
   Is DocumentRoot still set to /var/www/html?
   If so, what happens if you run 'ls -al /var/www/html' in a shell?
  
   Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: apache not running
   Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: waiting for apache
   /etc/httpd/conf/httpd.conf to come up
  
   Did you enable the status url?
  
  
 http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_enable_the_apache_status_url.html


 --
 : Lars Ellenberg
 : http://www.LINBIT.com | Your Way to High Availability
 : DRBD, Linux-HA  and  Pacemaker support and consulting

 DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.

 ___
 Pacemaker mailing list: 

[Pacemaker] Split Brain on DRBD Dual Primary

2014-11-11 Thread Ho, Alamsyah - ACE Life Indonesia
Hi All,

On October archives, I saw the issue reported by Felix Zachlod on 
http://oss.clusterlabs.org/pipermail/pacemaker/2014-October/022653.html and the 
same is actually happens to me now on dual primary DRBD node.

My current OS was RHEL 6.6 and software version that I used was
pacemaker-1.1.12-4.el6.x86_64
corosync-1.4.7-1.el6.x86_64
cman-3.0.12.1-68.el6.x86_64
drbd84-utils-8.9.1-1.el6.elrepo.x86_64
kmod-drbd84-8.4.5-2.el6.elrepo.x86_64
gfs2-utils-3.0.12.1-68.el6.x86_64

First, I will explain my existing resource. I have 3 resource which are drbd, 
dlm for gfs2, and HomeFS.

Master: HomeDataClone
  Meta Attrs: master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 
notify=true interval=0s
  Resource: HomeData (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=homedata
   Operations: start interval=0s timeout=240 (HomeData-start-timeout-240)
   promote interval=0s (HomeData-promote-interval-0s)
   demote interval=0s timeout=90 (HomeData-demote-timeout-90)
   stop interval=0s timeout=100 (HomeData-stop-timeout-100)
   monitor interval=60s (HomeData-monitor-interval-60s)
Clone: HomeFS-clone
  Meta Attrs: start-delay=30s target-role=Stopped
  Resource: HomeFS (class=ocf provider=heartbeat type=Filesystem)
   Attributes: device=/dev/drbd/by-res/homedata directory=/home fstype=gfs2
   Operations: start interval=0s timeout=60 (HomeFS-start-timeout-60)
   stop interval=0s timeout=60 (HomeFS-stop-timeout-60)
   monitor interval=20 timeout=40 (HomeFS-monitor-interval-20)
Clone: dlm-clone
  Meta Attrs: clone-max=2 clone-node-max=1 start-delay=0s
  Resource: dlm (class=ocf provider=pacemaker type=controld)
   Operations: start interval=0s timeout=90 (dlm-start-timeout-90)
   stop interval=0s timeout=100 (dlm-stop-timeout-100)
   monitor interval=60s (dlm-monitor-interval-60s)


But when I try to start the cluster on normal condition, It will cause split 
brain on DRBD on each node. From the log I can see it was the same case with 
Felix which was caused by pacemaker promoting drbd to primary  while it was 
still waiting for handshake connection on each node.

Nov 12 11:37:32 node002 kernel: block drbd1: disk( Attaching - UpToDate )
Nov 12 11:37:32 node002 kernel: block drbd1: attached to UUIDs 
C9630089EC3B58CC::B4653C665EBC0DBB:B4643C665EBC0DBA
Nov 12 11:37:32 node002 kernel: drbd homedata: conn( StandAlone - Unconnected )
Nov 12 11:37:32 node002 kernel: drbd homedata: Starting receiver thread (from 
drbd_w_homedata [22531])
Nov 12 11:37:32 node002 kernel: drbd homedata: receiver (re)started
Nov 12 11:37:32 node002 kernel: drbd homedata: conn( Unconnected - 
WFConnection )
Nov 12 11:37:32 node002 attrd[22340]:   notice: attrd_trigger_update: Sending 
flush op to all hosts for: master-HomeData (1000)
Nov 12 11:37:32 node002 attrd[22340]:   notice: attrd_perform_update: Sent 
update 17: master-HomeData=1000
Nov 12 11:37:32 node002 crmd[22342]:   notice: process_lrm_event: Operation 
HomeData_start_0: ok (node=node002, call=18, rc=0, cib-update=13, 
confirmed=true)
Nov 12 11:37:33 node002 crmd[22342]:   notice: process_lrm_event: Operation 
HomeData_notify_0: ok (node=node002, call=19, rc=0, cib-update=0, 
confirmed=true)
Nov 12 11:37:33 node002 crmd[22342]:   notice: process_lrm_event: Operation 
HomeData_notify_0: ok (node=node002, call=20, rc=0, cib-update=0, 
confirmed=true)
Nov 12 11:37:33 node002 kernel: block drbd1: role( Secondary - Primary )
Nov 12 11:37:33 node002 kernel: block drbd1: new current UUID 
58F02AE0E03C1C91:C9630089EC3B58CC:B4653C665EBC0DBB:B4643C665EBC0DBA
Nov 12 11:37:33 node002 crmd[22342]:   notice: process_lrm_event: Operation 
HomeData_promote_0: ok (node=node002, call=21, rc=0, cib-update=14, 
confirmed=true)
Nov 12 11:37:33 node002 attrd[22340]:   notice: attrd_trigger_update: Sending 
flush op to all hosts for: master-HomeData (1)
Nov 12 11:37:33 node002 attrd[22340]:   notice: attrd_perform_update: Sent 
update 23: master-HomeData=1
Nov 12 11:37:33 node002 crmd[22342]:   notice: process_lrm_event: Operation 
HomeData_notify_0: ok (node=node002, call=22, rc=0, cib-update=0, 
confirmed=true)
Nov 12 11:37:33 node002 kernel: drbd homedata: Handshake successful: Agreed 
network protocol version 101
Nov 12 11:37:33 node002 kernel: drbd homedata: Agreed to support TRIM on 
protocol level
Nov 12 11:37:33 node002 kernel: drbd homedata: Peer authenticated using 20 
bytes HMAC
Nov 12 11:37:33 node002 kernel: drbd homedata: conn( WFConnection - 
WFReportParams )
Nov 12 11:37:33 node002 kernel: drbd homedata: Starting asender thread (from 
drbd_r_homedata [22543])
Nov 12 11:37:33 node002 kernel: block drbd1: drbd_sync_handshake:
Nov 12 11:37:33 node002 kernel: block drbd1: self 
58F02AE0E03C1C91:C9630089EC3B58CC:B4653C665EBC0DBB:B4643C665EBC0DBA bits:0 
flags:0
Nov 12 11:37:33 node002 kernel: block drbd1: peer 

Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-11-11 Thread Vladislav Bogdanov
11.11.2014 07:27, Sihan Goi wrote:
 Hi,
 
 DocumentRoot is still set to /var/www/html
 ls -al /var/www/html shows different things on the 2 nodes
 node01:
 
 total 28
 drwxr-xr-x. 3 root root  4096 Nov 11 12:25 .
 drwxr-xr-x. 6 root root  4096 Jul 23 22:18 ..
 -rw-r--r--. 1 root root50 Oct 28 18:00 index.html
 drwx--. 2 root root 16384 Oct 28 17:59 lost+found
 
 node02 only has index.html, no lost+found, and it's a different version
 of the file.
 

It look like apache is unable to stat its document root.
Could you please show output of two commands:

getenforce
ls -dZ /var/www/html

on both nodes when fs is mounted on one of them?
If you see 'Enforcing', and the last part of the selinux context of a
mounted fs root is not httpd_sys_content_t, then run
'restorecon -R /var/www/html' on that node.

 Status URL is enabled in both nodes.
 
 
 On Oct 30, 2014 11:14 AM, Andrew Beekhof and...@beekhof.net
 mailto:and...@beekhof.net wrote:
 
 
  On 29 Oct 2014, at 1:01 pm, Sihan Goi gois...@gmail.com
 mailto:gois...@gmail.com wrote:
 
  Hi,
 
  I've never used crm_report before. I just read the man file and
 generated a tarball from 1-2 hours before I reconfigured all the
 DRBD related resources. I've put the tarball here -
 
 https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0
 
  Hope you can help figure out what I'm doing wrong. Thanks for the
 help!
 
 Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start
 for /dev/drbd/by-res/wwwdata on /var/www/html
 Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem
 with ordered data mode. Opts:
 Oct 28 18:13:39 node02 crmd[9870]:   notice: process_lrm_event: LRM
 operation WebFS_start_0 (call=164, rc=0, cib-update=298,
 confirmed=true) ok
 Oct 28 18:13:39 node02 crmd[9870]:   notice: te_rsc_command:
 Initiating action 7: start WebSite_start_0 on node02 (local)
 Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error
 on line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a
 directory
 
 Is DocumentRoot still set to /var/www/html?
 If so, what happens if you run 'ls -al /var/www/html' in a shell?
 
 Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: apache not running
 Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: waiting for
 apache /etc/httpd/conf/httpd.conf to come up
 
 Did you enable the status url?
 
 http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_enable_the_apache_status_url.html
 
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 mailto:Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Loosing corosync communication clusterwide

2014-11-11 Thread Andrew Beekhof

 On 11 Nov 2014, at 10:12 pm, Daniel Dehennin daniel.dehen...@baby-gnu.org 
 wrote:
 
 Andrew Beekhof and...@beekhof.net writes:
 
 
 [...]
 
 I have fencing configured and working, modulo fencing VMs on dead host[1].
 
 Are you saying that the host and the VMs running inside it are both part of 
 the same cluster?
 
 Yes, one of the VM needs to access the GFS2 filesystem like the nodes,
 the other VM is a quorum node (standby=on).

That sounds like a recipe for disaster to be honest.
If you want VM's to be part of a cluster, it would be advisable to have their 
host(s) be in a different one.
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org