Re: [ClusterLabs] iSCSI on ZFS on DRBD

2016-11-22 Thread Jason A Ramsey
The way that Pacemaker interacts with services is using resource agents. These 
resource agents are bash scripts that you can modify to your heart’s content to 
do the things you want to do. Having worked with the ocf:heartbeat:iSCSITarget 
and ocf:heartbeat:iSCSILogicalUnit quite a lot in the last several months, I 
can tell you that they only support iet, tgt, lio, and lio-t implementations of 
the standard out of the box. I’m sure you could make modifications to them (I 
have to support very specific use cases on my NAS cluster, and I’m definitely 
not a code monkey) as needed. Just take a peak at the resource agent files 
relevant to what you’re doing and go from there (on my systems they are located 
at /usr/lib/ocf/resource.d). Good luck!

--

[ jR ]

  there is no path to greatness; greatness is the path

From: Mark Adams 
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 

Date: Tuesday, November 22, 2016 at 11:59 AM
To: "Users@clusterlabs.org" 
Subject: [ClusterLabs] iSCSI on ZFS on DRBD

Hi All,

Looking for some opinions on this, if anyone has any. I'm looking at this 
solution to be for proxmox vm nodes using zfsoniscsi.

Just as back round for people that haven't looked at proxmox before it logs on 
to the iscsi server via ssh and creates a zfs dataset then adds iscsi config to 
/etc/ietd.conf so that dataset is available as a LUN. This works fine when 
you've got a single iscsi host, but I haven't figured out a way to use it with 
pacemaker/corosync.

Is there any way to have ISCSILogicalUnit read it's luns from a config file 
instead of specifying each one in the cluster config? or is there any other 
resource agents that might be more suitable for this job? I could write my own 
"watcher" script I guess, but does anyone think this is a dangerous idea?

Is the only sensible thing really to make proxmox zfsonlinux pacemaker/corosync 
"aware" so that it's scripts can create the luns through pcs instead of adding 
the config to ietd.conf?

Is anyone using zfs/iscsi/drbd in some other configuration and had success?

Looking forward to all ideas!

Regards,
Mark
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] OS Patching Process

2016-11-22 Thread Jason A Ramsey
I’ve done the opposite:

lvm on top of drbd -> iscsi lun

but I’m not trying to resize anything. I just want to patch the OS of the nodes 
and reboot them in sequence without breaking things (and, preferably, without 
taking the cluster offline).

--
 
[ jR ]

  there is no path to greatness; greatness is the path

On 11/22/16, 11:47 AM, "emmanuel segura" <emi2f...@gmail.com> wrote:

I been using this mode: iscsi_disks -> lvm volume ->
drbd_on_top_of_lvm -> filesystem

resize: add_one_iscsi_device_to_every_cluster_node_first ->
now_add_device_the_volume_group_on_every_cluster_node ->
now_resize_the_volume_on_every_cluster_node : now you have every
cluster with the same logical volume size, now you can resize drbd and
filesystem on the active node
    
    2016-11-22 17:35 GMT+01:00 Jason A Ramsey <ja...@eramsey.org>:
> Can anyone recommend a bulletproof process for OS patching a pacemaker
> cluster that manages a drbd mirror (with LVM on top of the drbd and luns
> defined for an iscsi target cluster if that matters)? Any time I’ve tried 
to
> mess with the cluster, it seems like I manage to corrupt my drbd 
filesystem,
> and now that I have actual data on the thing, that’s kind of a scary
> proposition. Thanks in advance!
>
>
>
> --
>
>
>
> [ jR ]
>
>
>
>   there is no path to greatness; greatness is the path
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] OS Patching Process

2016-11-22 Thread Jason A Ramsey
Can anyone recommend a bulletproof process for OS patching a pacemaker cluster 
that manages a drbd mirror (with LVM on top of the drbd and luns defined for an 
iscsi target cluster if that matters)? Any time I’ve tried to mess with the 
cluster, it seems like I manage to corrupt my drbd filesystem, and now that I 
have actual data on the thing, that’s kind of a scary proposition. Thanks in 
advance!

--

[ jR ]

  there is no path to greatness; greatness is the path
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] DRBD Insufficient Privileges Error

2016-11-22 Thread Jason A Ramsey
Did you install the drbd-pacemaker package? That’s the package that contains 
the resource agent.

--

[ jR ]

  there is no path to greatness; greatness is the path

From: Jasim Alam 
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 

Date: Sunday, November 20, 2016 at 2:58 PM
To: "Users@clusterlabs.org" 
Subject: [ClusterLabs] DRBD Insufficient Privileges Error

Hi,

I am trying to setup  two node H/A cluster with DRBD. Following is my 
configuration

[root@node-1 ~]# pcs config
Cluster Name: Cluster-1
Corosync Nodes:
node-1 node-2
Pacemaker Nodes:
node-1 node-2

Resources:
 Resource: vip (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=103.9.185.211 cidr_netmask=32
  Operations: start interval=0s timeout=20s (vip-start-interval-0s)
  stop interval=0s timeout=20s (vip-stop-interval-0s)
  monitor interval=30s (vip-monitor-interval-30s)
Resource: apache (class=ocf provider=heartbeat type=apache)
  Attributes: configfile=/etc/httpd/conf/httpd.conf 
statusurl=http://localhost/server-status
  Operations: start interval=0s timeout=40s (apache-start-interval-0s)
  stop interval=0s timeout=60s (apache-stop-interval-0s)
  monitor interval=1min (apache-monitor-interval-1min)
Master: StorageClone
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 
notify=true
  Resource: storage (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=drbd0
   Operations: start interval=0s timeout=240 (storage-start-interval-0s)
   promote interval=0s timeout=90 (storage-promote-interval-0s)
   demote interval=0s timeout=90 (storage-demote-interval-0s)
   stop interval=0s timeout=100 (storage-stop-interval-0s)
   monitor interval=60s (storage-monitor-interval-60s)

Stonith Devices:
Fencing Levels:

Location Constraints:
  Resource: apache
Enabled on: node-1 (score:50) (id:location-apache-node-1-50)
Ordering Constraints:
  start vip then start apache (kind:Mandatory) (id:order-vip-apache-mandatory)
Colocation Constraints:
  vip with apache (score:INFINITY) (id:colocation-vip-apache-INFINITY)

Resources Defaults:
No defaults set
Operations Defaults:
No defaults set

Cluster Properties:
cluster-infrastructure: corosync
cluster-name: Cluster-1
dc-version: 1.1.13-10.el7_2.4-44eb2dd
have-watchdog: false
no-quorum-policy: ignore
stonith-enabled: false

The problem is I am getting insufficient privilege error on second node

[root@node-1 ~]# pcs status
Cluster name: PSD-1
Last updated: Mon Nov 21 01:44:52 2016  Last change: Mon Nov 21 
01:19:17 2016 by root via cibadmin on node-1
Stack: corosync
Current DC: node-1 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum
2 nodes and 4 resources configured

Online: [ node-1 node-2 ]

Full list of resources:

vip(ocf::heartbeat:IPaddr2):   Started node-1
apache (ocf::heartbeat:apache):Started node-1
Master/Slave Set: StorageClone [storage]
 storage(ocf::linbit:drbd): FAILED node-2 (unmanaged)
 Masters: [ node-1 ]

Failed Actions:
* storage_stop_0 on node-2 'insufficient privileges' (4): call=16, 
status=complete, exitreason='none',
last-rc-change='Mon Nov 21 01:19:17 2016', queued=0ms, exec=2ms


PCSD Status:
  node-1: Online
  node-2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

but DRBD seems okay for both nodes

  [root@node-1 ~]# drbd-overview
 0:drbd0/0  Connected Primary/Secondary UpToDate/UpToDate
 [root@node-2 ~]# drbd-overview
 0:drbd0/0  Connected Secondary/Primary UpToDate/UpToDate

Log of node2

[root@node-2 ~]# tail -n 10 /var/log/messages
Nov 21 01:19:17 node-2 crmd[4060]:  notice: State transition S_NOT_DC -> 
S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_election_count_vote ]
Nov 21 01:19:17 node-2 crmd[4060]:  notice: State transition S_PENDING -> 
S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond 
]
Nov 21 01:19:17 node-2 crmd[4060]:   error: Failed to retrieve meta-data for 
ocf:linbit:drbd
Nov 21 01:19:17 node-2 crmd[4060]: warning: No metadata found for 
drbd::ocf:linbit: Input/output error (-5)
Nov 21 01:19:17 node-2 crmd[4060]:   error: No metadata for linbit::ocf:drbd
Nov 21 01:19:17 node-2 crmd[4060]:  notice: Operation storage_monitor_0: 
insufficient privileges (node=node-2, call=14, rc=4, cib-update=17, 
confirmed=true)
Nov 21 01:19:17 node-2 crmd[4060]:  notice: Operation storage_notify_0: ok 
(node=node-2, call=15, rc=0, cib-update=0, confirmed=true)
Nov 21 01:19:17 node-2 crmd[4060]:  notice: Operation storage_stop_0: 
insufficient privileges (node=node-2, call=16, rc=4, cib-update=18, 
confirmed=true)
Nov 21 01:20:31 node-2 systemd-logind: Removed session 3.
Nov 21 01:22:58 node-2 systemd-logind: Removed session 2.

Would appreciate any way out of this.

Thanks,
Jasim

[ClusterLabs] ocf:linbit:drbd Deprecated?

2016-09-15 Thread Jason A Ramsey
I note from http://linux-ha.org/doc/man-pages/re-ra-drbd.html that this 
resource agent is deprecated…? What’s the alternative?

I wouldn’t care except I just had to build a whole mess of stuff from source so 
I could get an iSCSI target on RHEL6 that supports SCSI-3 persistent 
reservations and 83h page VPD descriptors including a new kernel. Now that I’ve 
got it all done, I find myself missing a Pacemaker resource agent for my DRBD 
volume (!). Where do I find this?

--

[ jR ]

  there is no path to greatness; greatness is the path
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] (ELI5) Physical disk XXXXXXX does not have the inquiry data (SCSI page 83h VPD descriptor) that is required by failover clustering

2016-09-07 Thread Jason A Ramsey
Anyone that follows this mailing list at all has probably noticed that I’m 
creating a 2-node HA iSCSI Target on RHEL 6 using Pacemaker/Corosync (and CMAN, 
I guess) and the available tgt scsi tools to use as shared file system for some 
Windows Server Failover Cluster nodes. After a great deal of trial and error, 
I’ve finally had success getting this to work, but I’m running into an issue 
with the file systems passing cluster validation on the Windows side. 
Initially, I was getting an error that the iSCSI target didn’t support SCSI-3 
Persistent Reservations, which I was able to get around using the fence_scsi 
stonith module. I’ve found some extremely detailed conversations about the SCSI 
page 83h VPD descriptor error on the internet, but, frankly, I simply don’t 
follow them. Could someone ELI5 how to fix this **without scst, lio, lio-t**. 
Thanks!

# pcs status
Cluster name: cluster
Stack: cman
Current DC: node1 (version 1.1.15-1.9a34920.git.el6-9a34920) - partition with 
quorum
Last updated: Wed Sep  7 11:47:32 2016
Last change: Tue Sep  6 10:55:21 2016 by root via cibadmin on node1

2 nodes configured
8 resources configured

Online: [ node1 node2 ]

Full list of resources:

Master/Slave Set: cluster-fs2o [cluster-fs1o]
 Masters: [ node1 ]
 Slaves: [ node2 ]
cluster-vip(ocf::heartbeat:IPaddr2): Started node1
cluster-lvm  (ocf::heartbeat:LVM):   Started node1
cluster-tgt   (ocf::heartbeat:iSCSITarget):  Started node1
cluster-lun1 (ocf::heartbeat:iSCSILogicalUnit):  Started node1
cluster-lun2 (ocf::heartbeat:iSCSILogicalUnit):  Started node1
cluster-fence   (stonith:fence_scsi):   Started node2

PCSD Status:
  node1: Online
  node2: Online

# cat /etc/cluster/cluster.conf

  
  

  

  

  


  

  

  

  
  
  

  
  


  


# pcs config show
Cluster Name: cluster
Corosync Nodes:
node1 node2
Pacemaker Nodes:
node1 node2

Resources:
Master: cluster-fs2o
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 
notify=true
  Resource: cluster-fs1o (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=targetfs
   Operations: start interval=0s timeout=240 (cluster-fs1o-start-interval-0s)
   promote interval=0s timeout=90 (cluster-fs1o-promote-interval-0s)
   demote interval=0s timeout=90 (cluster-fs1o-demote-interval-0s)
   stop interval=0s timeout=100 (cluster-fs1o-stop-interval-0s)
   monitor interval=10s (cluster-fs1o-monitor-interval-10s)
Resource: cluster-vip (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.30.96.100 cidr_netmask=32 nic=eth0
  Operations: start interval=0s timeout=20s (cluster-vip-start-interval-0s)
  stop interval=0s timeout=20s (cluster-vip-stop-interval-0s)
  monitor interval=30s (cluster-vip-monitor-interval-30s)
Resource: cluster-lvm (class=ocf provider=heartbeat type=LVM)
  Attributes: volgrpname=targetfs
  Operations: start interval=0s timeout=30 (cluster-lvm-start-interval-0s)
  stop interval=0s timeout=30 (cluster-lvm-stop-interval-0s)
  monitor interval=10s timeout=30 (cluster-lvm-monitor-interval-10s)
Resource: cluster-tgt (class=ocf provider=heartbeat type=iSCSITarget)
  Attributes: iqn=iqn.2016-08.local.hsinauth.test:targetfs tid=1 
incoming_username=iscsi incoming_password=@4TAt9-laObIrdeR
  Operations: start interval=0s timeout=10 (cluster-tgt-start-interval-0s)
  stop interval=0s timeout=10 (cluster-tgt-stop-interval-0s)
  monitor interval=10s timeout=20s 
(cluster-tgt-monitor-interval-10s)
Resource: cluster-lun1 (class=ocf provider=heartbeat type=iSCSILogicalUnit)
  Attributes: target_iqn=iqn.2016-08.local.hsinauth.test:targetfs lun=1 
path=/dev/targetfs/lun1
  Operations: start interval=0s timeout=10 (cluster-lun1-start-interval-0s)
  stop interval=0s timeout=10 (cluster-lun1-stop-interval-0s)
  monitor interval=10 (cluster-lun1-monitor-interval-10)
Resource: cluster-lun2 (class=ocf provider=heartbeat type=iSCSILogicalUnit)
  Attributes: target_iqn=iqn.2016-08.local.hsinauth.test:targetfs lun=2 
path=/dev/targetfs/lun2
  Operations: start interval=0s timeout=10 (cluster-lun2-start-interval-0s)
  stop interval=0s timeout=10 (cluster-lun2-stop-interval-0s)
  monitor interval=10 (cluster-lun2-monitor-interval-10)

Stonith Devices:
Resource: cluster-fence (class=stonith type=fence_scsi)
  Attributes: devices=/dev/targetfs/lun1,/dev/targetfs/lun2
  Meta Attrs: provides=unfencing
  Operations: monitor interval=60s (cluster-fence-monitor-interval-60s)
Fencing Levels:

Location Constraints:
Ordering Constraints:
  promote cluster-fs2o then start cluster-lvm (kind:Mandatory) 
(id:order-cluster-fs2o-cluster-lvm-mandatory)
  start cluster-vip then start cluster-lvm (kind:Mandatory) 

[ClusterLabs] Node Fencing and STONITH in Amazon Web Services

2016-08-26 Thread Jason A Ramsey
If you don’t mind, please allow me to walk through my architecture just a bit. 
I know that I am far from an expert on this stuff, but I feel like I have a 
firm grasp on how this all works conceptually. That said, I welcome your 
insights and advice on how to approach this problem—and any ready-made 
solutions to it you might have on hand. :)

We are deploying into two availability zones (AZs) in AWS. Our goal is to be 
able to absorb the loss of an entire AZ and continue to provide services to 
users. Our first problem comes with Microsoft SQL Server running on Windows 
Server Failover Clustering. As you guys likely know, WSFC isn’t polite about 
staying up without a quorum. As such, I figured, oh, hey, I can build a 
two-node Pacemaker-based iSCSI Target cluster and expose luns from it via iSCSI 
so that the WSFC nodes could have a witness filesystem.

So, I’ve managed to make that all happen. Yay! What I’m now trying to suss out 
is how I ensure that I’m covered for availability in the event of any kind of 
outage. As it turns out, I believe that I actually have most of the covered.

(I use “1o” to indicate “primary” and “2o” for “secondary”)

Planned Outage: effected node gracefully demoted from cluster, life goes on, 
everyone is happy
Unplanned Outage (NAS cluster node fails/unreachable): if 2o node failure, 
nothing happens; if 1o node failure, drbd promotes 2o to 1o, constrained vip, 
lvm, tgt, and lun resources automatically flip to 2o node, life goes on, 
everyone is happy

But still—the one thing we built this ridiculously complicated and 
overengineered thing for—I don’t feel like I have a good story when it comes to 
a severed AZ event (loss of perimeter communications, etc.)

Unplanned Outage (AZ connectivity severed): both nodes detect that the other 
node is gone so promote themselves to primary. The unsevered side would 
continue to work as expected, with the witness mounted by the SQL servers in 
that AZ, life goes on and at least the USERS are happy… but the severed side is 
also sojourning on. Both sides of the SQL cluster would think they have quorum 
even if they can’t talk to their peer nodes, so they mark their peers as down 
and keep on keeping on. No users would be connecting to the severed instances, 
but background and system tasks would proceed as normal, potentially writing 
new data to the databases making rejoining the nodes to the cluster a little 
bit tricky to say the least, especially if the severed side’s network comes 
back up and both systems come to realize that they’re not consistent.

So, my problem, I think, is two-fold:


1.   What can I monitor from each of the NAS cluster instances (besides 
connectivity to one another) that would “ALWAYS” be available when things are 
working and NEVER available when they are broken?  Seems to me that if I can 
sort out something that meets these criteria (I was thinking, perhaps, a 
RESTful connection to the AWS API, but I’m not entirely sure you can’t get 
responses at API endpoints that may or may not be hosted inside the AZ), then I 
could write a simple monitoring script that runs on both nodes that would act 
as a fencing and STONITH solution (if detect bad things, then shut down). Seems 
to me that this would prevent the data inconsistency since the severed side 
WSFC would lose its witness file system, thus its quorum, and take itself 
offline.

2.   Have I failed to account for another failure condition that could be 
potentially as/more harmful than anything I’ve thought of already?

Anyway, I’m hopeful, someone(s) here can share some of their own experiences 
from the trenches. Thank you for your time (and all the help you guys have been 
in getting this set up already).

--

[ jR ]

  @: ja...@eramsey.org

  there is no path to greatness; greatness is the path
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Error When Creating LVM Resource

2016-08-26 Thread Jason A Ramsey
That makes sense. I wasn’t yet configuring the constraints on the cluster and 
was alarmed by the error messages…especially the ones where it seemed like the 
services weren’t starting anywhere. Eventually, however, that somehow magically 
resolved itself, so I did went ahead with adding the resource constraints.

Here’s what I added:

# pcs constraint colocation add gctvanas-vip with gctvanas-fs2o INFINITY 
with-rsc-role=Master
# pcs constraint colocation add gctvanas-lvm with gctvanas-fs2o INFINITY 
with-rsc-role=Master
# pcs constraint colocation add gctvanas-tgt with gctvanas-fs2o INFINITY 
with-rsc-role=Master
# pcs constraint colocation add gctvanas-lun1 with gctvanas-fs2o INFINITY 
with-rsc-role=Master
# pcs constraint colocation add gctvanas-lun2 with gctvanas-fs2o INFINITY 
with-rsc-role=Master
# pcs constraint order promote gctvanas-fs2o then start gctvanas-lvm
# pcs constraint order gctvanas-vip then gctvanas-lvm
# pcs constraint order gctvanas-lvm then gctvanas-tgt
# pcs constraint order gctvanas-tgt then gctvanas-lun1
# pcs constraint order gctvanas-tgt then gctvanas-lun2
# pcs constraint
Location Constraints:
Ordering Constraints:
  promote gctvanas-fs2o then start gctvanas-lvm (kind:Mandatory)
  start gctvanas-vip then start gctvanas-lvm (kind:Mandatory)
  start gctvanas-lvm then start gctvanas-tgt (kind:Mandatory)
  start gctvanas-tgt then start gctvanas-lun1 (kind:Mandatory)
  start gctvanas-tgt then start gctvanas-lun2 (kind:Mandatory)
Colocation Constraints:
  gctvanas-vip with gctvanas-fs2o (score:INFINITY) (with-rsc-role:Master)
  gctvanas-lvm with gctvanas-fs2o (score:INFINITY) (with-rsc-role:Master)
  gctvanas-tgt with gctvanas-fs2o (score:INFINITY) (with-rsc-role:Master)
  gctvanas-lun1 with gctvanas-fs2o (score:INFINITY) (with-rsc-role:Master)
  gctvanas-lun2 with gctvanas-fs2o (score:INFINITY) (with-rsc-role:Master)

I think this looks about right… hopefully when I test everything doesn’t go 
t/u. Thanks for the input!

--

[ jR ]
  @: ja...@eramsey.org<mailto:ja...@eramsey.org>

  there is no path to greatness; greatness is the path

From: Greg Woods <wo...@ucar.edu>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
Date: Friday, August 26, 2016 at 2:09 PM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
Subject: Re: [ClusterLabs] Error When Creating LVM Resource


On Fri, Aug 26, 2016 at 9:32 AM, Jason A Ramsey 
<ja...@eramsey.org<mailto:ja...@eramsey.org>> wrote:
Failed Actions:
* gctvanas-lvm_start_0 on node1 'not running' (7): call=42, status=complete, 
exitreason='LVM: targetfs did not activate correctly',
last-rc-change='Fri Aug 26 10:57:22 2016', queued=0ms, exec=577ms
* gctvanas-lvm_start_0 on node2 'unknown error' (1): call=34, status=complete, 
exitreason='Volume group [targetfs] does not exist or contains error!   Volume 
group "targetfs" not found',
last-rc-change='Fri Aug 26 10:57:21 2016', queued=0ms, exec=322ms


I think you need a colocation constraint to prevent it from trying to start the 
LVM resource on the DRBD secondary node. I used to run LVM-over-DRBD clusters 
but don't any more (switched to NFS backend storage), so I don't remember the 
exact syntax, but you certainly don't want the LVM resource to start on node2 
at this point because it will certainly fail.

It may not be running on node1 because it failed on node2, so if you can get 
the proper colocation constraint in place, things may work after you do a 
resource cleanup. (I stand ready to be corrected by someone more knowledgeable 
who can spot a configuration problem that I missed).

If you still get failure and the constraint is correct, then I would try 
running the lvcreate command manually on the DRBD primary node to make sure 
that works.

--Greg

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Error When Creating LVM Resource

2016-08-26 Thread Jason A Ramsey
So, I’m setting up a two node cluster that will eventually (hopefully) serve as 
an HA iSCSI Target (Active/Passive) on RHEL 6. I’m using the [incredibly poorly 
written] guide I found on Linbit’s website (“Highly available iSCSI storage 
with DRBD and Pacemaker”). I have somehow gotten pretty far through it, but 
I’ve hit a couple of snags.

Here’s my drbd.conf (/etc/drbd.d/global_common.conf):

global {
usage-count yes;
# minor-count dialog-refresh disable-ip-verification
}

common {
protocol C;

handlers {
# These are EXAMPLE handlers only.
# They may have severe implications,
# like hard resetting the node under certain 
circumstances.
# Be careful when chosing your poison.

# pri-on-incon-degr 
"/usr/lib/drbd/notify-pri-on-incon-degr.sh; 
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot 
-f";
# pri-lost-after-sb 
"/usr/lib/drbd/notify-pri-lost-after-sb.sh; 
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot 
-f";
# local-io-error 
"/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; 
echo o > /proc/sysrq-trigger ; halt -f";
# fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
# split-brain 
"/usr/lib/drbd/notify-split-brain.sh root";
# out-of-sync 
"/usr/lib/drbd/notify-out-of-sync.sh root";
# before-resync-target 
"/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
# after-resync-target 
/usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
}

startup {
# wfc-timeout degr-wfc-timeout 
outdated-wfc-timeout wait-after-sb
}

disk {
# on-io-error fencing use-bmbv no-disk-barrier 
no-disk-flushes
# no-disk-drain no-md-flushes max-bio-bvecs
}

net {
# sndbuf-size rcvbuf-size timeout connect-int 
ping-int ping-timeout max-buffers
# max-epoch-size ko-count allow-two-primaries 
cram-hmac-alg shared-secret
# after-sb-0pri after-sb-1pri after-sb-2pri 
data-integrity-alg no-tcp-cork
}

syncer {
# rate after al-extents use-rle cpu-mask 
verify-alg csums-alg
}
}

And here’s my targetfs.res config (/etc/drbd.d/targetfs.res):

resource targetfs {
protocol C;
meta-disk internal;
device /dev/drbd1;
disk /dev/xvdf;
syncer {
  verify-alg sha1;
  c-plan-ahead 0;
  rate 32M;
}
net {
  allow-two-primaries;
}
on node1 {
  address  10.130.96.120:7789;
}
on node2 {
  address  10.130.97.165:7789;
}
}

These, of course, live on both nodes.

Once I create the drbd md and sync the nodes:

(node1)# drbdadm create-md targetfs
(node2)# drbdadm create-md targetfs
(node1)# drbdadm up targetfs
(node2)# drbdadm up targetfs
(node2)# drbdadm invalidate targetfs
(node1)# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 
2014-11-24 14:51:37

1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-
ns:134213632 nr:0 dw:36 dr:134215040 al:1 bm:8192 lo:0 pe:0 ua:0 ap:0 ep:1 
wo:f oos:0

I run a pvcreate and an lvcreate on node1:

(node1)# pvcreate /dev/drbd/by-res/targetfs
(node1)# vgcreate targetfs /dev/drbd/by-res/targetfs
(node1)# pvs && vgs
  PV VG   Fmt  Attr PSize   PFree
  /dev/drbd1 targetfs lvm2 a--u 127.99g 127.99g
  VG   #PV #LV #SN Attr   VSize   VFree
  targetfs   1   0   0 wz--n- 127.99g 127.99g

pcs cluster configuration goes well enough for a bit:

# pcs cluster setup --name gctvanas node1 node2 --transport udpu
# pcs cluster start --all
# pcs property set stonith-enabled=false
# pcs property set no-quorum-policy=ignore
# pcs property set default-resource-stickiness="200"
# pcs resource create gctvanas-vip ocf:heartbeat:IPaddr2 ip=10.30.96.100 
cidr_netmask=32 nic=eth0 op monitor interval=30s
# pcs cluster cib drbd_cfg
# pcs -f drbd_cfg resource create gctvanas-fs1o ocf:linbit:drbd 
drbd_resource=targetfs op monitor interval=10s
# pcs -f drbd_cfg resource master gctvanas-fs2o gctvanas-fs1o master-max=1 
master-node-max=1 clone-max=2 clone-node-max=1 notify=true
# pcs cluster cib-push drbd_cfg
# pcs status
Cluster name: gctvanas
Stack: cman
Current DC: node1 (version 1.1.15-1.9a34920.git.el6-9a34920) - partition with 
quorum
Last updated: Fri Aug 26 11:29:11 2016
Last change: Fri Aug 26 11:29:07 2016 by root via cibadmin on node1

2 

Re: [ClusterLabs] pcs cluster auth returns authentication error

2016-08-25 Thread Jason A Ramsey
Well, I got around the problem, but I don’t understand the solution…

I edited /etc/pam.d/password-auth and commented out the following line:

authrequiredpam_tally2.so onerr=fail audit silent 
deny=5 unlock_time=900

Anyone have any idea why this was interfering?

--
 
[ jR ]
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path


On 8/25/16, 9:50 PM, "Jason A Ramsey" <ja...@eramsey.org> wrote:

Still stuck, but here’s the output of the command with --debug turned on:

Error: node1: Username and/or password is incorrect
Error: node2: Username and/or password is incorrect
Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb auth
--Debug Input Start--
{"username": "hacluster", "local": false, "nodes": ["node1", "node2"], 
"password": "", "force": false}
--Debug Input End--
Return Value: 0
--Debug Output Start--
{
  "status": "ok",
  "data": {
"sync_responses": {
},
"sync_nodes_err": [

],
"auth_responses": {
  "node2": {
"status": "bad_password"
  },
  "node1": {
"status": "bad_password"
  }
},
"sync_successful": true
  },
  "log": [
"I, [2016-08-25T21:46:40.848381 #4825]  INFO -- : PCSD Debugging 
enabled\n",
"D, [2016-08-25T21:46:40.848448 #4825] DEBUG -- : Detected RHEL 6\n",
"I, [2016-08-25T21:46:40.848489 #4825]  INFO -- : Running: 
/usr/sbin/corosync-objctl cluster\n",
"I, [2016-08-25T21:46:40.848526 #4825]  INFO -- : CIB USER: hacluster, 
groups: \n",
"D, [2016-08-25T21:46:40.850328 #4825] DEBUG -- : []\n",
"D, [2016-08-25T21:46:40.850378 #4825] DEBUG -- : [\"Failed to 
initialize the objdb API. Error 6\\n\"]\n",
"D, [2016-08-25T21:46:40.850429 #4825] DEBUG -- : Duration: 
0.001807s\n",
"I, [2016-08-25T21:46:40.850501 #4825]  INFO -- : Return Value: 1\n",
"W, [2016-08-25T21:46:40.850555 #4825]  WARN -- : Cannot read config 
'cluster.conf' from '/etc/cluster/cluster.conf': No such file\n",
"W, [2016-08-25T21:46:40.850609 #4825]  WARN -- : Cannot read config 
'cluster.conf' from '/etc/cluster/cluster.conf': No such file or directory - 
/etc/cluster/cluster.conf\n",
"I, [2016-08-25T21:46:40.851457 #4825]  INFO -- : SRWT Node: node1 
Request: check_auth\n",
"I, [2016-08-25T21:46:40.851554 #4825]  INFO -- : SRWT Node: node2 
Request: check_auth\n"
  ]
}
--Debug Output End--


--
 
[ jR ]
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path


On 8/25/16, 5:36 PM, "Jason A Ramsey" <ja...@eramsey.org> wrote:

Thanks for the response, Ken. I thought that might be the case, so I 
tried it with selinux disabled (setenforce=0). Same exact error. :-/

--
 
[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path


On 8/25/16, 5:29 PM, "Ken Gaillot" <kgail...@redhat.com> wrote:

On 08/25/2016 03:04 PM, Jason A Ramsey wrote:
> Please help. Just getting this thing stood up on a new set of 
servers
> and getting stymied right out the gate:
> 
>  
> 
> # pcs cluster auth node1 node2
> 
> Username: hacluster
> 
> Password:
> 
>  
> 
> I am **certain** that the password I’m providing is correct. Even 
still
> I get:
> 
>  
> 
> Error: node1: Username and/or password is incorrect
> 
> Error: node2: Username and/or password is incorrect
> 
>  
> 
> I also see this is /var/log/audit/audit.log:
> 
>  
> 
> type=USER_AUTH msg=audit(1472154922.415:69): user pid=1138 uid=0
> auid=4294967295 ses=4294967295 subj=system_u:system_r:initrc_t:s0
> msg='op=PAM:authentication acct="hacluster" exe="/usr/bin/ruby"
> hostname=? addr=? terminal=? res=failed'

That's an SELinux erro

Re: [ClusterLabs] pcs cluster auth returns authentication error

2016-08-25 Thread Jason A Ramsey
Still stuck, but here’s the output of the command with --debug turned on:

Error: node1: Username and/or password is incorrect
Error: node2: Username and/or password is incorrect
Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb auth
--Debug Input Start--
{"username": "hacluster", "local": false, "nodes": ["node1", "node2"], 
"password": "", "force": false}
--Debug Input End--
Return Value: 0
--Debug Output Start--
{
  "status": "ok",
  "data": {
"sync_responses": {
},
"sync_nodes_err": [

],
"auth_responses": {
  "node2": {
"status": "bad_password"
  },
  "node1": {
"status": "bad_password"
  }
},
"sync_successful": true
  },
  "log": [
"I, [2016-08-25T21:46:40.848381 #4825]  INFO -- : PCSD Debugging enabled\n",
"D, [2016-08-25T21:46:40.848448 #4825] DEBUG -- : Detected RHEL 6\n",
"I, [2016-08-25T21:46:40.848489 #4825]  INFO -- : Running: 
/usr/sbin/corosync-objctl cluster\n",
"I, [2016-08-25T21:46:40.848526 #4825]  INFO -- : CIB USER: hacluster, 
groups: \n",
"D, [2016-08-25T21:46:40.850328 #4825] DEBUG -- : []\n",
"D, [2016-08-25T21:46:40.850378 #4825] DEBUG -- : [\"Failed to initialize 
the objdb API. Error 6\\n\"]\n",
"D, [2016-08-25T21:46:40.850429 #4825] DEBUG -- : Duration: 0.001807s\n",
"I, [2016-08-25T21:46:40.850501 #4825]  INFO -- : Return Value: 1\n",
"W, [2016-08-25T21:46:40.850555 #4825]  WARN -- : Cannot read config 
'cluster.conf' from '/etc/cluster/cluster.conf': No such file\n",
"W, [2016-08-25T21:46:40.850609 #4825]  WARN -- : Cannot read config 
'cluster.conf' from '/etc/cluster/cluster.conf': No such file or directory - 
/etc/cluster/cluster.conf\n",
    "I, [2016-08-25T21:46:40.851457 #4825]  INFO -- : SRWT Node: node1 Request: 
check_auth\n",
"I, [2016-08-25T21:46:40.851554 #4825]  INFO -- : SRWT Node: node2 Request: 
check_auth\n"
  ]
}
--Debug Output End--


--
 
[ jR ]
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path


On 8/25/16, 5:36 PM, "Jason A Ramsey" <ja...@eramsey.org> wrote:

Thanks for the response, Ken. I thought that might be the case, so I tried 
it with selinux disabled (setenforce=0). Same exact error. :-/

--
 
[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path


On 8/25/16, 5:29 PM, "Ken Gaillot" <kgail...@redhat.com> wrote:

On 08/25/2016 03:04 PM, Jason A Ramsey wrote:
> Please help. Just getting this thing stood up on a new set of servers
> and getting stymied right out the gate:
> 
>  
> 
> # pcs cluster auth node1 node2
> 
> Username: hacluster
> 
> Password:
> 
>  
> 
> I am **certain** that the password I’m providing is correct. Even 
still
> I get:
> 
>  
> 
> Error: node1: Username and/or password is incorrect
> 
> Error: node2: Username and/or password is incorrect
> 
>  
> 
> I also see this is /var/log/audit/audit.log:
> 
>  
> 
> type=USER_AUTH msg=audit(1472154922.415:69): user pid=1138 uid=0
> auid=4294967295 ses=4294967295 subj=system_u:system_r:initrc_t:s0
> msg='op=PAM:authentication acct="hacluster" exe="/usr/bin/ruby"
> hostname=? addr=? terminal=? res=failed'

That's an SELinux error. To confirm, try again with SELinux disabled.

I think distributions that package pcs also provide any SELinux policies
it needs. I'm not sure what those are, or the best way to specify them
if you're building pcs yourself, but it shouldn't be difficult to figure
out.

> I’ve gone so far as to change the password to ensure that it didn’t 
have
> any “weird” characters in it, but the error persists. Appreciate the 
help!
> 
>  
> 
> --
> 
>  
> 
> *[ jR ]*
> 
>   @: ja...@eramsey.org <mailto:ja...@eramsey.org>
> 
>  
> 
>   /there is no path to greatness; greatness is the path/

___
Users mailing list: Use

Re: [ClusterLabs] pcs cluster auth returns authentication error

2016-08-25 Thread Jason A Ramsey
Thanks for the response, Ken. I thought that might be the case, so I tried it 
with selinux disabled (setenforce=0). Same exact error. :-/

--
 
[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path


On 8/25/16, 5:29 PM, "Ken Gaillot" <kgail...@redhat.com> wrote:

On 08/25/2016 03:04 PM, Jason A Ramsey wrote:
> Please help. Just getting this thing stood up on a new set of servers
> and getting stymied right out the gate:
> 
>  
> 
> # pcs cluster auth node1 node2
> 
> Username: hacluster
> 
> Password:
> 
>  
> 
> I am **certain** that the password I’m providing is correct. Even still
> I get:
> 
>  
> 
> Error: node1: Username and/or password is incorrect
> 
> Error: node2: Username and/or password is incorrect
> 
>  
> 
> I also see this is /var/log/audit/audit.log:
> 
>  
> 
> type=USER_AUTH msg=audit(1472154922.415:69): user pid=1138 uid=0
> auid=4294967295 ses=4294967295 subj=system_u:system_r:initrc_t:s0
> msg='op=PAM:authentication acct="hacluster" exe="/usr/bin/ruby"
> hostname=? addr=? terminal=? res=failed'

That's an SELinux error. To confirm, try again with SELinux disabled.

I think distributions that package pcs also provide any SELinux policies
it needs. I'm not sure what those are, or the best way to specify them
if you're building pcs yourself, but it shouldn't be difficult to figure
out.

> I’ve gone so far as to change the password to ensure that it didn’t have
> any “weird” characters in it, but the error persists. Appreciate the help!
> 
>  
> 
> --
> 
>  
> 
> *[ jR ]*
> 
>   @: ja...@eramsey.org <mailto:ja...@eramsey.org>
> 
>  
> 
>   /there is no path to greatness; greatness is the path/

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unable to Build fence-agents from Source on RHEL6

2016-08-10 Thread Jason A Ramsey
Believe me, I would love to use a more modern dist, but RHEL6 is currently our 
standard image…

--
 
[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

On 8/10/16, 2:50 PM, "Jan Pokorný" <jpoko...@redhat.com> wrote:

On 10/08/16 16:52 +, Jason A Ramsey wrote:
> Installing the openwsman-python package doesn’t work. Configure’ing
> the fence-agents source tree fails because it still can’t find the
> pywsman module. I thought that it might be because it’s looking in
> /usr/lib/python-x/site-packages rather than
> /usr/lib64/python-x/site-packages (or vice versa…I can’t remember…)
> but when I looked at the output, it was definitely looking in the
> directory that had pywsman.py/pyc/pyo or whatever it was.

Hmm, here's why:

# python -c 'import pywsman'  
> Traceback (most recent call last):

>   File "", line 1, in 

>   File "/usr/lib/python2.6/site-packages/pywsman.py", line 25, in 

> _pywsman = swig_import_helper()   

>   File "/usr/lib/python2.6/site-packages/pywsman.py", line 17, in 
swig_import_helper  
> import _pywsman   

> ImportError: /usr/lib64/python2.6/site-packages/_pywsman.so: undefined 
symbol: SWIG_exception 

it should also answer your "it’s not in the default yum repos" point
you've raised as indeed, both libwsman-devel and openwsman-python
are in "optional" repository with RHEL 6, meaning packages as-are,
without liabilities (mostly enabling the supported ones to be built;
openwsman-python in particular can be just a never triggered byproduct
when the important sibling packages, perhaps build prerequisites,
got built).

So either stick with upstream provided version for binaries + bindings
or you may have better luck with RHEL 7 (slash derivatives).

Anyway, you got me (indirectly) to make this PR against FAs:
https://github.com/ClusterLabs/fence-agents/pull/84

-- 
Jan (Poki)


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unable to Build fence-agents from Source on RHEL6

2016-08-10 Thread Jason A Ramsey
Installing the openwsman-python package doesn’t work. Configure’ing the 
fence-agents source tree fails because it still can’t find the pywsman module. 
I thought that it might be because it’s looking in 
/usr/lib/python-x/site-packages rather than /usr/lib64/python-x/site-packages 
(or vice versa…I can’t remember…) but when I looked at the output, it was 
definitely looking in the directory that had pywsman.py/pyc/pyo or whatever it 
was.

--
 
[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

On 8/10/16, 12:40 PM, "Jan Pokorný" <jpoko...@redhat.com> wrote:

On 09/08/16 20:20 +, Jason A Ramsey wrote:
> Here’s the output I now get out of pip install pywsman:
> 
> < stupiderrormessage >
> 
> # pip install pywsman
> DEPRECATION: Python 2.6 is no longer supported by the Python core team, 
please upgrade your Python. A future version of pip will drop support for 
Python 2.6
> Collecting pywsman
> 
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
>   SNIMissingWarning
> 
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
>   InsecurePlatformWarning
>   Using cached pywsman-2.5.2-1.tar.gz
> Building wheels for collected packages: pywsman
>   Running setup.py bdist_wheel for pywsman ... error
>   Complete output from command /usr/bin/python -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-build-bvG1Jf/pywsman/setup.py';exec(compile(getattr(tokenize,
 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" 
bdist_wheel -d /tmp/tmp3R7Zz5pip-wheel- --python-tag cp26:
>   No version.i.in file found -- Building from sdist.
>   /usr/lib/python2.6/site-packages/setuptools/dist.py:364: UserWarning: 
Normalizing '2.5.2-1' to '2.5.2.post1'
> normalized_version,
>   running bdist_wheel
>   running build
>   running build_ext
>   building '_pywsman' extension
>   swigging openwsman.i to openwsman_wrap.c
>   swig -python -I/tmp/pip-build-bvG1Jf/pywsman -I/usr/include/openwsman 
-features autodoc -o openwsman_wrap.c openwsman.i
>   wsman-client.i:44: Warning(504): Function _WsManClient must have a 
return type.
>   wsman-client.i:61: Warning(504): Function _WsManClient must have a 
return type.
>   creating build
>   creating build/temp.linux-x86_64-2.6
>   gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv 
-DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE 
-fPIC -fwrapv -fPIC -I/tmp/pip-build-bvG1Jf/pywsman -I/usr/include/openwsman 
-I/usr/include/python2.6 -c openwsman.c -o 
build/temp.linux-x86_64-2.6/openwsman.o
>   gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv 
-DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE 
-fPIC -fwrapv -fPIC -I/tmp/pip-build-bvG1Jf/pywsman -I/usr/include/openwsman 
-I/usr/include/python2.6 -c openwsman_wrap.c -o 
build/temp.linux-x86_64-2.6/openwsman_wrap.o
>   openwsman_wrap.c: In function ‘_WsXmlDoc_string’:
>   openwsman_wrap.c:3225: warning: implicit declaration of function 
‘ws_xml_dump_memory_node_tree_enc’
>   openwsman_wrap.c: In function ‘__WsXmlNode_size’:
>   openwsman_wrap.c:3487: warning: implicit declaration of function 
‘ws_xml_get_child_count_by_qname’
>   openwsman_wrap.c: In function ‘epr_t_cmp’:
>   openwsman_wrap.c:3550: warning: passing argument 2 of ‘epr_cmp’ 
discards qualifiers from pointer target type
>   /usr/include/openwsman/wsman-epr.h:163: note: expected ‘struct epr_t *’ 
but argument is of type ‘const struct epr_t *’

Re: [ClusterLabs] Unable to Build fence-agents from Source on RHEL6

2016-08-10 Thread Jason A Ramsey
Thanks! I actually just figured this out and was about to share my findings 
with the list. Lol. 

--
 
[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

On 8/10/16, 11:12 AM, "Dmitri Maziuk" <dmitri.maz...@gmail.com> wrote:

On 2016-08-10 10:04, Jason A Ramsey wrote:

> Traceback (most recent call last):
>
> File "eps/fence_eps", line 14, in 
>
> if sys.version_info.major > 2:
>
> AttributeError: 'tuple' object has no attribute 'major'


Replace with sys.version_info[0]

Dima

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unable to Build fence-agents from Source on RHEL6

2016-08-10 Thread Jason A Ramsey
Okay, so ripped apart the code and removed references to the fence_amt_ws 
module. Configure worked. Make got really close, but not quite:


PYTHONPATH=/home/ec2-user/pcs/fence-agents/fence/agents/lib:/home/ec2-user/pcs/fence-agents/fence/agents/../lib:/home/ec2-user/pcs/fence-agents/fence/agents/lib
 \
/usr/bin/python eps/fence_eps -o metadata > 
eps/.fence_eps.8.tmp && \
xmllint --noout --relaxng 
/home/ec2-user/pcs/fence-agents/fence/agents/lib/metadata.rng 
eps/.fence_eps.8.tmp && \
xsltproc ../../fence/agents/lib/fence2man.xsl 
eps/.fence_eps.8.tmp > eps/fence_eps.8
Traceback (most recent call last):
  File "eps/fence_eps", line 14, in 
if sys.version_info.major > 2:
AttributeError: 'tuple' object has no attribute 'major'
make[3]: *** [eps/fence_eps.8] Error 1
make[3]: Leaving directory `/home/ec2-user/pcs/fence-agents/fence/agents'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/ec2-user/pcs/fence-agents/fence'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/ec2-user/pcs/fence-agents'
make: *** [all] Error 2

Any ideas on this? Also, do you have any recommendations on making this into an 
RPM?

--

[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org<mailto:ja...@eramsey.org>

  there is no path to greatness; greatness is the path

From: Marek Grac <mg...@redhat.com>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
Date: Wednesday, August 10, 2016 at 7:30 AM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
Subject: Re: [ClusterLabs] Unable to Build fence-agents from Source on RHEL6

Hi,

* pywsman is required only for fence_amt_ws so if you don't need it, feel free 
to remove this dependency (and agent)
* python2.6 (and RHEL6) is no longer a platform that we support at upstream. We 
aim for python 2.7 and 3.x currently. But it might work on python2.6 (and 
Oyvind accepts patches that fixes it)
* for RHEL6 you might want to use our branch 'RHEL6' that have to work with 
python2.6. Maybe you won't get latest features but it should be enough for most 
of the installations.

m,

On Tue, Aug 9, 2016 at 10:20 PM, Jason A Ramsey 
<ja...@eramsey.org<mailto:ja...@eramsey.org>> wrote:
So, I’ve managed to wade through the majority of dependency hell issues I’ve 
encountered trying to get RPMs built of Pacemaker and its ancillary packages. 
That is, of course, with the exception of the fence-agents source tree (grabbed 
from github). Autogen.sh works great, but when it comes to configure’ing the 
tree, it bombs out on a missing pywsman module. Great, so I need to install 
that. pip install pywsman doesn’t work because of missing openwsman libraries, 
etc. so I go ahead and install the packages I’m pretty sure I need using yum 
(openwsman-client, openwsman-server, libwsman). pip install still craps out, so 
I figure out that the build is looking for the openwsman headers. Cool, find 
(it’s not in the default yum repos) libwsman-devel, which ends up requiring 
sblim-sfcc-devel and probably some other stuff I can’t remember any more 
because my brain is mostly jelly at this point… Anyway, finally get the 
libwsman-devel rpm. Yay! Everything should work now, right? Wrong. Here’s the 
output I now get out of pip install pywsman:

< stupiderrormessage >

# pip install pywsman
DEPRECATION: Python 2.6 is no longer supported by the Python core team, please 
upgrade your Python. A future version of pip will drop support for Python 2.6
Collecting pywsman
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Using cached pywsman-2.5.2-1.tar.gz
Building wheels for collected packages: pywsman
  Running setup.py bdist_wheel for pywsman ... error
  Complete output from command /usr/bin/python -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-build-bvG1Jf/pywsman/setup.py';exec(compile(getatt

Re: [ClusterLabs] Can Pacemaker monitor geographical separated servers

2016-08-10 Thread Jason A Ramsey
I can’t answer all of this because I’m still working out how to do fencing, but 
I’ve been setting up a Pacemaker cluster in Amazon Web Services across two 
separate availability zones. Naturally, this means that I have to bridge 
subnets, so I’ve battled through a good bit of this already.

Imagine that you have a cluster node in each of two IP subnets: 10.100.0.0/24 
and 10.200.0.0/24. This configuration prevents you from doing two things:


1.   You can’t use multicast

2.   You can’t pick an IP in either subnet as the cluster VIP

The way that I got around this was to pick an arbitrary subnet that exists 
_outside_ of all configured subnets in my environment: 10.0.0.0/24. I then 
created routes from each of my cluster node subnets to VIP(s) (I’m trying to 
make my cluster Active/Active, so I want 2) on this subnet:

Destination  Target
10.0.0.100/32 network interface for cluster node 1
10.0.0.101/32 network interface for cluster node 2

I then set up Pacemaker cluster resources for the VIPs:

pcs resource create cluster_vip1 ocf:heartbeat:IPAddr2 ip=10.0.0.100 
cidr_netmask=32 nic=eth0 op monitor interval=15s
pcs resource create cluster_vip2 ocf:heartbeat:IPAddr2 ip=10.0.0.101 
cidr_netmask=32 nic=eth0 op monitor interval=15s

The voodoo in this is that you specify the device name of the network interface 
that you’re mapping to rather than just the IP address. Otherwise Pacemaker 
will throw an error about how 10.0.0.100 isn’t an address that exists on the 
cluster nodes. Then you need to make sure that the right VIP is running on the 
right cluster node by probably moving resources around.

pcs resource move cluster_vip1 clusternode1
pcs resource move cluster_vip2 clusternode2

At this point, if everything is working properly, you should be able to ping 
(assuming no firewall rules are in the way) the VIP IPs as long as they are 
associated with the appropriate node and the routes are correct. You’ll find 
that if you move the VIP to another node without updating the routing table, 
the pings will no longer work. Success! Well, almost…

While I know logically that to make this work I need to sort out a fencing 
method that detects node failure and then have a fencing script that updates 
the routing table as appropriate to get the traffic moving around properly to 
the failover node in the event of a…well…event… ☺

#!/bin/bash
sudo -u hacluster aws ec2 replace-route --route-table-id rtb- 
--destination-cidr-block 10.0.0.100/32 --network-interface-id eni-
pcs resource move cluster_vip1 clusternode2

Obviously, this would only work in an AWS deployment, and, like I said, I still 
haven’t figured out how to detect an outage to make this failover occur. 
Hopefully, though, this should get you pointed in the right direction.

--

[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org

  there is no path to greatness; greatness is the path

From: "bhargav M.P" 
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 

Date: Tuesday, August 9, 2016 at 2:40 PM
To: "users@clusterlabs.org" 
Subject: [ClusterLabs] Can Pacemaker monitor geographical separated servers

Hi All,
I have deployment where we have two Linux servers that are geographically 
separated and they are across different subnets . I want the server to work in  
Active/Standby mode . I would like to use pacemaker-corsync for performing 
switch over when active fails.
My requirement:
 would like to have a single virtual IP address for accessing the Linux servers 
and only active must have VIP(virtual IP address)

Can pacemaker transfer the virtual IP address to new active  when the current 
active fails.? If so,  how can the same virtual IP address be accessible from 
the client since it has now moved to a different subnet.?
If the above use case cannot be supported by the pacemaker what are the 
possible options i need to look at.?

Thank you so much for the help,
Bhargav




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unable to Build fence-agents from Source on RHEL6

2016-08-10 Thread Jason A Ramsey
Unfortunately, I need the latest features in my use case… Thanks for the tip, 
though.

--

[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org<mailto:ja...@eramsey.org>

  there is no path to greatness; greatness is the path

From: Marek Grac <mg...@redhat.com>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
Date: Wednesday, August 10, 2016 at 7:30 AM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
Subject: Re: [ClusterLabs] Unable to Build fence-agents from Source on RHEL6

Hi,

* pywsman is required only for fence_amt_ws so if you don't need it, feel free 
to remove this dependency (and agent)
* python2.6 (and RHEL6) is no longer a platform that we support at upstream. We 
aim for python 2.7 and 3.x currently. But it might work on python2.6 (and 
Oyvind accepts patches that fixes it)
* for RHEL6 you might want to use our branch 'RHEL6' that have to work with 
python2.6. Maybe you won't get latest features but it should be enough for most 
of the installations.

m,

On Tue, Aug 9, 2016 at 10:20 PM, Jason A Ramsey 
<ja...@eramsey.org<mailto:ja...@eramsey.org>> wrote:
So, I’ve managed to wade through the majority of dependency hell issues I’ve 
encountered trying to get RPMs built of Pacemaker and its ancillary packages. 
That is, of course, with the exception of the fence-agents source tree (grabbed 
from github). Autogen.sh works great, but when it comes to configure’ing the 
tree, it bombs out on a missing pywsman module. Great, so I need to install 
that. pip install pywsman doesn’t work because of missing openwsman libraries, 
etc. so I go ahead and install the packages I’m pretty sure I need using yum 
(openwsman-client, openwsman-server, libwsman). pip install still craps out, so 
I figure out that the build is looking for the openwsman headers. Cool, find 
(it’s not in the default yum repos) libwsman-devel, which ends up requiring 
sblim-sfcc-devel and probably some other stuff I can’t remember any more 
because my brain is mostly jelly at this point… Anyway, finally get the 
libwsman-devel rpm. Yay! Everything should work now, right? Wrong. Here’s the 
output I now get out of pip install pywsman:

< stupiderrormessage >

# pip install pywsman
DEPRECATION: Python 2.6 is no longer supported by the Python core team, please 
upgrade your Python. A future version of pip will drop support for Python 2.6
Collecting pywsman
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Using cached pywsman-2.5.2-1.tar.gz
Building wheels for collected packages: pywsman
  Running setup.py bdist_wheel for pywsman ... error
  Complete output from command /usr/bin/python -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-build-bvG1Jf/pywsman/setup.py';exec(compile(getattr(tokenize,
 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" 
bdist_wheel -d /tmp/tmp3R7Zz5pip-wheel- --python-tag cp26:
  No version.i.in<http://version.i.in> file found -- Building from sdist.
  /usr/lib/python2.6/site-packages/setuptools/dist.py:364: UserWarning: 
Normalizing '2.5.2-1' to '2.5.2.post1'
normalized_version,
  running bdist_wheel
  running build
  running build_ext
  building '_pywsman' extension
  swigging openwsman.i to openwsman_wrap.c
  swig -python -I/tmp/pip-build-bvG1Jf/pywsman -I/usr/include/openwsman 
-features autodoc -o openwsman_wrap.c openwsman.i
  wsman-client.i:44: Warning(504): Function _WsManClient must have a return 
type.
  wsman-client.i:61: Warning(504): Function _WsManClient must have a return 
type.
  creating build
  creating build/temp.linux-x86_64-2.6
  gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/tmp

[ClusterLabs] Unable to Build fence-agents from Source on RHEL6

2016-08-09 Thread Jason A Ramsey
So, I’ve managed to wade through the majority of dependency hell issues I’ve 
encountered trying to get RPMs built of Pacemaker and its ancillary packages. 
That is, of course, with the exception of the fence-agents source tree (grabbed 
from github). Autogen.sh works great, but when it comes to configure’ing the 
tree, it bombs out on a missing pywsman module. Great, so I need to install 
that. pip install pywsman doesn’t work because of missing openwsman libraries, 
etc. so I go ahead and install the packages I’m pretty sure I need using yum 
(openwsman-client, openwsman-server, libwsman). pip install still craps out, so 
I figure out that the build is looking for the openwsman headers. Cool, find 
(it’s not in the default yum repos) libwsman-devel, which ends up requiring 
sblim-sfcc-devel and probably some other stuff I can’t remember any more 
because my brain is mostly jelly at this point… Anyway, finally get the 
libwsman-devel rpm. Yay! Everything should work now, right? Wrong. Here’s the 
output I now get out of pip install pywsman:

< stupiderrormessage >

# pip install pywsman
DEPRECATION: Python 2.6 is no longer supported by the Python core team, please 
upgrade your Python. A future version of pip will drop support for Python 2.6
Collecting pywsman
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Using cached pywsman-2.5.2-1.tar.gz
Building wheels for collected packages: pywsman
  Running setup.py bdist_wheel for pywsman ... error
  Complete output from command /usr/bin/python -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-build-bvG1Jf/pywsman/setup.py';exec(compile(getattr(tokenize,
 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" 
bdist_wheel -d /tmp/tmp3R7Zz5pip-wheel- --python-tag cp26:
  No version.i.in file found -- Building from sdist.
  /usr/lib/python2.6/site-packages/setuptools/dist.py:364: UserWarning: 
Normalizing '2.5.2-1' to '2.5.2.post1'
normalized_version,
  running bdist_wheel
  running build
  running build_ext
  building '_pywsman' extension
  swigging openwsman.i to openwsman_wrap.c
  swig -python -I/tmp/pip-build-bvG1Jf/pywsman -I/usr/include/openwsman 
-features autodoc -o openwsman_wrap.c openwsman.i
  wsman-client.i:44: Warning(504): Function _WsManClient must have a return 
type.
  wsman-client.i:61: Warning(504): Function _WsManClient must have a return 
type.
  creating build
  creating build/temp.linux-x86_64-2.6
  gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/tmp/pip-build-bvG1Jf/pywsman 
-I/usr/include/openwsman -I/usr/include/python2.6 -c openwsman.c -o 
build/temp.linux-x86_64-2.6/openwsman.o
  gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/tmp/pip-build-bvG1Jf/pywsman 
-I/usr/include/openwsman -I/usr/include/python2.6 -c openwsman_wrap.c -o 
build/temp.linux-x86_64-2.6/openwsman_wrap.o
  openwsman_wrap.c: In function ‘_WsXmlDoc_string’:
  openwsman_wrap.c:3225: warning: implicit declaration of function 
‘ws_xml_dump_memory_node_tree_enc’
  openwsman_wrap.c: In function ‘__WsXmlNode_size’:
  openwsman_wrap.c:3487: warning: implicit declaration of function 
‘ws_xml_get_child_count_by_qname’
  openwsman_wrap.c: In function ‘epr_t_cmp’:
  openwsman_wrap.c:3550: warning: passing argument 2 of ‘epr_cmp’ discards 
qualifiers from pointer target type
  /usr/include/openwsman/wsman-epr.h:163: note: expected ‘struct epr_t *’ but 
argument is of type ‘const struct epr_t *’
  openwsman_wrap.c: In function ‘epr_t_string’:

Re: [ClusterLabs] STONITH Fencing for Amazon EC2

2016-08-04 Thread Jason A Ramsey
Is there some other [updated] fencing module I can use in this use case?

--
 
[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

On 8/2/16, 11:59 AM, "Digimer" <li...@alteeve.ca> wrote:

On 02/08/16 10:02 AM, Jason A Ramsey wrote:
> I’ve found [oldish] references on the internet to a fencing module for 
Amazon EC2, but it doesn’t seem to be included in any the fencing yum packages 
for CentOS. Is this module not part of the canonical distribution? Is there 
something else I should be looking for?

I *think* it fell behind (fence_ec2, iirc). It might need to be picked
up, updated/tested and then it can be re-added to the official list.

I'm not 100% on this though, so if someone contradicts me, ignore me.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] STONITH Fencing for Amazon EC2

2016-08-02 Thread Jason A Ramsey
I’ve found [oldish] references on the internet to a fencing module for Amazon 
EC2, but it doesn’t seem to be included in any the fencing yum packages for 
CentOS. Is this module not part of the canonical distribution? Is there 
something else I should be looking for?

--
 
[ jR ]
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Pacemaker Resource Agent for iSCSILogicalUnit is Incomplete

2016-07-25 Thread Jason A Ramsey
Hi, everyone. You might’ve noticed a few emails from me in the last couple of 
weeks that chronicle my struggle to get an HA iSCSI Target configured in AWS. 
After stumbling through this setup (and struggling all the way), I think I’ve 
finally gotten to the point where I’m ready to create the LUN’s for this 
cluster. Unfortunately, I’ve come to realize that the 
ocf:heartbeat:iSCSILogicalUnit resource agent does not seem to completely 
support the iSCSI lio-t (lio using targetcli) specification. I was hoping that 
I could get some help in closing this final loop. Essentially, the resource 
agent (excerpt below) seems only to support the configuration and management of 
block backstores. Other backstores are not accounted for including most 
critically (in my use case) fileio:

lio-t)
   # For lio, we first have to create a target device, then
# add it to the Target Portal Group as an LU.
ocf_run targetcli /backstores/block create 
name=${OCF_RESOURCE_INSTANCE} dev=${OCF_RESKEY_path} || exit $OCF_ERR_GENERIC
if [ -n "${OCF_RESKEY_scsi_sn}" ]; then
echo ${OCF_RESKEY_scsi_sn} > 
/sys/kernel/config/target/core/iblock_${OCF_RESKEY_lio_iblock}/${OCF_RESOURCE_INSTANCE}/wwn/vpd_unit_serial
fi
ocf_run targetcli /iscsi/${OCF_RESKEY_target_iqn}/tpg1/luns create 
/backstores/block/${OCF_RESOURCE_INSTANCE} ${OCF_RESKEY_lun} || exit 
$OCF_ERR_GENERIC

if [ -n "${OCF_RESKEY_allowed_initiators}" ]; then
for initiator in ${OCF_RESKEY_allowed_initiators}; do
ocf_run targetcli 
/iscsi/${OCF_RESKEY_target_iqn}/tpg1/acls create ${initiator} 
add_mapped_luns=False || exit $OCF_ERR_GENERIC
ocf_run targetcli 
/iscsi/${OCF_RESKEY_target_iqn}/tpg1/acls/${initiator} create ${OCF_RESKEY_lun} 
${OCF_RESKEY_lun} || exit $OCF_ERR_GENERIC
done
fi
;;

My assumption is that I can hack out the script bits necessary to support the 
fileio backstore, but I’m curious if that’s something that someone has already 
done somewhere. Thanks in advance! I really appreciate all the assistance I’ve 
gotten from this group.

--
 
[ jR ]
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Resource Agent ocf:heartbeat:iSCSILogicalUnit

2016-07-22 Thread Jason A Ramsey
Great! Thanks for the pointer! Any ideas on the other stuff I was asking about 
(i.e. how to use any other backstore other than block with Pacemaker)?

--
 
[ jR ]
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

On 7/22/16, 12:24 PM, "Andrei Borzenkov" <arvidj...@gmail.com> wrote:

22.07.2016 18:29, Jason A Ramsey пишет:
> From the command line parameters for the pcs resource create or is it
> something internal (not exposed to the user)? If the former, what
> parameter?
> 


http://www.linux-ha.org/doc/dev-guides/_literal_ocf_resource_instance_literal.html

> --
> 
> [ jR ] @: ja...@eramsey.org
> 
> there is no path to greatness; greatness is the path
> 
> On 7/22/16, 11:08 AM, "Andrei Borzenkov" <arvidj...@gmail.com>
> wrote:
> 
> 22.07.2016 17:43, Jason A Ramsey пишет:
>> Additionally (and this is just a failing on my part), I’m unclear
>> as to where the resource agent is fed the value for 
>> “${OCF_RESOURCE_INSTANCE}” given the limited number of parameters
>> one is permitted to supply with “pcs resource create…”
>> 
> 
> It is supplied automatically by pacemaker.
> 
> 



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Resource Agent ocf:heartbeat:iSCSILogicalUnit

2016-07-22 Thread Jason A Ramsey
From the command line parameters for the pcs resource create or is it something 
internal (not exposed to the user)? If the former, what parameter?

--
 
[ jR ]
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

On 7/22/16, 11:08 AM, "Andrei Borzenkov" <arvidj...@gmail.com> wrote:

22.07.2016 17:43, Jason A Ramsey пишет:
> Additionally (and this is just a failing on my part), I’m
> unclear as to where the resource agent is fed the value for
> “${OCF_RESOURCE_INSTANCE}” given the limited number of parameters one
> is permitted to supply with “pcs resource create…”
>

It is supplied automatically by pacemaker.


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Resource Agent ocf:heartbeat:iSCSILogicalUnit

2016-07-22 Thread Jason A Ramsey
I’m struggling to understand how to fully exploit the capabilities of targetcli 
using the Pacemaker resource agent for iSCSILogicalUnit. From this block of 
code:

lio-t)
# For lio, we first have to create a target device, then
# add it to the Target Portal Group as an LU.
ocf_run targetcli /backstores/block create 
name=${OCF_RESOURCE_INSTANCE} dev=${OCF_RESKEY_path} || exit $OCF_ERR_GENERIC
if [ -n "${OCF_RESKEY_scsi_sn}" ]; then
echo ${OCF_RESKEY_scsi_sn} > 
/sys/kernel/config/target/core/iblock_${OCF_RESKEY_lio_iblock}/${OCF_RESOURCE_INSTANCE}/wwn/vpd_unit_serial
fi
ocf_run targetcli /iscsi/${OCF_RESKEY_target_iqn}/tpg1/luns 
create /backstores/block/${OCF_RESOURCE_INSTANCE} ${OCF_RESKEY_lun} || exit 
$OCF_ERR_GENERIC

if [ -n "${OCF_RESKEY_allowed_initiators}" ]; then
for initiator in ${OCF_RESKEY_allowed_initiators}; do
ocf_run targetcli 
/iscsi/${OCF_RESKEY_target_iqn}/tpg1/acls create ${initiator} 
add_mapped_luns=False || exit $OCF_ERR_GENERIC
ocf_run targetcli 
/iscsi/${OCF_RESKEY_target_iqn}/tpg1/acls/${initiator} create ${OCF_RESKEY_lun} 
${OCF_RESKEY_lun} || exit $OCF_ERR_GENERIC
done
fi
;;

it looks like I’m only permitted to create a block backstore. Critically 
missing, in this scenario, is the ability to create fileio backstores on things 
like mounted filesystems abstracted by things like drbd. Additionally (and this 
is just a failing on my part), I’m unclear as to where the resource agent is 
fed the value for “${OCF_RESOURCE_INSTANCE}” given the limited number of 
parameters one is permitted to supply with “pcs resource create…”

Can anyone provide any insight please? Thank you in advance!


--
 
[ jR ]
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Setup problem: couldn't find command: tcm_node

2016-07-21 Thread Jason A Ramsey
Okay! This is incredibly helpful. Thank you. After digging through the resource 
agents, it looks like what I actually want is to use the implementation “lio-t” 
since that uses the targetcli utilities.

--
 
[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

On 7/21/16, 10:31 AM, "Andrei Borzenkov" <arvidj...@gmail.com> wrote:

On Thu, Jul 21, 2016 at 5:18 PM, Jason A Ramsey <ja...@eramsey.org> wrote:
> Thanks for the response, but as I indicated in a previous response, 
lio-utils is deprecated, having been replaced by targetcli. It seems as though 
Pacemaker is trying to invoke a set of utilities/libraries that aren’t 
currently supported to perform iSCSI Target/LUN clustering. My question 
becomes, therefore, how does one retarget Pacemaker at the right/current 
toolset?

One updates resource agent?

# Set a default implementation based on software installed
if have_binary ietadm; then
OCF_RESKEY_implementation_default="iet"
elif have_binary tgtadm; then
OCF_RESKEY_implementation_default="tgt"
elif have_binary lio_node; then
OCF_RESKEY_implementation_default="lio"
elif have_binary targetcli; then
OCF_RESKEY_implementation_default="lio-t"
fi




The iSCSI target daemon implementation. Must be one of "iet", "tgt",
"lio", or "lio-t".  If unspecified, an implementation is selected based on 
the
availability of management utilities, with "iet" being tried first,
then "tgt", then "lio", then "lio-t".


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Setup problem: couldn't find command: tcm_node

2016-07-21 Thread Jason A Ramsey
Thanks for the response, but as I indicated in a previous response, lio-utils 
is deprecated, having been replaced by targetcli. It seems as though Pacemaker 
is trying to invoke a set of utilities/libraries that aren’t currently 
supported to perform iSCSI Target/LUN clustering. My question becomes, 
therefore, how does one retarget Pacemaker at the right/current toolset?

--
 
[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

On 7/20/16, 10:55 PM, "Zhu Lingshan" <ls...@suse.com> wrote:

Hi Jason,

tcm_node is in a package called lio-utils. If it is SUSE, you can try to 
zypper in lio-utils.


Thanks,
BR
Zhu Lingsan

On 07/20/2016 11:08 PM, Jason A Ramsey wrote:
> I have been struggling getting a HA iSCSI Target cluster in place for 
literally weeks. I cannot, for whatever reason, get pacemaker to create an 
iSCSILogicalUnit resource. The error message that I’m seeing leads me to 
believe that I’m missing something on the systems (“tcm_node”). Here are my 
setup commands leading up to seeing this error message:
>
> # pcs resource create hdcvbnas_tgtsvc ocf:heartbeat:iSCSITarget 
iqn="iqn.2016-07.local.hsinawsdev:hdcvadbs-witness" op monitor interval=15s
>
> # pcs resource create hdcvbnas_lun0 ocf:heartbeat:iSCSILogicalUnit 
target_iqn="iqn.2016-07.local.hsinawsdev:hdcvadbs-witness" lun="0" 
path=/dev/drbd1 implementation="lio" op monitor interval=15s
>
>
> Failed Actions:
> * hdcvbnas_lun0_stop_0 on hdc1anas002 'not installed' (5): call=321, 
status=complete, exitreason='Setup problem: couldn't find command: tcm_node',
>  last-rc-change='Wed Jul 20 10:51:15 2016', queued=0ms, exec=32ms
>
> This is with the following installed:
>
> pacemaker-cli-1.1.13-10.el7.x86_64
> pacemaker-1.1.13-10.el7.x86_64
> pacemaker-libs-1.1.13-10.el7.x86_64
> pacemaker-cluster-libs-1.1.13-10.el7.x86_64
> corosynclib-2.3.4-7.el7.x86_64
> corosync-2.3.4-7.el7.x86_64
>
> Please please please…any ideas are appreciated. I’ve exhausted all 
avenues of investigation at this point and don’t know what to do. Thank you!
>
> --
>   
> [ jR ]
> @: ja...@eramsey.org
>   
>there is no path to greatness; greatness is the path
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Setup problem: couldn't find command: tcm_node

2016-07-20 Thread Jason A Ramsey
Actually, according to http://linux-iscsi.org/wiki/Lio-utils lio-utils has been 
deprecated and replaced by targetcli.

--
 
[ jR ]
@: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

On 7/20/16, 12:09 PM, "Andrei Borzenkov" <arvidj...@gmail.com> wrote:

20.07.2016 18:08, Jason A Ramsey пишет:
> I have been struggling getting a HA iSCSI Target cluster in place for 
literally weeks. I cannot, for whatever reason, get pacemaker to create an 
iSCSILogicalUnit resource. The error message that I’m seeing leads me to 
believe that I’m missing something on the systems (“tcm_node”). Here are my 
setup commands leading up to seeing this error message:
> 
> # pcs resource create hdcvbnas_tgtsvc ocf:heartbeat:iSCSITarget 
iqn="iqn.2016-07.local.hsinawsdev:hdcvadbs-witness" op monitor interval=15s
> 
> # pcs resource create hdcvbnas_lun0 ocf:heartbeat:iSCSILogicalUnit 
target_iqn="iqn.2016-07.local.hsinawsdev:hdcvadbs-witness" lun="0" 
path=/dev/drbd1 implementation="lio" op monitor interval=15s
> 
> 
> Failed Actions:
> * hdcvbnas_lun0_stop_0 on hdc1anas002 'not installed' (5): call=321, 
status=complete, exitreason='Setup problem: couldn't find command: tcm_node',

tcm_node is part of lio-utils. I am not familiar with RedHat packages,
but I presume that searching for "lio" should reveal something.

> last-rc-change='Wed Jul 20 10:51:15 2016', queued=0ms, exec=32ms
> 
> This is with the following installed:
> 
> pacemaker-cli-1.1.13-10.el7.x86_64
> pacemaker-1.1.13-10.el7.x86_64
> pacemaker-libs-1.1.13-10.el7.x86_64
> pacemaker-cluster-libs-1.1.13-10.el7.x86_64
> corosynclib-2.3.4-7.el7.x86_64
> corosync-2.3.4-7.el7.x86_64
> 
> Please please please…any ideas are appreciated. I’ve exhausted all 
avenues of investigation at this point and don’t know what to do. Thank you!
> 
> --
>  
> [ jR ]
> @: ja...@eramsey.org
>  
>   there is no path to greatness; greatness is the path
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Setup problem: couldn't find command: tcm_node

2016-07-20 Thread Jason A Ramsey
I have been struggling getting a HA iSCSI Target cluster in place for literally 
weeks. I cannot, for whatever reason, get pacemaker to create an 
iSCSILogicalUnit resource. The error message that I’m seeing leads me to 
believe that I’m missing something on the systems (“tcm_node”). Here are my 
setup commands leading up to seeing this error message:

# pcs resource create hdcvbnas_tgtsvc ocf:heartbeat:iSCSITarget 
iqn="iqn.2016-07.local.hsinawsdev:hdcvadbs-witness" op monitor interval=15s

# pcs resource create hdcvbnas_lun0 ocf:heartbeat:iSCSILogicalUnit 
target_iqn="iqn.2016-07.local.hsinawsdev:hdcvadbs-witness" lun="0" 
path=/dev/drbd1 implementation="lio" op monitor interval=15s


Failed Actions:
* hdcvbnas_lun0_stop_0 on hdc1anas002 'not installed' (5): call=321, 
status=complete, exitreason='Setup problem: couldn't find command: tcm_node',
last-rc-change='Wed Jul 20 10:51:15 2016', queued=0ms, exec=32ms

This is with the following installed:

pacemaker-cli-1.1.13-10.el7.x86_64
pacemaker-1.1.13-10.el7.x86_64
pacemaker-libs-1.1.13-10.el7.x86_64
pacemaker-cluster-libs-1.1.13-10.el7.x86_64
corosynclib-2.3.4-7.el7.x86_64
corosync-2.3.4-7.el7.x86_64

Please please please…any ideas are appreciated. I’ve exhausted all avenues of 
investigation at this point and don’t know what to do. Thank you!

--
 
[ jR ]
@: ja...@eramsey.org
 
  there is no path to greatness; greatness is the path

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] HA iSCSITarget Using FileIO

2016-07-13 Thread Jason A Ramsey
Oh, and I forgot to add that when I try to create the LUN with the 
implementation=”lio” parameter, I see the following:

[root@hdc1anas002 ec2-user]# pcs resource create hdcvbnas_lun0 
ocf:heartbeat:iSCSILogicalUnit 
target_iqn="iqn.2016-07.local.hsinawsdev:hdcvadbs-witness" lun="0" 
path=/dev/drbd1 implementation="lio" op monitor interval=15s


[root@hdc1anas002 ec2-user]# pcs status
Cluster name: hdcvbnas
Last updated: Wed Jul 13 16:29:36 2016 Last change: Wed Jul 
13 16:29:33 2016 by root via cibadmin on hdc1anas002
Stack: corosync
Current DC: hdc1anas002 (version 1.1.13-10.el7-44eb2dd) - partition with quorum
2 nodes and 7 resources configured

Online: [ hdc1anas002 hdc1bnas002 ]

Full list of resources:

Master/Slave Set: hdcvbnas_tgtclone [hdcvbnas_tgt]
 Masters: [ hdc1anas002 ]
 Slaves: [ hdc1bnas002 ]
hdcvbnas_tgtfs   (ocf::heartbeat:Filesystem):Started hdc1anas002
hdcvbnas_ip0  (ocf::heartbeat:IPaddr2): Started hdc1anas002
hdcvbnas_ip1  (ocf::heartbeat:IPaddr2): Started hdc1bnas002
hdcvbnas_tgtsvc(ocf::heartbeat:iSCSITarget):  Started 
hdc1bnas002
hdcvbnas_lun0   (ocf::heartbeat:iSCSILogicalUnit):  FAILED hdc1anas002 
(unmanaged)

Failed Actions:
* hdcvbnas_lun0_stop_0 on hdc1anas002 'not installed' (5): call=251, 
status=complete, exitreason='Setup problem: couldn't find command: tcm_node',
last-rc-change='Wed Jul 13 16:29:33 2016', queued=0ms, exec=31ms


--

[ jR ]
  M: +1 (703) 628-2621
  @: ja...@eramsey.org<mailto:ja...@eramsey.org>

  there is no path to greatness; greatness is the path

From: Jason Ramsey <ja...@eramsey.org>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
Date: Wednesday, July 13, 2016 at 4:02 PM
To: "users@clusterlabs.org" <users@clusterlabs.org>
Subject: [ClusterLabs] HA iSCSITarget Using FileIO

I’m having some difficulty setting up a PCS/Corosync HA iSCSI target. I’m able 
to create the iSCSI target resource (it spins up the target service properly 
when the pcs command is issued. However, when I attempt to create a LUN, I get 
nothing but error messages. This works:

pcs resource create hdcvbnas_tgtsvc ocf:heartbeat:iSCSITarget 
iqn=”iqn.2016-07.local.hsinawsdev:hdcvadbs-witness” op monitor interval=15s

But this does not:

pcs resource create hdcvbnas_tgtsvc ocf:heartbeat:iSCSITarget 
iqn=”iqn.2016-07.local.hsinawsdev:hdcvadbs-witness” implementation=”lio” 
portals=”10.0.96.100 10.0.96.101” op monitor interval=15s

Also, when the first (working) command is issued, the creation of the LUN does 
not:

pcs resource create hdcvbnas_lun0 ocf:heartbeat:iSCSILogicalUnit 
target_iqn=”iqn.2016-07.local.hsinawsdev:hdcvadbs-witness” lun=”0” 
path=/dev/drbd1 op monitor interval=15s

Here’s the results:

[root@hdc1anas002 ec2-user]# pcs status
Cluster name: hdcvbnas
Last updated: Wed Jul 13 15:59:05 2016 Last change: Wed Jul 
13 15:59:03 2016 by root via cibadmin on hdc1anas002
Stack: corosync
Current DC: hdc1anas002 (version 1.1.13-10.el7-44eb2dd) - partition with quorum
2 nodes and 7 resources configured

Online: [ hdc1anas002 hdc1bnas002 ]

Full list of resources:

Master/Slave Set: hdcvbnas_tgtclone [hdcvbnas_tgt]
 Masters: [ hdc1anas002 ]
 Slaves: [ hdc1bnas002 ]
hdcvbnas_tgtfs   (ocf::heartbeat:Filesystem):Started hdc1anas002
hdcvbnas_ip0  (ocf::heartbeat:IPaddr2): Started hdc1anas002
hdcvbnas_ip1  (ocf::heartbeat:IPaddr2): Started hdc1bnas002
hdcvbnas_tgtsvc(ocf::heartbeat:iSCSITarget):  Started 
hdc1bnas002
hdcvbnas_lun0   (ocf::heartbeat:iSCSILogicalUnit):  Stopped

Failed Actions:
* hdcvbnas_lun0_start_0 on hdc1anas002 'unknown error' (1): call=243, 
status=complete, exitreason='none',
last-rc-change='Wed Jul 13 15:59:03 2016', queued=0ms, exec=123ms
* hdcvbnas_lun0_start_0 on hdc1bnas002 'unknown error' (1): call=257, 
status=complete, exitreason='none',
last-rc-change='Wed Jul 13 15:59:03 2016', queued=0ms, exec=124ms


PCSD Status:
  hdc1anas002: Online
  hdc1bnas002: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled


Unfortunately, neither the corosync nor syslog logs provide any information 
that is remotely helpful in troubleshooting this. I appreciate any help you 
might provide.

--

[ jR ]
  @: ja...@eramsey.org<mailto:ja...@eramsey.org>

  there is no path to greatness; greatness is the path
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] HA iSCSITarget Using FileIO

2016-07-13 Thread Jason A Ramsey
I’m having some difficulty setting up a PCS/Corosync HA iSCSI target. I’m able 
to create the iSCSI target resource (it spins up the target service properly 
when the pcs command is issued. However, when I attempt to create a LUN, I get 
nothing but error messages. This works:

pcs resource create hdcvbnas_tgtsvc ocf:heartbeat:iSCSITarget 
iqn=”iqn.2016-07.local.hsinawsdev:hdcvadbs-witness” op monitor interval=15s

But this does not:

pcs resource create hdcvbnas_tgtsvc ocf:heartbeat:iSCSITarget 
iqn=”iqn.2016-07.local.hsinawsdev:hdcvadbs-witness” implementation=”lio” 
portals=”10.0.96.100 10.0.96.101” op monitor interval=15s

Also, when the first (working) command is issued, the creation of the LUN does 
not:

pcs resource create hdcvbnas_lun0 ocf:heartbeat:iSCSILogicalUnit 
target_iqn=”iqn.2016-07.local.hsinawsdev:hdcvadbs-witness” lun=”0” 
path=/dev/drbd1 op monitor interval=15s

Here’s the results:

[root@hdc1anas002 ec2-user]# pcs status
Cluster name: hdcvbnas
Last updated: Wed Jul 13 15:59:05 2016 Last change: Wed Jul 
13 15:59:03 2016 by root via cibadmin on hdc1anas002
Stack: corosync
Current DC: hdc1anas002 (version 1.1.13-10.el7-44eb2dd) - partition with quorum
2 nodes and 7 resources configured

Online: [ hdc1anas002 hdc1bnas002 ]

Full list of resources:

Master/Slave Set: hdcvbnas_tgtclone [hdcvbnas_tgt]
 Masters: [ hdc1anas002 ]
 Slaves: [ hdc1bnas002 ]
hdcvbnas_tgtfs   (ocf::heartbeat:Filesystem):Started hdc1anas002
hdcvbnas_ip0  (ocf::heartbeat:IPaddr2): Started hdc1anas002
hdcvbnas_ip1  (ocf::heartbeat:IPaddr2): Started hdc1bnas002
hdcvbnas_tgtsvc(ocf::heartbeat:iSCSITarget):  Started 
hdc1bnas002
hdcvbnas_lun0   (ocf::heartbeat:iSCSILogicalUnit):  Stopped

Failed Actions:
* hdcvbnas_lun0_start_0 on hdc1anas002 'unknown error' (1): call=243, 
status=complete, exitreason='none',
last-rc-change='Wed Jul 13 15:59:03 2016', queued=0ms, exec=123ms
* hdcvbnas_lun0_start_0 on hdc1bnas002 'unknown error' (1): call=257, 
status=complete, exitreason='none',
last-rc-change='Wed Jul 13 15:59:03 2016', queued=0ms, exec=124ms


PCSD Status:
  hdc1anas002: Online
  hdc1bnas002: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled


Unfortunately, neither the corosync nor syslog logs provide any information 
that is remotely helpful in troubleshooting this. I appreciate any help you 
might provide.

--

[ jR ]
  @: ja...@eramsey.org

  there is no path to greatness; greatness is the path
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] HA iSCSI Target on Amazon Web Services (Multi-AZ)

2016-05-23 Thread Jason A Ramsey
Anyone have any clues on this? I’m rather a stumped person at present. Thank 
you!


--

[ jR ]

  there is no path to greatness; greatness is the path

From: Jason Ramsey <ja...@eramsey.org>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
Date: Tuesday, April 26, 2016 at 12:07 PM
To: "users@clusterlabs.org" <users@clusterlabs.org>
Subject: [ClusterLabs] HA iSCSI Target on Amazon Web Services (Multi-AZ)

So, I've been struggling for about 2 weeks to cobble together the bits and bobs 
required to create a highly available iSCSI Target cluster in AWS. I have a 
Pacemaker/Corosync cluster in place using DRBD for block-level replication of 
the EBS volumes used as target storage between the nodes. While I have managed 
to stitch together 4-5 how-tos to get to this point, I find myself struggling 
(conceptualizing and implementing) with how to make the final bits function in 
a multi-AZ implementation. Most guides have you set up instances in the same 
availability zone and the same subnet. This makes it easy to create a secondary 
"vip" address that the nodes can share for the ocf:heartbeat:IPaddr2 resource. 
This is, for obvious reasons, not the case when you're talking multi-AZ because 
IP subnets do not span availability zones. Can anyone walk me through this or 
point me somewhere that will? Thanks!

--


[ jR ]

  there is no path to greatness; greatness is the path
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] HA iSCSI Target on Amazon Web Services (Multi-AZ)

2016-04-26 Thread Jason A Ramsey
So, I've been struggling for about 2 weeks to cobble together the bits and bobs 
required to create a highly available iSCSI Target cluster in AWS. I have a 
Pacemaker/Corosync cluster in place using DRBD for block-level replication of 
the EBS volumes used as target storage between the nodes. While I have managed 
to stitch together 4-5 how-tos to get to this point, I find myself struggling 
(conceptualizing and implementing) with how to make the final bits function in 
a multi-AZ implementation. Most guides have you set up instances in the same 
availability zone and the same subnet. This makes it easy to create a secondary 
"vip" address that the nodes can share for the ocf:heartbeat:IPaddr2 resource. 
This is, for obvious reasons, not the case when you're talking multi-AZ because 
IP subnets do not span availability zones. Can anyone walk me through this or 
point me somewhere that will? Thanks!

--


[ jR ]

  there is no path to greatness; greatness is the path
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org