On Tue, Nov 4, 2014 at 6:34 PM, Arman Khalatyan <arm2...@gmail.com> wrote:
> I will he I.teresting to see your iscsi setup with drbd. Did you got > splitbrain before failure? > Did you check if your target went to readonly mode? > Thanks > Arman. > > I used some information provided here, even if it is with CentOS 5.7 and lvm on top of drbd, while in my setup I have CentOS 6.5 and drbd on top of lvm: http://blogs.mindspew-age.com/2012/04/05/adventures-in-high-availability-ha-iscsi-with-drbd-iscsi-and-pacemaker/ - my drbd resource definition for iSCSI HA: [root@srvmgmt01 ~]# cat iscsiha.res resource iscsiha { disk { disk-flushes no; md-flushes no; fencing resource-and-stonith; } device minor 2; disk /dev/iscsihavg/iscsihalv; syncer { rate 30M; verify-alg md5; } handlers { fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; } on srvmgmt01.localdomain.local { address 192.168.230.51:7790; meta-disk internal; } on srvmgmt02.localdomain.local { address 192.168.230.52:7790; meta-disk internal; } } - tgtd is setup to start on both nodes at startup iscsi and iscsid services configured to off - Put the agents iSCSILogicalUnit e iSCSITarget under /usr/lib/ocf/resource.d/heartbeat/ on both nodes downloaded from here, as they are not provided in plain CentOS: http://linux-ha.org/doc/man-pages/re-ra-iSCSITarget.html - Here below the pcs steps to create the group: pcs cluster cib iscsiha_cfg pcs -f iscsiha_cfg resource create p_drbd_iscsiha ocf:linbit:drbd drbd_resource=iscsiha \ op monitor interval="29s" role="Master" timeout="30" op monitor interval="31s" \ role="Slave" timeout="30" op start interval="0" timeout="240" op stop interval="0" timeout="100" pcs -f iscsiha_cfg resource master ms_drbd_iscsiha p_drbd_iscsiha \ master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true pcs -f iscsiha_cfg resource create p_iscsi_store1 ocf:heartbeat:iSCSITarget \ params implementation="tgt" iqn="iqn.2014-07.local.localdomain:store1" tid="1" \ allowed_initiators="10.10.1.61 10.10.1.62 10.10.1.63" incoming_username="iscsiuser" incoming_password="iscsipwd" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="60" \ op monitor interval="30" timeout="60" pcs -f iscsiha_cfg resource create p_iscsi_store1_lun1 ocf:heartbeat:iSCSILogicalUnit \ params implementation="tgt" target_iqn="iqn.2014-07.local.localdomain:store1" lun="1" \ path="/dev/drbd/by-res/iscsiha" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="60" \ op monitor interval="30" timeout="60" pcs -f iscsiha_cfg resource create p_ip_iscsi ocf:heartbeat:IPaddr2 \ params ip="10.10.1.71" \ op start interval="0" timeout="20" \ op stop interval="0" timeout="20" \ op monitor interval="30" timeout="20" pcs -f iscsiha_cfg resource create p_portblock-store1-block ocf:heartbeat:portblock \ params ip="10.10.1.71" portno="3260" protocol="tcp" action="block" pcs -f iscsiha_cfg resource create p_portblock-store1-unblock ocf:heartbeat:portblock \ params ip="10.10.1.71" portno="3260" protocol="tcp" action="unblock" \ op monitor interval="30s" pcs -f iscsiha_cfg resource group add g_iscsiha p_portblock-store1-block p_ip_iscsi p_iscsi_store1 \ p_iscsi_store1_lun1 p_portblock-store1-unblock pcs -f iscsiha_cfg constraint colocation add Started g_iscsiha with Master ms_drbd_iscsiha INFINITY pcs -f iscsiha_cfg constraint order promote ms_drbd_iscsiha then start g_iscsiha pcs cluster cib-push iscsiha_cfg - output of "crm_mon -1" Resource Group: g_iscsiha p_portblock-store1-block (ocf::heartbeat:portblock): Started srvmgmt01.localdomain.local p_ip_iscsi (ocf::heartbeat:IPaddr2): Started srvmgmt01.localdomain.local p_iscsi_store1 (ocf::heartbeat:iSCSITarget): Started srvmgmt01.localdomain.local p_iscsi_store1_lun1 (ocf::heartbeat:iSCSILogicalUnit): Started srvmgmt01.localdomain.local p_portblock-store1-unblock (ocf::heartbeat:portblock): Started srvmgmt01.localdomain.local - output of tgtadm on both nodes while srvmgmt01 is active for the group [root@srvmgmt01 ~]# tgtadm --mode target --op show Target 1: iqn.2014-07.local.localdomain:store1 System information: Driver: iscsi State: ready I_T nexus information: LUN information: LUN: 0 Type: controller SCSI ID: IET 00010000 SCSI SN: beaf10 Size: 0 MB, Block size: 1 Online: Yes Removable media: No Prevent removal: No Readonly: No Backing store type: null Backing store path: None Backing store flags: LUN: 1 Type: disk SCSI ID: p_iscsi_store1_l SCSI SN: 66666a41 Size: 214738 MB, Block size: 512 Online: Yes Removable media: No Prevent removal: No Readonly: No Backing store type: rdwr Backing store path: /dev/drbd/by-res/iscsiha Backing store flags: Account information: iscsiuser ACL information: 10.10.1.61 10.10.1.62 10.10.1.63 on the passive node: [root@srvmgmt02 heartbeat]# tgtadm --mode target --op show [root@srvmgmt02 heartbeat]# TBV performance and tuning values taken from here: http://www.dbarticles.com/centos-6-iscsi-tgtd-setup-and-performance-adjustments/ my cluster is basic for testing so not critical for my environment... at the momento only 1Gbit/s network and one adapter for drbd replica and one for iSCSI traffic Tested with some I/O basic benchmarks on VM insisting on the SD and I got about 90-95MB/s on both drbd and iSCSI networks. Also relocation of iSCSI service while benchmark active seemed not to cause problems with SD and VM. - I also enabled iptables on cluster nodes so that the initiators (oVirt hosts) could connect to the ip alias dedicated to iSCSI servicing: in /etc/sysconfig/iptables: # iSCSI -A INPUT -p tcp -m tcp -d 10.10.1.71 --dport 3260 -j ACCEPT I have to recheck the logs to give exact scenario of what happened causing the problem.... not being a critical system is not so well monitored at the moment... comments welcome Gianluca
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users