Yes, it's active/active and I found that VMWare can switch from path to path with no issues or service impact.
I posted some config files here: github.com/jak3kaj/misc One set is from my LIO nodes, both the primary and secondary configs so you can see what I needed to make unique. The other set (targets.conf) are from my tgt nodes. They are both 4 LUN configs. Like I said in my previous email, there is no performance difference between LIO and tgt. The only service I'm running on these nodes is a single iscsi target instance (either LIO or tgt). Jake On Wed, Jan 14, 2015 at 8:41 AM, Nick Fisk <n...@fisk.me.uk> wrote: > Hi Jake, > > > > I can’t remember the exact details, but it was something to do with a > potential problem when using the pacemaker resource agents. I think it was > to do with a potential hanging issue when one LUN on a shared target failed > and then it tried to kill all the other LUNS to fail the target over to > another host. This then leaves the TCM part of LIO locking the RBD which > also can’t fail over. > > > > That said I did try multiple LUNS on one target as a test and didn’t > experience any problems. > > > > I’m interested in the way you have your setup configured though. Are you > saying you effectively have an active/active configuration with a path > going to either host, or are you failing the iSCSI IP between hosts? If > it’s the former, have you had any problems with scsi > locking/reservations…etc between the two targets? > > > > I can see the advantage to that configuration as you reduce/eliminate a > lot of the troubles I have had with resources failing over. > > > > Nick > > > > *From:* Jake Young [mailto:jak3...@gmail.com] > *Sent:* 14 January 2015 12:50 > *To:* Nick Fisk > *Cc:* Giuseppe Civitella; ceph-users > *Subject:* Re: [ceph-users] Ceph, LIO, VMWARE anyone? > > > > Nick, > > > > Where did you read that having more than 1 LUN per target causes stability > problems? > > > > I am running 4 LUNs per target. > > > > For HA I'm running two linux iscsi target servers that map the same 4 rbd > images. The two targets have the same serial numbers, T10 address, etc. I > copy the primary's config to the backup and change IPs. This way VMWare > thinks they are different target IPs on the same host. This has worked very > well for me. > > > > One suggestion I have is to try using rbd enabled tgt. The performance is > equivalent to LIO, but I found it is much better at recovering from a > cluster outage. I've had LIO lock up the kernel or simply not recognize > that the rbd images are available; where tgt will eventually present the > rbd images again. > > > > I have been slowly adding servers and am expanding my test setup to a > production setup (nice thing about ceph). I now have 6 OSD hosts with 7 > disks on each. I'm using the LSI Nytro cache raid controller, so I don't > have a separate journal and have 40Gb networking. I plan to add another 6 > OSD hosts in another rack in the next 6 months (and then another 6 next > year). I'm doing 3x replication, so I want to end up with 3 racks. > > > > Jake > > On Wednesday, January 14, 2015, Nick Fisk <n...@fisk.me.uk> wrote: > > Hi Giuseppe, > > > > I am working on something very similar at the moment. I currently have it > working on some test hardware but seems to be working reasonably well. > > > > I say reasonably as I have had a few instability’s but these are on the HA > side, the LIO and RBD side of things have been rock solid so far. The main > problems I have had seem to be around recovering from failure with > resources ending up in a unmanaged state. I’m not currently using fencing > so this may be part of the cause. > > > > As a brief description of my configuration. > > > > 4 Hosts each having 2 OSD’s also running the monitor role > > 3 additional host in a HA cluster which act as iSCSI proxy nodes. > > > > I’m using the IP, RBD, iSCSITarget and iSCSILUN resource agents to provide > HA iSCSI LUN which maps back to a RBD. All the agents for each RBD are in a > group so they follow each other between hosts. > > > > I’m using 1 LUN per target as I read somewhere there are stability > problems using more than 1 LUN per target. > > > > Performance seems ok, I can get about 1.2k random IO’s out the iSCSI LUN. > These seems to be about right for the Ceph cluster size, so I don’t think > the LIO part is causing any significant overhead. > > > > We should be getting our production hardware shortly which wil have 40 > OSD’s with journals and a SSD caching tier, so within the next month or so > I will have a better idea of running it in a production environment and the > performance of the system. > > > > Hope that helps, if you have any questions, please let me know. > > > > Nick > > > > *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf > Of *Giuseppe Civitella > *Sent:* 13 January 2015 11:23 > *To:* ceph-users > *Subject:* [ceph-users] Ceph, LIO, VMWARE anyone? > > > > Hi all, > > > > I'm working on a lab setup regarding Ceph serving rbd images as ISCSI > datastores to VMWARE via a LIO box. Is there someone that already did > something similar wanting to share some knowledge? Any production > deployments? What about LIO's HA and luns' performances? > > > > Thanks > > Giuseppe > > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com