Ahh so, in this case, only Suse Enterprise Storage is able to provide ISCSI connections of MS Clusters if an HA is required be it Active/Standby, Active/Active or Active/Failover.
On Wed, Oct 11, 2017 at 2:03 PM, Jason Dillaman <jdill...@redhat.com> wrote: > On Wed, Oct 11, 2017 at 1:10 PM, Samuel Soulard > <samuel.soul...@gmail.com> wrote: > > Hmmm, If you failover the identity of the LIO configuration including > PGRs > > (I believe they are files on disk), this would work no? Using an 2 ISCSI > > gateways which have shared storage to store the LIO configuration and PGR > > data. > > Are you referring to the Active Persist Through Power Loss (APTPL) > support in LIO where it writes the PR metadata to > "/var/target/pr/aptpl_<wwn>"? I suppose that would work for a > Pacemaker failover if you had a shared file system mounted between all > your gateways *and* the initiator requests APTPL mode(?). > > > Also, you said another "fails over to another port", do you mean a port > on > > another ISCSI gateway? I believe LIO with multiple target portal IP on > the > > same node for path redundancy works with PGRs. > > Yes, I was referring to the case with multiple active iSCSI gateways > which doesn't currently distribute PGRs to all gateways in the group. > > > In my scenario, if my assumptions are correct, you would only have 1 > ISCSI > > gateway available through 2 target portal IP (for data path > redundancy). If > > this first ISCSI gateway fails, both target portal IP failover to the > > standby node with the PGR data that is available on share stored. > > > > > > Sam > > > > On Wed, Oct 11, 2017 at 12:52 PM, Jason Dillaman <jdill...@redhat.com> > > wrote: > >> > >> On Wed, Oct 11, 2017 at 12:31 PM, Samuel Soulard > >> <samuel.soul...@gmail.com> wrote: > >> > Hi to all, > >> > > >> > What if you're using an ISCSI gateway based on LIO and KRBD (that is, > >> > RBD > >> > block device mounted on the ISCSI gateway and published through LIO). > >> > The > >> > LIO target portal (virtual IP) would failover to another node. This > >> > would > >> > theoretically provide support for PGRs since LIO does support SPC-3. > >> > Granted it is not distributed and limited to 1 single node throughput, > >> > but > >> > this would achieve high availability required by some environment. > >> > >> Yes, LIO technically supports PGR but it's not distributed to other > >> nodes. If you have a pacemaker-initiated target failover to another > >> node, the PGR state would be lost / missing after migration (unless I > >> am missing something like a resource agent that attempts to preserve > >> the PGRs). For initiator-initiated failover (e.g. a target is alive > >> but the initiator cannot reach it), after it fails over to another > >> port the PGR data won't be available. > >> > >> > Of course, multiple target portal would be awesome since available > >> > throughput would be able to scale linearly, but since this isn't here > >> > right > >> > now, this would provide at least an alternative. > >> > >> It would definitely be great to go active/active but there are > >> concerns of data-corrupting edge conditions when using MPIO since it > >> relies on client-side failure timers that are not coordinated with the > >> target. > >> > >> For example, if an initiator writes to sector X down path A and there > >> is delay to the path A target (i.e. the target and initiator timeout > >> timers are not in-sync), and MPIO fails over to path B, quickly > >> performs the write to sector X and performs second write to sector X, > >> there is a possibility that eventually path A will unblock and > >> overwrite the new value in sector 1 with the old value. The safe way > >> to handle that would require setting the initiator-side IO timeouts to > >> such high values as to cause higher-level subsystems to mark the MPIO > >> path as failed should a failure actually occur. > >> > >> The iSCSI MCS protocol would address these concerns since in theory > >> path B could discover that the retried IO was actually a retry, but > >> alas it's not available in the Linux Open-iSCSI nor ESX iSCSI > >> initiators. > >> > >> > On Wed, Oct 11, 2017 at 12:26 PM, David Disseldorp <dd...@suse.de> > >> > wrote: > >> >> > >> >> Hi Jason, > >> >> > >> >> Thanks for the detailed write-up... > >> >> > >> >> On Wed, 11 Oct 2017 08:57:46 -0400, Jason Dillaman wrote: > >> >> > >> >> > On Wed, Oct 11, 2017 at 6:38 AM, Jorge Pinilla López > >> >> > <jorp...@unizar.es> > >> >> > wrote: > >> >> > > >> >> > > As far as I am able to understand there are 2 ways of setting > iscsi > >> >> > > for > >> >> > > ceph > >> >> > > > >> >> > > 1- using kernel (lrbd) only able on SUSE, CentOS, fedora... > >> >> > > > >> >> > > >> >> > The target_core_rbd approach is only utilized by SUSE (and its > >> >> > derivatives > >> >> > like PetaSAN) as far as I know. This was the initial approach for > Red > >> >> > Hat-derived kernels as well until the upstream kernel maintainers > >> >> > indicated > >> >> > that they really do not want a specialized target backend for just > >> >> > krbd. > >> >> > The next attempt was to re-use the existing target_core_iblock to > >> >> > interface > >> >> > with krbd via the kernel's block layer, but that hit similar > upstream > >> >> > walls > >> >> > trying to get support for SCSI command passthrough to the block > >> >> > layer. > >> >> > > >> >> > > >> >> > > 2- using userspace (tcmu , ceph-iscsi-conf, ceph-iscsi-cli) > >> >> > > > >> >> > > >> >> > The TCMU approach is what upstream and Red Hat-derived kernels will > >> >> > support > >> >> > going forward. > >> >> > >> >> SUSE is also in the process of migrating to the upstream tcmu > approach, > >> >> for the reasons that you gave in (1). > >> >> > >> >> ... > >> >> > >> >> > The TCMU approach also does not currently support SCSI persistent > >> >> > reservation groups (needed for Windows clustering) because that > >> >> > support > >> >> > isn't available in the upstream kernel. The SUSE kernel has an > >> >> > approach > >> >> > that utilizes two round-trips to the OSDs for each IO to simulate > PGR > >> >> > support. Earlier this summer I believe SUSE started to look into > how > >> >> > to > >> >> > get > >> >> > generic PGR support merged into the upstream kernel using > >> >> > corosync/dlm > >> >> > to > >> >> > synchronize the states between multiple nodes in the target. I am > not > >> >> > sure > >> >> > of the current state of that work, but it would benefit all LIO > >> >> > targets > >> >> > when complete. > >> >> > >> >> Zhu Lingshan (cc'ed) worked on a prototype for tcmu PR support. IIUC, > >> >> whether DLM or the underlying Ceph cluster gets used for PR state > >> >> storage is still under consideration. > >> >> > >> >> Cheers, David > >> >> _______________________________________________ > >> >> ceph-users mailing list > >> >> ceph-users@lists.ceph.com > >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > >> > > >> > >> > >> > >> -- > >> Jason > > > > > > > > -- > Jason >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com