Re: [ceph-users] Ceph-ISCSI

Samuel Soulard Wed, 11 Oct 2017 12:14:40 -0700

Ahh so, in this case, only Suse Enterprise Storage is able to provide ISCSI
connections of MS Clusters if an HA is required be it Active/Standby,
Active/Active or Active/Failover.


On Wed, Oct 11, 2017 at 2:03 PM, Jason Dillaman <jdill...@redhat.com> wrote:

> On Wed, Oct 11, 2017 at 1:10 PM, Samuel Soulard
> <samuel.soul...@gmail.com> wrote:
> > Hmmm, If you failover the identity of the LIO configuration including
> PGRs
> > (I believe they are files on disk), this would work no?  Using an 2 ISCSI
> > gateways which have shared storage to store the LIO configuration and PGR
> > data.
>
> Are you referring to the Active Persist Through Power Loss (APTPL)
> support in LIO where it writes the PR metadata to
> "/var/target/pr/aptpl_<wwn>"? I suppose that would work for a
> Pacemaker failover if you had a shared file system mounted between all
> your gateways *and* the initiator requests APTPL mode(?).
>
> > Also, you said another "fails over to another port", do you mean a port
> on
> > another ISCSI gateway?  I believe LIO with multiple target portal IP on
> the
> > same node for path redundancy works with PGRs.
>
> Yes, I was referring to the case with multiple active iSCSI gateways
> which doesn't currently distribute PGRs to all gateways in the group.
>
> > In my scenario, if my assumptions are correct, you would only have 1
> ISCSI
> > gateway available through 2 target portal IP (for data path
> redundancy).  If
> > this first ISCSI gateway fails, both target portal IP failover to the
> > standby node with the PGR data that is available on share stored.
> >
> >
> > Sam
> >
> > On Wed, Oct 11, 2017 at 12:52 PM, Jason Dillaman <jdill...@redhat.com>
> > wrote:
> >>
> >> On Wed, Oct 11, 2017 at 12:31 PM, Samuel Soulard
> >> <samuel.soul...@gmail.com> wrote:
> >> > Hi to all,
> >> >
> >> > What if you're using an ISCSI gateway based on LIO and KRBD (that is,
> >> > RBD
> >> > block device mounted on the ISCSI gateway and published through LIO).
> >> > The
> >> > LIO target portal (virtual IP) would failover to another node.  This
> >> > would
> >> > theoretically provide support for PGRs since LIO does support SPC-3.
> >> > Granted it is not distributed and limited to 1 single node throughput,
> >> > but
> >> > this would achieve high availability required by some environment.
> >>
> >> Yes, LIO technically supports PGR but it's not distributed to other
> >> nodes. If you have a pacemaker-initiated target failover to another
> >> node, the PGR state would be lost / missing after migration (unless I
> >> am missing something like a resource agent that attempts to preserve
> >> the PGRs). For initiator-initiated failover (e.g. a target is alive
> >> but the initiator cannot reach it), after it fails over to another
> >> port the PGR data won't be available.
> >>
> >> > Of course, multiple target portal would be awesome since available
> >> > throughput would be able to scale linearly, but since this isn't here
> >> > right
> >> > now, this would provide at least an alternative.
> >>
> >> It would definitely be great to go active/active but there are
> >> concerns of data-corrupting edge conditions when using MPIO since it
> >> relies on client-side failure timers that are not coordinated with the
> >> target.
> >>
> >> For example, if an initiator writes to sector X down path A and there
> >> is delay to the path A target (i.e. the target and initiator timeout
> >> timers are not in-sync), and MPIO fails over to path B, quickly
> >> performs the write to sector X and performs second write to sector X,
> >> there is a possibility that eventually path A will unblock and
> >> overwrite the new value in sector 1 with the old value. The safe way
> >> to handle that would require setting the initiator-side IO timeouts to
> >> such high values as to cause higher-level subsystems to mark the MPIO
> >> path as failed should a failure actually occur.
> >>
> >> The iSCSI MCS protocol would address these concerns since in theory
> >> path B could discover that the retried IO was actually a retry, but
> >> alas it's not available in the Linux Open-iSCSI nor ESX iSCSI
> >> initiators.
> >>
> >> > On Wed, Oct 11, 2017 at 12:26 PM, David Disseldorp <dd...@suse.de>
> >> > wrote:
> >> >>
> >> >> Hi Jason,
> >> >>
> >> >> Thanks for the detailed write-up...
> >> >>
> >> >> On Wed, 11 Oct 2017 08:57:46 -0400, Jason Dillaman wrote:
> >> >>
> >> >> > On Wed, Oct 11, 2017 at 6:38 AM, Jorge Pinilla López
> >> >> > <jorp...@unizar.es>
> >> >> > wrote:
> >> >> >
> >> >> > > As far as I am able to understand there are 2 ways of setting
> iscsi
> >> >> > > for
> >> >> > > ceph
> >> >> > >
> >> >> > > 1- using kernel (lrbd) only able on SUSE, CentOS, fedora...
> >> >> > >
> >> >> >
> >> >> > The target_core_rbd approach is only utilized by SUSE (and its
> >> >> > derivatives
> >> >> > like PetaSAN) as far as I know. This was the initial approach for
> Red
> >> >> > Hat-derived kernels as well until the upstream kernel maintainers
> >> >> > indicated
> >> >> > that they really do not want a specialized target backend for just
> >> >> > krbd.
> >> >> > The next attempt was to re-use the existing target_core_iblock to
> >> >> > interface
> >> >> > with krbd via the kernel's block layer, but that hit similar
> upstream
> >> >> > walls
> >> >> > trying to get support for SCSI command passthrough to the block
> >> >> > layer.
> >> >> >
> >> >> >
> >> >> > > 2- using userspace (tcmu , ceph-iscsi-conf, ceph-iscsi-cli)
> >> >> > >
> >> >> >
> >> >> > The TCMU approach is what upstream and Red Hat-derived kernels will
> >> >> > support
> >> >> > going forward.
> >> >>
> >> >> SUSE is also in the process of migrating to the upstream tcmu
> approach,
> >> >> for the reasons that you gave in (1).
> >> >>
> >> >> ...
> >> >>
> >> >> > The TCMU approach also does not currently support SCSI persistent
> >> >> > reservation groups (needed for Windows clustering) because that
> >> >> > support
> >> >> > isn't available in the upstream kernel. The SUSE kernel has an
> >> >> > approach
> >> >> > that utilizes two round-trips to the OSDs for each IO to simulate
> PGR
> >> >> > support. Earlier this summer I believe SUSE started to look into
> how
> >> >> > to
> >> >> > get
> >> >> > generic PGR support merged into the upstream kernel using
> >> >> > corosync/dlm
> >> >> > to
> >> >> > synchronize the states between multiple nodes in the target. I am
> not
> >> >> > sure
> >> >> > of the current state of that work, but it would benefit all LIO
> >> >> > targets
> >> >> > when complete.
> >> >>
> >> >> Zhu Lingshan (cc'ed) worked on a prototype for tcmu PR support. IIUC,
> >> >> whether DLM or the underlying Ceph cluster gets used for PR state
> >> >> storage is still under consideration.
> >> >>
> >> >> Cheers, David
> >> >> _______________________________________________
> >> >> ceph-users mailing list
> >> >> ceph-users@lists.ceph.com
> >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Jason
> >
> >
>
>
>
> --
> Jason
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph-ISCSI

Reply via email to