Re: [ceph-users] cluster network down
On 10/1/19 8:20 AM, Lars Täuber wrote: > Mon, 30 Sep 2019 15:21:18 +0200 > Janne Johansson ==> Lars Täuber : >>> >>> I don't remember where I read it, but it was told that the cluster is >>> migrating its complete traffic over to the public network when the cluster >>> networks goes down. So this seems not to be the case? >>> >> >> Be careful with generalizations like "when a network acts up, it will be >> completely down and noticeably unreachable for all parts", since networks >> can break in thousands of not-very-obvious ways which are not 0%-vs-100% >> but somewhere in between. >> > > Ok. I ask my question in a new way. > What does ceph do, when I switch off all switches of the cluster network? -> confused cluster, osds down > Does ceph handle this silently without interruption? Does the heartbeat > systems use the public network as a failover automatically? No and no. The osds uses the cluster network for replication and heartbeats [1] (between osds) - if the heartbeat fails they try to report the other osds to the mon as down. As far as i know there is no failover to the public network. Michel. [1] https://docs.ceph.com/docs/mimic/rados/configuration/network-config-ref/#cluster-network ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cluster network down
Am 1. Oktober 2019 08:20:08 MESZ schrieb "Lars Täuber" : >Mon, 30 Sep 2019 15:21:18 +0200 >Janne Johansson ==> Lars Täuber >: >> > >> > I don't remember where I read it, but it was told that the cluster >is >> > migrating its complete traffic over to the public network when the >cluster >> > networks goes down. So this seems not to be the case? >> > >> >> Be careful with generalizations like "when a network acts up, it will >be >> completely down and noticeably unreachable for all parts", since >networks >> can break in thousands of not-very-obvious ways which are not >0%-vs-100% >> but somewhere in between. >> > >Ok. I ask my question in a new way. >What does ceph do, when I switch off all switches of the cluster >network? >Does ceph handle this silently without interruption? Does the heartbeat >systems use the public network as a failover automatically? No, you will be in big trouble if this happens as the Cluster do not know how the status of your osds is to be able to Serv your Client requests. There is no redundant Ring used in case of a failure of your Cluster network. To David this you could use LACP. hth Mehmet > >Thanks >Lars >___ >ceph-users mailing list >ceph-users@lists.ceph.com >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cluster network down
Mon, 30 Sep 2019 15:21:18 +0200 Janne Johansson ==> Lars Täuber : > > > > I don't remember where I read it, but it was told that the cluster is > > migrating its complete traffic over to the public network when the cluster > > networks goes down. So this seems not to be the case? > > > > Be careful with generalizations like "when a network acts up, it will be > completely down and noticeably unreachable for all parts", since networks > can break in thousands of not-very-obvious ways which are not 0%-vs-100% > but somewhere in between. > Ok. I ask my question in a new way. What does ceph do, when I switch off all switches of the cluster network? Does ceph handle this silently without interruption? Does the heartbeat systems use the public network as a failover automatically? Thanks Lars ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cluster network down
> > I don't remember where I read it, but it was told that the cluster is > migrating its complete traffic over to the public network when the cluster > networks goes down. So this seems not to be the case? > Be careful with generalizations like "when a network acts up, it will be completely down and noticeably unreachable for all parts", since networks can break in thousands of not-very-obvious ways which are not 0%-vs-100% but somewhere in between. -- May the most significant bit of your life be positive. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cluster network down
Mon, 30 Sep 2019 14:49:48 +0200 Burkhard Linke ==> ceph-users@lists.ceph.com : > Hi, > > On 9/30/19 2:46 PM, Lars Täuber wrote: > > Hi! > > > > What happens when the cluster network goes down completely? > > Is the cluster silently using the public network without interruption, or > > does the admin has to act? > > The cluster network is used for OSD heartbeats and backfilling/recovery > traffic. If the heartbeats do not work anymore, the OSDs will start to > report the other OSDs as down, resulting in a completely confused cluster... > > > I would avoid an extra cluster network unless it is absolutely necessary. > > > Regards, > > Burkhard I don't remember where I read it, but it was told that the cluster is migrating its complete traffic over to the public network when the cluster networks goes down. So this seems not to be the case? Thanks Lars ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cluster network down
Hi, On 9/30/19 2:46 PM, Lars Täuber wrote: Hi! What happens when the cluster network goes down completely? Is the cluster silently using the public network without interruption, or does the admin has to act? The cluster network is used for OSD heartbeats and backfilling/recovery traffic. If the heartbeats do not work anymore, the OSDs will start to report the other OSDs as down, resulting in a completely confused cluster... I would avoid an extra cluster network unless it is absolutely necessary. Regards, Burkhard ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] cluster network down
Hi! What happens when the cluster network goes down completely? Is the cluster silently using the public network without interruption, or does the admin has to act? Thanks Lars ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com