[ceph-users] Re: Failing heartbeats when no backfill is running

2019-08-14 Thread Paul Emmerich
MTU issues due to the VPN connection? Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Wed, Aug 14, 2019 at 1:48 PM Lorenz Kiefner wrote: > > Dear ceph-users, > > I

[ceph-users] Re: Failing heartbeats when no backfill is running

2019-08-14 Thread Lorenz Kiefner
Hi, this was the first thing I was thinking about (and yes, there had been some issues, but they are resolved - double checked!). MTU is consistent throughout the whole net and pings in all sizes are handled well. And MTU problems wouldn't probably make a difference between backfills and normal o

[ceph-users] Re: Failing heartbeats when no backfill is running

2019-08-14 Thread Wido den Hollander
On 8/14/19 5:46 PM, Lorenz Kiefner wrote: > Hi, > > this was the first thing I was thinking about (and yes, there had been > some issues, but they are resolved - double checked!). > > MTU is consistent throughout the whole net and pings in all sizes are > handled well. And MTU problems wouldn'

[ceph-users] Re: Failing heartbeats when no backfill is running

2019-08-14 Thread Lorenz Kiefner
Ok, then Ceph probably doesn't fit for me. I wanted to provide a backup platform for me, my family and my friends. Speed is not relevant, but long-term reliability. So I'm depending on home internet connections and VPN. At the moment I'm mostly using wireguard, but I could switch to openvpn to re

[ceph-users] Re: Failing heartbeats when no backfill is running

2019-08-15 Thread Lorenz Kiefner
Oh no, it's not that bad. It's $ ping -s 65000 dest.inati.on on a VPN connection that has a MTU of 1300 via IPv6. So I suspect that I only get an answer, when all 51 fragments get fully returned. It's clear that big packets with lots of fragments are more affected by packet loss than 64 byte ping

[ceph-users] Re: Failing heartbeats when no backfill is running

2019-08-16 Thread Robert LeBlanc
Personally I would not be trying to create a Ceph cluster across Consumer Internet links, usually their upload speed is so slow and Ceph is so chatty that it would make for a horrible experience. If you are looking for a backup solution, then I would look at some sort of n-way rsync solution, or bt

[ceph-users] Re: Failing heartbeats when no backfill is running

2019-08-17 Thread Lorenz Kiefner
Hello again, all links are at least 10/50 mbit upstream/downstream, mostly 40/100 mbit, with some VMs at hosting companies running at 1/1 gbit. All my 39 OSDs on 17 hosts in 11 locations (5 of them are connected at the moment by consumer internet links) are nearly in a full mesh network consisting

[ceph-users] Re: Failing heartbeats when no backfill is running

2019-08-19 Thread Robert LeBlanc
Only other thing I can think of is that a firewall is dropping idle connections, although Ceph should be sending heartbeats more often then the common 5 minutes for most firewalls. In the logs is it showing the monitor marking the OSDs out or the OSD peers? That would give you an idea where to look