Re: Cassandra repair process in Low Bandwidth Network

Jeff Jirsa Fri, 15 Sep 2017 10:27:28 -0700

Hi Kishore,

Just to make sure we're all on the same page, I presume you're doing full
repairs using something like 'nodetool repair -pr', which repairs all data
for a given token range across all of your hosts in all of your dcs. Is
that a correct assumption to start?


In addition to throttling inter-dc stream throughput (which you should be
able to set quite low - perhaps as low as 20 Mbps), you may also want to
consider smaller ranges (using a concept we call subrange repair, where
instead of using -pr, you pass -st and -et - which is what tools like
http://cassandra-reaper.io/ do ) - this will keep streams smaller (in terms
of total bytes transferred per streaming session, though you'll have more
sessions). Finally, you can use -host and -dc options to limit repair so
that sessions don't always hit all 3 dcs - for exactly, you could do a
repair between DC1 and DC2 using -dc, then do a repair of DC1 and DC3 using
-dc - it's a lot more coordination required, but likely helps cut down on
the traffic over your VPN link.



On Fri, Sep 15, 2017 at 9:09 AM, Mohapatra, Kishore <
kishore.mohapa...@nuance.com> wrote:

> Hi,
>        we have a cassandra cluster with 7 nodes each in 3 datacenters. We
> are using C* 2.1.15.4 version.
> Network bandwidth between DC1 and DC2 is very good (10Gbit/s) and a
> dedicated one. However network pipe between DC1 and DC3 and between DC2 and
> DC3 is very poor and has only 100 MBit/s and also goes thru VPN network.
> Each node contains about 100 Gb of data and has a RF of 3. Whenever we run
> the repair, it fails with streaming errors and never completes. I have
> already tried the streaming timeout parameter to a very high value. But it
> did not help. I could repair either just in the local dc or just the first
> two DCs. Can not repair DC3 when i combine with the other two DCs.
>
> So how can i successfully repair the keyspace in these kind of
> environments ?
>
> I see that there is a parameter to throttle the inter-dc stream thruput,
> which default to 200 MBit/s. So what is the minimum threshold that i could
> set it to without affecting the cluster ?
>
> Is there any other way to work in these kind of environments ?
> I will appreciate your feedback and help on this.
>
>
>
>
>
> Thanks
>
>
>
> *Kishore Mohapatra*
>
> Principal Operations DBA
>
> Seattle, WA
>
> Email : kishore.mohapa...@nuance.com
>
>
>
>
>

Re: Cassandra repair process in Low Bandwidth Network

Reply via email to