Hi Kishore, Just to make sure we're all on the same page, I presume you're doing full repairs using something like 'nodetool repair -pr', which repairs all data for a given token range across all of your hosts in all of your dcs. Is that a correct assumption to start?
In addition to throttling inter-dc stream throughput (which you should be able to set quite low - perhaps as low as 20 Mbps), you may also want to consider smaller ranges (using a concept we call subrange repair, where instead of using -pr, you pass -st and -et - which is what tools like http://cassandra-reaper.io/ do ) - this will keep streams smaller (in terms of total bytes transferred per streaming session, though you'll have more sessions). Finally, you can use -host and -dc options to limit repair so that sessions don't always hit all 3 dcs - for exactly, you could do a repair between DC1 and DC2 using -dc, then do a repair of DC1 and DC3 using -dc - it's a lot more coordination required, but likely helps cut down on the traffic over your VPN link. On Fri, Sep 15, 2017 at 9:09 AM, Mohapatra, Kishore < kishore.mohapa...@nuance.com> wrote: > Hi, > we have a cassandra cluster with 7 nodes each in 3 datacenters. We > are using C* 2.1.15.4 version. > Network bandwidth between DC1 and DC2 is very good (10Gbit/s) and a > dedicated one. However network pipe between DC1 and DC3 and between DC2 and > DC3 is very poor and has only 100 MBit/s and also goes thru VPN network. > Each node contains about 100 Gb of data and has a RF of 3. Whenever we run > the repair, it fails with streaming errors and never completes. I have > already tried the streaming timeout parameter to a very high value. But it > did not help. I could repair either just in the local dc or just the first > two DCs. Can not repair DC3 when i combine with the other two DCs. > > So how can i successfully repair the keyspace in these kind of > environments ? > > I see that there is a parameter to throttle the inter-dc stream thruput, > which default to 200 MBit/s. So what is the minimum threshold that i could > set it to without affecting the cluster ? > > Is there any other way to work in these kind of environments ? > I will appreciate your feedback and help on this. > > > > > > Thanks > > > > *Kishore Mohapatra* > > Principal Operations DBA > > Seattle, WA > > Email : kishore.mohapa...@nuance.com > > > > >