Cassandra repair process in Low Bandwidth Network

2017-09-15 Thread Mohapatra, Kishore
Hi,
   we have a cassandra cluster with 7 nodes each in 3 datacenters. We are 
using C* 2.1.15.4 version.
Network bandwidth between DC1 and DC2 is very good (10Gbit/s) and a dedicated 
one. However network pipe between DC1 and DC3 and between DC2 and DC3 is very 
poor and has only 100 MBit/s and also goes thru VPN network. Each node contains 
about 100 Gb of data and has a RF of 3. Whenever we run the repair, it fails 
with streaming errors and never completes. I have already tried the streaming 
timeout parameter to a very high value. But it did not help. I could repair 
either just in the local dc or just the first two DCs. Can not repair DC3 when 
i combine with the other two DCs.

So how can i successfully repair the keyspace in these kind of environments ?

I see that there is a parameter to throttle the inter-dc stream thruput, which 
default to 200 MBit/s. So what is the minimum threshold that i could set it to 
without affecting the cluster ?

Is there any other way to work in these kind of environments ?
I will appreciate your feedback and help on this.


Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com




RE: [EXTERNAL] Re: Cassandra repair process in Low Bandwidth Network

2017-09-15 Thread Mohapatra, Kishore
Hi Jeff,
  Thanks for your reply.
Infact I have tried with all the options.

  1.  We use Cassandra reaper for our repair, which does the sub range repair.
  2.  I have also developed a shell script, which exactly does the same, as 
what reaper does. But this can control, how many repair session will run 
concurrently.
  3.  Also tried with full repair.
  4.  Tried running repair in two DCs at a time. While the repair between DC1 
And DC2 goes fine, but repair between DC1 and DC3 or between DC2 and DC3 fails.

So I will try setting inter-dc stream thruput to 20Mbps and see how that goes.

Is there anything else that could be done in this case ?

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com<mailto:kishore.mohapa...@nuance.com>


From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Friday, September 15, 2017 10:27 AM
To: cassandra 
Subject: [EXTERNAL] Re: Cassandra repair process in Low Bandwidth Network

Hi Kishore,

Just to make sure we're all on the same page, I presume you're doing full 
repairs using something like 'nodetool repair -pr', which repairs all data for 
a given token range across all of your hosts in all of your dcs. Is that a 
correct assumption to start?

In addition to throttling inter-dc stream throughput (which you should be able 
to set quite low - perhaps as low as 20 Mbps), you may also want to consider 
smaller ranges (using a concept we call subrange repair, where instead of using 
-pr, you pass -st and -et - which is what tools like 
http://cassandra-reaper.io/<https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra-2Dreaper.io_&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE&m=rNdSqNv4gpfoluDbS5uGdjDRj6zcJVHGYOSaJyl7FmQ&s=SiggeMxLLmJXEW7ljC48Lap4qov05ZvEuRJ_ybaxffI&e=>
 do ) - this will keep streams smaller (in terms of total bytes transferred per 
streaming session, though you'll have more sessions). Finally, you can use 
-host and -dc options to limit repair so that sessions don't always hit all 3 
dcs - for exactly, you could do a repair between DC1 and DC2 using -dc, then do 
a repair of DC1 and DC3 using -dc - it's a lot more coordination required, but 
likely helps cut down on the traffic over your VPN link.


On Fri, Sep 15, 2017 at 9:09 AM, Mohapatra, Kishore 
mailto:kishore.mohapa...@nuance.com>> wrote:

Hi,
   we have a cassandra cluster with 7 nodes each in 3 datacenters. We are 
using C* 2.1.15.4 version.
Network bandwidth between DC1 and DC2 is very good (10Gbit/s) and a dedicated 
one. However network pipe between DC1 and DC3 and between DC2 and DC3 is very 
poor and has only 100 MBit/s and also goes thru VPN network. Each node contains 
about 100 Gb of data and has a RF of 3. Whenever we run the repair, it fails 
with streaming errors and never completes. I have already tried the streaming 
timeout parameter to a very high value. But it did not help. I could repair 
either just in the local dc or just the first two DCs. Can not repair DC3 when 
i combine with the other two DCs.

So how can i successfully repair the keyspace in these kind of environments ?

I see that there is a parameter to throttle the inter-dc stream thruput, which 
default to 200 MBit/s. So what is the minimum threshold that i could set it to 
without affecting the cluster ?

Is there any other way to work in these kind of environments ?
I will appreciate your feedback and help on this.


Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com<mailto:kishore.mohapa...@nuance.com>





RE: [EXTERNAL] Cassandra copy command is giving OverflowError

2017-09-22 Thread Mohapatra, Kishore
Try this.

COPY keyspace1.table1 TO '/tmp/table1.csv' WITH PAGETIMEOUT=40 AND PAGESIZE=20;

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com


From: AI Rumman [mailto:rumman...@gmail.com]
Sent: Thursday, September 21, 2017 11:24 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Cassandra copy command is giving OverflowError


Hi,

This is my first post here. I am copying my writings from

https://stackoverflow.com/questions/46354677/cassandra-copy-command-is-giving-overflowerror

I am new with Cassandra and running DataStax Cassandra 4.8.14.

My cluster is as:

3 nodes - cassandra

3 nodes - solr search by DataStax

Table:

CREATE TABLE keyspace1.table1 (

id bigint,

is_dir boolean,

dir text,

name text,

created_date timestamp,

size bigint,

solr_query text,

status text,

PRIMARY KEY (id, is_dir, dir, name)

) WITH CLUSTERING ORDER BY (is_dir ASC, dir ASC, name ASC)

AND bloom_filter_fp_chance = 0.01

AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'

AND comment = ''

AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}

AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}

AND dclocal_read_repair_chance = 0.1

AND default_time_to_live = 0

AND gc_grace_seconds = 864000

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 0

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = '99.0PERCENTILE';

CREATE CUSTOM INDEX keyspace1_table1_created_date_index ON keyspace1.table1 
(created_date) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';

CREATE CUSTOM INDEX keyspace1_table1_size_index ON keyspace1.table1 (size) 
USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';

CREATE CUSTOM INDEX keyspace1_table1_solr_query_index ON keyspace1.table1 
(solr_query) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';

CREATE CUSTOM INDEX keyspace1_table1_status_index ON keyspace1.table1 (status) 
USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';

Command executed:

COPY keyspace1.table1 TO '/tmp/table1.csv';

I was trying to execute copy command and got the following error:

:10:Error for (None, -9207222088349382333): OverflowError - date value 
out of range (will try again later attempt 1 of 5)

:10:Error for (-2699117314734179807, -2639913056120220671): 
NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
(permanently given up after 95000 rows and 1 attempts)

:10:Error for (4414337294902011474, 4418434771303296337): 
NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
(will try again later attempt 1 of 5)

:10:Error for (-835790340821162882, -820685939947495393): 
NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
(permanently given up after 49000 rows and 1 attempts)

Can anyone please tell me what it means?

Thanks.


Increasing VNodes

2017-10-04 Thread Mohapatra, Kishore
Hi,
We are having a lot of problems in repair process. We use sub range 
repair. But most of the time, some ranges fails with streaming error or some 
other kind of error.
So wondering if it will help if we increase the no. of VNodes from 256 
(default) to 512. But increasing the VNodes will be a lot of efforts, as it 
involves wiping out the data and bootstrapping.
So is there any other way of splitting the range into small ranges ?

We are using version 2.1.15.4 at the moment.

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com




RE: [EXTERNAL] Re: Increasing VNodes

2017-10-04 Thread Mohapatra, Kishore
Thanks a lot for all of your input. We are actually using Cassandra reaper. But 
it is just splitting the ranges into 256 per node.
But I will certainly try out splitting into smaller ranges going thru the 
system.size_estimate table.

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com<mailto:kishore.mohapa...@nuance.com>


From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Wednesday, October 04, 2017 10:27 AM
To: user 
Subject: [EXTERNAL] Re: Increasing VNodes

The site (with the docs) is probably more helpful to learn about how reaper 
works:  
http://cassandra-reaper.io/<https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra-2Dreaper.io_&d=DwMFAg&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE&m=nHN7toaSQUjfwSABx1KXlVHLYmlaEcUMYPHzC3ky5TM&s=oqOGMK4a6er4kSRtzs7B2_A6QB6kb7nQek8NAU5pytI&e=>

On Oct 4, 2017, at 9:54 AM, Chris Lohfink 
mailto:clohfin...@gmail.com>> wrote:

Increasing number of tokens will make repairs worse not better. You can just 
split the sub ranges into smaller chunks, you dont need to use vnodes to do 
that. Simple approach is to iterate through each host token range and split by 
N and repair them (ie 
https://github.com/onzra/cassandra_range_repair<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_onzra_cassandra-5Frange-5Frepair&d=DwMFAg&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE&m=nHN7toaSQUjfwSABx1KXlVHLYmlaEcUMYPHzC3ky5TM&s=Ph50r9wV17T72OEwI3FsbAXBVZ3Pt-AmACQZYdsQqgk&e=>)
  To be more efficient you can grab ranges and split based on number of 
partitions in the range (ie fetch system.size_estimates and walk that) so you 
dont split empty or small ranges a ton unnecessarily, and because not all 
tables have some fixed N that is efficient.

Using TLPs reaper 
https://github.com/thelastpickle/cassandra-reaper<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_thelastpickle_cassandra-2Dreaper&d=DwMFAg&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE&m=nHN7toaSQUjfwSABx1KXlVHLYmlaEcUMYPHzC3ky5TM&s=_4VgTSxIgqGn339jpHMycnHg4bmM0pHmUxSQ8nNfdDU&e=>
 or DataStax OpsCenter's repair service is easiest solution without a lot of 
effort. Repairs are hard.

Chris

On Wed, Oct 4, 2017 at 11:48 AM, Jeff Jirsa 
mailto:jji...@gmail.com>> wrote:
You don't need to change the number of vnodes, you can manually select 
CONTAINED token subranges and pass in -st and -et (just try to pick a number > 
2^20 that is fully contained by at least one vnode).




On Wed, Oct 4, 2017 at 9:46 AM, Mohapatra, Kishore 
mailto:kishore.mohapa...@nuance.com>> wrote:
Hi,
We are having a lot of problems in repair process. We use sub range 
repair. But most of the time, some ranges fails with streaming error or some 
other kind of error.
So wondering if it will help if we increase the no. of VNodes from 256 
(default) to 512. But increasing the VNodes will be a lot of efforts, as it 
involves wiping out the data and bootstrapping.
So is there any other way of splitting the range into small ranges ?

We are using version 2.1.15.4 at the moment.

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com<mailto:kishore.mohapa...@nuance.com>







RE: [EXTERNAL]

2017-10-23 Thread Mohapatra, Kishore
What is your RF for the keyspace and how many nodes are there in each DC ?

Did you force a Read Repair to see, if you are getting the data or getting an 
error ?

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com


-Original Message-
From: vbhang...@gmail.com [mailto:vbhang...@gmail.com] 
Sent: Sunday, October 22, 2017 11:31 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] 

-- Consistency level  LQ
-- It started happening approximately couple of months back.  Issue is very 
inconsistent and can't be reproduced.  It used rarely happen earlier (since 
last few years).
-- There are very few GC pauses but  they don't coincide with the issue. 
-- 99% latency is less than 80ms and 75% is less than 5ms.

- Vedant
On 2017-10-22 21:29, Jeff Jirsa  wrote: 
> What consistency level do you use on writes?
> Did this just start or has it always happened ?
> Are you seeing GC pauses at all?
> 
> What’s your 99% write latency? 
> 
> --
> Jeff Jirsa
> 
> 
> > On Oct 22, 2017, at 9:21 PM, "vbhang...@gmail.com" 
> > wrote:
> > 
> > This is for Cassandra 2.1.13. At times there are replication delays across 
> > multiple regions. Data is available (getting queried from command line) in 
> > 1 region but not seen in other region(s).  This is not consistent. It is 
> > cluster spanning multiple data centers with total > 30 nodes. Keyspace is 
> > configured to get replicated in all the data centers.
> > 
> > Hints are getting piled up in the source region. This happens especially 
> > for large data payload (appro 1kb to few MB blobs).  Network  level 
> > congestion or saturation does not seem to be an issue.  There is no 
> > memory/cpu pressure on individual nodes.
> > 
> > I am sharing Cassandra.yaml below, any pointers on what can be tuned are 
> > highly appreciated. Let me know if you need any other info.
> > 
> > We tried bumping up hinted_handoff_throttle_in_kb: 30720 and handoff tends 
> > to be slower max_hints_delivery_threads: 12 on one of the nodes to see if 
> > it speeds up hints delivery, there was some improvement but not whole lot.
> > 
> > Thanks
> > 
> > =
> > # Cassandra storage config YAML
> > 
> > # NOTE:
> > #   See 
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.apache.org_cassandra_StorageConfiguration&d=DwIBaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE&m=n1yhBCTDUhib4RoMH1SWmzcJU1bb-kL6WyTdhDlBL5g&s=1SQ9gAKWYTFTLEnR1ubZ0zPq_wtBEpY9udxtmNRr6Qg&e=
> >   for
> > #   full explanations of configuration directives
> > # /NOTE
> > 
> > # The name of the cluster. This is mainly used to prevent machines 
> > in # one logical cluster from joining another.
> > cluster_name: "central"
> > 
> > # This defines the number of tokens randomly assigned to this node 
> > on the ring # The more tokens, relative to other nodes, the larger 
> > the proportion of data # that this node will store. You probably 
> > want all nodes to have the same number # of tokens assuming they have equal 
> > hardware capability.
> > #
> > # If you leave this unspecified, Cassandra will use the default of 1 
> > token for legacy compatibility, # and will use the initial_token as 
> > described below.
> > #
> > # Specifying initial_token will override this setting on the node's 
> > initial start, # on subsequent starts, this setting will apply even if 
> > initial token is set.
> > #
> > # If you already have a cluster with 1 token per node, and wish to 
> > migrate to # multiple tokens per node, see 
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.apache.org_
> > cassandra_Operations&d=DwIBaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0
> > rrLsOzY&r=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE&m=n1yhBCTDUhib
> > 4RoMH1SWmzcJU1bb-kL6WyTdhDlBL5g&s=rbkIhV_HMQ4R_YS_6-hGmPMYhsJJa6DDjg
> > ZfON6bo6M&e=
> > #num_tokens: 256
> > 
> > # initial_token allows you to specify tokens manually.  While you 
> > can use # it with # vnodes (num_tokens > 1, above) -- in which case 
> > you should provide a # comma-separated list -- it's primarily used 
> > when adding nodes # to legacy clusters # that do not have vnodes enabled.
> > # initial_token:
> > 
> > initial_token: 
> > 
> > # See 
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.apache.org_
> > cassandra_HintedHandoff&d=DwIBaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dl
> > gP0rrLsOzY&r=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE&m=n1yhBCTDU
> > hib4RoMH1SWmzcJU1bb-kL6WyTdhDlBL5g&s=X5a8VFm3Dap2-T8Zlo_9XZRVqgKaU7t
> > 46eYJ3ztBX7c&e= # May either be "true" or "false" to enable 
> > globally, or contain a list # of data centers to enable 
> > per-datacenter.
> > # hinted_handoff_enabled: DC1,DC2
> > hinted_handoff_enabled: true
> > # this defines the maximum amount of time a dead host will have 
> > hints # generated.  After it has been dead this long, new hints for 
> > it will not be # created until it has been seen alive and

RE: [EXTERNAL] Lot of hints piling up

2017-10-23 Thread Mohapatra, Kishore
Do you see any error in the cassandra log ?
Check compactionstats ?
Also check the OS level log messages to see if you are getting hardware level 
error messages.

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Ph : 425-691-6417 (cell)
Email : kishore.mohapa...@nuance.com


From: Jai Bheemsen Rao Dhanwada [mailto:jaibheem...@gmail.com]
Sent: Friday, October 20, 2017 9:44 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Lot of hints piling up

Hello,

We have cassandra cluster in 3 regions with version 2.1.13, and all of a sudden 
we started seeing lot of hints accumulating on the nodes. We are pretty sure 
there is no issue with the network between the regions and all the nodes are up 
and running all the time.

Is there any other reason for the hints accumulation other than the n/w? eg: 
wide rows or bigger objects?

Any pointers here could be very helpful.

b/w the hints gets processed after some point of time.


RE: [EXTERNAL]

2017-10-24 Thread Mohapatra, Kishore
Hi Vedant,
  I was actually referring to command line select query 
with Consistency level=ALL . This will force a read repair in the background.
But as I can see, you have tried with consistency level = one and and it is 
still timing out. SO what error you see in the system.log ?
Streaming error ?

Can you also check how many sstables are there for that table . Seems like your 
compaction may not be working.
Is your repair job running fine ?

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Ph : 425-691-6417 (cell)
Email : kishore.mohapa...@nuance.com


-Original Message-
From: vbhang...@gmail.com [mailto:vbhang...@gmail.com] 
Sent: Monday, October 23, 2017 6:59 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] 

It is RF=3 and 12 nodes in 3 regions and 6 in other 2, so total 48 nodes. Are 
you suggesting forced read repair by reading consistency of ONE or by bumping 
up read_repair_chance? 

We have tried from command  line with ONE but that times out. 
On 2017-10-23 10:18, "Mohapatra, Kishore"  wrote: 
> What is your RF for the keyspace and how many nodes are there in each DC ?
> 
> Did you force a Read Repair to see, if you are getting the data or getting an 
> error ?
> 
> Thanks
> 
> Kishore Mohapatra
> Principal Operations DBA
> Seattle, WA
> Email : kishore.mohapa...@nuance.com
> 
> 
> -Original Message-
> From: vbhang...@gmail.com [mailto:vbhang...@gmail.com]
> Sent: Sunday, October 22, 2017 11:31 PM
> To: user@cassandra.apache.org
> Subject: [EXTERNAL]
> 
> -- Consistency level  LQ
> -- It started happening approximately couple of months back.  Issue is very 
> inconsistent and can't be reproduced.  It used rarely happen earlier (since 
> last few years).
> -- There are very few GC pauses but  they don't coincide with the issue. 
> -- 99% latency is less than 80ms and 75% is less than 5ms.
> 
> - Vedant
> On 2017-10-22 21:29, Jeff Jirsa  wrote: 
> > What consistency level do you use on writes?
> > Did this just start or has it always happened ?
> > Are you seeing GC pauses at all?
> > 
> > Whatââ,¬â"¢s your 99% write latency? 
> > 
> > --
> > Jeff Jirsa
> > 
> > 
> > > On Oct 22, 2017, at 9:21 PM, "vbhang...@gmail.com" 
> > > wrote:
> > > 
> > > This is for Cassandra 2.1.13. At times there are replication delays 
> > > across multiple regions. Data is available (getting queried from command 
> > > line) in 1 region but not seen in other region(s).  This is not 
> > > consistent. It is cluster spanning multiple data centers with total > 30 
> > > nodes. Keyspace is configured to get replicated in all the data centers.
> > > 
> > > Hints are getting piled up in the source region. This happens especially 
> > > for large data payload (appro 1kb to few MB blobs).  Network  level 
> > > congestion or saturation does not seem to be an issue.  There is no 
> > > memory/cpu pressure on individual nodes.
> > > 
> > > I am sharing Cassandra.yaml below, any pointers on what can be tuned are 
> > > highly appreciated. Let me know if you need any other info.
> > > 
> > > We tried bumping up hinted_handoff_throttle_in_kb: 30720 and handoff 
> > > tends to be slower max_hints_delivery_threads: 12 on one of the nodes to 
> > > see if it speeds up hints delivery, there was some improvement but not 
> > > whole lot.
> > > 
> > > Thanks
> > > 
> > > =
> > > # Cassandra storage config YAML
> > > 
> > > # NOTE:
> > > #   See 
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.apache.org_cassandra_StorageConfiguration&d=DwIBaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE&m=n1yhBCTDUhib4RoMH1SWmzcJU1bb-kL6WyTdhDlBL5g&s=1SQ9gAKWYTFTLEnR1ubZ0zPq_wtBEpY9udxtmNRr6Qg&e=
> > >   for
> > > #   full explanations of configuration directives
> > > # /NOTE
> > > 
> > > # The name of the cluster. This is mainly used to prevent machines 
> > > in # one logical cluster from joining another.
> > > cluster_name: "central"
> > > 
> > > # This defines the number of tokens randomly assigned to this node 
> > > on the ring # The more tokens, relative to other nodes, the larger 
> > > the proportion of data # that this node will store. You probably 
> > > want all nodes to have the same number # of tokens assuming they have 
> > > equal hardware capability.
> > > #
> > > # If you leave

RE: [EXTERNAL] Lot of hints piling up

2017-10-24 Thread Mohapatra, Kishore
Check how many sstables are there for the table you are having issues with.
You might be having the Heap pressure. Check your system.log for any OOM error 
or HEAP error.

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com<mailto:kishore.mohapa...@nuance.com>


From: Jai Bheemsen Rao Dhanwada [mailto:jaibheem...@gmail.com]
Sent: Monday, October 23, 2017 11:54 AM
To: user@cassandra.apache.org
Subject: Re: [EXTERNAL] Lot of hints piling up

Do not see any errors in the logs or OS and compactions are happening in the 
regular interval and good too.


Issue here is, this causing replication lag across the datacenters.

On Mon, Oct 23, 2017 at 10:23 AM, Mohapatra, Kishore 
mailto:kishore.mohapa...@nuance.com>> wrote:
Do you see any error in the cassandra log ?
Check compactionstats ?
Also check the OS level log messages to see if you are getting hardware level 
error messages.

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Ph : 425-691-6417 (cell)
Email : kishore.mohapa...@nuance.com<mailto:kishore.mohapa...@nuance.com>


From: Jai Bheemsen Rao Dhanwada 
[mailto:jaibheem...@gmail.com<mailto:jaibheem...@gmail.com>]
Sent: Friday, October 20, 2017 9:44 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Lot of hints piling up

Hello,

We have cassandra cluster in 3 regions with version 2.1.13, and all of a sudden 
we started seeing lot of hints accumulating on the nodes. We are pretty sure 
there is no issue with the network between the regions and all the nodes are up 
and running all the time.

Is there any other reason for the hints accumulation other than the n/w? eg: 
wide rows or bigger objects?

Any pointers here could be very helpful.

b/w the hints gets processed after some point of time.