Need to review java gc, system , network, disk, memory, node, and table 
statistics. A lot can be discerned from visually examining the charts. Eg. if 
the nodes with the most local reads is failing or is it the one with the most 
writes or is it completely unrelated.

Since it’s a distributed system you need to review the data points together for 
all nodes. Data is the only way to see what’s going on. Either connect 
Prometheus / Grafana , get Datadog , New Relic, or something else to see the 
patterns across the cluster.

https://blog.anant.us/resources-for-monitoring-datastax-cassandra-spark-solr-performance/

I assembled that list recently — I would even add that getting system logs into 
ELK or Splunk could also show some patterns otherwise not detected tailing and 
gripping.

Rahul
On Jul 26, 2018, 10:20 AM -0400, R1 J1 <rjsoft...@gmail.com>, wrote:
> Thanks for your prompt replies. No the same node is not bouncing over. When 
> you say it is about to tip over: What can we do to stop that ?
>
> Also about that error : you guys are correct: it is  a warning and might not 
> be contributing to the node bounce issue and it can be removed by changing 
> batch_size_warn_threshold_in_kb: 5
>
> R1J1
>
> > On Wed, Jul 25, 2018 at 10:32 PM, R1 J1 <rjsoft...@gmail.com> wrote:
> > > cassandro nodes restarts
> > >
> > >
> > >
> > > we see errors typically like these
> > >
> > >
> > > WARN  [Native-Transport-Requests-3] 2018-07-25 20:51:38,520 
> > > BatchStatement.java:301 - Batch for "keyspace.table"
> > >  is of size 19.386KiB, exceeding specified threshold of 5.000KiB by 
> > > 14.386KiB.
> > >
> > >
> > > Regards
> > > R1J1
>

Reply via email to