Re: Cassandra node JVM hang during node repair a table with materialized view

Ben G Wed, 15 Apr 2020 22:30:08 -0700

Thanks a lot for your sharing.
The node is added recently.  The bootstrap failed since too many tombstone.
So we enabled the node without bootstrap enabled.  Some sstables are not
created in bootstrap.  So the missing files might be numerous.  I have set
the repair thread number is 1.  should I also set '-seq' flag? In fact, I
set '-seq' and '-pl' flag on a very small token range (only 2 long number
on a vnode), the JVM issue was not reproduced. But there are still
thousands of writer pending tasks in the peak.
 I have scaled up the EC2 RAM. Now 64 GB totally, and the JVM heap is 24 G
because less than 1/2 heap is recommended by document.  GC collector is
G1.  I ever repair the node after scale up. The JVM issue reproduced.  Can
I increase the heap to 40 GB on a 64GB VM?


Do you think the issue is related to materialized view or big partition?

Thanks

Erick Ramirez <erick.rami...@datastax.com> 于2020年4月16日周四 下午12:51写道：

> Is this the first time you've repaired your cluster? Because it sounds
> like it isn't coping. First thing you need to make sure of is to *not*
> run repairs in parallel. It can overload your cluster -- only kick off a
> repair one node at a time on small clusters. For larger clusters, you might
> be able to run it on multiple nodes but only on non-adjacent nodes (or
> nodes far enough around the ring from each other) where you absolutely know
> they don't have overlapping token ranges. If this doesn't make sense or is
> too complicated then just repair one node at a time.
>
> You should also consider running a partitioner-range repair (with the -pr
> flag) so you're only repairing ranges once. This is the quickest and most
> efficient way to repair since it doesn't repair overlapping token ranges
> multiple times. If you're interested, Jeremiah Jordan wrote a nice blog
> post explaining this in detail [1
> <https://www.datastax.com/blog/2014/07/repair-cassandra>].
>
> Third thing to consider is bumping up the heap on the nodes to 20GB. See
> how it goes. If you need to, maybe go as high as 24GB but understand the
> tradeoffs -- larger heaps mean that GC pauses are longer since there is
> more space to clean up. I also try to reserve 8GB of RAM for the operating
> system so on a 32GB system, 24GB is the most I would personally allocate to
> the heap (my opinion, YMMV).
>
> CMS also doesn't cope well with large heap sizes so depending on your use
> case/data model/access patterns/etc, you might need to switch to G1 GC if
> you really need to go upwards of 20GB. To be clear -- I'm not recommending
> that you switch to G1. I'm just saying that in my experience, CMS isn't
> great with large heap sizes. ;)
>
> Finally, 4 flush writers may be doing your nodes more harm than good since
> your nodes are on EBS, likely just a single volume. More is not always
> better so there's a word of warning for you. Again, YMMV. Cheers!
>
> [1] https://www.datastax.com/blog/2014/07/repair-cassandra
>
> GOT QUESTIONS? Apache Cassandra experts from the community and DataStax
> have answers! Share your expertise on https://community.datastax.com/.
>
>>

-- 

Thanks
Guo Bin

Re: Cassandra node JVM hang during node repair a table with materialized view

Reply via email to