Also should note: Cassandra 2.2.5, Centos 6.7

On Wed, Mar 2, 2016 at 1:34 PM, Dan Kinder <dkin...@turnitin.com> wrote:

> Hi y'all,
>
> I am writing to a cluster fairly fast and seeing this odd behavior happen,
> seemingly to single nodes at a time. The node starts to take more and more
> memory (instance has 48GB memory on G1GC). tpstats shows that
> MemtableReclaimMemory Pending starts to grow first, then later
> MutationStage builds up as well. By then most of the memory is being
> consumed, GC is getting longer, node slows down and everything slows down
> unless I kill the node. Also the number of Active MemtableReclaimMemory
> threads seems to stay at 1. Also interestingly, neither CPU nor disk
> utilization are pegged while this is going on; it's on jbod and there is
> plenty of headroom there. (Note that there is a decent number of
> compactions going on as well but that is expected on these nodes and this
> particular one is catching up from a high volume of writes).
>
> Anyone have any theories on why this would be happening?
>
>
> $ nodetool tpstats
> Pool Name                    Active   Pending      Completed   Blocked
>  All time blocked
> MutationStage                   192    715481      311327142         0
>             0
> ReadStage                         7         0        9142871         0
>             0
> RequestResponseStage              1         0      690823199         0
>             0
> ReadRepairStage                   0         0        2145627         0
>             0
> CounterMutationStage              0         0              0         0
>             0
> HintedHandoff                     0         0            144         0
>             0
> MiscStage                         0         0              0         0
>             0
> CompactionExecutor               12        24          41022         0
>             0
> MemtableReclaimMemory             1       102           4263         0
>             0
> PendingRangeCalculator            0         0             10         0
>             0
> GossipStage                       0         0         148329         0
>             0
> MigrationStage                    0         0              0         0
>             0
> MemtablePostFlush                 0         0           5233         0
>             0
> ValidationExecutor                0         0              0         0
>             0
> Sampler                           0         0              0         0
>             0
> MemtableFlushWriter               0         0           4270         0
>             0
> InternalResponseStage             0         0       16322698         0
>             0
> AntiEntropyStage                  0         0              0         0
>             0
> CacheCleanupExecutor              0         0              0         0
>             0
> Native-Transport-Requests        25         0      547935519         0
>       2586907
>
> Message type           Dropped
> READ                         0
> RANGE_SLICE                  0
> _TRACE                       0
> MUTATION                287057
> COUNTER_MUTATION             0
> REQUEST_RESPONSE             0
> PAGED_RANGE                  0
> READ_REPAIR                149
>
>


-- 
Dan Kinder
Principal Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com

Reply via email to