again I haven’t read this thread from the beginning so I don’t know which node 
is which, but if nodes pause for longish GC, then other nodes will likely be 
saving hints (assuming you are writing at the time), then they will be 
delivered once the machines become responsive again. I’m just guessing though. 
Take a look at the hinting metrics.
> On Sep 11, 2015, at 2:45 PM, Roman Tkachenko <ro...@mailgunhq.com> wrote:
> 
> I have another datapoint from our monitoring system that shows huge outbound 
> network traffic increase for the affected boxes during these spikes:
> 
> <Screen Shot 2015-09-11 at 12.35.16 PM.png>
> 
> Looking at inbound traffic, it is increased on nodes other than these 
> (purple, yellow and blue) so it does look like some kind of excessive 
> internode communication is going on between these 3 nodes and the rest of the 
> cluster.
> 
> What could these network spikes be a sign of?
> 
> 
> On Thu, Sep 10, 2015 at 12:00 PM, Graham Sanderson <gra...@vast.com 
> <mailto:gra...@vast.com>> wrote:
> Haven’t been following this thread, but we run beefy machines with 8gig new 
> gen, 12 gig old gen (down from 16g since moving memtables off heap, we can 
> probably go lower)…
> 
> Apart from making sure you have all the latest -XX: flags from 
> cassandra-env.sh (and MALLOC_ARENA_MAX), I personally would recommend running 
> latest 2.1.x with
> 
> memory_allocator: JEMallocAllocator
> memtable_allocation_type: offheap_objects
> 
> Some people will probably disagree, but it works great for us (rare long 
> pauses sub 2 secs), and if you’re seeing slow GC because of promotion failure 
> of objects 131074 dwords big, then I definitely suggest you give it a try.
> 
>> On Sep 10, 2015, at 1:43 PM, Robert Coli <rc...@eventbrite.com 
>> <mailto:rc...@eventbrite.com>> wrote:
>> 
>> On Thu, Sep 10, 2015 at 10:54 AM, Roman Tkachenko <ro...@mailgunhq.com 
>> <mailto:ro...@mailgunhq.com>> wrote: 
>> [5 second CMS GC] Is my best shot to play with JVM settings trying to tune 
>> garbage collection then?
>> 
>> Yep. As a minor note, if the machines are that beefy, they probably have a 
>> lot of RAM, you might wish to consider trying G1 GC and a larger heap.
>> 
>> =Rob
>> 
>>  
> 
> 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to