Re: Cassandra 2.0.8 MemoryMeter goes crazy

2014-06-16 Thread Robert Coli
On Mon, Jun 16, 2014 at 11:03 AM, horschi  wrote:

> About running mixed versions:
> I thought running mixed versions is ok. Running repair with mixed versions
> is not though. Right?
>

Running with split major versions for longer than it takes to do a rolling
restart is not supported.

=Rob


Re: Cassandra 2.0.8 MemoryMeter goes crazy

2014-06-16 Thread horschi
Hi Robert,

sorry, I am using our own internal terminology :-)

The entire cluster was upgraded. All 3 nodes of that cluster are on 2.0.8
now.

About the issue:
To me it looks like there is something wrong in the Memtable class. Some
very special edge case on CFs that are updated rarely. I cant say if it is
new to 2.0 or if it already existed in 1.2.

About running mixed versions:
I thought running mixed versions is ok. Running repair with mixed versions
is not though. Right?

kind regards,
Christian



On Mon, Jun 16, 2014 at 7:50 PM, Robert Coli  wrote:

> On Sat, Jun 14, 2014 at 1:02 PM, horschi  wrote:
>
>> this week we upgraded one of our Systems from Cassandra 1.2.16 to 2.0.8.
>> All 3 nodes were upgraded. SStables are upgraded.
>>
>
> One of your *clusters* or one of your *systems*?
>
> Running with split major versions is not supported.
>
> =Rob
>


Re: Cassandra 2.0.8 MemoryMeter goes crazy

2014-06-16 Thread Robert Coli
On Sat, Jun 14, 2014 at 1:02 PM, horschi  wrote:

> this week we upgraded one of our Systems from Cassandra 1.2.16 to 2.0.8.
> All 3 nodes were upgraded. SStables are upgraded.
>

One of your *clusters* or one of your *systems*?

Running with split major versions is not supported.

=Rob


Re: Cassandra 2.0.8 MemoryMeter goes crazy

2014-06-16 Thread horschi
Hi again,

before people start replying here: I just reported a Jira ticket:
https://issues.apache.org/jira/browse/CASSANDRA-7401

I think Memtable.maybeUpdateLiveRatio() needs some love.

kind regards,
Christian



On Sat, Jun 14, 2014 at 10:02 PM, horschi  wrote:

> Hi everyone,
>
> this week we upgraded one of our Systems from Cassandra 1.2.16 to 2.0.8.
> All 3 nodes were upgraded. SStables are upgraded.
>
> Unfortunetaly we are now experiencing that Cassandra starts to hang every
> 10 hours or so.
>
> We can see the MemoryMeter being very active, every time it is hanging.
> Both in tpstats and in the system.log:
>
>  INFO [MemoryMeter:1] 2014-06-14 19:24:09,488 Memtable.java (line 481)
> CFS(Keyspace='MDS', ColumnFamily='ResponsePortal') liveRatio is 64.0
> (just-counted was 64.0).  calculation took 0ms for 0 cells
>
> This line is logged hundreds of times per second (!) when Cassandra is
> down. CPU is a 100% busy.
>
> Interestingly this is only logged for this particular Columnfamily. This
> CF is used as a queue, which only contains a few entries (datafiles are
> about 4kb, only ~100 keys, usually 1-2 active, 98-99 tombstones).
>
> Table: ResponsePortal
> SSTable count: 1
> Space used (live), bytes: 4863
> Space used (total), bytes: 4863
> SSTable Compression Ratio: 0.9545454545454546
> Number of keys (estimate): 128
> Memtable cell count: 0
> Memtable data size, bytes: 0
> Memtable switch count: 1
> Local read count: 0
> Local read latency: 0.000 ms
> Local write count: 5
> Local write latency: 0.000 ms
> Pending tasks: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used, bytes: 176
> Compacted partition minimum bytes: 43
> Compacted partition maximum bytes: 50
> Compacted partition mean bytes: 50
> Average live cells per slice (last five minutes): 0.0
> Average tombstones per slice (last five minutes): 0.0
>
>
> Table: ResponsePortal
> SSTable count: 1
> Space used (live), bytes: 4765
> Space used (total), bytes: 5777
> SSTable Compression Ratio: 0.75
> Number of keys (estimate): 128
> Memtable cell count: 0
> Memtable data size, bytes: 0
> Memtable switch count: 12
> Local read count: 0
> Local read latency: 0.000 ms
> Local write count: 1096
> Local write latency: 0.000 ms
> Pending tasks: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used, bytes: 16
> Compacted partition minimum bytes: 43
> Compacted partition maximum bytes: 50
> Compacted partition mean bytes: 50
> Average live cells per slice (last five minutes): 0.0
> Average tombstones per slice (last five minutes): 0.0
>
>
> Has anyone ever seen this or has an idea what could be wrong? It seems
> that 2.0 can handle this column family not as good as 1.2 could.
>
> Any hints on what could be wrong are greatly appreciated :-)
>
> Cheers,
> Christian
>


Re: Multi-DC Environment Question

2014-06-16 Thread Vasileios Vlachos
Hello again,

Back to this after a while...

As far as I can tell whenever DC2 is unavailable, there is one node from
DC1 that acts as a coordinator. When DC2 is available again, this one node
sends the hints to only one node at DC2, which then sends any replicas to
the other nodes in the local DC (DC2). This ensures efficient cross-DC
bandwidth usage. I was watching "system.hints" on all nodes during this
test and this is the conclusion I came to.

Two things:
1. If the above is correct, does the same apply when performing
anti-entropy repair (without specifying a particular DC)? I'm just hoping
the answer to this is going to be YES, otherwise the VPN is not going to be
very happy in our case and we would prefer to not saturate it whenever
running nodetool repair. I suppose we could have a traffic limiter on the
firewalls worst case scenario but I would appreciate your input if you know
more on this.

2. As I described earlier, in order to test this I was watching the
"system.hints" CF in order to monitor any hints. I was looking to add a
Nagios check for this purpose. For that reason I was looking into JMX
Concole. I noticed that when a node stores hints, "MBean
org.apache.cassandra.db:type=ColumnFamilies,keyspace=system,columnfamily=hints",
attribute "MemtableColumnsCount" goes up (although I would expect it to be
MemtableRowCount or something?). This attribute will retain its value,
until the other node becomes available and ready to receive the hints. I
was looking for another attribute somewhere to monitor the active hints. I
checked:

"MBean
org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=hints,name=PendingTasks",

"MBean org.apache.cassandra.metrics:type=Storage,name=TotalHints",
"MBean
org.apache.cassandra.metrics:type=Storage,name=TotalHintsInProgress",
"MBean
org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=HintedHandoff,name=ActiveTasks"
and even
"MBean
org.apache.cassandra.metrics:type=HintedHandOffManager,name=Hints_not_stored-/
10.2.1.100" (this one will never go back to zero).

All of them would not increase whenever any hints are being sent (or at
least I didn't catch it because it was too fast or whatever?). Does anyone
know what all these attributes represent? It looks like there are more
specific hint attributes on a per CF basis, but I was looking for a more
generic one to begin with. Any help would be much appreciated.

Thanks in advance,

Vasilis


On Wed, Jun 4, 2014 at 1:42 PM, Vasileios Vlachos <
vasileiosvlac...@gmail.com> wrote:

> Hello Matt,
>
> nodetool status:
>
> Datacenter: MAN
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Owns (effective) Host ID Token Rack
> UN 10.2.1.103 89.34 KB 99.2% b7f8bc93-bf39-475c-a251-8fbe2c7f7239
> -9211685935328163899 RAC1
> UN 10.2.1.102 86.32 KB 0.7% 1f8937e1-9ecb-4e59-896e-6d6ac42dc16d
> -3511707179720619260 RAC1
> Datacenter: DER
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Owns (effective) Host ID Token Rack
> UN 10.2.1.101 75.43 KB 0.2% e71c7ee7-d852-4819-81c0-e993ca87dd5c
> -1277931707251349874 RAC1
> UN 10.2.1.100 104.53 KB 99.8% 7333b664-ce2d-40cf-986f-d4b4d4023726
> -9204412570946850701 RAC1
>
> I do not know why the cluster is not balanced at the moment, but it holds
> almost no data. I will populate it soon and see how that goes. The output
> of 'nodetool ring' just lists all the tokens assigned to each individual
> node, and as you can imagine it would be pointless to paste it here. I just
> did 'nodetool ring | awk ... | unique | wc -l' and it works out to be 1024
> as expected (4 nodes x 256 tokens each).
>
> Still have not got the answers to the other questions though...
>
> Thanks,
>
> Vasilis
>
>
> On Wed, Jun 4, 2014 at 12:28 AM, Matthew Allen 
> wrote:
>
>> Thanks Vasileios.  I think I need to make a call as to whether to switch
>> to vnodes or stick with tokens for my Multi-DC cluster.
>>
>> Would you be able to show a nodetool ring/status from your cluster to see
>> what the token assignment looks like ?
>>
>> Thanks
>>
>> Matt
>>
>>
>> On Wed, Jun 4, 2014 at 8:31 AM, Vasileios Vlachos <
>> vasileiosvlac...@gmail.com> wrote:
>>
>>>  I should have said that earlier really... I am using 1.2.16 and Vnodes
>>> are enabled.
>>>
>>> Thanks,
>>>
>>> Vasilis
>>>
>>> --
>>> Kind Regards,
>>>
>>> Vasileios Vlachos
>>>
>>>
>>
>