Re: Cluster key distribution wrong after upgrading to 0.8.4

aaron morton Sun, 21 Aug 2011 02:58:17 -0700

This looks like an artifact of the way ownership is calculated for the OOP. See 
https://github.com/apache/cassandra/blob/cassandra-0.8.4/src/java/org/apache/cassandra/dht/OrderPreservingPartitioner.java#L177
 it was changed in this ticket
https://issues.apache.org/jira/browse/CASSANDRA-2800


The change applied in CASSANDRA-2800 was not applied to the 
AbstractByteOrderPartitioner. Looks like it should have been. I'll chase that 
up. 
 
When each node calculates the ownership for the token ranges (for OOP and BOP) 
it's based on the number of keys the node has in that range. As there is no way 
for the OOP to understand the range of values the keys may take. If you look at 
the 192 node it's showing ownership most with 192, 191 and 190 - so i'm 
assuming RF3 and 192 also has data from the ranges owned by 191 and 190. 

IMHO you can ignore this. 

You can use load the the number of keys estimate from cfstats to get an idea of 
whats happening. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 19/08/2011, at 9:42 PM, Thibaut Britz wrote:

> Hi,
> 
> we were using apache-cassandra-2011-06-28_08-04-46.jar so far in
> production and wanted to upgrade to 0.8.4.
> 
> Our cluster was well balanced and we only saved keys with a lower case
> md5 prefix. (Orderpreserving partitioner).
> Each node owned 20% of the tokens, which was also displayed on each
> node in nodetool -h localhost ring.
> 
> After upgrading, our well balanced cluster shows completely wrong
> percentage on who owns which keys:
> 
> *.*.*.190:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 34.57%  2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 0.02%   55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 0.02%   80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 0.02%   aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 65.36%  ffffffffffffffff
> 
> *.*.*.191:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 36.46%  2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 26.02%  55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 0.02%   80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 0.02%   aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 37.48%  ffffffffffffffff
> 
> *.*.*.192:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 38.16%  2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 27.61%  55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 34.17%  80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 0.02%   aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 0.02%   ffffffffffffffff
> 
> *.*.*.194:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 0.03%   2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 31.43%  55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 39.69%  80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 28.82%  aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 0.03%   ffffffffffffffff
> 
> *.*.*.196:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 0.02%   2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 0.02%   55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 0.02%   80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 27.52%  aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 72.42%  ffffffffffffffff
> 
> 
> Interestingly, each server shows something completely different.
> 
> Removing the locationInfo files didn't help.
> -Dcassandra.load_ring_state=false didn't help as well.
> 
> Our cassandra.yaml is at http://pastebin.com/pCVCt3RM
> 
> Any idea on what might cause this? Is it save to suspect that
> operating under this distribution will cause severe data loss? Or can
> I safely ignore this?
> 
> Thanks,
> Thibaut

Re: Cluster key distribution wrong after upgrading to 0.8.4

Reply via email to