Re: Cassandra 1.0.0 - Node Load Bug

Henrik Schröder Mon, 24 Oct 2011 03:22:57 -0700

We're also seeing something similar since upgrading to 1.0.0.

We have a 6-node cluster with replication factor of 3, but three of the
nodes are older running 32-bit Windows Server 2008, and three of the nodes
are newer and running 64-bit Windows Server 2008 R2, and we're running
32-bit java on the older nodes and 64-bit java on the newer nodes. We are
*not* using compression and we are *not* using leveled compaction, and we
also see that nodetool ring and info report the wrong load, it's growing
faster than actual disk usage. Restarting a node restores the reported load
to the correct number.


However, this only happens on the newer nodes running 64-bit java, not on
the older nodes running 32-bit.

Nodetool ring reports:
10.0.0.57       datacenter1 rack1       Up     Normal  25.7 GB
16.67%
10.0.0.50       datacenter1 rack1       Up     Normal  12.34 GB
16.67%
10.0.0.58       datacenter1 rack1       Up     Normal  11.74 GB
16.67%
10.0.0.51       datacenter1 rack1       Up     Normal  12.25 GB
16.67%
10.0.0.56       datacenter1 rack1       Up     Normal  17.94 GB
16.67%
10.0.0.52       datacenter1 rack1       Up     Normal  12.56 GB
16.67%

.56, .57, .58 are the newer nodes, I restarted .58, and then it reports the
correct size, while .57 and .56 report the wrong size. This is after about a
week of uptime for all nodes, and the bug makes the newer nodes report about
twice the actual datasize.

Running compaction does not correct the reported load number, only
restarting Cassandra fixes it.

I hope this helps a little bit at least.


/Henrik Schröder

On Thu, Oct 20, 2011 at 18:53, Dan Hendry <dan.hendry.j...@gmail.com> wrote:

> I have been playing around with Cassandra 1.0.0 in our test environment it
> seems pretty sweet so far. I have however come across what appears to be a
> bug tracking node load. I have enabled compression and levelled compaction
> on all CFs (scrub  + snapshot deletion) and the nodes have been operating
> normally for a day or two. I started getting concerned when the load as
> reported by nodetool ring kept increasing (it seems monotonically) despite
> seeing a compression ratio of ~2.5x (as a side note, I find it strange
> Cassandra does not provide the compression ratio via jmx or in the logs). I
> initially thought there might be a bug in cleaning up obsolete SSTables but
> I then noticed the following discrepancy:****
>
> ** **
>
> Nodetool ring reports:****
>
>                 10.112.27.65    datacenter1 rack1       Up     Normal  8.64
> GB         50.00%  170141183460469231731687303715884105727****
>
> ** **
>
> Yet du . –h reports: only 2.4G in the data directory.****
>
> ** **
>
> After restarting the node, nodetool ring reports a more accurate:****
>
> 10.112.27.65    datacenter1 rack1       Up     Normal  2.35 GB
> 50.00%  170141183460469231731687303715884105727****
>
> ** **
>
> Again, both compression and levelled compaction have been enabled on all
> CFs. Is this a known issue or has anybody else observed a similar pattern?
> ****
>
> ** **
>
> Dan Hendry****
>
> (403) 660-2297****
>
> ** **
>

Re: Cassandra 1.0.0 - Node Load Bug

Reply via email to