Tombstones should eventually compact away in most cases, but if you've
recently changed topology (added nodes, removed nodes, etc), you should run
"nodetool cleanup" to remove no-longer-owned data (start by running it on
one instance at a time, it's a form of compaction and can impact disk space
and latencies).


On Mon, Aug 7, 2017 at 2:04 PM, Chuck Reynolds <creyno...@ancestry.com>
wrote:

> Yes it’s the total size.
>
>
>
> Could it be that tombstones or data that nodes no longer own is not being
> copied/streamed to the data center in AWS?
>
>
>
> *From: *Jeff Jirsa <jji...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Monday, August 7, 2017 at 2:51 PM
> *To: *cassandra <user@cassandra.apache.org>
> *Subject: *Re: Different data size between datacenters
>
>
>
> And when you say the data size is smaller, you mean per node? Or sum of
> all nodes in the datacenter?
>
>
>
> With 185 hosts in AWS vs 135 in your DC, I would expect your DC hosts to
> have  30% less data per host than AWS.
>
>
>
> If instead they have twice as much, it sounds like it's balancing by # of
> tokens instead, which may be an indication that you're somehow using
> SimpleStrategy, or your NetworkTopologyStrategy is somehow misconfigured
> for one or more keyspaces.
>
>
>
> Can you paste your keyspace replication strategy lines, anonymized as
> needed?
>
>
>
>
>
> On Mon, Aug 7, 2017 at 1:46 PM, Chuck Reynolds <creyno...@ancestry.com>
> wrote:
>
> Yes to the NetworkTopologyStrategy.
>
>
>
> *From: *Jeff Jirsa <jji...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Monday, August 7, 2017 at 2:39 PM
> *To: *cassandra <user@cassandra.apache.org>
> *Subject: *Re: Different data size between datacenters
>
>
>
> You're using NetworkTopologyStrategy and not SimpleStrategy, correct?
>
>
>
>
>
> On Mon, Aug 7, 2017 at 11:50 AM, Chuck Reynolds <creyno...@ancestry.com>
> wrote:
>
> I have a cluster that spans two datacenters running Cassandra 2.1.12.  135
> nodes in my data center and about 185 in AWS.
>
>
>
> The size of the second data center (AWS) is quite a bit smaller.
> Replication is the same in both datacenters.  Is there a logical
> explanation for this?
>
>
>
> thanks
>
>
>
>
>

Reply via email to