Re: Increasing replication factor and repair doesn't seem to work

Bhuvan Rawal Tue, 24 May 2016 13:06:43 -0700

Hi Luke,

You mentioned that replication factor was increased from 1 to 2. In that
case was the node bearing ip 10.128.0.20 carried around 3GB data earlier?


You can run nodetool repair with option -local to initiate repair local
datacenter for gce-us-central1.

Also you may suspect that if a lot of data was deleted while the node was
down it may be having a lot of tombstones which is not needed to be
replicated to the other node. In order to verify the same, you can issue a
select count(*) query on column families (With the amount of data you have
it should not be an issue) with tracing on and with consistency local_all
by connecting to either 10.128.0.3  or 10.128.0.20 and store it in a file.
It will give you a fair amount of idea about how many deleted cells the
nodes have. I tried searching for reference if tombstones are moved around
during repair, but I didnt find evidence of it. However I see no reason to
because if the node didnt have data then streaming tombstones does not make
a lot of sense.

Regards,
Bhuvan

On Tue, May 24, 2016 at 11:06 PM, Luke Jolly <l...@getadmiral.com> wrote:

> Here's my setup:
>
> Datacenter: gce-us-central1
> ===========================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address      Load       Tokens       Owns (effective)  Host ID
>                       Rack
> UN  10.128.0.3   6.4 GB     256          100.0%
>  3317a3de-9113-48e2-9a85-bbf756d7a4a6  default
> UN  10.128.0.20  943.08 MB  256          100.0%
>  958348cb-8205-4630-8b96-0951bf33f3d3  default
> Datacenter: gce-us-east1
> ========================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address      Load       Tokens       Owns (effective)  Host ID
>                       Rack
> UN  10.142.0.14  6.4 GB     256          100.0%
>  c3a5c39d-e1c9-4116-903d-b6d1b23fb652  default
> UN  10.142.0.13  5.55 GB    256          100.0%
>  d0d9c30e-1506-4b95-be64-3dd4d78f0583  default
>
> And my replication settings are:
>
> {'class': 'NetworkTopologyStrategy', 'aws-us-west': '2',
> 'gce-us-central1': '2', 'gce-us-east1': '2'}
>
> As you can see 10.128.0.20 in the gce-us-central1 DC only has a load of
> 943 MB even though it's supposed to own 100% and should have 6.4 GB.  Also 
> 10.142.0.13
> seems also not to have everything as it only has a load of 5.55 GB.
>
> On Mon, May 23, 2016 at 7:28 PM, kurt Greaves <k...@instaclustr.com>
> wrote:
>
>> Do you have 1 node in each DC or 2? If you're saying you have 1 node in
>> each DC then a RF of 2 doesn't make sense. Can you clarify on what your set
>> up is?
>>
>> On 23 May 2016 at 19:31, Luke Jolly <l...@getadmiral.com> wrote:
>>
>>> I am running 3.0.5 with 2 nodes in two DCs, gce-us-central1 and
>>> gce-us-east1.  I increased the replication factor of gce-us-central1 from 1
>>> to 2.  Then I ran 'nodetool repair -dc gce-us-central1'.  The "Owns"
>>> for the node switched to 100% as it should but the Load showed that it
>>> didn't actually sync the data.  I then ran a full 'nodetool repair' and it
>>> didn't fix it still.  This scares me as I thought 'nodetool repair' was a
>>> way to assure consistency and that all the nodes were synced but it doesn't
>>> seem to be.  Outside of that command, I have no idea how I would assure all
>>> the data was synced or how to get the data correctly synced without
>>> decommissioning the node and re-adding it.
>>>
>>
>>
>>
>> --
>> Kurt Greaves
>> k...@instaclustr.com
>> www.instaclustr.com
>>
>
>

Re: Increasing replication factor and repair doesn't seem to work

Reply via email to