In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

2016-01-10 Thread Carlos A
Hello all,

I have a small dev environment with 4 machines. One of them, I had it
removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
I then reinstalled it and tried to join. It is on UJ status for a week now
and no changes.

I had tried node-repair etc but nothing.

nodetool status output

Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens   OwnsHost ID
Rack
UN  192.168.1.30  16.13 MB   256  ?
0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
UN  192.168.1.31  20.12 MB   256  ?
1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
UN  192.168.1.32  17.73 MB   256  ?
7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
UJ  192.168.1.33  877.6 KB   256  ?
7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1

Note: Non-system keyspaces don't have the same replication settings,
effective ownership information is meaningless

Any tips on fixing this?

Thanks,

C.


Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

2016-01-12 Thread DuyHai Doan
What is your Cassandra version ? In earlier versions there was some issues
with streaming that can make the joining process stuck.

On Mon, Jan 11, 2016 at 6:57 AM, Carlos A  wrote:

> Hello all,
>
> I have a small dev environment with 4 machines. One of them, I had it
> removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
> I then reinstalled it and tried to join. It is on UJ status for a week now
> and no changes.
>
> I had tried node-repair etc but nothing.
>
> nodetool status output
>
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address   Load   Tokens   OwnsHost ID
>   Rack
> UN  192.168.1.30  16.13 MB   256  ?
> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
> UN  192.168.1.31  20.12 MB   256  ?
> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
> UN  192.168.1.32  17.73 MB   256  ?
> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
> UJ  192.168.1.33  877.6 KB   256  ?
> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>
> Note: Non-system keyspaces don't have the same replication settings,
> effective ownership information is meaningless
>
> Any tips on fixing this?
>
> Thanks,
>
> C.
>


Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

2016-01-12 Thread DuyHai Doan
Oh, sorry, did not notice the version in the title. Did you check the
system.log to verify if there isn't any Exception related to data streaming
? What is the output of "nodetool tpstats" ?

On Tue, Jan 12, 2016 at 1:00 PM, DuyHai Doan  wrote:

> What is your Cassandra version ? In earlier versions there was some issues
> with streaming that can make the joining process stuck.
>
> On Mon, Jan 11, 2016 at 6:57 AM, Carlos A  wrote:
>
>> Hello all,
>>
>> I have a small dev environment with 4 machines. One of them, I had it
>> removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
>> I then reinstalled it and tried to join. It is on UJ status for a week now
>> and no changes.
>>
>> I had tried node-repair etc but nothing.
>>
>> nodetool status output
>>
>> Datacenter: DC1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address   Load   Tokens   OwnsHost ID
>>   Rack
>> UN  192.168.1.30  16.13 MB   256  ?
>> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
>> UN  192.168.1.31  20.12 MB   256  ?
>> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
>> UN  192.168.1.32  17.73 MB   256  ?
>> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
>> UJ  192.168.1.33  877.6 KB   256  ?
>> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>>
>> Note: Non-system keyspaces don't have the same replication settings,
>> effective ownership information is meaningless
>>
>> Any tips on fixing this?
>>
>> Thanks,
>>
>> C.
>>
>
>


Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

2016-01-16 Thread Carlos Fernando Scheidecker Antunes
DuyHai,

Nothing wrong on the logs either.



> nodetool tpstats
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> MutationStage 0 0  11464 0
>  0
> ViewMutationStage 0 0  0 0
>  0
> ReadStage 0 0  0 0
>  0
> RequestResponseStage  0 0 10 0
>  0
> ReadRepairStage   0 0  0 0
>  0
> CounterMutationStage  0 0  0 0
>  0
> MiscStage 0 0  0 0
>  0
> CompactionExecutor0 0683 0
>  0
> MemtableReclaimMemory 0 0357 0
>  0
> PendingRangeCalculator0 0  5 0
>  0
> GossipStage   0 02682208 0
>  0
> SecondaryIndexManagement  0 0  0 0
>  0
> HintsDispatcher   0 0  0 0
>  0
> MigrationStage0 0  0 0
>  0
> MemtablePostFlush 0 0375 0
>  0
> ValidationExecutor0 0  0 0
>  0
> Sampler   0 0  0 0
>  0
> MemtableFlushWriter   0 0357 0
>  0
> InternalResponseStage 0 0  0 0
>  0
> AntiEntropyStage  0 0  0 0
>  0
> CacheCleanupExecutor  0 0  0 0
>  0
> 
> Message type   Dropped
> READ 0
> RANGE_SLICE  0
> _TRACE   0
> HINT 0
> MUTATION 0
> COUNTER_MUTATION 0
> BATCH_STORE  0
> BATCH_REMOVE 0
> REQUEST_RESPONSE 0
> PAGED_RANGE  0
> READ_REPAIR  0


On Tue, 2016-01-12 at 13:05 +0100, DuyHai Doan wrote:
> Oh, sorry, did not notice the version in the title. Did you check the
> system.log to verify if there isn't any Exception related to data
> streaming ? What is the output of "nodetool tpstats" ?
> 
> 
> On Tue, Jan 12, 2016 at 1:00 PM, DuyHai Doan 
> wrote:
> 
> What is your Cassandra version ? In earlier versions there was
> some issues with streaming that can make the joining process
> stuck.
> 
> 
> On Mon, Jan 11, 2016 at 6:57 AM, Carlos A
>  wrote:
> 
> Hello all,
> 
> 
> 
> I have a small dev environment with 4 machines. One of
> them, I had it removed (.33) from the cluster because
> I wanted to upgrade its HD to a SSD. I then
> reinstalled it and tried to join. It is on UJ status
> for a week now and no changes.
> 
> 
> I had tried node-repair etc but nothing.
> 
> 
> nodetool status output
> 
> 
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address   Load   Tokens   OwnsHost
> ID   Rack
> UN  192.168.1.30  16.13 MB   256  ?
> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
> UN  192.168.1.31  20.12 MB   256  ?
> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
> UN  192.168.1.32  17.73 MB   256  ?
> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
> UJ  192.168.1.33  877.6 KB   256  ?
> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
> 
> 
> Note: Non-system keyspaces don't have the same
> replication settings, effective ownership information
> is meaningless
> 
> 
> Any tips on fixing this?
> 
> 
> Thanks,
> 
> 
> C.
> 
> 
> 
> 
> 
> 



Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

2016-01-17 Thread Kai Wang
Carlos,

so you essentially replace the 33 node. Did you follow this
https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?
The link is for 2.x not sure about 3.x. What if you change the new node to
.34?



On Mon, Jan 11, 2016 at 12:57 AM, Carlos A  wrote:

> Hello all,
>
> I have a small dev environment with 4 machines. One of them, I had it
> removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
> I then reinstalled it and tried to join. It is on UJ status for a week now
> and no changes.
>
> I had tried node-repair etc but nothing.
>
> nodetool status output
>
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address   Load   Tokens   OwnsHost ID
>   Rack
> UN  192.168.1.30  16.13 MB   256  ?
> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
> UN  192.168.1.31  20.12 MB   256  ?
> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
> UN  192.168.1.32  17.73 MB   256  ?
> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
> UJ  192.168.1.33  877.6 KB   256  ?
> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>
> Note: Non-system keyspaces don't have the same replication settings,
> effective ownership information is meaningless
>
> Any tips on fixing this?
>
> Thanks,
>
> C.
>


Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

2016-01-17 Thread daemeon reiydelle
What do the logs say on the seed node (and on the UJ node)?

Look for timeout messages.

This problem has occurred for me when there was high network utilization
between the seed and the joining node, also routing issues.



*...*






*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Sun, Jan 17, 2016 at 2:24 PM, Kai Wang  wrote:

> Carlos,
>
> so you essentially replace the 33 node. Did you follow this
> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?
> The link is for 2.x not sure about 3.x. What if you change the new node to
> .34?
>
>
>
> On Mon, Jan 11, 2016 at 12:57 AM, Carlos A  wrote:
>
>> Hello all,
>>
>> I have a small dev environment with 4 machines. One of them, I had it
>> removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
>> I then reinstalled it and tried to join. It is on UJ status for a week now
>> and no changes.
>>
>> I had tried node-repair etc but nothing.
>>
>> nodetool status output
>>
>> Datacenter: DC1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address   Load   Tokens   OwnsHost ID
>>   Rack
>> UN  192.168.1.30  16.13 MB   256  ?
>> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
>> UN  192.168.1.31  20.12 MB   256  ?
>> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
>> UN  192.168.1.32  17.73 MB   256  ?
>> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
>> UJ  192.168.1.33  877.6 KB   256  ?
>> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>>
>> Note: Non-system keyspaces don't have the same replication settings,
>> effective ownership information is meaningless
>>
>> Any tips on fixing this?
>>
>> Thanks,
>>
>> C.
>>
>
>