Re: Weird replace_address issue in 1.2

2015-10-30 Thread Carlos Alonso
Well, everything was fine.

The streams finished and after that the node joined the ring nicely and
everyone removed the old one. :)

Thanks!

Carlos Alonso | Software Engineer | @calonso 

On 29 October 2015 at 21:45, Robert Coli  wrote:

> On Thu, Oct 29, 2015 at 6:41 AM, Carlos Alonso  wrote:
>
>> Tonight one of our 3 nodes cluster has died. This is Cassandra 1.2. and
>> RF = 3
>>
>> After bringing up a new node and starting it with -Dreplace_address of
>> the dead one different things are happening.
>>
>> On the new node nodetool status is showing the expected ring (him along
>> with the other two working nodes), but his status is UN, when I would
>> expect it to be UJ, because is joining, right?
>>
>
> I predict that this node has itself in its own seed list, and therefore
> cannot bootstrap. I have no idea what is going on with the streams you
> report.
>
> What version of 1.2.x?
>
> As an aside, you are likely to have a better latency on debugging such
> issues if you join the #cassandra irc channel on freenode.
>
> =Rob
>
>


Weird replace_address issue in 1.2

2015-10-29 Thread Carlos Alonso
Hi guys.

Tonight one of our 3 nodes cluster has died. This is Cassandra 1.2. and RF
= 3

After bringing up a new node and starting it with -Dreplace_address of the
dead one different things are happening.

On the new node nodetool status is showing the expected ring (him along
with the other two working nodes), but his status is UN, when I would
expect it to be UJ, because is joining, right?

Also in this same node, the logs are not showing anything related to
received streams, but streams are being received, the used disk space is
growing and nodetool netstats shows progress.

In this same node, this logs messages are continuously appearing:

INFO [GossipStage:1] 2015-10-29 13:23:10,719 Gossiper.java (line 843) Node
/ is now part of the cluster
 INFO [GossipStage:1] 2015-10-29 13:23:10,721 Gossiper.java (line 809)
InetAddress / is now UP
 WARN [GossipStage:1] 2015-10-29 13:23:10,723 StorageService.java (line
1469) Not updating token metadata for / because I am
replacing it
 INFO [GossipStage:1] 2015-10-29 13:23:10,723 StorageService.java (line
1567) Nodes / and / have the same token
115915760983105627952720478187817787338.  Ignoring /
 INFO [GossipTasks:1] 2015-10-29 13:23:41,350 Gossiper.java (line 622)
FatClient / has been silent for 3ms, removing from gossip

Now switching to the old working nodes, nodetool status shows the old ring,
with the failed node as DN, but the new node doesn't appear in any of them.

Streams are flowing from one of them to the newcomer and both netstats and
logs show it.

Gossipinfo in the new node shows the two working nodes as normal and
himself as hibernating and the other nodes show the same but also the dead
node as normal too.

Is that nodetool status/gossipinfo disagreement normal and the logs of the
newcomer? Any experience with this?

Regards

Carlos Alonso | Software Engineer | @calonso 


Re: Weird replace_address issue in 1.2

2015-10-29 Thread Robert Coli
On Thu, Oct 29, 2015 at 6:41 AM, Carlos Alonso  wrote:

> Tonight one of our 3 nodes cluster has died. This is Cassandra 1.2. and RF
> = 3
>
> After bringing up a new node and starting it with -Dreplace_address of the
> dead one different things are happening.
>
> On the new node nodetool status is showing the expected ring (him along
> with the other two working nodes), but his status is UN, when I would
> expect it to be UJ, because is joining, right?
>

I predict that this node has itself in its own seed list, and therefore
cannot bootstrap. I have no idea what is going on with the streams you
report.

What version of 1.2.x?

As an aside, you are likely to have a better latency on debugging such
issues if you join the #cassandra irc channel on freenode.

=Rob