[jira] [Updated] (CASSANDRA-6622) Streaming session failures during node replace of same address

2014-02-11 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6622:


Attachment: 6622-v2.txt

> Streaming session failures during node replace of same address
> --
>
> Key: CASSANDRA-6622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6622
> Project: Cassandra
>  Issue Type: Bug
> Environment: RHEL6, cassandra-2.0.4
>Reporter: Ravi Prasad
>Assignee: Brandon Williams
> Fix For: 1.2.16, 2.0.6
>
> Attachments: 0001-don-t-signal-restart-of-dead-states.txt, 
> 6622-2.0.txt, 6622-v2.txt, 6622_logs.tgz, logs.tgz
>
>
> When using replace_address, Gossiper ApplicationState is set to hibernate, 
> which is a down state. We are seeing that the peer nodes are seeing streaming 
> plan request even before the Gossiper on them marks the replacing node as 
> dead. As a result, streaming on peer nodes convicts the replacing node by 
> closing the stream handler.  
> I think, making the StorageService thread on the replacing node, sleep for 
> BROADCAST_INTERVAL before bootstrapping, would avoid this scenario.
> Relevant logs from peer node (see that the Gossiper on peer node mark the 
> replacing node as down, 2 secs after  the streaming init request):
> {noformat}
>  INFO [STREAM-INIT-/x.x.x.x:46436] 2014-01-26 20:42:24,388 
> StreamResultFuture.java (line 116) [Stream 
> #5c6cd940-86ca-11e3-90a0-411b913c0e88] Received streaming plan for Bootstrap
> 
>  INFO [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with /x.x.x.x is 
> complete
>  WARN [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
>  INFO [GossipStage:1] 2014-01-26 20:42:25,242 Gossiper.java (line 850) 
> InetAddress /x.x.x.x is now DOWN
> ERROR [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,766 StreamSession.java (line 
> 410) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Streaming error occurred
> java.lang.RuntimeException: Outgoing stream handler has been closed
> at 
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:175)
> at 
> org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
> at java.lang.Thread.run(Thread.java:722)
>  INFO [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with 
> /x.x.x.x is complete
>  WARN [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6622) Streaming session failures during node replace of same address

2014-02-11 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6622:


Attachment: (was: 6622-v2.txt)

> Streaming session failures during node replace of same address
> --
>
> Key: CASSANDRA-6622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6622
> Project: Cassandra
>  Issue Type: Bug
> Environment: RHEL6, cassandra-2.0.4
>Reporter: Ravi Prasad
>Assignee: Brandon Williams
> Fix For: 1.2.16, 2.0.6
>
> Attachments: 0001-don-t-signal-restart-of-dead-states.txt, 
> 6622-2.0.txt, 6622-v2.txt, 6622_logs.tgz, logs.tgz
>
>
> When using replace_address, Gossiper ApplicationState is set to hibernate, 
> which is a down state. We are seeing that the peer nodes are seeing streaming 
> plan request even before the Gossiper on them marks the replacing node as 
> dead. As a result, streaming on peer nodes convicts the replacing node by 
> closing the stream handler.  
> I think, making the StorageService thread on the replacing node, sleep for 
> BROADCAST_INTERVAL before bootstrapping, would avoid this scenario.
> Relevant logs from peer node (see that the Gossiper on peer node mark the 
> replacing node as down, 2 secs after  the streaming init request):
> {noformat}
>  INFO [STREAM-INIT-/x.x.x.x:46436] 2014-01-26 20:42:24,388 
> StreamResultFuture.java (line 116) [Stream 
> #5c6cd940-86ca-11e3-90a0-411b913c0e88] Received streaming plan for Bootstrap
> 
>  INFO [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with /x.x.x.x is 
> complete
>  WARN [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
>  INFO [GossipStage:1] 2014-01-26 20:42:25,242 Gossiper.java (line 850) 
> InetAddress /x.x.x.x is now DOWN
> ERROR [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,766 StreamSession.java (line 
> 410) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Streaming error occurred
> java.lang.RuntimeException: Outgoing stream handler has been closed
> at 
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:175)
> at 
> org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
> at java.lang.Thread.run(Thread.java:722)
>  INFO [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with 
> /x.x.x.x is complete
>  WARN [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6622) Streaming session failures during node replace of same address

2014-02-11 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6622:


Attachment: 6622-v2.txt

So the problem isn't any gossip event breaking the stream, it's that streaming 
is subscribed to the FD directly in order to wait  for double the phi, but this 
notifies every second regardless of up/down state and ends up racing the stream 
request before the hibernate state has reached the source node.  We won't be 
able to fix that without CASSANDRA-3569, so we'll add the RING_DELAY sleep.  v2 
conditionally determines sleep length based on whether the IP is the same or 
not.

> Streaming session failures during node replace of same address
> --
>
> Key: CASSANDRA-6622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6622
> Project: Cassandra
>  Issue Type: Bug
> Environment: RHEL6, cassandra-2.0.4
>Reporter: Ravi Prasad
>Assignee: Brandon Williams
> Fix For: 1.2.16, 2.0.6
>
> Attachments: 0001-don-t-signal-restart-of-dead-states.txt, 
> 6622-2.0.txt, 6622-v2.txt, 6622_logs.tgz, logs.tgz
>
>
> When using replace_address, Gossiper ApplicationState is set to hibernate, 
> which is a down state. We are seeing that the peer nodes are seeing streaming 
> plan request even before the Gossiper on them marks the replacing node as 
> dead. As a result, streaming on peer nodes convicts the replacing node by 
> closing the stream handler.  
> I think, making the StorageService thread on the replacing node, sleep for 
> BROADCAST_INTERVAL before bootstrapping, would avoid this scenario.
> Relevant logs from peer node (see that the Gossiper on peer node mark the 
> replacing node as down, 2 secs after  the streaming init request):
> {noformat}
>  INFO [STREAM-INIT-/x.x.x.x:46436] 2014-01-26 20:42:24,388 
> StreamResultFuture.java (line 116) [Stream 
> #5c6cd940-86ca-11e3-90a0-411b913c0e88] Received streaming plan for Bootstrap
> 
>  INFO [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with /x.x.x.x is 
> complete
>  WARN [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
>  INFO [GossipStage:1] 2014-01-26 20:42:25,242 Gossiper.java (line 850) 
> InetAddress /x.x.x.x is now DOWN
> ERROR [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,766 StreamSession.java (line 
> 410) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Streaming error occurred
> java.lang.RuntimeException: Outgoing stream handler has been closed
> at 
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:175)
> at 
> org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
> at java.lang.Thread.run(Thread.java:722)
>  INFO [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with 
> /x.x.x.x is complete
>  WARN [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6622) Streaming session failures during node replace of same address

2014-02-08 Thread Ravi Prasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prasad updated CASSANDRA-6622:
---

Attachment: 6622_logs.tgz

> Streaming session failures during node replace of same address
> --
>
> Key: CASSANDRA-6622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6622
> Project: Cassandra
>  Issue Type: Bug
> Environment: RHEL6, cassandra-2.0.4
>Reporter: Ravi Prasad
>Assignee: Brandon Williams
> Attachments: 0001-don-t-signal-restart-of-dead-states.txt, 
> 6622-2.0.txt, 6622_logs.tgz, logs.tgz
>
>
> When using replace_address, Gossiper ApplicationState is set to hibernate, 
> which is a down state. We are seeing that the peer nodes are seeing streaming 
> plan request even before the Gossiper on them marks the replacing node as 
> dead. As a result, streaming on peer nodes convicts the replacing node by 
> closing the stream handler.  
> I think, making the StorageService thread on the replacing node, sleep for 
> BROADCAST_INTERVAL before bootstrapping, would avoid this scenario.
> Relevant logs from peer node (see that the Gossiper on peer node mark the 
> replacing node as down, 2 secs after  the streaming init request):
> {noformat}
>  INFO [STREAM-INIT-/x.x.x.x:46436] 2014-01-26 20:42:24,388 
> StreamResultFuture.java (line 116) [Stream 
> #5c6cd940-86ca-11e3-90a0-411b913c0e88] Received streaming plan for Bootstrap
> 
>  INFO [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with /x.x.x.x is 
> complete
>  WARN [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
>  INFO [GossipStage:1] 2014-01-26 20:42:25,242 Gossiper.java (line 850) 
> InetAddress /x.x.x.x is now DOWN
> ERROR [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,766 StreamSession.java (line 
> 410) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Streaming error occurred
> java.lang.RuntimeException: Outgoing stream handler has been closed
> at 
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:175)
> at 
> org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
> at java.lang.Thread.run(Thread.java:722)
>  INFO [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with 
> /x.x.x.x is complete
>  WARN [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6622) Streaming session failures during node replace of same address

2014-02-06 Thread Ravi Prasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prasad updated CASSANDRA-6622:
---

Summary: Streaming session failures during node replace of same address  
(was: Streaming session failures during node replace using replace_address)

> Streaming session failures during node replace of same address
> --
>
> Key: CASSANDRA-6622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6622
> Project: Cassandra
>  Issue Type: Bug
> Environment: RHEL6, cassandra-2.0.4
>Reporter: Ravi Prasad
>Assignee: Brandon Williams
> Attachments: 0001-don-t-signal-restart-of-dead-states.txt, 
> 6622-2.0.txt, logs.tgz
>
>
> When using replace_address, Gossiper ApplicationState is set to hibernate, 
> which is a down state. We are seeing that the peer nodes are seeing streaming 
> plan request even before the Gossiper on them marks the replacing node as 
> dead. As a result, streaming on peer nodes convicts the replacing node by 
> closing the stream handler.  
> I think, making the StorageService thread on the replacing node, sleep for 
> BROADCAST_INTERVAL before bootstrapping, would avoid this scenario.
> Relevant logs from peer node (see that the Gossiper on peer node mark the 
> replacing node as down, 2 secs after  the streaming init request):
> {noformat}
>  INFO [STREAM-INIT-/x.x.x.x:46436] 2014-01-26 20:42:24,388 
> StreamResultFuture.java (line 116) [Stream 
> #5c6cd940-86ca-11e3-90a0-411b913c0e88] Received streaming plan for Bootstrap
> 
>  INFO [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with /x.x.x.x is 
> complete
>  WARN [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
>  INFO [GossipStage:1] 2014-01-26 20:42:25,242 Gossiper.java (line 850) 
> InetAddress /x.x.x.x is now DOWN
> ERROR [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,766 StreamSession.java (line 
> 410) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Streaming error occurred
> java.lang.RuntimeException: Outgoing stream handler has been closed
> at 
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:175)
> at 
> org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
> at java.lang.Thread.run(Thread.java:722)
>  INFO [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with 
> /x.x.x.x is complete
>  WARN [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)